We welcome contributions to Apache Iceberg! To learn more about contributing to Apache Iceberg, please refer to the official Iceberg contribution guidelines. These guidelines are intended as helpful suggestions to make the contribution process as seamless as possible, and are not strict rules.
If you would like to discuss your proposed change before contributing, we encourage you to visit our Community page. There, you will find various ways to connect with the community, including Slack and our mailing lists. Alternatively, you can open a new issue directly in the GitHub repository.
For first-time contributors, feel free to check out our good first issues for an easy way to get started.
The Iceberg C++ Project is hosted on GitHub at https://github.com/apache/iceberg-cpp.
- CMake 3.25 or higher
- C++23 compliant compiler (GCC 14+, Clang 17+, MSVC 2022+)
- Git
Clone the repository for local development:
git clone https://github.com/apache/iceberg-cpp.git
cd iceberg-cppBuild the core libraries:
cmake -S . -B build -G Ninja -DCMAKE_INSTALL_PREFIX=/path/to/install -DICEBERG_BUILD_STATIC=ON -DICEBERG_BUILD_SHARED=ON
cmake --build build
ctest --test-dir build --output-on-failure
cmake --install buildBuild with bundled dependencies:
cmake -S . -B build -G Ninja -DCMAKE_INSTALL_PREFIX=/path/to/install -DICEBERG_BUILD_BUNDLE=ON
cmake --build build
ctest --test-dir build --output-on-failure
cmake --install buildWe follow modern C++ best practices:
- C++23 Standard: Use C++23 features where appropriate
- Naming Conventions:
- Classes:
PascalCase(e.g.,TableScanBuilder) - Functions/Methods:
PascalCase(e.g.,CreateNamespace,ExtractYear) - Trivial getters:
snake_case(e.g.,name(),type_id(),is_primitive()) - Variables:
snake_case(e.g.,file_io) - Constants:
kprefix withPascalCase(e.g.,kHeaderContentType,kMaxPrecision)
- Classes:
- Memory Management: Prefer smart pointers (
std::unique_ptr,std::shared_ptr) - Error Handling: Use
Result<T>types for error propagation - Documentation: Use Doxygen-style comments for public APIs
It is important to keep the C++ public API compatible across versions. Public methods should have no leading underscores and should not be removed without deprecation notice.
If you want to remove a method, please add a deprecation notice:
[[deprecated("This method will be removed in version 2.0.0. Use new_method() instead.")]]
void old_method();We use clang-format for code formatting. The configuration is defined in .clang-format file.
Format your code before submitting:
clang-format -i src/**/*.{h,cc}Run all tests:
ctest --test-dir build --output-on-failureRun specific test:
ctest --test-dir build -R test_nameInstall the python package pre-commit and run once pre-commit install:
pip install pre-commit
pre-commit installThis will setup a git pre-commit-hook that is executed on each commit and will report the linting problems. To run all hooks on all files use pre-commit run -a.
The Apache Iceberg C++ community has the following policy for AI-assisted PRs:
- The PR author should understand the core ideas behind the implementation end-to-end, and be able to justify the design and code during review.
- Calls out unknowns and assumptions. It's okay to not fully understand some bits of AI generated code. You should comment on these cases and point them out to reviewers so that they can use their knowledge of the codebase to clear up any concerns. For example, you might comment "calling this function here seems to work but I'm not familiar with how it works internally, I wonder if there's a race condition if it is called concurrently".
Today, AI tools cannot reliably make complex changes to the codebase on their own, which is why we rely on pull requests and code review.
The purposes of code review are:
- Finish the intended task.
- Share knowledge between authors and reviewers, as a long-term investment in the project. For this reason, even if someone familiar with the codebase can finish a task quickly, we're still happy to help a new contributor work on it even if it takes longer.
An AI dump for an issue doesn’t meet these purposes. Maintainers could finish the task faster by using AI directly, and the submitters gain little knowledge if they act only as a pass through AI proxy without understanding.
Please understand the reviewing capacity is very limited for the project, so large PRs which appear to not have the requisite understanding might not get reviewed, and eventually closed or redirected.
It's recommended to write a high-quality issue with a clear problem statement and a minimal, reproducible example. This can make it easier for others to contribute.
- Fork the repository on GitHub
- Create a feature branch from
main:git checkout -b feature/your-feature-name
- Make your changes following the coding standards
- Add tests for your changes
- Run tests to ensure everything passes
- Commit your changes with a clear commit message
- Push to your fork and create a Pull Request
Use clear, descriptive commit messages:
feat: add support for S3 file system
fix: resolve memory leak in table reader
docs: update API documentation
test: add unit tests for schema validation
- Create a Pull Request with a clear description
- Link related issues if applicable
- Ensure CI passes - all tests must pass
- Request review from maintainers
- Address feedback and update the PR as needed
- Squash commits if requested by reviewers
The Apache Iceberg community is built on the principles described in the Apache Way and all who engage with the community are expected to be respectful, open, come with the best interests of the community in mind, and abide by the Apache Foundation Code of Conduct.
- Submit Issues: GitHub Issues for bug reports or feature requests
- Mailing List: dev@iceberg.apache.org for discussions
- Slack: Apache Iceberg Slack #cpp channel
New to the project? Check out our good first issues for an easy way to get started.
Releases are managed by the Apache Iceberg project maintainers. For information about the release process, please refer to the main Iceberg project documentation.