Skip to content

GH-49928: [C++][Parquet] Fix UB in UpdateLevelHistogram from nullptr std::span#50233

Open
Diveyam-Mishra wants to merge 1 commit into
apache:mainfrom
Diveyam-Mishra:49928
Open

GH-49928: [C++][Parquet] Fix UB in UpdateLevelHistogram from nullptr std::span#50233
Diveyam-Mishra wants to merge 1 commit into
apache:mainfrom
Diveyam-Mishra:49928

Conversation

@Diveyam-Mishra

@Diveyam-Mishra Diveyam-Mishra commented Jun 21, 2026

Copy link
Copy Markdown

[Parquet] Guard against null level pointers in UpdateLevelHistogram

column_writer.cc

  1. Added ARROW_DCHECK and null guard in the add_levels lambda
    size_statistics_test.cc
  2. Added regression test for empty/null span

Add an explicit null-pointer check before constructing a std::span in
TypedColumnWriterImpl::UpdateLevelHistogram. This avoids relying on the
implicit invariant that nullptr level pointers only occur when
max_level == 0.

Also add a regression test verifying that UpdateLevelHistogram handles
an empty span as a no op.

@Diveyam-Mishra Diveyam-Mishra requested a review from wgtmac as a code owner June 21, 2026 20:40
Copilot AI review requested due to automatic review settings June 21, 2026 20:40
@Diveyam-Mishra Diveyam-Mishra requested a review from pitrou as a code owner June 21, 2026 20:40
@github-actions

Copy link
Copy Markdown

⚠️ GitHub issue #49928 has been automatically assigned in GitHub to PR creator.

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Fixes undefined behavior in Parquet C++ column writing by preventing construction of a std::span from a potentially null level pointer during size-statistics histogram updates.

Changes:

  • Add a null-pointer guard (and debug assertion) in TypedColumnWriterImpl::UpdateLevelHistogram before constructing a std::span.
  • Add a regression test ensuring parquet::UpdateLevelHistogram treats an empty std::span as a no-op.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

File Description
cpp/src/parquet/column_writer.cc Adds null-guard/DCHECK around level histogram updates to avoid std::span UB.
cpp/src/parquet/size_statistics_test.cc Adds regression coverage for empty-span histogram updates.

Comment thread cpp/src/parquet/column_writer.cc Outdated
Comment thread cpp/src/parquet/size_statistics_test.cc

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated no new comments.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants