Skip to content

Add NullBuffer::try_from_unsliced helper and refactor call sites#9411

Open
Eyad3skr wants to merge 2 commits intoapache:mainfrom
Eyad3skr:feat/null-buffer-try-from-unsliced
Open

Add NullBuffer::try_from_unsliced helper and refactor call sites#9411
Eyad3skr wants to merge 2 commits intoapache:mainfrom
Eyad3skr:feat/null-buffer-try-from-unsliced

Conversation

@Eyad3skr
Copy link

Implements a helper to replace the pattern of creating a BooleanBuffer from an unsliced validity bitmap and filtering by null count. Previously this was done with BooleanBuffer::new(...) plus Some(NullBuffer::new(...)).filter(|n| n.null_count() > 0); now it is a single call to NullBuffer::try_from_unsliced(buffer, len), which returns Some(NullBuffer) when there are nulls and None when all values are valid.

  • Added try_from_unsliced in arrow-buffer/src/buffer/null.rs with tests for nulls, all valid, all null, empty
  • Refactor FixedSizeBinaryArray::try_from_iter_with_size and try_from_sparse_iter_with_size to use it
  • Refactor take_nulls in arrow-select to use it

Closes #9385

@github-actions github-actions bot added parquet Changes to the parquet crate arrow Changes to the arrow crate labels Feb 13, 2026
Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @Eyad3skr -- this looks nice to me

cc @liamzwbao in case you have time to help reivew

pub fn buffer(&self) -> &Buffer {
self.buffer.inner()
}
/// Create a [`NullBuffer`] from an *unsliced* validity bitmap (`offset = 0`) of length `len`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you also update this comment to reflect the offset is in bits (not bytes)? Mixing the units is a common mistake so making sure the documentation is as clear as possible would help

/// Create a [`NullBuffer`] from an *unsliced* validity bitmap (`offset = 0`) of length `len`.
///
/// Returns `None` if there are no nulls (all values valid).
pub fn try_from_unsliced(buffer: impl Into<Buffer>, len: usize) -> Option<Self> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we should call it from_buffer? try_* probably should be reserved for functions that return a result, and its probably better to be direct that we're constructing directly from a buffer?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a valid point tbh! on it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

arrow Changes to the arrow crate parquet Changes to the parquet crate

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Introduce NullBuffer::try_from_unsliced to simplify array construction

3 participants