Skip to content

Add table_file_row_counts API#79

Open
robertbuessow wants to merge 1 commit intomainfrom
rb-get-rows-per-file
Open

Add table_file_row_counts API#79
robertbuessow wants to merge 1 commit intomainfrom
rb-get-rows-per-file

Conversation

@robertbuessow
Copy link
Copy Markdown
Contributor

Summary

  • Adds iceberg_table_file_row_counts FFI function that reads per-file row counts from Iceberg manifest metadata without reading any Parquet data files
  • Adds Julia binding table_file_row_counts(table) -> Dict{String, Int64} mapping file paths to their row counts
  • Uses boxed slices (not Box<Vec>) for the response arrays so Julia can index the data directly without indirecting through the Vec struct layout
  • Adds a test verifying correctness against the known TPC-H SF0.1 customer table

Test plan

  • make run-containers to start MinIO + Polaris
  • make test-dev — the new table_file_row_counts testset in scan_tests.jl should pass

🤖 Generated with Claude Code

…fest metadata

Adds iceberg_table_file_row_counts FFI function that calls plan_files() on the
current snapshot to collect (file_path, record_count) pairs from Iceberg manifest
metadata without reading any Parquet files. Includes a corresponding free function
and Julia bindings returning a Dict{String, Int64}.

Uses boxed slices (not Box<Vec>) so response pointers point directly to array data,
allowing Julia to index them without indirecting through the Vec struct layout.

Labels: dismiss-release-notes, build:benchmark

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant