Skip to content

fix(datafusion): return single row with count 0 for empty inserts#2712

Open
u70b3 wants to merge 1 commit into
apache:mainfrom
u70b3:fix/datafusion-reject-non-append-inserts
Open

fix(datafusion): return single row with count 0 for empty inserts#2712
u70b3 wants to merge 1 commit into
apache:mainfrom
u70b3:fix/datafusion-reject-non-append-inserts

Conversation

@u70b3

@u70b3 u70b3 commented Jun 25, 2026

Copy link
Copy Markdown

Which issue does this PR close?

What changes are included in this PR?

Fix empty inserts in the DataFusion integration.

When an INSERT produces no data files (e.g. INSERT INTO ... SELECT ... WHERE false), IcebergCommitExec previously returned an empty RecordBatch. DataFusion expects a single-row count result for DML statements, so this PR changes the empty-data path to return a batch with count 0.

Additionally, the commit executor now skips creating a transaction/snapshot when no data files are written, avoiding unnecessary table history entries.

Are these changes tested?

Yes. Added tests verifying that:

  • Empty inserts return a single-row UInt64 count batch with value 0.
  • Empty inserts do not create a new snapshot for unpartitioned tables.
  • Empty inserts do not create a new snapshot for partitioned tables.
  • Existing insert tests continue to pass.

Local verification:

  • cargo test -p iceberg-datafusion
  • cargo clippy -p iceberg-datafusion --all-targets -- -D warnings
  • cargo fmt --all -- --check
  • git diff --check

When an INSERT produces no data files, IcebergCommitExec previously returned

an empty RecordBatch. DataFusion expects a single-row count result, so this

change returns a batch with count=0 and skips creating a new snapshot.

Includes unit and integration tests for unpartitioned and partitioned tables.
@u70b3 u70b3 force-pushed the fix/datafusion-reject-non-append-inserts branch from 0f90cfc to 21ba002 Compare June 25, 2026 12:07
@u70b3 u70b3 changed the title fix(datafusion): reject non-append insert operations fix(datafusion): return single row with count 0 for empty inserts Jun 25, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

DataFusion empty INSERT returns empty batch and creates unnecessary snapshot

1 participant