Skip to content

refactor(rest): localize CreateTableRequest's partition-spec skip to REST#2702

Open
rahulsmahadev wants to merge 1 commit into
apache:mainfrom
rahulsmahadev:rahul/iceberg-rust-2610-followup
Open

refactor(rest): localize CreateTableRequest's partition-spec skip to REST#2702
rahulsmahadev wants to merge 1 commit into
apache:mainfrom
rahulsmahadev:rahul/iceberg-rust-2610-followup

Conversation

@rahulsmahadev

Copy link
Copy Markdown
Contributor

Which issue does this PR close?

Follow-up to #2610. Addresses two review threads on that PR:

  • @laskoviymishka: move the #[serde(skip_serializing_if = "Option::is_none")] off of the core UnboundPartitionField / UnboundPartitionSpec types and apply it only when these types are serialized as part of CreateTableRequest.
  • @viirya (non-blocking): add a positive serialization test for UnboundPartitionField::field_id (set case).

What changes are included in this PR?

#2610 added #[serde(skip_serializing_if = "Option::is_none")] to UnboundPartitionField::field_id and UnboundPartitionSpec::spec_id so that REST CreateTableRequest payloads would omit unset partition ids. The same core types are also serialized as part of TableUpdate::AddSpec on the commit path, where a missing spec-id would hard-fail a Java REST server (PartitionSpecParser.fromJson throws IllegalArgumentException: Cannot parse missing int: spec-id). The breakage is latent today only because TableMetadataBuilder::add_partition_spec always populates spec_id via with_spec_id before pushing the AddSpec change.

This PR moves the skip behavior out of the core spec types and into the REST layer:

  1. Drop #[serde(skip_serializing_if = "Option::is_none")] from UnboundPartitionField::field_id and UnboundPartitionSpec::spec_id. The core types again emit spec-id / field-id as null when unset, matching the pre-fix(rest): skip serializing unset optional fields in CreateTableRequest #2610 wire format and preserving the TableUpdate::AddSpec contract.
  2. Add a serialize_with helper on CreateTableRequest::partition_spec that serializes through private shadow types whose fields skip the ids when None. This preserves the externally visible REST payload shape (partition-spec omitted entirely when unset; inner spec-id / field-id omitted when unset) without coupling the core spec type to it.

The shadow-type approach was chosen over a public newtype wrapper to keep the change scope to the REST crate and avoid touching iceberg::spec::partition's public surface beyond the revert.

Are these changes tested?

Yes. Four new / updated tests:

  • crates/iceberg/src/catalog/mod.rs::test_add_spec_emits_spec_id_key: regression guard that TableUpdate::AddSpec always emits the spec-id key, including when spec_id is None. This is the test that would have caught the latent bug if fix(rest): skip serializing unset optional fields in CreateTableRequest #2610 had been merged without the add_partition_spec always-populates-spec_id invariant.
  • crates/iceberg/src/spec/partition.rs::test_unbound_partition_field_field_id_set_is_serialized: positive test that a set field_id round-trips under the kebab-case field-id key (viirya's follow-up ask).
  • crates/catalog/rest/src/types.rs::test_create_table_request_partition_spec_skips_inner_none_ids: verifies that the REST shadow serializer omits inner spec-id and field-id when unset, and emits them when set.
  • crates/iceberg/src/spec/partition.rs::test_unbound_partition_spec_serialization_emits_none_as_null replaces the previous skip-expectation test and documents the new contract on the core type.

All pre-existing tests in iceberg (1361) and iceberg-catalog-rest (45) still pass. cargo fmt --check and cargo clippy --all-targets --all-features -- -D warnings are clean on both crates.

Test plan

  • cargo test -p iceberg --lib
  • cargo test -p iceberg-catalog-rest --lib
  • cargo fmt --check -p iceberg -p iceberg-catalog-rest
  • cargo clippy -p iceberg -p iceberg-catalog-rest --all-targets --all-features -- -D warnings

…REST

Follow-up to apache#2610. That PR added `#[serde(skip_serializing_if =
"Option::is_none")]` to `UnboundPartitionField::field_id` and
`UnboundPartitionSpec::spec_id` in `iceberg::spec::partition` so the
REST `CreateTableRequest` payload would omit unset partition ids. The
same core types are also serialized as part of `TableUpdate::AddSpec`
on the commit path, where a missing `spec-id` would hard-fail a Java
REST server (`PartitionSpecParser.fromJson` throws
`IllegalArgumentException: Cannot parse missing int: spec-id`). The
breakage is latent today only because `TableMetadataBuilder::add_partition_spec`
always populates `spec_id` before pushing the `AddSpec` change.

This change moves the skip behavior out of the core types and into the
REST layer:

- Drop `#[serde(skip_serializing_if = "Option::is_none")]` from
  `UnboundPartitionField::field_id` and `UnboundPartitionSpec::spec_id`.
  The core types again emit `spec-id` / `field-id` as `null` when
  unset, matching the pre-apache#2610 wire format and preserving the
  `TableUpdate::AddSpec` contract.
- Add a `serialize_with` helper on `CreateTableRequest::partition_spec`
  that serializes through private shadow types whose fields skip the
  ids when `None`. This preserves the externally visible REST payload
  shape (`partition-spec` is omitted entirely when unset; inner
  `spec-id` / `field-id` are omitted when unset) without coupling the
  core spec type to it.

Tests:
- `test_add_spec_emits_spec_id_key`: regression guard that
  `TableUpdate::AddSpec` always emits the `spec-id` key, including
  when `spec_id` is `None`.
- `test_unbound_partition_field_field_id_set_is_serialized`: positive
  test that a set `field_id` round-trips under the kebab-case
  `field-id` key (viirya's follow-up ask on apache#2610).
- `test_create_table_request_partition_spec_skips_inner_none_ids`:
  verifies that the REST shadow serializer omits inner `spec-id` and
  `field-id` when unset, and emits them when set.
- `test_unbound_partition_spec_serialization_emits_none_as_null`
  replaces the previous skip-expectation test and documents the new
  contract.

Signed-off-by: rahulsmahadev <rahul.mahadev@databricks.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant