fix(table): reject reserved metadata IDs before reassignment#1126
fix(table): reject reserved metadata IDs before reassignment#1126officialasishkumar wants to merge 1 commit into
Conversation
Signed-off-by: Asish Kumar <officialasishkumar@gmail.com>
laskoviymishka
left a comment
There was a problem hiding this comment.
Thanks for closing this gap, silently reassigning a user-supplied _row_id was a real footgun, and validating before reassignIDs is the right spot.
Two things I'd tidy before merge:
- Validator runs twice. NewMetadataWithUUID calls it pre-reassignment (load-bearing), and checkSchemaCompatibility calls it again on the already-reassigned schema (dead for this path). Pick one home, or leave a comment explaining which gate covers which caller.
- Test gaps. The new test only covers top-level reserved IDs, the nested struct/list/map walk that's the bulk of the new code is untested. And the
TestProjectionV3SchemaAlreadyHasRowIDrename drops the end-to-endScan.Projection()path, which still matters for tables loaded from catalogs.
Once those are in, happy to take another pass.
| const defaultValuesMinFormatVersion = 3 | ||
| problems := make([]IncompatibleField, 0) | ||
|
|
||
| if err := validateNoReservedMetadataColumnIDs(sc); err != nil { |
There was a problem hiding this comment.
I think this call is effectively dead for the NewMetadata path — by the time checkSchemaCompatibility runs, reassignIDs has already remapped every ID to a fresh non-reserved value, so the validator can't find anything. The only meaningful check is the new one in NewMetadataWithUUID before reassignment.
If the intent is to also catch callers that reach checkSchemaCompatibility without going through reassignment (e.g. MetadataBuilder.AddSchema later in a table's life), I'd move this guard into AddSchema directly and leave checkSchemaCompatibility focused on format-version compatibility. Otherwise we end up with two call sites for the same guard and an unclear invariant about which one is the actual gate. wdyt?
| if err := validateNoReservedMetadataColumnID(child, name+"."+child.Name); err != nil { | ||
| return err | ||
| } | ||
| } |
There was a problem hiding this comment.
The three if typ == nil { return nil } arms (StructType / ListType / MapType) aren't reachable — a successful type assertion on field.Type.(type) already discriminates against the nil-interface case, and nothing in the codebase produces a typed-nil *iceberg.StructType here. validateComplexDefault in the same file does the same switch without these guards. I'd drop them.
| } | ||
|
|
||
| return nil | ||
| } |
There was a problem hiding this comment.
Small consistency thing: the neighboring validators (validateUnknownTypes, validateComplexTypeDefaults) wrap their error with fmt.Errorf("failed to validate ...: %w", err) at the call site, while this one returns bare. Not a regression (the inline check it replaced also returned bare), but worth wrapping at the checkSchemaCompatibility call site for symmetry if we keep that call site at all.
| assert.ErrorIs(t, err, iceberg.ErrInvalidSchema) | ||
| assert.Contains(t, err.Error(), "multiple fields for name foo") | ||
| } | ||
|
|
There was a problem hiding this comment.
Two thoughts on the new test:
The PR description says nested struct/list/map fields are validated, but the test only covers top-level reserved IDs. I'd add at least one case with a struct child or a list element carrying RowIDFieldID so the recursive walk is actually exercised.
Also worth adding a NestedField{ID: iceberg.RowIDFieldID, Name: "user_id"} case — that confirms the error message uses the schema's name, not the canonical constant, which is the point of the name argument in validateNoReservedMetadataColumnID.
| // behavior when a schema already includes reserved lineage fields. New table | ||
| // metadata rejects user schemas with these IDs, but projections can still | ||
| // contain them after the scanner adds row-lineage columns. | ||
| func TestAppendMissingLineageFieldsAlreadyHasRowID(t *testing.T) { |
There was a problem hiding this comment.
This rename narrows the test from end-to-end (scan.Projection() over real v3 metadata) to a unit test of appendMissingLineageFields in isolation. The motivation is clear — NewMetadata now rejects the old setup — but it leaves the full read-path projection over a schema-with-reserved-IDs uncovered, and existing tables loaded from catalogs can still have those IDs.
I'd keep this unit test and add a sibling that constructs *Scan with a hand-built metadata (or via a path that bypasses the new validator) and exercises Projection() end-to-end. Otherwise we lose the original regression target.
| } | ||
|
|
||
| if err := validateNoReservedMetadataColumnIDs(sc); err != nil { | ||
| return nil, err |
There was a problem hiding this comment.
Worth a one-line comment here explaining why the check runs before reassignIDs — "user-provided IDs must be validated before reassignment overwrites them with fresh non-reserved IDs". The placement is correct but non-obvious to a future reader.
Also a heads-up: Java's TableMetadata.newTableMetadata doesn't reject reserved IDs here; it silently relies on TypeUtil.assignFreshIds to remap them. We're imposing a stricter precondition than Java. I think that's the right call (silent reassignment of a user-supplied _row_id is a footgun), but worth flagging in case it's intentional vs. accidental divergence.
Summary
Fixes #1107
Testing