Skip to content

fix(table): reject reserved metadata IDs before reassignment#1126

Open
officialasishkumar wants to merge 1 commit into
apache:mainfrom
officialasishkumar:fix/1107-reject-reserved-metadata-ids
Open

fix(table): reject reserved metadata IDs before reassignment#1126
officialasishkumar wants to merge 1 commit into
apache:mainfrom
officialasishkumar:fix/1107-reject-reserved-metadata-ids

Conversation

@officialasishkumar
Copy link
Copy Markdown

Summary

  • reject user schemas that contain reserved metadata column IDs before schema ID reassignment
  • validate nested struct, list, and map fields for reserved metadata IDs
  • keep lineage projection tests focused on appendMissingLineageFields behavior

Fixes #1107

Testing

  • go test ./table -run "TestNewMetadataRejectsReservedMetadataColumnIDsBeforeReassignment|TestAppendMissingLineageFieldsAlreadyHasRowID|TestNewMetadata_DuplicateValues" -count=1
  • go test ./table -count=1

Signed-off-by: Asish Kumar <officialasishkumar@gmail.com>
Copy link
Copy Markdown
Contributor

@laskoviymishka laskoviymishka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for closing this gap, silently reassigning a user-supplied _row_id was a real footgun, and validating before reassignIDs is the right spot.

Two things I'd tidy before merge:

  • Validator runs twice. NewMetadataWithUUID calls it pre-reassignment (load-bearing), and checkSchemaCompatibility calls it again on the already-reassigned schema (dead for this path). Pick one home, or leave a comment explaining which gate covers which caller.
  • Test gaps. The new test only covers top-level reserved IDs, the nested struct/list/map walk that's the bulk of the new code is untested. And the TestProjectionV3SchemaAlreadyHasRowID rename drops the end-to-end Scan.Projection() path, which still matters for tables loaded from catalogs.

Once those are in, happy to take another pass.

const defaultValuesMinFormatVersion = 3
problems := make([]IncompatibleField, 0)

if err := validateNoReservedMetadataColumnIDs(sc); err != nil {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this call is effectively dead for the NewMetadata path — by the time checkSchemaCompatibility runs, reassignIDs has already remapped every ID to a fresh non-reserved value, so the validator can't find anything. The only meaningful check is the new one in NewMetadataWithUUID before reassignment.

If the intent is to also catch callers that reach checkSchemaCompatibility without going through reassignment (e.g. MetadataBuilder.AddSchema later in a table's life), I'd move this guard into AddSchema directly and leave checkSchemaCompatibility focused on format-version compatibility. Otherwise we end up with two call sites for the same guard and an unclear invariant about which one is the actual gate. wdyt?

if err := validateNoReservedMetadataColumnID(child, name+"."+child.Name); err != nil {
return err
}
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The three if typ == nil { return nil } arms (StructType / ListType / MapType) aren't reachable — a successful type assertion on field.Type.(type) already discriminates against the nil-interface case, and nothing in the codebase produces a typed-nil *iceberg.StructType here. validateComplexDefault in the same file does the same switch without these guards. I'd drop them.

}

return nil
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Small consistency thing: the neighboring validators (validateUnknownTypes, validateComplexTypeDefaults) wrap their error with fmt.Errorf("failed to validate ...: %w", err) at the call site, while this one returns bare. Not a regression (the inline check it replaced also returned bare), but worth wrapping at the checkSchemaCompatibility call site for symmetry if we keep that call site at all.

assert.ErrorIs(t, err, iceberg.ErrInvalidSchema)
assert.Contains(t, err.Error(), "multiple fields for name foo")
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two thoughts on the new test:

The PR description says nested struct/list/map fields are validated, but the test only covers top-level reserved IDs. I'd add at least one case with a struct child or a list element carrying RowIDFieldID so the recursive walk is actually exercised.

Also worth adding a NestedField{ID: iceberg.RowIDFieldID, Name: "user_id"} case — that confirms the error message uses the schema's name, not the canonical constant, which is the point of the name argument in validateNoReservedMetadataColumnID.

// behavior when a schema already includes reserved lineage fields. New table
// metadata rejects user schemas with these IDs, but projections can still
// contain them after the scanner adds row-lineage columns.
func TestAppendMissingLineageFieldsAlreadyHasRowID(t *testing.T) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This rename narrows the test from end-to-end (scan.Projection() over real v3 metadata) to a unit test of appendMissingLineageFields in isolation. The motivation is clear — NewMetadata now rejects the old setup — but it leaves the full read-path projection over a schema-with-reserved-IDs uncovered, and existing tables loaded from catalogs can still have those IDs.

I'd keep this unit test and add a sibling that constructs *Scan with a hand-built metadata (or via a path that bypasses the new validator) and exercises Projection() end-to-end. Otherwise we lose the original regression target.

Comment thread table/metadata.go
}

if err := validateNoReservedMetadataColumnIDs(sc); err != nil {
return nil, err
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Worth a one-line comment here explaining why the check runs before reassignIDs — "user-provided IDs must be validated before reassignment overwrites them with fresh non-reserved IDs". The placement is correct but non-obvious to a future reader.

Also a heads-up: Java's TableMetadata.newTableMetadata doesn't reject reserved IDs here; it silently relies on TypeUtil.assignFreshIds to remap them. We're imposing a stricter precondition than Java. I think that's the right call (silent reassignment of a user-supplied _row_id is a footgun), but worth flagging in case it's intentional vs. accidental divergence.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Bug(schema): AssignFreshSchemaIDs Java Divergance

2 participants