Skip to content

fix(table): cross-check deletion vector cardinality#1127

Open
officialasishkumar wants to merge 1 commit into
apache:mainfrom
officialasishkumar:fix/1058-dv-cardinality-cross-check
Open

fix(table): cross-check deletion vector cardinality#1127
officialasishkumar wants to merge 1 commit into
apache:mainfrom
officialasishkumar:fix/1058-dv-cardinality-cross-check

Conversation

@officialasishkumar
Copy link
Copy Markdown

Summary

  • compare deletion vector manifest record_count with the Puffin cardinality property when both are available
  • fail fast when manifest and Puffin cardinality metadata disagree
  • keep bitmap cardinality validation active when metadata sources agree

Fixes #1058

Testing

  • go test ./table/dv -run TestReadDVCardinalityValidation -count=1
  • go test ./table/dv -count=1

Signed-off-by: Asish Kumar <officialasishkumar@gmail.com>
Copy link
Copy Markdown
Contributor

@laskoviymishka laskoviymishka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The cross-source validation direction looks right, and the disagreement test is exactly the regression pin I’d want here. Almost there, with one blocker before merge:

hasManifestCardinality := manifestCardinality > 0 mixes up “manifest didn’t tell us” with “manifest said zero.” record_count is non-nullable in the spec, and Java writes 0 for empty DVs. So a real blob=5 / manifest=0 mismatch would silently fall back to “blob-only wins”, which is the opposite of what this PR is trying to catch.

Since RecordCount is a non-pointer int64, I’d treat it as always present and let zero go through the normal agree / disagree path.

Related: the “missing cardinality property is tolerated with a warning” test doesn’t hit that branch anymore. writePuffinWithDVBlob still writes "cardinality": "5" and the test passes count=5, so it’s just the agree path. Worth adding a sibling helper that omits the property entirely.

Small cleanup while you’re there:

  • rename the local blobCardinality; it shadows the package-level function
  • include the DV file path in the new disagreement error, like the rest of ReadDV does

}
if !ok {
manifestCardinality := dvFile.Count()
hasManifestCardinality := manifestCardinality > 0
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there's a subtle issue here. record_count is field 103 in the spec — non-nullable long, no absent state — and Java writes 0 for empty DVs (BaseDVFileWriter via deletes.positions().cardinality()). Treating manifestCardinality > 0 as "manifest didn't tell us" means a real disagreement like blob=5 / manifest=0 falls through to "blob-only" and gets accepted silently, which is exactly the case the PR is trying to catch.

Since RecordCount is a non-pointer int64 with no representable absent state, I'd just treat it as always present — hasManifestCardinality := true — and let the agree/disagree branches handle zero naturally. wdyt?

// Pin the specific error path: DeserializeDV's "cardinality
// mismatch", not the helper's "invalid cardinality" parse error.
assert.Contains(t, err.Error(), "manifest record_count 5 disagrees with puffin cardinality property 99")
})
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Related to this PR but in the sibling subtest just below this one: the existing "missing cardinality property is tolerated with a warning" test doesn't reach the default branch anymore — writePuffinWithDVBlob still embeds "cardinality": "5" and the test passes count=5, so it hits the agree-path. The warning branch could be deleted and that test would keep passing.

I'd add a sibling helper (or extend writePuffinWithDVBlobAndProps) that omits the cardinality property entirely, paired with a manifest count that also can't be used, so the test actually pins the default warning.

}

cardinality, ok, err := blobCardinality(reader.Blobs(), offset, size)
blobCardinality, hasBlobCardinality, err := blobCardinality(reader.Blobs(), offset, size)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The local blobCardinality shadows the package-level function of the same name on this line — readability hazard since the identifier resolves to the int64 result for the rest of the function, including the three uses below (manifestCardinality != blobCardinality, the Errorf arg, and expectedCardinality = blobCardinality in the case hasBlobCardinality: arm).

I'd rename the local to puffinCard (mirrors the manifestCardinality source-prefix naming) and touch up all 4 sites — declaration plus the three uses inside the switch.

switch {
case hasManifestCardinality && hasBlobCardinality:
if manifestCardinality != blobCardinality {
return nil, fmt.Errorf("manifest record_count %d disagrees with puffin cardinality property %d",
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Every other error in ReadDV carries the DV file path (the "DV file %s: %w" wrap above, plus read DV blob at offset %d, etc.); this one's the odd one out. Cheap to include and it makes triage from a server log a lot easier:

return nil, fmt.Errorf("DV file %s: manifest record_count %d disagrees with puffin cardinality property %d",
    dvFile.FilePath(), manifestCardinality, blobCardinality)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feat(table/dv): cross-validate manifest record_count against blob cardinality property

2 participants