fix(table): cross-check deletion vector cardinality#1127
fix(table): cross-check deletion vector cardinality#1127officialasishkumar wants to merge 1 commit into
Conversation
Signed-off-by: Asish Kumar <officialasishkumar@gmail.com>
laskoviymishka
left a comment
There was a problem hiding this comment.
The cross-source validation direction looks right, and the disagreement test is exactly the regression pin I’d want here. Almost there, with one blocker before merge:
hasManifestCardinality := manifestCardinality > 0 mixes up “manifest didn’t tell us” with “manifest said zero.” record_count is non-nullable in the spec, and Java writes 0 for empty DVs. So a real blob=5 / manifest=0 mismatch would silently fall back to “blob-only wins”, which is the opposite of what this PR is trying to catch.
Since RecordCount is a non-pointer int64, I’d treat it as always present and let zero go through the normal agree / disagree path.
Related: the “missing cardinality property is tolerated with a warning” test doesn’t hit that branch anymore. writePuffinWithDVBlob still writes "cardinality": "5" and the test passes count=5, so it’s just the agree path. Worth adding a sibling helper that omits the property entirely.
Small cleanup while you’re there:
- rename the local
blobCardinality; it shadows the package-level function - include the DV file path in the new disagreement error, like the rest of
ReadDVdoes
| } | ||
| if !ok { | ||
| manifestCardinality := dvFile.Count() | ||
| hasManifestCardinality := manifestCardinality > 0 |
There was a problem hiding this comment.
I think there's a subtle issue here. record_count is field 103 in the spec — non-nullable long, no absent state — and Java writes 0 for empty DVs (BaseDVFileWriter via deletes.positions().cardinality()). Treating manifestCardinality > 0 as "manifest didn't tell us" means a real disagreement like blob=5 / manifest=0 falls through to "blob-only" and gets accepted silently, which is exactly the case the PR is trying to catch.
Since RecordCount is a non-pointer int64 with no representable absent state, I'd just treat it as always present — hasManifestCardinality := true — and let the agree/disagree branches handle zero naturally. wdyt?
| // Pin the specific error path: DeserializeDV's "cardinality | ||
| // mismatch", not the helper's "invalid cardinality" parse error. | ||
| assert.Contains(t, err.Error(), "manifest record_count 5 disagrees with puffin cardinality property 99") | ||
| }) |
There was a problem hiding this comment.
Related to this PR but in the sibling subtest just below this one: the existing "missing cardinality property is tolerated with a warning" test doesn't reach the default branch anymore — writePuffinWithDVBlob still embeds "cardinality": "5" and the test passes count=5, so it hits the agree-path. The warning branch could be deleted and that test would keep passing.
I'd add a sibling helper (or extend writePuffinWithDVBlobAndProps) that omits the cardinality property entirely, paired with a manifest count that also can't be used, so the test actually pins the default warning.
| } | ||
|
|
||
| cardinality, ok, err := blobCardinality(reader.Blobs(), offset, size) | ||
| blobCardinality, hasBlobCardinality, err := blobCardinality(reader.Blobs(), offset, size) |
There was a problem hiding this comment.
The local blobCardinality shadows the package-level function of the same name on this line — readability hazard since the identifier resolves to the int64 result for the rest of the function, including the three uses below (manifestCardinality != blobCardinality, the Errorf arg, and expectedCardinality = blobCardinality in the case hasBlobCardinality: arm).
I'd rename the local to puffinCard (mirrors the manifestCardinality source-prefix naming) and touch up all 4 sites — declaration plus the three uses inside the switch.
| switch { | ||
| case hasManifestCardinality && hasBlobCardinality: | ||
| if manifestCardinality != blobCardinality { | ||
| return nil, fmt.Errorf("manifest record_count %d disagrees with puffin cardinality property %d", |
There was a problem hiding this comment.
Every other error in ReadDV carries the DV file path (the "DV file %s: %w" wrap above, plus read DV blob at offset %d, etc.); this one's the odd one out. Cheap to include and it makes triage from a server log a lot easier:
return nil, fmt.Errorf("DV file %s: manifest record_count %d disagrees with puffin cardinality property %d",
dvFile.FilePath(), manifestCardinality, blobCardinality)
Summary
Fixes #1058
Testing