[kv] Fix non-target columns not being nulled during partial update on first insert#2969
[kv] Fix non-target columns not being nulled during partial update on first insert#2969binary-signal wants to merge 4 commits intoapache:mainfrom
Conversation
|
@binary-signal I will help to review it then |
Thanks for the heads, I am rebasing |
|
@fresh-borzoni it's rebased, no git conflicts. I will let the CI run and then have a look if any further changes are needed |
|
@fresh-borzonit rebase and adapt your PR, I've rebased latest changes from main |
fresh-borzoni
left a comment
There was a problem hiding this comment.
@binary-signal Ty for the PR, overall your approach works, but mergeInsert is unnecessary as PartialUpdater.updateRow(null, v) already handles null oldValue, so you can just call currentMerger.merge(null, currentValue) in the insert path. This way - no interface change needed.
But it requires checking other mergers if they handle null gracefully, do you mind to check?
Purpose
Linked issue: close #2843
This pull request fixes a bug where
partialUpdateon a Primary Key table during a first insert (where no row exists for that key) incorrectly stored all columns. This resulted in non-target columns retaining their values from the client row instead of being properly set to null.Brief change log
The root cause was isolated to
KvTablet.processUpsert(), which bypassed theRowMergerwhenoldValueBytes == null(indicating a first insert) and passed the raw row directly toapplyInsert(). As a result,PartialUpdater.updateRow(null, partialValue)was never invoked.The following changes were implemented to resolve this:
fluss-server/.../rowmerger/RowMerger.java: Added a defaultmergeInsert(BinaryValue newValue)method that returnsnewValueunchanged. This ensures safety and backward compatibility for all existing mergers.fluss-server/.../rowmerger/DefaultRowMerger.java: OverrodemergeInsertwithin the innerPartialUpdateRowMergerclass to callpartialUpdater.updateRow(null, newValue). This guarantees non-target columns are nulled out on the first insert.fluss-server/.../kv/KvTablet.java: UpdatedprocessUpsert()to executecurrentMerger.mergeInsert(currentValue)prior toapplyInsert(). This ensures partial updates are accurately applied throughout the initial insert process.Tests
Added two new unit tests to
fluss-server/.../kv/KvTabletTest.java:testPartialUpdateFirstInsertNullsNonTargetColumns: Verifies that non-target columns are set to null on a first insert, even if the client row contains values for them.testPartialUpdateFirstInsertThenUpdate: Verifies the full lifecycle by ensuring that an initial insert with a partial update nulls non-target columns, and a subsequent partial update correctly retains the previously stored values.API and Format
No. This is an internal data processing logic fix. No public APIs or storage formats are modified.
Documentation
No. This is a bug fix correcting existing behavior. No documentation updates are required.