[ISSUE #10417][BUG] Fix stale minOffset after truncate#10418
Conversation
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## develop #10418 +/- ##
=============================================
- Coverage 48.03% 47.90% -0.13%
+ Complexity 13286 13245 -41
=============================================
Files 1376 1376
Lines 100558 100562 +4
Branches 12985 12986 +1
=============================================
- Hits 48305 48179 -126
- Misses 46333 46435 +102
- Partials 5920 5948 +28 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
Since this bug can surface during HA master-slave switch or CommitLog truncation, would you consider adding a test that deletes all ConsumeQueue files and then verifies correctMinOffset properly resets minLogicOffset and maxPhysicOffset? That would help guard against future regressions in this code path. |
oss-taishan-ai
left a comment
There was a problem hiding this comment.
Review by github-manager-bot
Summary
Fixes a bug where ConsumeQueue#correctMinOffset fails to reset minLogicOffset when commitlog truncation deletes all ConsumeQueue files for a topic queue, leaving an inconsistent state where minOffset > maxOffset.
Findings
-
[Correctness] ✅ The fix is well-structured and addresses the root cause:
-
Reordering checks —
getLastMappedFile() == nullis now checked first. This is the critical case where all CQ files have been deleted, and the queue must be reset to empty state (minLogicOffset = 0,maxPhysicOffset = -1). Previously this was checked second, and the first>=check would short-circuit and return without any reset. -
>=→>change — WhenminLogicOffset == maxOffset, the queue is legitimately empty (min points to the end, which equals the current max). This is not an anomalous state and does not require a reset. The>guard correctly targets only the truly invalid case whereminLogicOffsetexceedsmaxOffset. -
Reset values —
minLogicOffset = 0matches the default initialization (line 78).maxPhysicOffset = -1matches the default initialization (line 73). This is consistent withreset()(line 1087).
-
-
[Performance] ✅ No performance concern. The
getLastMappedFile()call was already present in the original code; it's simply moved earlier in the method. -
[Compatibility] ✅ No API or protocol changes. Internal state management only.
-
[Tests]
⚠️ No unit test added. The scenario (commitlog truncation removing all CQ files) is complex to reproduce in a unit test, but the fix is small enough that a targeted test forcorrectMinOffsetwith a mocked/emptymappedFileQueuewould be valuable to prevent regression. Consider adding:@Test public void testCorrectMinOffsetWhenAllFilesDeleted() { // Setup: create a ConsumeQueue, write some entries, then delete all mapped files // After calling correctMinOffset: // assertEquals(0, consumeQueue.getMinOffsetInQueue() * CQ_STORE_UNIT_SIZE); // assertEquals(-1, consumeQueue.getMaxPhysicOffset()); }
Suggestions
-
Add a unit test — As noted above, a test covering the "all CQ files deleted" path would strengthen this fix.
-
Consider thread safety —
minLogicOffsetisvolatile(good), butmaxPhysicOffsetis not. In the reset path, both are written sequentially. If another thread readsminLogicOffsetafter it's reset to 0 but beforemaxPhysicOffsetis reset to -1, the intermediate state could be briefly inconsistent. This is likely benign (the queue is empty, no consumer should be iterating), but worth a quick look to confirm no reader depends on both values being consistent simultaneously.
CI
✅ All 10 checks pass (CodeQL, bazel-compile, maven-compile × 4, coverage, license, misspell).
Automated review by github-manager-bot
|
@lollipopjin Thanks for the suggestion. I have added a regression test in ConsumeQueueTest for the case where all ConsumeQueue files are deleted, to verify that correctMinOffset properly resets minLogicOffset to 0 and maxPhysicOffset to -1. |
Which Issue(s) This PR Fixes
Summary
minOffsetminOffsetstays consistent withmaxOffsetmaxOffset == 0butminOffset > 0Root cause
When all ConsumeQueue files of a topic queue are deleted during commitlog truncation, the in-memory logical minimum offset may
not be reset. As a result, later
minOffsetqueries can return a stale value even though the queue is already empty.Test Plan
minOffset == 0andmaxOffset == 0after truncation