[ISSUE #10423] Optimize LiteConsumerLagCalculator.getLagCountTopK by deferring getStoreTimestamp to topK results only by f1amingo · Pull Request #10424 · apache/rocketmq

f1amingo · 2026-06-03T12:56:35Z

What is the purpose of the change

Fix [Enhancement] Optimize LiteConsumerLagCalculator.getLagCountTopK by deferring expensive getStoreTimestamp to topK results only #10423

LiteConsumerLagCalculator.getLagCountTopK currently calls getStoreTimestamp() (disk I/O) for every offset entry, creates a LiteLagInfo for each, and uses String.split() for key parsing. This PR optimizes these hotspots.

Brief changelog

Defer getStoreTimestamp to topK results only: accumulate consumerOffset in LiteLagInfo during traversal, compute timestamps for the final topK entries after heap filtering (N → K calls).
Lazy object creation: only instantiate LiteLagInfo when the entry qualifies for heap insertion.
Replace split() with indexOf(): parse topicAtGroup key via indexOf + substring, avoiding array allocation; check lite prefix before string extraction.
Replace AtomicLong with long[]: single-threaded traversal needs no atomic overhead.

Verifying this change

Existing unit tests cover the changed methods.

imzs

a necessary optimization

…pK by deferring getStoreTimestamp to topK results only

codecov-commenter · 2026-06-04T03:35:25Z

Codecov Report

❌ Patch coverage is 51.06383% with 23 lines in your changes missing coverage. Please review.
✅ Project coverage is 47.94%. Comparing base (2a9560c) to head (d46924f).

Files with missing lines	Patch %	Lines
...tools/command/lite/GetLiteGroupInfoSubCommand.java	0.00%	11 Missing ⚠️
...etmq/broker/metrics/LiteConsumerLagCalculator.java	75.00%	3 Missing and 5 partials ⚠️
...a/org/apache/rocketmq/common/lite/LiteLagInfo.java	0.00%	4 Missing ⚠️

Additional details and impacted files

@@              Coverage Diff              @@
##             develop   #10424      +/-   ##
=============================================
- Coverage      48.05%   47.94%   -0.11%     
+ Complexity     13303    13276      -27     
=============================================
  Files           1377     1377              
  Lines         100611   100629      +18     
  Branches       12991    12994       +3     
=============================================
- Hits           48347    48246     -101     
- Misses         46348    46437      +89     
- Partials        5916     5946      +30

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

oss-taishan-ai

Review by github-manager-bot

Summary

Optimizes LiteConsumerLagCalculator.getLagCountTopK by deferring expensive getStoreTimestamp calls to topK results only, lazy object creation, replacing split() with indexOf(), and replacing AtomicLong with long[].

Findings

[Correctness] ✅ Core optimizations are correct:
- AtomicLong → long[]: Correct since offsetTableForEachByGroup is single-threaded
- split() → indexOf() + substring(): Equivalent behavior, avoids array allocation
- isLiteTopicQueue(topicAtGroup) instead of isLiteTopicQueue(topicGroup[0]): Correct because isLiteTopicQueue only checks startsWith prefix
- Lazy LiteLagInfo creation: Only instantiates when qualifying for heap insertion
[Correctness] ⚠️ Post-heap lmqName reconstruction (lines 240-247):
```
String parentTopic = LiteMetadataUtil.getLiteBindTopic(group, brokerController);
for (LiteLagInfo lagInfo : topList) {
    String lmqName = LiteUtil.toLmqName(parentTopic, lagInfo.getLiteTopic());
    lagInfo.setEarliestUnconsumedTimestamp(getStoreTimestamp(lmqName, consumerOffset));
}
```
This reconstructs lmqName from group config instead of using the original lmqName from the offset table. This works in practice because the only caller (LiteManagerProcessor.getLiteGroupInfo) validates the group and liteBindTopic before calling getLagCountTopK. However, the method signature allows group to be null, and if getLiteBindTopic returns null, toLmqName returns null, causing getStoreTimestamp(null, ...) to fail. Consider adding a null guard:
```
if (parentTopic == null) {
    // Fall back to original behavior or skip timestamp lookup
    return Pair.of(topList, totalLagCount[0]);
}
```
[Readability] ⚠️ Operator precedence (line 222):
```
if (minHeap.size() < topK || minHeap.peek() != null && diff > minHeap.peek().getLagCount())
```
Java evaluates && before ||, so this is parsed as size < topK || (peek != null && diff > peek.getLagCount()), which is correct. Adding explicit parentheses would improve readability and prevent future maintenance errors:
```
if (minHeap.size() < topK || (minHeap.peek() != null && diff > minHeap.peek().getLagCount()))
```
[Performance] ✅ Excellent improvements:
- N → K getStoreTimestamp calls (major I/O reduction for groups with thousands of lite topics)
- N → K LiteLagInfo allocations
- No String.split() array allocations per entry
- No AtomicLong CAS overhead
[Tests] ✅ Test testGetLagCountTopK_NormalCase covers the new behavior including timestamp deferral and mocks SubscriptionGroupManager for liteBindTopic lookup.
[CI] ✅ All 10 checks pass (CodeQL, bazel-compile, maven-compile × 4, coverage, license, misspell).

Suggestions

Add null guard for parentTopic in the post-heap loop to handle edge cases defensively
Add explicit parentheses in the heap condition for readability
Consider adding a comment explaining why lmqName reconstruction is safe (i.e., caller guarantees valid group with liteBindTopic)

Automated review by github-manager-bot

f1amingo force-pushed the lite_consumer_lag branch 2 times, most recently from 061b5f2 to 634bca0 Compare June 4, 2026 02:24

imzs approved these changes Jun 4, 2026

View reviewed changes

f1amingo force-pushed the lite_consumer_lag branch 2 times, most recently from 43a1259 to 21e215a Compare June 4, 2026 02:58

[ISSUE apache#10423] Optimize LiteConsumerLagCalculator.getLagCountTo…

d46924f

…pK by deferring getStoreTimestamp to topK results only

f1amingo force-pushed the lite_consumer_lag branch from 21e215a to d46924f Compare June 4, 2026 03:06

oss-taishan-ai mentioned this pull request Jun 4, 2026

[Enhancement] Optimize LiteConsumerLagCalculator.getLagCountTopK by deferring expensive getStoreTimestamp to topK results only #10423

Open

1 task

oss-taishan-ai reviewed Jun 4, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[ISSUE #10423] Optimize LiteConsumerLagCalculator.getLagCountTopK by deferring getStoreTimestamp to topK results only#10424

[ISSUE #10423] Optimize LiteConsumerLagCalculator.getLagCountTopK by deferring getStoreTimestamp to topK results only#10424
f1amingo wants to merge 1 commit into
apache:developfrom
f1amingo:lite_consumer_lag

f1amingo commented Jun 3, 2026 •

edited

Loading

Uh oh!

imzs left a comment

Uh oh!

codecov-commenter commented Jun 4, 2026

Uh oh!

oss-taishan-ai left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

f1amingo commented Jun 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What is the purpose of the change

Brief changelog

Verifying this change

Uh oh!

imzs left a comment

Choose a reason for hiding this comment

Uh oh!

codecov-commenter commented Jun 4, 2026

Codecov Report

Uh oh!

oss-taishan-ai left a comment

Choose a reason for hiding this comment

Review by github-manager-bot

Summary

Findings

Suggestions

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

f1amingo commented Jun 3, 2026 •

edited

Loading