Skip to content

Add batched SegmentZKMetadata iteration API and migrate controller usages#17706

Closed
xiangfu0 wants to merge 1 commit intoapache:masterfrom
xiangfu0:codex/add-batch-segment-metadata-api
Closed

Add batched SegmentZKMetadata iteration API and migrate controller usages#17706
xiangfu0 wants to merge 1 commit intoapache:masterfrom
xiangfu0:codex/add-batch-segment-metadata-api

Conversation

@xiangfu0
Copy link
Contributor

@xiangfu0 xiangfu0 commented Feb 15, 2026

Summary

  • Add batched/streaming SegmentZKMetadata enumeration API in PinotHelixResourceManager via ZKMetadataProvider
  • Add configurable batch size support (default 1000) for retention scanning
  • Migrate heavy controller callsites from list-based getSegmentsZKMetadata(...) to iterative processing where applicable

@codecov-commenter
Copy link

codecov-commenter commented Feb 15, 2026

Codecov Report

❌ Patch coverage is 62.18487% with 45 lines in your changes missing coverage. Please review.
✅ Project coverage is 63.24%. Comparing base (657e7f0) to head (751c271).
⚠️ Report is 11 commits behind head on master.

Files with missing lines Patch % Lines
...troller/helix/core/retention/RetentionManager.java 54.71% 20 Missing and 4 partials ⚠️
...helix/core/minion/generator/BaseTaskGenerator.java 52.63% 3 Missing and 6 partials ⚠️
...ache/pinot/common/metadata/ZKMetadataProvider.java 84.61% 1 Missing and 3 partials ⚠️
...troller/helix/core/minion/ClusterInfoAccessor.java 0.00% 4 Missing ⚠️
...va/org/apache/pinot/controller/ControllerConf.java 33.33% 2 Missing ⚠️
...ntroller/helix/core/PinotHelixResourceManager.java 85.71% 2 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             master   #17706    +/-   ##
==========================================
  Coverage     63.23%   63.24%            
  Complexity     1502     1502            
==========================================
  Files          3179     3181     +2     
  Lines        190710   190875   +165     
  Branches      29153    29171    +18     
==========================================
+ Hits         120597   120720   +123     
  Misses        60746    60746            
- Partials       9367     9409    +42     
Flag Coverage Δ
custom-integration1 100.00% <ø> (ø)
integration 100.00% <ø> (ø)
integration1 100.00% <ø> (ø)
integration2 0.00% <ø> (ø)
java-11 63.16% <62.18%> (-0.04%) ⬇️
java-21 63.20% <62.18%> (+0.03%) ⬆️
temurin 63.24% <62.18%> (+<0.01%) ⬆️
unittests 63.24% <62.18%> (+<0.01%) ⬆️
unittests1 55.58% <84.61%> (-0.05%) ⬇️
unittests2 34.11% <55.46%> (+0.03%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@xiangfu0 xiangfu0 force-pushed the codex/add-batch-segment-metadata-api branch 2 times, most recently from 17da5ce to 2005659 Compare February 16, 2026 08:57
_pinotHelixResourceManager.getSegmentsZKMetadata(tableNameWithType);

for (SegmentZKMetadata segmentZKMetadata : segmentZKMetadataList) {
_pinotHelixResourceManager.forEachSegmentsZKMetadata(tableNameWithType, segmentZKMetadata -> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Won't this actually be slower due to more ZK point calls rather than earlier getChildren() calls ?

I think if use cases require all data, then earlier getChildren() calls is probably a better choice.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For propertyStore, underlying it's also wrapping a for loop on top of zkclient to call each path to fetch the znRecord then construct the segment metadata.

So there is no difference in terms of the zk overhead.

GetChildren for all segment name is always just one zk call to fetch all.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR introduces a batched/streaming API for iterating over SegmentZKMetadata to reduce memory pressure when processing large numbers of segments. The changes add a new forEachSegmentZKMetadata method in ZKMetadataProvider that retrieves segment metadata in configurable batches (default 1000) and applies a consumer function to each segment, avoiding the need to load all segment metadata into memory at once.

Changes:

  • Added batched segment metadata iteration API (forEachSegmentZKMetadata) in ZKMetadataProvider with configurable batch size
  • Migrated RetentionManager, OfflineSegmentValidationManager, RealtimeSegmentValidationManager, and other controller components from list-based to consumer-based iteration
  • Added configuration support for retention manager batch size with dynamic updates via cluster config changes

Reviewed changes

Copilot reviewed 12 out of 12 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
pinot-common/src/main/java/org/apache/pinot/common/metadata/ZKMetadataProvider.java Added core forEachSegmentZKMetadata method with batched ZK reads
pinot-controller/src/main/java/org/apache/pinot/controller/helix/core/PinotHelixResourceManager.java Added wrapper methods for batched segment metadata iteration and migrated existing usages
pinot-controller/src/main/java/org/apache/pinot/controller/helix/core/minion/ClusterInfoAccessor.java Added forwarding methods for batched segment metadata iteration
pinot-controller/src/main/java/org/apache/pinot/controller/helix/core/minion/generator/BaseTaskGenerator.java Refactored to use batched iteration with fallback for existing test mocks
pinot-controller/src/main/java/org/apache/pinot/controller/helix/core/retention/RetentionManager.java Migrated all segment metadata processing to batched iteration with configurable batch size
pinot-controller/src/main/java/org/apache/pinot/controller/validation/OfflineSegmentValidationManager.java Refactored to use single-pass consumer-based iteration with array-based accumulators
pinot-controller/src/main/java/org/apache/pinot/controller/validation/RealtimeSegmentValidationManager.java Migrated to batched iteration while maintaining existing functionality
pinot-controller/src/main/java/org/apache/pinot/controller/api/resources/PinotSegmentRestletResource.java Simplified to use consumer-based iteration for ZK metadata retrieval
pinot-controller/src/main/java/org/apache/pinot/controller/util/TableTierReader.java Migrated to consumer-based iteration for tier details retrieval
pinot-controller/src/main/java/org/apache/pinot/controller/ControllerConf.java Added configuration for segmentsZKMetadataBatchSize with default value of 1000
pinot-controller/src/test/java/org/apache/pinot/controller/helix/core/retention/RetentionManagerTest.java Added mock helper for both old and new API methods to support test cases
pinot-plugins/pinot-minion-tasks/pinot-minion-builtin-tasks/src/test/java/org/apache/pinot/plugin/minion/tasks/realtimetoofflinesegments/RealtimeToOfflineSegmentsTaskGeneratorTest.java Contains debug code that should be removed

@xiangfu0 xiangfu0 force-pushed the codex/add-batch-segment-metadata-api branch 3 times, most recently from 473c632 to 031b0b7 Compare February 17, 2026 09:12
Copy link
Contributor

@Jackie-Jiang Jackie-Jiang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In a lot of places, the code has to cache all SegmentZKMetadata for a table. I don't think the solution in this PR is good enough to reduce memory footprint. We should consider caching them so that we only maintain one copy within memory.
Within each controller, we should make it ZK listener based, and cache the SegmentZKMetadata for the tables managed by the controller


List<SegmentZKMetadata> segmentsZKMetadata = _pinotHelixResourceManager.getSegmentsZKMetadata(realtimeTableName);
List<SegmentZKMetadata> segmentsZKMetadata = new ArrayList<>();
_pinotHelixResourceManager.forEachSegmentsZKMetadata(realtimeTableName, segmentsZKMetadata::add);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not reducing memory footprint

List<SegmentZKMetadata> selectedSegmentZKMetadataList = new ArrayList<>();
for (SegmentZKMetadata segmentZKMetadata : segmentZKMetadataList) {
if (segmentsForTable.contains(segmentZKMetadata.getSegmentName())) {
boolean[] callbackInvoked = new boolean[1];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need to track this?

callbackInvoked[0] = true;
if (segmentsForTable.contains(segmentZKMetadata.getSegmentName())
&& (segmentMetadataFilter == null || segmentMetadataFilter.test(segmentZKMetadata))) {
selectedSegmentZKMetadataList.add(segmentZKMetadata);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not reducing memory footprint

@xiangfu0 xiangfu0 force-pushed the codex/add-batch-segment-metadata-api branch from 031b0b7 to 8307d49 Compare February 18, 2026 21:33
@xiangfu0 xiangfu0 force-pushed the codex/add-batch-segment-metadata-api branch from 8307d49 to 751c271 Compare February 18, 2026 21:44
@xiangfu0 xiangfu0 closed this Mar 2, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants