fix: empty loads for asymmetric cluster-group partial-load matchers#19565
fix: empty loads for asymmetric cluster-group partial-load matchers#19565clintropolis wants to merge 4 commits into
Conversation
FrankChen021
left a comment
There was a problem hiding this comment.
| Severity | Findings |
|---|---|
| P0 | 0 |
| P1 | 1 |
| P2 | 1 |
| P3 | 0 |
| Total | 2 |
Reviewed 7 of 7 changed files.
This is an automated review by Codex GPT-5.5
| return null; | ||
| } | ||
| final String fingerprint = computeFingerprint(resolved); | ||
| final String fingerprint = resolved.isEmpty() ? EMPTY_LOAD_FINGERPRINT : computeFingerprint(resolved); |
There was a problem hiding this comment.
[P1] Empty matches bypass FALL_THROUGH for fully unmatched groups
Returning a non-null empty MatchResult for every core segment means PartialLoadRule.appliesTo returns true even when the rule is configured with onCannotMatch=FALL_THROUGH. If a core shard group has clusterGroups but none of the segments match the configured patterns, RunRules stops on this partial rule for every segment, then EmptyDeferringHandler discards all buffered empties because no positive was seen. Later fallback rules never run, so segments that previously fell through to a full-load rule can be left with no assignment. The group-level discard path needs to preserve the configured cannot-match behavior.
There was a problem hiding this comment.
ah, so this is intentional behavior to not fall-through, however this comment made me realize that it was a bit too aggressive with the empty matches since it was applying to any clustered segments.
I only intended for empty matches to occur when the matcher is 'compatible' with the segment clustering, e.g. there is some overlap between what clustering columns the matcher is looking at and what columns the segment is actually clustered on so that we don't incorrect do an empty load for segments that are clustered completely differently from what the matcher is looking for.
I've updated the check to ensure that empty matches only occur when things are compatible, which was the intended behavior
| ); | ||
| // An empty index list is used when a cluster-group matcher applies to a (clustered) segment but no configured | ||
| // pattern matches its cluster groups. The historical-side partial loader honors this by loading nothing. | ||
| Preconditions.checkNotNull(clusterGroupIndices, "clusterGroupIndices"); |
There was a problem hiding this comment.
[P2] Empty loads still delegate to a full segment load
The constructor now accepts an empty clusterGroupIndices list for the empty-load sentinel, but PartialClusterGroupLoadSpec still inherits PartialLoadSpec.loadSegment(), which delegates directly to the inner load spec. A historical receiving this supposedly empty request will download the full segment and announce a matching full-fallback profile, so sparse matches can full-load every empty sibling instead of loading nothing. This needs an explicit empty-load implementation or another guard before emitting empty cluster-group load specs.
There was a problem hiding this comment.
yea, this will be the case until partial load stuff is fully merged
FrankChen021
left a comment
There was a problem hiding this comment.
Handled follow-up for this PR. Reviewed 8 of 8 changed files. The compatibility update now returns null for incompatible clustering schemes, so the prior FALL_THROUGH concern is addressed. The remaining empty-load loader delegation point was acknowledged as expected until the broader partial-load implementation lands. No inline reply is needed.
This is an automated review by Codex GPT-5.5
Description
This PR fixes a flaw with
ClusterGroupPartialLoadMatcher.match()when it asymmetrically matches clustered segments within a core-partition group, resulting in only some of the core-partition segments in the group being loaded and being 'incomplete' in the timeline.The fix is to allow
PartialLoadMatcherto perform 'empty' loads, where if the rule matcher is processing a clustered segment but none of the cluster groups match. Coupled with this,RunRulesguards against totally empty groups by using a newEmptyDeferringHandlerwrapperSegmentActionHandlerthat buffers these empty requests and 'flushes' them at shard-group borders, only performing empty loads when some part of a core partition group was a non-empty match.