You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Upsert tables currently cannot use tiered storage, even when segments beyond the upsert window are effectively immutable and could safely be moved to cold storage.
// Current validation in TableConfigUtils.java:779-780Preconditions.checkState(tableConfig.getTierConfigsList() == null || tableConfig.getTierConfigsList().isEmpty(),
"Tiered storage is not supported for Upsert/Dedup tables");
This blanket prohibition prevents cost optimization for upsert tables with time-bounded upsert windows (configured via metadataTTL).
Proposed Solution
Allow tiered storage for upsert tables when metadataTTL < minSegmentAge, ensuring segments only move to cold tier after their validDocIds bitmaps have been frozen.
Example Configuration:
metadataTTL: 3 days (upsert window)
minSegmentAge: 7 days (tier boundary)
Result: Segments 7+ days old can safely move to cold tier since they haven't received upserts for 4 days
Why This Is Safe
Bitmaps freeze after TTL: Once a segment passes metadataTTL, the metadata manager stops tracking it and its validDocIds bitmap never receives updates again (see isOutOfMetadataTTL() in BasePartitionUpsertMetadataManager)
No write conflicts: Expired segments are removed from _trackedSegments and _primaryKeyToRecordLocationMap, so no upsert operations will attempt to modify their bitmaps
Bitmap storage model supports this: Bitmaps are persisted with segments as validdocids.bitmap.snapshot files and move with the segment to cold tier. Queries only need read access.
// From validateTTLAndTierConfigsForDedupTable()Preconditions.checkState(ttlInMs < minSegmentAgeInMs,
"MetadataTTL: %s(ms) must be smaller than the minimum segmentAge: %s(ms)",
ttlInMs, minSegmentAgeInMs);
Implementation Approach
Update validation in TableConfigUtils.validateUpsertAndDedupConfig() to allow tiered configs when TTL constraint is met
Add validateTTLAndTierConfigsForUpsertTable() method similar to existing dedup validation
Update segment assignment policy to use multi-tier assignment when constraint is satisfied
Require metadataTTL > 0 when tiered storage is configured for upsert tables
Benefits
Storage cost reduction: Move cold segments to cheaper storage tiers
No correctness impact: Frozen segments guarantee query correctness
Consistent with dedup: Uses same validation pattern already proven in production
Backward compatible: Existing upsert tables without tiered configs are unaffected
Problem Statement
Upsert tables currently cannot use tiered storage, even when segments beyond the upsert window are effectively immutable and could safely be moved to cold storage.
This blanket prohibition prevents cost optimization for upsert tables with time-bounded upsert windows (configured via
metadataTTL).Proposed Solution
Allow tiered storage for upsert tables when
metadataTTL < minSegmentAge, ensuring segments only move to cold tier after their validDocIds bitmaps have been frozen.Example Configuration:
metadataTTL: 3 days (upsert window)minSegmentAge: 7 days (tier boundary)Why This Is Safe
Bitmaps freeze after TTL: Once a segment passes
metadataTTL, the metadata manager stops tracking it and itsvalidDocIdsbitmap never receives updates again (seeisOutOfMetadataTTL()inBasePartitionUpsertMetadataManager)No write conflicts: Expired segments are removed from
_trackedSegmentsand_primaryKeyToRecordLocationMap, so no upsert operations will attempt to modify their bitmapsBitmap storage model supports this: Bitmaps are persisted with segments as
validdocids.bitmap.snapshotfiles and move with the segment to cold tier. Queries only need read access.Proven pattern: Dedup tables already support this exact approach (added in PR Make dedup table use StrictRealtimeSegmentAssignment with support of multi tiers #17154, commit
ccc41ea8e5):Implementation Approach
TableConfigUtils.validateUpsertAndDedupConfig()to allow tiered configs when TTL constraint is metvalidateTTLAndTierConfigsForUpsertTable()method similar to existing dedup validationmetadataTTL > 0when tiered storage is configured for upsert tablesBenefits
References
ccc41ea8e5)pinot-segment-local/src/main/java/org/apache/pinot/segment/local/utils/TableConfigUtils.javaBasePartitionUpsertMetadataManager.isOutOfMetadataTTL()