Skip to content

[server] Optimize CompletedSnapshotStoreManager memory usage by using relative paths #2486

@LiebingYu

Description

@LiebingYu

Search before asking

  • I searched in the issues and found nothing similar.

Description

Background

The CompletedSnapshotStoreManager in Fluss coordinator has significant memory overhead due to storing absolute paths repeatedly for snapshot metadata, which can lead to OOM errors in production clusters with many table buckets.

Current Implementation Issues

Currently, snapshot-related paths are stored as absolute paths in multiple places:

  1. CompletedSnapshot.snapshotLocation: Each snapshot stores a complete absolute path
  2. KvFileHandle.filePath: Each KV file stores a complete absolute path

This creates severe memory redundancy:

  • For snapshots belonging to the same TableBucket, their snapshotLocation values differ only in the final snapshot ID, while the base path prefix (e.g., hdfs://namenode:8020/fluss/kv/db1/table1-100/0/) is identical across all snapshots
  • Each KvFileHandle within a snapshot also stores the complete absolute file path, including highly repetitive path prefixes
  • In a typical scenario: with a 120-byte base path, 10 retained snapshots per bucket, and 100 files per snapshot, path prefixes alone consume over 100KB of memory per bucket
  • When multiplied across thousands of table buckets in a production cluster, this leads to multi-GB memory overhead and eventual OOM

Memory Impact Analysis

Before optimization (absolute paths):

Per snapshot:
  - snapshotLocation: ~120 bytes (full path)
  - 100 KvFileHandles × ~150 bytes each = ~15KB
  Total per snapshot: ~15.12KB

Per bucket (10 snapshots): ~151KB
1000 buckets: ~151MB (paths only, excluding other metadata)
10000 buckets: ~1.5GB (paths only)

Willingness to contribute

  • I'm willing to submit a PR!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No fields configured for Task.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions