Skip to content

feat(cache): support manifest file cache#345

Open
gripleaf wants to merge 1 commit into
alibaba:mainfrom
gripleaf:feat/manifest-cache
Open

feat(cache): support manifest file cache#345
gripleaf wants to merge 1 commit into
alibaba:mainfrom
gripleaf:feat/manifest-cache

Conversation

@gripleaf
Copy link
Copy Markdown

@gripleaf gripleaf commented Jun 8, 2026

Purpose

Add an optional process-local manifest file bytes cache to reduce repeated manifest file reads during scan/planning paths.

This change:

  • Adds manifest.cache-max-memory-size, defaulting to 64MB; 0 disables the cache.
  • Reuses the existing CacheManager to cache raw manifest file bytes.
  • Wires the shared manifest cache manager through CoreOptions, ManifestList, ManifestFile, and IndexManifestFile.
  • Reads cached manifest bytes through MemorySegmentInputStream, which keeps the underlying MemorySegment alive without copying bytes.
  • Uses a whole-file cache key to avoid extra file-status lookups on cache hits.
  • Bypasses the cache when IOHook is active so fault-injection tests still exercise real IO.

Tests

  • git diff --check HEAD~1 HEAD
  • cmake -S . -B /tmp/paimon-cpp-manifest-cache-build -DPAIMON_BUILD_TESTS=ON -DPAIMON_DEPENDENCY_SOURCE=AUTO ...

Added/updated UT coverage:

  • CoreOptionsTest
    • verifies default manifest cache size is 64MB
    • verifies setting manifest.cache-max-memory-size=0 disables the cache
  • ManifestFileTest.TestReadUsesManifestCache
    • verifies repeated reads hit cache and avoid repeated file opens
    • verifies filtering still works after reading from cached bytes
  • ManifestFileTest.TestManifestCacheCanBeDisabled
    • verifies disabled cache preserves original repeated-open behavior
  • ManifestFileTest.TestManifestCacheBypassesWhenIOHookActive
    • verifies cache is bypassed while IOHook is active

Note: local paimon-core-test build was blocked by third-party dependency downloads because the configured proxy could not connect to GitHub.

API and Format

This change adds a new public table option:

  • manifest.cache-max-memory-size

It does not change storage format or protocol compatibility. The cache stores raw manifest file bytes in process memory only.

Documentation

Yes. Added docs/source/user_guide/manifest_cache.rst and linked it from the user guide.

Generative AI tooling

Generated-by: OpenAI Codex

@CLAassistant
Copy link
Copy Markdown

CLAassistant commented Jun 8, 2026

CLA assistant check
All committers have signed the CLA.

@gripleaf gripleaf force-pushed the feat/manifest-cache branch 2 times, most recently from 6e8e3cd to 7b71e6d Compare June 8, 2026 06:28
@gripleaf gripleaf force-pushed the feat/manifest-cache branch from 7b71e6d to 805c2e3 Compare June 8, 2026 06:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants