Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
16 changes: 16 additions & 0 deletions docs/docs/maintenance/write-performance.md
Original file line number Diff line number Diff line change
Expand Up @@ -128,6 +128,22 @@ here (For example, writing a large number of partitions simultaneously), you can
to use a Flink coordinator to cache the read manifest data to accelerate initialization. The cache memory for coordinator
is `sink.writer-coordinator.cache-memory`, default is 1GB in Job Manager.

The coordinator manifest cache normally holds entries with soft references, so the JVM can reclaim them when the Job
Manager runs low on heap. On a heavily loaded Job Manager this can backfire: the JVM reclaims cached manifests, writers
immediately re-read and decompress them, and that work drives heap back up, triggering more reclamation. The cache
thrashes instead of helping.

If you see this, set `sink.writer-coordinator.cache-soft-values` to `false`. Entries are then held with strong
references, so GC never reclaims them and the thrash loop cannot start.

With soft references off the cache no longer shrinks under GC, but it stays bounded by weight: it occupies up to
`sink.writer-coordinator.cache-memory` and evicts the least-recently-used entries beyond that. Size the Job Manager
total heap memory to at least twice that value so an undersized heap fails fast with an `OutOfMemoryError` instead of
degrading silently. Optionally set `sink.writer-coordinator.cache-expire-after-access` to also release entries that
have been idle for a while.

The same `cache.manifest.soft-values` / `cache.manifest.max-memory` settings apply to the catalog manifest cache.

## Write Memory

There are three main places in Paimon writer that takes up memory:
Expand Down
6 changes: 6 additions & 0 deletions docs/generated/catalog_configuration.html
Original file line number Diff line number Diff line change
Expand Up @@ -68,6 +68,12 @@
<td>MemorySize</td>
<td>Controls the threshold of small manifest file.</td>
</tr>
<tr>
<td><h5>cache.manifest.soft-values</h5></td>
<td style="word-wrap: break-word;">true</td>
<td>Boolean</td>
<td>If true (default), manifest cache entries are held with soft references and may be reclaimed by the GC under memory pressure. This can trigger a cache-thrash spiral where reclaimed entries are refetched, spiking heap and forcing further reclamation. Set to false to hold entries with strong references, breaking the spiral; the cache then stays bounded by weight up to 'cache.manifest.max-memory' (size the total heap memory to at least roughly twice that value).</td>
</tr>
<tr>
<td><h5>cache.partition.max-num</h5></td>
<td style="word-wrap: break-word;">0</td>
Expand Down
18 changes: 18 additions & 0 deletions docs/generated/flink_connector_configuration.html
Original file line number Diff line number Diff line change
Expand Up @@ -302,12 +302,24 @@
<td>Boolean</td>
<td>If true, flink sink will use managed memory for merge tree; otherwise, it will create an independent memory allocator.</td>
</tr>
<tr>
<td><h5>sink.writer-coordinator.cache-expire-after-access</h5></td>
<td style="word-wrap: break-word;">(none)</td>
<td>Duration</td>
<td>Optional idle TTL for writer coordinator manifest cache entries. Disabled by default. When set, an entry that has not been accessed within this duration is evicted, releasing its heap. The cache stays bounded by 'sink.writer-coordinator.cache-memory' regardless of this setting.</td>
</tr>
<tr>
<td><h5>sink.writer-coordinator.cache-memory</h5></td>
<td style="word-wrap: break-word;">2 gb</td>
<td>MemorySize</td>
<td>Controls the cache memory of writer coordinator to cache manifest files in Job Manager.</td>
</tr>
<tr>
<td><h5>sink.writer-coordinator.cache-soft-values</h5></td>
<td style="word-wrap: break-word;">true</td>
<td>Boolean</td>
<td>If true (default), writer coordinator manifest cache entries are held with soft references and may be reclaimed by the GC under memory pressure. This can trigger a cache-thrash spiral where reclaimed entries are refetched, spiking heap and forcing further reclamation. Set to false to hold entries with strong references, breaking the spiral; the cache then stays bounded by weight up to 'sink.writer-coordinator.cache-memory' (size the Job Manager total heap memory to at least roughly twice that value).</td>
</tr>
<tr>
<td><h5>sink.writer-coordinator.enabled</h5></td>
<td style="word-wrap: break-word;">false</td>
Expand All @@ -320,6 +332,12 @@
<td>MemorySize</td>
<td>Controls the page size for one RPC request of writer coordinator.</td>
</tr>
<tr>
<td><h5>sink.writer-coordinator.prefetch-manifests</h5></td>
<td style="word-wrap: break-word;">false</td>
<td>Boolean</td>
<td>If true, the writer coordinator eagerly reads all data manifests of the latest snapshot during refresh to warm the in-Job-Manager manifest cache. This avoids many concurrent cold manifest reads when high-parallelism writers restore at the same time, reducing Job Manager heap pressure at the cost of one full manifest read per refresh.</td>
</tr>
<tr>
<td><h5>sink.writer-cpu</h5></td>
<td style="word-wrap: break-word;">1.0</td>
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -130,6 +130,20 @@ public class CatalogOptions {
.noDefaultValue()
.withDescription("Controls the maximum memory to cache manifest content.");

public static final ConfigOption<Boolean> CACHE_MANIFEST_SOFT_VALUES =
key("cache.manifest.soft-values")
.booleanType()
.defaultValue(true)
.withDescription(
"If true (default), manifest cache entries are held with soft references "
+ "and may be reclaimed by the GC under memory pressure. This can "
+ "trigger a cache-thrash spiral where reclaimed entries are "
+ "refetched, spiking heap and forcing further reclamation. Set to "
+ "false to hold entries with strong references, breaking the spiral; "
+ "the cache then stays bounded by weight up to "
+ "'cache.manifest.max-memory' (size the total heap memory to at "
+ "least roughly twice that value).");

public static final ConfigOption<Integer> CACHE_SNAPSHOT_MAX_NUM_PER_TABLE =
key("cache.snapshot.max-num-per-table")
.intType()
Expand Down
Loading
Loading