[CELEBORN-2350] Support chunk level compression to optimize storage by saurabhd336 · Pull Request #3699 · apache/celeborn

saurabhd336 · 2026-05-23T16:44:44Z

What changes were proposed in this pull request?

Adds chunk-level ZSTD compression on the worker write path and streaming decompression on the client read path. Records accumulate in a fixed-size chunk buffer (default 8 MB); when the buffer overflows it is compressed as a single ZSTD frame and written to disk file. On the read side, CelebornInputStream wraps each fetched chunk in a ZstdInputStream to uncompress.

This is orthogonal to the existing batch-level LZ4/ZSTD codec: both can be active simultaneously, or the batch codec can be NONE.

This change primarily helps in reducing the disk usage (~40% lower disk usage seen in tests) as well as read flow celeborn network egress.

Impl details

Writer side

Added a new FileChannelWriter interface which supports write / close functionalities. BypassFileChannelWriter is the default and ensures the current behaviour (directly write flushBuffer to disk file channel).
Added ChunkCompressedFileChannelWriter: Accumulates records in a direct ByteBuffer of chunkSize bytes. On overflow, ZSTD-compresses and writes as a single frame. Records larger than chunkSize stream directly to disk via ZstdOutputStream. Replaces compressed chunk-boundary offsets into ReduceFileMeta on close. Also updates the bytesFlushed to overwrite the FileInfo length post close. The buffers used to buffer chunkSize data before compression and flush is powered by MmapMemoryManager and ChunkBufferPool which uses mmap'ed temporary files to avoid the memory overhead of buffering chunk sized data.
Choice b/w ChunkCompressedFileChannelWriter and FileChannelWriter is made basis the new config set by client during ReserveSlots (conf.isChunkCompressionEnabled).

Read side

No changes in the worker (YET TO IMPLEMENT: Partition sorting during AQE flow)
CelebornInputStream: When reading chunkCompressed chunks, wraps the read ByteBuf into a ZSTDIs to inplace decompress and read.

Configs added

Key	Default	Meaning
`celeborn.chunk.compression.enabled`	`false`	Client side config. Enables chunk-level ZSTD compression on the worker write path and transparent decompression in `CelebornInputStream`.

Does this PR resolve a correctness bug?

No

Does this PR introduce any user-facing change?

celeborn.chunk.compression.enabled config to enable / disable chunk level compression (disabled by default)

How was this patch tested?

UTs, ITs

saurabhd336 · 2026-05-25T11:32:22Z

Hi team @SteNicholas / @s0nskar / @zaynt4606 / others

I wanted to start an early discussion for these proposed changes (can share a design doc too).

At a high level, our Celeborn infra costs have largely been influenced by large locally attached SSD requirements. Additionally, also looking for ways to reduce celeborn network ingress / egress. This change helps is reducing both.

Wanted to start this discussion for the change thanks

codecov · 2026-05-26T16:22:45Z

Codecov Report

❌ Patch coverage is 95.00000% with 3 lines in your changes missing coverage. Please review.
✅ Project coverage is 67.28%. Comparing base (b4cb5a0) to head (39595da).
⚠️ Report is 54 commits behind head on main.

Files with missing lines	Patch %	Lines
.../org/apache/celeborn/common/meta/DiskFileInfo.java	85.72%	1 Missing ⚠️
...rg/apache/celeborn/common/meta/ReduceFileMeta.java	88.89%	0 Missing and 1 partial ⚠️
...leborn/common/network/buffer/FileChunkBuffers.java	50.00%	0 Missing and 1 partial ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #3699      +/-   ##
==========================================
+ Coverage   66.91%   67.28%   +0.37%     
==========================================
  Files         358      360       +2     
  Lines       21986    22337     +351     
  Branches     1946     1982      +36     
==========================================
+ Hits        14710    15027     +317     
- Misses       6262     6285      +23     
- Partials     1014     1025      +11

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

…ge-record chunks readChunks() was unconditionally decompressing every chunk, but large records are written raw (uncompressed) by flushLargeRecord(). The fix consults ReduceFileMeta.getChunkCompressed() per chunk and only calls ZstdInputStream on chunks that were actually compressed. Also exposes compressAndFlush() as public so the test can call it directly. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

saurabhd336 · 2026-06-05T10:36:51Z

Hi team
@SteNicholas / @s0nskar / @zaynt4606 / others

Ping on this again!

So far, i've been able to get ~40% reduction in celeborn worker disk usage with a roughly 80% increase in CPU usage. A lot of times, the CPU usage of our celeborn fleets is idle, hence this tradeoff feels reasonable and worth testing at scale for us. Can you please take a look at this PR?

saurabhd336 added 2 commits May 23, 2026 22:13

Support chunk compressed data writer

6e53791

Add read support

53a4e81

github-actions Bot added module:client kind:build module:common module:worker labels May 23, 2026

saurabhd336 added 3 commits May 24, 2026 09:17

Add tests

5520acc

lint fix

f917a57

Add e2e test

cde5467

saurabhd336 changed the title ~~[FEATURE] [WIP] Support chunk level compression to optimize storage~~ [CELEBORN-XXXX] [FEATURE] [WIP] Support chunk level compression to optimize storage May 25, 2026

Fix lint

4f76be5

saurabhd336 added 3 commits May 26, 2026 13:45

Fix sbt build

c3ed330

Fix compression level compilation

485dd23

Fix client.md

3d1daa4

github-actions Bot added the kind:documentation label May 26, 2026

saurabhd336 added 2 commits May 26, 2026 20:25

Fix test

c3d4bc8

Fix test

7c2cbff

SteNicholas force-pushed the main branch 2 times, most recently from 1d92a40 to cf8d472 Compare May 27, 2026 02:11

saurabhd336 and others added 9 commits June 1, 2026 12:48

Fix chunk compressed writer

78611fd

Avoid chunk compression during large records

187906a

Move to chunk compression context message

3533e52

Fix compression

174db3d

Don't compress large records

91a49e7

Lint fix

39595da

Fix compilation

6d8a640

Fix test

3b1ea63

Fix lint

c9b4785

saurabhd336 changed the title ~~[CELEBORN-XXXX] [FEATURE] [WIP] Support chunk level compression to optimize storage~~ [CELEBORN-XXXX] Support chunk level compression to optimize storage Jun 5, 2026

saurabhd336 changed the title ~~[CELEBORN-XXXX] Support chunk level compression to optimize storage~~ [CELEBORN-2350] Support chunk level compression to optimize storage Jun 5, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[CELEBORN-2350] Support chunk level compression to optimize storage#3699

[CELEBORN-2350] Support chunk level compression to optimize storage#3699
saurabhd336 wants to merge 21 commits into
apache:mainfrom
saurabhd336:chunkCompressedWriter

saurabhd336 commented May 23, 2026 •

edited

Loading

Uh oh!

saurabhd336 commented May 25, 2026

Uh oh!

codecov Bot commented May 26, 2026 •

edited

Loading

Uh oh!

saurabhd336 commented Jun 5, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

saurabhd336 commented May 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What changes were proposed in this pull request?

Impl details

Writer side

Read side

Configs added

Does this PR resolve a correctness bug?

Does this PR introduce any user-facing change?

How was this patch tested?

Uh oh!

saurabhd336 commented May 25, 2026

Uh oh!

codecov Bot commented May 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

saurabhd336 commented Jun 5, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

saurabhd336 commented May 23, 2026 •

edited

Loading

codecov Bot commented May 26, 2026 •

edited

Loading