feat: Migrate block compression infrastructure with LZ4, ZSTD, and none backends#52
Conversation
|
Thank you @ChaomingZhangCN for the contributions to the block compression infrastructure — including the core framework, LZ4/ZSTD/NONE backends, and tests — migrated as part of this batch. 🎉 |
leaves12138
left a comment
There was a problem hiding this comment.
I found a blocking issue in the LZ4 block decompressor: it reads the 8-byte header before validating that src_length is at least HEADER_LENGTH. If a truncated/corrupted block shorter than 8 bytes is passed in, Decompress will read past the supplied buffer instead of returning an Invalid status. Please add a src_length >= HEADER_LENGTH guard before ReadIntLE and cover it with a corruption test. No other blocking issue found in this pass.
leaves12138
left a comment
There was a problem hiding this comment.
Re-reviewed the latest revision. The previous blocking issue is fixed: the LZ4 decompressor now validates that the input is at least the header length before reading the header, and the added regression coverage exercises the truncated-header and insufficient-output-buffer cases. I did not find any remaining blocking issue within the staged migration scope. Approving.
Purpose
No Linked issue.
Migrate block compression framework for SST file I/O:
Block compression core (
src/paimon/common/compression/):BlockCompressionType— compression type enum (LZ4, ZSTD, NONE) (block_compression_type.h)BlockCompressor— abstract compressor interface (block_compressor.h/cpp)BlockDecompressor— abstract decompressor interface (block_decompressor.h/cpp)BlockCompressionFactory— factory for creating compressor/decompressor by type (block_compression_factory.h/cpp)NoneBlockCompressionFactory— passthrough (no-op) compression factory (none_block_compression_factory.h)LZ4 backend (
src/paimon/common/compression/lz4/):Lz4BlockCompressionFactory— LZ4 compression factory (lz4_block_compression_factory.h)Lz4BlockCompressor— LZ4 compressor implementation (lz4_block_compressor.h)Lz4BlockDecompressor— LZ4 decompressor implementation (lz4_block_decompressor.h)ZSTD backend (
src/paimon/common/compression/zstd/):ZstdBlockCompressionFactory— ZSTD compression factory (zstd_block_compression_factory.h)ZstdBlockCompressor— ZSTD compressor implementation (zstd_block_compressor.h)ZstdBlockDecompressor— ZSTD decompressor implementation (zstd_block_decompressor.h)Tests
block_compression_factory_test.cpp— factory creation, round-trip compress/decompress for LZ4, ZSTD, and NONEAPI and Format
Documentation
Generative AI tooling
Migrate-by: Aone Copilot (Claude)