Write vortex-compact with Zstdbuffers instead of Zstd with unstable_encodings#8542
Write vortex-compact with Zstdbuffers instead of Zstd with unstable_encodings#8542myrrc wants to merge 2 commits into
Conversation
Merging this PR will not alter performance
|
| Mode | Benchmark | BASE |
HEAD |
Efficiency | |
|---|---|---|---|---|---|
| ❌ | Simulation | slice_empty_vortex |
310 ns | 368.3 ns | -15.84% |
| ⚡ | WallTime | cuda/bitpacked_u8/unpack/3bw[100M] |
353.7 µs | 299.8 µs | +17.97% |
| ⚡ | Simulation | encode_varbin[(1000, 8)] |
159.3 µs | 140.4 µs | +13.46% |
| ⚡ | Simulation | encode_varbin[(1000, 4)] |
158.3 µs | 139.8 µs | +13.27% |
| ⚡ | Simulation | encode_varbin[(1000, 2)] |
157.6 µs | 140.7 µs | +12.02% |
| ⚡ | Simulation | encode_varbin[(1000, 32)] |
160.1 µs | 144.8 µs | +10.58% |
Tip
Investigate this regression by commenting @codspeedbot fix this regression on this PR, or directly use the CodSpeed MCP with your agent.
Comparing myrrc/bench-zstdbuffers-byte-length (6cd01e8) with develop (5e95e75)2
Footnotes
-
4 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports. ↩
-
No successful run was found on
develop(51752c8) during the generation of this report, so 5e95e75 was used instead as the comparison base. There might be some changes unrelated to this pull request in this report. ↩
|
I've experimented with implementing ByteLength and CompareKernels, but this doesn't add any performance gains. |
f0c8795 to
fef6209
Compare
Signed-off-by: Mikhail Kot <mikhail@spiraldb.com>
fef6209 to
bb05df0
Compare
Polar Signals Profiling ResultsLatest Run
Previous Runs (1)
Powered by Polar Signals Cloud |
Benchmarks: PolarSignals Profiling (base)Vortex (geomean): 0.958x ➖ How to read Verdict and Engines
datafusion / vortex-file-compressed (0.958x ➖, 1↑ 0↓)
No file size changes detected. |
Benchmarks: FineWeb NVMe (base)Verdict: No clear signal (low confidence) How to read Verdict and Engines
datafusion / vortex-file-compressed (1.005x ➖, 1↑ 0↓)
datafusion / parquet (0.996x ➖, 0↑ 0↓)
duckdb / vortex-file-compressed (1.008x ➖, 0↑ 0↓)
duckdb / parquet (1.014x ➖, 0↑ 1↓)
File Size Changes (3 files changed, -46.3% overall, 1↑ 2↓)
Totals:
|
Benchmarks: TPC-H SF=1 on NVME (base)Verdict: No clear signal (low confidence) How to read Verdict and Engines
datafusion / vortex-file-compressed (0.983x ➖, 0↑ 0↓)
datafusion / parquet (0.982x ➖, 1↑ 0↓)
datafusion / arrow (0.984x ➖, 1↑ 1↓)
duckdb / vortex-file-compressed (0.955x ➖, 2↑ 0↓)
duckdb / parquet (0.991x ➖, 2↑ 2↓)
File Size Changes (17 files changed, -44.3% overall, 5↑ 12↓)
Totals:
|
Benchmarks: TPC-DS SF=1 on NVME (base)Verdict: No clear signal (low confidence) How to read Verdict and Engines
datafusion / vortex-file-compressed (0.905x ➖, 44↑ 0↓)
datafusion / parquet (0.920x ➖, 27↑ 0↓)
duckdb / vortex-file-compressed (0.913x ➖, 36↑ 1↓)
duckdb / parquet (0.946x ➖, 8↑ 0↓)
File Size Changes (30 files changed, -43.4% overall, 0↑ 30↓)
Totals:
|
Benchmarks: FineWeb S3 (base)Verdict: No clear signal (low confidence) How to read Verdict and Engines
datafusion / vortex-file-compressed (0.915x ➖, 1↑ 0↓)
datafusion / parquet (1.116x ➖, 0↑ 0↓)
duckdb / vortex-file-compressed (1.103x ➖, 0↑ 1↓)
duckdb / parquet (1.026x ➖, 0↑ 0↓)
|
Benchmarks: Statistical and Population Genetics (base)Verdict: No clear signal (low confidence) How to read Verdict and Engines
duckdb / vortex-file-compressed (0.998x ➖, 0↑ 0↓)
duckdb / parquet (1.001x ➖, 0↑ 0↓)
File Size Changes (3 files changed, -32.3% overall, 1↑ 2↓)
Totals:
|
Benchmarks: TPC-H SF=10 on NVME (base)Verdict: No clear signal (low confidence) How to read Verdict and Engines
datafusion / vortex-file-compressed (0.995x ➖, 0↑ 0↓)
datafusion / parquet (0.990x ➖, 0↑ 0↓)
datafusion / arrow (0.985x ➖, 0↑ 1↓)
duckdb / vortex-file-compressed (0.990x ➖, 0↑ 0↓)
duckdb / parquet (0.991x ➖, 0↑ 0↓)
File Size Changes (47 files changed, -44.4% overall, 10↑ 37↓)
Totals:
|
Benchmarks: Clickbench on NVME (base)Verdict: No clear signal (low confidence) How to read Verdict and Engines
datafusion / vortex-file-compressed (1.009x ➖, 0↑ 0↓)
datafusion / parquet (0.995x ➖, 0↑ 0↓)
duckdb / vortex-file-compressed (0.934x ➖, 12↑ 0↓)
duckdb / parquet (0.979x ➖, 2↑ 1↓)
File Size Changes (201 files changed, -39.1% overall, 50↑ 151↓)
Totals:
|
Benchmarks: TPC-H SF=1 on S3 (base)Verdict: No clear signal (environment too noisy confidence) How to read Verdict and Engines
datafusion / vortex-file-compressed (1.071x ➖, 1↑ 4↓)
datafusion / parquet (1.008x ➖, 5↑ 6↓)
duckdb / vortex-file-compressed (1.039x ➖, 0↑ 1↓)
duckdb / parquet (1.132x ➖, 0↑ 2↓)
|
Benchmarks: PolarSignals ProfilingVortex (geomean): 1.032x ➖ How to read Verdict and Engines
datafusion / vortex-file-compressed (1.032x ➖, 1↑ 0↓)
No file size changes detected. |
Benchmarks: TPC-DS SF=1 on NVMEVerdict: No clear signal (low confidence) How to read Verdict and Engines
datafusion / vortex-file-compressed (0.925x ➖, 14↑ 0↓)
datafusion / vortex-compact (0.968x ➖, 10↑ 0↓)
datafusion / parquet (0.985x ➖, 0↑ 3↓)
duckdb / vortex-file-compressed (0.997x ➖, 4↑ 0↓)
duckdb / vortex-compact (0.999x ➖, 0↑ 0↓)
duckdb / parquet (0.997x ➖, 0↑ 0↓)
duckdb / duckdb (1.006x ➖, 0↑ 1↓)
File Size Changes (21 files changed, +0.1% overall, 19↑ 2↓)
Totals:
|
Benchmarks: FineWeb NVMeVerdict: Likely improvement (medium confidence) How to read Verdict and Engines
datafusion / vortex-file-compressed (0.988x ➖, 1↑ 0↓)
datafusion / vortex-compact (0.446x ✅, 7↑ 0↓)
datafusion / parquet (0.997x ➖, 0↑ 0↓)
duckdb / vortex-file-compressed (1.015x ➖, 0↑ 1↓)
duckdb / vortex-compact (0.470x ✅, 7↑ 1↓)
duckdb / parquet (0.996x ➖, 0↑ 0↓)
File Size Changes (2 files changed, +6.6% overall, 1↑ 1↓)
Totals:
|
Benchmarks: TPC-H SF=1 on NVMEVerdict: No clear signal (environment too noisy confidence) How to read Verdict and Engines
datafusion / vortex-file-compressed (0.999x ➖, 0↑ 0↓)
datafusion / vortex-compact (0.984x ➖, 1↑ 0↓)
datafusion / parquet (0.984x ➖, 1↑ 0↓)
datafusion / arrow (0.997x ➖, 0↑ 0↓)
duckdb / vortex-file-compressed (0.991x ➖, 0↑ 0↓)
duckdb / vortex-compact (0.989x ➖, 0↑ 0↓)
duckdb / parquet (1.023x ➖, 0↑ 3↓)
duckdb / duckdb (0.999x ➖, 0↑ 0↓)
File Size Changes (16 files changed, -1.5% overall, 10↑ 6↓)
Totals:
|
Benchmarks: FineWeb S3Verdict: No clear signal (environment too noisy confidence) How to read Verdict and Engines
datafusion / vortex-file-compressed (0.977x ➖, 0↑ 1↓)
datafusion / vortex-compact (0.957x ➖, 0↑ 1↓)
datafusion / parquet (0.971x ➖, 1↑ 1↓)
duckdb / vortex-file-compressed (0.954x ➖, 0↑ 0↓)
duckdb / vortex-compact (0.845x ➖, 1↑ 0↓)
duckdb / parquet (0.988x ➖, 0↑ 0↓)
|
Benchmarks: Statistical and Population GeneticsVerdict: No clear signal (low confidence) How to read Verdict and Engines
duckdb / vortex-file-compressed (1.003x ➖, 0↑ 0↓)
duckdb / vortex-compact (0.998x ➖, 0↑ 0↓)
duckdb / parquet (1.002x ➖, 0↑ 0↓)
File Size Changes (2 files changed, +0.1% overall, 2↑ 0↓)
Totals:
|
Benchmarks: TPC-H SF=10 on NVMEVerdict: No clear signal (low confidence) How to read Verdict and Engines
datafusion / vortex-file-compressed (1.020x ➖, 0↑ 0↓)
datafusion / vortex-compact (1.006x ➖, 1↑ 0↓)
datafusion / parquet (1.019x ➖, 0↑ 0↓)
datafusion / arrow (1.032x ➖, 0↑ 0↓)
duckdb / vortex-file-compressed (1.013x ➖, 0↑ 0↓)
duckdb / vortex-compact (1.010x ➖, 1↑ 0↓)
duckdb / parquet (1.004x ➖, 0↑ 0↓)
duckdb / duckdb (1.015x ➖, 0↑ 0↓)
File Size Changes (46 files changed, -1.3% overall, 17↑ 29↓)
Totals:
|
Benchmarks: TPC-H SF=1 on S3Verdict: No clear signal (environment too noisy confidence) How to read Verdict and Engines
datafusion / vortex-file-compressed (0.893x ➖, 4↑ 1↓)
datafusion / vortex-compact (0.952x ➖, 6↑ 5↓)
datafusion / parquet (1.020x ➖, 3↑ 7↓)
duckdb / vortex-file-compressed (0.994x ➖, 0↑ 0↓)
duckdb / vortex-compact (1.019x ➖, 0↑ 0↓)
duckdb / parquet (1.001x ➖, 0↑ 0↓)
|
Benchmarks: Clickbench on NVMEVerdict: No clear signal (environment too noisy confidence) How to read Verdict and Engines
datafusion / vortex-file-compressed (0.706x ✅, 43↑ 0↓)
datafusion / parquet (0.906x ➖, 14↑ 0↓)
duckdb / vortex-file-compressed (1.006x ➖, 1↑ 3↓)
duckdb / parquet (0.980x ➖, 0↑ 0↓)
duckdb / duckdb (0.985x ➖, 0↑ 0↓)
File Size Changes (201 files changed, +6.1% overall, 149↑ 52↓)
Totals:
|
Benchmarks: Appian on NVMEVerdict: No clear signal (low confidence) How to read Verdict and Engines
datafusion / vortex-file-compressed (0.931x ➖, 0↑ 0↓)
datafusion / parquet (0.910x ➖, 3↑ 0↓)
duckdb / vortex-file-compressed (0.953x ➖, 0↑ 0↓)
duckdb / parquet (0.964x ➖, 0↑ 0↓)
duckdb / duckdb (0.953x ➖, 0↑ 0↓)
File Size Changes (12 files changed, +0.4% overall, 8↑ 4↓)
Totals:
|
Benchmarks: TPC-H SF=10 on S3Verdict: No clear signal (environment too noisy confidence) How to read Verdict and Engines
datafusion / vortex-file-compressed (1.043x ➖, 0↑ 2↓)
datafusion / vortex-compact (0.964x ➖, 1↑ 1↓)
datafusion / parquet (0.921x ➖, 6↑ 2↓)
duckdb / vortex-file-compressed (1.074x ➖, 0↑ 0↓)
duckdb / vortex-compact (1.092x ➖, 0↑ 1↓)
duckdb / parquet (0.926x ➖, 0↑ 0↓)
|
|
Wins: Fineweb: performance improvements on all queries up to 4x (q2), Losses:
cc @onursatici |
Benchmarks: Clickbench Sorted on NVMEVerdict: No clear signal (low confidence) How to read Verdict and Engines
datafusion / vortex-file-compressed (0.981x ➖, 2↑ 1↓)
datafusion / parquet (0.989x ➖, 0↑ 0↓)
duckdb / vortex-file-compressed (0.966x ➖, 2↑ 0↓)
duckdb / parquet (0.976x ➖, 0↑ 0↓)
duckdb / duckdb (0.998x ➖, 0↑ 0↓)
File Size Changes (201 files changed, +4.6% overall, 149↑ 52↓)
Totals:
|
Benchmarks: Random AccessVortex (geomean): 0.974x ➖ How to read Verdict and Engines
unknown / unknown (0.989x ➖, 2↑ 1↓)
|
Benchmarks: CompressionVortex (geomean): 1.002x ➖ How to read Verdict and Engines
unknown / unknown (1.024x ➖, 1↑ 16↓)
|
|
We really shouldn't add one more config here, we should settle on some stable set of encodings, i.e. maybe we can make zstdbuffers the preferred zstd encoding |
If unstable_encodings feature is set (CI as an example), register ZstdBuffers and not Zstd as default write strategy. This allows using byte_length without decompressing data in Zstd.
This brings down local Clickbench Q27 run from 450 to 190ms.
Resolves: #8541