Remove data generation code from engine-specific bench binaries#8586
Remove data generation code from engine-specific bench binaries#8586AdamGS wants to merge 1 commit into
Conversation
Signed-off-by: Adam Gutglick <adam@spiraldb.com>
Merging this PR will not alter performance
|
| Mode | Benchmark | BASE |
HEAD |
Efficiency | |
|---|---|---|---|---|---|
| ❌ | Simulation | slice_empty_vortex |
310 ns | 368.3 ns | -15.84% |
| ⚡ | Simulation | bitwise_not_vortex_buffer_mut[128] |
273.6 ns | 244.4 ns | +11.93% |
Tip
Investigate this regression by commenting @codspeedbot fix this regression on this PR, or directly use the CodSpeed MCP with your agent.
Comparing adamg/split-benchmarks-data-gen (0a53378) with develop (15cec3b)
Footnotes
-
4 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports. ↩
Polar Signals Profiling ResultsLatest Run
Powered by Polar Signals Cloud |
Benchmarks: PolarSignals ProfilingVortex (geomean): 1.135x ❌ How to read Verdict and Engines
datafusion / vortex-file-compressed (1.135x ❌, 0↑ 7↓)
No file size changes detected. |
Benchmarks: TPC-H SF=1 on NVMEVerdict: No clear signal (environment too noisy confidence) How to read Verdict and Engines
datafusion / vortex-file-compressed (1.012x ➖, 0↑ 0↓)
datafusion / vortex-compact (1.009x ➖, 0↑ 0↓)
datafusion / parquet (1.034x ➖, 0↑ 4↓)
datafusion / arrow (1.002x ➖, 0↑ 1↓)
duckdb / vortex-file-compressed (1.012x ➖, 0↑ 0↓)
duckdb / vortex-compact (1.007x ➖, 0↑ 0↓)
duckdb / parquet (1.040x ➖, 0↑ 4↓)
duckdb / duckdb (1.012x ➖, 0↑ 0↓)
File Size Changes (10 files changed, -0.2% overall, 4↑ 6↓)
Totals:
|
Benchmarks: FineWeb NVMeVerdict: No clear signal (environment too noisy confidence) How to read Verdict and Engines
datafusion / vortex-file-compressed (1.026x ➖, 1↑ 1↓)
datafusion / vortex-compact (1.020x ➖, 0↑ 0↓)
datafusion / parquet (0.957x ➖, 3↑ 1↓)
duckdb / vortex-file-compressed (1.018x ➖, 0↑ 0↓)
duckdb / vortex-compact (0.864x ✅, 8↑ 0↓)
duckdb / parquet (1.026x ➖, 0↑ 0↓)
File Size Changes (1 files changed, -0.0% overall, 0↑ 1↓)
Totals:
|
Benchmarks: TPC-DS SF=1 on NVMEVerdict: No clear signal (low confidence) How to read Verdict and Engines
datafusion / vortex-file-compressed (0.959x ➖, 2↑ 0↓)
datafusion / vortex-compact (0.970x ➖, 3↑ 0↓)
datafusion / parquet (0.955x ➖, 9↑ 2↓)
duckdb / vortex-file-compressed (0.969x ➖, 4↑ 2↓)
duckdb / vortex-compact (0.990x ➖, 2↑ 2↓)
duckdb / parquet (0.978x ➖, 0↑ 0↓)
duckdb / duckdb (0.983x ➖, 4↑ 4↓)
File Size Changes (12 files changed, -0.0% overall, 3↑ 9↓)
Totals:
|
Benchmarks: Statistical and Population GeneticsVerdict: No clear signal (low confidence) How to read Verdict and Engines
duckdb / vortex-file-compressed (1.007x ➖, 0↑ 0↓)
duckdb / vortex-compact (1.016x ➖, 0↑ 0↓)
duckdb / parquet (1.028x ➖, 0↑ 0↓)
File Size Changes (1 files changed, -0.0% overall, 0↑ 1↓)
Totals:
|
Benchmarks: FineWeb S3Verdict: No clear signal (environment too noisy confidence) How to read Verdict and Engines
datafusion / vortex-file-compressed (0.957x ➖, 0↑ 1↓)
datafusion / vortex-compact (0.773x ➖, 3↑ 0↓)
datafusion / parquet (0.910x ➖, 0↑ 0↓)
duckdb / vortex-file-compressed (1.032x ➖, 0↑ 1↓)
duckdb / vortex-compact (0.998x ➖, 0↑ 0↓)
duckdb / parquet (0.929x ➖, 0↑ 0↓)
|
Benchmarks: Clickbench Sorted on NVMEVerdict: No clear signal (low confidence) How to read Verdict and Engines
datafusion / vortex-file-compressed (0.978x ➖, 2↑ 1↓)
datafusion / parquet (1.016x ➖, 0↑ 0↓)
duckdb / vortex-file-compressed (0.962x ➖, 1↑ 0↓)
duckdb / parquet (0.997x ➖, 0↑ 0↓)
duckdb / duckdb (0.995x ➖, 0↑ 0↓)
File Size Changes (133 files changed, -0.0% overall, 59↑ 74↓)
Totals:
|
Benchmarks: TPC-H SF=10 on NVMEVerdict: No clear signal (low confidence) How to read Verdict and Engines
datafusion / vortex-file-compressed (0.994x ➖, 0↑ 0↓)
datafusion / vortex-compact (0.992x ➖, 0↑ 0↓)
datafusion / parquet (1.002x ➖, 0↑ 0↓)
datafusion / arrow (1.006x ➖, 0↑ 0↓)
duckdb / vortex-file-compressed (0.998x ➖, 0↑ 0↓)
duckdb / vortex-compact (0.998x ➖, 0↑ 0↓)
duckdb / parquet (0.995x ➖, 0↑ 0↓)
duckdb / duckdb (0.998x ➖, 0↑ 0↓)
File Size Changes (28 files changed, -0.0% overall, 13↑ 15↓)
Totals:
|
Benchmarks: Clickbench on NVMEVerdict: No clear signal (low confidence) How to read Verdict and Engines
datafusion / vortex-file-compressed (0.945x ➖, 4↑ 1↓)
datafusion / parquet (0.958x ➖, 2↑ 0↓)
duckdb / vortex-file-compressed (0.944x ➖, 7↑ 0↓)
duckdb / parquet (0.979x ➖, 1↑ 0↓)
duckdb / duckdb (0.985x ➖, 0↑ 0↓)
File Size Changes (103 files changed, -0.0% overall, 46↑ 57↓)
Totals:
|
Benchmarks: TPC-H SF=1 on S3Verdict: No clear signal (environment too noisy confidence) How to read Verdict and Engines
datafusion / vortex-file-compressed (0.888x ➖, 6↑ 2↓)
datafusion / vortex-compact (0.833x ➖, 8↑ 3↓)
datafusion / parquet (0.720x ➖, 12↑ 0↓)
duckdb / vortex-file-compressed (0.933x ➖, 0↑ 1↓)
duckdb / vortex-compact (0.992x ➖, 0↑ 0↓)
duckdb / parquet (0.912x ➖, 0↑ 0↓)
|
Benchmarks: Appian on NVMEVerdict: No clear signal (low confidence) How to read Verdict and Engines
datafusion / vortex-file-compressed (0.900x ➖, 3↑ 0↓)
datafusion / parquet (0.911x ➖, 2↑ 0↓)
duckdb / vortex-file-compressed (0.939x ➖, 0↑ 0↓)
duckdb / parquet (0.938x ➖, 0↑ 0↓)
duckdb / duckdb (0.943x ➖, 0↑ 0↓)
File Size Changes (6 files changed, -0.1% overall, 0↑ 6↓)
Totals:
|
Benchmarks: TPC-H SF=10 on S3Verdict: No clear signal (environment too noisy confidence) How to read Verdict and Engines
datafusion / vortex-file-compressed (0.848x ➖, 1↑ 0↓)
datafusion / vortex-compact (0.905x ➖, 3↑ 0↓)
datafusion / parquet (0.828x ➖, 4↑ 1↓)
duckdb / vortex-file-compressed (1.117x ➖, 0↑ 2↓)
duckdb / vortex-compact (1.167x ➖, 0↑ 2↓)
duckdb / parquet (1.126x ➖, 0↑ 5↓)
|
|
|
||
| anyhow::bail!( | ||
| "prepared data is missing for {}: {missing_data}. Generate it first with \ | ||
| `vx-bench prepare-data {} --formats-json '[{requested_formats}]'` or \ |
There was a problem hiding this comment.
In bail we print, --formats-json '["vortex-file-compressed"]'. But we don't accept --formats-json '["vortex-file-compressed"]' on the CLI?
There was a problem hiding this comment.
We do! this is the python tool, it has its own CLI args.
Rationale for this change
This is a step towards making the benchmark binaries smaller, shifting as much of the functionality into the python tool
What changes are included in this PR?
Removes data generation from the engine specific binaries, replacing it with an informative error.
What APIs are changed? Are there any user-facing changes?
None