Skip to content

Add basic sql benchmark runner for running sql benchmarks#23052

Draft
Omega359 wants to merge 2 commits into
apache:mainfrom
Omega359:benchmark_runner_v2
Draft

Add basic sql benchmark runner for running sql benchmarks#23052
Omega359 wants to merge 2 commits into
apache:mainfrom
Omega359:benchmark_runner_v2

Conversation

@Omega359

@Omega359 Omega359 commented Jun 19, 2026

Copy link
Copy Markdown
Contributor

Which issue does this PR close?

Rationale for this change

Running sql benchmarks using environment variables for configuration is awkward and error prone and strictly using criterion, while statistically much better, is quite slow compared to using simple iterations.

This PR is the first version of a benchmark runner for sql benchmarks that will eventually use arguments for all benchmark configuration options.

What changes are included in this PR?

A simple benchmark runner that can list out the sql benchmarks and run a benchmark using iterations or criterion allowing for specifying a single query if desired.

Future enhancements will use arguments for benchmark configuration vs just using environment variables as well as providing help and tying this into bench.sh

Are these changes tested?

Yes. I have a script that tests all current sql benchmarks both with and without criterion. Here is an portion of it for the single clickbench benchmark:

# clickbench single basic long flags
env DATA_DIR=data CLICKBENCH_TYPE=single cargo run -p datafusion-benchmarks --bin benchmark_runner -- clickbench --query 0 --iterations 5 --output results/benchmark_runner/clickbench_single_long.json

# clickbench single basic short flags
env DATA_DIR=data CLICKBENCH_TYPE=single cargo run -p datafusion-benchmarks --bin benchmark_runner -- clickbench --query 0 -i 5 -o results/benchmark_runner/clickbench_single_short.json

# clickbench single basic env iterations
env DATA_DIR=data CLICKBENCH_TYPE=single ITERATIONS=5 cargo run -p datafusion-benchmarks --bin benchmark_runner -- clickbench --query 0 --output results/benchmark_runner/clickbench_single_env_iterations.json

# clickbench single criterion with baseline
env DATA_DIR=data CLICKBENCH_TYPE=single cargo run -p datafusion-benchmarks --bin benchmark_runner -- clickbench --query 0 --criterion --save-baseline benchmark_runner_acceptance

# clickbench single criterion without baseline
env DATA_DIR=data CLICKBENCH_TYPE=single cargo run -p datafusion-benchmarks --bin benchmark_runner -- clickbench --query 0 --criterion

The existing cargo bench approach still works the same (criterion only):

env DATA_DIR=data CLICKBENCH_TYPE=single BENCH_NAME=clickbench BENCH_QUERY=0 cargo bench -p datafusion-benchmarks --bench sql`

Are there any user-facing changes?

No.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant