feat: add --run-sequentially to interleave benchmark runs (fixes #822)#889
feat: add --run-sequentially to interleave benchmark runs (fixes #822)#889leno23 wants to merge 1 commit into
Conversation
When benchmarking multiple commands, hyperfine normally completes all timing runs for one command before starting the next. The new flag runs timing iterations in rounds across commands, which helps pipeline-style workloads and reduces temporal load bias. Fixes sharkdp#822 Co-authored-by: Cursor <cursoragent@cursor.com>
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 5950cc1c23
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| for (benchmark, state) in benchmarks.iter().zip(states) { | ||
| results.push(benchmark.finalize(state)?); |
There was a problem hiding this comment.
Finalize each benchmark as soon as its last run completes
In sequential mode, finalize() (which runs --cleanup) is deferred until all commands finish, so a command that reaches its run count early is left uncleaned while later commands continue running. This contradicts the documented cleanup behavior (“after the completion of all benchmarking runs for each individual command” in src/cli.rs) and can change results when cleanup removes artifacts that affect subsequent rounds; it also means early-finished commands never get cleanup if a later command errors before the final loop.
Useful? React with 👍 / 👎.
Summary
--run-sequentiallyto interleave timing runs across multiple benchmark commands instead of completing all runs for one command before starting the next.cmd1,cmd2,cmd1,cmd2, …--reference) still run as a separate full benchmark before the interleaved group, matching existing behavior.Motivation
Fixes #822. Useful when:
Example
Executes:
Instead of the default grouped order (
step1×3, thenstep2×3).Test plan
cargo test(unit + integration + execution order tests)benchmarks_are_executed_sequentially_with_run_sequentially_flagbenchmark::sequential