feat: add --threads-all option to llama-bench by xiaobai0529 · Pull Request #25261 · ggml-org/llama.cpp

xiaobai0529 · 2026-07-03T06:48:38Z

Summary

Add a new command-line option -ta / --threads-all to llama-bench that uses all logical cores (including SMT/hyperthreading) instead of only physical cores.

Problem

As discussed in issue #17611, llama-bench currently uses cpu_get_num_math() which returns the number of physical cores. On systems with SMT/hyperthreading enabled (e.g., dual CPU with 36 cores / 72 threads), this results in only 50% CPU utilization being reported by default.

While using physical cores is generally optimal for llama.cpp workloads, users who want to benchmark their system's full capabilities need to manually specify thread count with -t 72.

Solution

Add a convenient -ta / --threads-all flag that automatically uses std::thread::hardware_concurrency() to get all logical cores.

Usage

# Use all logical cores (including SMT)
./llama-bench -ta
./llama-bench --threads-all

feat: add --threads-all option to llama-bench

1ab317b

github-actions Bot added the examples label Jul 3, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: add --threads-all option to llama-bench#25261

feat: add --threads-all option to llama-bench#25261
xiaobai0529 wants to merge 1 commit into
ggml-org:masterfrom
xiaobai0529:feature/bench-threads-all

xiaobai0529 commented Jul 3, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

xiaobai0529 commented Jul 3, 2026

Summary

Problem

Solution

Usage

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant