hyper, hyper-parquet: persistent server so hot runs are actually hot by caetanosauer · Pull Request #955 · ClickHouse/ClickBench

caetanosauer · 2026-06-23T15:04:38Z

Fixes #936.

Problem

The shared driver (lib/benchmark-common.sh) calls ./query once per try (BENCH_TRIES) and, for daemon-backed systems, keeps the server alive across tries so tries 2..N measure hot execution. Hyper's ./query instead opened a brand-new HyperProcess on every call, so each "hot" try hit an empty buffer pool against a just-cache-dropped file — every reported hot time was actually cold. (Reported in #936; the same per-query restart was introduced for Hyper in the benchmark.sh split refactor.)

Fix

Convert both hyper/ and hyper-parquet/ to the client-server model the framework expects, mirroring umbra/:

start backgrounds a supervisor that opens one long-lived hyperd and publishes its connection descriptor to server.endpoint. In hyper/ it also holds a keep-alive connection to hits.hyper so the buffer pool isn't torn down when each per-try ./query process exits (Hyper detaches a .hyper DB when its last connection closes).
stop SIGTERMs the supervisor (cleanly shutting down hyperd) and waits for it to fully exit so drop_caches isn't defeated by pinned mmap pages.
check / query / load reconnect to the persistent server via its descriptor instead of spawning their own HyperProcess. Loading through the same server also avoids briefly running two hyperd instances (each claiming ~80% RAM) during the heavy COPY.
benchmark.sh: BENCH_RESTARTABLE=yes (there is now a real daemon whose lifecycle matters) and drop the BENCH_CONCURRENT_DURATION=0 override, re-enabling the concurrent-QPS test (a single shared server makes it meaningful again).

Net effect: the driver's cold cycle (stop → wait → drop_caches → start) gives an honest cold try 1, and tries 2..N hit the warm server = genuinely hot.

Validation

Ran the full 43-query sweep on a c7a-class x86_64 box for both directories:

Benchmark	Rows	cold(try1) ≥ hot(min t2,t3)	Median speedup	Cold→Hot total
`hyper` (native)	43/43	43/43	9.71×	34.4s → 4.1s
`hyper-parquet`	43/43	43/43	2.35×	64.4s → 22.2s

Every query on both paths shows try 1 (cold) ≥ tries 2..3 (hot), with zero anomalies; before the fix all three tries were cold and roughly equal. Native data size (~19 GB) and load time (~351s) match prior committed results. The re-enabled concurrent-QPS test on hyper-parquet produced 5.583 QPS at a 0.3% error ratio. No leaked hyperd processes after stop.

🤖 Generated with Claude Code

Fixes ClickHouse#936. The shared driver (lib/benchmark-common.sh) calls ./query once per try and, for daemon-backed systems, keeps the server alive across tries so tries 2..N measure hot execution. Hyper's ./query instead opened a brand-new HyperProcess on every call, so each "hot" try hit an empty buffer pool against a just-cache-dropped file: every reported hot time was actually cold. Convert both hyper/ and hyper-parquet/ to the client-server model the framework expects (mirroring umbra/): - start: background a supervisor that opens one long-lived hyperd and publishes its connection descriptor to server.endpoint. In hyper/ it also holds a keep-alive connection to hits.hyper so the buffer pool isn't torn down when each per-try ./query process exits (Hyper detaches a .hyper DB when its last connection closes). - stop: SIGTERM the supervisor (cleanly shutting down hyperd) and wait for it to fully exit so drop_caches isn't defeated by pinned mmap pages. - check / query / load: reconnect to the persistent server via its descriptor instead of spawning their own HyperProcess. Loading through the same server also avoids briefly running two hyperd instances (each claiming ~80% RAM) during the heavy COPY. - benchmark.sh: BENCH_RESTARTABLE=yes (there is now a real daemon whose lifecycle matters) and drop the BENCH_CONCURRENT_DURATION=0 override, re-enabling the concurrent-QPS test. Net effect: the driver's cold cycle (stop -> wait -> drop_caches -> start) gives an honest cold try 1, and tries 2..N hit the warm server = genuinely hot. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

hyper, hyper-parquet: persistent server so hot runs are actually hot#955

hyper, hyper-parquet: persistent server so hot runs are actually hot#955
caetanosauer wants to merge 1 commit into
ClickHouse:mainfrom
caetanosauer:fix-hyper-hot-runs-936

caetanosauer commented Jun 23, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

caetanosauer commented Jun 23, 2026

Problem

Fix

Validation

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant