Use userspace page cache for datalake benchmarks#818
Draft
alexey-milovidov wants to merge 15 commits intomainfrom
Draft
Use userspace page cache for datalake benchmarks#818alexey-milovidov wants to merge 15 commits intomainfrom
alexey-milovidov wants to merge 15 commits intomainfrom
Conversation
…chmarks Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This ensures the userspace page cache persists across tries. A fresh process per query group means try 1 is naturally cold (empty page cache) and tries 2-3 are hot, without needing drop_caches. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…lickBench into use-page-cache-for-datalake
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…lickBench into use-page-cache-for-datalake
…-cache-for-datalake
1 task
clickgapai
pushed a commit
to clickgapai/ClickHouse
that referenced
this pull request
May 6, 2026
`CachedInMemoryReadBufferFromFile::populateBlockRange` previously issued one `in->readBigAt` per missing 1 MiB block. On object storage, each call is a separate HTTP request, so a cold scan of a 14 GB Parquet file through the userspace page cache made ~15k requests, each paying the TCP/TLS round-trip — measurably slower than the filesystem cache, which fetches in larger segments. Coalescing was previously implemented in commit 682b070 and reverted in c178d2a to avoid transient memory spikes from huge temporary buffers under parallel cold reads. Re-introduce coalescing with a hard cap on the temporary buffer (`max_coalesced_bytes` = 16 MiB). Long miss runs are split into multiple fetches, bounding peak transient memory per call. Single-block misses still read directly into the cache cell, avoiding the buffer and the extra `memcpy`. Measured locally on c8g.24xlarge against the ClickBench `clickhouse-datalake` queries (43 queries, single 14.7 GB Parquet on S3, totals over all queries): cold runs: filesystem cache 62.28s -> page cache (default) 56.58s hot runs: filesystem cache 18.57s -> page cache (default) 13.59s The page cache is now strictly faster than the filesystem cache on both cold and hot, with no benchmark-script tuning required. Context: ClickHouse/ClickBench#818 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
clickhouse-datalakeandclickhouse-datalake-partitionedfrom filesystem cache (/dev/shm/) to the userspace page cachefilesystem_cachesconfig withpage_cache_size: autoinclickhouse-local.yaml--filesystem_cache_name cachewith--use_page_cache_for_object_storage 1in query invocationsTest plan
clickhouse-datalakebenchmark and verify hot runs use the page cacheclickhouse-datalake-partitionedbenchmark and verify hot runs use the page cache🤖 Generated with Claude Code