Skip to content

metric: replace histogram backing with goodhistogram#168347

Draft
angles-n-daemons wants to merge 2 commits intocockroachdb:masterfrom
angles-n-daemons:bdillmann/goodhistogram-migration
Draft

metric: replace histogram backing with goodhistogram#168347
angles-n-daemons wants to merge 2 commits intocockroachdb:masterfrom
angles-n-daemons:bdillmann/goodhistogram-migration

Conversation

@angles-n-daemons
Copy link
Copy Markdown
Contributor

@angles-n-daemons angles-n-daemons commented Apr 14, 2026

Summary

Replace all histogram implementations with
cockroachdb/goodhistogram,
a lock-free exponential histogram that is faster (~20ns/op vs ~130ns/op),
uses less memory (~0.85KB vs ~19KB), and natively produces Prometheus
native histogram output.

  • Remove the legacy HdrHistogram implementation entirely (TODO pkg/util/metric: remove HdrHistogram once Prometheus histograms proven in production #96357)
  • Rewrite Histogram to use *goodhistogram.Histogram as backing
  • Rewrite ManualWindowHistogram RecordValue+Rotate mode to use
    goodhistogram (Update mode retains prometheus backing for pebble
    compatibility)
  • Remove HistogramMode, MaxVal, SigFigs from HistogramOptions
  • Remove the unused RaftQuotaPoolPercentUsed histogram
  • Remove native histogram env vars — goodhistogram always produces
    native histogram fields

An alternative approach using prometheus native histograms directly
(without goodhistogram) is explored in #168352.

Fixes: #168274
Epic: CRDB-52498

Release note: None

@angles-n-daemons angles-n-daemons requested review from a team as code owners April 14, 2026 18:00
@angles-n-daemons angles-n-daemons requested review from jasonlmfong and removed request for a team April 14, 2026 18:00
@trunk-io
Copy link
Copy Markdown
Contributor

trunk-io bot commented Apr 14, 2026

Merging to master in this repository is managed by Trunk.

  • To merge this pull request, check the box to the left or comment /trunk merge below.

After your PR is submitted to the merge queue, this comment will be automatically updated with its status. If the PR fails, failure details will also be posted here

@angles-n-daemons angles-n-daemons requested review from aa-joshi and removed request for a team April 14, 2026 18:00
@blathers-crl
Copy link
Copy Markdown

blathers-crl bot commented Apr 14, 2026

Your pull request contains more than 1000 changes. It is strongly encouraged to split big PRs into smaller chunks.

🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf.

@angles-n-daemons angles-n-daemons requested review from KeithCh, ZhouXing19 and msbutler and removed request for a team April 14, 2026 18:01
@cockroach-teamcity
Copy link
Copy Markdown
Member

This change is Reviewable

@angles-n-daemons angles-n-daemons force-pushed the bdillmann/goodhistogram-migration branch from 2c154b4 to 13bd321 Compare April 14, 2026 18:17
@angles-n-daemons angles-n-daemons marked this pull request as draft April 14, 2026 18:21
@angles-n-daemons angles-n-daemons removed request for a team April 14, 2026 18:21
@angles-n-daemons angles-n-daemons force-pushed the bdillmann/goodhistogram-migration branch from bfc8439 to bd2d363 Compare April 14, 2026 18:49
@cockroach-teamcity
Copy link
Copy Markdown
Member

🟢 Sysbench [SQL, 3node, oltp_read_write]
Metric Old Commit New Commit Delta Note
sec/op 10.43m ±5% 10.57m ±7% ~ p=0.653 n=15
🟢 allocs/op 8.091k ±1% 7.922k ±1% -2.09% p=0.000 n=15
Reproduce

benchdiff binaries:

mkdir -p benchdiff/bd2d363/bin/1058449141
gcloud storage cp gs://cockroach-microbench-ci/builds/bd2d3631bf84bfea7b83738a6bbb55603fe062af/bin/pkg_sql_tests benchdiff/bd2d363/bin/1058449141/cockroachdb_cockroach_pkg_sql_tests
chmod +x benchdiff/bd2d363/bin/1058449141/cockroachdb_cockroach_pkg_sql_tests
mkdir -p benchdiff/d7f43f1/bin/1058449141
gcloud storage cp gs://cockroach-microbench-ci/builds/d7f43f10758cc0c772a67a863cad69935f44d96c/bin/pkg_sql_tests benchdiff/d7f43f1/bin/1058449141/cockroachdb_cockroach_pkg_sql_tests
chmod +x benchdiff/d7f43f1/bin/1058449141/cockroachdb_cockroach_pkg_sql_tests

benchdiff command:

# NB: for best (most stable) results, also add a suitable `--benchtime` that
# results in ~1s to ~5s of benchmark runs. For example, if ops average ~3ms, a
# benchtime of `1000x` is appropriate.
#
# Some benchmarks (in particular BenchmarkSysbench) output additional memory
# profiles covering only the execution (excluding the setup/teardown) - those
# should be preferred for analysis since they more closely correspond to what's
# reported as B/op and alloc/op.
benchdiff --run=^BenchmarkSysbench/SQL/3node/oltp_read_write$ --old=d7f43f1 --new=bd2d363 --memprofile ./pkg/sql/tests
🟢 Sysbench [KV, 3node, oltp_read_only]
Metric Old Commit New Commit Delta Note
sec/op 3.159m ±1% 3.157m ±1% ~ p=0.806 n=15
🟢 allocs/op 2.100k ±0% 2.005k ±0% -4.52% p=0.000 n=15
Reproduce

benchdiff binaries:

mkdir -p benchdiff/bd2d363/bin/1058449141
gcloud storage cp gs://cockroach-microbench-ci/builds/bd2d3631bf84bfea7b83738a6bbb55603fe062af/bin/pkg_sql_tests benchdiff/bd2d363/bin/1058449141/cockroachdb_cockroach_pkg_sql_tests
chmod +x benchdiff/bd2d363/bin/1058449141/cockroachdb_cockroach_pkg_sql_tests
mkdir -p benchdiff/d7f43f1/bin/1058449141
gcloud storage cp gs://cockroach-microbench-ci/builds/d7f43f10758cc0c772a67a863cad69935f44d96c/bin/pkg_sql_tests benchdiff/d7f43f1/bin/1058449141/cockroachdb_cockroach_pkg_sql_tests
chmod +x benchdiff/d7f43f1/bin/1058449141/cockroachdb_cockroach_pkg_sql_tests

benchdiff command:

# NB: for best (most stable) results, also add a suitable `--benchtime` that
# results in ~1s to ~5s of benchmark runs. For example, if ops average ~3ms, a
# benchtime of `1000x` is appropriate.
#
# Some benchmarks (in particular BenchmarkSysbench) output additional memory
# profiles covering only the execution (excluding the setup/teardown) - those
# should be preferred for analysis since they more closely correspond to what's
# reported as B/op and alloc/op.
benchdiff --run=^BenchmarkSysbench/KV/3node/oltp_read_only$ --old=d7f43f1 --new=bd2d363 --memprofile ./pkg/sql/tests
🟢 Sysbench [KV, 3node, oltp_write_only]
Metric Old Commit New Commit Delta Note
sec/op 3.018m ±1% 2.970m ±1% ~ p=0.037 n=15
🟢 allocs/op 4.209k ±0% 4.109k ±0% -2.38% p=0.000 n=15
Reproduce

benchdiff binaries:

mkdir -p benchdiff/bd2d363/bin/1058449141
gcloud storage cp gs://cockroach-microbench-ci/builds/bd2d3631bf84bfea7b83738a6bbb55603fe062af/bin/pkg_sql_tests benchdiff/bd2d363/bin/1058449141/cockroachdb_cockroach_pkg_sql_tests
chmod +x benchdiff/bd2d363/bin/1058449141/cockroachdb_cockroach_pkg_sql_tests
mkdir -p benchdiff/d7f43f1/bin/1058449141
gcloud storage cp gs://cockroach-microbench-ci/builds/d7f43f10758cc0c772a67a863cad69935f44d96c/bin/pkg_sql_tests benchdiff/d7f43f1/bin/1058449141/cockroachdb_cockroach_pkg_sql_tests
chmod +x benchdiff/d7f43f1/bin/1058449141/cockroachdb_cockroach_pkg_sql_tests

benchdiff command:

# NB: for best (most stable) results, also add a suitable `--benchtime` that
# results in ~1s to ~5s of benchmark runs. For example, if ops average ~3ms, a
# benchtime of `1000x` is appropriate.
#
# Some benchmarks (in particular BenchmarkSysbench) output additional memory
# profiles covering only the execution (excluding the setup/teardown) - those
# should be preferred for analysis since they more closely correspond to what's
# reported as B/op and alloc/op.
benchdiff --run=^BenchmarkSysbench/KV/3node/oltp_write_only$ --old=d7f43f1 --new=bd2d363 --memprofile ./pkg/sql/tests
Artifacts

download:

mkdir -p new
gcloud storage cp gs://cockroach-microbench-ci/artifacts/bd2d3631bf84bfea7b83738a6bbb55603fe062af/24417019886-1/\* new/
mkdir -p old
gcloud storage cp gs://cockroach-microbench-ci/artifacts/d7f43f10758cc0c772a67a863cad69935f44d96c/24417019886-1/\* old/

built with commit: bd2d3631bf84bfea7b83738a6bbb55603fe062af

@cockroach-teamcity cockroach-teamcity added the X-perf-gain Microbenchmarks CI: Added if a performance gain is detected label Apr 14, 2026
@blathers-crl
Copy link
Copy Markdown

blathers-crl bot commented Apr 14, 2026

Detected infrastructure failure (matched: ). Automatically rerunning failed jobs. (run link)

@angles-n-daemons angles-n-daemons force-pushed the bdillmann/goodhistogram-migration branch from aabcc9c to 37ce9ba Compare April 14, 2026 20:00
@blathers-crl
Copy link
Copy Markdown

blathers-crl bot commented Apr 14, 2026

Detected infrastructure failure (matched: self-hosted runner lost communication with the server). Automatically rerunning failed jobs. (run link)

@blathers-crl
Copy link
Copy Markdown

blathers-crl bot commented Apr 14, 2026

Detected infrastructure failure (matched: ). Automatically rerunning failed jobs. (run link)

@angles-n-daemons angles-n-daemons force-pushed the bdillmann/goodhistogram-migration branch from 37ce9ba to dff278d Compare April 15, 2026 15:39
@blathers-crl
Copy link
Copy Markdown

blathers-crl bot commented Apr 15, 2026

Detected infrastructure failure (matched: ). Automatically rerunning failed jobs. (run link)

@pav-kv
Copy link
Copy Markdown
Collaborator

pav-kv commented Apr 15, 2026

Remove the unused RaftQuotaPoolPercentUsed histogram

How did you conclude it's unused? Should it just be converted like other histograms, or it became impossible with this change?

@blathers-crl
Copy link
Copy Markdown

blathers-crl bot commented Apr 15, 2026

Detected infrastructure failure (matched: ). Automatically rerunning failed jobs. (run link)

@angles-n-daemons angles-n-daemons force-pushed the bdillmann/goodhistogram-migration branch from 9c12626 to 1b8e172 Compare April 15, 2026 17:04
@blathers-crl
Copy link
Copy Markdown

blathers-crl bot commented Apr 15, 2026

Detected infrastructure failure (matched: self-hosted runner lost communication with the server). Automatically rerunning failed jobs. (run link)

@cockroach-teamcity
Copy link
Copy Markdown
Member

🟢 Sysbench [SQL, 3node, oltp_read_write]
Metric Old Commit New Commit Delta Note
sec/op 10.89m ±2% 10.82m ±2% ~ p=0.902 n=15
🟢 allocs/op 8.112k ±1% 7.911k ±0% -2.48% p=0.000 n=15
Reproduce

benchdiff binaries:

mkdir -p benchdiff/1b8e172/bin/1058449141
gcloud storage cp gs://cockroach-microbench-ci/builds/1b8e1726eb58458c208c88e8565d9b4afccca4a5/bin/pkg_sql_tests benchdiff/1b8e172/bin/1058449141/cockroachdb_cockroach_pkg_sql_tests
chmod +x benchdiff/1b8e172/bin/1058449141/cockroachdb_cockroach_pkg_sql_tests
mkdir -p benchdiff/d7f43f1/bin/1058449141
gcloud storage cp gs://cockroach-microbench-ci/builds/d7f43f10758cc0c772a67a863cad69935f44d96c/bin/pkg_sql_tests benchdiff/d7f43f1/bin/1058449141/cockroachdb_cockroach_pkg_sql_tests
chmod +x benchdiff/d7f43f1/bin/1058449141/cockroachdb_cockroach_pkg_sql_tests

benchdiff command:

# NB: for best (most stable) results, also add a suitable `--benchtime` that
# results in ~1s to ~5s of benchmark runs. For example, if ops average ~3ms, a
# benchtime of `1000x` is appropriate.
#
# Some benchmarks (in particular BenchmarkSysbench) output additional memory
# profiles covering only the execution (excluding the setup/teardown) - those
# should be preferred for analysis since they more closely correspond to what's
# reported as B/op and alloc/op.
benchdiff --run=^BenchmarkSysbench/SQL/3node/oltp_read_write$ --old=d7f43f1 --new=1b8e172 --memprofile ./pkg/sql/tests
🟢 Sysbench [KV, 3node, oltp_read_only]
Metric Old Commit New Commit Delta Note
sec/op 3.154m ±2% 3.137m ±3% ~ p=0.389 n=15
🟢 allocs/op 2.100k ±0% 2.005k ±0% -4.52% p=0.000 n=15
Reproduce

benchdiff binaries:

mkdir -p benchdiff/1b8e172/bin/1058449141
gcloud storage cp gs://cockroach-microbench-ci/builds/1b8e1726eb58458c208c88e8565d9b4afccca4a5/bin/pkg_sql_tests benchdiff/1b8e172/bin/1058449141/cockroachdb_cockroach_pkg_sql_tests
chmod +x benchdiff/1b8e172/bin/1058449141/cockroachdb_cockroach_pkg_sql_tests
mkdir -p benchdiff/d7f43f1/bin/1058449141
gcloud storage cp gs://cockroach-microbench-ci/builds/d7f43f10758cc0c772a67a863cad69935f44d96c/bin/pkg_sql_tests benchdiff/d7f43f1/bin/1058449141/cockroachdb_cockroach_pkg_sql_tests
chmod +x benchdiff/d7f43f1/bin/1058449141/cockroachdb_cockroach_pkg_sql_tests

benchdiff command:

# NB: for best (most stable) results, also add a suitable `--benchtime` that
# results in ~1s to ~5s of benchmark runs. For example, if ops average ~3ms, a
# benchtime of `1000x` is appropriate.
#
# Some benchmarks (in particular BenchmarkSysbench) output additional memory
# profiles covering only the execution (excluding the setup/teardown) - those
# should be preferred for analysis since they more closely correspond to what's
# reported as B/op and alloc/op.
benchdiff --run=^BenchmarkSysbench/KV/3node/oltp_read_only$ --old=d7f43f1 --new=1b8e172 --memprofile ./pkg/sql/tests
🟢 Sysbench [KV, 3node, oltp_write_only]
Metric Old Commit New Commit Delta Note
sec/op 3.049m ±2% 3.023m ±2% ~ p=0.217 n=15
🟢 allocs/op 4.210k ±0% 4.112k ±0% -2.33% p=0.000 n=15
Reproduce

benchdiff binaries:

mkdir -p benchdiff/1b8e172/bin/1058449141
gcloud storage cp gs://cockroach-microbench-ci/builds/1b8e1726eb58458c208c88e8565d9b4afccca4a5/bin/pkg_sql_tests benchdiff/1b8e172/bin/1058449141/cockroachdb_cockroach_pkg_sql_tests
chmod +x benchdiff/1b8e172/bin/1058449141/cockroachdb_cockroach_pkg_sql_tests
mkdir -p benchdiff/d7f43f1/bin/1058449141
gcloud storage cp gs://cockroach-microbench-ci/builds/d7f43f10758cc0c772a67a863cad69935f44d96c/bin/pkg_sql_tests benchdiff/d7f43f1/bin/1058449141/cockroachdb_cockroach_pkg_sql_tests
chmod +x benchdiff/d7f43f1/bin/1058449141/cockroachdb_cockroach_pkg_sql_tests

benchdiff command:

# NB: for best (most stable) results, also add a suitable `--benchtime` that
# results in ~1s to ~5s of benchmark runs. For example, if ops average ~3ms, a
# benchtime of `1000x` is appropriate.
#
# Some benchmarks (in particular BenchmarkSysbench) output additional memory
# profiles covering only the execution (excluding the setup/teardown) - those
# should be preferred for analysis since they more closely correspond to what's
# reported as B/op and alloc/op.
benchdiff --run=^BenchmarkSysbench/KV/3node/oltp_write_only$ --old=d7f43f1 --new=1b8e172 --memprofile ./pkg/sql/tests
Artifacts

download:

mkdir -p new
gcloud storage cp gs://cockroach-microbench-ci/artifacts/1b8e1726eb58458c208c88e8565d9b4afccca4a5/24467588522-1/\* new/
mkdir -p old
gcloud storage cp gs://cockroach-microbench-ci/artifacts/d7f43f10758cc0c772a67a863cad69935f44d96c/24467588522-1/\* old/

built with commit: 1b8e1726eb58458c208c88e8565d9b4afccca4a5

@blathers-crl
Copy link
Copy Markdown

blathers-crl bot commented Apr 15, 2026

Detected infrastructure failure (matched: ). Automatically rerunning failed jobs. (run link)

@angles-n-daemons angles-n-daemons force-pushed the bdillmann/goodhistogram-migration branch from ddd1198 to 1ca9b50 Compare April 15, 2026 19:43
@blathers-crl
Copy link
Copy Markdown

blathers-crl bot commented Apr 15, 2026

Detected infrastructure failure (matched: self-hosted runner lost communication with the server). Automatically rerunning failed jobs. (run link)

@cockroach-teamcity
Copy link
Copy Markdown
Member

🟢 Sysbench [SQL, 3node, oltp_read_write]
Metric Old Commit New Commit Delta Note
sec/op 10.92m ±2% 10.87m ±2% ~ p=0.567 n=15
🟢 allocs/op 8.126k ±1% 7.931k ±0% -2.40% p=0.000 n=15
Reproduce

benchdiff binaries:

mkdir -p benchdiff/1ca9b50/bin/1058449141
gcloud storage cp gs://cockroach-microbench-ci/builds/1ca9b50b8158f1e4bc14a228ddfc3081e29914de/bin/pkg_sql_tests benchdiff/1ca9b50/bin/1058449141/cockroachdb_cockroach_pkg_sql_tests
chmod +x benchdiff/1ca9b50/bin/1058449141/cockroachdb_cockroach_pkg_sql_tests
mkdir -p benchdiff/d7f43f1/bin/1058449141
gcloud storage cp gs://cockroach-microbench-ci/builds/d7f43f10758cc0c772a67a863cad69935f44d96c/bin/pkg_sql_tests benchdiff/d7f43f1/bin/1058449141/cockroachdb_cockroach_pkg_sql_tests
chmod +x benchdiff/d7f43f1/bin/1058449141/cockroachdb_cockroach_pkg_sql_tests

benchdiff command:

# NB: for best (most stable) results, also add a suitable `--benchtime` that
# results in ~1s to ~5s of benchmark runs. For example, if ops average ~3ms, a
# benchtime of `1000x` is appropriate.
#
# Some benchmarks (in particular BenchmarkSysbench) output additional memory
# profiles covering only the execution (excluding the setup/teardown) - those
# should be preferred for analysis since they more closely correspond to what's
# reported as B/op and alloc/op.
benchdiff --run=^BenchmarkSysbench/SQL/3node/oltp_read_write$ --old=d7f43f1 --new=1ca9b50 --memprofile ./pkg/sql/tests
🟢 Sysbench [KV, 3node, oltp_read_only]
Metric Old Commit New Commit Delta Note
sec/op 3.192m ±1% 3.185m ±1% ~ p=0.870 n=15
🟢 allocs/op 2.101k ±0% 2.005k ±0% -4.57% p=0.000 n=15
Reproduce

benchdiff binaries:

mkdir -p benchdiff/1ca9b50/bin/1058449141
gcloud storage cp gs://cockroach-microbench-ci/builds/1ca9b50b8158f1e4bc14a228ddfc3081e29914de/bin/pkg_sql_tests benchdiff/1ca9b50/bin/1058449141/cockroachdb_cockroach_pkg_sql_tests
chmod +x benchdiff/1ca9b50/bin/1058449141/cockroachdb_cockroach_pkg_sql_tests
mkdir -p benchdiff/d7f43f1/bin/1058449141
gcloud storage cp gs://cockroach-microbench-ci/builds/d7f43f10758cc0c772a67a863cad69935f44d96c/bin/pkg_sql_tests benchdiff/d7f43f1/bin/1058449141/cockroachdb_cockroach_pkg_sql_tests
chmod +x benchdiff/d7f43f1/bin/1058449141/cockroachdb_cockroach_pkg_sql_tests

benchdiff command:

# NB: for best (most stable) results, also add a suitable `--benchtime` that
# results in ~1s to ~5s of benchmark runs. For example, if ops average ~3ms, a
# benchtime of `1000x` is appropriate.
#
# Some benchmarks (in particular BenchmarkSysbench) output additional memory
# profiles covering only the execution (excluding the setup/teardown) - those
# should be preferred for analysis since they more closely correspond to what's
# reported as B/op and alloc/op.
benchdiff --run=^BenchmarkSysbench/KV/3node/oltp_read_only$ --old=d7f43f1 --new=1ca9b50 --memprofile ./pkg/sql/tests
🟢 Sysbench [KV, 3node, oltp_write_only]
Metric Old Commit New Commit Delta Note
sec/op 2.984m ±1% 2.959m ±1% -0.85% p=0.008 n=15
🟢 allocs/op 4.206k ±0% 4.110k ±0% -2.28% p=0.000 n=15
Reproduce

benchdiff binaries:

mkdir -p benchdiff/1ca9b50/bin/1058449141
gcloud storage cp gs://cockroach-microbench-ci/builds/1ca9b50b8158f1e4bc14a228ddfc3081e29914de/bin/pkg_sql_tests benchdiff/1ca9b50/bin/1058449141/cockroachdb_cockroach_pkg_sql_tests
chmod +x benchdiff/1ca9b50/bin/1058449141/cockroachdb_cockroach_pkg_sql_tests
mkdir -p benchdiff/d7f43f1/bin/1058449141
gcloud storage cp gs://cockroach-microbench-ci/builds/d7f43f10758cc0c772a67a863cad69935f44d96c/bin/pkg_sql_tests benchdiff/d7f43f1/bin/1058449141/cockroachdb_cockroach_pkg_sql_tests
chmod +x benchdiff/d7f43f1/bin/1058449141/cockroachdb_cockroach_pkg_sql_tests

benchdiff command:

# NB: for best (most stable) results, also add a suitable `--benchtime` that
# results in ~1s to ~5s of benchmark runs. For example, if ops average ~3ms, a
# benchtime of `1000x` is appropriate.
#
# Some benchmarks (in particular BenchmarkSysbench) output additional memory
# profiles covering only the execution (excluding the setup/teardown) - those
# should be preferred for analysis since they more closely correspond to what's
# reported as B/op and alloc/op.
benchdiff --run=^BenchmarkSysbench/KV/3node/oltp_write_only$ --old=d7f43f1 --new=1ca9b50 --memprofile ./pkg/sql/tests
Artifacts

download:

mkdir -p new
gcloud storage cp gs://cockroach-microbench-ci/artifacts/1ca9b50b8158f1e4bc14a228ddfc3081e29914de/24474690715-1/\* new/
mkdir -p old
gcloud storage cp gs://cockroach-microbench-ci/artifacts/d7f43f10758cc0c772a67a863cad69935f44d96c/24474690715-1/\* old/

built with commit: 1ca9b50b8158f1e4bc14a228ddfc3081e29914de

@angles-n-daemons angles-n-daemons force-pushed the bdillmann/goodhistogram-migration branch 5 times, most recently from 73d2ff9 to 2512dee Compare April 16, 2026 14:38
@cockroach-teamcity
Copy link
Copy Markdown
Member

🟢 Sysbench [SQL, 3node, oltp_read_write]
Metric Old Commit New Commit Delta Note
sec/op 10.23m ±1% 10.28m ±3% ~ p=0.935 n=15
🟢 allocs/op 8.135k ±1% 7.909k ±1% -2.78% p=0.000 n=15
Reproduce

benchdiff binaries:

mkdir -p benchdiff/2512dee/bin/1058449141
gcloud storage cp gs://cockroach-microbench-ci/builds/2512deeec446ce110af25feeef1a64823a4cae40/bin/pkg_sql_tests benchdiff/2512dee/bin/1058449141/cockroachdb_cockroach_pkg_sql_tests
chmod +x benchdiff/2512dee/bin/1058449141/cockroachdb_cockroach_pkg_sql_tests
mkdir -p benchdiff/7b5db8f/bin/1058449141
gcloud storage cp gs://cockroach-microbench-ci/builds/7b5db8f7a08cbd09f42e0638dc84c2e50160be01/bin/pkg_sql_tests benchdiff/7b5db8f/bin/1058449141/cockroachdb_cockroach_pkg_sql_tests
chmod +x benchdiff/7b5db8f/bin/1058449141/cockroachdb_cockroach_pkg_sql_tests

benchdiff command:

# NB: for best (most stable) results, also add a suitable `--benchtime` that
# results in ~1s to ~5s of benchmark runs. For example, if ops average ~3ms, a
# benchtime of `1000x` is appropriate.
#
# Some benchmarks (in particular BenchmarkSysbench) output additional memory
# profiles covering only the execution (excluding the setup/teardown) - those
# should be preferred for analysis since they more closely correspond to what's
# reported as B/op and alloc/op.
benchdiff --run=^BenchmarkSysbench/SQL/3node/oltp_read_write$ --old=7b5db8f --new=2512dee --memprofile ./pkg/sql/tests
🟢 Sysbench [KV, 3node, oltp_read_only]
Metric Old Commit New Commit Delta Note
🟢 sec/op 3.157m ±1% 3.109m ±1% -1.54% p=0.000 n=15
🟢 allocs/op 2.101k ±0% 2.004k ±0% -4.62% p=0.000 n=15
Reproduce

benchdiff binaries:

mkdir -p benchdiff/2512dee/bin/1058449141
gcloud storage cp gs://cockroach-microbench-ci/builds/2512deeec446ce110af25feeef1a64823a4cae40/bin/pkg_sql_tests benchdiff/2512dee/bin/1058449141/cockroachdb_cockroach_pkg_sql_tests
chmod +x benchdiff/2512dee/bin/1058449141/cockroachdb_cockroach_pkg_sql_tests
mkdir -p benchdiff/7b5db8f/bin/1058449141
gcloud storage cp gs://cockroach-microbench-ci/builds/7b5db8f7a08cbd09f42e0638dc84c2e50160be01/bin/pkg_sql_tests benchdiff/7b5db8f/bin/1058449141/cockroachdb_cockroach_pkg_sql_tests
chmod +x benchdiff/7b5db8f/bin/1058449141/cockroachdb_cockroach_pkg_sql_tests

benchdiff command:

# NB: for best (most stable) results, also add a suitable `--benchtime` that
# results in ~1s to ~5s of benchmark runs. For example, if ops average ~3ms, a
# benchtime of `1000x` is appropriate.
#
# Some benchmarks (in particular BenchmarkSysbench) output additional memory
# profiles covering only the execution (excluding the setup/teardown) - those
# should be preferred for analysis since they more closely correspond to what's
# reported as B/op and alloc/op.
benchdiff --run=^BenchmarkSysbench/KV/3node/oltp_read_only$ --old=7b5db8f --new=2512dee --memprofile ./pkg/sql/tests
🟢 Sysbench [KV, 3node, oltp_write_only]
Metric Old Commit New Commit Delta Note
🟢 sec/op 2.967m ±1% 2.940m ±1% -0.91% p=0.000 n=15
🟢 allocs/op 4.204k ±0% 4.105k ±0% -2.35% p=0.000 n=15
Reproduce

benchdiff binaries:

mkdir -p benchdiff/2512dee/bin/1058449141
gcloud storage cp gs://cockroach-microbench-ci/builds/2512deeec446ce110af25feeef1a64823a4cae40/bin/pkg_sql_tests benchdiff/2512dee/bin/1058449141/cockroachdb_cockroach_pkg_sql_tests
chmod +x benchdiff/2512dee/bin/1058449141/cockroachdb_cockroach_pkg_sql_tests
mkdir -p benchdiff/7b5db8f/bin/1058449141
gcloud storage cp gs://cockroach-microbench-ci/builds/7b5db8f7a08cbd09f42e0638dc84c2e50160be01/bin/pkg_sql_tests benchdiff/7b5db8f/bin/1058449141/cockroachdb_cockroach_pkg_sql_tests
chmod +x benchdiff/7b5db8f/bin/1058449141/cockroachdb_cockroach_pkg_sql_tests

benchdiff command:

# NB: for best (most stable) results, also add a suitable `--benchtime` that
# results in ~1s to ~5s of benchmark runs. For example, if ops average ~3ms, a
# benchtime of `1000x` is appropriate.
#
# Some benchmarks (in particular BenchmarkSysbench) output additional memory
# profiles covering only the execution (excluding the setup/teardown) - those
# should be preferred for analysis since they more closely correspond to what's
# reported as B/op and alloc/op.
benchdiff --run=^BenchmarkSysbench/KV/3node/oltp_write_only$ --old=7b5db8f --new=2512dee --memprofile ./pkg/sql/tests
Artifacts

download:

mkdir -p new
gcloud storage cp gs://cockroach-microbench-ci/artifacts/2512deeec446ce110af25feeef1a64823a4cae40/24516444639-1/\* new/
mkdir -p old
gcloud storage cp gs://cockroach-microbench-ci/artifacts/7b5db8f7a08cbd09f42e0638dc84c2e50160be01/24516444639-1/\* old/

built with commit: 2512deeec446ce110af25feeef1a64823a4cae40

Replace the prometheus-backed and HDR histogram implementations with
cockroachdb/goodhistogram, a lock-free exponential histogram library
that is faster (~20ns/op vs ~130ns/op), more memory-efficient (~0.85KB
vs ~19KB), and natively produces Prometheus-compatible output (both
conventional and native histogram formats).

Key changes:

- Remove HdrHistogram entirely, completing TODO cockroachdb#96357. Delete
  hdrhistogram.go, remove HistogramModePreferHdrLatency, MaxVal,
  SigFigs from HistogramOptions, and the COCKROACH_ENABLE_HDR_HISTOGRAMS
  env var.

- Rewrite the Histogram struct to use *goodhistogram.Histogram as its
  backing store. RecordValue is now lock-free. The 2-window rotation
  scheme is preserved but uses goodhistogram instances instead of
  prometheus.HistogramInternal.

- Rewrite ManualWindowHistogram to use goodhistogram in RecordValue+Rotate
  mode. The Update mode (used by pebble metrics) retains its prometheus
  backing since pebble provides data in prometheus format.

- Rewrite HistogramSnapshot to support dual backing: goodhistogram
  snapshots (primary) and prometheusgo.Histogram (compat for Update
  mode and schedulerlatency).

- Remove the HistogramMode type and Mode field from HistogramOptions
  since there is now only one histogram implementation.

- Remove the unused RaftQuotaPoolPercentUsed histogram.

- Remove native histogram env vars (COCKROACH_ENABLE_PROMETHEUS_NATIVE_HISTOGRAMS
  etc.) since goodhistogram always produces native histogram fields.

- Add COCKROACH_ENABLE_PROMETHEUS_NATIVE_HISTOGRAMS env var (default
  true) as a safety valve to revert to the legacy prometheus-backed
  histogram implementation if issues are discovered.

The prometheus dependency upgrades (client_model v0.3.0 -> v0.6.2)
required fixing copylocks issues in a few places where the newer
protobuf types contain sync.Mutex in MessageState.

Histogram bucket boundaries change from hand-tuned values to
goodhistogram's exponential scheme, and quantile estimation uses
trapezoidal interpolation instead of linear. Both changes improve
accuracy.

An alternative approach using prometheus native histograms directly
(without goodhistogram) is explored in cockroachdb#168352.

Fixes: cockroachdb#168274
Epic: CRDB-52498

Release note: None
@angles-n-daemons angles-n-daemons force-pushed the bdillmann/goodhistogram-migration branch from 2512dee to 982a968 Compare April 16, 2026 15:51
@cockroach-teamcity
Copy link
Copy Markdown
Member

🟢 Sysbench [SQL, 3node, oltp_read_write]
Metric Old Commit New Commit Delta Note
sec/op 10.29m ±3% 10.07m ±3% ~ p=0.367 n=15
🟢 allocs/op 8.124k ±1% 7.940k ±0% -2.26% p=0.000 n=15
Reproduce

benchdiff binaries:

mkdir -p benchdiff/982a968/bin/1058449141
gcloud storage cp gs://cockroach-microbench-ci/builds/982a968c06d38bf6321b3ab8860bc650a38b8656/bin/pkg_sql_tests benchdiff/982a968/bin/1058449141/cockroachdb_cockroach_pkg_sql_tests
chmod +x benchdiff/982a968/bin/1058449141/cockroachdb_cockroach_pkg_sql_tests
mkdir -p benchdiff/7b5db8f/bin/1058449141
gcloud storage cp gs://cockroach-microbench-ci/builds/7b5db8f7a08cbd09f42e0638dc84c2e50160be01/bin/pkg_sql_tests benchdiff/7b5db8f/bin/1058449141/cockroachdb_cockroach_pkg_sql_tests
chmod +x benchdiff/7b5db8f/bin/1058449141/cockroachdb_cockroach_pkg_sql_tests

benchdiff command:

# NB: for best (most stable) results, also add a suitable `--benchtime` that
# results in ~1s to ~5s of benchmark runs. For example, if ops average ~3ms, a
# benchtime of `1000x` is appropriate.
#
# Some benchmarks (in particular BenchmarkSysbench) output additional memory
# profiles covering only the execution (excluding the setup/teardown) - those
# should be preferred for analysis since they more closely correspond to what's
# reported as B/op and alloc/op.
benchdiff --run=^BenchmarkSysbench/SQL/3node/oltp_read_write$ --old=7b5db8f --new=982a968 --memprofile ./pkg/sql/tests
🟢 Sysbench [KV, 3node, oltp_read_only]
Metric Old Commit New Commit Delta Note
sec/op 3.085m ±1% 3.092m ±1% ~ p=0.595 n=15
🟢 allocs/op 2.100k ±0% 2.004k ±0% -4.57% p=0.000 n=15
Reproduce

benchdiff binaries:

mkdir -p benchdiff/982a968/bin/1058449141
gcloud storage cp gs://cockroach-microbench-ci/builds/982a968c06d38bf6321b3ab8860bc650a38b8656/bin/pkg_sql_tests benchdiff/982a968/bin/1058449141/cockroachdb_cockroach_pkg_sql_tests
chmod +x benchdiff/982a968/bin/1058449141/cockroachdb_cockroach_pkg_sql_tests
mkdir -p benchdiff/7b5db8f/bin/1058449141
gcloud storage cp gs://cockroach-microbench-ci/builds/7b5db8f7a08cbd09f42e0638dc84c2e50160be01/bin/pkg_sql_tests benchdiff/7b5db8f/bin/1058449141/cockroachdb_cockroach_pkg_sql_tests
chmod +x benchdiff/7b5db8f/bin/1058449141/cockroachdb_cockroach_pkg_sql_tests

benchdiff command:

# NB: for best (most stable) results, also add a suitable `--benchtime` that
# results in ~1s to ~5s of benchmark runs. For example, if ops average ~3ms, a
# benchtime of `1000x` is appropriate.
#
# Some benchmarks (in particular BenchmarkSysbench) output additional memory
# profiles covering only the execution (excluding the setup/teardown) - those
# should be preferred for analysis since they more closely correspond to what's
# reported as B/op and alloc/op.
benchdiff --run=^BenchmarkSysbench/KV/3node/oltp_read_only$ --old=7b5db8f --new=982a968 --memprofile ./pkg/sql/tests
🟢 Sysbench [KV, 3node, oltp_write_only]
Metric Old Commit New Commit Delta Note
sec/op 2.941m ±1% 2.929m ±1% -0.40% p=0.021 n=15
🟢 allocs/op 4.202k ±0% 4.106k ±0% -2.28% p=0.000 n=15
Reproduce

benchdiff binaries:

mkdir -p benchdiff/982a968/bin/1058449141
gcloud storage cp gs://cockroach-microbench-ci/builds/982a968c06d38bf6321b3ab8860bc650a38b8656/bin/pkg_sql_tests benchdiff/982a968/bin/1058449141/cockroachdb_cockroach_pkg_sql_tests
chmod +x benchdiff/982a968/bin/1058449141/cockroachdb_cockroach_pkg_sql_tests
mkdir -p benchdiff/7b5db8f/bin/1058449141
gcloud storage cp gs://cockroach-microbench-ci/builds/7b5db8f7a08cbd09f42e0638dc84c2e50160be01/bin/pkg_sql_tests benchdiff/7b5db8f/bin/1058449141/cockroachdb_cockroach_pkg_sql_tests
chmod +x benchdiff/7b5db8f/bin/1058449141/cockroachdb_cockroach_pkg_sql_tests

benchdiff command:

# NB: for best (most stable) results, also add a suitable `--benchtime` that
# results in ~1s to ~5s of benchmark runs. For example, if ops average ~3ms, a
# benchtime of `1000x` is appropriate.
#
# Some benchmarks (in particular BenchmarkSysbench) output additional memory
# profiles covering only the execution (excluding the setup/teardown) - those
# should be preferred for analysis since they more closely correspond to what's
# reported as B/op and alloc/op.
benchdiff --run=^BenchmarkSysbench/KV/3node/oltp_write_only$ --old=7b5db8f --new=982a968 --memprofile ./pkg/sql/tests
Artifacts

download:

mkdir -p new
gcloud storage cp gs://cockroach-microbench-ci/artifacts/982a968c06d38bf6321b3ab8860bc650a38b8656/24520062130-1/\* new/
mkdir -p old
gcloud storage cp gs://cockroach-microbench-ci/artifacts/7b5db8f7a08cbd09f42e0638dc84c2e50160be01/24520062130-1/\* old/

built with commit: 982a968c06d38bf6321b3ab8860bc650a38b8656

Empty commit to ensure CI rebuilds from scratch and picks up the
prototext normalization fix for cross-platform compatibility.

Epic: none
Release note: None
@blathers-crl
Copy link
Copy Markdown

blathers-crl bot commented Apr 16, 2026

Detected infrastructure failure (matched: self-hosted runner lost communication with the server). Automatically rerunning failed jobs. (run link)

@cockroach-teamcity
Copy link
Copy Markdown
Member

🟢 Sysbench [SQL, 3node, oltp_read_write]
Metric Old Commit New Commit Delta Note
sec/op 10.52m ±1% 10.51m ±2% ~ p=0.486 n=15
🟢 allocs/op 8.132k ±1% 7.925k ±0% -2.55% p=0.000 n=15
Reproduce

benchdiff binaries:

mkdir -p benchdiff/79b639a/bin/1058449141
gcloud storage cp gs://cockroach-microbench-ci/builds/79b639aee12c096f017809efe4bab5e865bd6f6b/bin/pkg_sql_tests benchdiff/79b639a/bin/1058449141/cockroachdb_cockroach_pkg_sql_tests
chmod +x benchdiff/79b639a/bin/1058449141/cockroachdb_cockroach_pkg_sql_tests
mkdir -p benchdiff/7b5db8f/bin/1058449141
gcloud storage cp gs://cockroach-microbench-ci/builds/7b5db8f7a08cbd09f42e0638dc84c2e50160be01/bin/pkg_sql_tests benchdiff/7b5db8f/bin/1058449141/cockroachdb_cockroach_pkg_sql_tests
chmod +x benchdiff/7b5db8f/bin/1058449141/cockroachdb_cockroach_pkg_sql_tests

benchdiff command:

# NB: for best (most stable) results, also add a suitable `--benchtime` that
# results in ~1s to ~5s of benchmark runs. For example, if ops average ~3ms, a
# benchtime of `1000x` is appropriate.
#
# Some benchmarks (in particular BenchmarkSysbench) output additional memory
# profiles covering only the execution (excluding the setup/teardown) - those
# should be preferred for analysis since they more closely correspond to what's
# reported as B/op and alloc/op.
benchdiff --run=^BenchmarkSysbench/SQL/3node/oltp_read_write$ --old=7b5db8f --new=79b639a --memprofile ./pkg/sql/tests
🟢 Sysbench [KV, 3node, oltp_read_only]
Metric Old Commit New Commit Delta Note
🟢 sec/op 3.060m ±1% 3.019m ±1% -1.33% p=0.008 n=15
🟢 allocs/op 2.100k ±0% 2.004k ±0% -4.57% p=0.000 n=15
Reproduce

benchdiff binaries:

mkdir -p benchdiff/79b639a/bin/1058449141
gcloud storage cp gs://cockroach-microbench-ci/builds/79b639aee12c096f017809efe4bab5e865bd6f6b/bin/pkg_sql_tests benchdiff/79b639a/bin/1058449141/cockroachdb_cockroach_pkg_sql_tests
chmod +x benchdiff/79b639a/bin/1058449141/cockroachdb_cockroach_pkg_sql_tests
mkdir -p benchdiff/7b5db8f/bin/1058449141
gcloud storage cp gs://cockroach-microbench-ci/builds/7b5db8f7a08cbd09f42e0638dc84c2e50160be01/bin/pkg_sql_tests benchdiff/7b5db8f/bin/1058449141/cockroachdb_cockroach_pkg_sql_tests
chmod +x benchdiff/7b5db8f/bin/1058449141/cockroachdb_cockroach_pkg_sql_tests

benchdiff command:

# NB: for best (most stable) results, also add a suitable `--benchtime` that
# results in ~1s to ~5s of benchmark runs. For example, if ops average ~3ms, a
# benchtime of `1000x` is appropriate.
#
# Some benchmarks (in particular BenchmarkSysbench) output additional memory
# profiles covering only the execution (excluding the setup/teardown) - those
# should be preferred for analysis since they more closely correspond to what's
# reported as B/op and alloc/op.
benchdiff --run=^BenchmarkSysbench/KV/3node/oltp_read_only$ --old=7b5db8f --new=79b639a --memprofile ./pkg/sql/tests
🟢 Sysbench [KV, 3node, oltp_write_only]
Metric Old Commit New Commit Delta Note
sec/op 2.908m ±2% 2.876m ±2% ~ p=0.187 n=15
🟢 allocs/op 4.201k ±0% 4.102k ±0% -2.36% p=0.000 n=15
Reproduce

benchdiff binaries:

mkdir -p benchdiff/79b639a/bin/1058449141
gcloud storage cp gs://cockroach-microbench-ci/builds/79b639aee12c096f017809efe4bab5e865bd6f6b/bin/pkg_sql_tests benchdiff/79b639a/bin/1058449141/cockroachdb_cockroach_pkg_sql_tests
chmod +x benchdiff/79b639a/bin/1058449141/cockroachdb_cockroach_pkg_sql_tests
mkdir -p benchdiff/7b5db8f/bin/1058449141
gcloud storage cp gs://cockroach-microbench-ci/builds/7b5db8f7a08cbd09f42e0638dc84c2e50160be01/bin/pkg_sql_tests benchdiff/7b5db8f/bin/1058449141/cockroachdb_cockroach_pkg_sql_tests
chmod +x benchdiff/7b5db8f/bin/1058449141/cockroachdb_cockroach_pkg_sql_tests

benchdiff command:

# NB: for best (most stable) results, also add a suitable `--benchtime` that
# results in ~1s to ~5s of benchmark runs. For example, if ops average ~3ms, a
# benchtime of `1000x` is appropriate.
#
# Some benchmarks (in particular BenchmarkSysbench) output additional memory
# profiles covering only the execution (excluding the setup/teardown) - those
# should be preferred for analysis since they more closely correspond to what's
# reported as B/op and alloc/op.
benchdiff --run=^BenchmarkSysbench/KV/3node/oltp_write_only$ --old=7b5db8f --new=79b639a --memprofile ./pkg/sql/tests
Artifacts

download:

mkdir -p new
gcloud storage cp gs://cockroach-microbench-ci/artifacts/79b639aee12c096f017809efe4bab5e865bd6f6b/24522780182-1/\* new/
mkdir -p old
gcloud storage cp gs://cockroach-microbench-ci/artifacts/7b5db8f7a08cbd09f42e0638dc84c2e50160be01/24522780182-1/\* old/

built with commit: 79b639aee12c096f017809efe4bab5e865bd6f6b

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

X-perf-check Microbenchmarks CI: Added to a PR if a performance regression is detected and should be checked X-perf-gain Microbenchmarks CI: Added if a performance gain is detected

Projects

None yet

Development

Successfully merging this pull request may close these issues.

metric: use goodhistogram as the default histogram implementation

3 participants