Rework and publish metric benchmarks #8000

jack-berg · 2026-01-21T23:04:39Z

As mentioned #7986, I've been working through some ideas to improve the performance of the metric SDK under high contention.

To illustrate the impact on these changes, I've reworked MetricsBenchmark to include dimensions that impact record performance. The set of dimensions that play some role include:

Instrument type / aggregation (5): counter + sum, up down counter + sum, gauge + last value, histogram + explicit histogram, histogram + base2 expo histogram
instrument value type (2): double, long
memory mode (2): immutable, reuseable
temporality (2): cumulative, delta
exemplars recorded (2): true, false
threads (2): 1, 4
cardinality (2): 1, 100

That forms 2 * 2 * 2 * 2 * 2 * 2 * 5 = 320 unique test cases, which is just impractical. And so I narrow it down to the most meaningful dimensions:

eliminated instrument value type: while long vs. double matters somewhat, its not much
eliminated memory mode: immutable vs reusable mostly matters for the collect path
exemplars: can impact performance, but less important than other factors

With these eliminated, were down to 222*5 = 40 test cases, which is more reasonable.

I'm also using this as an opportunity to finish what @tylerbenson started and get into the routine of running benchmarks on each change on dedicated hardwhere, and publishing the results on https://open-telemetry.github.io/opentelemetry-java/benchmarks/

The unfinished problem was that the benchmarks in this repo are micro benchmarks. Their not very meaningful for end users and may even do more harm then good. What we need is a curated set of somewhat high level benchmarks, intentionally built to demonstrate / report on the types of performance characteristics that matter to end users.

This revamped MetricRecordBenchmark is the first of these. I will followup with dedicated benchmarks for other areas:

Log SDK record and export
Trace SDK record and export
Metric SDK export
Noop implementation

For reference, here are the results of the revamped MetricRecordBenchmark on my machine:

Benchmark                       (aggregationTemporality)  (cardinality)  (instrumentTypeAndAggregation)   Mode  Cnt      Score     Error  Units
MetricRecordBenchmark.threads1                     DELTA              1                     COUNTER_SUM  thrpt    5  13414.208 ± 243.504  ops/s
MetricRecordBenchmark.threads1                     DELTA              1             UP_DOWN_COUNTER_SUM  thrpt    5  12276.148 ± 105.900  ops/s
MetricRecordBenchmark.threads1                     DELTA              1                GAUGE_LAST_VALUE  thrpt    5  10896.580 ± 705.898  ops/s
MetricRecordBenchmark.threads1                     DELTA              1              HISTOGRAM_EXPLICIT  thrpt    5   6642.787 ± 674.574  ops/s
MetricRecordBenchmark.threads1                     DELTA              1     HISTOGRAM_BASE2_EXPONENTIAL  thrpt    5   3651.887 ± 304.134  ops/s
MetricRecordBenchmark.threads1                     DELTA            100                     COUNTER_SUM  thrpt    5   8359.025 ± 777.598  ops/s
MetricRecordBenchmark.threads1                     DELTA            100             UP_DOWN_COUNTER_SUM  thrpt    5   9247.253 ± 423.551  ops/s
MetricRecordBenchmark.threads1                     DELTA            100                GAUGE_LAST_VALUE  thrpt    5   9165.700 ± 143.755  ops/s
MetricRecordBenchmark.threads1                     DELTA            100              HISTOGRAM_EXPLICIT  thrpt    5   7300.896 ± 684.395  ops/s
MetricRecordBenchmark.threads1                     DELTA            100     HISTOGRAM_BASE2_EXPONENTIAL  thrpt    5   3858.246 ±  34.989  ops/s
MetricRecordBenchmark.threads1                CUMULATIVE              1                     COUNTER_SUM  thrpt    5  12433.135 ± 148.315  ops/s
MetricRecordBenchmark.threads1                CUMULATIVE              1             UP_DOWN_COUNTER_SUM  thrpt    5  13341.423 ± 242.611  ops/s
MetricRecordBenchmark.threads1                CUMULATIVE              1                GAUGE_LAST_VALUE  thrpt    5  10628.592 ± 101.145  ops/s
MetricRecordBenchmark.threads1                CUMULATIVE              1              HISTOGRAM_EXPLICIT  thrpt    5   6895.783 ± 740.681  ops/s
MetricRecordBenchmark.threads1                CUMULATIVE              1     HISTOGRAM_BASE2_EXPONENTIAL  thrpt    5   4087.396 ± 435.895  ops/s
MetricRecordBenchmark.threads1                CUMULATIVE            100                     COUNTER_SUM  thrpt    5  10402.076 ± 240.933  ops/s
MetricRecordBenchmark.threads1                CUMULATIVE            100             UP_DOWN_COUNTER_SUM  thrpt    5   9199.368 ± 107.627  ops/s
MetricRecordBenchmark.threads1                CUMULATIVE            100                GAUGE_LAST_VALUE  thrpt    5   9056.580 ± 297.773  ops/s
MetricRecordBenchmark.threads1                CUMULATIVE            100              HISTOGRAM_EXPLICIT  thrpt    5   7475.743 ± 979.090  ops/s
MetricRecordBenchmark.threads1                CUMULATIVE            100     HISTOGRAM_BASE2_EXPONENTIAL  thrpt    5   3836.227 ± 131.765  ops/s
MetricRecordBenchmark.threads4                     DELTA              1                     COUNTER_SUM  thrpt    5   1577.822 ± 219.796  ops/s
MetricRecordBenchmark.threads4                     DELTA              1             UP_DOWN_COUNTER_SUM  thrpt    5   1615.582 ± 335.284  ops/s
MetricRecordBenchmark.threads4                     DELTA              1                GAUGE_LAST_VALUE  thrpt    5   1208.008 ± 165.999  ops/s
MetricRecordBenchmark.threads4                     DELTA              1              HISTOGRAM_EXPLICIT  thrpt    5    904.243 ±  22.615  ops/s
MetricRecordBenchmark.threads4                     DELTA              1     HISTOGRAM_BASE2_EXPONENTIAL  thrpt    5    869.229 ±  31.214  ops/s
MetricRecordBenchmark.threads4                     DELTA            100                     COUNTER_SUM  thrpt    5   1725.486 ± 240.360  ops/s
MetricRecordBenchmark.threads4                     DELTA            100             UP_DOWN_COUNTER_SUM  thrpt    5   1422.319 ± 594.337  ops/s
MetricRecordBenchmark.threads4                     DELTA            100                GAUGE_LAST_VALUE  thrpt    5   1560.890 ± 654.561  ops/s
MetricRecordBenchmark.threads4                     DELTA            100              HISTOGRAM_EXPLICIT  thrpt    5   1587.582 ± 458.715  ops/s
MetricRecordBenchmark.threads4                     DELTA            100     HISTOGRAM_BASE2_EXPONENTIAL  thrpt    5   1688.229 ± 181.653  ops/s
MetricRecordBenchmark.threads4                CUMULATIVE              1                     COUNTER_SUM  thrpt    5   1540.747 ± 137.303  ops/s
MetricRecordBenchmark.threads4                CUMULATIVE              1             UP_DOWN_COUNTER_SUM  thrpt    5   1429.698 ± 220.415  ops/s
MetricRecordBenchmark.threads4                CUMULATIVE              1                GAUGE_LAST_VALUE  thrpt    5   1215.367 ± 546.045  ops/s
MetricRecordBenchmark.threads4                CUMULATIVE              1              HISTOGRAM_EXPLICIT  thrpt    5   1237.215 ±  18.528  ops/s
MetricRecordBenchmark.threads4                CUMULATIVE              1     HISTOGRAM_BASE2_EXPONENTIAL  thrpt    5    837.980 ±  23.871  ops/s
MetricRecordBenchmark.threads4                CUMULATIVE            100                     COUNTER_SUM  thrpt    5   1602.628 ± 813.536  ops/s
MetricRecordBenchmark.threads4                CUMULATIVE            100             UP_DOWN_COUNTER_SUM  thrpt    5   1717.663 ± 577.817  ops/s
MetricRecordBenchmark.threads4                CUMULATIVE            100                GAUGE_LAST_VALUE  thrpt    5   1565.824 ± 298.550  ops/s
MetricRecordBenchmark.threads4                CUMULATIVE            100              HISTOGRAM_EXPLICIT  thrpt    5   1352.174 ± 594.439  ops/s
MetricRecordBenchmark.threads4                CUMULATIVE            100     HISTOGRAM_BASE2_EXPONENTIAL  thrpt    5   1465.394 ± 313.072  ops/s

jack-berg · 2026-01-21T23:07:29Z

.github/workflows/benchmark.yml

        with:
          tool: 'jmh'
-          output-file-path: sdk/trace/build/jmh-result.json
+          output-file-path: sdk/all/build/jmh-result.json


@tylerbenson this benchmark-action only allows you to have a single output file path. This means we need to all the published benchmarks to be in a single module, such that we can run with them a single java -jar *-jmh.jar ... command. I think the opentelemetry-sdk artifact is a good spot for this.

This turns out to be a useful constraint as I think it will be nice to have all the public benchmarks colocated.

codecov · 2026-01-21T23:07:51Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 90.14%. Comparing base (cbab60c) to head (77262b2).
⚠️ Report is 9 commits behind head on main.

Additional details and impacted files

@@             Coverage Diff              @@
##               main    #8000      +/-   ##
============================================
+ Coverage     90.13%   90.14%   +0.01%     
- Complexity     7469     7476       +7     
============================================
  Files           833      834       +1     
  Lines         22523    22540      +17     
  Branches       2234     2236       +2     
============================================
+ Hits          20301    20319      +18     
+ Misses         1517     1516       -1     
  Partials        705      705

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

tylerbenson

While you're at it, you might consider squashing the commit history for the results branch as it continues to grow each run. Especially since this change will invalidate prior results.

tylerbenson · 2026-01-23T15:49:14Z

.github/workflows/benchmark.yml

        with:
          tool: 'jmh'
-          output-file-path: sdk/trace/build/jmh-result.json
+          output-file-path: sdk/all/build/jmh-result.json


Copilot

Pull request overview

This PR refactors the metrics benchmarking infrastructure to focus on high-level, user-relevant performance characteristics. The old micro-benchmarks in MetricsBenchmarks are replaced with a new comprehensive MetricRecordBenchmark that measures metric recording performance across multiple dimensions including instrument type/aggregation, temporality, cardinality, and thread count. The GitHub Actions workflow is updated to run and publish the new benchmarks to https://open-telemetry.github.io/opentelemetry-java/benchmarks/.

Changes:

Removed old benchmark classes (TestSdk, MetricsTestOperationBuilder, MetricsBenchmarks) that were too granular and less meaningful for end users
Added new MetricRecordBenchmark that comprehensively tests metric recording performance across 40 meaningful test cases
Updated GitHub Actions workflow to run the new benchmark from sdk/all module instead of trace benchmarks

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 3 comments.

Show a summary per file

File	Description
sdk/metrics/src/jmh/java/io/opentelemetry/sdk/metrics/TestSdk.java	Deleted old SDK configuration enum used by previous benchmarks
sdk/metrics/src/jmh/java/io/opentelemetry/sdk/metrics/MetricsTestOperationBuilder.java	Deleted old operation builder enum for micro-benchmarks
sdk/metrics/src/jmh/java/io/opentelemetry/sdk/metrics/MetricsBenchmarks.java	Deleted old micro-benchmark suite
sdk/all/src/jmh/java/io/opentelemetry/sdk/MetricRecordBenchmark.java	Added comprehensive benchmark measuring metric recording across instrument types, temporalities, cardinalities, and thread counts
sdk/all/build.gradle.kts	Added JMH dependency on testing module
.github/workflows/benchmark.yml	Updated to run new MetricRecordBenchmark from sdk/all instead of trace benchmarks

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

sdk/all/src/jmh/java/io/opentelemetry/sdk/MetricRecordBenchmark.java

jack-berg added 2 commits January 21, 2026 16:30

Rework MetricRecordBenchmark

c62dc62

Publish MetricRecordBenchmark

1b4ce35

jack-berg requested a review from a team as a code owner January 21, 2026 23:04

jack-berg commented Jan 21, 2026

View reviewed changes

tylerbenson approved these changes Jan 23, 2026

View reviewed changes

jaydeluca approved these changes Jan 25, 2026

View reviewed changes

zeitlinger approved these changes Jan 26, 2026

View reviewed changes

zeitlinger requested a review from Copilot January 26, 2026 17:28

Copilot started reviewing on behalf of zeitlinger January 26, 2026 17:28 View session

Copilot AI reviewed Jan 26, 2026

View reviewed changes

feedback

77262b2

jack-berg merged commit fd0ffde into open-telemetry:main Jan 26, 2026
27 checks passed

This was referenced Jan 26, 2026

Split out cumulative vs. delta storage #8015

Open

Update LongLastValueAggregator algo to avoid allocations #8017

Open

Reset benchmark history and fix display #8018

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Rework and publish metric benchmarks #8000

Rework and publish metric benchmarks #8000

Uh oh!

jack-berg commented Jan 21, 2026 •

edited

Loading

Uh oh!

jack-berg Jan 21, 2026

Uh oh!

tylerbenson Jan 23, 2026

Uh oh!

codecov bot commented Jan 21, 2026 •

edited

Loading

Uh oh!

tylerbenson left a comment

Uh oh!

tylerbenson Jan 23, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Rework and publish metric benchmarks #8000

Rework and publish metric benchmarks #8000

Uh oh!

Conversation

jack-berg commented Jan 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jack-berg Jan 21, 2026

Choose a reason for hiding this comment

Uh oh!

tylerbenson Jan 23, 2026

Choose a reason for hiding this comment

Uh oh!

codecov bot commented Jan 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

tylerbenson left a comment

Choose a reason for hiding this comment

Uh oh!

tylerbenson Jan 23, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

jack-berg commented Jan 21, 2026 •

edited

Loading

codecov bot commented Jan 21, 2026 •

edited

Loading