Skip to content

Add bursty-load JMH benchmark for BatchSpanProcessor#8178

Open
adp2201 wants to merge 1 commit intoopen-telemetry:mainfrom
adp2201:codex/bursty-bsp-benchmark-7508
Open

Add bursty-load JMH benchmark for BatchSpanProcessor#8178
adp2201 wants to merge 1 commit intoopen-telemetry:mainfrom
adp2201:codex/bursty-bsp-benchmark-7508

Conversation

@adp2201
Copy link

@adp2201 adp2201 commented Mar 11, 2026

Summary

  • add a new JMH benchmark: BatchSpanProcessorBurstyLoadBenchmark
  • simulate bursty span production with cooldown periods to make burst/load behavior reproducible
  • report dropRatio, droppedSpans, and exportedSpans via BatchSpanProcessorMetrics aux counters
  • keep production behavior unchanged (benchmark-only addition in sdk/trace JMH sources)

Motivation

This is a small, low-risk first increment for #7508: establish a reproducible benchmark harness before discussing any default semantic changes to BatchSpanProcessor.

Tests Run

Using Java 21:

  • ./gradlew check
  • ./gradlew :sdk:trace:compileJmhJava :sdk:trace:checkstyleJmh
  • ./gradlew --no-configuration-cache -PjmhIncludeSingleClass=BatchSpanProcessorBurstyLoadBenchmark :sdk:trace:jmh

Next Steps

  1. Add scenario(s) that intentionally saturate queue/export path to produce measurable drop rates.
  2. Add comparative runs against opt-in alternative processor/export strategies (likely in contrib first).
  3. Use data from these benchmarks to guide follow-up discussion in Optimize data loss issues for high-resource single-instance deployments that generate a large number of spans. #7508.

Part of #7508.

@adp2201 adp2201 requested a review from a team as a code owner March 11, 2026 21:15
@linux-foundation-easycla
Copy link

linux-foundation-easycla bot commented Mar 11, 2026

CLA Signed
The committers listed above are authorized under a signed CLA.

  • ✅ login: adp2201 / name: Amol Patil (93a34bb)

@codecov
Copy link

codecov bot commented Mar 11, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 90.29%. Comparing base (839f235) to head (93a34bb).
⚠️ Report is 15 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff            @@
##               main    #8178   +/-   ##
=========================================
  Coverage     90.29%   90.29%           
  Complexity     7650     7650           
=========================================
  Files           843      843           
  Lines         23059    23059           
  Branches       2309     2309           
=========================================
  Hits          20822    20822           
  Misses         1519     1519           
  Partials        718      718           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@jack-berg
Copy link
Member

Output on my machine doesn't show results for dropRatio, droppedSpans, exportedSpans figures:

Benchmark                                                              (burstSize)  (cooldownMs)  (exporterDelayMs)  (maxExportBatchSize)  (maxQueueSize)  (scheduleDelayMs)   Mode  Cnt        Score     Error   Units
BatchSpanProcessorBurstyLoadBenchmark.exportBursty                           20000             5                 20                    64              64                200  thrpt    5      122.324 ±   7.039   ops/s
BatchSpanProcessorBurstyLoadBenchmark.exportBursty:dropRatio                 20000             5                 20                    64              64                200  thrpt    5          ≈ 0                 #
BatchSpanProcessorBurstyLoadBenchmark.exportBursty:droppedSpans              20000             5                 20                    64              64                200  thrpt    5          ≈ 0                 #
BatchSpanProcessorBurstyLoadBenchmark.exportBursty:exportedSpans             20000             5                 20                    64              64                200  thrpt    5          ≈ 0                 #
BatchSpanProcessorBurstyLoadBenchmark.exportBursty:gc.alloc.rate             20000             5                 20                    64              64                200  thrpt    5      945.857 ±  52.240  MB/sec
BatchSpanProcessorBurstyLoadBenchmark.exportBursty:gc.alloc.rate.norm        20000             5                 20                    64              64                200  thrpt    5  8161696.618 ±  76.383    B/op
BatchSpanProcessorBurstyLoadBenchmark.exportBursty:gc.count                  20000             5                 20                    64              64                200  thrpt    5       54.000            counts
BatchSpanProcessorBurstyLoadBenchmark.exportBursty:gc.time                   20000             5                 20                    64              64                200  thrpt    5       27.000                ms
BatchSpanProcessorBurstyLoadBenchmark.exportBursty                           20000             5                 20                    64            2048                200  thrpt    5      113.924 ±   4.475   ops/s
BatchSpanProcessorBurstyLoadBenchmark.exportBursty:dropRatio                 20000             5                 20                    64            2048                200  thrpt    5          ≈ 0                 #
BatchSpanProcessorBurstyLoadBenchmark.exportBursty:droppedSpans              20000             5                 20                    64            2048                200  thrpt    5          ≈ 0                 #
BatchSpanProcessorBurstyLoadBenchmark.exportBursty:exportedSpans             20000             5                 20                    64            2048                200  thrpt    5          ≈ 0                 #
BatchSpanProcessorBurstyLoadBenchmark.exportBursty:gc.alloc.rate             20000             5                 20                    64            2048                200  thrpt    5      771.276 ±  32.336  MB/sec
BatchSpanProcessorBurstyLoadBenchmark.exportBursty:gc.alloc.rate.norm        20000             5                 20                    64            2048                200  thrpt    5  8162079.491 ±  67.104    B/op
BatchSpanProcessorBurstyLoadBenchmark.exportBursty:gc.count                  20000             5                 20                    64            2048                200  thrpt    5       49.000            counts
BatchSpanProcessorBurstyLoadBenchmark.exportBursty:gc.time                   20000             5                 20                    64            2048                200  thrpt    5       35.000                ms
BatchSpanProcessorBurstyLoadBenchmark.exportBursty                           20000            25                 20                    64              64                200  thrpt    5       30.480 ±   2.047   ops/s
BatchSpanProcessorBurstyLoadBenchmark.exportBursty:dropRatio                 20000            25                 20                    64              64                200  thrpt    5          ≈ 0                 #
BatchSpanProcessorBurstyLoadBenchmark.exportBursty:droppedSpans              20000            25                 20                    64              64                200  thrpt    5          ≈ 0                 #
BatchSpanProcessorBurstyLoadBenchmark.exportBursty:exportedSpans             20000            25                 20                    64              64                200  thrpt    5          ≈ 0                 #
BatchSpanProcessorBurstyLoadBenchmark.exportBursty:gc.alloc.rate             20000            25                 20                    64              64                200  thrpt    5      237.012 ±  17.006  MB/sec
BatchSpanProcessorBurstyLoadBenchmark.exportBursty:gc.alloc.rate.norm        20000            25                 20                    64              64                200  thrpt    5  8166722.877 ± 493.784    B/op
BatchSpanProcessorBurstyLoadBenchmark.exportBursty:gc.count                  20000            25                 20                    64              64                200  thrpt    5       13.000            counts
BatchSpanProcessorBurstyLoadBenchmark.exportBursty:gc.time                   20000            25                 20                    64              64                200  thrpt    5       11.000                ms
BatchSpanProcessorBurstyLoadBenchmark.exportBursty                           20000            25                 20                    64            2048                200  thrpt    5       30.428 ±   1.131   ops/s
BatchSpanProcessorBurstyLoadBenchmark.exportBursty:dropRatio                 20000            25                 20                    64            2048                200  thrpt    5          ≈ 0                 #
BatchSpanProcessorBurstyLoadBenchmark.exportBursty:droppedSpans              20000            25                 20                    64            2048                200  thrpt    5          ≈ 0                 #
BatchSpanProcessorBurstyLoadBenchmark.exportBursty:exportedSpans             20000            25                 20                    64            2048                200  thrpt    5          ≈ 0                 #
BatchSpanProcessorBurstyLoadBenchmark.exportBursty:gc.alloc.rate             20000            25                 20                    64            2048                200  thrpt    5      206.760 ±   7.237  MB/sec
BatchSpanProcessorBurstyLoadBenchmark.exportBursty:gc.alloc.rate.norm        20000            25                 20                    64            2048                200  thrpt    5  8167788.481 ± 392.257    B/op
BatchSpanProcessorBurstyLoadBenchmark.exportBursty:gc.count                  20000            25                 20                    64            2048                200  thrpt    5       13.000            counts
BatchSpanProcessorBurstyLoadBenchmark.exportBursty:gc.time                   20000            25                 20                    64            2048                200  thrpt    5       13.000                ms

@adp2201
Copy link
Author

adp2201 commented Mar 12, 2026

Thanks for flagging this. I reproduced the same output (≈ 0 for dropRatio, droppedSpans, exportedSpans) and investigated.

What I found:

  • This is not unique to BatchSpanProcessorBurstyLoadBenchmark; existing BatchSpanProcessorDroppedSpansBenchmark shows the same pattern.
  • In both cases, these values are assigned in @TearDown(Level.Iteration).
  • JMH aux counters are reported as rates relative to benchmark operations, so values populated only at iteration teardown end up appearing as near-zero.

What we can do about it:

  1. Rework the benchmark so aux counters are updated during measured benchmark operations (incremental deltas), not only in teardown.
  2. If we want strict totals instead of rates, add explicit iteration-total reporting (separate from aux counters).
  3. Tune params to intentionally saturate queue/export so drop behavior becomes clearly visible and comparable across runs.

Thoughts on this approach?

@jack-berg
Copy link
Member

Thoughts on this approach?

I'm new to JMH aux counters so the onus will be on you to figure out how to use them appropriately to demonstrate the problem, and eventually show improvement. 🙂

Not sure if its useful to you, but BatchSpanProcessor already collects internal telemetry capturing some (all?) of these concepts. The metrics conform to the otel.sdk.processor.span.* metrics from semantic conventions: https://github.com/open-telemetry/semantic-conventions/blob/main/docs/otel/sdk-metrics.md

Consider if its possible / helpful to expose the metrics you want by setting up internal telemetry and reading it out in the JMH aux counter accessors.

@adp2201
Copy link
Author

adp2201 commented Mar 12, 2026

Thanks, this is really helpful context.

I’d like to keep this PR focused and incorporate your suggestion directly:

  1. keep this PR scoped to the benchmark harness only (no BSP behavior/API changes),
  2. rework aux-counter reporting to use incremental deltas during measured execution (instead of teardown-only snapshots),
  3. use BSP internal telemetry as the source for exported/dropped metrics,
  4. add one drop-prone parameter set so output shows non-zero drop behavior clearly.

And yes, I think a follow-up issue makes sense for anything broader we uncover (e.g., aligning aux-counter/reporting patterns across other BSP JMH benchmarks) so we can keep this PR small and still track the improvements.

What do you think?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants