Inject APM trace headers into SQS _datadog message attribute on send#10843
Inject APM trace headers into SQS _datadog message attribute on send#10843
Conversation
…attribute Documents the missing send-side feature: Java never writes x-datadog-trace-id or x-datadog-parent-id into _datadog, breaking Java→other-tracer propagation. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Previously, the Java SQS instrumentation never wrote APM trace context (x-datadog-trace-id, x-datadog-parent-id) into the _datadog message attribute, breaking distributed tracing from Java to other language tracers (Python, Node, etc.) that rely on that attribute. Changes: - SqsInterceptor (v1 + v2): call defaultPropagator().inject() before DSM inject so APM headers are written into _datadog on SendMessage and SendMessageBatch - MessageAttributeInjector (v1 + v2): accumulate multiple key-value pairs into the _datadog JSON blob instead of short-circuiting on the second key - SqsClientInstrumentation (v1 + v2): register SqsInterceptor when isSqsInjectDatadogAttributeEnabled() is true, not only when DSM is enabled, so APM injection works even without Data Streams - Tests: add APM trace context injection test for both v1 and v2, covering DSM-on and DSM-off variants Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
When a message is sent with an existing _datadog attribute of Binary type (e.g. from the SQS-JMS library), getStringValue() returns null. Skip injection in that case rather than NPE. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
I heard some suggestions that this was deliberate. Or at least, it was deliberate for baggage. |
DSM_CONCERN is registered with usedAsDefault=true, so defaultPropagator() already includes DSM injection when Data Streams is enabled. The separate dsmPropagator.inject() call was therefore always redundant: a no-op when DSM is off, a duplicate when DSM is on. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
SqsInterceptor: skip injection if _datadog is already present in the message attributes. Appending to a caller-provided attribute could produce duplicate JSON keys or corrupt propagation context set intentionally upstream. MessageAttributeInjector: add comments explaining why set() accumulates into the existing _datadog value rather than overwriting — a single propagator.inject() call fires set() once per header key, so all keys (x-datadog-trace-id, x-datadog-parent-id, dd-pathway-ctx-base64, ...) must land in the same JSON attribute. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
@codex review |
|
To use Codex here, create a Codex account and connect to github. |
|
@codex review |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: ad0d115b56
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
.../aws-java-sqs-1.0/src/main/java/datadog/trace/instrumentation/aws/v1/sqs/SqsInterceptor.java
Outdated
Show resolved
Hide resolved
BenchmarksStartupParameters
See matching parameters
SummaryFound 0 performance improvements and 0 performance regressions! Performance is the same for 59 metrics, 12 unstable metrics. Startup time reports for insecure-bankgantt
title insecure-bank - global startup overhead: candidate=1.60.0-SNAPSHOT~8f7f04ca71, baseline=1.60.0-SNAPSHOT~7f5d7043a9
dateFormat X
axisFormat %s
section tracing
Agent [baseline] (1.063 s) : 0, 1062529
Total [baseline] (8.706 s) : 0, 8705708
Agent [candidate] (1.064 s) : 0, 1063723
Total [candidate] (8.76 s) : 0, 8760431
section iast
Agent [baseline] (1.233 s) : 0, 1232828
Total [baseline] (9.386 s) : 0, 9385687
Agent [candidate] (1.234 s) : 0, 1234074
Total [candidate] (9.384 s) : 0, 9383768
gantt
title insecure-bank - break down per module: candidate=1.60.0-SNAPSHOT~8f7f04ca71, baseline=1.60.0-SNAPSHOT~7f5d7043a9
dateFormat X
axisFormat %s
section tracing
crashtracking [baseline] (1.191 ms) : 0, 1191
crashtracking [candidate] (1.171 ms) : 0, 1171
BytebuddyAgent [baseline] (626.802 ms) : 0, 626802
BytebuddyAgent [candidate] (626.128 ms) : 0, 626128
AgentMeter [baseline] (28.941 ms) : 0, 28941
AgentMeter [candidate] (29.111 ms) : 0, 29111
GlobalTracer [baseline] (257.176 ms) : 0, 257176
GlobalTracer [candidate] (257.601 ms) : 0, 257601
AppSec [baseline] (33.031 ms) : 0, 33031
AppSec [candidate] (33.283 ms) : 0, 33283
Debugger [baseline] (62.851 ms) : 0, 62851
Debugger [candidate] (63.809 ms) : 0, 63809
Remote Config [baseline] (621.742 µs) : 0, 622
Remote Config [candidate] (622.24 µs) : 0, 622
Telemetry [baseline] (10.545 ms) : 0, 10545
Telemetry [candidate] (11.375 ms) : 0, 11375
Flare Poller [baseline] (5.325 ms) : 0, 5325
Flare Poller [candidate] (4.528 ms) : 0, 4528
section iast
crashtracking [baseline] (1.189 ms) : 0, 1189
crashtracking [candidate] (1.191 ms) : 0, 1191
BytebuddyAgent [baseline] (797.448 ms) : 0, 797448
BytebuddyAgent [candidate] (799.618 ms) : 0, 799618
AgentMeter [baseline] (11.471 ms) : 0, 11471
AgentMeter [candidate] (11.476 ms) : 0, 11476
GlobalTracer [baseline] (247.095 ms) : 0, 247095
GlobalTracer [candidate] (246.692 ms) : 0, 246692
IAST [baseline] (27.291 ms) : 0, 27291
IAST [candidate] (26.93 ms) : 0, 26930
AppSec [baseline] (34.818 ms) : 0, 34818
AppSec [candidate] (34.644 ms) : 0, 34644
Debugger [baseline] (64.762 ms) : 0, 64762
Debugger [candidate] (64.749 ms) : 0, 64749
Remote Config [baseline] (532.421 µs) : 0, 532
Remote Config [candidate] (536.167 µs) : 0, 536
Telemetry [baseline] (8.737 ms) : 0, 8737
Telemetry [candidate] (8.756 ms) : 0, 8756
Flare Poller [baseline] (3.471 ms) : 0, 3471
Flare Poller [candidate] (3.481 ms) : 0, 3481
Startup time reports for petclinicgantt
title petclinic - global startup overhead: candidate=1.60.0-SNAPSHOT~8f7f04ca71, baseline=1.60.0-SNAPSHOT~7f5d7043a9
dateFormat X
axisFormat %s
section tracing
Agent [baseline] (1.064 s) : 0, 1063867
Total [baseline] (10.932 s) : 0, 10931721
Agent [candidate] (1.072 s) : 0, 1072013
Total [candidate] (10.901 s) : 0, 10900977
section appsec
Agent [baseline] (1.242 s) : 0, 1241718
Total [baseline] (11.007 s) : 0, 11006902
Agent [candidate] (1.242 s) : 0, 1241693
Total [candidate] (11.046 s) : 0, 11045933
section iast
Agent [baseline] (1.227 s) : 0, 1226619
Total [baseline] (11.115 s) : 0, 11114583
Agent [candidate] (1.241 s) : 0, 1240889
Total [candidate] (11.318 s) : 0, 11317858
section profiling
Agent [baseline] (1.189 s) : 0, 1189410
Total [baseline] (10.954 s) : 0, 10954292
Agent [candidate] (1.191 s) : 0, 1190648
Total [candidate] (10.905 s) : 0, 10904804
gantt
title petclinic - break down per module: candidate=1.60.0-SNAPSHOT~8f7f04ca71, baseline=1.60.0-SNAPSHOT~7f5d7043a9
dateFormat X
axisFormat %s
section tracing
crashtracking [baseline] (1.203 ms) : 0, 1203
crashtracking [candidate] (1.196 ms) : 0, 1196
BytebuddyAgent [baseline] (626.607 ms) : 0, 626607
BytebuddyAgent [candidate] (631.833 ms) : 0, 631833
AgentMeter [baseline] (29.106 ms) : 0, 29106
AgentMeter [candidate] (29.191 ms) : 0, 29191
GlobalTracer [baseline] (257.171 ms) : 0, 257171
GlobalTracer [candidate] (258.708 ms) : 0, 258708
AppSec [baseline] (32.993 ms) : 0, 32993
AppSec [candidate] (33.226 ms) : 0, 33226
Debugger [baseline] (65.065 ms) : 0, 65065
Debugger [candidate] (66.519 ms) : 0, 66519
Remote Config [baseline] (606.7 µs) : 0, 607
Remote Config [candidate] (630.177 µs) : 0, 630
Telemetry [baseline] (9.805 ms) : 0, 9805
Telemetry [candidate] (10.671 ms) : 0, 10671
Flare Poller [baseline] (5.287 ms) : 0, 5287
Flare Poller [candidate] (3.787 ms) : 0, 3787
section appsec
crashtracking [baseline] (1.192 ms) : 0, 1192
crashtracking [candidate] (1.196 ms) : 0, 1196
BytebuddyAgent [baseline] (657.747 ms) : 0, 657747
BytebuddyAgent [candidate] (659.392 ms) : 0, 659392
AgentMeter [baseline] (11.955 ms) : 0, 11955
AgentMeter [candidate] (11.977 ms) : 0, 11977
GlobalTracer [baseline] (259.206 ms) : 0, 259206
GlobalTracer [candidate] (258.882 ms) : 0, 258882
IAST [baseline] (25.655 ms) : 0, 25655
IAST [candidate] (25.424 ms) : 0, 25424
AppSec [baseline] (168.584 ms) : 0, 168584
AppSec [candidate] (167.664 ms) : 0, 167664
Debugger [baseline] (67.328 ms) : 0, 67328
Debugger [candidate] (67.21 ms) : 0, 67210
Remote Config [baseline] (662.252 µs) : 0, 662
Remote Config [candidate] (651.424 µs) : 0, 651
Telemetry [baseline] (9.654 ms) : 0, 9654
Telemetry [candidate] (9.524 ms) : 0, 9524
Flare Poller [baseline] (3.75 ms) : 0, 3750
Flare Poller [candidate] (3.731 ms) : 0, 3731
section iast
crashtracking [baseline] (1.19 ms) : 0, 1190
crashtracking [candidate] (1.196 ms) : 0, 1196
BytebuddyAgent [baseline] (793.079 ms) : 0, 793079
BytebuddyAgent [candidate] (802.03 ms) : 0, 802030
AgentMeter [baseline] (11.206 ms) : 0, 11206
AgentMeter [candidate] (11.516 ms) : 0, 11516
GlobalTracer [baseline] (245.822 ms) : 0, 245822
GlobalTracer [candidate] (248.234 ms) : 0, 248234
IAST [baseline] (27.036 ms) : 0, 27036
IAST [candidate] (27.459 ms) : 0, 27459
AppSec [baseline] (35.492 ms) : 0, 35492
AppSec [candidate] (33.257 ms) : 0, 33257
Debugger [baseline] (64.383 ms) : 0, 64383
Debugger [candidate] (68.306 ms) : 0, 68306
Remote Config [baseline] (531.837 µs) : 0, 532
Remote Config [candidate] (544.0 µs) : 0, 544
Telemetry [baseline] (8.574 ms) : 0, 8574
Telemetry [candidate] (8.74 ms) : 0, 8740
Flare Poller [baseline] (3.408 ms) : 0, 3408
Flare Poller [candidate] (3.507 ms) : 0, 3507
section profiling
crashtracking [baseline] (1.163 ms) : 0, 1163
crashtracking [candidate] (1.156 ms) : 0, 1156
BytebuddyAgent [baseline] (681.177 ms) : 0, 681177
BytebuddyAgent [candidate] (682.252 ms) : 0, 682252
AgentMeter [baseline] (8.576 ms) : 0, 8576
AgentMeter [candidate] (8.554 ms) : 0, 8554
GlobalTracer [baseline] (215.658 ms) : 0, 215658
GlobalTracer [candidate] (216.2 ms) : 0, 216200
AppSec [baseline] (32.399 ms) : 0, 32399
AppSec [candidate] (32.456 ms) : 0, 32456
Debugger [baseline] (66.881 ms) : 0, 66881
Debugger [candidate] (66.07 ms) : 0, 66070
Remote Config [baseline] (621.592 µs) : 0, 622
Remote Config [candidate] (623.832 µs) : 0, 624
Telemetry [baseline] (8.886 ms) : 0, 8886
Telemetry [candidate] (9.64 ms) : 0, 9640
Flare Poller [baseline] (3.766 ms) : 0, 3766
Flare Poller [candidate] (3.781 ms) : 0, 3781
ProfilingAgent [baseline] (99.581 ms) : 0, 99581
ProfilingAgent [candidate] (99.173 ms) : 0, 99173
Profiling [baseline] (100.17 ms) : 0, 100170
Profiling [candidate] (99.749 ms) : 0, 99749
LoadParameters
See matching parameters
SummaryFound 5 performance improvements and 1 performance regressions! Performance is the same for 14 metrics, 16 unstable metrics.
Request duration reports for petclinicgantt
title petclinic - request duration [CI 0.99] : candidate=1.60.0-SNAPSHOT~8f7f04ca71, baseline=1.60.0-SNAPSHOT~7f5d7043a9
dateFormat X
axisFormat %s
section baseline
no_agent (19.235 ms) : 19035, 19436
. : milestone, 19235,
appsec (18.962 ms) : 18771, 19154
. : milestone, 18962,
code_origins (17.767 ms) : 17592, 17943
. : milestone, 17767,
iast (17.871 ms) : 17691, 18052
. : milestone, 17871,
profiling (20.087 ms) : 19879, 20295
. : milestone, 20087,
tracing (18.457 ms) : 18265, 18648
. : milestone, 18457,
section candidate
no_agent (17.391 ms) : 17213, 17569
. : milestone, 17391,
appsec (18.925 ms) : 18732, 19118
. : milestone, 18925,
code_origins (17.792 ms) : 17621, 17963
. : milestone, 17792,
iast (17.685 ms) : 17512, 17858
. : milestone, 17685,
profiling (18.654 ms) : 18468, 18840
. : milestone, 18654,
tracing (17.642 ms) : 17467, 17817
. : milestone, 17642,
Request duration reports for insecure-bankgantt
title insecure-bank - request duration [CI 0.99] : candidate=1.60.0-SNAPSHOT~8f7f04ca71, baseline=1.60.0-SNAPSHOT~7f5d7043a9
dateFormat X
axisFormat %s
section baseline
no_agent (1.183 ms) : 1171, 1194
. : milestone, 1183,
iast (3.295 ms) : 3250, 3340
. : milestone, 3295,
iast_FULL (5.952 ms) : 5892, 6011
. : milestone, 5952,
iast_GLOBAL (3.525 ms) : 3466, 3584
. : milestone, 3525,
profiling (1.897 ms) : 1881, 1913
. : milestone, 1897,
tracing (1.749 ms) : 1734, 1763
. : milestone, 1749,
section candidate
no_agent (1.187 ms) : 1175, 1198
. : milestone, 1187,
iast (3.149 ms) : 3108, 3191
. : milestone, 3149,
iast_FULL (6.024 ms) : 5962, 6085
. : milestone, 6024,
iast_GLOBAL (3.498 ms) : 3433, 3563
. : milestone, 3498,
profiling (2.072 ms) : 2052, 2093
. : milestone, 2072,
tracing (1.819 ms) : 1804, 1834
. : milestone, 1819,
DacapoParameters
See matching parameters
SummaryFound 0 performance improvements and 0 performance regressions! Performance is the same for 11 metrics, 1 unstable metrics. Execution time for biojavagantt
title biojava - execution time [CI 0.99] : candidate=1.60.0-SNAPSHOT~8f7f04ca71, baseline=1.60.0-SNAPSHOT~7f5d7043a9
dateFormat X
axisFormat %s
section baseline
no_agent (14.968 s) : 14968000, 14968000
. : milestone, 14968000,
appsec (14.979 s) : 14979000, 14979000
. : milestone, 14979000,
iast (17.97 s) : 17970000, 17970000
. : milestone, 17970000,
iast_GLOBAL (17.745 s) : 17745000, 17745000
. : milestone, 17745000,
profiling (14.893 s) : 14893000, 14893000
. : milestone, 14893000,
tracing (14.703 s) : 14703000, 14703000
. : milestone, 14703000,
section candidate
no_agent (15.664 s) : 15664000, 15664000
. : milestone, 15664000,
appsec (14.706 s) : 14706000, 14706000
. : milestone, 14706000,
iast (18.173 s) : 18173000, 18173000
. : milestone, 18173000,
iast_GLOBAL (17.823 s) : 17823000, 17823000
. : milestone, 17823000,
profiling (14.992 s) : 14992000, 14992000
. : milestone, 14992000,
tracing (14.538 s) : 14538000, 14538000
. : milestone, 14538000,
Execution time for tomcatgantt
title tomcat - execution time [CI 0.99] : candidate=1.60.0-SNAPSHOT~8f7f04ca71, baseline=1.60.0-SNAPSHOT~7f5d7043a9
dateFormat X
axisFormat %s
section baseline
no_agent (1.474 ms) : 1463, 1486
. : milestone, 1474,
appsec (2.514 ms) : 2459, 2568
. : milestone, 2514,
iast (2.256 ms) : 2186, 2325
. : milestone, 2256,
iast_GLOBAL (2.293 ms) : 2223, 2363
. : milestone, 2293,
profiling (2.11 ms) : 2054, 2167
. : milestone, 2110,
tracing (2.06 ms) : 2006, 2113
. : milestone, 2060,
section candidate
no_agent (1.475 ms) : 1464, 1487
. : milestone, 1475,
appsec (3.795 ms) : 3572, 4017
. : milestone, 3795,
iast (2.254 ms) : 2186, 2323
. : milestone, 2254,
iast_GLOBAL (2.291 ms) : 2221, 2360
. : milestone, 2291,
profiling (2.095 ms) : 2039, 2152
. : milestone, 2095,
tracing (2.064 ms) : 2010, 2117
. : milestone, 2064,
|
Moving newContext/getContext outside the per-entry loop prevents creating a new span per entry where each overwrites the same contextStore slot, leaking all but the last span and injecting mismatched parent IDs into earlier batch messages. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
What Does This Do
Java's SQS instrumentation never wrote APM trace context (
x-datadog-trace-id,x-datadog-parent-id) into the_datadogSQS message attribute on send. The receive side already knows how to read them, but the send side only injected the DSM pathway context — and only when Data Streams was enabled.Motivation
This broke distributed tracing from Java to other language tracers (Python, Node.js, etc.) that rely on the
_datadogattribute for context propagation.Additional Notes
SqsInterceptor(v1 + v2): CalldefaultPropagator().inject()before the DSM inject onSendMessageandSendMessageBatch, so APM trace headers are written into_datadog.MessageAttributeInjector(v1 + v2): Accumulate multiple key-value pairs into the_datadogJSON blob instead of short-circuiting on the second call. Also guard against a nullstringValue(Binary-typed_datadogattributes from SQS-JMS).SqsClientInstrumentation(v1 + v2): RegisterSqsInterceptorwhenisSqsInjectDatadogAttributeEnabled()is true, not only when Data Streams is enabled."APM trace context is injected into _datadog message attribute on send"for both AWS SDK v1 and v2, covering both DSM-off (V0Test) and DSM-on (V0DataStreamsTest) variants.DD context extraction was added in #5920
Contributor Checklist
type:and (comp:orinst:) labels in addition to any other useful labelsclose,fix, or any linking keywords when referencing an issueUse
solvesinstead, and assign the PR milestone to the issueJira ticket: APMS-18847
Note: Once your PR is ready to merge, add it to the merge queue by commenting
/merge./merge -ccancels the queue request./merge -f --reason "reason"skips all merge queue checks; please use this judiciously, as some checks do not run at the PR-level. For more information, see this doc.