fix: observe baggage limits on extraction by zacharycmontoya · Pull Request #313 · DataDog/dd-trace-cpp

zacharycmontoya · 2026-05-04T15:12:35Z

Description

Implements the baggage limits on context extraction. Previously the limits only applied on context injection. This PR also applies context extraction limits on the dd tracestate information.

Motivation

We want to implement a reasonable upper-bound on the size of incoming propagation headers.

Additional Notes

N/A

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 6c50e01910

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

zacharycmontoya · 2026-05-04T18:43:54Z

  }

+  // If the "dd" vendor entry's value exceeds 512 bytes, drop it and record a propagation error tag.
+  if (datadog_value.size() > 512) {


FYI: We only parse and manipulate our section of the W3C tracestate header, so always pass through the entries from other vendors but parse our portion or discard it if it exceeds the configured limits

awforsythe

(Apologies for my verbosity, but I like to talk through the thought process when I'm reviewing urgent changes in codebases I don't know well)

baggage.h / .cpp

We're introducing two separate limits, max_bytes (total payload size) and max_items (total number of key/value pairs parsed). These limits are now enforced in parse_baggage.

max_bytes gets checked up-front with a simple length check on the input string: if the header value exceeds the maximum size, we reject it outright and perform no parsing. Simple enough.

max_items gets checked any time we're about to push a newly-parsed value into the result map. There are three existing places where the function calls result.emplace (and it doens't appear to mutate result in any other way, so this list oughta be exhaustive):

In the consume_value branch of the state machine logic, after we've reached the end of a value in the string
In the post-state-machine code that handles the last value, if we've concluded parsing in the value state, meaning we have a final value to add to the map
In the same code, just in the trailing_spaces_value || properties state, which has slightly different arithmetic for grabbing the portion of the string that represents the value, but is otherwise duplicated (duplication which is extremely minor and would be silly to refactor in an urgent bug-fix change)

...so in each of those three places, we now have preemptive checks for result.size() >= max_items. This will correctly allow the case where we have max_items values, and abort with an error in any conceivable case where we could otherwise accumulate more than max_items.

Validation logic looks correct to me.

parse_baggage is internal to baggage.cpp, so the extra parameter is fine. Baggage::extract is public, but the newly-added parameter is declared with a default so it won't break usage. The Codex review note re: this nevertheless breaking linkage seems worth considering, but I don't know what level of binary compatibility we're concerned with here, so I'll leave that to you.

extraction_util.cpp

This looks like an entirely separate limit, and presumably X-Datadog-Tags is something that the library sets internally, hence the hardcoded, non-configurable limit. I'm guessing that exceeding 512 bytes would be unusual, and would indicate either a bug in the library or (more likely) the user configuring excessively long values.

Anyway, yup, logic is straightforward and looks good.

tracer.cpp

We're just passing baggage_opts_ to Baggage::extract. baggage_opts_ is preexisting. OK.

w3c_propagation.cpp

extract_tracestate (internal) gets a new parameter so that we can mutate span_tags if validation fails.

We have another separate, hardcoded 512-byte check, this time for datadog_value. Again, presumably internal-only and so a reasonable upper-bound-of-sanity to hardcode. Again, very simple logic to early-out if we exceed the max length.

test/

Tests look sane; test coverage on Baggage::extract looks good. It might be nice to ensure that there are enough test cases to cover both the value and trailing_spaces_value || properties branches for accumulation of the final value, but that's a nitpick.

Other tests look good.

(Sidenote: I've been doing the GENERATE(values<TestCase>) pattern but in a stupid way that uses DYNAMIC_SECTION; I knew there had to be a better way to do that. Thanks for the good example!)

As long as all of my stated assumptions line up with reality, LGTM.

zacharycmontoya · 2026-05-04T19:20:22Z

Thanks Alex! Yes all of the assumptions and reasoning are accurate, thanks for providing a review. I'll re-run CI (seems to be flaky GitHub Actions runners / Docker build times) and then it should be good to merge

pr-commenter · 2026-05-04T19:26:23Z

Benchmarks

Benchmark execution time: 2026-05-04 21:12:44

Comparing candidate commit d4c0de6 in PR branch zach.montoya/baggage-improvements with baseline commit 5dfaf40 in branch main.

Found 0 performance improvements and 0 performance regressions! Performance is the same for 1 metrics, 0 unstable metrics.

Explanation

This is an A/B test comparing a candidate commit's performance against that of a baseline commit. Performance changes are noted in the tables below as:

🟩 = significantly better candidate vs. baseline
🟥 = significantly worse candidate vs. baseline

We compute a confidence interval (CI) over the relative difference of means between metrics from the candidate and baseline commits, considering the baseline as the reference.

If the CI is entirely outside the configured SIGNIFICANT_IMPACT_THRESHOLD (or the deprecated UNCONFIDENCE_THRESHOLD), the change is considered significant.

Feel free to reach out to #apm-benchmarking-platform on Slack if you have any questions.

More details about the CI and significant changes

You can imagine this CI as a range of values that is likely to contain the true difference of means between the candidate and baseline commits.

CIs of the difference of means are often centered around 0%, because often changes are not that big:

---------------------------------(------|---^--------)-------------------------------->
                              -0.6%    0%  0.3%     +1.2%
                                 |          |        |
         lower bound of the CI --'          |        |
sample mean (center of the CI) -------------'        |
         upper bound of the CI ----------------------'

As described above, a change is considered significant if the CI is entirely outside the configured SIGNIFICANT_IMPACT_THRESHOLD (or the deprecated UNCONFIDENCE_THRESHOLD).

For instance, for an execution time metric, this confidence interval indicates a significantly worse performance:

----------------------------------------|---------|---(---------^---------)---------->
                                       0%        1%  1.3%      2.2%      3.1%
                                                  |   |         |         |
       significant impact threshold --------------'   |         |         |
                      lower bound of CI --------------'         |         |
       sample mean (center of the CI) --------------------------'         |
                      upper bound of CI ----------------------------------'

datadog-prod-us1-5 · 2026-05-04T19:29:34Z

🎯 Code Coverage (details)
• Patch Coverage: 96.15%
• Overall Coverage: 90.82%

_{This comment will be updated automatically if new data arrives.

🔗 Commit SHA: d4c0de6 | Docs | Datadog PR Page | Give us feedback!}

zacharycmontoya added 2 commits May 1, 2026 08:33

Add limit checks to baggage extraction as well as injection

d97ca08

Add limit checks to x-datadog-tags extraction as well as injection

6c50e01

zacharycmontoya requested a review from xlamorlette-datadog May 4, 2026 15:12

zacharycmontoya requested review from a team as code owners May 4, 2026 15:12

chatgpt-codex-connector Bot reviewed May 4, 2026

View reviewed changes

Comment thread include/datadog/baggage.h

zacharycmontoya requested a review from dubloom May 4, 2026 16:48

zacharycmontoya commented May 4, 2026

View reviewed changes

Comment thread examples/http-server/common/httplib.h

zacharycmontoya commented May 4, 2026

View reviewed changes

Comment thread src/datadog/baggage.cpp

zacharycmontoya commented May 4, 2026

View reviewed changes

Comment thread src/datadog/baggage.cpp

zacharycmontoya commented May 4, 2026

View reviewed changes

awforsythe approved these changes May 4, 2026

View reviewed changes

Run bin/format to fix formatting issues

76cd980

zacharycmontoya had a problem deploying to dev May 4, 2026 19:23 — with GitHub Actions Failure

zacharycmontoya temporarily deployed to dev May 4, 2026 19:23 — with GitHub Actions Inactive

zacharycmontoya temporarily deployed to dev May 4, 2026 19:27 — with GitHub Actions Inactive

Fix hiding of reader local in unit tests

a10afa7

zacharycmontoya temporarily deployed to dev May 4, 2026 19:48 — with GitHub Actions Inactive

zacharycmontoya temporarily deployed to dev May 4, 2026 19:52 — with GitHub Actions Inactive

xlamorlette-datadog approved these changes May 4, 2026

View reviewed changes

Address PR feedback

d4c0de6

zacharycmontoya temporarily deployed to dev May 4, 2026 21:09 — with GitHub Actions Inactive

zacharycmontoya temporarily deployed to dev May 4, 2026 21:12 — with GitHub Actions Inactive

zacharycmontoya merged commit 422dc91 into main May 4, 2026
124 of 133 checks passed

zacharycmontoya deleted the zach.montoya/baggage-improvements branch May 4, 2026 22:11

Conversation

zacharycmontoya commented May 4, 2026

Description

Motivation

Additional Notes

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

zacharycmontoya May 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

awforsythe left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

baggage.h / .cpp

extraction_util.cpp

tracer.cpp

w3c_propagation.cpp

test/

Uh oh!

zacharycmontoya commented May 4, 2026

Uh oh!

pr-commenter Bot commented May 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Benchmarks

Explanation

More details about the CI and significant changes

Uh oh!

datadog-prod-us1-5 Bot commented May 4, 2026 • edited by datadog-datadog-prod-us1 Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

zacharycmontoya May 4, 2026 •

edited

Loading

awforsythe left a comment •

edited

Loading

pr-commenter Bot commented May 4, 2026 •

edited

Loading

datadog-prod-us1-5 Bot commented May 4, 2026 •

edited by datadog-datadog-prod-us1 Bot

Loading