Skip to content

fix: observe baggage limits on extraction#313

Merged
zacharycmontoya merged 5 commits intomainfrom
zach.montoya/baggage-improvements
May 4, 2026
Merged

fix: observe baggage limits on extraction#313
zacharycmontoya merged 5 commits intomainfrom
zach.montoya/baggage-improvements

Conversation

@zacharycmontoya
Copy link
Copy Markdown
Contributor

Description

Implements the baggage limits on context extraction. Previously the limits only applied on context injection. This PR also applies context extraction limits on the dd tracestate information.

Motivation

We want to implement a reasonable upper-bound on the size of incoming propagation headers.

Additional Notes

N/A

@zacharycmontoya zacharycmontoya requested review from a team as code owners May 4, 2026 15:12
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 6c50e01910

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread include/datadog/baggage.h
@zacharycmontoya zacharycmontoya requested a review from dubloom May 4, 2026 16:48
Comment thread examples/http-server/common/httplib.h
Comment thread src/datadog/baggage.cpp
Comment thread src/datadog/baggage.cpp
}

// If the "dd" vendor entry's value exceeds 512 bytes, drop it and record a propagation error tag.
if (datadog_value.size() > 512) {
Copy link
Copy Markdown
Contributor Author

@zacharycmontoya zacharycmontoya May 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI: We only parse and manipulate our section of the W3C tracestate header, so always pass through the entries from other vendors but parse our portion or discard it if it exceeds the configured limits

Copy link
Copy Markdown

@awforsythe awforsythe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Apologies for my verbosity, but I like to talk through the thought process when I'm reviewing urgent changes in codebases I don't know well)

baggage.h / .cpp

We're introducing two separate limits, max_bytes (total payload size) and max_items (total number of key/value pairs parsed). These limits are now enforced in parse_baggage.

max_bytes gets checked up-front with a simple length check on the input string: if the header value exceeds the maximum size, we reject it outright and perform no parsing. Simple enough.

max_items gets checked any time we're about to push a newly-parsed value into the result map. There are three existing places where the function calls result.emplace (and it doens't appear to mutate result in any other way, so this list oughta be exhaustive):

  1. In the consume_value branch of the state machine logic, after we've reached the end of a value in the string
  2. In the post-state-machine code that handles the last value, if we've concluded parsing in the value state, meaning we have a final value to add to the map
  3. In the same code, just in the trailing_spaces_value || properties state, which has slightly different arithmetic for grabbing the portion of the string that represents the value, but is otherwise duplicated (duplication which is extremely minor and would be silly to refactor in an urgent bug-fix change)

...so in each of those three places, we now have preemptive checks for result.size() >= max_items. This will correctly allow the case where we have max_items values, and abort with an error in any conceivable case where we could otherwise accumulate more than max_items.

Validation logic looks correct to me.

parse_baggage is internal to baggage.cpp, so the extra parameter is fine. Baggage::extract is public, but the newly-added parameter is declared with a default so it won't break usage. The Codex review note re: this nevertheless breaking linkage seems worth considering, but I don't know what level of binary compatibility we're concerned with here, so I'll leave that to you.

extraction_util.cpp

This looks like an entirely separate limit, and presumably X-Datadog-Tags is something that the library sets internally, hence the hardcoded, non-configurable limit. I'm guessing that exceeding 512 bytes would be unusual, and would indicate either a bug in the library or (more likely) the user configuring excessively long values.

Anyway, yup, logic is straightforward and looks good.

tracer.cpp

We're just passing baggage_opts_ to Baggage::extract. baggage_opts_ is preexisting. OK.

w3c_propagation.cpp

extract_tracestate (internal) gets a new parameter so that we can mutate span_tags if validation fails.

We have another separate, hardcoded 512-byte check, this time for datadog_value. Again, presumably internal-only and so a reasonable upper-bound-of-sanity to hardcode. Again, very simple logic to early-out if we exceed the max length.

test/

Tests look sane; test coverage on Baggage::extract looks good. It might be nice to ensure that there are enough test cases to cover both the value and trailing_spaces_value || properties branches for accumulation of the final value, but that's a nitpick.

Other tests look good.

(Sidenote: I've been doing the GENERATE(values<TestCase>) pattern but in a stupid way that uses DYNAMIC_SECTION; I knew there had to be a better way to do that. Thanks for the good example!)


As long as all of my stated assumptions line up with reality, LGTM.

@zacharycmontoya
Copy link
Copy Markdown
Contributor Author

Thanks Alex! Yes all of the assumptions and reasoning are accurate, thanks for providing a review. I'll re-run CI (seems to be flaky GitHub Actions runners / Docker build times) and then it should be good to merge

@pr-commenter
Copy link
Copy Markdown

pr-commenter Bot commented May 4, 2026

Benchmarks

Benchmark execution time: 2026-05-04 21:12:44

Comparing candidate commit d4c0de6 in PR branch zach.montoya/baggage-improvements with baseline commit 5dfaf40 in branch main.

Found 0 performance improvements and 0 performance regressions! Performance is the same for 1 metrics, 0 unstable metrics.

Explanation

This is an A/B test comparing a candidate commit's performance against that of a baseline commit. Performance changes are noted in the tables below as:

  • 🟩 = significantly better candidate vs. baseline
  • 🟥 = significantly worse candidate vs. baseline

We compute a confidence interval (CI) over the relative difference of means between metrics from the candidate and baseline commits, considering the baseline as the reference.

If the CI is entirely outside the configured SIGNIFICANT_IMPACT_THRESHOLD (or the deprecated UNCONFIDENCE_THRESHOLD), the change is considered significant.

Feel free to reach out to #apm-benchmarking-platform on Slack if you have any questions.

More details about the CI and significant changes

You can imagine this CI as a range of values that is likely to contain the true difference of means between the candidate and baseline commits.

CIs of the difference of means are often centered around 0%, because often changes are not that big:

---------------------------------(------|---^--------)-------------------------------->
                              -0.6%    0%  0.3%     +1.2%
                                 |          |        |
         lower bound of the CI --'          |        |
sample mean (center of the CI) -------------'        |
         upper bound of the CI ----------------------'

As described above, a change is considered significant if the CI is entirely outside the configured SIGNIFICANT_IMPACT_THRESHOLD (or the deprecated UNCONFIDENCE_THRESHOLD).

For instance, for an execution time metric, this confidence interval indicates a significantly worse performance:

----------------------------------------|---------|---(---------^---------)---------->
                                       0%        1%  1.3%      2.2%      3.1%
                                                  |   |         |         |
       significant impact threshold --------------'   |         |         |
                      lower bound of CI --------------'         |         |
       sample mean (center of the CI) --------------------------'         |
                      upper bound of CI ----------------------------------'

@datadog-prod-us1-5
Copy link
Copy Markdown

datadog-prod-us1-5 Bot commented May 4, 2026

🎯 Code Coverage (details)
Patch Coverage: 96.15%
Overall Coverage: 90.82%

This comment will be updated automatically if new data arrives.
🔗 Commit SHA: d4c0de6 | Docs | Datadog PR Page | Give us feedback!

Comment thread src/datadog/baggage.cpp Outdated
Comment thread src/datadog/baggage.cpp
Comment thread src/datadog/extraction_util.cpp Outdated
Comment thread src/datadog/extraction_util.cpp Outdated
Comment thread src/datadog/w3c_propagation.cpp
Comment thread test/test_trace_segment.cpp
Comment thread test/test_baggage.cpp Outdated
Comment thread test/test_trace_segment.cpp
Comment thread test/test_tracer.cpp Outdated
@zacharycmontoya zacharycmontoya merged commit 422dc91 into main May 4, 2026
124 of 133 checks passed
@zacharycmontoya zacharycmontoya deleted the zach.montoya/baggage-improvements branch May 4, 2026 22:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants