fix(traces): adjust stale span timestamps for SnapStart restores#1114
Draft
jchrostek-dd wants to merge 5 commits intomainfrom
Draft
fix(traces): adjust stale span timestamps for SnapStart restores#1114jchrostek-dd wants to merge 5 commits intomainfrom
jchrostek-dd wants to merge 5 commits intomainfrom
Conversation
- Remove redundant comments that just restate what code does - Extract magic number into named constant SIXTY_SECONDS_NS - Consolidate multi-line comments into clearer explanations 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Add test infrastructure to validate that traces have reasonable durations after SnapStart restore. The test: - Creates a Java Lambda that makes HTTP requests during static init - Waits 2 minutes after snapshot creation for timestamps to become stale - Verifies trace duration is < 1 minute (not 24+ hours) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Switch from java.net.http.HttpClient to OkHttp for better dd-trace-java instrumentation coverage - Add test assertion to verify OkHttp spans appear in the invocation trace - Add diagnostic function to search all spans from service - Add detailed span logging for debugging Test now validates: - OkHttp spans are created by Java tracer - OkHttp spans are correctly linked to the Lambda invocation trace - Trace structure includes extension and tracer spans 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Previously, only spans with request_id metadata were adjusted for stale SnapStart timestamps. This missed tracer spans like OkHttp requests that don't have request_id. Now the fix: 1. Finds request_id from any span in the trace chunk 2. Looks up the restore_time for that invocation 3. Adjusts ALL spans with timestamps before the threshold Integration test verified: OkHttp span timestamp went from 195 seconds before invocation (stale) to 2.4 seconds before (at restore time). 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Fixes APMS-18793 - SnapStart Span Duration Bug
When a SnapStart-enabled Lambda function is restored from a snapshot, tracer spans (like Java Netty HTTP client spans) may have timestamps from when the snapshot was created, not when the restore happened. This caused traces to appear to span 24+ hours.
Changes:
snapstart_restore_timeon invocation context whenPlatformRestoreStartis receivedTags added to adjusted spans:
_dd.snapstart_adjusted=true- Indicates the span was adjusted_dd.snapstart_original_start=<timestamp>- Preserves the original start time for debuggingTest plan
🤖 Generated with Claude Code