-
-
Notifications
You must be signed in to change notification settings - Fork 1.8k
Description
Summary
While investigating detached spans in our Next.js + Slack app, Codex diagnosed a trace propagation break in the chat SDK async webhook processing pattern. The result is Redis DB spans showing up as root transactions (is_transaction:true) instead of children of the HTTP webhook/request span.
This appears to be a context propagation issue around waitUntil usage with eagerly-started async tasks.
What we observed
- Sentry trace sample:
de7433a2d4c4f70153604eb328d6fd5e - In that trace, DB spans are root spans with
is_transaction:true. - Querying the trace:
- DB spans with missing parent span id: 6
- DB spans with non-empty parent span id: 0
- In the last 24h, top
db“transactions” include Redis commands like:GET chat-sdk:cache:...SISMEMBER chat-sdk:subscriptions ...SET chat-sdk:lock:...
That indicates the DB spans are detached from the intended request/workflow parent span.
Why this seems to happen
1) Request span exists in webhook handler
Our route wraps webhook handling in a request span and passes a background function:
withSpan("http.server.request", "http.server", ..., async () => {
const response = await handler(request, {
waitUntil: (task) => after(() => task),
});
return response;
});2) chat SDK eagerly starts async work before scheduling
In chat processMessage (and similar methods), task is started immediately:
const task = (async () => {
const message = ...
await this.handleIncomingMessage(...)
})().catch(...)
options?.waitUntil?.(task)By the time waitUntil/after runs, work is already in flight and may no longer be bound to the originating active span context.
3) Redis spans are emitted inside that async pipeline
handleIncomingMessage performs state adapter operations (get/set/lock/isSubscribed/...), so those DB spans can become root spans if parent context is absent.
Expected behavior
DB spans from webhook message processing should remain parented under the request span (or at least under the same trace root), not become detached root transactions.
Suggested direction
A lazy background scheduling API in chat could avoid eager task startup and preserve context:
- Add something like
runInBackground?: (run: () => Promise<unknown>) => void - If provided, SDK should pass a thunk and let caller execute it in the desired context.
- Keep existing
waitUntil(task)for backward compatibility.
This allows frameworks/runtimes to run deferred work inside a captured active span context (e.g. withActiveSpan).
Environment
@sentry/nextjs:10.40.0next:16.1.6chat:4.14.0@chat-adapter/slack:4.14.0@chat-adapter/state-redis:4.14.0
Additional note
This diagnosis and root-cause mapping were performed by Codex during a deep source + trace analysis.