Skip to content

Detached Redis DB spans become root transactions with chat-sdk async webhook processing #19529

@dcramer

Description

@dcramer

Summary

While investigating detached spans in our Next.js + Slack app, Codex diagnosed a trace propagation break in the chat SDK async webhook processing pattern. The result is Redis DB spans showing up as root transactions (is_transaction:true) instead of children of the HTTP webhook/request span.

This appears to be a context propagation issue around waitUntil usage with eagerly-started async tasks.

What we observed

  • Sentry trace sample: de7433a2d4c4f70153604eb328d6fd5e
  • In that trace, DB spans are root spans with is_transaction:true.
  • Querying the trace:
    • DB spans with missing parent span id: 6
    • DB spans with non-empty parent span id: 0
  • In the last 24h, top db “transactions” include Redis commands like:
    • GET chat-sdk:cache:...
    • SISMEMBER chat-sdk:subscriptions ...
    • SET chat-sdk:lock:...

That indicates the DB spans are detached from the intended request/workflow parent span.

Why this seems to happen

1) Request span exists in webhook handler

Our route wraps webhook handling in a request span and passes a background function:

withSpan("http.server.request", "http.server", ..., async () => {
  const response = await handler(request, {
    waitUntil: (task) => after(() => task),
  });
  return response;
});

2) chat SDK eagerly starts async work before scheduling

In chat processMessage (and similar methods), task is started immediately:

const task = (async () => {
  const message = ...
  await this.handleIncomingMessage(...)
})().catch(...)

options?.waitUntil?.(task)

By the time waitUntil/after runs, work is already in flight and may no longer be bound to the originating active span context.

3) Redis spans are emitted inside that async pipeline

handleIncomingMessage performs state adapter operations (get/set/lock/isSubscribed/...), so those DB spans can become root spans if parent context is absent.

Expected behavior

DB spans from webhook message processing should remain parented under the request span (or at least under the same trace root), not become detached root transactions.

Suggested direction

A lazy background scheduling API in chat could avoid eager task startup and preserve context:

  • Add something like runInBackground?: (run: () => Promise<unknown>) => void
  • If provided, SDK should pass a thunk and let caller execute it in the desired context.
  • Keep existing waitUntil(task) for backward compatibility.

This allows frameworks/runtimes to run deferred work inside a captured active span context (e.g. withActiveSpan).

Environment

  • @sentry/nextjs: 10.40.0
  • next: 16.1.6
  • chat: 4.14.0
  • @chat-adapter/slack: 4.14.0
  • @chat-adapter/state-redis: 4.14.0

Additional note

This diagnosis and root-cause mapping were performed by Codex during a deep source + trace analysis.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions