Skip to content

feat: OpenTelemetry distributed tracing for multi-agent request flows (RELIABILITY-002) #305

@vybe

Description

@vybe

Summary

When Agent A calls Agent B which calls Agent C, there is no way to trace the request across hops. Debugging multi-agent failures requires manually correlating timestamps across separate log streams. The OTel Collector is already running in docker-compose but the backend isn't instrumented.

Solution

Add OpenTelemetry auto-instrumentation for FastAPI, httpx, and Redis using official opentelemetry-instrumentation-* packages. This gives automatic trace propagation — the traceparent header flows through every inter-agent call, creating a connected trace.

Scope

  • Add opentelemetry-instrumentation-fastapi, opentelemetry-instrumentation-httpx, opentelemetry-instrumentation-redis to backend dependencies
  • Initialize OTel SDK in main.py startup (exporter pointed at existing collector on :4317)
  • Auto-instrument FastAPI app, httpx client, and Redis connections
  • Propagate trace context through agent-to-agent calls (traceparent header via httpx instrumentation)
  • Add trace ID to structured log output (logging_config.py) for log-trace correlation

Acceptance Criteria

  • Every API request gets a trace ID automatically
  • Agent A → backend → Agent B calls share the same trace ID
  • Trace ID appears in structured JSON logs
  • Traces export to OTel Collector (already running on otel-collector:4317)
  • No performance regression (sampling at 10% for high-volume endpoints)
  • ~30 lines in main.py, 3 new pip packages, zero new infrastructure

Key Files

  • src/backend/main.py — OTel initialization
  • src/backend/logging_config.py — trace ID in log format
  • src/backend/services/agent_client.py — httpx auto-instrumented (no code change needed)
  • docker-compose.yml — collector already configured

Dependencies

None — collector infrastructure already exists.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions