title	Telemetry Processor
sidebar_order	5

🚧 This document is work in progress. 🚧
For feedback or suggestions, open an issue or PR and tag @philipphofmann, @isaacs, or @giortzisg for review. We welcome input as we validate this approach. This document uses key words such as "MUST", "SHOULD", and "MAY" as defined in [RFC 2119](https://www.ietf.org/rfc/rfc2119.txt) to indicate requirement levels.

The telemetry processor ensures data is delivered efficiently to Sentry. It receives data from the client and forwards it to the transport. Its key responsibilities include buffering, rate limiting, client reports, priority-based sending, and, on some platforms, offline caching.

flowchart LR
  Client -- send Data --> TelemetryProcessor
  TelemetryProcessor -- sendEnvelope --> Transport

SDKs SHOULD only add the telemetry processor for high-volume data (spans, logs, metrics). SDKs without these features MAY omit it. Once added, SDK clients SHOULD forward all data to the processor, not the transport. During migration, SDKs MAY temporarily send only some telemetry data through the processor.

The telemetry processor consists of two major components:

Telemetry Buffer: Focuses on buffering high-volume data, such as spans and logs, into batches to minimize HTTP requests.
Telemetry Scheduler: Takes buffered data from the TelemetryBuffer and manages prioritized sending, including potential offline caching and sending of client reports.

Here's a more detailed overview of the most important components of the telemetry processor:

flowchart

Client -- add telemetry data --> TelemetryBuffer

subgraph TelemetryProcessorSubgraph["TelemetryProcessor"]

    subgraph TelemetryBufferSubgraph["TelemetryBuffer"]

      TelemetryBuffer["**TelemetryBuffer**<br/>Knows all buffers and forwards telemetry data"]
      SpanBuffer["**SpanBuffer**</br>Buckets per trace"]

      TelemetryBuffer -- add Span --> SpanBuffer
      TelemetryBuffer -- add Log or Metric --> GenericBuffer

      SpanBuffer --> SendData
      GenericBuffer --> SendData

      SendData@{ shape: fork, label: "Fork or Join" }

    end

    TelemetryBuffer -- "add low-volume data (Error, Check-In, Session, etc.)" --> TelemetryScheduler
    SendData -- when full or time out, flush buffered data --> TelemetryScheduler

    subgraph TelemetrySchedulerSubGraph["TelemetryScheduler"]

      TelemetryQueue["**TelemetryQueue**<br/>LIFO or RoundRobin</br>InMemory or Disk"]
      TelemetryQueue -- nextEnvelope --> TelemetryScheduler

    TelemetryScheduler -- add or remove envelope --> TelemetryQueue
  end


end

Transport
TelemetryScheduler -- sendEnvelope -->Transport

Here's a simplified example of a log traveling through the telemetry processor:

The client adds the log to the telemetry processor.
The telemetry processor adds the log to the telemetry buffer.
When the telemetry buffer is full or its timeout fires, it forwards a batch of logs to the telemetry scheduler.
The telemetry scheduler builds the envelope containing the batch of logs and adds it to the telemetry queue.
Depending on the platform, the telemetry queue can use either in-memory or disk storage. Each platform can also choose a priority concept. It can be, for example, LIFO or RoundRobin.
The telemetry scheduler asks the telemetry queue for the next envelope to send and passes it to the transport.
The transport sends the envelope to Sentry.

This exact flow is not enforced for all SDKs. SDKs MAY adapt the architecture as needed, such as having the telemetry scheduler pull from the telemetry buffer instead of receiving pushed data.

We aim to standardize requirements so SDKs share consistent logic across platforms. This benefits multi-platform SDKs like Java. However, requirements MAY differ where necessary. This page outlines common requirements; platform-specific requirements are listed below.

For rules on which items can be combined in a single envelope, see Envelope Item Constraints. These constraints impact the design — for example, the span buffer organizes spans per traceID because each envelope can only contain spans from a single trace, while logs and metrics can be batched across traces.

Telemetry Buffer

The telemetry buffer batches high-volume data and forwards it to the telemetry scheduler. This section covers the common requirements for all platforms:

Before adding an item to a specific buffer, the telemetry buffer SHOULD drop rate-limited items to avoid overhead. If doing so, it MUST record client reports.
When the telemetry buffer overflows and it drops data, it MUST record client reports.
The telemetry buffer MUST forward low-volume data, such as normal events, session replays, or user feedback, directly to the telemetry scheduler.
The telemetry buffer MUST start a timeout of 5 seconds when the first item is added. When the timeout expires, the telemetry buffer MUST forward all items to the telemetry scheduler.
The telemetry buffer MUST define a size limit of x items. See Size Limit Recommendations below for more details.
When the size limit is reached, the telemetry buffer MUST forward all items to the telemetry scheduler. The buffer MAY forward items in batches.

Size Limits

SDKs SHOULD use the following size limits for the telemetry buffer. SDKs MAY use lower values, but they MUST NOT exceed the following size limits:

100 items for logs
100 items for metrics
1000 items for spans

While the envelope size limits would allow higher size limits for specific categories, these limits are optimized for Relay and exceeding them is absolutely discouraged.

Data Forwarding Scenarios

The telemetry buffer MUST forward all data in memory to the telemetry scheduler to avoid data loss in the following scenarios:

When the user calls SentrySDK.flush(), the telemetry buffer MUST forward all data in memory to the telemetry scheduler, and only then SHOULD the telemetry scheduler flush the data.
When the user calls SentrySDK.close(), the telemetry buffer MUST forward all data in memory to the telemetry scheduler. SDKs SHOULD keep their existing closing behavior.
When the application shuts down gracefully, the telemetry buffer MUST forward all data in memory to the telemetry scheduler.

Span Buffer

Spans require a dedicated buffer because we can only send spans from a single trace per envelope. Still, the span buffer MUST respect the common telemetry buffer requirements.

For the telemetry span buffer, we recommend using a map of traceID to span buckets per trace. SDKs may also use alternatives, such as a fixed-ring buffer with tombstones. Regardless of the data structure, your SDK MUST meet these requirements:

When forwarding spans, the span buffer MUST forward only spans from a single trace, since each envelope supports only one traceID. This means that the span buffer might forward spans multiple times.
When the span buffer overflows and it drops data, it SHOULD drop all spans from a single traceID rather than individual spans. Dropping individual spans creates incomplete traces that are harder to debug, whereas dropping a complete trace preserves the integrity of the remaining traces.

Buckets per TraceID

The span buffer is a map where each key is a traceID and each value is a list of spans.

spanBuffer = {
  "trace-a": [span1, span2, span3],
  "trace-b": [span4],
  "trace-c": [span5, span6]
}

Here are the requirements for the buckets per traceID:

When the span buffer adds a span, it MUST add it to the bucket with the traceID of the span.
When the span buffer doesn't have a bucket for the traceID, it MUST create a new bucket for this traceID.
After forwarding the spans, the span buffer MUST remove all spans from the bucket and delete the bucket.

Telemetry Scheduler

The telemetry scheduler receives buffered data from the telemetry buffer, prioritizes critical low-volume data (like errors and crashes) over high-volume data (like spans), builds envelopes, and forwards them to the transport. Before sending, it enforces rate limits and records client reports for dropped envelopes and envelope items. Finally, it also forwards client reports to the transport.

The now deprecated unified API suggested to handle rate limits and offline caching in the transport, forcing multi-platform SDKs (like Java) to maintain separate transports. To simplify, transports SHOULD now only send envelopes. The telemetry scheduler SHOULD handle rate limiting, offline caching, and prioritization, ensuring consistent logic across platforms. This also keeps custom user defined transports simple. SDKs need only adjust the scheduler for platform-specific needs.

We recommend LIFO or round-robin as priority strategies, but SDKs MAY use other methods. Envelopes can be stored in memory, on disk, or with other solutions as needed.

Here are the common requirements for all platforms:

The telemetry scheduler MUST drop rate-limited envelope items before forwarding them to the transport. When doing so, it MUST record respective client reports.
The telemetry scheduler SHOULD attach the client reports to the envelopes it forwards to the transport.
When the telemetry queue overflows and drops envelopes, it MUST record the client report.

To be defined: How exactly does the telemetry scheduler interact with the transport to correctly remove items from the telemetry queue based on the received HTTP info and how to adhere to the dealing with network failure requirements.

Client Reports

The telemetry scheduler SHOULD add client reports to outgoing envelopes or periodically send standalone client report envelopes to the transport, following the SDK side recommendations.

Client report tracking SHOULD NOT be implemented in the telemetry scheduler. Instead, the SDK SHOULD have ClientReports accessible throughout the SDK, since data can be dropped in multiple places, as shown in the diagram below. The telemetry scheduler SHOULD request the current client reports from ClientReports.

Here is a diagram showing how ClientReports interacts with the telemetry scheduler.

flowchart LR
  ClientReports["ClientReports"]
  Client -- "records: sample_rate, before_send, ..." --> ClientReports

  subgraph TelemetryProcessor
    TelemetryBuffer
    TelemetryScheduler
  end

  TelemetryBuffer -- "records: buffer_overflow, ..." --> ClientReports
  TelemetryScheduler -- "records: queue_overflow, ..." --> ClientReports
  Transport -- "records: send_error, network_error" --> ClientReports
  ClientReports -- "aggregated reports" --> TelemetryScheduler

Rate Limiting

As mentioned above, the telemetry buffer SHOULD drop rate-limited data to reduce overhead, and the telemetry scheduler MUST drop rate-limited envelopes before forwarding them to the transport. To do this, the telemetry processor MUST have access to active rate limits. If your SDKs keep track of active rate limits in the transport, we recommend extracting them from the transport and making them available to the telemetry processor. Ideally, the transport is unaware of rate limits and only passes update information to the rate limits class, which manages updates.

It's worth noting that rate limits MUST still be tracked per each DSN. Whatever bookkeeping you do to track rate limits, it MUST be done per each DSN.

The following diagram shows how the telemetry processor interacts with the transport and rate limits. SDKs can choose to implement RateLimits inside or outside the telemetry processor, based on their needs.

flowchart LR

  RateLimits["RateLimits"]

  subgraph TelemetryProcessor
    TelemetryBuffer
    TelemetryScheduler
  end

  TelemetryBuffer -- "checks active rate limits" --> RateLimits
  TelemetryScheduler -- "checks active rate limits" --> RateLimits
  Transport -- "reports HTTP responses" --> RateLimits

Platform Specific Requirements

For the platform specific requirements, please see:

Backend Telemetry Processor: Detailed backend design
Mobile Telemetry Processor: Spec under validation.

FAQ

Where is the Batch Processor?

The batch processor is deprecated, so we moved it to the batch-processor page. The telemetry buffer will include parts of the batch processor functionality.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Telemetry Buffer

Size Limits

Data Forwarding Scenarios

Span Buffer

Buckets per TraceID

Telemetry Scheduler

Client Reports

Rate Limiting

Platform Specific Requirements

FAQ

Where is the Batch Processor?

Uh oh!

FilesExpand file tree

index.mdx

Latest commit

History

index.mdx

File metadata and controls

Telemetry Buffer

Size Limits

Data Forwarding Scenarios

Span Buffer

Buckets per TraceID

Telemetry Scheduler

Client Reports

Rate Limiting

Platform Specific Requirements

FAQ

Where is the Batch Processor?