diff --git a/.github/actions/spelling/allow.txt b/.github/actions/spelling/allow.txt index 233302a2c..5bf198896 100644 --- a/.github/actions/spelling/allow.txt +++ b/.github/actions/spelling/allow.txt @@ -2,6 +2,7 @@ aarch APPENDLIMIT bitmask bitmasks +clickable configurations DESTADDR DESTPORT @@ -10,12 +11,14 @@ ede fcf fuzzer github +Grafana hardenings https iostream iostreams istream istreams +Jaeger jvm JVM JVMs @@ -30,13 +33,18 @@ mharden mmaped ond onlinedocs +opentelemetry ostream ostreams +otel +otlp +permille plugin plugins postgre Postgre postgres +protobuf retpoline retuns rfill diff --git a/data/settings.js b/data/settings.js index 90589b14b..3ba08829f 100644 --- a/data/settings.js +++ b/data/settings.js @@ -5394,7 +5394,7 @@ Creates a new event exporter. The filter name refers to the event_exporter_driver: { tags: [ 'event-export' ], values: setting_types.ENUM, - values_enum: [ 'log', 'file', 'unix', 'http-post', 'drop' ], + values_enum: [ 'log', 'file', 'unix', 'http-post', 'drop', 'opentelemetry' ], default: 'log', seealso: [ '[[link,event_export_drivers]]' ], text: ` @@ -5466,6 +5466,105 @@ Timeout when connecting to unix socket with Path to event unix socket with [[setting,event_exporter_driver,unix]].` }, + event_exporter_opentelemetry_endpoint_url: { + added: { + settings_event_exporter_opentelemetry_added: false, + }, + tags: [ 'event-export' ], + values: setting_types.STRING, + default: 'http://localhost:4318', + seealso: [ '[[link,event_export_drivers]]', + 'event_exporter_opentelemetry_trace_id_field', + 'event_exporter_opentelemetry_emit_logs' ], + text: ` +Base URL of the OTLP/HTTP collector when using +[[setting,event_exporter_driver,opentelemetry]]. The exporter POSTs +spans to \`/v1/traces\` and (when +[[setting,event_exporter_opentelemetry_emit_logs]] is enabled) log +records to \`/v1/logs\`. The wire format and +\`Content-Type\` are selected by [[setting,event_exporter_format]]: + +* \`protobuf\` → \`application/x-protobuf\` (OTLP/HTTP+protobuf binary) +* \`json\` → \`application/json\` (OTLP/HTTP+JSON, proto3 ProtoJSON + canonical encoding) + +Both formats carry identical trace/span content and the same +deterministic \`trace_id\` for a given session; choose based on the +collector's preference and operational convenience (JSON is easier to +inspect with \`curl\` / \`jq\`). Log records are emitted as JSON only. + +Only OTLP/HTTP is implemented. OTLP/gRPC is not supported; point this +setting at a collector that accepts OTLP/HTTP (the OpenTelemetry +Collector, Jaeger v2, Grafana Tempo, etc.).` + }, + + event_exporter_opentelemetry_trace_id_field: { + added: { + settings_event_exporter_opentelemetry_added: false, + }, + tags: [ 'event-export' ], + values: setting_types.STRING, + default: 'session', + seealso: [ '[[link,event_export_drivers]]' ], + text: ` +Name of the event field whose value is hashed (SHA-1) to derive the +[OTLP trace_id](https://opentelemetry.io/docs/concepts/signals/traces/) +for every emitted span. The default \`session\` correlates every event +of a single mail session under the same trace. + +If the field is missing from an event, that event is not exported. + +Child sessions of the form \`:\` (e.g. \`session:2\`, +\`session:indexer-worker\`, \`doveadm:\`) get their own trace_id +derived from the full value, plus a Span.Link back to the parent's +trace_id derived from \`\`. This lets collectors render the +indexer / doveadm / sub-session traffic as references off the parent +session.` + }, + + event_exporter_opentelemetry_emit_spans: { + added: { + settings_event_exporter_opentelemetry_added: false, + }, + tags: [ 'event-export' ], + values: setting_types.BOOLEAN, + default: 'yes', + seealso: [ '[[link,event_export_drivers]]', + 'event_exporter_opentelemetry_emit_logs' ], + text: ` +When enabled (the default), the exporter emits one OTLP Span per +matched event to \`/v1/traces\`. Disable only when you +want the exporter to ship log records exclusively (set together with +[[setting,event_exporter_opentelemetry_emit_logs,yes]]).` + }, + + event_exporter_opentelemetry_emit_logs: { + added: { + settings_event_exporter_opentelemetry_added: false, + }, + tags: [ 'event-export' ], + values: setting_types.BOOLEAN, + default: 'no', + seealso: [ '[[link,event_export_drivers]]', + 'event_exporter_opentelemetry_emit_spans' ], + text: ` +When enabled, the exporter emits an OTLP LogRecord to +\`/v1/logs\` for every matched event that carries a +formatted log message (\`e_info()\`, \`e_warning()\`, \`e_error()\`, +\`e_debug()\` call sites). The LogRecord shares its \`trace_id\` and +\`span_id\` with the corresponding Span, so collectors link logs and +traces for the same session. + +Only JSON output is supported for log records, regardless of +[[setting,event_exporter_format]]. Severity is mapped to the OTel +severity model (DEBUG=5, INFO=9, WARN=13, ERROR=17, FATAL=21, +PANIC→FATAL4=24). + +Enabling this setting opts the stats process into receiving the +formatted message text from each dovecot service; the wire cost is +zero when the corresponding metric's filter does not match an event.` + }, + execute: { tags: [ 'execute' ], values: setting_types.NAMED_LIST_FILTER, @@ -9292,6 +9391,61 @@ Human-readable description of the metric. This is included in the HELP text sent to OpenMetrics.` }, + metric_export_sample_by: { + added: { + settings_metric_export_sample_by_added: false, + }, + tags: [ 'metrics' ], + values: setting_types.NAMED_LIST_FILTER, + seealso: [ '[[link,stats_sample_by]]' ], + text: ` +Defines a [[link,stats_sample_by,sampling rule]] that decides whether an +event matching this metric is forwarded to the +[[setting,metric_exporter]]. The filter name is an arbitrary identifier +for the rule; the actual sampled field is the +[[setting,metric_export_sample_by_field]] setting inside the block. + +Multiple [[setting,metric_export_sample_by]] blocks combine with AND +semantics: an event is exported only if every rule passes. Sampling is +deterministic: events that share the same field value are always sampled +in or out together. Statistics counters are unaffected; sampling applies +only to exported events.` + }, + + metric_export_sample_by_field: { + added: { + settings_metric_export_sample_by_added: false, + }, + tags: [ 'metrics' ], + values: setting_types.STRING, + seealso: [ '[[link,stats_sample_by]]' ], + text: ` +Name of the event field whose value is hashed to make the sampling +decision. A common choice is \`session\`, which causes every event in +the same session to be sampled in or out together. + +If the configured field is missing from the event, the event is dropped +from the export.` + }, + + metric_export_sample_by_permille: { + added: { + settings_metric_export_sample_by_added: false, + }, + tags: [ 'metrics' ], + values: setting_types.UINT, + default: 1000, + seealso: [ '[[link,stats_sample_by]]' ], + text: ` +Fraction of distinct field values that are exported, expressed in +permille (parts per thousand). The valid range is \`1\`..\`1000\`; +\`0\` is rejected. The default \`1000\` exports every event, which +makes the [[setting,metric_export_sample_by]] block a no-op. + +For example, \`permille = 100\` exports approximately 10% of distinct +field values, and \`permille = 500\` exports approximately 50%.` + }, + oauth2: { tags: [ 'oauth2' ], values: setting_types.NAMED_FILTER, diff --git a/data/updates.js b/data/updates.js index 355eda9ec..00c0ecefc 100644 --- a/data/updates.js +++ b/data/updates.js @@ -166,6 +166,8 @@ export const updates = { settings_mailbox_defaults_english_changed: '2.4.3', settings_mailbox_directory_name_legacy_deprecated: '2.4.2', settings_mailbox_special_use_changed: '2.4.0', + settings_event_exporter_opentelemetry_added: '2.4.6', + settings_metric_export_sample_by_added: '2.4.6', settings_metric_fields_changed: '2.4.0', settings_passdb_default_password_scheme_changed: '2.4.3', settings_passdb_mechanisms_filter_added: '2.4.0', diff --git a/docs/core/config/events/export.md b/docs/core/config/events/export.md index 2dc147338..849c35543 100644 --- a/docs/core/config/events/export.md +++ b/docs/core/config/events/export.md @@ -12,6 +12,15 @@ dovecotlinks: event_export_drivers: hash: drivers text: "Event Export: Drivers" + event_export_opentelemetry: + hash: opentelemetry + text: "Event Export: OpenTelemetry" + event_export_opentelemetry_logs: + hash: log-records + text: "Event Export: OpenTelemetry LogRecords" + stats_sample_by: + hash: sampling + text: "Event Export: Sampling" --- # Event Export @@ -58,8 +67,9 @@ Supported Formats: | Formats | Description | | ------- | ----------- | -| `json` | JSON output | +| `json` | JSON output. With the [[setting,event_exporter_driver,opentelemetry]] driver this produces an OTLP/HTTP+JSON `TracesData` payload (proto3 canonical encoding); with all other drivers, the legacy event-envelope JSON shown below. | | `tab-text` | TAB-separated text fields | +| `protobuf` | OTLP/HTTP+protobuf `TracesData` payload. Only valid with [[setting,event_exporter_driver,opentelemetry]] driver.
[[added,settings_event_exporter_opentelemetry_added]] | #### Example: JSON @@ -112,6 +122,7 @@ Supported drivers: | `http-post` | Send the serialized event as a HTTP POST payload to [[setting,event_exporter_http_post_url]]. The driver defaults to [[setting,http_client_request_absolute_timeout,250 milliseconds]]. | | `file` | Send serialized events to a file specified in [[setting,event_exporter_file_path]]
[[added,event_export_drivers_file_unix_added]] | | `unix` | Send serialised events to a unix socket specified in [[setting,event_exporter_unix_path]]. The [[setting,event_exporter_unix_connect_timeout]] setting is used to specify how long the unix socket connection can take. Default is `250 milliseconds`.
[[added,event_export_drivers_file_unix_added]] | +| `opentelemetry` | Send events as OTLP/HTTP spans to an OpenTelemetry collector at [[setting,event_exporter_opentelemetry_endpoint_url]]. Wire format is selected by [[setting,event_exporter_format]] (`protobuf` or `json`). See [[link,event_export_opentelemetry]].
[[added,settings_event_exporter_opentelemetry_added]] | The `drop` driver is useful when one wants to disable the event exporter temporarily. Note that serialization still occurs, but the resulting @@ -144,6 +155,247 @@ One uses the `metric` block settings documented in [[link,stats]] to select and filter the event to be exported. See [[setting,metric_exporter]] and [[setting,metric_exporter_include]] settings. +## OpenTelemetry + +[[added,settings_event_exporter_opentelemetry_added]] + +The `opentelemetry` driver POSTs every emitted event to an OTLP/HTTP +collector (Jaeger v2, Grafana Tempo, the OpenTelemetry Collector, etc.) +as a serialized +[OTLP `TracesData`](https://opentelemetry.io/docs/concepts/signals/traces/) +payload. One Span is emitted per event. + +The `opentelemetry` driver requires [[setting,event_exporter_format]] +to be set to one of: + +| Format | Wire format | `Content-Type` | +| ------ | ----------- | -------------- | +| `protobuf` | OTLP/HTTP+protobuf (binary `TracesData`) | `application/x-protobuf` | +| `json` | OTLP/HTTP+JSON (proto3 ProtoJSON canonical encoding) | `application/json` | + +The `protobuf` format produces a compact binary payload; the `json` +format is easier to inspect with `curl` / `jq` and has no extra build +dependency on the receiving side. Both POST to the same +`/v1/traces` path and carry identical trace/span content +(same `trace_id` for the same session, same attributes, etc.); they +differ only in serialization. + +Any other format value is rejected at config load time. `protobuf` may +only be paired with this driver. + +::: tip +Only OTLP/HTTP is implemented. OTLP/gRPC is **not** supported; point +[[setting,event_exporter_opentelemetry_endpoint_url]] at a collector that +accepts OTLP/HTTP (most collectors expose both endpoints on different +ports — use the HTTP one, typically port `4318`). +::: + +Minimal configuration (protobuf): + +```doveconf[dovecot.conf] +event_exporter otlp { + driver = opentelemetry + format = protobuf + event_exporter_opentelemetry_endpoint_url = http://collector.example.com:4318 +} + +metric imap_traces { + filter = event=* + exporter = otlp +} +``` + +To export as JSON instead, set `format = json`: + +```doveconf[dovecot.conf] +event_exporter otlp { + driver = opentelemetry + format = json + event_exporter_opentelemetry_endpoint_url = http://collector.example.com:4318 +} +``` + +The driver shares the standard [[link,http_client]] settings; tune the +per-request timeout (default `10s`) and other knobs directly under the +exporter block: + +```doveconf[dovecot.conf] +event_exporter otlp { + driver = opentelemetry + event_exporter_opentelemetry_endpoint_url = http://collector.example.com:4318 + http_client_request_timeout = 2s +} +``` + +The JSON output follows the +[OTLP/HTTP+JSON encoding](https://opentelemetry.io/docs/specs/otlp/#json-protobuf-encoding): +`trace_id` / `span_id` are lowercase hex strings (OTLP deviates from +proto3 ProtoJSON here, which would use base64), `int64` values including +`startTimeUnixNano` / `endTimeUnixNano` / `intValue` are JSON strings, +and enums (`SpanKind`, `Status.code`) are name strings (e.g. +`"SPAN_KIND_SERVER"`, `"STATUS_CODE_OK"`). + +### Trace correlation + +Every span's `trace_id` is `SHA-1()[:16]` where the session +id is taken from the event field named by +[[setting,event_exporter_opentelemetry_trace_id_field]] (default `session`). +This ties all events of one mail session into one trace, deterministic +across restarts. + +Each span's `span_id` is deterministically derived from +`SHA-1( + + )[:8]`. The +same logical event produces the same `span_id` across re-emissions and +across processes, and (when log records are enabled) lets every +LogRecord reference the Span it belongs to via the shared +`span_id`. See [[link,event_export_opentelemetry_logs]]. + +Dovecot internally creates sub-sessions of the form `:` +(per-user mail_storage retry counters, `indexer-worker`, +`doveadm:`, etc.). Each sub-session is emitted under its own +`trace_id`, and a [Span.Link](https://opentelemetry.io/docs/concepts/signals/traces/#span-links) +back to the parent's trace_id is attached. Collectors render this as a +clickable reference from the child trace to the parent trace. + +If the configured trace_id_field is missing from an event, that event +is not exported. + +### Span timing + +Span `start_time` is the per-event creation timestamp (the moment the +event was created in dovecot, not the parent operation's start), and +`end_time` is when the event was sent. For passthrough events this +gives accurate per-operation timings rather than collapsing every event +to the parent's start time. + +### Resource attributes + +Each batch carries an OTel +[Resource](https://opentelemetry.io/docs/concepts/resources/) with +`service.name = "Dovecot"`, `service.instance.id` set to the hostname +and `service.version` set to the dovecot version. Sub-system context +(`imap`, `imap-login`, `auth`, `indexer-worker`, …) lands on the span's +InstrumentationScope rather than on `service.name`, so all traces show +up under a single `Dovecot` service in collector UIs. + +### Log records + +When [[setting,event_exporter_opentelemetry_emit_logs,yes]] is set, the +exporter additionally POSTs an OTLP +[LogRecord](https://opentelemetry.io/docs/specs/otel/logs/data-model/) +to `/v1/logs` for every matched event that carries a +formatted log message (i.e. events produced by `e_info()`, +`e_warning()`, `e_error()`, `e_debug()` and friends). + +Log records always use the OTLP/HTTP+JSON wire format regardless of +[[setting,event_exporter_format]]; the protobuf encoding is not +emitted for logs. + +Each LogRecord carries: + +* the same `trace_id` and `span_id` as its corresponding Span, so the + collector can navigate from a log line to its Span and vice versa, +* a `severityNumber` and `severityText` derived from the dovecot + log type (DEBUG=5, INFO=9, WARN=13, ERROR=17, FATAL=21, PANIC=24), +* `body.stringValue` with the formatted message text from the + `e_*()` call site, +* the event's fields and category list as `attributes`, identical to + what the matching Span carries. + +The exporter ships log records only for events whose corresponding +metric's `filter` matches. Use the metric filter to scope log +forwarding to the sessions you want to trace - typically a session +predicate plus a category restriction: + +```doveconf[dovecot.conf] +event_exporter otlp_logs { + driver = opentelemetry + format = json + event_exporter_opentelemetry_endpoint_url = http://collector.example.com:4318 + event_exporter_opentelemetry_emit_logs = yes +} + +metric traced_session_logs { + exporter = otlp_logs + # Pick whichever events should ship as both spans and log records. + filter = event=imap_command_finished or event=mail_user_session_finished +} +``` + +Disable [[setting,event_exporter_opentelemetry_emit_spans,no]] +together with `emit_logs = yes` to ship log records only. + +When the metric filter does not match an event, no log message is +formatted or transmitted - the wire cost of enabling log records on a +heavy-traffic service is therefore proportional to the matched +volume, not the total event volume. + +The corresponding ResourceLogs carries the same `service.name`, +`service.instance.id`, and `service.version` attributes as +ResourceSpans, so collectors group traces and logs under the same +service. + +### Sampling + +Use [[link,stats_sample_by]] to keep the export volume bounded - +particularly important when shipping all events from a busy mail +server. + +## Sampling + +[[added,settings_metric_export_sample_by_added]] + +When a metric's exporter would emit too many events (for example, every +IMAP command from every session), the +[[setting,metric_export_sample_by]] block selects a deterministic subset +to forward to the exporter. + +For each block, Dovecot hashes the configured event field with SHA-1, +folds the digest to a 64-bit integer, and exports the event when +`hash % 1000 < permille`. Because the decision is a pure function of the +field value, every event that shares the same value (e.g. every event +in the same IMAP session, when sampling by `session`) is either +exported in full or dropped in full. This is the property that makes +the resulting export usable for tracing: a sampled session has all of +its events, not a random scattering. + +::: tip +Sampling only affects what reaches the [[setting,metric_exporter]]. +The metric's in-process counters (`doveadm stats dump`, +OpenMetrics) still count every event. +::: + +Multiple [[setting,metric_export_sample_by]] blocks combine with AND +semantics: an event is exported only if every rule passes. This lets a +metric require, say, both a sampled session *and* a sampled user. + +If the configured [[setting,metric_export_sample_by_field]] is missing +from the event, the event is dropped from the export. + +### Example: sample 10% of IMAP sessions + +```doveconf[dovecot.conf] +event_exporter datalake { + driver = http-post + http_post_url = https://datalake.example.com/api/endpoint + format = json + time_format = rfc3339 +} + +metric imap_traced { + exporter = datalake + filter = event=imap_command_finished + + metric_export_sample_by sess { + field = session + permille = 100 + } +} +``` + +Roughly 10% of distinct session IDs are forwarded to the exporter, and +the same session always produces the same decision across restarts. + ## Settings