Skip to content

Evaluate OpenTelemetry semantic conventions for feature flag observability #275

@Starefossen

Description

@Starefossen

Context

Enterprise teams at NAV use Unleash with complex flag configurations (gradual rollouts, A/B testing, user targeting). The Unleash SDK already sends aggregate evaluation metrics (yes/no counts, variant distribution) to the Unleash server, but these are disconnected from the application's observability stack (traces, logs, metrics in Grafana).

Proposal

Evaluate and prototype the OpenTelemetry semantic conventions for feature flags (currently Release Candidate status). The spec defines a standard feature_flag.evaluation event with structured attributes:

Attribute Description
feature_flag.key Flag name (e.g., quotes.submit)
feature_flag.result.variant Variant name
feature_flag.result.value Evaluated value
feature_flag.result.reason default, targeting_match, split, error, etc.
feature_flag.provider.name Unleash
feature_flag.context.id Targeting key (user/session ID)

Why This Matters for Enterprise Teams

  1. Trace correlation — Flag evaluations as span events let you see "this request got variant X and returned a 500." Aggregate Unleash metrics can't do this.
  2. Cross-service impact — A flag change in service A may affect latency in service B. OTel traces already connect the dots; adding flag events makes causation visible.
  3. Grafana-native — Teams already use Tempo and Loki via NAIS. Flag events flow through the existing OTel pipeline with no new infrastructure.
  4. A/B testing — Correlate variant assignment with business metrics (error rates, p95 latency) per variant per user cohort.
  5. Rollout safety — Compare error rates between users hitting the new code path (targeting_match) vs. old (default) during gradual rollouts.

Implementation Approach

A thin wrapper around Unleash SDK calls that emits OTel log events following the semantic conventions. This is lightweight and complements (not replaces) NAIS auto-instrumentation.

Kotlin (backend):

// Emit feature_flag.evaluation event via OTel Logger API
val logger = GlobalOpenTelemetry.get().logsBridge.loggerBuilder("feature-flags").build()
logger.logRecordBuilder()
    .setBody("feature_flag.evaluation")
    .setAttribute(AttributeKey.stringKey("feature_flag.key"), flagName)
    .setAttribute(AttributeKey.stringKey("feature_flag.result.variant"), variant.name)
    .setAttribute(AttributeKey.booleanKey("feature_flag.result.value"), enabled)
    .setAttribute(AttributeKey.stringKey("feature_flag.provider.name"), "Unleash")
    .emit()

TypeScript (frontend API routes):

// Emit via @opentelemetry/api logs
import { logs } from '@opentelemetry/api';
const logger = logs.getLogger('feature-flags');
logger.emit({
  body: 'feature_flag.evaluation',
  attributes: {
    'feature_flag.key': flagName,
    'feature_flag.result.variant': variant?.name,
    'feature_flag.result.value': enabled,
    'feature_flag.provider.name': 'Unleash',
  },
});

Suggested Grafana Dashboard Panels

Panel Query Basis
Flag evaluation rate by flag Count of feature_flag.evaluation events grouped by feature_flag.key
Variant distribution over time Group by feature_flag.result.variant
Error rate by flag variant Correlate variant with HTTP 5xx spans
Flag evaluation reasons Distribution of feature_flag.result.reason
P95 latency per variant Filter spans by variant, compute duration percentiles

Open Questions

  • Does NAIS auto-instrumentation already pick up OTel log events, or do we need explicit exporter config?
  • Should this live as a shared library/pattern, or inline per-service?
  • Is the OTel Logs API mature enough in the Java and Node.js SDKs for production use?
  • Should we also evaluate Unleash's Impact Metrics (counters/gauges/histograms sent to Unleash server) as a complementary approach?

Acceptance Criteria

  • Prototype OTel feature_flag.evaluation events in quotes-backend (Kotlin)
  • Prototype OTel feature_flag.evaluation events in quotes-frontend (TypeScript)
  • Verify events appear in Tempo/Loki via NAIS auto-instrumentation pipeline
  • Create example Grafana dashboard panels querying flag evaluation data
  • Document findings and recommendation for enterprise teams

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions