Skip to content

feat: Add OTEL#16

Open
alex-rufo wants to merge 1 commit intomasterfrom
add_otel
Open

feat: Add OTEL#16
alex-rufo wants to merge 1 commit intomasterfrom
add_otel

Conversation

@alex-rufo
Copy link
Copy Markdown
Contributor

@alex-rufo alex-rufo commented May 5, 2026

Summary by CodeRabbit

  • New Features

    • OpenTelemetry support added as an available metrics backend option.
  • Chores

    • Updated Go toolchain to version 1.25.0 and upgraded dependencies including OpenTelemetry libraries, testing framework, and protocol buffers.

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 5, 2026

📝 Walkthrough

Walkthrough

OpenTelemetry metrics and tracing support is added via a new otelStatter implementation with cached instruments, OTEL-specific configuration fields in StatterConfig, integration in the Init() method's agent selection, and corresponding test/dependency updates. Function timing metrics switch from histogram-based milliseconds to duration-based timing calls.

Changes

OpenTelemetry Metrics & Tracing Integration

Layer / File(s) Summary
Configuration & Agent Type
client.go (lines 21–25, 39–56)
OTELAgent constant added; StatterConfig gains OTELInsecure and OTELHeaders fields for OTEL-specific configuration.
Core OTEL Implementation
otel.go (new file, ~232 lines)
otelStatter type implements Statter interface with cached metric instruments (Int64UpDownCounter, Float64Gauge, Float64Histogram). Provides Count, Incr, Decr, Gauge, Timing (records seconds), and Histogram methods. Helper constructors newOTELStatter and newOTELTracerProvider configure OTLP exporters with optional insecure mode and custom headers.
Base Tag Formatting
client.go (lines 59–87)
BaseTags() now formats tags as env=... / machine=... with = separators for OTELAgent, distinct from other agent types using : separators.
Agent Initialization & Wiring
client.go (lines 124–194)
Init() adds OTELAgent case to instantiate metrics via newOTELStatter. Tracing setup refactored to store provider shutdown in traceProviderShutdownFn and support OTEL tracing via newOTELTracerProvider, setting global tracer provider.
Shutdown Logic
client.go (lines 111–122)
Close() now invokes traceProviderShutdownFn when set, ensuring proper cleanup of tracer providers.
Metrics API Update
metrics.go (lines 193–236)
Function timing emission switches from Histogram("func.time", milliseconds) to Timing("func.timing", duration) in both reportTiming and ReportClosureFuncTiming.
Test & Dependency Updates
metrics_test.go (lines 389–465), go.mod (lines 1–144)
Test assertions adjusted to reflect fewer recorded metrics (histogram removed). Go toolchain bumped to 1.25.0; OpenTelemetry upgraded to v1.43.0 with full OTLP support; indirect dependencies updated for OTEL, gRPC, protobuf, and logging.

Sequence Diagram

sequenceDiagram
    participant Client
    participant OTELInit
    participant OTELStatter
    participant MeterProvider
    participant OTLPExporter
    
    Client->>OTELInit: Init(cfg with OTELAgent)
    OTELInit->>OTELInit: newOTELStatter(endpoint, headers, baseTags)
    OTELInit->>OTELInit: newOTELResource(baseTags)
    OTELInit->>OTLPExporter: Create OTLP metric exporter<br/>(endpoint, insecure, headers)
    OTELInit->>MeterProvider: Create OTel MeterProvider<br/>with exporter & periodic reader
    OTELInit->>OTELInit: Set global MeterProvider
    OTELInit->>OTELStatter: Initialize with<br/>cached instruments
    OTELInit->>Client: Return OTELStatter instance
    
    Client->>OTELStatter: Timing("func.timing", duration, tags)
    OTELStatter->>OTELStatter: tagsToAttrs(tags)
    OTELStatter->>OTELStatter: getHistogram("func.timing")
    OTELStatter->>MeterProvider: Record(duration in seconds, attributes)
    MeterProvider->>OTLPExporter: Batch & export metrics
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~30 minutes

Poem

🐰 Metrics hop through OTEL's bright gleam,
Instruments cached, a systematic dream,
From Datadog hops to OpenTel's grace,
Timing now dances at seconds' own pace,
Tags dressed in equals, shutdown complete—
Observability made rabbit-neat! 🥕✨

🚥 Pre-merge checks | ✅ 3 | ❌ 2

❌ Failed checks (1 warning, 1 inconclusive)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
Title check ❓ Inconclusive The title 'feat: Add OTEL' is vague and generic, lacking specificity about what OpenTelemetry functionality was added or how it integrates with the existing metrics system. Use a more descriptive title such as 'feat: Add OpenTelemetry metrics backend support' to clearly communicate the primary change and its purpose.
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch add_otel

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🧹 Nitpick comments (1)
otel.go (1)

29-41: 💤 Low value

Deduplicate the key=value tag parser.

newOTELResource and (*otelStatter).tagsToAttrs implement the same key=valueattribute.KeyValue conversion. The receiver on tagsToAttrs is also unused. Consider extracting a single helper and dropping the receiver.

♻️ Proposed refactor
-func newOTELResource(baseTags []string) *resource.Resource {
-	attrs := []attribute.KeyValue{}
-	for _, tag := range baseTags {
-		if idx := strings.IndexByte(tag, '='); idx > 0 {
-			attrs = append(attrs, attribute.String(tag[:idx], tag[idx+1:]))
-		}
-	}
-	res, err := resource.New(context.Background(), resource.WithAttributes(attrs...))
+func parseKVTagsToAttrs(tags []string) []attribute.KeyValue {
+	attrs := make([]attribute.KeyValue, 0, len(tags))
+	for _, tag := range tags {
+		if idx := strings.IndexByte(tag, '='); idx > 0 {
+			attrs = append(attrs, attribute.String(tag[:idx], tag[idx+1:]))
+		}
+	}
+	return attrs
+}
+
+func newOTELResource(baseTags []string) *resource.Resource {
+	res, err := resource.New(context.Background(), resource.WithAttributes(parseKVTagsToAttrs(baseTags)...))
 	if err != nil {
 		return resource.Default()
 	}
 	return res
 }
@@
-// tagsToAttrs parses "key=value" tag strings into OTel attributes.
-func (s *otelStatter) tagsToAttrs(tags []string) []attribute.KeyValue {
-	attrs := make([]attribute.KeyValue, 0, len(tags))
-	for _, tag := range tags {
-		idx := strings.IndexByte(tag, '=')
-		if idx <= 0 {
-			continue
-		}
-		attrs = append(attrs, attribute.String(tag[:idx], tag[idx+1:]))
-	}
-	return attrs
-}
+// (use parseKVTagsToAttrs at the call sites)

…and replace s.tagsToAttrs(tags) with parseKVTagsToAttrs(tags) in the metric methods.

Also applies to: 102-113

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@otel.go` around lines 29 - 41, newOTELResource duplicates the key=value →
attribute.KeyValue parsing logic found in (*otelStatter).tagsToAttrs and the
otelStatter receiver is unused; extract a single helper (e.g.,
parseKVTagsToAttrs) that accepts []string and returns []attribute.KeyValue,
replace the parsing code in newOTELResource with a call to that helper, update
all uses of s.tagsToAttrs(tags) to parseKVTagsToAttrs(tags), and remove the
tagsToAttrs method (and its receiver) to avoid duplication.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@client.go`:
- Around line 184-194: The traceProvider shutdown closure uses an unbounded
context which can hang process exit; update the traceProviderShutdownFn (set
where newOTELTracerProvider is used and where traceProvider.Shutdown is called)
to create a context with a finite timeout (mirror the pattern used in
otelStatter.Close()), call traceProvider.Shutdown with that timed context and
ensure the cancel function is deferred/called; additionally, review the use of
addr passed into newOTELTracerProvider (it currently reuses the metrics
endpoint) and either document that traces and metrics must share addr or add a
separate OTELTraceAddr configuration to avoid mixing endpoints.
- Around line 111-122: The Close function currently ignores errors from
client.Close() and traceProviderShutdownFn(), so capture both return errors and
either return a combined error from Close (change signature to Close() error and
update callers) or, if keeping void, at minimum log failures via the existing
logger; specifically modify the Close function (guarded by clientMux) to check
the error from client.Close() and the error from traceProviderShutdownFn() (if
it returns error), aggregate them (e.g., wrap or join messages) and return that
error, or log each error with clear context like "otel client close failed" /
"trace provider shutdown failed" if you opt not to change the signature.

In `@otel.go`:
- Around line 230-232: The Close method currently calls s.meterProvider.Shutdown
with context.Background(), which can hang indefinitely; replace this with a
context that has a bounded timeout (e.g., context.WithTimeout) and pass that
cancellable context to s.meterProvider.Shutdown in otelStatter.Close, deferring
cancel() so Shutdown returns promptly even if the OTLP exporter/collector
stalls.
- Around line 211-219: The Timing method records durations in seconds but OTel's
default histogram buckets are millisecond-scaled; update otelStatter.Timing so
it records milliseconds (use value.Milliseconds() as a float64) instead of
value.Seconds() to match Telegraf/Datadog and preserve histogram resolution, and
keep the existing use of s.getHistogram(name) and s.tagsToAttrs(tags) unchanged;
alternatively, if you prefer seconds, register a corresponding View with custom
sub-second bucket boundaries for the histogram returned by getHistogram.

---

Nitpick comments:
In `@otel.go`:
- Around line 29-41: newOTELResource duplicates the key=value →
attribute.KeyValue parsing logic found in (*otelStatter).tagsToAttrs and the
otelStatter receiver is unused; extract a single helper (e.g.,
parseKVTagsToAttrs) that accepts []string and returns []attribute.KeyValue,
replace the parsing code in newOTELResource with a call to that helper, update
all uses of s.tagsToAttrs(tags) to parseKVTagsToAttrs(tags), and remove the
tagsToAttrs method (and its receiver) to avoid duplication.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 872426a4-b516-454b-b3df-39ec005bf587

📥 Commits

Reviewing files that changed from the base of the PR and between 723e058 and 0472c13.

⛔ Files ignored due to path filters (1)
  • go.sum is excluded by !**/*.sum
📒 Files selected for processing (5)
  • client.go
  • go.mod
  • metrics.go
  • metrics_test.go
  • otel.go
💤 Files with no reviewable changes (1)
  • metrics.go

Comment thread client.go
Comment on lines 111 to 122
func Close() {
clientMux.RLock()
defer clientMux.RUnlock()
if client == nil {
return

if client != nil {
client.Close()
}

if traceProviderShutdownFn != nil {
traceProviderShutdownFn()
}
client.Close()
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Shutdown errors are silently dropped (errcheck) and Close() itself returns nothing.

Static analysis flags both client.Close() and traceProviderShutdownFn() as unchecked. Beyond the linter, swallowing these errors hides exporter shutdown failures (e.g., backed‑up OTLP queue), which is exactly when operators need visibility. Consider logging at minimum, and ideally surfacing an error to callers.

🛠️ Proposed fix
-func Close() {
+func Close() {
 	clientMux.RLock()
 	defer clientMux.RUnlock()

 	if client != nil {
-		client.Close()
+		if err := client.Close(); err != nil {
+			log.WithError(err).Errorln("metrics client close failed")
+		}
 	}

 	if traceProviderShutdownFn != nil {
-		traceProviderShutdownFn()
+		if err := traceProviderShutdownFn(); err != nil {
+			log.WithError(err).Errorln("trace provider shutdown failed")
+		}
 	}
 }

If you'd rather keep the void signature unchanged, a follow-up could change Close() to return error and have callers decide.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
func Close() {
clientMux.RLock()
defer clientMux.RUnlock()
if client == nil {
return
if client != nil {
client.Close()
}
if traceProviderShutdownFn != nil {
traceProviderShutdownFn()
}
client.Close()
}
func Close() {
clientMux.RLock()
defer clientMux.RUnlock()
if client != nil {
if err := client.Close(); err != nil {
log.WithError(err).Errorln("metrics client close failed")
}
}
if traceProviderShutdownFn != nil {
if err := traceProviderShutdownFn(); err != nil {
log.WithError(err).Errorln("trace provider shutdown failed")
}
}
}
🧰 Tools
🪛 golangci-lint (2.12.1)

[error] 116-116: Error return value of client.Close is not checked

(errcheck)


[error] 120-120: Error return value is not checked

(errcheck)

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@client.go` around lines 111 - 122, The Close function currently ignores
errors from client.Close() and traceProviderShutdownFn(), so capture both return
errors and either return a combined error from Close (change signature to
Close() error and update callers) or, if keeping void, at minimum log failures
via the existing logger; specifically modify the Close function (guarded by
clientMux) to check the error from client.Close() and the error from
traceProviderShutdownFn() (if it returns error), aggregate them (e.g., wrap or
join messages) and return that error, or log each error with clear context like
"otel client close failed" / "trace provider shutdown failed" if you opt not to
change the signature.

Comment thread client.go
Comment on lines +184 to 194
} else if cfg.Agent == OTELAgent && cfg.TracingEnabled {
traceProvider, err := newOTELTracerProvider(addr, cfg.OTELInsecure, cfg.OTELHeaders, config.BaseTags())
if err != nil {
return errors.Wrap(err, "otel tracer provider init failed")
}
otel.SetTracerProvider(traceProvider)
tracer = otel.Tracer(prefix)
traceProviderShutdownFn = func() error {
return traceProvider.Shutdown(context.Background())
}
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Tracer shutdown closure should use a bounded context.

traceProvider.Shutdown(context.Background()) will block process exit indefinitely if the OTLP collector is unreachable or slow. Wrap with a timeout, mirroring what should be done in otelStatter.Close() as well.

🛡️ Proposed fix
 		traceProviderShutdownFn = func() error {
-			return traceProvider.Shutdown(context.Background())
+			ctx, cancel := context.WithTimeout(context.Background(), 5*time.Second)
+			defer cancel()
+			return traceProvider.Shutdown(ctx)
 		}

Also worth verifying: addr is reused as the OTLP trace endpoint (line 185) while it was originally the metrics endpoint. In typical collector setups they coincide on :4317, but if a deployment ever splits metrics and traces this becomes a footgun. Consider an explicit OTELTraceAddr (or document the expectation).

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@client.go` around lines 184 - 194, The traceProvider shutdown closure uses an
unbounded context which can hang process exit; update the
traceProviderShutdownFn (set where newOTELTracerProvider is used and where
traceProvider.Shutdown is called) to create a context with a finite timeout
(mirror the pattern used in otelStatter.Close()), call traceProvider.Shutdown
with that timed context and ensure the cancel function is deferred/called;
additionally, review the use of addr passed into newOTELTracerProvider (it
currently reuses the metrics endpoint) and either document that traces and
metrics must share addr or add a separate OTELTraceAddr configuration to avoid
mixing endpoints.

Comment thread otel.go
Comment on lines +211 to +219
func (s *otelStatter) Timing(name string, value time.Duration, tags []string, rate float64) error {
h, err := s.getHistogram(name)
if err != nil {
return err
}
// OTel convention: duration histograms use seconds
h.Record(context.Background(), value.Seconds(), otelmetric.WithAttributes(s.tagsToAttrs(tags)...))
return nil
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

🧩 Analysis chain

🌐 Web query:

OpenTelemetry Go SDK default explicit bucket boundaries for Float64Histogram v1.43.0

💡 Result:

Default explicit bucket boundaries used by the OpenTelemetry Go SDK for Float64Histogram v1.43.0 are: [1][2] [3]. Note: OpenTelemetry’s SDK keeps these defaults unless you explicitly override boundaries via APIs like WithExplicitBucketBoundaries (or equivalent mechanisms such as metric views / exporter configuration); a maintainer issue documents these defaults and explains they’re oriented around milliseconds, not seconds [3].

Citations:


🏁 Script executed:

# First, let's find the relevant files in the codebase
find . -name "otel.go" -o -name "*dogstatsd*" -o -name "*statsd*" -o -name "metrics.go" | head -20

Repository: InjectiveLabs/metrics

Length of output: 88


🏁 Script executed:

# Look at the otel.go file to confirm the Timing implementation
fd "otel\.go" -x cat -n {}

Repository: InjectiveLabs/metrics

Length of output: 7808


🏁 Script executed:

# Search for Datadog/Telegraf Statter implementations to verify they use milliseconds
fd -e "go" | xargs rg -l "Timing.*time\.Duration" | head -10

Repository: InjectiveLabs/metrics

Length of output: 111


🏁 Script executed:

# Check for any metric configuration or default bucket boundaries in the codebase
rg "Histogram|bucket|Boundary" --type go -i | grep -i "default\|bucket" | head -15

Repository: InjectiveLabs/metrics

Length of output: 1186


🏁 Script executed:

# Look at the imports and dependencies to understand which OTEL version is being used
rg "go\.opentelemetry\.io" | head -20

Repository: InjectiveLabs/metrics

Length of output: 1660


🏁 Script executed:

# Check telegraf.go Timing implementation
cat -n telegraf.go | grep -A 5 "func (t \*telegrafStatter) Timing"

Repository: InjectiveLabs/metrics

Length of output: 322


🏁 Script executed:

# Check metrics.go to see the Statter interface and any dogstatsd implementation
cat -n metrics.go | head -100

Repository: InjectiveLabs/metrics

Length of output: 3625


🏁 Script executed:

# Search for dogstatsd or statsd implementation
fd -e "go" | xargs rg -l "dogstatsd|statsd" | head -10

Repository: InjectiveLabs/metrics

Length of output: 87


🏁 Script executed:

# Look for any documentation or code about default histogram buckets in otel or client.go
cat -n client.go | head -150

Repository: InjectiveLabs/metrics

Length of output: 5398


🏁 Script executed:

# Check dogstatsd implementation - look at how the client is used
cat -n client.go | grep -A 30 "case DatadogAgent:"

Repository: InjectiveLabs/metrics

Length of output: 1987


🏁 Script executed:

# Verify dogstatsd library uses milliseconds by searching documentation or checking if there's any explicit conversion
rg "dogstatsd|statsd" -A 5 -B 5 client.go | head -40

Repository: InjectiveLabs/metrics

Length of output: 976


🏁 Script executed:

# Search for default histogram bucket boundaries in OTEL documentation or any config
web_search
OpenTelemetry Go SDK v1.43.0 Float64Histogram default bucket boundaries

Repository: InjectiveLabs/metrics

Length of output: 168


🌐 Web query:

OpenTelemetry Go SDK v1.43.0 Float64Histogram default bucket boundaries milliseconds

💡 Result:

Default Float64Histogram bucket boundaries in the OpenTelemetry Go SDK (v1.43.0) are: [1][2][3][4] (in milliseconds when used for duration/latency metrics) [5]. Explanation: the Go SDK’s default histogram buckets are explicitly described as oriented around milliseconds, with the specific boundary list above [5].

Citations:


Timing records in seconds but default histogram buckets are millisecond-scaled, causing unit mismatch with other backends and loss of resolution.

Timing records value.Seconds(), while the Telegraf backend converts to milliseconds (int(value/time.Millisecond)) and Datadog's dogstatsd library uses the standard statsd protocol which expects milliseconds. The OpenTelemetry Go SDK v1.43.0 default histogram bucket boundaries (0, 5, 10, 25, 50, 75, 100, 250, 500, 750, 1000, 2500, 5000, 7500, 10000) are explicitly oriented around milliseconds. Recording seconds into these millisecond-scaled buckets collapses nearly all typical latencies into the lowest bucket, destroying histogram resolution and breaking consistency with other agents.

Either:

  • Record float64(value.Milliseconds()) to match Telegraf and Datadog, or
  • Keep seconds and register a View with custom bucket boundaries appropriate for sub-second durations.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@otel.go` around lines 211 - 219, The Timing method records durations in
seconds but OTel's default histogram buckets are millisecond-scaled; update
otelStatter.Timing so it records milliseconds (use value.Milliseconds() as a
float64) instead of value.Seconds() to match Telegraf/Datadog and preserve
histogram resolution, and keep the existing use of s.getHistogram(name) and
s.tagsToAttrs(tags) unchanged; alternatively, if you prefer seconds, register a
corresponding View with custom sub-second bucket boundaries for the histogram
returned by getHistogram.

Comment thread otel.go
Comment on lines +230 to +232
func (s *otelStatter) Close() error {
return s.meterProvider.Shutdown(context.Background())
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Close can hang indefinitely on a stalled OTLP exporter.

meterProvider.Shutdown is invoked with context.Background(), so a misbehaving collector or hung gRPC connection blocks shutdown forever. Bound this with a timeout so process exit is deterministic.

🛡️ Proposed fix
 func (s *otelStatter) Close() error {
-	return s.meterProvider.Shutdown(context.Background())
+	ctx, cancel := context.WithTimeout(context.Background(), 5*time.Second)
+	defer cancel()
+	return s.meterProvider.Shutdown(ctx)
 }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
func (s *otelStatter) Close() error {
return s.meterProvider.Shutdown(context.Background())
}
func (s *otelStatter) Close() error {
ctx, cancel := context.WithTimeout(context.Background(), 5*time.Second)
defer cancel()
return s.meterProvider.Shutdown(ctx)
}
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@otel.go` around lines 230 - 232, The Close method currently calls
s.meterProvider.Shutdown with context.Background(), which can hang indefinitely;
replace this with a context that has a bounded timeout (e.g.,
context.WithTimeout) and pass that cancellable context to
s.meterProvider.Shutdown in otelStatter.Close, deferring cancel() so Shutdown
returns promptly even if the OTLP exporter/collector stalls.

Comment thread otel.go
Comment on lines +156 to +166
s.mu.RLock()
h, ok := s.histograms[name]
s.mu.RUnlock()
if ok {
return h, nil
}
s.mu.Lock()
defer s.mu.Unlock()
if h, ok = s.histograms[name]; ok {
return h, nil
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dang this is smart, almost thought it was racy...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants