Skip to content
Closed
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
12 changes: 11 additions & 1 deletion docs/specs/om/open_metrics_spec_2_0.md
Original file line number Diff line number Diff line change
Expand Up @@ -142,7 +142,17 @@ Metrics are defined by a unique LabelSet within a MetricFamily. Metrics MUST con

MetricPoints SHOULD NOT have explicit timestamps.

If more than one MetricPoint is exposed for a Metric, then its MetricPoints MUST have monotonically increasing timestamps.
If more than one MetricPoint is exposed for a Metric,
then unless it is a special case exempted below,
its MetricPoints MUST have monotonically increasing timestamps.

Special case exemptions:

* Two Histogram MetricPoints where one has Classic Buckets and the other has Native Buckets.
* Two GaugeHistogram MetricPoints where one has Classic Buckets and the other has Native Buckets.

Note: these exemptions allow exposing both Classic and Native Buckets for the
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry if this is a can of worms...

Did we already discuss having classic + native on the same line?

If we keep them on separate lines, I think we need more updates to the spec. E.g.

MetricFamily
A MetricFamily MAY have zero or more Metrics. A MetricFamily MUST have a name, HELP, TYPE, and UNIT metadata. Every Metric within a MetricFamily MUST have a unique LabelSet.

We need to carve out an exception from "MUST have a unique LabelSet" for this.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry if this is a can of worms...

Did we already discuss having classic + native on the same line?

No, we have not. I think the fallout from that would be worse than having two lines.

  • Readability would be worse.
  • Native and classic bucketing has different exemplars so where do you put those?
  • One composite value per line no longer true.
  • probably more...

If we keep them on separate lines, I think we need more updates to the spec. E.g.

MetricFamily
A MetricFamily MAY have zero or more Metrics. A MetricFamily MUST have a name, HELP, TYPE, and UNIT metadata. Every Metric within a MetricFamily MUST have a unique LabelSet.

We need to carve out an exception from "MUST have a unique LabelSet" for this.

True.

Copy link
Contributor

@dashpole dashpole Mar 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Readability would be worse.

Agreed. This is also my biggest concern.

Native and classic bucketing has different exemplars so where do you put those?

They aren't required to have different exemplars (both just need to be evenly distributed now), so I think they could use either set of exemplars, and be compliant.

One composite value per line no longer true.

I guess it depends on how you define composite value, but you could consider it a single composite value. The fact that we currently store it as two separate things (or keep one or the other) seems like an implementation detail of the current storage mechanism.

If we put both in a single line, it would actually match the protobuf format more closely. https://github.com/prometheus/client_golang/blob/4b86739a4e19a9d33b78ab94fc9d0c54bfc6ebf6/prometheus/histogram.go#L810. The protobuf format sends both classic and native buckets in a single composite value.

Example text format (just to make the proposal clear):

# TYPE acme_http_request_seconds histogram
# UNIT acme_http_request_seconds seconds
# HELP acme_http_request_seconds Latency histogram of all of ACME's HTTP requests.
-acme_http_request_seconds{path="/api/v1",method="GET"} {count:2,sum:1.2e2,schema:0,zero_threshold:1e-4,zero_count:0,positive_spans:[1:2],positive_buckets:[1,1]}
-acme_http_request_seconds{path="/api/v1",method="GET"} {count:2,sum:1.2e2,bucket:[0.5:1,1:2,+Inf:2]}
+acme_http_request_seconds{path="/api/v1",method="GET"} {count:2,sum:1.2e2,bucket:[0.5:1,1:2,+Inf:2],schema:0,zero_threshold:1e-4,zero_count:0,positive_spans:[1:2],positive_buckets:[1,1]}

I'll try to lay out benefits for the sake of discussion. I'm not sold that it is the right direction, but would like us to have the discussion.

  • It probably makes parsing more efficient and simpler.
    • E.g. If we want to drop classic buckets only if the native buckets exist, we need to keep track of the series we've gotten native buckets for. We also can't easily implement the inverse: "Keep native buckets only if the classic buckets don't exist" because we require the native buckets to come first.
  • It keeps label uniqueness within a metric.
    • It might be confusing for the average user to see two lines with the same name and labels if they aren't familiar with the protocol.
    • The number of lines in the text format is always equal to the number of unique label sets (cardinality).
  • Makes the spec more consistent, and requires fewer "carve-outs". Maybe less of a practical issue, but "MUST do X, except when A, B, C" ... makes it harder to know if something is allowed or not and tends to make the spec more complex to implement.
  • Eliminates "invalid" states: Sum, count, start timestamps need to be the same (I assume?) between classic and native buckets. Having them on the same line enforces this.

Copy link
Member

@bwplotka bwplotka Mar 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

EDIT: Lol, @dashpole comment didn't load for me so I duplicate a bit.

Oh yes, it would be nice to discuss this case, which I assume exists only for migration purposes?

I assume SDK and Prometheus protobuf allows mixed bucketing on the same histogram for migration purposes. This assumed on both text and ingestion there will be classic and native histograms with different names. This assumption breaks with both composite values AND full NHCB mode.

I think the fallout from that would be worse than having two lines.

For the text, I wonder if two lines is actually not more complicated than combined line, so the opposite being true 🤔

One composite value per line no longer true.

I wonder if you took it wrongly. I think what @dashpole proposes is NOT

mixed-histogram {<native sample>} {<classic sample>}

I think what I would do is combined sample that has both schema etc and buckets fields.

mixed-histogram {count:59,sum:1.2e2, bucket:[0.0:0,1e-05:0,0.0001:5,0.1:8,1.0:10,10.0:11,100000.0:11,1e+06:15,1e+23:16,1.1e+23:17,+Inf:17],schema:7,zero_threshold:1e-4,zero_count:0,negative_spans:[1:2],negative_buckets:[5,7],positive_spans:[-1:2,3:4],positive_buckets:[5,7,10,9,8,8]} st@1520430000.123

Benefits of combining sample (single line):

  • No need for non trivial exceptions on unique MetricFamily Metric labels per line. This use case feels like a temporary migration case, so impacting the whole spec with non-trivial exceptions might be not worth it.
  • No need to specify ordering of duplicate labels lines.
  • Strong, format enforced, requirements for sample to be a single histogram (!!!), similar to Prometheus Protobuf and our SDKs logic. Separate line would allow different sum/count etc unless we add many, easy to miss, MUST words.
  • Consistency with PrometheusProto?

nits:

  • Less data
  • Parsing might be actually simpler

Downsides:

  • Native and classic bucketing has different exemplars so where do you put those?
    • That's the biggest challenge. Perhaps it's fine to mix those for this temporary, migration case?
  • Readability could be worse.
  • Arguable as long JSON like lines are just a bit longer and they might be already less readable
  • Also it's relatively rare case

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, it is tempting.

For the exemplars, there's an easy solution, which is to only expose the exemplars for the classic buckets. Since we think this is a migration case and exemplars aren't 100% guaranteed anyway (they are inherently sampled).

Another upside would be that we can get rid of the rule to have native histograms first.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cons:

  • this is a format that's only really needed for a temporary situation when someone is migrating from one system to another.
  • It requires SDK maintainers to implement yet another format

pros:

  • This fits the philosophy we've been moving towards which is removing the need for multiline state -- one line says everything you need to know
  • This removes ambiguity

I think preserving the one-line/stateless approach is worth the extra effort, especially if the format is not too onerous / verbose.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CONSENSUS: Try single line for the mixed case

same MetricPoint and letting the ingestor decide which one or both to process.

#### MetricFamily

Expand Down