Update histogram best practices and metric types documentation for native histograms by beorn7 · Pull Request #2868 · prometheus/docs

beorn7 · 2026-03-05T14:43:16Z

With this update, the best practices page about histograms and summaries and the concepts page about metric types finally takes native histograms into account.

While working on this, it occured to me that the best practices page about histograms and summaries would probably benefit from a more fundamental rewrite, based on the user experience of the last decade. Some of the focal points of the document seem outdated from today's perspective, while other topics might be missing. (I find the focus on Apdex score a bit weird by now, and also the detailed error analysis might not appeal to a broad audience…) However, a complete rewrite would have taken a lot of time, and I did not want to let our users wait for even longer. So I went for this incremental update – which shouldn't prevent anybody from a thorough rewrite in the future.

Note that I used this opportunity to replace the term "client library" with "instrumentation library". I always thought that "client library" is confusing as it is not implementing a client in any way. (Technically, it implements a server, of which the Prometheus "server" is the client… 🤯) Even if we accept that "Prometheus client library" just means "a library to do something that has to do with Prometheus", the title "client library" still doesn't tell us what the library is actually for. (Note that the client_golang repository not only contains an instrumentation library, but also includes an actual client library that helps you to implement clients that talk to the Prometheus HTTP API.)

With this update, the best practices document about histogram and summaries finally takes native histograms into account. Signed-off-by: beorn7 <beorn@grafana.com>

Note that I used this opportunity to replace the term "client library" with "instrumentation library". I always thought that "client library" is confusing as it is not implementing a client in any way. (Technically, it implements a _server_, of which the Prometheus "server" is the client… 🤯) Even if we accept that "Prometheus client library" just means "a library to do something that has to do with Prometheus", the title "client library" still doesn't tell us what the library is actually for. (Note that the client_golang repository not only contains an instrumentation library, but also includes an _actual_ client library that helps you to implement clients that talk to the Prometheus HTTP API.) Signed-off-by: beorn7 <beorn@grafana.com>

krajorama

First pass. Looking good.

krajorama · 2026-03-12T07:05:16Z

docs/concepts/metric_types.md

+histograms (currently this is the case for Go and Java), you should probably
+prefer native histograms over classic histograms.


I think instead of saying "probably" we should point to the spec where it talks about pros and cons , although I don't seem to find that bit :( What do you think about adding something like https://grafana.com/docs/mimir/latest/send/native-histograms/_exponential_buckets/#advantages-and-disadvantages in the spec ?

I can take this as a follow up if you'd like?

krajorama · 2026-03-12T07:07:54Z

docs/concepts/metric_types.md

+exception of native histograms, these are currently only differentiated in the
+instrumentation libraries (to enable APIs tailored to the usage of the specific
+types) and in the exposition protocols. The Prometheus server does not yet make


I feel that the parenthesis makes this sentence too long and a bit hard to understand, let's simplify, something like:

Suggested change

exception of native histograms, these are currently only differentiated in the

instrumentation libraries (to enable APIs tailored to the usage of the specific

types) and in the exposition protocols. The Prometheus server does not yet make

exception of native histograms, these are currently only differentiated in the

API of instrumentation libraries and in the exposition protocols.

The Prometheus server does not yet make

krajorama · 2026-03-12T07:09:15Z

docs/concepts/metric_types.md

@@ -51,37 +57,78 @@ Client library usage documentation for gauges:

 A _histogram_ samples observations (usually things like request durations or


Hmm samples observations sounds like we don't take all observations into account - too similar to sampling traces. So maybe say "measures".

krajorama · 2026-03-12T07:11:32Z

docs/concepts/metric_types.md

+    `<basename>_count` (identical to `<basename>_bucket{le="+Inf"}` above)
+
+Native histograms are generally much more efficient than classic histograms,
+allow much higher resolution, and do not require explicit configuration of


Maybe mention atomic transfer over network? Here and for NHCBs as well?

krajorama · 2026-03-12T07:13:10Z

docs/concepts/metric_types.md

+boundary provided as a label. With native histograms, use the
+[`histogram_fraction()`
+function](/docs/prometheus/latest/querying/functions/#histogram_fraction) to
+calculate fractions of observations within given boundaries.


Mention trim as a new (experimental ?) way of doing the same.

krajorama · 2026-03-12T07:19:32Z

docs/practices/histograms.md

-To calculate the average request duration during the last 5 minutes
-from a histogram or summary called `http_request_duration_seconds`,
-use the following expression:
+Histograms and summaries both sample observations, typically request durations


I'd prefer measure over sample

krajorama · 2026-03-12T07:20:39Z

docs/practices/histograms.md

+
+      histogram_sum(rate(http_request_duration_seconds[5m]))
+    /
+      histogram_count(rate(http_request_duration_seconds[5m]))


Mention the shorthand for this use case histogram_avg.

krajorama · 2026-03-12T07:30:01Z

docs/practices/histograms.md

+| Required configuration during instrumentation | Pick a desired resolution and maybe a strategy to limit the bucket count. | Pick buckets suitable for the expected range of observed values and the desired queries. | Pick desired φ-quantiles and sliding window. Other φ-quantiles and sliding windows cannot be calculated later.
+| Instrumentation cost | Observations are cheap as they only need to increment counters. | Observations are cheap as they only need to increment counters. | Observations are relatively expensive due to the streaming quantile calculation.
+| Query performance | The server has to calculate quantiles from complex histogram samples. You can use [recording rules](/docs/prometheus/latest/configuration/recording_rules/#recording-rules) should the ad-hoc calculation take too long (e.g. in a large dashboard). | The server has to calculate quantiles from a large number of bucket series. You can use [recording rules](/docs/prometheus/latest/configuration/recording_rules/#recording-rules) should the ad-hoc calculation take too long (e.g. in a large dashboard). | Fast (no quantile calculations on the server, and aggregations are impossible anyway, see below).
+| Number of time series per histogram/summary | One (with a composite sample type). | `_sum`, `_count`, one per configured bucket. | `_sum`, `_count`, one per configured quantile.


Suggested change

| Number of time series per histogram/summary | One (with a composite sample type). | `_sum`, `_count`, one per configured bucket. | `_sum`, `_count`, one per configured quantile.

| Number of time series per histogram/summary | One (with a composite sample type). | `_sum`, `_count`, and one for each configured bucket. | `_sum`, `_count`, and one for each configured quantile.

beorn7 force-pushed the beorn7/histogram2 branch 3 times, most recently from 8ab93fc to 89b7d47 Compare March 10, 2026 18:10

beorn7 changed the title ~~Revamp histogram and summaries best practices~~ Update histogram best practices and metric types documentation for native histograms Mar 10, 2026

beorn7 force-pushed the beorn7/histogram2 branch 2 times, most recently from 84ebcab to 85fcce3 Compare March 10, 2026 18:24

beorn7 added 2 commits March 10, 2026 20:34

Revamp histogram and summaries best practices

4ce5b60

With this update, the best practices document about histogram and summaries finally takes native histograms into account. Signed-off-by: beorn7 <beorn@grafana.com>

beorn7 force-pushed the beorn7/histogram2 branch from 85fcce3 to 6ec3723 Compare March 10, 2026 19:35

beorn7 marked this pull request as ready for review March 10, 2026 19:35

beorn7 requested a review from krajorama March 10, 2026 19:35

beorn7 mentioned this pull request Mar 11, 2026

Add docs about native histograms. #2362

Open

krajorama reviewed Mar 12, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update histogram best practices and metric types documentation for native histograms#2868

Update histogram best practices and metric types documentation for native histograms#2868
beorn7 wants to merge 2 commits intomainfrom
beorn7/histogram2

beorn7 commented Mar 5, 2026 •

edited

Loading

Uh oh!

krajorama left a comment

Uh oh!

krajorama Mar 12, 2026

Uh oh!

krajorama Mar 12, 2026

Uh oh!

krajorama Mar 12, 2026

Uh oh!

krajorama Mar 12, 2026

Uh oh!

krajorama Mar 12, 2026

Uh oh!

krajorama Mar 12, 2026

Uh oh!

krajorama Mar 12, 2026

Uh oh!

krajorama Mar 12, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

		histograms (currently this is the case for Go and Java), you should probably
		prefer native histograms over classic histograms.

		@@ -51,37 +57,78 @@ Client library usage documentation for gauges:

		A _histogram_ samples observations (usually things like request durations or

	\| Number of time series per histogram/summary \| One (with a composite sample type). \| `_sum`, `_count`, one per configured bucket. \| `_sum`, `_count`, one per configured quantile.
	\| Number of time series per histogram/summary \| One (with a composite sample type). \| `_sum`, `_count`, and one for each configured bucket. \| `_sum`, `_count`, and one for each configured quantile.

Conversation

beorn7 commented Mar 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

krajorama left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

beorn7 commented Mar 5, 2026 •

edited

Loading