Skip to content

Oximeter: add a metric to record samples dropped in the database batc…#10683

Open
jmcarp wants to merge 1 commit into
mainfrom
jmcarp/oximeter-dropped-samples-metric
Open

Oximeter: add a metric to record samples dropped in the database batc…#10683
jmcarp wants to merge 1 commit into
mainfrom
jmcarp/oximeter-dropped-samples-metric

Conversation

@jmcarp

@jmcarp jmcarp commented Jun 29, 2026

Copy link
Copy Markdown
Contributor

…her.

Oximeter sends samples from all collection tasks to a shared database batcher, which then inserts samples into clickhouse. The database batcher uses a bounded queue that drops old samples when adding a new sample would overflow the queue. We currently log a warning when dropping an old sample, but operators would have to proactively check those logs in order to notice data loss via this queue. To make dropped samples more visible, this patches introduces a new oximeter metric that counts the number of dropped samples in the database batcher.

Note: if the batcher isn't able to push samples to the database at all, we won't be able to record the new metric! However, we write the new metric at the head of the queue, and dropped sample counts persist for the lifetime of the oximeter agent, so we'll be able to push metrics unless the queue is wildly oversaturated.

Part of #10552.

@jmcarp jmcarp requested a review from bnaecker June 29, 2026 18:41
…her.

Oximeter sends samples from all collection tasks to a shared database batcher,
which then inserts samples into clickhouse. The database batcher uses a bounded
queue that drops old samples when adding a new sample would overflow the queue.
We currently log a warning when dropping an old sample, but operators would
have to proactively check those logs in order to notice data loss via this
queue. To make dropped samples more visible, this patches introduces a new
oximeter metric that counts the number of dropped samples in the database
batcher.

Note: if the batcher isn't able to push samples to the database at all, we
won't be able to record the new metric! However, we write the new metric at the
head of the queue, and dropped sample counts persist for the lifetime of the
oximeter agent, so we'll be able to push metrics unless the queue is wildly
oversaturated.

Part of #10552.
@jmcarp jmcarp force-pushed the jmcarp/oximeter-dropped-samples-metric branch from 9d99a2e to 57f5802 Compare June 29, 2026 20:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant