Skip to content

Don't manipulate time in oximeter database usage tests#10405

Open
bnaecker wants to merge 3 commits into
mainfrom
ben/fix-db-usage-flake
Open

Don't manipulate time in oximeter database usage tests#10405
bnaecker wants to merge 3 commits into
mainfrom
ben/fix-db-usage-flake

Conversation

@bnaecker
Copy link
Copy Markdown
Collaborator

@bnaecker bnaecker commented May 7, 2026

  • The database usage tests paused and advanced time to ensure we trigger another query against the DB to compute usage. This interacts poorly with a timeout in the connection type for talking to ClickHouse. Tokio's test-utils allow pausing time, but when there is literally no other work to do, the runtime "auto-advances" time to the next pending timer. That happens to be our connection timeout, and so the tests fail. Instead, just lower the polling interval for this one test and live with it taking a few seconds.
  • Fix test failed in CI: omicron-clickhouse-admin::context::tests::test_database_usage #10398

- The database usage tests paused and advanced time to ensure we trigger
  another query against the DB to compute usage. This interacts poorly
  with a timeout in the connection type for talking to ClickHouse.
  Tokio's test-utils allow pausing time, but when there is literally no
  other work to do, the runtime "auto-advances" time to the next pending
  timer. That happens to be our connection timeout, and so the tests
  fail. Instead, just lower the polling interval for this one test and
  live with it taking a few seconds.
- Fix #10398
@bnaecker bnaecker requested a review from hawkw May 7, 2026 18:24
Copy link
Copy Markdown
Member

@hawkw hawkw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall, this seems like an unfortunate but reasonable solution. I don't think any of us like writing tests that will always spend multiple seconds waiting for stuff, but I don't think that this test is going to be the longest-running test and we run them in parallel, so it's probably not the absolute worst thing.

There is an alternative solution 13 might consider, though: the Tokio docs suggest that a spawn_blocking task can be used to temporarily suppress the auto-advance behavior with paused time. I think that we could potentially do something like spawning a blocking task that waits on a blocking channel that we don't unblock until the test has completed, or something. Then, we could continue using paused time. On the other hand, this feels like a bit of a hack to me, and we might be better off just avoiding the paused timer entirely. Ultimately, it's up to you whether you'd rather try that out or continue with the current approach.

Comment thread clickhouse-admin/src/context.rs
Comment thread clickhouse-admin/src/context.rs Outdated
@bnaecker bnaecker enabled auto-merge (squash) May 12, 2026 17:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

test failed in CI: omicron-clickhouse-admin::context::tests::test_database_usage

2 participants