Skip to content

Further query perf updates for oximeter field lookups.#10110

Open
jmcarp wants to merge 1 commit intomainfrom
jmcarp/oximeter-field-union
Open

Further query perf updates for oximeter field lookups.#10110
jmcarp wants to merge 1 commit intomainfrom
jmcarp/oximeter-field-union

Conversation

@jmcarp
Copy link
Copy Markdown
Contributor

@jmcarp jmcarp commented Mar 20, 2026

Joins are expensive in clickhouse. In #9262, we dropped the number of joins in the field lookup query from the number of fields to the number of relevant field tables. However, this still leaves us with up to six field tables to join. In this patch, we do away with joins for the field lookup query entirely. Instead, we select the relevant rows from each field table, then combine them with UNION ALL, and use anyIf and GROUP BY to pivot from long format to wide. This speeds up the field query with indentical results, with greater speedups for timeseries that use more field tables. A timeseries whose labels are all strings will see no difference in performance; a series that has string, uuid, u8, u16, u32, and i16 labels (like the switch_port_control_data_link metrics) will return results much faster.

@jmcarp jmcarp requested a review from bnaecker March 20, 2026 14:25
Copy link
Copy Markdown
Collaborator

@bnaecker bnaecker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is clever! I'd like to see some performance numbers if possible. Could we look at the returned query summary from some representative OxQL queries, and see what the performance impact is? If possible, looking at memory consumption would be nice too, but we don't have an easy way to access that per-query. You would have to take the query ID and go back to the system.query_log table, I think.

Comment on lines +961 to +962
.into_iter()
.collect();
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this actually need to be a Vec<_>? It looks like we iterate over it below, which could happen just as well with the set itself.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, this is just noise from Claude that I didn't catch earlier. Updated to use a set instead.

} else {
query.push_str(" WHERE ");
}
query.push_str(" HAVING ");
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it helpful to have a test checking this new syntax element? I always find it hard to build a SQL query programmatically without seeing the final output.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated the existing test to use a predicate so that our expectorated test file includes the HAVING clause.

@@ -0,0 +1,44 @@
SELECT
assumeNotNull(anyIf(fields_i32, field_name = 'foo')) AS foo,
assumeNotNull(anyIf(fields_u32, field_name = 'index')) AS INDEX,
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is INDEX capitalized here?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because we're using sqlformat to format the generated sql, so that it's legible to humans, and INDEX is a sql keyword. Updated to quote identifiers so that we get AS "index" rather than AS INDEX.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool thanks! I think quoting an identifier is more obvious than the capitalization. I definitely thought this was case-insensitive! ClickHouse SQL is weird in lots of ways though :)

let union_parts: Vec<String> = field_types
.iter()
.map(|&this_type| {
let value_cols: Vec<String> = field_types
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This nested-loop strikes me as inefficient, given that we know that all but one of the elements is going to be NULL AS <some_table_name>. I wonder if we can:

  • build an array of all the possible NULL AS fields_{type} entries we'll need
  • on L995, call enumerate().iter() instead of just .iter()
  • inside the .map() on L996, clone the array and overwrite the one element at that index with the string like field_value AS fields_{type} instead

I'm not positive that will be noticeably better, given the small sizes, but it's definitely asymptotically better.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. I don't think we'll see much of a performance difference, since the number of field types is small and query latency is dominated by clickhouse, but the new version is also more readable.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome, thanks. I agree it's unlikely to be noticeable.

@jmcarp jmcarp force-pushed the jmcarp/oximeter-field-union branch from 4160734 to 32ab0a5 Compare March 23, 2026 19:03
@jmcarp
Copy link
Copy Markdown
Contributor Author

jmcarp commented Mar 23, 2026

Now with benchmarks. I'm running this on an omnios vm on dogfood, running clickhouse v23. I copied over the field tables from dogfood, but no measurement tables. To benchmark, check out main, then collect some observations:

cargo bench --package oximeter-db --bench oxql -- --save-baseline main

Then check out this branch and run again:

cargo bench --package oximeter-db --bench oxql -- --save-baseline union

Then compare with critcmp. The output is a little hard to read, but critcmp picks the faster query, indexes it to 1.0, and then shows the slower query as a multiple of that reference latency. To summarize, lookups over one field table are the same (very slightly faster some runs, very slightly slower others), and then relative performance gets better for metrics that use increasing numbers of fields. There's an exception for virtual_machine:vcpu_usage_tables, which only shows a slight improvement, because the cardinality is relatively high there, which outweighs any gains from avoiding JOINs.

omnios@omni-builder:~/code/omicron$ critcmp main union
group                                                                 main                                   union
-----                                                                 ----                                   -----
oxql/field_lookup/1/crucible_upstairs:flush_tables                    1.00     34.8±2.81ms        ? ?/sec    1.02     35.4±3.00ms        ? ?/sec
oxql/field_lookup/2/ddm_session:advertisements_received_tables        1.22     14.1±0.38ms        ? ?/sec    1.00     11.6±0.34ms        ? ?/sec
oxql/field_lookup/3/virtual_machine:vcpu_usage_tables                 1.03  1245.0±38.38ms        ? ?/sec    1.00  1205.3±38.25ms        ? ?/sec
oxql/field_lookup/4/bgp_session:active_connections_accepted_tables    2.09     32.8±1.02ms        ? ?/sec    1.00     15.7±0.50ms        ? ?/sec
oxql/field_lookup/6/switch_data_link:bytes_sent_tables                3.31    102.6±5.87ms        ? ?/sec    1.00     31.0±1.27ms        ? ?/sec

Overall, my qualitative summary is that changing from JOIN to UNION helps more as the number of field tables per metric increases, and less as timeseries cardinality increases, but never makes queries slower than the status quo.

Joins are expensive in clickhouse. In #9262, we dropped the number of joins in
the field lookup query from the number of fields to the number of relevant
field tables. However, this still leaves us with up to six field tables to
join. In this patch, we do away with joins for the field lookup query entirely.
Instead, we select the relevant rows from each field table, then combine them
with UNION ALL, and use anyIf and GROUP BY to pivot from long format to wide.
This speeds up the field query with indentical results, with greater speedups
for timeseries that use more field tables. A timeseries whose labels are all
strings will see no difference in performance; a series that has string, uuid,
u8, u16, u32, and i16 labels (like the switch_port_control_data_link metrics)
will return results much faster.
@jmcarp jmcarp force-pushed the jmcarp/oximeter-field-union branch from 32ab0a5 to a435519 Compare March 23, 2026 20:16
@bnaecker
Copy link
Copy Markdown
Collaborator

This is great! Thanks for adding the benchmarking data. I agree that this seems no worse, and often much better than main. Seems like an unambiguous win.

We don't have to do it now, but it would be great to eventually add a sweep over the number of unique timeseries in the result. I'm specifically interested the case when the cardinality is high, but the query is highly selective, so that we should be pulling out a very small amount of data from the joined / unioned field tables.

This looks great, thanks!

jmcarp added a commit that referenced this pull request Mar 24, 2026
Add oxql field lookup benchmark. To avoid the complication of generating
realistic synthetic data, we provide scripts for fetching and loading
fields from a running ClickHouse instance.

Checking in to benchmark changes in #10110.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants