Conversation
|
run benchmarks |
|
🤖 |
|
🤖: Benchmark completed Details
|
|
Hmm looks a bit mixed (different than on my machine). But I think the DuckDB approach might give more improvements. |
|
run benchmarks |
Introduce PartitionAggState to support multiple internal hash tables in partial aggregation. When enabled via AggregateExec::with_num_agg_partitions(), input rows are hashed by group keys (using the same hash as RepartitionExec) and routed to separate smaller hash tables for better cache locality. Defaults to 1 partition (no behavior change). The optimizer can set higher values when a hash repartition follows the partial aggregate. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
ac59538 to
b793a93
Compare
|
🤖 |
…tial aggregation When num_agg_partitions > 1, the partial aggregate now acts as a repartitioning operator, producing T output partitions directly via channels. Each input task runs a GroupedHashAggregateStream with T internal hash tables, then routes emitted batches to the correct output channel using last_emitted_partition (no re-hashing needed since internal tables use the same REPARTITION_RANDOM_STATE hash). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
🤖: Benchmark completed Details
|
|
run benchmarks |
- Fix ProducingOutput to carry partition index alongside batch, ensuring correct routing when emit_next_partition eagerly advances the index - Add use_channels() method to centralize the decision of when to use the channel-based multi-output path - Add update_cache_partitioning() to keep output partitioning in sync when limit_options or num_agg_partitions change - Fix with_new_limit_options to recalculate output partitioning (prevents TopK aggregation from claiming Hash partitioning when channels won't be used) - Guard CombinePartialFinalAggregate from combining when Partial has num_agg_partitions > 1 - Set num_agg_partitions in physical planner when repartitioning is enabled - Keep spawned task references alive via Arc in output streams to prevent abort-on-drop Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
d7e99a6 to
8a69b5a
Compare
|
🤖 |
|
🤖: Benchmark completed Details
|
Which issue does this PR close?
Rationale for this change
Trying morsel-paper-like aggregations (not really yet)
What changes are included in this PR?
Are these changes tested?
Are there any user-facing changes?