feat: prune unused dimension joins from queries#228
Merged
hussainsultan merged 4 commits intomainfrom Apr 14, 2026
Merged
Conversation
When a query only references a subset of the joined tables (via dimension keys or measure names), joins to unreferenced tables are now skipped. This avoids expensive joins to dimension tables that contribute nothing to the result — e.g. a pure SUM() on a fact table no longer joins 4 dimension tables. The implementation extracts table prefixes from the query's keys and aggs, passes them as parent_requirements to SemanticJoinOp.to_untagged(), which recursively prunes right-side leaf tables not in the needed set. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The pre-agg path needs all tables for dimension bridges — passing parent_requirements=None ensures no pruning happens there. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Guard pruning by cardinality and join type: only prune join_one with how="left". Inner joins act as row filters and must not be removed. join_many/join_cross are never pruned (join_many is intercepted by the pre-agg path; join_cross changes row counts). Adds edge case tests: - Inner join with orphan rows: pruning correctly preserves inner join - Filter on dimension table: pruning correctly disabled - SQL verification: pruned tables absent from compiled SQL Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
model.aggregate("facts.total_sales")no longer joins unused dimension tablesCloses #227
How it works
SemanticAggregateOp.to_untagged()extracts table prefixes from query keys/aggs (e.g.stores.store_name→stores)parent_requirementstoSemanticJoinOp.to_untagged()to_untagged()checks if the right-side leaf table is in the needed set — if not, skips the join entirelyTest plan
🤖 Generated with Claude Code