-
Notifications
You must be signed in to change notification settings - Fork 14.9k
[R2 SQL] Add support for UNION, INTERSECT, EXCEPT, and SELECT DISTINCT #31252
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
sejoker
wants to merge
1
commit into
production
Choose a base branch
from
r2-sql/union-changelog
base: production
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
+132
−5
Open
Changes from all commits
Commits
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
57 changes: 57 additions & 0 deletions
57
src/content/changelog/r2-sql/2026-06-05-union-intersect-except-select-distinct.mdx
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,57 @@ | ||
| --- | ||
| title: R2 SQL now supports UNION, INTERSECT, EXCEPT, and SELECT DISTINCT | ||
| description: Combine query results with set operations and eliminate duplicates with SELECT DISTINCT. | ||
| products: | ||
| - r2-sql | ||
| date: 2026-06-05 | ||
| --- | ||
|
|
||
| [R2 SQL](/r2-sql/) now supports set operations (`UNION`, `INTERSECT`, `EXCEPT`) and `SELECT DISTINCT`, expanding the range of analytical queries you can run directly on [Apache Iceberg](https://iceberg.apache.org/) tables in [R2 Data Catalog](/r2/data-catalog/). | ||
|
|
||
| ## Set operations | ||
|
|
||
| Combine the results of multiple `SELECT` statements: | ||
|
|
||
| - **`UNION`** — returns all rows from both queries, removing duplicates | ||
| - **`UNION ALL`** — returns all rows from both queries, including duplicates | ||
| - **`INTERSECT`** — returns only rows that appear in both queries | ||
| - **`EXCEPT`** — returns rows from the first query that do not appear in the second | ||
|
|
||
| ```sql | ||
| -- Find zones that had either firewall blocks OR high-risk requests | ||
| SELECT zone_id FROM my_namespace.firewall_events WHERE action = 'block' | ||
| UNION | ||
| SELECT zone_id FROM my_namespace.http_requests WHERE risk_score > 0.8 | ||
| ``` | ||
|
|
||
| ```sql | ||
| -- Find zones with both firewall blocks AND high traffic | ||
| SELECT zone_id FROM my_namespace.firewall_events WHERE action = 'block' | ||
| INTERSECT | ||
| SELECT zone_id FROM my_namespace.http_requests | ||
| GROUP BY zone_id | ||
| HAVING COUNT(*) > 10000 | ||
| ``` | ||
|
|
||
| ```sql | ||
| -- Find enterprise zones that have not been compacted | ||
| SELECT zone_id FROM my_namespace.zones WHERE plan = 'enterprise' | ||
| EXCEPT | ||
| SELECT zone_id FROM my_namespace.compaction_history | ||
| ``` | ||
|
|
||
| ## SELECT DISTINCT | ||
|
|
||
| Eliminate duplicate rows from query results: | ||
|
|
||
| ```sql | ||
| SELECT DISTINCT region, department | ||
| FROM my_namespace.sales_data | ||
| WHERE total_amount > 1000 | ||
| ORDER BY region, department | ||
| LIMIT 100 | ||
| ``` | ||
|
|
||
| For large datasets where approximate results are acceptable, `approx_distinct()` remains a faster alternative for counting unique values. | ||
|
|
||
| For the full syntax reference, refer to the [SQL reference](/r2-sql/sql-reference/). For performance guidance, refer to [Limitations and best practices](/r2-sql/reference/limitations-best-practices/). | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change | ||||
|---|---|---|---|---|---|---|
|
|
@@ -76,7 +76,7 @@ DESCRIBE namespace_name.table_name; | |||||
| ### Syntax | ||||||
|
|
||||||
| ```sql | ||||||
| SELECT column_specification [, column_specification, ...] | ||||||
| SELECT [DISTINCT] column_specification [, column_specification, ...] | ||||||
| ``` | ||||||
|
|
||||||
| ### Column specification | ||||||
|
|
@@ -87,6 +87,20 @@ SELECT column_specification [, column_specification, ...] | |||||
| - **Column alias**: `column_name AS alias` | ||||||
| - **Expressions**: arithmetic, function calls, CASE expressions, and casts | ||||||
|
|
||||||
| ### SELECT DISTINCT | ||||||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Use sentence case per the style guide:
Suggested change
|
||||||
|
|
||||||
| Use `DISTINCT` to eliminate duplicate rows from the result set: | ||||||
|
|
||||||
| ```sql | ||||||
| SELECT DISTINCT region, department | ||||||
| FROM my_namespace.sales_data | ||||||
| WHERE total_amount > 1000 | ||||||
| ORDER BY region, department | ||||||
| LIMIT 100 | ||||||
| ``` | ||||||
|
|
||||||
| For large datasets where approximate results are acceptable, `approx_distinct()` is a faster alternative for counting unique values. | ||||||
|
|
||||||
| ### Examples | ||||||
|
|
||||||
| ```sql | ||||||
|
|
@@ -581,6 +595,64 @@ SELECT * FROM my_namespace.sales_data LIMIT 100 | |||||
|
|
||||||
| --- | ||||||
|
|
||||||
| ## Set operations | ||||||
|
|
||||||
| Set operations combine the results of two or more `SELECT` statements. | ||||||
|
|
||||||
| ### Syntax | ||||||
|
|
||||||
| ```sql | ||||||
| SELECT ... FROM table1 | ||||||
| UNION | UNION ALL | INTERSECT | EXCEPT | ||||||
| SELECT ... FROM table2 | ||||||
| ``` | ||||||
|
|
||||||
| ### Supported operations | ||||||
|
|
||||||
| | Operation | Description | | ||||||
| | :------------ | :--------------------------------------------------------------- | | ||||||
| | `UNION` | Returns all rows from both queries, removing duplicates | | ||||||
| | `UNION ALL` | Returns all rows from both queries, including duplicates | | ||||||
| | `INTERSECT` | Returns only rows that appear in both query results | | ||||||
| | `EXCEPT` | Returns rows from the first query that do not appear in the second | | ||||||
|
|
||||||
| ### Examples | ||||||
|
|
||||||
| #### UNION | ||||||
|
|
||||||
| ```sql | ||||||
| -- Find zones that had either firewall blocks OR high-risk requests | ||||||
| SELECT zone_id FROM my_namespace.firewall_events WHERE action = 'block' | ||||||
| UNION | ||||||
| SELECT zone_id FROM my_namespace.http_requests WHERE risk_score > 0.8 | ||||||
| ``` | ||||||
|
|
||||||
| #### INTERSECT | ||||||
|
|
||||||
| ```sql | ||||||
| -- Find zones with both firewall blocks AND entries in the zones table | ||||||
| SELECT zone_id FROM my_namespace.firewall_events WHERE action = 'block' | ||||||
| INTERSECT | ||||||
| SELECT zone_id FROM my_namespace.zones WHERE plan = 'enterprise' | ||||||
| ``` | ||||||
|
|
||||||
| #### EXCEPT | ||||||
|
|
||||||
| ```sql | ||||||
| -- Find enterprise zones that have no firewall events | ||||||
| SELECT zone_id FROM my_namespace.zones WHERE plan = 'enterprise' | ||||||
| EXCEPT | ||||||
| SELECT zone_id FROM my_namespace.firewall_events | ||||||
| ``` | ||||||
|
|
||||||
| ### Requirements | ||||||
|
|
||||||
| - All queries in a set operation must return the same number of columns. | ||||||
| - Corresponding columns must have compatible data types. | ||||||
| - Column names in the result are taken from the first query. | ||||||
|
|
||||||
| --- | ||||||
|
|
||||||
| ## EXPLAIN | ||||||
|
|
||||||
| Returns the execution plan for a query without running it. | ||||||
|
|
||||||
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use sentence case per the style guide: