[GH-2885] Add ST_GeomFromBox2D and ST_AsText(box2d)#2899
[GH-2885] Add ST_GeomFromBox2D and ST_AsText(box2d)#2899jiayuasu merged 3 commits intoapache:masterfrom
Conversation
ST_GeomFromBox2D(box2d) -> Polygon converts a Box2D to a closed rectangular polygon. PostGIS exposes the same conversion as the implicit cast box2d::geometry; we expose it as a function because UDT-to-UDT implicit casts in Spark Catalyst require Catalyst-level work, and an explicit constructor fits the existing ST_GeomFrom* family. ST_AsText is overloaded to accept a Box2D and return the PostGIS-format string 'BOX(xmin ymin, xmax ymax)'. NULL on null input. Closes apache#2885.
There was a problem hiding this comment.
Pull request overview
This PR extends Sedona’s new native Box2D surface by adding two conversions in the common and Spark SQL layers: a Box2D -> Geometry constructor and ST_AsText(Box2D) formatting. It fits into the broader Phase 1 Box2D work by making bbox values easier to round-trip through SQL without unpacking coordinates manually.
Changes:
- Added a common-layer
Box2D -> Polygonconstructor andBox2D -> BOX(...)text formatter. - Wired those conversions into Spark SQL via a new
ST_GeomFromBox2Dexpression and an addedST_AsTextoverload. - Added Scala SQL tests covering happy-path behavior and null propagation for both new conversions.
Reviewed changes
Copilot reviewed 7 out of 7 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
spark/common/src/test/scala/org/apache/sedona/sql/functionTestScala.scala |
Adds SQL-level coverage for ST_AsText(Box2D). |
spark/common/src/test/scala/org/apache/sedona/sql/constructorTestScala.scala |
Adds SQL-level coverage for ST_GeomFromBox2D. |
spark/common/src/main/scala/org/apache/spark/sql/sedona_sql/expressions/Functions.scala |
Extends ST_AsText expression dispatch to accept Box2D. |
spark/common/src/main/scala/org/apache/spark/sql/sedona_sql/expressions/Constructors.scala |
Introduces the Spark SQL expression for ST_GeomFromBox2D. |
spark/common/src/main/scala/org/apache/sedona/sql/UDF/Catalog.scala |
Registers ST_GeomFromBox2D in the Spark SQL function catalog. |
common/src/main/java/org/apache/sedona/common/Functions.java |
Adds the common helper that formats Box2D as BOX(xmin ymin, xmax ymax). |
common/src/main/java/org/apache/sedona/common/Constructors.java |
Adds the common helper that converts a Box2D into a polygon geometry. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| /** PostGIS-format text for a Box2D: {@code BOX(xmin ymin, xmax ymax)}. NULL on null input. */ | ||
| public static String asWKT(Box2D box) { | ||
| if (box == null) { | ||
| return null; | ||
| } | ||
| return "BOX(" | ||
| + box.getXMin() | ||
| + " " | ||
| + box.getYMin() | ||
| + ", " | ||
| + box.getXMax() | ||
| + " " | ||
| + box.getYMax() | ||
| + ")"; |
There was a problem hiding this comment.
Renamed in 8007abd to Functions.box2dAsText so the WKT serializer family stays a strict serializer. The Spark-side expression is still ST_AsText (which is a label, not a strict WKT contract — already mixed Geometry WKT and Geography text formats).
| Coordinate[] coords = | ||
| new Coordinate[] { | ||
| new Coordinate(xmin, ymin), | ||
| new Coordinate(xmin, ymax), | ||
| new Coordinate(xmax, ymax), | ||
| new Coordinate(xmax, ymin), | ||
| new Coordinate(xmin, ymin) | ||
| }; | ||
| return GEOMETRY_FACTORY.createPolygon(coords); |
There was a problem hiding this comment.
Done in 8007abd. geomFromBox2D now delegates to polygonFromEnvelope — one source of truth for the ring assembly.
| * Convert a {@link Box2D} to a closed rectangular polygon. NULL on null input. Mirrors PostGIS | ||
| * {@code box2d::geometry}. | ||
| */ | ||
| public static Geometry geomFromBox2D(Box2D box) { |
There was a problem hiding this comment.
Added direct ConstructorsTest.geomFromBox2D in 8007abd covering happy path, degenerate box (collapsed to a point), and null input.
| } | ||
|
|
||
| /** PostGIS-format text for a Box2D: {@code BOX(xmin ymin, xmax ymax)}. NULL on null input. */ | ||
| public static String asWKT(Box2D box) { |
There was a problem hiding this comment.
Added direct FunctionsTest.box2dAsText in 8007abd covering typical case, full-globe extent, and null input.
- Rename Functions.asWKT(Box2D) to Functions.box2dAsText. The output 'BOX(...)' is not WKT, so it shouldn't share the WKT serializer's name. The Spark expression remains ST_AsText (which has always been a label, not a strict WKT contract). - Have Constructors.geomFromBox2D delegate to polygonFromEnvelope instead of duplicating the ring assembly. One source of truth for envelope-to-polygon construction. - Add common-layer unit coverage in ConstructorsTest.geomFromBox2D (happy path, degenerate box, null) and FunctionsTest.box2dAsText (typical case, full-globe extent, null), so regressions are caught before the Spark integration suite runs.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 9 out of 9 changed files in this pull request and generated 2 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| * Convert a {@link Box2D} to a closed rectangular polygon. NULL on null input. Mirrors PostGIS | ||
| * {@code box2d::geometry}. | ||
| */ | ||
| public static Geometry geomFromBox2D(Box2D box) { | ||
| if (box == null) { | ||
| return null; | ||
| } | ||
| return polygonFromEnvelope(box.getXMin(), box.getYMin(), box.getXMax(), box.getYMax()); |
There was a problem hiding this comment.
Fixed in c8eecaf. ST_GeomFromBox2D now dispatches on dimensionality — POINT for 0-D (xmin==xmax && ymin==ymax), LINESTRING for 1-D (one axis collapsed), POLYGON otherwise. ST_GeomFromBox2D(ST_Box2D(geom)) now matches ST_Envelope(geom) for points, axis-aligned lines, and 2-D geometries. Added unit coverage in ConstructorsTest and SQL coverage in constructorTestScala for all three branches.
| return "BOX(" | ||
| + box.getXMin() | ||
| + " " | ||
| + box.getYMin() | ||
| + ", " | ||
| + box.getXMax() | ||
| + " " | ||
| + box.getYMax() | ||
| + ")"; |
There was a problem hiding this comment.
Keeping the as-stored emit semantics — but tightened the Javadoc in c8eecaf to make the contract explicit. Reasoning: the rest of the Box2D API (ST_XMin/XMax/YMin/YMax) returns stored values, not normalized values; if box2dAsText normalized, text output would diverge from accessor output and round-trip via text would be lossy. The swapped-corner ordering is reserved for future antimeridian semantics on geography bboxes (cf. sedona-db WraparoundInterval) and was decided in #2883. Inputs that arrive swapped today are user error rather than a contract violation, and faithful text output is the right debug aid.
- ST_GeomFromBox2D now dispatches on dimensionality, matching PostGIS
box2d::geometry and Sedona's own ST_Envelope(geom):
- POINT for 0-D boxes (xmin == xmax && ymin == ymax)
- LINESTRING for 1-D boxes (one axis collapsed)
- POLYGON otherwise.
Adds direct unit coverage in ConstructorsTest plus SQL coverage in
constructorTestScala for all three branches.
- box2dAsText: keep the as-stored emit semantics (consistent with
ST_XMin/XMax/etc. which also return stored values, not normalized
values), but tighten the Javadoc to make that contract explicit and
flag the antimeridian-reservation rationale.
Mirrors the Phase 1 SQL surface added in apache#2890, apache#2895, apache#2897, apache#2898, apache#2899 in PySpark wrappers: - ST_Box2D in st_functions - ST_MakeBox2D and ST_GeomFromBox2D in st_constructors - ST_Extent in st_aggregates Accessor overloads (ST_XMin/XMax/YMin/YMax) and ST_AsText already worked with Box2D inputs through their existing wrappers; SQL overload resolution happens on the JVM side. The Python Box2DType UDT and Box2D value class were merged in apache#2878, so collected results materialize as Box2D Python objects with xmin/ymin/xmax/ymax attributes. Closes apache#2887.
Did you read the Contributor Guide?
Is this PR related to a ticket?
[GH-XXX] my subject. Closes Add CAST(box2d AS geometry) and ST_AsText(box2d) #2885What changes were proposed in this PR?
Two related conversions on
Box2D:1.
ST_GeomFromBox2D(box2d) -> PolygonConverts a
Box2Dto a closed rectangular polygon. Equivalent to PostGISbox2d::geometry.Why a function instead of an implicit cast. PostGIS uses
box2d::geometrycast syntax. Spark Catalyst supports UDT-to-UDT implicit casts but registering one requires extending Catalyst'sCastresolution rule, which is a real piece of work and orthogonal to the bbox feature itself. Sedona's existingST_GeomFrom*family (ST_GeomFromWKT,ST_GeomFromGeoJSON, etc.) is the natural home for an explicit constructor. NULL input → NULL output.2.
ST_AsText(box2d) -> 'BOX(xmin ymin, xmax ymax)'Overloads the existing
ST_AsText(which already handlesGeometryandGeography) to accept aBox2Dand return PostGIS-format text:NULL on null input.
How was this patch tested?
constructorTestScala:ST_MakeBox2D→ST_GeomFromBox2D→ST_AsText, and NULL propagation.functionTestScala:BOX(1.0 2.0, 4.0 5.0), and NULL propagation.Did this PR include necessary documentation updates?