Skip to content

Add major world cities as bounding-box templates in from_template#3534

Merged
brendancol merged 7 commits into
mainfrom
issue-3533
Jun 26, 2026
Merged

Add major world cities as bounding-box templates in from_template#3534
brendancol merged 7 commits into
mainfrom
issue-3533

Conversation

@brendancol

@brendancol brendancol commented Jun 26, 2026

Copy link
Copy Markdown
Contributor

Closes #3533

Adds world cities to from_template, following the existing nyc pattern.

  • 558 cities, generated from Natural Earth populated places and committed as static _CITIES data next to the country bboxes. No runtime network or geopandas dependency. Selection = every national capital + metros with POP_MAX >= 1.2M + a curated set of recognizable US secondary cities (Austin, New Orleans, Las Vegas, Portland, ...), since Natural Earth's POP_MAX/SCALERANK underrate US metros.
  • Each city resolves to its UTM zone (EPSG:326xx north / 327xx south), picked from the centroid — a standard EPSG code, never a synthesized projection. Bounding box is a metro-scale lon/lat box (half-width scales with population) projected into that zone.
  • Wired into _resolve between the curated regions and the country codes, so preserve='area'/'shape' work exactly as they do for nyc. No backend-dispatch changes.
  • Name collisions: the larger-population city keeps the bare slug, the others take an _<iso2> suffix (e.g. hyderabad vs hyderabad_pk, birmingham UK vs birmingham_us).
from xrspatial import from_template
from_template("tokyo").attrs["crs"]        # 32654
from_template("austin").attrs["crs"]       # 32614
from_template("new_orleans").attrs["crs"]  # 32615

Backend coverage: numpy / cupy / dask+numpy / dask+cupy — cities reuse the same _make_data paths as every other template (verified locally on all four).

Test plan

  • pytest xrspatial/tests/test_templates.py (56 passed)
  • Integrity loop over all cities (keys, UTM-range CRS, ordered lonlat/bounds, ascii lowercase keys)
  • Sample builds obey the array contract; pixel centers stay inside the bbox; southern-hemisphere build exercised
  • UTM spot-checks (london 32630, tokyo 32654, southern -> 327xx), case-insensitivity, collision disambiguation
  • Backend parity on dask+numpy / cupy / dask+cupy
  • flake8 + isort clean on touched files
  • Reconciled with main from_template: use CF Conventions metadata instead of crs_units/crs_name #3532 (CF grid-mapping metadata)

519 cities (national capitals + metros >=1.2M pop) from Natural Earth,
each in its UTM zone (EPSG:326xx/327xx). Wired into _resolve between the
curated regions and country codes; preserve='area'/'shape' work as for nyc.
- 5 tests: registry integrity over all 519 cities, sample builds, UTM
  spot-checks, case-insensitivity, name-collision disambiguation.
- Reformat generated _CITIES entries to a hanging indent (flake8 max-line
  100) and apply isort to the touched imports.

@brendancol brendancol left a comment

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review: world-city bounding-box templates

Blockers

None.

Suggestions

  • Discoverability. The unknown-name error points to "the templates reference" (templates.py:69-70), but neither templates.rst nor any public API lists the 519 city names — _CITIES is private. Country codes have the same gap, except ISO-3166 is a discoverable standard and an ad-hoc city list isn't. A follow-up that exposes a names listing (or a reference note on how to find them) would help. Not blocking.

Nits

  • test_city_sample_builds (test_templates.py:187) builds only sorted(_CITIES)[:20], which is alphabetical and northern-heavy. A southern-hemisphere build is covered in practice because test_city_utm_spot_checks calls from_template("sao_paulo"), but adding one southern city to the sample loop would make that explicit.

What looks good

  • Cities slot into _resolve between regions and countries (templates.py:49-55) and reuse _make_data unchanged, so all four backends work with no dispatch edits (checked locally on dask+numpy / cupy / dask+cupy).
  • UTM-zone-from-centroid keeps every city on a standard EPSG code, and preserve='area'/'shape' fall through to the same nyc behavior for free.
  • Data is committed static (no runtime geopandas or network), and the _CITIES header documents the source, the UTM rule, the buffer sizing, and the collision rule.
  • test_city_registry_integrity checks the shape of all 519 entries, so a bad regeneration fails in CI instead of at call time.

Checklist

  • CRS is a standard EPSG code (UTM), not synthesized
  • All four backends consistent
  • crs_units correct (metres for UTM cities)
  • Integrity covered for every entry
  • No dask/dispatch changes; reuses existing paths
  • No premature materialization (static dict)
  • No benchmark needed (allocation-only, matches existing from_template)
  • README + reference + notebook updated
  • Docstring updated with city example and collision rule

Add sao_paulo/sydney (327xx) to test_city_sample_builds so the southern
build is explicit, not only implied by the UTM spot-checks. Discoverability
suggestion deferred to follow-up #3535 (new public listing API needs its
own design).
main replaced crs_units/crs_name with CF grid-mapping attrs. Update the
city sample test to assert CF coord units (x.units=='m', standard_name)
and the notebook to print grid_mapping_name instead of crs_name.

@brendancol brendancol left a comment

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Follow-up review

Changes since the first pass:

  • Nit fixed: test_city_sample_builds now includes sao_paulo and sydney, so the southern-hemisphere (327xx) build path is exercised directly.
  • Discoverability suggestion deferred to #3535 — a public listing API is a separate design decision (name, return shape, whether to include the 240 country codes).
  • Merged origin/main and reconciled with #3532, which replaced crs_units/crs_name with CF grid-mapping attributes. The city test now asserts CF coord units (x.attrs["units"] == "m", standard_name == "projection_x_coordinate") and the user-guide cell prints grid_mapping_name instead of crs_name.

No new blockers. Local: 56 passed in test_templates.py, flake8 + isort clean, backend parity holds on dask+numpy / cupy / dask+cupy.

Natural Earth's POP_MAX/SCALERANK underrate US metros (Las Vegas listed at
17k, scalerank 8), so a population cutoff misses Austin, New Orleans, Las
Vegas, Portland, etc. Add a curated US name list (top match per name) run
through the same UTM-zone + metro-buffer machinery. 519 -> 558 cities.
@brendancol brendancol merged commit 7509fe8 into main Jun 26, 2026
12 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add major world cities as bounding-box templates in from_template

1 participant