Skip to content

Reduce time cost of tests#1116

Open
LucaMarconato wants to merge 9 commits intomainfrom
faster-tests
Open

Reduce time cost of tests#1116
LucaMarconato wants to merge 9 commits intomainfrom
faster-tests

Conversation

@LucaMarconato
Copy link
Copy Markdown
Member

Reducing the time required to run tests. This complements #1113, which reduces the number of emitted warnings (@melonora observed that this was impacting the runtime).

LucaMarconato and others added 4 commits May 5, 2026 17:01
Cache `blobs(256, 300, 3)` and `BlobsDataset()._labels_blobs()` once
per session via private session-scoped fixtures, then deepcopy into
each function-scoped fixture. Cuts fixture setup from 44.8s to 35.0s
(-9.8s) and total suite from 186s to 180s. All 1332 tests still pass.

Benchmark CSVs committed for reference (pytest_*.csv).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
to_circles() on labels scales linearly with pixel count.
Dropping from 512×512 to 128×128 cuts test_labels_2d_to_circles
from ~3.9s to ~1.0s per parametrized variant (−4.7s across the file).
Updated hardcoded coordinate/radius assertions to match the new size.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@codecov
Copy link
Copy Markdown

codecov Bot commented May 5, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 91.83%. Comparing base (8d7f72f) to head (89a2202).
⚠️ Report is 2 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1116      +/-   ##
==========================================
- Coverage   91.93%   91.83%   -0.11%     
==========================================
  Files          51       51              
  Lines        7772     7750      -22     
==========================================
- Hits         7145     7117      -28     
- Misses        627      633       +6     
Files with missing lines Coverage Δ
src/spatialdata/_core/_elements.py 93.15% <100.00%> (+0.92%) ⬆️
src/spatialdata/_core/centroids.py 100.00% <100.00%> (ø)
src/spatialdata/models/models.py 87.80% <100.00%> (-0.87%) ⬇️

... and 4 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

… deepcopy

Two orthogonal wins:

1. _elements.py: `get_model()` already calls `schema.validate()` internally;
   the explicit second validate() + get_axes_names() call in every __setitem__
   was redundant. Removing it halves the DataTree (_to_dataset_view) overhead
   per element insertion — directly speeds up fixture setup.

2. conftest.py: introduce `_fast_deepcopy_sdata` (copy.deepcopy + manual attrs
   restoration for DaskDataFrame/#503 and GeoDataFrame/#286) that is ~13x faster
   than sd_deepcopy (7ms vs 93ms for full_sdata). Session-scope full_sdata,
   images, labels and the 'full' sdata parametrized fixture; each test gets
   a fresh 7ms copy instead of an 87ms full reconstruction. Also switch
   sdata_blobs from sd_deepcopy to fast_deepcopy_sdata (2ms vs 25ms).

Full suite: 186s → 163s (~12% reduction, ~23s saved).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@LucaMarconato
Copy link
Copy Markdown
Member Author

This PR also adjusts the tests in the CI (modified in #1114) to keep only lower-bound and upper-bound Python versions, which is the philosophy used in integration-testing.

Actually since we test 3.11 and 3.14, and integration testing tests 3.12 every night, we cover also the 3.12 case, while keeping the per-commit CI leaner.

…ion comment

- get_model() now accepts validate=False to skip schema.validate() when the
  caller only needs to infer the element type without re-running validation
- add comment to Elements.__setitem__ noting that subclass overrides call
  get_model() which performs the validation
- drop Python 3.13 CI matrix entries (superseded by 3.14)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@LucaMarconato LucaMarconato marked this pull request as ready for review May 5, 2026 17:07
@LucaMarconato
Copy link
Copy Markdown
Member Author

So far the tests should be running ~12% faster than in main. We could merge already, or optimize further.

LucaMarconato and others added 3 commits May 5, 2026 19:20
… elements

Add skip_element_validation() context manager (backed by a ContextVar) that
makes __setitem__ call get_model(validate=False) — type inference only, no
schema.validate().  Use it in every code path that constructs a SpatialData
from elements that originated from an existing SpatialData and were never
externally mutated: bounding_box_query, polygon_query, query_by_coordinate_system,
transform_to_coordinate_system, subset, and init_from_elements.

test_query_spatial_data: 0.77s → 0.64s (the remaining time is the query work
itself — filtering, shapely ops, raster cropping).

Also inline a minimal 2-image SpatialData in
test_transformations_between_coordinate_systems instead of relying on the
full 8-element images fixture; the test only ever uses image2d and
image2d_multiscale, so writing the other 6 to disk was pure waste.
test_transformations_between_coordinate_systems: 0.61s → 0.44s.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Replace per-slice O(H+W) approach (512 dask compute() calls for a 256×256
array) with a single array materialization + np.bincount O(n_pixels) pass.

This speeds up get_centroids() on labels from ~1.5s to ~50ms, cutting
to_circles(labels) from ~1.6s to ~53ms. Affects test_validation dataloader
variants (~2.5s → ~0.2s each, saving ~9s), test_labels_2d_to_circles, and
any production call to get_centroids or to_circles on label arrays.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant