Reduce time cost of tests#1116
Open
LucaMarconato wants to merge 9 commits intomainfrom
Open
Conversation
Cache `blobs(256, 300, 3)` and `BlobsDataset()._labels_blobs()` once per session via private session-scoped fixtures, then deepcopy into each function-scoped fixture. Cuts fixture setup from 44.8s to 35.0s (-9.8s) and total suite from 186s to 180s. All 1332 tests still pass. Benchmark CSVs committed for reference (pytest_*.csv). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
to_circles() on labels scales linearly with pixel count. Dropping from 512×512 to 128×128 cuts test_labels_2d_to_circles from ~3.9s to ~1.0s per parametrized variant (−4.7s across the file). Updated hardcoded coordinate/radius assertions to match the new size. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #1116 +/- ##
==========================================
- Coverage 91.93% 91.83% -0.11%
==========================================
Files 51 51
Lines 7772 7750 -22
==========================================
- Hits 7145 7117 -28
- Misses 627 633 +6
🚀 New features to boost your workflow:
|
… deepcopy Two orthogonal wins: 1. _elements.py: `get_model()` already calls `schema.validate()` internally; the explicit second validate() + get_axes_names() call in every __setitem__ was redundant. Removing it halves the DataTree (_to_dataset_view) overhead per element insertion — directly speeds up fixture setup. 2. conftest.py: introduce `_fast_deepcopy_sdata` (copy.deepcopy + manual attrs restoration for DaskDataFrame/#503 and GeoDataFrame/#286) that is ~13x faster than sd_deepcopy (7ms vs 93ms for full_sdata). Session-scope full_sdata, images, labels and the 'full' sdata parametrized fixture; each test gets a fresh 7ms copy instead of an 87ms full reconstruction. Also switch sdata_blobs from sd_deepcopy to fast_deepcopy_sdata (2ms vs 25ms). Full suite: 186s → 163s (~12% reduction, ~23s saved). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Member
Author
|
This PR also adjusts the tests in the CI (modified in #1114) to keep only lower-bound and upper-bound Python versions, which is the philosophy used in integration-testing. Actually since we test 3.11 and 3.14, and integration testing tests 3.12 every night, we cover also the 3.12 case, while keeping the per-commit CI leaner. |
…ion comment - get_model() now accepts validate=False to skip schema.validate() when the caller only needs to infer the element type without re-running validation - add comment to Elements.__setitem__ noting that subclass overrides call get_model() which performs the validation - drop Python 3.13 CI matrix entries (superseded by 3.14) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Member
Author
|
So far the tests should be running ~12% faster than in |
… elements Add skip_element_validation() context manager (backed by a ContextVar) that makes __setitem__ call get_model(validate=False) — type inference only, no schema.validate(). Use it in every code path that constructs a SpatialData from elements that originated from an existing SpatialData and were never externally mutated: bounding_box_query, polygon_query, query_by_coordinate_system, transform_to_coordinate_system, subset, and init_from_elements. test_query_spatial_data: 0.77s → 0.64s (the remaining time is the query work itself — filtering, shapely ops, raster cropping). Also inline a minimal 2-image SpatialData in test_transformations_between_coordinate_systems instead of relying on the full 8-element images fixture; the test only ever uses image2d and image2d_multiscale, so writing the other 6 to disk was pure waste. test_transformations_between_coordinate_systems: 0.61s → 0.44s. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…dy-valid elements" This reverts commit acb25b1.
Replace per-slice O(H+W) approach (512 dask compute() calls for a 256×256 array) with a single array materialization + np.bincount O(n_pixels) pass. This speeds up get_centroids() on labels from ~1.5s to ~50ms, cutting to_circles(labels) from ~1.6s to ~53ms. Affects test_validation dataloader variants (~2.5s → ~0.2s each, saving ~9s), test_labels_2d_to_circles, and any production call to get_centroids or to_circles on label arrays. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Reducing the time required to run tests. This complements #1113, which reduces the number of emitted warnings (@melonora observed that this was impacting the runtime).