feat: implement tensorstore cache bounding in CellMapDataLoader#59
feat: implement tensorstore cache bounding in CellMapDataLoader#59rhoadesScholar merged 13 commits intomainfrom
Conversation
There was a problem hiding this comment.
Pull request overview
This PR implements TensorStore cache bounding in CellMapDataLoader to prevent unbounded RAM growth in persistent worker processes during multi-epoch training. The feature allows users to limit the chunk cache size either via a constructor parameter or an environment variable, with automatic per-worker budget splitting.
Changes:
- Added
tensorstore_cache_bytesparameter toCellMapDataLoader.__init__()with environment variable fallback (CELLMAP_TENSORSTORE_CACHE_BYTES) - Implemented recursive context propagation to all
CellMapImageinstances across dataset types (single, multi, subset) - Added comprehensive test suite covering parameter storage, per-worker division, env var handling, multi-dataset traversal, warning on pre-opened arrays, and functional data loading
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 7 comments.
| File | Description |
|---|---|
src/cellmap_data/dataloader.py |
Added cache bounding logic with helper functions _set_tensorstore_context() and _apply_context_to_image() to recursively set TensorStore contexts on all images; integrated cache limit calculation and context assignment into loader initialization |
tests/test_dataloader.py |
Added 13 comprehensive test cases covering cache parameter storage, per-worker byte division, environment variable handling, multi-dataset traversal, pre-opened array warnings, and functional data loading verification |
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
|
@copilot open a new pull request to apply changes based on the comments in this thread |
|
@rhoadesScholar I've opened a new pull request, #60, to work on those changes. Once the pull request is ready, I'll request review from you. |
…alues Co-authored-by: rhoadesScholar <37990507+rhoadesScholar@users.noreply.github.com>
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #59 +/- ##
==========================================
+ Coverage 69.33% 70.38% +1.04%
==========================================
Files 28 28
Lines 2576 2630 +54
==========================================
+ Hits 1786 1851 +65
+ Misses 790 779 -11 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Add validation and test coverage for tensorstore_cache_bytes edge cases
No description provided.