Skip to content

Comments

feat: implement tensorstore cache bounding in CellMapDataLoader#59

Merged
rhoadesScholar merged 13 commits intomainfrom
cache_bound
Feb 24, 2026
Merged

feat: implement tensorstore cache bounding in CellMapDataLoader#59
rhoadesScholar merged 13 commits intomainfrom
cache_bound

Conversation

@rhoadesScholar
Copy link
Member

No description provided.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR implements TensorStore cache bounding in CellMapDataLoader to prevent unbounded RAM growth in persistent worker processes during multi-epoch training. The feature allows users to limit the chunk cache size either via a constructor parameter or an environment variable, with automatic per-worker budget splitting.

Changes:

  • Added tensorstore_cache_bytes parameter to CellMapDataLoader.__init__() with environment variable fallback (CELLMAP_TENSORSTORE_CACHE_BYTES)
  • Implemented recursive context propagation to all CellMapImage instances across dataset types (single, multi, subset)
  • Added comprehensive test suite covering parameter storage, per-worker division, env var handling, multi-dataset traversal, warning on pre-opened arrays, and functional data loading

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 7 comments.

File Description
src/cellmap_data/dataloader.py Added cache bounding logic with helper functions _set_tensorstore_context() and _apply_context_to_image() to recursively set TensorStore contexts on all images; integrated cache limit calculation and context assignment into loader initialization
tests/test_dataloader.py Added 13 comprehensive test cases covering cache parameter storage, per-worker byte division, environment variable handling, multi-dataset traversal, pre-opened array warnings, and functional data loading verification

rhoadesScholar and others added 4 commits February 23, 2026 14:38
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
@rhoadesScholar
Copy link
Member Author

@copilot open a new pull request to apply changes based on the comments in this thread

Copy link
Contributor

Copilot AI commented Feb 23, 2026

@rhoadesScholar I've opened a new pull request, #60, to work on those changes. Once the pull request is ready, I'll request review from you.

@codecov
Copy link

codecov bot commented Feb 23, 2026

Codecov Report

❌ Patch coverage is 93.22034% with 4 lines in your changes missing coverage. Please review.
✅ Project coverage is 70.38%. Comparing base (034f1f2) to head (99b745b).
⚠️ Report is 14 commits behind head on main.

Files with missing lines Patch % Lines
src/cellmap_data/dataloader.py 95.00% 2 Missing ⚠️
src/cellmap_data/dataset.py 89.47% 2 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main      #59      +/-   ##
==========================================
+ Coverage   69.33%   70.38%   +1.04%     
==========================================
  Files          28       28              
  Lines        2576     2630      +54     
==========================================
+ Hits         1786     1851      +65     
+ Misses        790      779      -11     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Add validation and test coverage for tensorstore_cache_bytes edge cases
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.

@rhoadesScholar rhoadesScholar merged commit 183268e into main Feb 24, 2026
14 checks passed
@rhoadesScholar rhoadesScholar deleted the cache_bound branch February 24, 2026 00:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants