fix: allow writing to 0-dimensional arrays with sharding#3966
Conversation
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #3966 +/- ##
==========================================
+ Coverage 93.28% 93.29% +0.01%
==========================================
Files 87 87
Lines 11745 11752 +7
==========================================
+ Hits 10956 10964 +8
+ Misses 789 788 -1
🚀 New features to boost your workflow:
|
chuckwondo
left a comment
There was a problem hiding this comment.
Thanks @NIK-TIGER-BILL!
Would you mind also adding a test that covers your patch that handles the 0D case? The test you added shows that the original case no longer errors, but that case no longer enters the function that you patched, so your patch itself remains untested, hence the failed codecov run.
|
@chuckwondo Good catch — the integration test bypassed the patched function. I just pushed a direct unit test that exercises _ShardIndex.get_chunk_slices_vectorized with a 0-D offsets_and_lengths array, covering both the written and unwritten chunk cases. This should satisfy codecov for the new branch. Let me know if you would like anything else adjusted! |
|
Thanks @NIK-TIGER-BILL. Would you mind updating your branch and addressing the ruff pre-commit error? (See https://results.pre-commit.ci/run/github/48049137/1778641889.s58oPP4KQbGdn7pJdtUPAg) @maxrjones and/or @d-v-b, please review. |
Signed-off-by: NIK-TIGER-BILL <nik.tiger.bill@github.com>
Signed-off-by: NIK-TIGER-BILL <nik.tiger.bill@github.com>
Signed-off-by: NIK-TIGER-BILL <nik.tiger.bill@github.com>
411cf37 to
bb315a8
Compare
|
@chuckwondo Done — rebased on latest upstream/main and fixed the ruff unused-variable error (prefixed |
|
@chuckwondo I believe I've addressed all feedback — rebased on latest upstream/main and fixed the ruff unused-variable issue (prefixed with ). Could you please take another look when you have a moment? Thanks! |
|
@d-v-b, this is a very small PR. Looks good to me, but would you mind reviewing as well? |
chuckwondo
left a comment
There was a problem hiding this comment.
Thanks @NIK-TIGER-BILL!
|
this looks good! there's some baseline code smell in these classes but this fix is appropriately targeted. I'll open an issue with my follow-up recommendations. |
Closes #3751
This PR fixes a bug where writing to a 0-dimensional array with sharding enabled would crash with an
IndexErrorin the sharding codec's vectorized chunk slice lookup.The root cause was that
get_chunk_slices_vectorizedinzarr/codecs/sharding.pyassumedoffsets_and_lengthswas at least 2-dimensional, but for 0-dimensional arrays it has shape(2,)instead of(1, 2). This causedoffsets_and_lengths[:, 0]to fail with "too many indices for array".Fix: Add an early return in
get_chunk_slices_vectorizedto handle the 0-dimensional case by reshapingoffsets_and_lengthsto(1, 2)before slicing.Test: Added
test_sharding_zero_dimensionalintests/test_codecs/test_sharding.pyas a regression test.