Skip to content

fix: apply drop_axes squeeze in partial decode path for sharding (#3691)#3763

Open
abishop1990 wants to merge 3 commits intozarr-developers:mainfrom
abishop1990:fix/sharded-mixed-indexing-3691
Open

fix: apply drop_axes squeeze in partial decode path for sharding (#3691)#3763
abishop1990 wants to merge 3 commits intozarr-developers:mainfrom
abishop1990:fix/sharded-mixed-indexing-3691

Conversation

@abishop1990
Copy link
Contributor

Summary

Fixes #3691.

Mixed integer/list indexing on sharded arrays (e.g. arr[0:10, 0, [0, 1]]) raised:

ValueError: could not broadcast input array from shape (10,1,2) into shape (10,2)

Root Cause

When OrthogonalIndexer processes advanced indexing (slices + integers + arrays), it applies ix_() to chunk_selection to set up orthogonal numpy indexing. Integer indices become 1-element ranges (size-1 dimensions) via ix_().

CodecPipeline.read_batch() has two paths:

  1. Non-partial decode (regular codecs): Applies drop_axes.squeeze() to remove size-1 integer dims ✅
  2. Partial decode (ShardingCodec): Missing drop_axes.squeeze()

ShardingCodec._decode_partial_single() receives the ix_()-transformed chunk_selection, which looks like pure fancy indexing to get_indexer(), so it routes to CoordinateIndexer. The result is reshaped to the broadcast coordinate shape (10, 1, 2) instead of (10, 2).

Fix

Apply drop_axes squeeze to chunk_array in the partial decode branch of read_batch(), matching the non-partial path behaviour:

if drop_axes != ():
    chunk_array = chunk_array.squeeze(axis=drop_axes)
out[out_selection] = chunk_array

Testing

Added test_sharding_mixed_integer_list_indexing that verifies:

  • Shape and data equality between chunked and sharded arrays for mixed indexing
  • Multiple integer axes (arr[0, 0, [0, 1, 2]])
  • Slice + integer + slice (arr[0:5, 1, 0:3])
tests/test_codecs/test_sharding.py  125 passed, 1 skipped
tests/test_indexing.py  149 passed, 1 skipped, 5 xfailed

chunk_array_batch, batch_info, strict=False
):
if chunk_array is not None:
if drop_axes != ():
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are we sure that drop_axes is always a tuple? maybe it is, but if not, might be safer to check if its length is 0

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch — switched to if drop_axes: (truthy check) throughout codec_pipeline.py. It's both more Pythonic and type-agnostic, even though drop_axes is annotated as tuple[int, ...] everywhere. Pushed in the latest commit.

When reading sharded arrays with mixed integer/list indexing (e.g.
arr[0:10, 0, [0, 1]]), the outer OrthogonalIndexer produces chunk
selections that have been ix_()-transformed for orthogonal advanced
indexing. Integer indices become single-element ranges (size-1 dims)
via ix_() to enable NumPy orthogonal indexing.

In CodecPipeline.read_batch(), the non-partial path correctly applies
drop_axes.squeeze() to remove those size-1 integer dimensions before
writing to the output buffer. However, the partial decode path (used
by ShardingCodec) was missing this squeeze step.

Fixes zarr-developers#3691

Also: Fix line length violation in test error message to comply with
100 character linting limit.
@abishop1990 abishop1990 force-pushed the fix/sharded-mixed-indexing-3691 branch from a65a546 to 07b6fb7 Compare March 11, 2026 09:40
Cipher added 2 commits March 11, 2026 02:42
…rding test

The test uses complex indexing patterns (mixed integer/list indices) that
mypy's zarr.Array stubs don't recognize as valid. Add specific type ignore
comments for [index] and [union-attr] errors to suppress false positives.
…arding test

- Line 542: Fix assert accessing .shape by changing from [index] to [union-attr]
- Line 544: Add missing type-ignore[union-attr] for f-string .shape access
- Lines 554-555: Remove unused type-ignore[index] comments on assignments

The mypy errors were caused by indexing operations returning union types that
include scalar types (int, float, etc.), which don't have a .shape attribute.
The proper fix uses type-ignore[union-attr] for attribute access, not [index].
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Shape mismatch with mixed integer/list indexing on Sharded arrays

2 participants