Return a 1-D vector from torch.diagonal converter by john-rocky · Pull Request #2682 · apple/coremltools

john-rocky · 2026-05-01T04:46:31Z

Summary

torch.diagonal(input, offset, dim1, dim2) returns the requested diagonal as a 1-D tensor (for 2-D input), but the converter used mb.band_part, which only zeros the off-diagonal entries and returns the same-shape matrix. As a result torch.diagonal(x) for a 5x5 matrix produced a 5x5 result instead of a length-5 vector.
Extract the diagonal by flattening the input and gathering elements at strides of m + 1, mirroring NumPy's row-major diagonal indexing.
Support offset and the (dim1, dim2) == (1, 0) transpose case in addition to the default (0, 1). Higher-rank input still raises NotImplementedError, matching the pre-existing scope.

Test plan

pytest test_torch_ops.py::TestDiagonal — 120 passed, 120 skipped (dim1==dim2 / empty-diagonal cases).
End-to-end mlpackage.predict matched the PyTorch reference on shapes {(5,5), (3,4), (4,3)} × offsets {-2,-1,0,1,2} × (dim1, dim2) ∈ {(0,1), (1,0)}.

`torch.diagonal(input, offset, dim1, dim2)` returns the requested diagonal as a 1-D tensor (for 2-D input), but the converter used `mb.band_part`, which only zeros the off-diagonal entries and returns a same-shape matrix. As a result, `torch.diagonal(x)` for a 5x5 matrix produced a 5x5 result instead of a length-5 vector. Extract the diagonal by flattening the input and gathering the elements at strides of `m + 1`, mirroring NumPy's row-major diagonal indexing. Support `offset` and the `(dim1, dim2) == (1, 0)` transpose case in addition to the default. Higher-rank input still raises `NotImplementedError`, matching the pre-existing scope. Verified end-to-end against PyTorch reference for shapes {(5,5), (3,4), (4,3)}, offsets {-2,-1,0,1,2}, and dim swaps. Fixes apple#2565.

TobyRoseman · 2026-05-18T15:51:51Z

+        raise NotImplementedError("diagonal requires a statically-shaped input")
+
+    if dim1 < 0:
+        dim1 += x.rank


I'm not sure the logic here is correct. It also looks like you don't have any unit test instances for when dim1 and dim2 are less than zero. Please add some.

TobyRoseman · 2026-05-18T16:02:45Z

CI Run: https://gitlab.com/coremltools1/coremltools/-/pipelines/2534725082

john-rocky · 2026-05-18T22:15:38Z

Thanks for the review — addressed both points in 93010490.

Negative-dim tests added: the parametrize for dim1/dim2 is now [-2, -1, 0, 1] (was [0, 1]), with the "same axis" skip rewritten as d1 % 2 == d2 % 2 so (-1, 1), (0, -2) etc. are filtered out. Effective-offset calc in the test also resolves negatives before checking the (1, 0) swap. End-to-end on a 3×4 input I sanity-checked off/dim combos including (off=-1, dim1=-1, dim2=-2) and (off=1, dim1=1, dim2=0) — MIL graph matches PyTorch (reshape → gather, correct output shape and values).

On the logic concern (line 8381): the section is doing three things and I think the worry might be about the offset flip — happy to push a comment-only commit if anything below still reads ambiguously.

Negative-dim resolution (dim1 += x.rank): since we already rejected anything but x.rank == 2 two lines up, this is just -1 → 1, -2 → 0. After this both dims are in {0, 1}.
(dim1, dim2) == (1, 0) ⇒ flip offset: torch.diagonal(A, k, dim1=1, dim2=0) is the same as torch.diagonal(A.T, k, dim1=0, dim2=1). For an (n, m) matrix A and offset k ≥ 0, the diagonal of A.T at +k reads A.T[i][i+k] = A[i+k][i], which equals the diagonal of A at offset -k. Same identity for k < 0. So we can keep n, m = x.shape (original, not transposed) and just negate offset.
Index formula on the flattened tensor: for row-major shape (n, m), element A[r][c] is at flat index r*m + c. The diagonal at offset k starts at (0, k) for k ≥ 0 (flat index k) and at (-k, 0) for k < 0 (flat index -k*m), then steps m + 1 each time. That's the start + i * (m + 1) you see.

I cross-checked this against torch.diagonal for all 280 combinations of shape ∈ {(5,5), (3,4), (4,3), (2,3), (3,2), (1,5), (5,1)}, offset ∈ {-3..3}, (dim1, dim2) over {0, 1, -1, -2}-valid pairs — zero mismatches (excluding the empty-diagonal cases, where the PR raises ValueError instead of returning an empty tensor; happy to switch that to a mb.reshape(x=..., shape=[0]) if matching PyTorch's empty-result semantics matters).

Re-running CI: https://gitlab.com/coremltools1/coremltools/-/pipelines/2534725082 covered the previous diff — pushed commit will need a fresh run.

TobyRoseman · 2026-05-19T23:59:34Z

Thanks for the additional tests.

Updated CI: https://gitlab.com/coremltools1/coremltools/-/pipelines/2538806799

If CI passes, this change should be good to merge.

john-rocky force-pushed the fix-diagonal-vector-output branch from e08d13d to 1ae5717 Compare May 6, 2026 00:28

TobyRoseman requested changes May 18, 2026

View reviewed changes

Extend diagonal test to cover negative dim1/dim2

9301049

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Return a 1-D vector from torch.diagonal converter#2682

Return a 1-D vector from torch.diagonal converter#2682
john-rocky wants to merge 2 commits into
apple:mainfrom
john-rocky:fix-diagonal-vector-output

john-rocky commented May 1, 2026

Uh oh!

TobyRoseman May 18, 2026

Uh oh!

TobyRoseman commented May 18, 2026

Uh oh!

john-rocky commented May 18, 2026

Uh oh!

TobyRoseman commented May 19, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

john-rocky commented May 1, 2026

Summary

Test plan

Uh oh!

TobyRoseman May 18, 2026

Choose a reason for hiding this comment

Uh oh!

TobyRoseman commented May 18, 2026

Uh oh!

john-rocky commented May 18, 2026

Uh oh!

TobyRoseman commented May 19, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants