Skip to content

Fix SGEMM returning wrong results in multithreading on NeoverseV2#5643

Open
martin-frbg wants to merge 1 commit intoOpenMathLib:developfrom
martin-frbg:neov2_param
Open

Fix SGEMM returning wrong results in multithreading on NeoverseV2#5643
martin-frbg wants to merge 1 commit intoOpenMathLib:developfrom
martin-frbg:neov2_param

Conversation

@martin-frbg
Copy link
Collaborator

@mattip
Copy link
Contributor

mattip commented Feb 19, 2026

Is this ready to go?

@martin-frbg
Copy link
Collaborator Author

It definitely fixes the problem on NeoverseV2, but (a) there may be other arm64 cpus similarly affected and (b) I haven't fully understood the underlying issue with that specific parameter that was introduced fairly recently - it may be papering over a missing tail call in the gemm kernel it rode in on, or something else entirely.

@martin-frbg
Copy link
Collaborator Author

@Mousius I'm a bit confused now as I notice NeoverseV2 was already using the SVE SGEMM kernel (via ARMV8SVE) until you switched its KERNEL file to be based on NEOVERSEN2 rather than ARMV8SVE in order to reuse N2's sbgemm kernel in the otherwise unrelated #5399 . Was that a conscious decision to return the V2 to the basic Neon kernel due to its shorter vector register size compared to V1&A64FX, or collateral damage that I missed at the time ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

BUG: np.linalg.norm returns different value on ubuntu-24.04-arm runner after update to 2.4.2

2 participants