Conversation
|
I have a general concern over all the SVE specific microbenchmarks being added. Benchmarks are fairly expensive in terms of runtime and even a small number of them can have significant cost to CI and our tracking. We don't really have any platform specific intrinsic benchmarks correspondingly (i.e. you don't see explicit benchmarks covering Rather instead we typically have our normal benchmarks like for I would expect here that we aren't directly testing SVE either. But rather would be testing with SVE enabled and comparing that against a run with it disabled. This will require a bit more work in the JIT to enable first, but significantly reduces cost and gives better metrics as to the benefit customers will see. CC. @DrewScoggins |
There was a problem hiding this comment.
Pull request overview
Adds a new SVE-focused microbenchmark (OddEvenSort) to measure scalar, AdvSimd (Vector128), and SVE implementations of odd-even sort on Arm64/SVE-capable platforms, following the existing SveBenchmarks patterns.
Changes:
- Introduces
OddEvenSortbenchmark class with four benchmark methods:Scalar,Vector128OddEvenSort,SveOddEvenSort, andSveTail. - Adds per-benchmark SVE support filtering via a local
ManualConfig+SimpleFilter. - Adds GlobalSetup input generation and GlobalCleanup verification against the scalar reference implementation.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
You can also share your feedback on Copilot code review. Take the survey.
| // Find elements that are not in order. | ||
| Vector<uint> pCmp = Sve.ConditionalSelect(pLoop, Sve.CompareGreaterThanOrEqual(a0, a1), Sve.CreateFalseMaskUInt32()); |
| Vector<uint> pCmp = Sve.CompareGreaterThanOrEqual(a0, a1); | ||
| // Swap those elements. | ||
| Vector<uint> b0 = Sve.ConditionalSelect(pCmp, a1, a0); | ||
| Vector<uint> b1 = Sve.ConditionalSelect(pCmp, a0, a1); |
| Vector<uint> pLoop = Sve.CreateWhileLessThanMask32Bit(0, (n - j) / 2); | ||
| (Vector<uint> a0, Vector<uint> a1) = Sve.Load2xVectorAndUnzip(pLoop, source + j); | ||
|
|
||
| Vector<uint> pCmp = Sve.ConditionalSelect(pLoop, Sve.CompareGreaterThanOrEqual(a0, a1), Sve.CreateFalseMaskUInt32()); |
| // Find elements that are not in order. | ||
| Vector128<uint> cmp = AdvSimd.CompareGreaterThanOrEqual(a0, a1); |
| // Handle remaining elemnts in scalar. | ||
| for (; j < n; j += 2) | ||
| { | ||
| if (source[j - 1] > source[j]) | ||
| { | ||
| // Swap source[j - 1] and source[j]. | ||
| uint tmp = source[j - 1]; | ||
| source[j - 1] = source[j]; | ||
| source[j] = tmp; | ||
| sorted = 0; | ||
| } | ||
| } | ||
| for (; j < n - 1; j += 2) | ||
| { | ||
| if (source[j] > source[j + 1]) | ||
| { | ||
| // Swap source[j] and source[j + 1]. | ||
| uint tmp = source[j - 1]; | ||
| source[j - 1] = source[j]; | ||
| source[j] = tmp; |
| AdvSimd.Arm64.StoreVectorAndZip(source + j, (b0, b1)); | ||
| } | ||
|
|
||
| // Handle remaining elemnts in scalar. |
Performance Results
Run on Neoverse-V2
cc @dotnet/arm64-contrib @SwapnilGaikwad @LoopedBard3