Skip to content

[release-0.9] Add test for kernels with multiple shared buffers (#672)#678

Merged
vchuravy merged 1 commit intorelease-0.9from
backport672
Feb 3, 2026
Merged

[release-0.9] Add test for kernels with multiple shared buffers (#672)#678
vchuravy merged 1 commit intorelease-0.9from
backport672

Conversation

@christiangnrd
Copy link
Member

@christiangnrd christiangnrd commented Feb 2, 2026

This probably doesn't merit a release on its own but we could get #673 in and make a small release

@christiangnrd christiangnrd changed the title [Backport 0.9] Add test for kernels with multiple shared buffers (#672) [release-0.9] Add test for kernels with multiple shared buffers (#672) Feb 2, 2026
@github-actions
Copy link
Contributor

github-actions bot commented Feb 2, 2026

Benchmark Results

main 970f7a3... main / 970f7a3...
saxpy/default/Float32/1024 0.074 ± 0.0081 ms 0.651 ± 0.015 μs 114 ± 13
saxpy/default/Float32/1048576 0.453 ± 0.02 ms 0.209 ± 0.021 ms 2.17 ± 0.24
saxpy/default/Float32/16384 0.0613 ± 0.027 ms 2.78 ± 0.15 μs 22 ± 9.9
saxpy/default/Float32/2048 0.0658 ± 0.026 ms 0.78 ± 0.049 μs 84.3 ± 34
saxpy/default/Float32/256 0.0727 ± 0.013 ms 0.582 ± 0.012 μs 125 ± 22
saxpy/default/Float32/262144 0.164 ± 0.025 ms 0.0449 ± 0.00098 ms 3.66 ± 0.56
saxpy/default/Float32/32768 0.0666 ± 0.027 ms 5.32 ± 0.28 μs 12.5 ± 5.1
saxpy/default/Float32/4096 0.0656 ± 0.024 ms 1.15 ± 0.083 μs 57.1 ± 21
saxpy/default/Float32/512 0.0727 ± 0.0076 ms 0.619 ± 0.0092 μs 117 ± 12
saxpy/default/Float32/64 0.0727 ± 0.011 ms 0.572 ± 0.0084 μs 127 ± 19
saxpy/default/Float32/65536 0.088 ± 0.027 ms 12 ± 1.2 μs 7.35 ± 2.4
saxpy/default/Float64/1024 0.0711 ± 0.022 ms 0.777 ± 0.067 μs 91.6 ± 29
saxpy/default/Float64/1048576 0.568 ± 0.065 ms 0.481 ± 0.047 ms 1.18 ± 0.18
saxpy/default/Float64/16384 0.0626 ± 0.027 ms 5.3 ± 0.24 μs 11.8 ± 5.2
saxpy/default/Float64/2048 0.0675 ± 0.025 ms 1.16 ± 0.086 μs 58.1 ± 22
saxpy/default/Float64/256 0.0724 ± 0.011 ms 0.591 ± 0.0082 μs 123 ± 18
saxpy/default/Float64/262144 0.172 ± 0.028 ms 0.0886 ± 0.0075 ms 1.94 ± 0.36
saxpy/default/Float64/32768 0.0725 ± 0.027 ms 12 ± 0.86 μs 6.05 ± 2.3
saxpy/default/Float64/4096 0.0652 ± 0.024 ms 1.73 ± 0.12 μs 37.7 ± 14
saxpy/default/Float64/512 0.0731 ± 0.013 ms 0.645 ± 0.011 μs 113 ± 20
saxpy/default/Float64/64 0.073 ± 0.014 ms 0.568 ± 0.011 μs 129 ± 24
saxpy/default/Float64/65536 0.0915 ± 0.027 ms 24 ± 1.9 μs 3.81 ± 1.2
saxpy/static workgroup=(1024,)/Float32/1024 0.0707 ± 0.0072 ms 2.18 ± 0.028 μs 32.4 ± 3.3
saxpy/static workgroup=(1024,)/Float32/1048576 0.443 ± 0.019 ms 0.196 ± 0.019 ms 2.26 ± 0.24
saxpy/static workgroup=(1024,)/Float32/16384 0.0576 ± 0.026 ms 4.31 ± 0.2 μs 13.4 ± 6.1
saxpy/static workgroup=(1024,)/Float32/2048 0.0634 ± 0.026 ms 2.33 ± 0.061 μs 27.2 ± 11
saxpy/static workgroup=(1024,)/Float32/256 0.0712 ± 0.012 ms 2.65 ± 0.016 μs 26.9 ± 4.5
saxpy/static workgroup=(1024,)/Float32/262144 0.162 ± 0.025 ms 0.0476 ± 0.0023 ms 3.4 ± 0.56
saxpy/static workgroup=(1024,)/Float32/32768 0.0636 ± 0.027 ms 7.4 ± 0.25 μs 8.6 ± 3.6
saxpy/static workgroup=(1024,)/Float32/4096 0.0651 ± 0.024 ms 2.61 ± 0.07 μs 25 ± 9.1
saxpy/static workgroup=(1024,)/Float32/512 0.0713 ± 0.0092 ms 2.81 ± 0.023 μs 25.4 ± 3.3
saxpy/static workgroup=(1024,)/Float32/64 0.0714 ± 0.0094 ms 2.54 ± 0.016 μs 28.1 ± 3.7
saxpy/static workgroup=(1024,)/Float32/65536 0.0817 ± 0.027 ms 14.6 ± 1.4 μs 5.59 ± 1.9
saxpy/static workgroup=(1024,)/Float64/1024 0.0709 ± 0.016 ms 2.3 ± 0.071 μs 30.7 ± 7
saxpy/static workgroup=(1024,)/Float64/1048576 0.489 ± 0.051 ms 0.488 ± 0.044 ms 1 ± 0.14
saxpy/static workgroup=(1024,)/Float64/16384 0.0602 ± 0.026 ms 7.4 ± 0.3 μs 8.13 ± 3.6
saxpy/static workgroup=(1024,)/Float64/2048 0.0643 ± 0.026 ms 2.6 ± 0.079 μs 24.7 ± 10
saxpy/static workgroup=(1024,)/Float64/256 0.0714 ± 0.01 ms 2.63 ± 0.02 μs 27.2 ± 4
saxpy/static workgroup=(1024,)/Float64/262144 0.179 ± 0.026 ms 0.093 ± 0.0082 ms 1.93 ± 0.33
saxpy/static workgroup=(1024,)/Float64/32768 0.0675 ± 0.026 ms 14.7 ± 1.1 μs 4.6 ± 1.8
saxpy/static workgroup=(1024,)/Float64/4096 0.0645 ± 0.024 ms 3.13 ± 0.12 μs 20.6 ± 7.6
saxpy/static workgroup=(1024,)/Float64/512 0.0716 ± 0.011 ms 2.79 ± 0.029 μs 25.7 ± 3.8
saxpy/static workgroup=(1024,)/Float64/64 0.0718 ± 0.011 ms 2.52 ± 0.018 μs 28.5 ± 4.3
saxpy/static workgroup=(1024,)/Float64/65536 0.0916 ± 0.028 ms 26.3 ± 2 μs 3.48 ± 1.1
time_to_load 0.966 ± 0.024 s 0.278 ± 0.0014 s 3.47 ± 0.087

Benchmark Plots

A plot of the benchmark results have been uploaded as an artifact to the workflow run for this PR.
Go to "Actions"->"Benchmark a pull request"->[the most recent run]->"Artifacts" (at the bottom).

@codecov
Copy link

codecov bot commented Feb 2, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 71.85%. Comparing base (1f84b17) to head (970f7a3).
⚠️ Report is 2 commits behind head on release-0.9.

Additional details and impacted files
@@             Coverage Diff              @@
##           release-0.9     #678   +/-   ##
============================================
  Coverage        71.85%   71.85%           
============================================
  Files               14       14           
  Lines              906      906           
============================================
  Hits               651      651           
  Misses             255      255           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@vchuravy vchuravy merged commit 06aa020 into release-0.9 Feb 3, 2026
38 of 39 checks passed
@vchuravy vchuravy deleted the backport672 branch February 3, 2026 13:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants