Skip to content

Add test for kernels with multiple shared buffers#672

Merged
vchuravy merged 1 commit intomainfrom
localmemtest
Feb 2, 2026
Merged

Add test for kernels with multiple shared buffers#672
vchuravy merged 1 commit intomainfrom
localmemtest

Conversation

@christiangnrd
Copy link
Member

@christiangnrd christiangnrd commented Jan 11, 2026

Split off from #666.

This is a regression test for an issue that came up with AMDGPU.jl when making the id field no longer necessary as part of my (n-progress) efforts to get KernelIntrinsics ready

@github-actions
Copy link
Contributor

github-actions bot commented Jan 11, 2026

Benchmark Results

main 9362542... main / 9362542...
saxpy/default/Float32/1024 0.034 ± 0.02 ms 0.0342 ± 0.017 ms 0.997 ± 0.76
saxpy/default/Float32/1048576 0.326 ± 0.034 ms 0.348 ± 0.05 ms 0.935 ± 0.16
saxpy/default/Float32/16384 0.0471 ± 0.023 ms 0.049 ± 0.023 ms 0.961 ± 0.65
saxpy/default/Float32/2048 0.0453 ± 0.026 ms 0.0384 ± 0.022 ms 1.18 ± 0.97
saxpy/default/Float32/256 30.8 ± 12 μs 0.0332 ± 0.014 ms 0.927 ± 0.53
saxpy/default/Float32/262144 0.124 ± 0.023 ms 0.128 ± 0.023 ms 0.969 ± 0.25
saxpy/default/Float32/32768 0.048 ± 0.022 ms 0.0486 ± 0.021 ms 0.988 ± 0.62
saxpy/default/Float32/4096 0.0423 ± 0.026 ms 0.039 ± 0.025 ms 1.08 ± 0.96
saxpy/default/Float32/512 0.0336 ± 0.016 ms 0.0348 ± 0.019 ms 0.967 ± 0.7
saxpy/default/Float32/64 29.7 ± 9.8 μs 0.0327 ± 0.013 ms 0.909 ± 0.47
saxpy/default/Float32/65536 0.0636 ± 0.024 ms 0.0612 ± 0.023 ms 1.04 ± 0.55
saxpy/default/Float64/1024 0.0369 ± 0.022 ms 0.0378 ± 0.022 ms 0.975 ± 0.82
saxpy/default/Float64/1048576 0.621 ± 0.1 ms 0.68 ± 0.064 ms 0.913 ± 0.17
saxpy/default/Float64/16384 0.0521 ± 0.025 ms 0.0487 ± 0.023 ms 1.07 ± 0.73
saxpy/default/Float64/2048 0.0415 ± 0.024 ms 0.0451 ± 0.025 ms 0.921 ± 0.74
saxpy/default/Float64/256 31.4 ± 12 μs 0.0328 ± 0.015 ms 0.957 ± 0.56
saxpy/default/Float64/262144 0.195 ± 0.02 ms 0.199 ± 0.02 ms 0.978 ± 0.14
saxpy/default/Float64/32768 0.0554 ± 0.023 ms 0.0582 ± 0.024 ms 0.953 ± 0.57
saxpy/default/Float64/4096 0.0456 ± 0.028 ms 0.0462 ± 0.026 ms 0.987 ± 0.82
saxpy/default/Float64/512 0.0356 ± 0.021 ms 0.033 ± 0.018 ms 1.08 ± 0.86
saxpy/default/Float64/64 29.1 ± 10 μs 30.1 ± 9.3 μs 0.966 ± 0.45
saxpy/default/Float64/65536 0.078 ± 0.024 ms 0.0791 ± 0.023 ms 0.987 ± 0.41
saxpy/static workgroup=(1024,)/Float32/1024 0.0342 ± 0.02 ms 0.0324 ± 0.017 ms 1.06 ± 0.85
saxpy/static workgroup=(1024,)/Float32/1048576 0.33 ± 0.033 ms 0.325 ± 0.03 ms 1.02 ± 0.14
saxpy/static workgroup=(1024,)/Float32/16384 0.0466 ± 0.023 ms 0.0464 ± 0.022 ms 1.01 ± 0.68
saxpy/static workgroup=(1024,)/Float32/2048 0.0381 ± 0.022 ms 0.0419 ± 0.025 ms 0.91 ± 0.76
saxpy/static workgroup=(1024,)/Float32/256 0.032 ± 0.011 ms 0.0323 ± 0.012 ms 0.991 ± 0.52
saxpy/static workgroup=(1024,)/Float32/262144 0.124 ± 0.022 ms 0.12 ± 0.024 ms 1.03 ± 0.28
saxpy/static workgroup=(1024,)/Float32/32768 0.0482 ± 0.022 ms 0.0465 ± 0.02 ms 1.04 ± 0.65
saxpy/static workgroup=(1024,)/Float32/4096 0.0461 ± 0.025 ms 0.0385 ± 0.022 ms 1.2 ± 0.94
saxpy/static workgroup=(1024,)/Float32/512 0.0333 ± 0.017 ms 0.0366 ± 0.017 ms 0.909 ± 0.62
saxpy/static workgroup=(1024,)/Float32/64 31.4 ± 9.5 μs 0.0317 ± 0.011 ms 0.992 ± 0.46
saxpy/static workgroup=(1024,)/Float32/65536 0.0608 ± 0.022 ms 0.0584 ± 0.021 ms 1.04 ± 0.53
saxpy/static workgroup=(1024,)/Float64/1024 0.0363 ± 0.023 ms 0.0345 ± 0.022 ms 1.05 ± 0.94
saxpy/static workgroup=(1024,)/Float64/1048576 0.688 ± 0.066 ms 0.669 ± 0.066 ms 1.03 ± 0.14
saxpy/static workgroup=(1024,)/Float64/16384 0.0458 ± 0.022 ms 0.049 ± 0.023 ms 0.935 ± 0.63
saxpy/static workgroup=(1024,)/Float64/2048 0.0397 ± 0.023 ms 0.039 ± 0.022 ms 1.02 ± 0.82
saxpy/static workgroup=(1024,)/Float64/256 0.0372 ± 0.017 ms 0.0369 ± 0.015 ms 1.01 ± 0.62
saxpy/static workgroup=(1024,)/Float64/262144 0.185 ± 0.02 ms 0.188 ± 0.021 ms 0.983 ± 0.15
saxpy/static workgroup=(1024,)/Float64/32768 0.0565 ± 0.024 ms 0.0548 ± 0.023 ms 1.03 ± 0.62
saxpy/static workgroup=(1024,)/Float64/4096 0.042 ± 0.024 ms 0.0412 ± 0.024 ms 1.02 ± 0.83
saxpy/static workgroup=(1024,)/Float64/512 0.0322 ± 0.018 ms 0.0336 ± 0.019 ms 0.959 ± 0.75
saxpy/static workgroup=(1024,)/Float64/64 30.9 ± 9.5 μs 30.5 ± 8.6 μs 1.01 ± 0.42
saxpy/static workgroup=(1024,)/Float64/65536 0.0751 ± 0.022 ms 0.0762 ± 0.021 ms 0.985 ± 0.4
time_to_load 0.981 ± 0.0061 s 0.982 ± 0.013 s 0.999 ± 0.015

Benchmark Plots

A plot of the benchmark results have been uploaded as an artifact to the workflow run for this PR.
Go to "Actions"->"Benchmark a pull request"->[the most recent run]->"Artifacts" (at the bottom).

@codecov
Copy link

codecov bot commented Jan 13, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 52.27%. Comparing base (dc57cd0) to head (9362542).
⚠️ Report is 2 commits behind head on main.

Additional details and impacted files
@@           Coverage Diff           @@
##             main     #672   +/-   ##
=======================================
  Coverage   52.27%   52.27%           
=======================================
  Files          22       22           
  Lines        1689     1689           
=======================================
  Hits          883      883           
  Misses        806      806           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@vchuravy vchuravy merged commit 627eb0e into main Feb 2, 2026
34 checks passed
@vchuravy vchuravy deleted the localmemtest branch February 2, 2026 13:46
@vchuravy
Copy link
Member

vchuravy commented Feb 2, 2026

Backport to release-0.9 as well?

@christiangnrd
Copy link
Member Author

Backport to release-0.9 as well?

Might as well. #678

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants