Skip to content

Fix adv benchmarks.yaml: test args do not match kernel compile-time constants#247

Merged
zjin-lcf merged 1 commit into
ORNL:masterfrom
zhihuidu-amd:fix/adv-benchmarks-yaml-wrong-args
Jun 6, 2026
Merged

Fix adv benchmarks.yaml: test args do not match kernel compile-time constants#247
zjin-lcf merged 1 commit into
ORNL:masterfrom
zhihuidu-amd:fix/adv-benchmarks-yaml-wrong-args

Conversation

@zhihuidu-amd
Copy link
Copy Markdown

@zhihuidu-amd zhihuidu-amd commented Jun 6, 2026

Summary

Fixes incorrect test arguments for the adv benchmark that cause correctness failures on all GPU platforms (CUDA, HIP, SYCL, OpenMP).

Fixes #246

Root Cause

The advCubatureHex3D kernel in src/adv-{cuda,hip}/adv.h hardcodes compile-time constants:

  • p_Nq = 8 → requires N = 7 (Nq = N+1 = 8)
  • p_cubNq = 16 → requires cubN = 15 (cubNq = 16)

The previous args [�\, �\, �\] gave Nq=17 ≠ p_Nq=8, causing shared memory out-of-bounds reads and wrong results on all platforms.

Change

-    args: [�\, �\, �\]
+    args: [�\, 
\, �\]

Verification (AMD MI300X gfx942, ROCm 7.2)

Command Result
adv 16 16 16 1 ❌ FAIL — index 8192: 0.691294 ≠ 0.583546
adv 7 15 1 1 ✅ PASS
adv 7 15 4 1 ✅ PASS
adv 7 15 16 1 ✅ PASS
adv 7 15 64 1 ✅ PASS

Failure occurs on all platforms (CUDA/HIP/SYCL/OpenMP) — not AMD-specific.

The advCubatureHex3D kernel hardcodes p_Nq=8 and p_cubNq=16.
These require N=7 (Nq=8) and cubN=15 (cubNq=16) respectively.
Previous args [16,16,16] gave Nq=17 != p_Nq=8, causing shared
memory out-of-bounds reads and incorrect results on all platforms.

Verified on AMD MI300X (gfx942):
  adv 7 15 16 1 -> PASS (Nelements=1,4,16,64 all pass)
  adv 16 16 16 1 -> FAIL (index 8192: 0.691294 != 0.583546)

Fixes: ORNL#246
@zjin-lcf zjin-lcf merged commit 4a40821 into ORNL:master Jun 6, 2026
@zjin-lcf
Copy link
Copy Markdown
Collaborator

zjin-lcf commented Jun 6, 2026

Thank you for the pull request.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

benchmarks.yaml: adv test args do not match kernel compile-time constants (N=16 should be N=7)

2 participants