Empirical kernel scheduling characterization for NVIDIA GB10 (SM121a). Sweeps GEMM tile configurations, classifies PTX instruction paths, captures hardware telemetry
benchmark gpu cuda nvidia empirical performance-analysis profiling cutlass gemm ptx black-box-testing unified-memory kernel-scheduling nvidia-tools gb10 dgx-spark sm121
-
Updated
May 10, 2026 - C++