Skip to content
#

sgemm

Here are 20 public repositories matching this topic...

The repository targets the OpenCL gemm function performance optimization. It compares several libraries clBLAS, clBLAST, MIOpenGemm, Intel MKL(CPU) and cuBLAS(CUDA) on different matrix sizes/vendor's hardwares/OS. Out-of-the-box easy as MSVC, MinGW, Linux(CentOS) x86_64 binary provided. 在不同矩阵大小/硬件/操作系统下比较几个BLAS库的sgemm函数性能,提供binary,开盒即用。

  • Updated Mar 28, 2019
  • C

CUDA Kernel Academy: A systematic learning path for high-performance CUDA kernel development. From SGEMM basics to Tensor Core mastery with 4 progressive modules. | CUDA 高性能算子开发:从 SGEMM 基础到 Tensor Core 精通,4 模块渐进式学习路径

  • Updated Apr 22, 2026
  • C++

Improve this page

Add a description, image, and links to the sgemm topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the sgemm topic, visit your repo's landing page and select "manage topics."

Learn more