A translator from Intel SSE intrinsics to Arm/Aarch64 NEON implementation
-
Updated
Apr 12, 2026 - C++
A translator from Intel SSE intrinsics to Arm/Aarch64 NEON implementation
Actually-sparse dynamic training for PyTorch. CPU-native, Apple Silicon first. Pluggable routers, drop-in SparseLinear.
Optimizing convolution function using ARM's NEON Intrinsics
Extended morphological filters using ARM SIMD instructions for Python on Raspberry Pi (Deep network for Gaussian denoising and image completion are included.)
Add a description, image, and links to the neon-intrinsics topic page so that developers can more easily learn about it.
To associate your repository with the neon-intrinsics topic, visit your repo's landing page and select "manage topics."