ampere
Here are 28 public repositories matching this topic...
Pre-built wheels for llama-cpp-python across platforms and CUDA versions
-
Updated
Apr 18, 2026
Cross-platform FlashAttention-2 Triton implementation for Turing+ GPUs with custom configuration mode
-
Updated
Jan 12, 2026 - Python
AML's goal is to make benchmarking of various AI architectures on Ampere CPUs a pleasurable experience :)
-
Updated
Feb 26, 2026 - Python
Setup instructions for running Valheim on Oracle Cloud Infrastructure using Arm also available at https://codeberg.org/husjon/valheim_server_oci_setup
-
Updated
Feb 23, 2026 - Shell
Arduino energy monitor, using SCT-013-030 current sensors
-
Updated
Jun 3, 2019 - OpenSCAD
Production-grade runtime patches for vLLM (45+ patches) — Qwen3.6-35B-A3B-FP8 hybrid GDN+MoE on NVIDIA Ampere (SM 80-86). 127 tok/s MTP free-form, 99 tok/s suffix tool-call (max 175). TurboQuant k8v4 KV cache, 256K context verified to 252K. P67 multi-query kernel + Suffix Decoding + adaptive ngram K. Zero source modifications.
-
Updated
Apr 26, 2026 - Python
First public benchmark of llama.cpp speculative decoding on Qwen3.6-35B-A3B with a single RTX 3090 (post PR #19493 merge, 2026-04-19). 19 configurations covering ngram-cache, ngram-mod, and classic draft with vocab-matched Qwen3.5-0.8B. Finding: no variant achieves net speedup on Ampere + A3B MoE. Raw JSON, plots, full reproducibility.
-
Updated
Apr 26, 2026 - Python
Deploy a complete, self-hosted AI stack for private LLMs, agentic workflows, and content generation. One-command Docker Compose deployment on any cloud.
-
Updated
Jan 1, 2026
📦 A fully automated method for installing Nvidia drivers on Arch Linux
-
Updated
Jan 11, 2026 - Shell
How to deploy Ampere Altra Arm–based processors on Azure for Azure Virtual Machines and Azure Kubernetes Service
-
Updated
May 9, 2023
A VR application to help you understand the seven fundamental units in physics. And more...
-
Updated
Jun 19, 2024 - C++
16-step CUDA optimization of FlashAttention-2 achieving 99.2% of official performance on A100 — Ampere architecture
-
Updated
Mar 6, 2026 - Cuda
Improve this page
Add a description, image, and links to the ampere topic page so that developers can more easily learn about it.
Add this topic to your repo
To associate your repository with the ampere topic, visit your repo's landing page and select "manage topics."