#

ampere

Here are 28 public repositories matching this topic...

RobTillaart / INA226

Arduino library for INA226 power sensor

arduino sensor power voltage ampere

Updated Jan 10, 2026
C++

dougeeai / llama-cpp-python-wheels

Pre-built wheels for llama-cpp-python across platforms and CUDA versions

Updated Apr 18, 2026

AmpereComputingAI / llama.cpp

Ampere optimized llama.cpp

meta ai llama arm64 ampere llm llamacpp

Updated Jan 30, 2026
Python

egaoharu-kensei / flash-attention-triton

Cross-platform FlashAttention-2 Triton implementation for Turing+ GPUs with custom configuration mode

Updated Jan 12, 2026
Python

ampere_model_library

AmpereComputingAI / ampere_model_library

AML's goal is to make benchmarking of various AI architectures on Ampere CPUs a pleasurable experience :)

machine-learning natural-language-processing computer-vision model-zoo tensorflow inference pytorch artificial-intelligence arm64 aarch64 ampere armv8-a onnxruntime mlperf-inference dlrm large-language-models yolov8 llama2

Updated Feb 26, 2026
Python

husjon / valheim_server_oci_setup

Setup instructions for running Valheim on Oracle Cloud Infrastructure using Arm also available at https://codeberg.org/husjon/valheim_server_oci_setup

arm ampere valheim

Updated Feb 23, 2026
Shell

BmdOnline / EnergyMonitor

Arduino energy monitor, using SCT-013-030 current sensors

Updated Jun 3, 2019
OpenSCAD

Sandermage / genesis-vllm-patches

Production-grade runtime patches for vLLM (45+ patches) — Qwen3.6-35B-A3B-FP8 hybrid GDN+MoE on NVIDIA Ampere (SM 80-86). 127 tok/s MTP free-form, 99 tok/s suffix tool-call (max 175). TurboQuant k8v4 KV cache, 256K context verified to 252K. P67 multi-query kernel + Suffix Decoding + adaptive ngram K. Zero source modifications.

cuda nvidia moe gdn ampere structured-output long-context fp8 vllm llm-inference qwen speculative-decoding tool-calling block-verify turboquant suffix-decoding adaptive-speculation ampere-sm86

Updated Apr 26, 2026
Python

thc1006 / qwen3.6-speculative-decoding-rtx3090

First public benchmark of llama.cpp speculative decoding on Qwen3.6-35B-A3B with a single RTX 3090 (post PR #19493 merge, 2026-04-19). 19 configurations covering ngram-cache, ngram-mod, and classic draft with vocab-matched Qwen3.5-0.8B. Finding: no variant achieves net speedup on Ampere + A3B MoE. Raw JSON, plots, full reproducibility.

benchmark cuda moe ampere mixture-of-experts inference-benchmark llama-cpp ggml local-llm llm-inference qwen speculative-decoding qwen3 rtx-3090

Updated Apr 26, 2026
Python

fosshostorg / aarch64.com

We are a fosshost project which is delivering ARM-based hardware into multiple, global data centers. We document and keep a diary of our project daily.

ecosystem arm arm64 ampere fosshost

Updated May 19, 2022
TypeScript

groxaxo / GPTQ-Pro

Validated Gemma4 GPTQ on Ampere (RTX 3090/3060) — Marlin JIT nvcc compatibility fixes included

nvidia marlin ampere gptq gemma4

Updated Apr 23, 2026
Python

badr42 / oke_A1

Terraform to provision an OCI OKE cluster on Ampere A1 Processors, and then deploy nginx on it

nginx oci ampere iaac

Updated Dec 10, 2022
HCL

ADLINK / meta-adlink-ampere

Single Yocto layer for all Ampere Altra Arm 64-bit based Computer on Modules (COM-HPC). This layer has the support for the following products AADP, AADK, AADR, AVA

linux cloud embedded server yocto ampere adlink ampere-altra com-hpc soafee

Updated Dec 17, 2025
BitBake

pantaleone-ai / private-ai-stack

Deploy a complete, self-hosted AI stack for private LLMs, agentic workflows, and content generation. One-command Docker Compose deployment on any cloud.

aws ai docker-compose azure gcp self-hosted oci llama traefik ampere homelab oracle-cloud nvidia-cuda n8n llm llamacpp openwebui private-ai

Updated Jan 1, 2026

PowerPlug

manu-p-1 / PowerPlug

🔌 A PowerShell Cmdlet library powered by Ampere

c-sharp library utilities dotnet powershell cmdlets ampere docfx

Updated Mar 29, 2026
C#

Justus0405 / Nvidiainstall

📦 A fully automated method for installing Nvidia drivers on Arch Linux

Updated Jan 11, 2026
Shell

bbenz / ampereonazure

How to deploy Ampere Altra Arm–based processors on Azure for Azure Virtual Machines and Azure Kubernetes Service

kubernetes azure virtual-machine arm64 ampere arm64-images azurekubernetesservice

Updated May 9, 2023

m96-chan / PyGPUkit

Minimal GPU runtime for Python - high-performance CUDA kernels, memory management, and LLM inference without heavy dependencies

python rust gpu numpy cuda inference hopper ampere tensorcore blackwell llm safetensors

Updated Feb 20, 2026
Python

system-of-units

pierre-auguste / system-of-units

A VR application to help you understand the seven fundamental units in physics. And more...

open-source simulation vr oculus meter virtual-reality ampere mole metrology second kelvin kilogram candella unreal-engine-5

Updated Jun 19, 2024
C++

kalyani-25 / Reimplementation_flash-attention-from-scratch

16-step CUDA optimization of FlashAttention-2 achieving 99.2% of official performance on A100 — Ampere architecture

deep-learning cuda pytorch ampere gpu-kernels nsight llm-inference flashattention

Updated Mar 6, 2026
Cuda

Improve this page

Add a description, image, and links to the ampere topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the ampere topic, visit your repo's landing page and select "manage topics."