grouped-query-attention

Here are 12 public repositories matching this topic...

knotgrass / attention

several types of attention modules written in PyTorch for learning purposes

transformers pytorch transformer attention attention-mechanism softmax-layer multi-head-attention multi-query-attention grouped-query-attention scale-dot-product-attention

Updated Jan 2, 2026
Python

reshalfahsi / image-captioning-mobilenet-llama3

Star

Image Captioning With MobileNet-LLaMA 3

nlp cnn pytorch transformer image-captioning image-text flickr8k-dataset mobilenetv3 pytorch-lightning kv-cache rotary-position-embedding grouped-query-attention rms-norm llama3

Updated Jun 23, 2024
Jupyter Notebook

krik8235 / ml-gqa-transformer

Star

Examine cost-effective methods for optimizing GQA configurations, comparing the performance with its counterparts like Multi-Head Attention (MHA) and Multi-Query Attention (MQA).

transformer multi-head-attention multi-query-attention grouped-query-attention

Updated Nov 21, 2025
Jupyter Notebook

MyDarapy / smollm-experiments

Star

(Unofficial) building Hugging Face SmolLM-blazingly fast and small language model with PyTorch implementation of grouped query attention (GQA)

transformer attention smol huggingface ml-efficiency llm grouped-query-attention smol-lm huggingface-smol-lm

Updated Jan 11, 2025
Python

estnafinema0 / russian-jokes-generator

Star

Transformer Models for Humorous Text Generation. Fine-tuned on Russian jokes dataset with ALiBi, RoPE, GQA, and SwiGLU.Plus a custom Byte-level BPE tokenizer.

nlp pytorch alibi transformer-models rotary-position-embedding grouped-query-attention swiglu bpe-tokenizer

Updated Mar 10, 2025
Jupyter Notebook

prajeshshrestha / Llama-2.0-architecture-and-inference-from-scratch-with-PyTorch

Star

pytorch pytorch-implementation kv-cache llama2 grouped-query-attention rotary-positional-embedding

Updated Aug 5, 2024
Python

LeviJunior21 / LargeLanguageModel

Star

Criando um modelo Transformer do zero com Positional Encoding / Posições treináveis, MultiHead Attention, KV Cache e Grouped Attention com alguns livros brasileiros.

transformers batch-normalization layer-normalization multi-head-attention kv-cache positional-encoding grouped-query-attention rms-norm

Updated Sep 30, 2025
Jupyter Notebook

lucadellalib / llama3

Star

A single-file implementation of LLaMA 3, with support for jitting, KV caching and prompting

python transformers pytorch large-language-models llm grouped-query-attention rotary-positional-embedding llama3

Updated Nov 11, 2024
Python

andrewhsugithub / min-llama

Star

my llama3 implementation

nlp transformers rope kv-cache llm grouped-query-attention swiglu llama3

Updated Jan 31, 2025
Python

LeviJunior21 / Trabalho-AprendizagemProfunda-Transformer

Star

Criando um modelo Transformer do zero com variações como Multi-Head Attention e Grouped Query Attention em livros de Machado de Assis.

transformers batch-normalization layer-normalization multi-head-attention grouped-query-attention rms-norm decoder-only

Updated Sep 27, 2025
Python

AnkitaMungalpara / Building-DeepSeek-From-Scratch

Star

This repository shows how to build a DeepSeek language model from scratch using PyTorch. It includes clean, well-structured implementations of advanced attention techniques such as key–value caching for fast decoding, multi-query attention, grouped-query attention, and multi-head latent attention.

transformers pytorch multi-query-attention grouped-query-attention multi-head-latent-attention deepseek-from-scratch

Updated Jan 10, 2026
Jupyter Notebook

LukasDrews97 / DumbleLLM

Star

Decoder-only LLM trained on the Harry Potter books.

transformer byte-pair-encoding rotary-position-embedding large-language-model flash-attention grouped-query-attention

Updated Dec 20, 2024
Python

Improve this page

Add a description, image, and links to the grouped-query-attention topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the grouped-query-attention topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

grouped-query-attention

Here are 12 public repositories matching this topic...

knotgrass / attention

reshalfahsi / image-captioning-mobilenet-llama3

krik8235 / ml-gqa-transformer

MyDarapy / smollm-experiments

estnafinema0 / russian-jokes-generator

prajeshshrestha / Llama-2.0-architecture-and-inference-from-scratch-with-PyTorch

LeviJunior21 / LargeLanguageModel

lucadellalib / llama3

andrewhsugithub / min-llama

LeviJunior21 / Trabalho-AprendizagemProfunda-Transformer

AnkitaMungalpara / Building-DeepSeek-From-Scratch

LukasDrews97 / DumbleLLM

Improve this page

Add this topic to your repo