small-models

Star

Here are 36 public repositories matching this topic...

PrismML-Eng / Bonsai-demo

Star

Bonsai Demo

bonsai mlx llm small-models llamacpp prism-ml

Updated May 31, 2026
Shell

SqueezeAILab / SqueezeLLM

Star

[ICML 2024] SqueezeLLM: Dense-and-Sparse Quantization

natural-language-processing text-generation transformer llama quantization model-compression efficient-inference post-training-quantization large-language-models llm small-models localllm

Updated Aug 13, 2024
Python

PrismML-Eng / Bonsai-Image-Demo

Star

Generate images locally

image-generation ternary bonsai 1-bit on-device-ai small-models

Updated Jun 14, 2026
PowerShell

SqueezeAILab / KVQuant

Star

[NeurIPS 2024] KVQuant: Towards 10 Million Context Length LLM Inference with KV Cache Quantization

natural-language-processing compression text-generation transformer llama quantization mistral model-compression efficient-inference efficient-model large-language-models llm small-models localllm localllama

Updated Aug 13, 2024
Python

aitomatic / openssa

Star

OpenSSA: Small Specialist Agents based on Domain-Aware Neurosymbolic Agent (DANA) architecture for industrial problem-solving

domain-knowledge industrial-ai small-models specialist-agents

Updated Aug 14, 2025
Python

markendo / downscaling_intelligence

Star

Downscaling Intelligence: Exploring Perception and Reasoning Bottlenecks in Small Multimodal Models

computer-vision small-models instruction-tuning mllm multimodal-large-language-models

Updated Mar 21, 2026
Python

Coding agent built around small (<10B) models for local development with a Rust (Actix Web, Diesel)/Typescript (Solid, Tailwind) opinionated stack. No coding knowledge needed. Multiple agents and roles - Product Owner, PM, Engineering Manager, Rust Engineer, SolidJS Engineer, etc. work together to build full-stack apps.

small-models llama-cpp local-llm local-ai unsloth coding-agent

Updated Jun 28, 2026
Rust

MCG-NJU / AMD

Star

[CVPR 2024] Asymmetric Masked Distillation for Pre-Training Small Foundation Models

action-recognition video-understanding distillation self-supervised-learning temporal-action-detection foundation-models small-models cvpr2024

Updated Jan 11, 2026
Python

logic-OT / Decoder-Only-LLM

Star

This repository features a custom-built decoder-only language model (LLM) with a total of 37 million parameters 🔥. I train the model to be able to ask question from a given context

nlp computer-vision deep-learning inference transformer attention-mechanism decoder-model large-language-models llm small-models

Updated Aug 27, 2024
Jupyter Notebook

logic-OT / BobVLM

Star

BobVLM – A 1.5B multimodal model built from scratch and pre-trained on a single P100 GPU capable of image descriptions and moderate question answering. 🤗🎉

nlp experiment library deep-learning gpu multimodal huggingface huggingface-transformers vision-transformer llm llms small-models vlms

Updated Feb 17, 2025
Python

zhangyifei01 / Awesome-Self-supervised-Learning-of-Tiny-Models

Star

Overview of self-supervised learning of tiny models, including distillation-based methods (aks. self-supervised distillation) and non-distillation methods.

knowledge-distillation self-supervised binary-neural-networks self-supervised-distillation lightweight-models tiny-models small-models

Updated Nov 13, 2022

sfarhat / dapt

Star

Code for "On the Surprising Efficacy of Distillation as an Alternative to Pre-Training Small Models"

synthetic-data distillation pre-training contrastive-learning small-models

Updated Apr 5, 2024
Python

NoaiRox / specialist-agent

Star

Provide specialized AI agents that develop, review, debug, and deploy production-ready code efficiently across various programming tasks.

Updated Jul 7, 2026
JavaScript

JarvisPei / HarnessForge

Star

Open-source harness distillation: frontier teachers improve prompts, tools, validators, skills, and runtime policies around weaker models.

model-evaluation ai-agents tool-use prompt-engineering small-models llm-systems runtime-policies harness-distillation

Updated Jul 7, 2026
Python

ENSTA-U2IS-AI / optuMNIST

Star

Help us define the Pareto front of small models for MNIST classification. Frugal AI.

deep-neural-networks deep-learning mnist-classification frugality small-models

Updated Jul 13, 2023
Python

KapitalSP / VOID

Star

If OpenAI builds engines, VOID builds the chassis.

python modular cross-platform api-gateway self-hosted termux ai-framework embedded-ai edge-ai small-models llama-cpp local-llm gguf offline-ai experimental-ai lightweight-ai rule-based-ai

Updated Feb 19, 2026
Python

weirenong / simpleagent

Star

Tiny local AI agent for edge devices, built to make small models useful with workflows, personas, file/web context, and safe tool-like actions.

edge-ai llm small-models local-llm local-ai ollama coding-agent nemotron-nano

Updated May 24, 2026
Python

Joello2925 / hermes-mod

Star

Manage Hermes CLI skins in a web UI, edit live YAML fields, and save or activate skins with built-in previews.

xmlhttprequest r modules shiny xml ajax hermes rna-seq-analysis runbook ipc-hermes-9852 small-models pinokio llm-agents deterministic-workflows hermes-agent hermes-plugin model-discipline

Updated Jul 7, 2026

dane-codes / TellMeWhy-Context-Injection

Star

Fine-tunes a T5-small model on the TellMeWhy dataset using context injection from a large language model (Gemini) to improve causal reasoning for “why” questions in narratives. Combines efficient training with human and automated evaluations to assess impact.

nlp transformers gemini question-answering language-model fine-tuning huggingface t5 context-injection commensense human-evaluation ai-evaluation small-models bleurt t5-small

Updated May 18, 2025
Jupyter Notebook

hermes-labs-ai / quickthink

Star

quickthink is a local-first CLI and Python library that wraps Ollama-backed LLM calls with a compressed plan-then-answer scaffold and latency-aware routing. It adds a short validated planning step for multi-step prompts and routes simple ones straight through. Local inference control for small models.

cli latency routing scaffold inference structured-output local-first ai-tools llm prompt-engineering small-models local-llm llm-inference ollama llm-ops small-llm ai-reliability plan-then-answer

Updated Jun 7, 2026
Python

Improve this page

Add a description, image, and links to the small-models topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the small-models topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

small-models

Here are 36 public repositories matching this topic...

PrismML-Eng / Bonsai-demo

SqueezeAILab / SqueezeLLM

PrismML-Eng / Bonsai-Image-Demo

SqueezeAILab / KVQuant

aitomatic / openssa

markendo / downscaling_intelligence

brainless / nocodo

MCG-NJU / AMD

logic-OT / Decoder-Only-LLM

logic-OT / BobVLM

zhangyifei01 / Awesome-Self-supervised-Learning-of-Tiny-Models

sfarhat / dapt

NoaiRox / specialist-agent

JarvisPei / HarnessForge

ENSTA-U2IS-AI / optuMNIST

KapitalSP / VOID

weirenong / simpleagent

Joello2925 / hermes-mod

dane-codes / TellMeWhy-Context-Injection

hermes-labs-ai / quickthink

Improve this page

Add this topic to your repo