[ICML 2024] SqueezeLLM: Dense-and-Sparse Quantization
-
Updated
Aug 13, 2024 - Python
[ICML 2024] SqueezeLLM: Dense-and-Sparse Quantization
[NeurIPS 2024] KVQuant: Towards 10 Million Context Length LLM Inference with KV Cache Quantization
OpenSSA: Small Specialist Agents based on Domain-Aware Neurosymbolic Agent (DANA) architecture for industrial problem-solving
Downscaling Intelligence: Exploring Perception and Reasoning Bottlenecks in Small Multimodal Models
[CVPR 2024] Asymmetric Masked Distillation for Pre-Training Small Foundation Models
This repository features a custom-built decoder-only language model (LLM) with a total of 37 million parameters 🔥. I train the model to be able to ask question from a given context
BobVLM – A 1.5B multimodal model built from scratch and pre-trained on a single P100 GPU capable of image descriptions and moderate question answering. 🤗🎉
Overview of self-supervised learning of tiny models, including distillation-based methods (aks. self-supervised distillation) and non-distillation methods.
Code for "On the Surprising Efficacy of Distillation as an Alternative to Pre-Training Small Models"
Help us define the Pareto front of small models for MNIST classification. Frugal AI.
If OpenAI builds engines, VOID builds the chassis.
Provide specialized AI agents that develop, review, debug, and deploy production-ready code efficiently across various programming tasks.
Portable Knowledge — Prepared For AI
Universal AI agent framework — 1B models compose multi-step plans like 12B with structured context (CTT). 6 domains, 8 MCP tools, 167 tests, zero runtime deps.
Local-first coding agent built with small models in mind and tight VRAM budgets.
LlamaTalks is a modern web application designed to facilitate seamless conversations with powerful language models,
Fine-tunes a T5-small model on the TellMeWhy dataset using context injection from a large language model (Gemini) to improve causal reasoning for “why” questions in narratives. Combines efficient training with human and automated evaluations to assess impact.
Phi-3-Vision model test - running locally
Hermes Agent plugin: bounded runbook-driven workflows for small local models. Deterministic scripts, verifier-first discipline, guarded publication.
Add a description, image, and links to the small-models topic page so that developers can more easily learn about it.
To associate your repository with the small-models topic, visit your repo's landing page and select "manage topics."