Model used in tests and examples: qwen2.5-0.5b-instruct_q4_K_M.gguf
HuggingFace Repo: provetgrizzner/qwen-bundle
A universal, OS-agnostic AI model bundler and runner. Build single-file AI executables that run natively on Linux, Windows, and macOS without dependencies, using Cosmopolitan Libc.
- Polyglot Binaries: One file (
.baremetallama) runs on Windows (as.exe), Linux, and macOS. - Embedded Inference: The engine and weights are fused into a single executable.
- Zero Dependencies: No Python, no CUDA, no DLLs required on the target machine.
- Bare-Metal Vision: Roadmap for
.pureblm, a bootable RTOS runner that runs AI directly on hardware.
bundler/: CLI tool to package models into.baremetallamafiles.vendor/llama.cpp/: Modified Llama.cpp source for Cosmopolitan compatibility.Makefile.cosmo: The primary build system for universal binaries.PUREBLM_ARCHITECTURE.md: Technical roadmap for the bare-metal bootable runner.docs/diagrams.md: Mermaid.js diagrams of the system architecture.
You need the Cosmocc toolchain to compile universal binaries.
# Download and setup Cosmocc
wget https://cosmo.zip/pub/cosmocc/cosmocc.zip
unzip cosmocc.zip -d cosmocc/
export PATH="$PWD/cosmocc/bin:$PATH"Use the custom Cosmopolitan Makefile to build the portable llama-server.com:
make -f Makefile.cosmo -j$(nproc)cosmoc++ -O3 -mcosmo bundler/bundler.cpp -o bundler/baremetallama.comPack a GGUF model into a standalone .baremetallama file:
./bundler/baremetallama.com llama-server.com your_model.gguf qwen.baremetallamaIf you don't want to install cosmocc locally, you can use Docker to bundle your models:
docker build -t baremetallama .Mount your current directory to /work inside the container:
docker run -v $(pwd):/work baremetallama /work/model.gguf /work/output.baremetallamaThis command will output a output.baremetallama file in your local folder that works on any OS.
Rename to .exe or run directly from CMD/PowerShell:
.\qwen.baremetallamachmod +x qwen.baremetallama
./qwen.baremetallamaBy default, running the bundle without arguments launches an interactive Chat TUI in your terminal.
Repository: RedLordezh7Venom/baremetallama
TUI Engine: Modified llama-server (Llama.cpp)
Runtime: Cosmopolitan Libc