v0.4.0 — preLLM becomes an automatic persistent context layer for small LLMs.
Small LLMs (Bielik 7B/11B, Qwen 3B, Phi3) lose context after 5–10 exchanges, hallucinate without pre-prompts, and don't know the execution environment. Users must manually craft system prompts with env info, project structure, and session history.
preLLM automatically:
- Collects runtime context (env, process, locale, network, git, system)
- Compresses codebase into token-efficient
.toonformat - Persists session history across restarts (SQLite)
- Retrieves relevant context via RAG-style similarity search
- Filters sensitive data (API keys, tokens) before the large-LLM
All of this happens with zero manual pre-prompts.
User Query
│
▼
┌─────────────────────────────────────────────────┐
│ CONTEXT LAYER (automatic) │
│ │
│ RuntimeContext → env, process, locale, │
│ network, git, system │
│ SessionPersistence → SQLite history + prefs │
│ CodebaseCompressor → .toon project summary │
│ SensitiveFilter → block API keys/tokens │
└──────────────┬──────────────────────────────────┘
│ enriched context
▼
┌─────────────────────────────────────────────────┐
│ PREPROCESSOR AGENT (small LLM ≤24B) │
│ auto-strategy selection + pipeline execution │
│ receives: query + runtime + history + codebase │
└──────────────┬──────────────────────────────────┘
│ structured executor_input
▼
┌─────────────────────────────────────────────────┐
│ EXECUTOR AGENT (large LLM) │
│ receives: sanitized prompt (no secrets) │
└─────────────────────────────────────────────────┘
from prellm import preprocess_and_execute
# Zero-config — everything is automatic in v0.4
result = await preprocess_and_execute(
query="Zoptymalizuj monitoring ESP32",
small_llm="ollama/bielik:7b",
large_llm="openrouter/google/gemini-3-flash-preview",
)
### Full Persistent Context
```python
result = await preprocess_and_execute(
query="Zoptymalizuj monitoring ESP32",
small_llm="ollama/bielik:7b",
large_llm="openrouter/google/gemini-3-flash-preview",
strategy="auto", # auto-select best strategy
collect_runtime=True, # full env/shell snapshot
session_path=".prellm/sessions.db", # persistent history
codebase_path=".", # compress project → .toon
sanitize=True, # filter secrets
)The RuntimeContext model captures the full execution environment:
from prellm.analyzers.context_engine import ContextEngine
engine = ContextEngine()
runtime = engine.gather_runtime()
print(runtime.env_safe) # filtered env vars (no secrets)
print(runtime.process) # {"pid": 1234, "cwd": "/project", ...}
print(runtime.locale) # {"lang": "pl_PL.UTF-8", "timezone": "CET", ...}
print(runtime.network) # {"hostname": "nvidia", "local_ip": "192.168.1.10"}
print(runtime.git) # {"branch": "main", "short_sha": "abc1234"}
print(runtime.system) # {"os": "Linux", "arch": "x86_64", "python": "3.13"}
print(runtime.sensitive_blocked_count) # 7 (env vars blocked)
print(runtime.token_estimate) # 350 tokensprellm context show # formatted runtime context
prellm context show --json # as JSON
prellm context show --blocked # show env vars + what was blocked
prellm context show --codebase . # include compressed project| Parameter | Default | Description |
|---|---|---|
strategy |
"auto" |
Small-LLM auto-selects strategy (was: "classify") |
collect_runtime |
True |
Collect full env/process/locale/network/git/system |
session_path |
None |
Path to session persistence SQLite DB |
sanitize |
True |
Filter sensitive data before large-LLM |
sensitive_rules |
None |
Custom YAML rules for sensitive data |
codebase_path |
None |
Folder to compress for context |
The new context_aware pipeline in configs/pipelines.yaml runs 6 steps:
- collect_runtime — gather
RuntimeContext - inject_session — RAG-retrieve relevant history from
UserMemory - classify_with_context — auto-select strategy using runtime context
- decompose — classify/structure/split/enrich the query
- compose — build optimized prompt for large-LLM
- sanitize — filter sensitive data before output
result = await preprocess_and_execute(
query="Deploy to production",
pipeline="context_aware", # use the full context pipeline
)- Session Persistence — export/import/RAG
- Sensitive Data Filtering — rules and config
- CHANGELOG — v0.4.0 details
- ROADMAP — future plans