|
| 1 | +# LLMVCS |
| 2 | + |
| 3 | +LLMVCS reduces LLM agent token costs by turning “what to do” into tiny, stable instruction references that a deterministic interpreter can execute. |
| 4 | + |
| 5 | +Instead of having the LLM repeatedly re-describe common operations in natural language, you define those operations once in human-readable `.txt` catalogs, index them for semantic search, and then have the LLM output compact `.vcs` programs that reference operations by numeric IDs. |
| 6 | + |
| 7 | +## The algorithm (in simple terms) |
| 8 | + |
| 9 | +1. **Write operations once (human-readable)** |
| 10 | + Put reusable operations in `.txt` files under `vc-database/source/`. Each entry is separated by a blank line and has a stable position (its index). |
| 11 | + |
| 12 | +2. **Build a searchable index** |
| 13 | + Run `vc-database/vcdb.py` to convert the `.txt` catalogs into `.dat` files under `vc-database/vectors/` (JSON TF‑IDF vectors). |
| 14 | + |
| 15 | +3. **At runtime: semantic lookup (via MCP in your system)** |
| 16 | + When a user asks for something, the agent calls your vector-search tool to retrieve the best matching **module** and **operation** IDs. |
| 17 | + |
| 18 | +4. **Emit low-token `.vcs`** |
| 19 | + The agent outputs a tiny instruction stack like `moduleId.opcode(params)` instead of verbose code. |
| 20 | + |
| 21 | +5. **Execute `.vcs` in an embedded interpreter** |
| 22 | + A small, language-agnostic, single-file interpreter executes the instructions deterministically inside your app. |
| 23 | + |
| 24 | +## Why this reduces tokens |
| 25 | + |
| 26 | +- **Knowledge is stored once** in the operation catalogs (`.txt`) and searchable indexes (`.dat`). |
| 27 | +- **Agent output stays small** because `.vcs` references operations numerically (stable IDs) instead of repeating long explanations or boilerplate. |
| 28 | + |
| 29 | +## Repository layout |
| 30 | + |
| 31 | +- `vc-database/` — builds/searches `.dat` indexes from human-readable `.txt` sources (zero-dependency Python). |
| 32 | +- `vcs-js/` — a working JavaScript interpreter + `.vcs` compiler and example stacks. |
| 33 | +- `vcs-pseudo/` — the language-agnostic pseudo-code spec used to generate new interpreters. |
| 34 | + |
| 35 | +## `.vcs` format (vectorized code stack) |
| 36 | + |
| 37 | +Each line is an instruction: |
| 38 | + |
| 39 | +```text |
| 40 | +MODULE_ID.OPCODE(param1, param2, ...) |
| 41 | +``` |
| 42 | + |
| 43 | +- Comments start with `//` (and `#` is also ignored by the JS compiler). |
| 44 | +- Variables are referenced as `$name` (resolved to a register slot at compile time). |
| 45 | +- Strings can be `"double quoted"` or `'single quoted'`. |
| 46 | +- Labels are supported by the JS interpreter for authoring convenience: |
| 47 | + - define: `:loop` |
| 48 | + - jump to: `@loop` (compile-time label reference) |
| 49 | + |
| 50 | +Example: |
| 51 | + |
| 52 | +```text |
| 53 | +// x = 10 |
| 54 | +0.11("x", 10) |
| 55 | +// x = x + 5 |
| 56 | +0.17($x, 5) |
| 57 | +0.11("x", $result) |
| 58 | +// print x, then stop |
| 59 | +0.13($x) |
| 60 | +0.1() |
| 61 | +``` |
| 62 | + |
| 63 | +## Quickstart |
| 64 | + |
| 65 | +### 1) Build the vector databases (`.dat`) |
| 66 | + |
| 67 | +From the repo root: |
| 68 | + |
| 69 | +```bash |
| 70 | +python vc-database/vcdb.py |
| 71 | +``` |
| 72 | + |
| 73 | +This reads `vc-database/source/*.txt` and writes `vc-database/vectors/*.dat`. |
| 74 | + |
| 75 | +To demo search output: |
| 76 | + |
| 77 | +```bash |
| 78 | +python vc-database/vcdb.py --demo-search add numbers |
| 79 | +``` |
| 80 | + |
| 81 | +### 2) Run the JavaScript interpreter smoke test |
| 82 | + |
| 83 | +From the repo root: |
| 84 | + |
| 85 | +```bash |
| 86 | +node vcs-js/smoke-test.js |
| 87 | +``` |
| 88 | + |
| 89 | +### 3) (Optional) Run the browser demo |
| 90 | + |
| 91 | +Open `vcs-js/index.html` in a browser. |
| 92 | + |
| 93 | +## Porting to other languages |
| 94 | + |
| 95 | +Use the single-file interpreter spec in `vcs-pseudo/vcs.pseudo` as the authoritative reference. New-language interpreters can be generated by an LLM by following that spec (data structures, compiler, engine loop, module interface). |
| 96 | + |
| 97 | +## Compatibility: stable IDs (important) |
| 98 | + |
| 99 | +LLMVCS relies on **stable numeric IDs**: |
| 100 | + |
| 101 | +- **Module IDs** come from the entry order in `vc-database/source/vector-categories.txt`. |
| 102 | +- **Operation IDs / opcodes** come from the entry order in each `vc-database/source/*.txt` module catalog. |
| 103 | + |
| 104 | +This means reordering entries is a breaking change. Prefer only appending new entries to preserve old IDs. |
| 105 | + |
| 106 | +Recommended convention (optional): put a version header at the top of `.vcs` programs, for example: |
| 107 | + |
| 108 | +```text |
| 109 | +// vcs:1 |
| 110 | +``` |
| 111 | + |
| 112 | +## Safety / security note |
| 113 | + |
| 114 | +`.vcs` is executable instruction input. Treat it like code: |
| 115 | + |
| 116 | +- Only enable modules/opcodes you intend to allow. |
| 117 | +- Sandbox or validate inputs if `.vcs` can come from untrusted sources. |
| 118 | +- Keep module/opcode mappings stable; changing IDs is a breaking change. |
| 119 | + |
0 commit comments