Skip to content

Commit 6a345c6

Browse files
author
Tolin Simpson
committed
Initial Commit.
0 parents  commit 6a345c6

37 files changed

Lines changed: 8030 additions & 0 deletions

.cursor/plans/vcs_pseudo-code_interpreter_c5fd4cea.plan.md

Lines changed: 645 additions & 0 deletions
Large diffs are not rendered by default.

.github/workflows/ci.yml

Lines changed: 38 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,38 @@
1+
name: ci
2+
3+
on:
4+
push:
5+
pull_request:
6+
7+
jobs:
8+
build-and-test:
9+
runs-on: ubuntu-latest
10+
steps:
11+
- name: Checkout
12+
uses: actions/checkout@v4
13+
14+
- name: Setup Python
15+
uses: actions/setup-python@v5
16+
with:
17+
python-version: "3.11"
18+
19+
- name: Setup Node
20+
uses: actions/setup-node@v4
21+
with:
22+
node-version: "20"
23+
24+
- name: Build vector databases
25+
run: python vc-database/vcdb.py
26+
27+
- name: Demo search
28+
run: python vc-database/vcdb.py --demo-search add numbers
29+
30+
- name: Python tests
31+
run: python -m unittest discover -s vc-database -p "test_*.py" -v
32+
33+
- name: JS interpreter smoke test
34+
run: node vcs-js/smoke-test.js
35+
36+
- name: JS tests
37+
run: node vcs-js/tests/test_vcs.js
38+

.gitignore

Lines changed: 33 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,33 @@
1+
# Python
2+
__pycache__/
3+
*.py[cod]
4+
*.pyd
5+
.Python
6+
7+
# Virtualenvs
8+
.venv/
9+
venv/
10+
ENV/
11+
env/
12+
13+
# Node
14+
node_modules/
15+
npm-debug.log*
16+
yarn-debug.log*
17+
yarn-error.log*
18+
pnpm-debug.log*
19+
20+
# OS / editor
21+
.DS_Store
22+
Thumbs.db
23+
Desktop.ini
24+
25+
.vscode/
26+
.idea/
27+
28+
# Build / cache
29+
dist/
30+
build/
31+
out/
32+
coverage/
33+

CHANGELOG.md

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,8 @@
1+
# Changelog
2+
3+
All notable changes to this project will be documented in this file.
4+
5+
## Unreleased
6+
7+
- (placeholder)
8+

CODE_OF_CONDUCT.md

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,8 @@
1+
# Code of Conduct
2+
3+
This project follows a simple standard: be respectful and constructive.
4+
5+
Unacceptable behavior includes harassment, discrimination, threats, or sustained disruption.
6+
7+
Maintainers may remove content or restrict participation that violates this standard.
8+

CONTRIBUTING.md

Lines changed: 16 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,16 @@
1+
# Contributing
2+
3+
This repository is currently **proprietary** (see `LICENSE.txt`). External contributions may not be accepted unless explicitly requested by the maintainers.
4+
5+
If you want to help, the most useful contributions are:
6+
7+
- Clear bug reports (include OS, versions, and reproduction steps)
8+
- Feature requests with concrete use-cases
9+
- Documentation improvements (typos, clarifications, examples)
10+
11+
## Development guidelines
12+
13+
- Keep `.vcs` compatibility stable (avoid reordering `vc-database/source/*.txt` entries).
14+
- Prefer small, focused changes with clear commit messages.
15+
- Add/extend tests when changing interpreter parsing/execution or vector search behavior.
16+

DOCUMENTATION.md

Lines changed: 107 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,107 @@
1+
# LLMVCS - LLM-Friendly Vectorized Code Stack Interpreter
2+
3+
A modular, cross-language stack-based interpreter with semantic search capabilities.
4+
5+
## Architecture
6+
7+
```
8+
LLMVCS/
9+
├── vc-database/
10+
│ ├── source/ # Human-readable function definitions (.txt)
11+
│ ├── vectors/ # Vectorized outputs (.dat)
12+
│ └── vcdb.py # Converts .txt → .dat and semantic search via TF-IDF (pure Python)
13+
├── vcs-js/ # Working JavaScript interpreter + compiler for .vcs
14+
│ ├── modules/ # Opcode modules (registered into the engine by moduleId)
15+
│ ├── stacks/ # Example .vcs programs
16+
│ └── vcs.js # Single-file interpreter + compiler (JS)
17+
├── vcs-pseudo/ # Language-agnostic single-file interpreter specification
18+
│ └── vcs.pseudo
19+
```
20+
21+
## Database
22+
23+
### Source Files (`vc-database/source/*.txt`)
24+
25+
Define operations in a simple format. Each entry separated by `\n\n` gets a sequential ID (0, 1, 2...).
26+
27+
```
28+
add | Adds two types.
29+
30+
subtract | Subtracts two types.
31+
32+
multiply | Multiplies two types.
33+
```
34+
35+
### Vectorization
36+
37+
```bash
38+
cd vc-database
39+
python vcdb.py
40+
```
41+
42+
Converts `.txt` files to `.dat` (JSON TF-IDF vectors) for semantic search.
43+
44+
### Vector Search
45+
46+
```python
47+
# Run this from inside `vc-database/`
48+
from vcdb import search
49+
50+
results = search(
51+
query="how to add numbers",
52+
db_path="vectors/llmvcc-ops.dat",
53+
top_k=3,
54+
)
55+
```
56+
57+
## .vcs File Format
58+
59+
Compact instruction format: `moduleId.methodId(params)`
60+
61+
```vcs
62+
// Comments start with //
63+
0.13("Hello") // print("Hello")
64+
0.11("x", 10) // store("x", 10)
65+
0.17($x, 5) // add($x, 5)
66+
0.5(0, true) // jump_if(0, true)
67+
0.1() // stop()
68+
```
69+
70+
### Note on naming
71+
72+
The current JS compiler (`vcs-js/vcs.js`) parses numeric `moduleId.methodId(...)` only. Named variants like `math.add(...)` are a potential extension, but are not implemented today.
73+
74+
### Parameters
75+
76+
- Numbers: `10`, `3.14`
77+
- Strings: `"hello"`, `'world'`
78+
- Booleans: `true`, `false`
79+
- Variables: `$varName`
80+
81+
## Compatibility policy (stable IDs)
82+
83+
LLMVCS relies on **stable numeric IDs**:
84+
85+
- **Module IDs** are derived from the entry order in `vc-database/source/vector-categories.txt`
86+
- **Operation IDs / opcodes** are derived from the entry order in each `vc-database/source/*.txt`
87+
88+
Reordering existing entries is a breaking change. Prefer only appending new entries to preserve old IDs.
89+
90+
Recommended convention (optional): add a format version header to `.vcs` programs:
91+
92+
```text
93+
// vcs:1
94+
```
95+
96+
97+
## Adding New Modules
98+
99+
1. Create `vc-database/source/your-module.txt` with operations
100+
2. Add entry to `vc-database/source/vector-categories.txt`
101+
3. Implement the module in your interpreter (for JS, add a class under `vcs-js/modules/` with `eval(methodId, args, ctx, regs, strings)`)
102+
4. Register it into the engine at the desired `moduleId`
103+
5. Run `vcdb.py` to update the search index
104+
105+
## Cross-Language Compatibility
106+
107+
The interpreter design is intended to be portable across languages. The authoritative portability reference is `vcs-pseudo/vcs.pseudo`, and `vcs-js/` is a working reference implementation.

LICENSE.txt

Lines changed: 17 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,17 @@
1+
Copyright (c) 2026
2+
3+
All rights reserved.
4+
5+
This software and associated documentation files (the "Software") are proprietary.
6+
No permission is granted to use, copy, modify, merge, publish, distribute,
7+
sublicense, and/or sell copies of the Software, in whole or in part, without
8+
prior written permission from the copyright holder.
9+
10+
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
11+
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
12+
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
13+
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
14+
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
15+
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
16+
SOFTWARE.
17+

Prompt.md

Lines changed: 32 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,32 @@
1+
# VCS File Generator
2+
3+
Convert user tasks into `.vcs` files using the vector database MCP tool.
4+
5+
## Process
6+
7+
1. **Search categories**: `search(query, "vc-database/vectors/vector-categories.dat", top_k=2)` → get relevant module names
8+
2. **Search operations**: `search(query, "vc-database/vectors/<module>.dat", top_k=5)` → get opcodes
9+
3. **Generate .vcs**: Use `MODULE_ID.OPCODE(params)` format
10+
11+
## .vcs Syntax
12+
13+
```
14+
// Comments start with //
15+
MODULE_ID.OPCODE(param1, param2)
16+
```
17+
18+
- Strings: `"text"`
19+
- Variables: `$varname`
20+
- Numbers/booleans: literals
21+
22+
## Example
23+
24+
Task: "Print hello then stop"
25+
26+
```vcs
27+
// Print and stop
28+
0.13("Hello")
29+
0.1()
30+
```
31+
32+
Output the .vcs file with inline comments explaining each line.

README.md

Lines changed: 119 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,119 @@
1+
# LLMVCS
2+
3+
LLMVCS reduces LLM agent token costs by turning “what to do” into tiny, stable instruction references that a deterministic interpreter can execute.
4+
5+
Instead of having the LLM repeatedly re-describe common operations in natural language, you define those operations once in human-readable `.txt` catalogs, index them for semantic search, and then have the LLM output compact `.vcs` programs that reference operations by numeric IDs.
6+
7+
## The algorithm (in simple terms)
8+
9+
1. **Write operations once (human-readable)**
10+
Put reusable operations in `.txt` files under `vc-database/source/`. Each entry is separated by a blank line and has a stable position (its index).
11+
12+
2. **Build a searchable index**
13+
Run `vc-database/vcdb.py` to convert the `.txt` catalogs into `.dat` files under `vc-database/vectors/` (JSON TF‑IDF vectors).
14+
15+
3. **At runtime: semantic lookup (via MCP in your system)**
16+
When a user asks for something, the agent calls your vector-search tool to retrieve the best matching **module** and **operation** IDs.
17+
18+
4. **Emit low-token `.vcs`**
19+
The agent outputs a tiny instruction stack like `moduleId.opcode(params)` instead of verbose code.
20+
21+
5. **Execute `.vcs` in an embedded interpreter**
22+
A small, language-agnostic, single-file interpreter executes the instructions deterministically inside your app.
23+
24+
## Why this reduces tokens
25+
26+
- **Knowledge is stored once** in the operation catalogs (`.txt`) and searchable indexes (`.dat`).
27+
- **Agent output stays small** because `.vcs` references operations numerically (stable IDs) instead of repeating long explanations or boilerplate.
28+
29+
## Repository layout
30+
31+
- `vc-database/` — builds/searches `.dat` indexes from human-readable `.txt` sources (zero-dependency Python).
32+
- `vcs-js/` — a working JavaScript interpreter + `.vcs` compiler and example stacks.
33+
- `vcs-pseudo/` — the language-agnostic pseudo-code spec used to generate new interpreters.
34+
35+
## `.vcs` format (vectorized code stack)
36+
37+
Each line is an instruction:
38+
39+
```text
40+
MODULE_ID.OPCODE(param1, param2, ...)
41+
```
42+
43+
- Comments start with `//` (and `#` is also ignored by the JS compiler).
44+
- Variables are referenced as `$name` (resolved to a register slot at compile time).
45+
- Strings can be `"double quoted"` or `'single quoted'`.
46+
- Labels are supported by the JS interpreter for authoring convenience:
47+
- define: `:loop`
48+
- jump to: `@loop` (compile-time label reference)
49+
50+
Example:
51+
52+
```text
53+
// x = 10
54+
0.11("x", 10)
55+
// x = x + 5
56+
0.17($x, 5)
57+
0.11("x", $result)
58+
// print x, then stop
59+
0.13($x)
60+
0.1()
61+
```
62+
63+
## Quickstart
64+
65+
### 1) Build the vector databases (`.dat`)
66+
67+
From the repo root:
68+
69+
```bash
70+
python vc-database/vcdb.py
71+
```
72+
73+
This reads `vc-database/source/*.txt` and writes `vc-database/vectors/*.dat`.
74+
75+
To demo search output:
76+
77+
```bash
78+
python vc-database/vcdb.py --demo-search add numbers
79+
```
80+
81+
### 2) Run the JavaScript interpreter smoke test
82+
83+
From the repo root:
84+
85+
```bash
86+
node vcs-js/smoke-test.js
87+
```
88+
89+
### 3) (Optional) Run the browser demo
90+
91+
Open `vcs-js/index.html` in a browser.
92+
93+
## Porting to other languages
94+
95+
Use the single-file interpreter spec in `vcs-pseudo/vcs.pseudo` as the authoritative reference. New-language interpreters can be generated by an LLM by following that spec (data structures, compiler, engine loop, module interface).
96+
97+
## Compatibility: stable IDs (important)
98+
99+
LLMVCS relies on **stable numeric IDs**:
100+
101+
- **Module IDs** come from the entry order in `vc-database/source/vector-categories.txt`.
102+
- **Operation IDs / opcodes** come from the entry order in each `vc-database/source/*.txt` module catalog.
103+
104+
This means reordering entries is a breaking change. Prefer only appending new entries to preserve old IDs.
105+
106+
Recommended convention (optional): put a version header at the top of `.vcs` programs, for example:
107+
108+
```text
109+
// vcs:1
110+
```
111+
112+
## Safety / security note
113+
114+
`.vcs` is executable instruction input. Treat it like code:
115+
116+
- Only enable modules/opcodes you intend to allow.
117+
- Sandbox or validate inputs if `.vcs` can come from untrusted sources.
118+
- Keep module/opcode mappings stable; changing IDs is a breaking change.
119+

0 commit comments

Comments
 (0)