Skip to content

Commit e1c76b9

Browse files
Hooman Mehrclaude
andcommitted
Rename loom → heddle, framework → baseline
Update all references for the repo reorganization: - Package dep: loom-ai → heddle-ai - All imports: from loom.* → from heddle.* - CLI commands: loom worker/mcp/pipeline → heddle worker/mcp/pipeline - GitHub URLs: IranTransitionProject/loom → getheddle/heddle - GitHub URLs: IranTransitionProject/framework → IranTransitionProject/baseline - CI, configs, scripts, docs all updated - All 124 tests passing Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
1 parent 3de22a4 commit e1c76b9

45 files changed

Lines changed: 293 additions & 293 deletions

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

.github/workflows/ci.yml

Lines changed: 11 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -22,14 +22,14 @@ jobs:
2222
runs-on: ubuntu-latest
2323
steps:
2424
- uses: actions/checkout@v4
25-
- name: Checkout loom
25+
- name: Checkout heddle
2626
uses: actions/checkout@v4
2727
with:
28-
repository: ${{ github.repository_owner }}/loom
29-
path: loom
28+
repository: getheddle/heddle
29+
path: heddle
3030
- uses: astral-sh/setup-uv@v5
31-
# Symlink so [tool.uv.sources] path "../loom" resolves in CI
32-
- run: ln -s "$GITHUB_WORKSPACE/loom" "$GITHUB_WORKSPACE/../loom"
31+
# Symlink so [tool.uv.sources] path "../heddle" resolves in CI
32+
- run: ln -s "$GITHUB_WORKSPACE/heddle" "$GITHUB_WORKSPACE/../heddle"
3333
- run: uv sync --extra dev
3434
- run: uv run ruff check src/ tests/
3535
- run: uv run ruff format --check src/ tests/
@@ -42,19 +42,19 @@ jobs:
4242
steps:
4343
- uses: actions/checkout@v4
4444

45-
# Docman depends on loom — check it out and install from source.
46-
- name: Checkout loom
45+
# Docman depends on heddle — check it out and install from source.
46+
- name: Checkout heddle
4747
uses: actions/checkout@v4
4848
with:
49-
repository: ${{ github.repository_owner }}/loom
50-
path: loom
49+
repository: getheddle/heddle
50+
path: heddle
5151

5252
- uses: astral-sh/setup-uv@v5
5353
with:
5454
python-version: ${{ matrix.python-version }}
5555

56-
# Symlink so [tool.uv.sources] path "../loom" resolves in CI
57-
- run: ln -s "$GITHUB_WORKSPACE/loom" "$GITHUB_WORKSPACE/../loom"
56+
# Symlink so [tool.uv.sources] path "../heddle" resolves in CI
57+
- run: ln -s "$GITHUB_WORKSPACE/heddle" "$GITHUB_WORKSPACE/../heddle"
5858

5959
- name: Install dependencies
6060
run: uv sync --extra dev

CLAUDE.md

Lines changed: 42 additions & 42 deletions
Original file line numberDiff line numberDiff line change
@@ -2,9 +2,9 @@
22

33
## What this project is
44

5-
Docman (v0.5.0) is a document processing pipeline built on the Loom framework. It extracts content from PDF, DOCX, PPTX, XLSX, and HTML files using an adaptive two-tier extraction strategy (MarkItDown for speed, Docling for depth), with LLM-based classification and summarization stages.
5+
Docman (v0.5.0) is a document processing pipeline built on the Heddle framework. It extracts content from PDF, DOCX, PPTX, XLSX, and HTML files using an adaptive two-tier extraction strategy (MarkItDown for speed, Docling for depth), with LLM-based classification and summarization stages.
66

7-
This is a **consumer** of the Loom framework — it provides concrete worker configs, processing backends, and pipeline definitions. The Loom framework itself lives in a separate repo.
7+
This is a **consumer** of the Heddle framework — it provides concrete worker configs, processing backends, and pipeline definitions. The Heddle framework itself lives in a separate repo.
88

99
## Project structure
1010

@@ -16,18 +16,18 @@ src/docman/
1616
markitdown_backend.py # MarkItDownBackend — fast extraction via Microsoft MarkItDown (no ML)
1717
smart_extractor.py # SmartExtractorBackend — MarkItDown-first, Docling fallback
1818
duckdb_ingest.py # DuckDBIngestBackend — document persistence (serialize_writes=True)
19-
duckdb_query.py # DocmanQueryBackend — thin subclass of loom.contrib.duckdb.DuckDBQueryBackend
19+
duckdb_query.py # DocmanQueryBackend — thin subclass of heddle.contrib.duckdb.DuckDBQueryBackend
2020
tools/
21-
vector_search.py # DuckDBVectorTool — thin wrapper around loom.contrib.duckdb.DuckDBVectorTool
22-
manifest.yaml # App manifest for Loom Workshop deployment
21+
vector_search.py # DuckDBVectorTool — thin wrapper around heddle.contrib.duckdb.DuckDBVectorTool
22+
manifest.yaml # App manifest for Heddle Workshop deployment
2323
configs/
2424
workers/ # YAML configs for doc_extractor, doc_classifier, doc_summarizer, doc_ingest, doc_query
2525
orchestrators/ # Pipeline configs (doc_pipeline, doc_pipeline_local, doc_pipeline_smart)
2626
mcp/ # MCP gateway config (docman.yaml)
2727
scripts/
2828
dev-start.sh # Local development launcher
2929
dev-start.ps1 # Windows development launcher
30-
build-app.sh # Build deployment ZIP for Loom Workshop
30+
build-app.sh # Build deployment ZIP for Heddle Workshop
3131
docs/
3232
ARCHITECTURE.md # System architecture overview
3333
CONTRIBUTING.md # Contribution standards and CLA
@@ -37,18 +37,18 @@ docs/
3737
tests/ # Unit tests (mock backends, in-memory DuckDB, no infrastructure)
3838
```
3939

40-
## Relationship to Loom
40+
## Relationship to Heddle
4141

42-
Docman depends on `loom-ai[duckdb]` as a package. It uses:
42+
Docman depends on `heddle[duckdb]` as a package. It uses:
4343

4444
- `ProcessingBackend` ABC — DoclingBackend, MarkItDownBackend, SmartExtractorBackend, DuckDBIngestBackend implement this
45-
- `resolve_schema_refs()` — worker configs use `input_schema_ref` / `output_schema_ref` pointing to `docman.contracts.*` Pydantic models (Loom resolves to JSON Schema at load time)
46-
- `loom.contrib.duckdb.DuckDBQueryBackend` — DocmanQueryBackend subclasses this with Docman-specific schema defaults
47-
- `loom.contrib.duckdb.DuckDBVectorTool` — DuckDBVectorTool wraps this with Docman-specific column/table defaults
48-
- `loom.contrib.duckdb.DuckDBViewTool` — used directly (no Docman wrapper needed, already generic)
49-
- `ProcessorWorker` — runs extraction and DuckDB backends via `loom processor` CLI
50-
- `LLMWorker` — runs classifier and summarizer via `loom worker` CLI
51-
- `PipelineOrchestrator` — orchestrates the 4-stage pipeline via `loom pipeline` CLI (with dependency-aware parallel stage execution)
45+
- `resolve_schema_refs()` — worker configs use `input_schema_ref` / `output_schema_ref` pointing to `docman.contracts.*` Pydantic models (Heddle resolves to JSON Schema at load time)
46+
- `heddle.contrib.duckdb.DuckDBQueryBackend` — DocmanQueryBackend subclasses this with Docman-specific schema defaults
47+
- `heddle.contrib.duckdb.DuckDBVectorTool` — DuckDBVectorTool wraps this with Docman-specific column/table defaults
48+
- `heddle.contrib.duckdb.DuckDBViewTool` — used directly (no Docman wrapper needed, already generic)
49+
- `ProcessorWorker` — runs extraction and DuckDB backends via `heddle processor` CLI
50+
- `LLMWorker` — runs classifier and summarizer via `heddle worker` CLI
51+
- `PipelineOrchestrator` — orchestrates the 4-stage pipeline via `heddle pipeline` CLI (with dependency-aware parallel stage execution)
5252

5353
The CLI loads backends by fully qualified class path from worker configs:
5454

@@ -71,7 +71,7 @@ Docman provides three extraction backends, all producing the same output contrac
7171
3. **doc_summarizer** (LLMWorker) — LLM summarizes based on document type and extracted content. Returns summary, key_points, word_count.
7272
4. **doc_ingest** (ProcessorWorker + DuckDBIngestBackend) — Persists all pipeline results (metadata, classification, summary, full text) into DuckDB. Reads full extracted text from workspace JSON. Returns document_id.
7373

74-
**Pipeline execution order:** Loom's `PipelineOrchestrator` auto-infers dependencies from `input_mapping` paths and runs independent stages concurrently. Docman's pipeline has genuinely sequential dependencies (classify depends on extract, summarize depends on both, ingest depends on all three), so it produces 4 levels of 1 stage each — sequential execution.
74+
**Pipeline execution order:** Heddle's `PipelineOrchestrator` auto-infers dependencies from `input_mapping` paths and runs independent stages concurrently. Docman's pipeline has genuinely sequential dependencies (classify depends on extract, summarize depends on both, ingest depends on all three), so it produces 4 levels of 1 stage each — sequential execution.
7575

7676
**Pipeline variants:**
7777

@@ -83,9 +83,9 @@ Docman provides three extraction backends, all producing the same output contrac
8383

8484
```bash
8585
# Process 3 documents concurrently — each instance handles one goal
86-
loom pipeline --config configs/orchestrators/doc_pipeline_smart.yaml &
87-
loom pipeline --config configs/orchestrators/doc_pipeline_smart.yaml &
88-
loom pipeline --config configs/orchestrators/doc_pipeline_smart.yaml &
86+
heddle pipeline --config configs/orchestrators/doc_pipeline_smart.yaml &
87+
heddle pipeline --config configs/orchestrators/doc_pipeline_smart.yaml &
88+
heddle pipeline --config configs/orchestrators/doc_pipeline_smart.yaml &
8989
```
9090

9191
## Standalone workers
@@ -94,7 +94,7 @@ loom pipeline --config configs/orchestrators/doc_pipeline_smart.yaml &
9494

9595
## I/O contracts
9696

97-
Worker I/O schemas are defined as Pydantic models in `src/docman/contracts.py`. Worker YAML configs reference them via `input_schema_ref` / `output_schema_ref`, and Loom's `resolve_schema_refs()` converts them to JSON Schema at load time.
97+
Worker I/O schemas are defined as Pydantic models in `src/docman/contracts.py`. Worker YAML configs reference them via `input_schema_ref` / `output_schema_ref`, and Heddle's `resolve_schema_refs()` converts them to JSON Schema at load time.
9898

9999
Models: `ExtractorInput`, `ExtractorOutput`, `ClassifierInput`, `ClassifierOutput`, `SummarizerInput`, `SummarizerOutput`, `IngestInput`, `IngestOutput`, `QueryInput`, `QueryOutput`.
100100

@@ -103,7 +103,7 @@ Models: `ExtractorInput`, `ExtractorOutput`, `ClassifierInput`, `ClassifierOutpu
103103
- Large data passes via **file references** in a shared workspace directory (`--workspace-dir`)
104104
- Messages carry only file_ref strings, not inline content
105105
- Extraction backends (MarkItDown, Docling) read source file from workspace, write extracted JSON to workspace
106-
- **Summarizer file resolution:** `resolve_file_refs: ["file_ref"]` and `workspace_dir` are set in the summarizer config — Loom's LLMWorker reads extracted JSON from workspace automatically.
106+
- **Summarizer file resolution:** `resolve_file_refs: ["file_ref"]` and `workspace_dir` are set in the summarizer config — Heddle's LLMWorker reads extracted JSON from workspace automatically.
107107

108108
## Docling configuration
109109

@@ -120,14 +120,14 @@ Full guide: `docs/docling-setup.md`
120120

121121
## MCP gateway
122122

123-
Docman can be exposed as an MCP (Model Context Protocol) server using Loom's built-in MCP gateway — zero MCP-specific code needed.
123+
Docman can be exposed as an MCP (Model Context Protocol) server using Heddle's built-in MCP gateway — zero MCP-specific code needed.
124124

125125
```bash
126-
# Start Docman as an MCP server (requires loom[mcp] and NATS + workers running)
127-
loom mcp --config configs/mcp/docman.yaml
126+
# Start Docman as an MCP server (requires heddle[mcp] and NATS + workers running)
127+
heddle mcp --config configs/mcp/docman.yaml
128128
129129
# Or with streamable-http transport
130-
loom mcp --config configs/mcp/docman.yaml --transport streamable-http --port 8000
130+
heddle mcp --config configs/mcp/docman.yaml --transport streamable-http --port 8000
131131
```
132132

133133
The MCP config (`configs/mcp/docman.yaml`) maps Docman's workers and query backend to MCP tools:
@@ -136,7 +136,7 @@ The MCP config (`configs/mcp/docman.yaml`) maps Docman's workers and query backe
136136
- Query backend → `docman_search`, `docman_filter`, `docman_stats`, `docman_get` tools
137137
- Workspace files exposed as MCP resources
138138

139-
See Loom's [Building Workflows](https://github.com/IranTransitionProject/loom/blob/main/docs/building-workflows.md) Part 11 for full MCP gateway documentation.
139+
See Heddle's [Building Workflows](https://github.com/getheddle/heddle/blob/main/docs/building-workflows.md) Part 11 for full MCP gateway documentation.
140140

141141
## Key design rules
142142

@@ -152,13 +152,13 @@ See Loom's [Building Workflows](https://github.com/IranTransitionProject/loom/bl
152152
- Query results exclude `full_text` column by default to keep NATS messages small; use `get` action for full content
153153
- Vector embeddings use `FLOAT[]` (variable-length) column in DuckDB — use `list_cosine_similarity` (NOT `array_cosine_similarity` which requires fixed-size `FLOAT[N]`)
154154
- Embedding generation is optional — controlled by `embedding` config section in `doc_ingest.yaml`. When absent, embedding column stores NULL
155-
- DuckDBViewTool and DuckDBVectorTool implement Loom's `SyncToolProvider` for LLM function-calling via `knowledge_silos` config
155+
- DuckDBViewTool and DuckDBVectorTool implement Heddle's `SyncToolProvider` for LLM function-calling via `knowledge_silos` config
156156

157157
## Build and test commands
158158

159159
```bash
160160
# Install all dependencies (requires Python 3.11+, uses uv)
161-
# Loom is resolved from ../loom via [tool.uv.sources] in pyproject.toml
161+
# Heddle is resolved from ../heddle via [tool.uv.sources] in pyproject.toml
162162
uv sync --extra dev
163163
164164
# Pre-download Docling detection models (avoids delay on first run)
@@ -167,23 +167,23 @@ uv run docling-tools models download
167167
# Run unit tests (no infrastructure needed)
168168
uv run pytest tests/ -v
169169
170-
# Run with infrastructure (needs NATS + Loom installed)
170+
# Run with infrastructure (needs NATS + Heddle installed)
171171
# Terminal 1: docker run -p 4222:4222 nats:latest
172-
# Terminal 2: uv run loom router --nats-url nats://localhost:4222
173-
# Terminal 3: uv run loom processor --config configs/workers/doc_extractor.yaml --nats-url nats://localhost:4222
174-
# Terminal 4: OLLAMA_URL=http://localhost:11434 uv run loom worker --config configs/workers/doc_classifier.yaml --tier local --nats-url nats://localhost:4222
175-
# Terminal 5: ANTHROPIC_API_KEY=sk-... uv run loom worker --config configs/workers/doc_summarizer.yaml --tier standard --nats-url nats://localhost:4222
176-
# Terminal 6: uv run loom processor --config configs/workers/doc_ingest.yaml --nats-url nats://localhost:4222
177-
# Terminal 7: uv run loom processor --config configs/workers/doc_query.yaml --nats-url nats://localhost:4222
178-
# Terminal 8: uv run loom pipeline --config configs/orchestrators/doc_pipeline.yaml --nats-url nats://localhost:4222
179-
# Submit: uv run loom submit "Process document" --context file_ref=test.pdf --nats-url nats://localhost:4222
172+
# Terminal 2: uv run heddle router --nats-url nats://localhost:4222
173+
# Terminal 3: uv run heddle processor --config configs/workers/doc_extractor.yaml --nats-url nats://localhost:4222
174+
# Terminal 4: OLLAMA_URL=http://localhost:11434 uv run heddle worker --config configs/workers/doc_classifier.yaml --tier local --nats-url nats://localhost:4222
175+
# Terminal 5: ANTHROPIC_API_KEY=sk-... uv run heddle worker --config configs/workers/doc_summarizer.yaml --tier standard --nats-url nats://localhost:4222
176+
# Terminal 6: uv run heddle processor --config configs/workers/doc_ingest.yaml --nats-url nats://localhost:4222
177+
# Terminal 7: uv run heddle processor --config configs/workers/doc_query.yaml --nats-url nats://localhost:4222
178+
# Terminal 8: uv run heddle pipeline --config configs/orchestrators/doc_pipeline.yaml --nats-url nats://localhost:4222
179+
# Submit: uv run heddle submit "Process document" --context file_ref=test.pdf --nats-url nats://localhost:4222
180180
```
181181

182182
## Current state
183183

184184
The following items are **implemented and working**:
185185

186-
- Pydantic I/O contracts (`src/docman/contracts.py`) — source of truth for all worker schemas, resolved at load time via Loom's `resolve_schema_refs()`
186+
- Pydantic I/O contracts (`src/docman/contracts.py`) — source of truth for all worker schemas, resolved at load time via Heddle's `resolve_schema_refs()`
187187
- MarkItDownBackend (`src/docman/backends/markitdown_backend.py`) — fast extraction via Microsoft MarkItDown, derives metadata from Markdown output
188188
- SmartExtractorBackend (`src/docman/backends/smart_extractor.py`) — composite MarkItDown-first with Docling fallback, configurable thresholds
189189
- DoclingBackend (`src/docman/backends/docling_backend.py`) — deep extraction with OCR, table structure, layout analysis
@@ -192,14 +192,14 @@ The following items are **implemented and working**:
192192
- DuckDBVectorTool (`src/docman/tools/vector_search.py`) — thin wrapper with Docman-specific defaults
193193
- Worker configs for all pipeline stages + standalone query worker — using `input_schema_ref`/`output_schema_ref`
194194
- Pipeline configs: `doc_pipeline.yaml` (Docling), `doc_pipeline_local.yaml` (all local), `doc_pipeline_smart.yaml` (MarkItDown-first)
195-
- App manifest (`manifest.yaml`) — declares all configs, Python package, and required Loom extras
196-
- Build script (`scripts/build-app.sh`) — generates deployment ZIP for Loom Workshop
195+
- App manifest (`manifest.yaml`) — declares all configs, Python package, and required Heddle extras
196+
- Build script (`scripts/build-app.sh`) — generates deployment ZIP for Heddle Workshop
197197

198198
## What to implement next
199199

200200
1. **End-to-end test** — With NATS, Valkey, and Ollama running locally
201-
2. **Design a parallel pipeline variant** — Current pipeline is inherently sequential, but a variant could run classify and summarize concurrently if the summarizer doesn't need `document_type` (Loom's pipeline parallelism would auto-detect this from input_mapping)
202-
3. **MCP progress notifications** — When Loom's MCP bridge wires progress callbacks to MCP progress tokens, Docman's pipeline would automatically report per-stage progress to MCP clients
201+
2. **Design a parallel pipeline variant** — Current pipeline is inherently sequential, but a variant could run classify and summarize concurrently if the summarizer doesn't need `document_type` (Heddle's pipeline parallelism would auto-detect this from input_mapping)
202+
3. **MCP progress notifications** — When Heddle's MCP bridge wires progress callbacks to MCP progress tokens, Docman's pipeline would automatically report per-stage progress to MCP clients
203203

204204
## Environment
205205

GOVERNANCE.md

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -9,12 +9,12 @@
99
## Mission Constraint
1010

1111
This project exists to provide a reference implementation and test harness for
12-
the Loom framework's document processing capabilities. All governance decisions
12+
the Heddle framework's document processing capabilities. All governance decisions
1313
must be evaluated against a single test: does this decision serve that mission
1414
or does it serve something else?
1515

16-
Decisions that compromise Docman's role as a faithful test of the Loom framework
17-
architecture — by circumventing Loom abstractions, hard-coding provider-specific
16+
Decisions that compromise Docman's role as a faithful test of the Heddle framework
17+
architecture — by circumventing Heddle abstractions, hard-coding provider-specific
1818
logic, or abandoning the pipeline model — are incompatible with the mission and
1919
constitute grounds for leadership review.
2020

@@ -61,7 +61,7 @@ If the Founder is unable to act and the project has been inactive for 90+ days:
6161
A successor must:
6262

6363
- Accept the mission constraint without reservation
64-
- Commit to maintaining the project as a Loom framework consumer
64+
- Commit to maintaining the project as a Heddle framework consumer
6565
- Maintain the MPL 2.0 public license (alternative licensing rights
6666
revert to the copyright holder and do not automatically transfer)
6767

0 commit comments

Comments
 (0)