Releases: IranTransitionProject/docman
Releases · IranTransitionProject/docman
v0.5.0 — Schema Refs, Smart Extraction
What's New
Pydantic Schema Refs
All worker configs migrated from inline JSON Schema to input_schema_ref / output_schema_ref pointing to typed Pydantic models in src/docman/contracts.py. Schemas are resolved at config load time via Loom's resolve_schema_refs().
Smart Extraction
SmartExtractorBackend — composite backend that tries MarkItDown first (fast, no ML) and falls back to Docling (deep OCR, table recognition) when needed. Configurable fallback thresholds.
Built on Loom v0.8.0
Requires Loom v0.8.0 or later.
Installation
# Requires loom cloned adjacent
git clone https://github.com/IranTransitionProject/docman.git
cd docman
uv sync --extra dev
uv run pytest tests/ -v # 63 testsStats
- 5 worker configs, 3 pipeline variants, 1 MCP gateway config
- 63 unit tests
- 3 extraction backends (MarkItDown, Docling, Smart)