Teach machines to see. Let agents decide what to do about it.
ζ₯ζ¬θͺ Β· Quickstart Β· Architecture Β· Docs Site
Arcnem Vision is an open-source image ingestion and orchestration platform. Workflow API keys can upload images with org/project-scoped credentials that bind directly to a default workflow, operators can upload images from the dashboard, and every document can be routed through a customizable agent graph stored in Postgres and executed by Go services.
That core service is the product: a control plane for defining workflows, attaching them to workflow keys, running ad-hoc analyses from the dashboard, persisting OCR/descriptions/embeddings/segmentations, and inspecting every run with step-level state transitions. The Flutter app is included as a demo client for capture and GenUI experiments, but it is not the center of the system.
- Two ingestion paths: workflow-key uploads for automated pipelines, plus dashboard uploads for ad-hoc operator review.
- Configurable workflows: graphs live in the database and can be created, edited, templated, cloned, and assigned without redeploying code.
- Multiple analysis modes: OCR, descriptions, image embeddings, description embeddings, prompt-based segmentation, semantic segmentation, similarity search, and grounded collection chat.
- Mixed orchestration styles: combine LLM workers, MCP tool nodes, supervisor routing, and deterministic condition nodes in the same graph.
- Persistent run history: track
agent_graph_runsandagent_graph_run_stepswith initial state, final state, per-step deltas, timing, and errors. - Operator-first dashboard: manage projects, workflow keys, service keys, workflow templates, uploads, search, chat, and live run inspection from one UI.
- Worker nodes: choose a model, prompt, input mode, and optional MCP tools.
- Tool nodes: call one MCP tool with configurable input/output mappings and literal constants.
- Supervisor nodes: route between specialist workers, cap iterations, and choose explicit finish targets.
- Condition nodes: branch on extracted state with
containsorequalsrules. - State reducers: configure append vs overwrite behavior per state key.
- Template library: save reusable workflow versions, then start editable copies from them in the dashboard.
- AI workflow drafting: describe a pipeline in plain language and start from an unsaved graph generated from the live model and tool catalog.
The seeded workflows show the kinds of pipelines this service is built for:
- Document Processing Pipeline: describe an image, save the description, embed the image and text, then find similar items.
- Image Quality Review: use a supervisor to route an upload to a good-image or bad-image specialist.
- Document Segmentation Showcase: derive a prompt from the image, run prompt-based segmentation, then summarize the segmented output.
- Semantic Document Segmentation Showcase: run semantic segmentation directly and describe the result.
- OCR Keyword Condition Router: extract OCR, branch deterministically on keywords, and save a summary.
- OCR Review Supervisor: extract OCR with confidence, route to domain specialists, and save a reviewed summary.
Behind those workflows, the MCP layer currently exposes document description, document embedding, OCR, segmentation, description embedding, similarity search, scoped search, scoped browse, and grounded document-context reads.
| Layer | Tech | What it does |
|---|---|---|
| API | Bun, Hono, better-auth, Inngest, Pino | Presigned upload flow, auth, orchestration triggers, realtime publishing |
| Dashboard | React 19, TanStack Router, Tailwind, shadcn/ui | Projects/API keys, workflow builder, docs view, chat, run inspection |
| Agents | Go, Gin, LangGraph, LangChain, inngestgo | Loads graph snapshots from DB, executes worker/tool/supervisor/condition nodes |
| MCP | Go, MCP go-sdk, replicate-go, GORM | OCR, descriptions, embeddings, segmentation, retrieval, grounded reads |
| Storage | Postgres 18 + pgvector, S3-compatible storage, Redis | Documents, derived artifacts, vector indexes, sessions, realtime fan-out |
| Client | Flutter, Dart, flutter_gemma, GenUI | Optional demo client for capture, preview, and UI experiments |
ββββββββββββββββββββββββ ββββββββββββββββββββββββββββ
β Workflow Key Client βββx-api-key upload flowβββββββββββββΆβ
ββββββββββββββββββββββββ β Hono API β
β presign / ack / auth β
ββββββββββββββββββββββββ β β
β Dashboard Operators βββsession-based uploads / queueβββββΆβ
ββββββββββββ¬ββββββββββββ βββββββββββββββ¬βββββββββββββ
β β
β search, chat, runs, config β enqueue
βΌ βΌ
ββββββββββββββββ ββββββββββββββββ
β Dashboard ββββrealtime SSEβββββΆβ Inngest β
β Control UI β ββββββββ¬ββββββββ
ββββββββ¬ββββββββ β
β βΌ
β ββββββββββββββββ
ββββββββββββββββΆ Postgres ββββ Go Agents β
+ pgvector β LangGraph β
ββββββββ¬ββββββββ
β
βΌ
ββββββββββββββββ
β MCP Server β
β OCR / desc / β
β embed / seg / β
β retrieval β
ββββββββββββββββ
Workflow-key flow: a workflow-key client or external integration calls /api/uploads/presign, uploads to S3, then calls /api/uploads/ack. The API verifies the object, creates the document, and emits document/process.upload. The agents service then loads the document's bound workflow from the API key in the database and runs it.
Dashboard flow: an operator uploads an image to a project from the Docs tab. The dashboard path creates the document without binding it to an API key, and the operator can then queue any saved workflow against that document from the UI.
Execution flow: the agents service builds the graph from DB rows, runs LangGraph nodes, calls MCP tools as needed, and persists run records, step deltas, errors, OCR results, descriptions, embeddings, and segmentations back into Postgres.
| Projects & API Keys | Graph Editor & Inspector |
|---|---|
![]() |
![]() |
| Docs Search & Chat | Run Details |
|---|---|
![]() |
![]() |
| Image Detail Workspace | AI Workflow Drafting |
|---|---|
![]() |
![]() |
git clone https://github.com/arcnem-ai/arcnem-vision.git
cd arcnem-visionCopy every .env.example to .env:
cp server/packages/api/.env.example server/packages/api/.env
cp server/packages/db/.env.example server/packages/db/.env
cp server/packages/dashboard/.env.example server/packages/dashboard/.env
cp models/agents/.env.example models/agents/.env
cp models/mcp/.env.example models/mcp/.env
cp client/.env.example client/.envAdd your provider keys:
- OpenAI API key β
OPENAI_API_KEYinmodels/agents/.env - Same OpenAI key (recommended) β
OPENAI_API_KEYinserver/packages/api/.envfor dashboard collection chat and AI workflow draft generation - Replicate API token β
REPLICATE_API_TOKENinmodels/mcp/.env
Everything else is wired for local development. Postgres, Redis, and MinIO come from docker-compose.yaml.
tilt upTilt starts the whole repository, including the dashboard, API, agents, MCP server, Inngest, docs site, and the Flutter demo client. If you're evaluating the core service, your first stop should be the dashboard on http://localhost:3001.
In the Tilt UI, trigger seed-database.
The seed creates:
- a demo organization, project, workflow keys, service keys, and API keys
- reusable workflow templates and editable workflows
- sample documents for the document, OCR, quality-review, and segmentation paths
- example OCR results, descriptions, embeddings, segmentations, and run history
- a local debug dashboard session
Because server/packages/api/.env.example enables API_DEBUG=true, the dashboard can auto-bootstrap into the seeded local session after seeding.
- Open
http://localhost:3001. - In Projects & API Keys, inspect seeded workflow keys, service keys, and their attached workflows.
- In Workflow Library, browse templates, click Generate With AI from a short brief, or open the canvas to see how graphs are composed.
- In Docs, inspect seeded documents or upload a new one from the dashboard.
- In Runs, expand a run to inspect initial state, per-step deltas, final state, and errors.
- Open
http://localhost:3000/api/openapi.jsonto inspect the generated service API spec.
Use a workflow API key to exercise the automated path:
# 1. Ask the API for a presigned upload URL
curl -X POST http://localhost:3000/api/uploads/presign \
-H "Content-Type: application/json" \
-H "x-api-key: ${API_KEY}" \
-d '{"contentType":"image/png","size":12345}'
# 2. Upload directly to storage
curl -X PUT "${UPLOAD_URL}" --data-binary @photo.png
# 3. Acknowledge the upload
curl -X POST http://localhost:3000/api/uploads/ack \
-H "Content-Type: application/json" \
-H "x-api-key: ${API_KEY}" \
-d '{"objectKey":"uploads/.../photo.png"}'For workflow-key uploads, step 3 verifies the object, creates the document, and enqueues document/process.upload for the key's bound workflow.
- Docker + Docker Compose
- Bun
- Go 1.25+
- CompileDaemon (
go install github.com/githubnemo/CompileDaemon@latest) - Flutter SDK
- Tilt
arcnem-vision/
βββ server/ Bun workspace
β βββ packages/api/ Upload/auth routes, dashboard APIs, Inngest triggers
β βββ packages/db/ Drizzle schema, migrations, seed data, templates
β βββ packages/dashboard/ React control plane for operators
β βββ packages/shared/ Shared env helpers
βββ models/ Go workspace
β βββ agents/ Workflow loader, LangGraph execution, run tracker
β βββ mcp/ OCR, embeddings, descriptions, segmentation, retrieval
β βββ db/ GORM model generation
β βββ shared/ Shared env, S3, realtime utilities
βββ client/ Optional Flutter demo client
| Doc | What's in it |
|---|---|
| site/ | Docs site for onboarding, architecture, guides, and API examples |
| site/src/content/docs/architecture.md | Service architecture, ingestion paths, workflow model, persistence |
| site/src/content/docs/guides/dashboard-workflow-editor.md | Dashboard operations, workflow canvas, docs tab, runs tab |
| site/src/content/docs/reference/api.md | Workflow-key ingestion, dashboard uploads, run queueing, realtime feed |
Use the CI-safe suites for the default feedback loop:
cd server && bun testcd models/agents && go test ./...
For the fuller local service API flow, run:
make live-service-testThat command is the full local entrypoint. It brings up infra, migrates and seeds the database, starts the required services, verifies health, then uploads a real image, acknowledges it, executes the selected workflow, polls for completion, and verifies the document can be published.
It reuses the existing .env.docker files for secrets and service config, but
it overrides the infra endpoints at runtime so the probe runs against a
separate local Postgres, Redis, and MinIO stack instead of your normal dev
state. The isolated API container also gets an explicit local
S3_PUBLIC_BASE_URL, so the publish step verifies a real public URL instead of
relying on fallback behavior.
Use make live-service-stack-down afterward if you want to stop the stack that
the live test started and remove its isolated volumes.
The service API is the project-scoped orchestration surface for service integrations and other non-dashboard clients.
GET /api/service/workflowslists available workflows for the API key's organization.POST /api/service/uploads/presigndeclares upload intent, including visibility.POST /api/service/uploads/ackverifies the object and creates the document.POST /api/service/workflow-executionsqueues a workflow against explicit document ids or a scoped selection.GET /api/service/workflow-executions/:idreads execution state.GET /api/service/documentsandPOST /api/service/documents/visibilitycover document inspection and publishing.GET /api/openapi.jsonserves the generated OpenAPI spec for this surface.
See CONTRIBUTING.md for contributor workflow. If you use AI coding agents, also read AGENTS.md.
Built by Arcnem AI in Tokyo.





