AI-First Process Automation with Large Multimodal Models (LMMs)
OpenAdapt is the open source software adapter between Large Multimodal Models (LMMs) and traditional desktop and web GUIs.
Collect human demonstrations, learn agent policies, and evaluate autonomous execution - all from a unified CLI.
Join Discord{ .md-button .md-button--primary } View on GitHub{ .md-button }
OpenAdapt bridges the gap between powerful AI models and everyday software automation. Instead of writing complex scripts or learning APIs, you simply:
- Demonstrate - Show the agent how to perform a task by doing it yourself
- Learn - Let OpenAdapt learn an agent policy from your demonstration trajectory
- Execute - Deploy your trained agent to autonomously perform the task
- Evaluate - Measure agent performance on standardized benchmarks
flowchart LR
subgraph Demonstrate["1. Demonstrate"]
A[Human Trajectory] --> B[Capture]
end
subgraph Learn["2. Learn"]
B --> C[Policy Learning]
end
subgraph Execute["3. Execute"]
C --> D[Trained Policy]
D --> E[Agent Deployment]
end
subgraph Evaluate["4. Evaluate"]
D --> F[Benchmark]
F --> G[Metrics]
end
GROUND[Grounding] -.-> E
RETRIEVE[Retrieval] -.-> C
PRIV[Privacy] -.-> B
Works with any Large Multimodal Model - Claude, GPT-4V, Gemini, Qwen-VL, or your own fine-tuned models.
No manual prompt engineering required. OpenAdapt learns agent policies directly from your demonstration trajectories.
Works with all desktop GUIs including native applications, web browsers, and virtualized environments.
MIT licensed. Full transparency, community-driven development, and no vendor lock-in.
Install OpenAdapt with the features you need:
pip install openadapt[all] # EverythingWhat You'll See:
Successfully installed openadapt-1.0.0
Successfully installed openadapt-capture-1.0.0
Successfully installed openadapt-ml-1.0.0
Successfully installed openadapt-evals-1.0.0
...
openadapt capture start --name my-task
# Perform your task, then press Ctrl+CWhat You'll See:
[INFO] Starting capture session: my-task
[INFO] Recording started. Press Ctrl+C to stop.
[INFO] Capturing events...
^C
[INFO] Capture stopped
[INFO] Saved 127 events to database
[SUCCESS] Capture 'my-task' completed successfully
openadapt train start --capture my-task --model qwen3vl-2bWhat You'll See:
[INFO] Loading capture: my-task
[INFO] Found 127 events
[INFO] Initializing model: qwen3vl-2b
[INFO] Starting training...
Epoch 1/10: 100%|████████████| 127/127 [00:45<00:00]
Epoch 2/10: 100%|████████████| 127/127 [00:43<00:00]
...
[SUCCESS] Training complete. Model saved to: training_output/model.pt
openadapt eval run --checkpoint training_output/model.pt --benchmark waaWhat You'll See:
[INFO] Loading checkpoint: training_output/model.pt
[INFO] Running benchmark: waa
[INFO] Processing task 1/10...
[INFO] Processing task 2/10...
...
[SUCCESS] Evaluation complete
Results:
Success Rate: 85.0%
Average Steps: 12.3
Total Time: 5m 32s
Success Indicators:
- Green checkmarks or
[SUCCESS]messages indicate completion - No error or warning messages in the output
- Output files created in expected locations
- Metrics show reasonable values (success rate > 0%)
See the Installation Guide for detailed setup instructions.
OpenAdapt v1.0+ uses a modular meta-package architecture. The main openadapt package provides a unified CLI and depends on focused sub-packages:
| Package | Description |
|---|---|
| openadapt-capture | Demonstration collection and storage |
| openadapt-ml | Policy learning, training, inference |
| openadapt-evals | Benchmark evaluation |
| openadapt-viewer | Trajectory visualization |
| openadapt-grounding | UI element grounding |
| openadapt-retrieval | Trajectory retrieval |
| openadapt-privacy | PII/PHI scrubbing |
See the full Architecture Documentation for detailed diagrams.
- Discord: Join our community
- GitHub: OpenAdaptAI
- Twitter: @OpenAdaptAI
OpenAdapt is released under the MIT License.