PYOB is a high-fidelity autonomous agent that performs surgical code modifications, cross-file dependency tracking, and self-healing verification β all without destroying your codebase.
Getting Started Β· How It Works Β· Architecture Β· Documentation
PYOB is an autonomous code review and feature engineering system that continuously analyzes, patches, and evolves your codebase through a multi-stage verification pipeline. Unlike "black-box" coding assistants that rewrite entire files, PyOB operates with surgical XML-based edits, a persistent symbolic dependency ledger, and human-in-the-loop checkpoints β ensuring your project is never left in a broken state.
| Feature | Traditional AI Assistants | PYOB |
|---|---|---|
| Edit Strategy | Full file rewrites | Surgical <SEARCH>/<REPLACE> XML blocks |
| Dependency Awareness | None | Symbolic ledger (SYMBOLS.json) with ripple detection |
| Error Recovery | Manual | Context-aware self-healing with auto-rollback |
| Verification | None | 4-layer pipeline: XML matching β linting β PIR β runtime test |
| State Persistence | Stateless | MEMORY.md, ANALYSIS.md, HISTORY.md, SYMBOLS.json |
| API Resilience | Single key, fails on rate limit | Multi-key rotation with automatic local LLM fallback |
Every proposed change uses exact <SEARCH>/<REPLACE> blocks matched against the source. PyOB utilizes a Multi-Strategy Matcher (Exact, Stripped, Normalized, Regex Fuzzy, and Robust Line-Matching). If any block fails to align, the entire patch is rejected and auto-regenerated to prevent partial, broken edits.
PyOB maintains a live dependency graph (SYMBOLS.json). When a function signature or constant changes, every file that references that symbol is automatically queued for synchronized updates via the Symbolic Cascade system, ensuring cross-file integrity.
- Atomic XML Matching β Strict anchor validation with Smart Indent Alignment.
- Syntactic Validation β
ruff(Python),node --check(JS), brace-balancing (CSS). - Downstream Mypy Checks β Mandatory workspace-wide type checking after every edit.
- Context-Aware Self-Healing (PIR) β Feeds the original goal + error + broken code back to the AI for automated repair.
- Runtime Smoke Test β Launches the app for 10 seconds, monitoring
stdout/stderrfor tracebacks.
PyOB is capable of targeting its own source code. It can refactor its mixins, optimize its engine logic, and add new features to itself.
- Recursive Safety Pods: Before a self-edit, PyOB shelters its working source code in an external backup directory (
~/Documents/PyOuroBoros_Backups). - Autonomous Forge: If the compiled DMG version evolves itself, it triggers a background build, replaces the binary in
/Applications, and reboots.
Interactive terminal checkpoints at every stage:
AUGMENT_PROMPTβ Inject instructions into the AI's mental process.EDIT_CODE/EDIT_XMLβ Polish proposed changes in your terminal editor (Nano/Vim).FULL_DIFFβ View the complete unified diff in a pager.REGENERATEβ Force the AI to rethink the implementation.
- Primary: Gemini 2.5 Flash with multi-key rotation and 429-aware cooldowns
- Fallback: Local Ollama (
qwen3-coder:30b) activates automatically when all API keys are rate-limited - Progress Spinner: Real-time token estimation with animated progress bar during inference
PyOB can be used either as a pre-compiled standalone application (Recommended) or by running the source code directly.
Regardless of installation method, PyOB utilizes external tools for code verification and local LLM fallback.
| Requirement | Purpose | Required? |
|---|---|---|
| Ollama | Local model server (fallback) | β‘ Recommended |
ruff |
Python linting & formatting | β‘ Recommended |
mypy |
Static type checking | β‘ Recommended |
| Node.js | JavaScript syntax validation | Optional |
Download the latest pre-built binaries from the Releases Page.
- Download and Mount: Open
PyOB-v0.2.0.dmg. - Install: Drag the PyOB icon into your
/Applicationsfolder. - Launch: Open PyOB via Spotlight (
Cmd + Space> "PyOB"). - Setup: A Terminal window will open automatically. Follow the prompts to enter your Gemini API keys and select your AI models. These settings are saved to
~/.pyob_config.
- Download: Save
PyOB.exeto a known directory. - Launch: Double-click the executable or run it via PowerShell/CMD.
- Setup: Follow the on-screen prompts to configure your API keys and model preferences.
Use this method if you wish to modify PyOB or contribute to its development.
# Clone the repository
git clone https://github.com/vicsanity623/PyOB.git
cd PyOB
brew install python@3.12
# 2. Wipe the old environment
deactivate 2>/dev/null
rm -rf build_env
# 3. Create the 3.12 environment
python3.12 -m venv build_env
# 4. Activate and Install
source build_env/bin/activate
pip install --upgrade pip
pip install ruff mypy requests ollama pyinstaller psutil chardet charset-normalizer types-chardetIf you intend to use the local fallback feature, pull the recommended model:
ollama pull qwen3-coder:30bRun the launcher directly. On the first run, you will be prompted to configure your API keys and model settings.
python PyOB_launcher.py- Targeting: Provide the path to the project you want PyOB to manage.
- Dashboard: Open
http://localhost:5000to watch the "Observer" dashboard in real-time. - Approve: When PyOB proposes a fix or feature, review the diff and hit
ENTER. - Observe: Watch the Cascade Queue trigger as PyOB ripples changes through your files.
- Self-Evolution: To have PyOB improve itself, target its own root:
python pyob_launcher.py . - Verification: The system runs the 4-layer pipeline (XML Match β Lint β PIR β Runtime Test) to ensure the code is functional.
- Persistence: Your
MEMORY.mdandHISTORY.mdare updated to maintain context for the next iteration.
PyOB will:
- π Bootstrap β Generate
ANALYSIS.md(project map) andSYMBOLS.json(dependency graph) - π― Target β Intelligently select the next file to review based on history and dependencies
- π¬ Analyze β Scan for bugs, lint errors, and architectural gaps
- π‘ Propose β Generate a
PEER_REVIEW.md(bug fixes) orFEATURE.md(new features) - βΈοΈ Checkpoint β Wait for your approval before applying any changes
- β Verify β Run the 4-layer verification pipeline and auto-heal if needed
- π Cascade β Detect and queue downstream dependency impacts
- πΎ Persist β Update
MEMORY.mdandHISTORY.mdwith session context - π Iterate β Loop back to step 2 with a 2-minute cooldown
PyOB is built using a Mixin-based architecture to separate concerns and prevent context bloat:
| Component | File | Role |
|---|---|---|
| Entrance Controller | entrance.py |
Master loop, Symbolic targeting, and Recursive Forge management. |
| Auto Reviewer | autoreviewer.py |
Orchestrates the 6-phase pipeline and feature implementation. |
| Core Utilities | core_utils.py |
LLM streaming, Smart Python detection, and Cyberpunk Logging. |
| Prompts & Memory | prompts_and_memory.py |
8 specialized prompt templates and Transactional Memory logic. |
| Structure Parser | structure_parser.py |
High-fidelity AST parsing for Python/JS signatures. |
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β ENTRANCE CONTROLLER β
β (entrance.py) β
β ββββββββββββ ββββββββββββ ββββββββββββ ββββββββββββββββ β
β β Target β β Analysis β β Symbolic β β History β β
β β Selector β β Builder β β Ripple β β Tracker β β
β β β β β β Engine β β β β
β ββββββ¬ββββββ ββββββ¬ββββββ ββββββ¬ββββββ ββββββββ¬ββββββββ β
β β β β β β
β βΌ βΌ βΌ βΌ β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β TARGETED REVIEWER β β
β β (Scoped AutoReviewer Instance) β β
β βββββββββββββββββββββββ¬ββββββββββββββββββββββββββββββββββββ β
β β β
β βββββββββββββββββββββββΌββββββββββββββββββββββββββββββββββββ β
β β FINAL VERIFICATION & HEALING β β
β β (10s Runtime Test + Auto-Rollback) β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β
βΌ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β AUTO REVIEWER β
β (autoreviewer.py) β
β βββββββββββββ ββββββββββββ βββββββββββββ ββββββββββββββ β
β β 6-Phase β β XML Edit β β Linter β β Runtime β β
β β Pipeline β β Engine β β Fix Loop β β Verifier β β
β βββββββ¬ββββββ ββββββ¬ββββββ βββββββ¬ββββββ βββββββ¬βββββββ β
β β β β β β
β βββββββΌβββββββββββββββΌβββββββββββββββΌβββββββββββββββΌβββββββ β
β β CORE UTILITIES MIXIN β β
β β (core_utils.py) β β
β β β’ Gemini Streaming β’ Ollama Streaming β’ Key Rotation β β
β β β’ User Approval β’ Workspace Backup β’ XML Parser β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β PROMPTS & MEMORY MIXIN β β
β β (prompts_and_memory.py) β β
β β β’ Template Management β’ Rich Context Builder β β
β β β’ Memory Update/Refactor β’ Impactful History Extraction β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
| File | Purpose | Managed By |
|---|---|---|
ANALYSIS.md |
Recursive project map with file summaries and structural dropdowns | entrance.py |
SYMBOLS.json |
Dependency graph: definitions β files, references β call sites | entrance.py |
MEMORY.md |
Synthesized session memory; auto-refactored every 2 iterations | prompts_and_memory.py |
HISTORY.md |
Append-only ledger of every unified diff applied to the project | entrance.py |
PEER_REVIEW.md |
Generated bug fix proposals (created during Phase 1) | autoreviewer.py |
FEATURE.md |
Generated feature proposals (created during Phase 2) | autoreviewer.py |
Scans all supported files (.py, .js, .ts, .html, .css, .json, .sh), runs linters, detects lazy code patterns (e.g., typing.Any), and generates surgical patch proposals.
If no bugs were found in Phase 1, the AI analyzes a randomly selected file and proposes one interactive, user-facing feature with a <SNIPPET> code block.
After any modification, PyOB runs mypy across the workspace. If type errors surface in dependent files, it generates cascade fixes using the PCF.md prompt template.
Identifies the project entry point (if __name__ == "__main__": or main.py/app.py), launches it for 10 seconds, and monitors stdout/stderr for crashes. Auto-installs missing pip packages on ModuleNotFoundError.
Synthesizes all session actions into MEMORY.md using the UM.md prompt template, preserving architectural decisions and dependency mappings.
Aggressively summarizes MEMORY.md to prevent context bloat, consolidating repetitive logs into a concise knowledge base.
PyOB uses 8 specialized prompt templates, auto-generated as .md files in the target directory:
| Template | Full Name | Purpose |
|---|---|---|
PP.md |
Patch Prompt | Code review and bug fix generation |
PF.md |
Propose Feature | Interactive feature proposal |
IF.md |
Implement Feature | Surgical feature implementation |
ALF.md |
Auto Linter Fix | Syntax error repair |
FRE.md |
Fix Runtime Error | Runtime crash diagnosis and repair |
PIR.md |
Post-Implementation Repair | Context-aware error recovery (knows the original goal) |
PCF.md |
Propose Cascade Fix | Cross-file dependency repair |
UM.md |
Update Memory | Memory synthesis and consolidation |
RM.md |
Refactor Memory | Aggressive memory summarization |
Gemini API keys are configured in core_utils.py in the GEMINI_API_KEYS list. Multiple keys enable automatic rotation and rate-limit resilience.
| Setting | Default | Location |
|---|---|---|
| Gemini Model | gemini-2.5-flash |
core_utils.py β GEMINI_MODEL |
| Local Model | qwen3-coder:30b |
core_utils.py β LOCAL_MODEL |
| Temperature | 0.1 |
core_utils.py β stream_gemini() / stream_ollama() |
PyOB automatically skips certain directories and files to avoid self-modification and virtual environments:
Ignored Directories
.git, autovenv, venv, .venv, code, .mypy_cache, .ruff_cache, patch_test, env, __pycache__, node_modules, .vscode, .idea, other_dir
Ignored Files
core_utils.py, prompts_and_memory.py, autoreviewer.py, entrance.py, all prompt templates (ALF.md, FRE.md, etc.), sw.js, manifest.json, package-lock.json, auto.py, any_other_file_to_ignore.filetype
.py Β· .js Β· .ts Β· .html Β· .css Β· .json Β· .sh
PyOB's edit engine is a multi-strategy matcher that ensures reliable code modifications:
1. Exact Match β Direct string replacement
2. Stripped Match β Leading/trailing whitespace tolerance
3. Normalized Match β Ignores comments and collapses whitespace
4. Regex Fuzzy Match β Line-by-line regex matching with indent tolerance
5. Robust Line Match β Stripped line-by-line content comparison
If all 5 strategies fail for any <SEARCH> block, the entire multi-block edit is rejected and the AI is asked to regenerate.
The engine detects the base indentation of both the <SEARCH> and <REPLACE> blocks, then re-aligns the replacement to match the source file's indentation style β preventing whitespace corruption.
| Mechanism | Description |
|---|---|
| Workspace Backup | Full in-memory snapshot before every modification attempt |
| Atomic Rollback | Restores the entire workspace if verification fails |
| Import Preservation | AST-based import retention ensures no imports are accidentally deleted |
| Cascaded Healing | Downstream type errors trigger automatic synchronized repairs |
| Rate-Limit Quarantine | 429'd API keys get a 20-minute timeout; system auto-falls back to local LLM |
| Timeout Protection | All user prompts have configurable timeouts (default: 220s) with auto-proceed |
PyOB includes a built-in Real-Time Control Room. While the engine runs in the terminal, you can monitor the process through a glowing, cyberpunk-themed web interface.
- Iteration Tracking: See exactly which turn the engine is on.
- Cascade Monitoring: Watch files enter and exit the symbolic queue.
- Live Memory Stream: Read the engine's updated mental model as it develops.
- URL:
http://localhost:5000(built automatically on launch).
For in-depth technical documentation covering the verification pipeline, symbolic dependency management, prompt engineering, and more, see the Technical Documentation.
PyOB/
βββ entrance.py # π§ Entrance Controller β master loop & symbolic orchestration
βββ autoreviewer.py # π§ Auto Reviewer β 6-phase pipeline & edit engine
βββ core_utils.py # βοΈ Core Utilities β LLM streaming, XML parser, key rotation
βββ prompts_and_memory.py # π Prompts & Memory β template management & persistence
βββ docs/
β βββ DOCUMENTATION.md # π Full technical documentation
βββ README.md # π This file
your-project/
βββ ANALYSIS.md # πΊοΈ Auto-generated project map
βββ SYMBOLS.json # π Dependency graph
βββ MEMORY.md # π§ Persistent session memory
βββ HISTORY.md # π Change history ledger
βββ PEER_REVIEW.md # π Pending bug fix proposals (temporary)
βββ FEATURE.md # π‘ Pending feature proposals (temporary)
βββ FAILED_PEER_REVIEW.md # β Rolled-back bug fixes (for debugging)
βββ FAILED_FEATURE.md # β Rolled-back features (for debugging)
βββ PP.md, PF.md, IF.md ... # π Prompt templates (auto-generated)
βββ [your source files]
Built with surgical precision. π¦
Version: 2.0 Β· Last Updated: March 2026 Architecture: Python 3.10+ Β· Gemini 2.5 Flash / Ollama Local LLM
- System Philosophy
- Architecture Overview
- Module Reference
- The Verification & Healing Pipeline
- Symbolic Dependency Management
- The XML Edit Engine
- Prompt Template System
- Human-in-the-Loop Bridging
- LLM Backend & Resilience
- Persistence & State Management
- Safety & Rollback Mechanisms
- Configuration Reference
- Internal Constants & Defaults
- Operational Workflow
- Troubleshooting
PyOB is built on the principle of constrained agency. Rather than giving an AI free reign to rewrite files, PyOB forces every modification through:
- Surgical XML blocks β Small, verifiable
<SEARCH>/<REPLACE>patches instead of full file rewrites - Symbolic verification β A persistent dependency ledger that tracks the global impact of every change
- Multi-layer healing β Four independent verification layers that catch errors at different levels (syntax, type, runtime)
- Human checkpoints β Interactive approval gates at every critical decision point
This design eliminates the "hallucination-deletion" spiral common in autonomous coding agents, where an AI hallucinates a bug, deletes working code to "fix" it, then cascades errors throughout the project.
| Principle | Implementation |
|---|---|
| Never leave broken state | Atomic workspace backup/restore before every modification |
| Verify, don't trust | Every AI output is validated before disk write |
| Surgical over wholesale | <SEARCH> blocks must be 2-5 lines; no full-file rewrites |
| Context over repetition | PIR protocol feeds the original goal back on failure |
| Human sovereignty | Every change requires explicit or timeout-based approval |
CoreUtilsMixin (core_utils.py)
βββ Provides: LLM streaming, XML edit engine, key rotation,
β user approval, workspace backup/restore,
β entry file detection, import preservation
β
PromptsAndMemoryMixin (prompts_and_memory.py)
βββ Provides: Prompt template management, memory CRUD,
β rich context building, history extraction
β
AutoReviewer(CoreUtilsMixin, PromptsAndMemoryMixin) (autoreviewer.py)
βββ Provides: 6-phase review pipeline, file analysis,
β feature proposal/implementation, PR generation,
β linter fix loops, runtime verification,
β downstream cascade checks
β
βββ TargetedReviewer(AutoReviewer) (entrance.py)
β βββ Overrides scan_directory() to target a single file
β
βββ EntranceController (entrance.py)
βββ Owns: AutoReviewer instance (self.llm_engine)
βββ Provides: Master loop, symbolic targeting, ripple detection,
β analysis/ledger management, structure parsing,
β final verification & healing
βββ Entry Point: __main__ β run_master_loop()
User runs: python entrance.py /path/to/project
β
βΌ
ββββββββββββββββββββββββ
β EntranceController β
β __init__() β
β β’ Sets target_dir β
β β’ Creates AutoReviewer β
β β’ Loads SYMBOLS.jsonβ
ββββββββββββ¬ββββββββββββ
β
βΌ
ββββββββββββββββββββββββ
β run_master_loop() ββββββββββββββββββββββββββββββββ
β 1. Bootstrap if β β
β ANALYSIS.md β β
β missing β β
β 2. Call execute_ β β
β targeted_ β β
β iteration() β β
ββββββββββββ¬ββββββββββββ β
β β
βΌ β
ββββββββββββββββββββββββ β
β execute_targeted_ β β
β iteration() β β
β 1. Backup workspace β β
β 2. Pick target file β β
β 3. Create Targeted β β
β Reviewer β β
β 4. Run pipeline β β
β 5. Update analysis β β
β 6. Detect ripples β β
β 7. Final verify β β
ββββββββββββ¬ββββββββββββ β
β β
βΌ β
ββββββββββββββββββββββββ β
β AutoReviewer. β β
β run_pipeline() β β
β Phase 1: Scan/Fix β β
β Phase 2: Propose β β
β Phase 3: Cascade β β
β Phase 4: Runtime β β
β Phase 5: Memory β β
β Phase 6: Refactor β β
ββββββββββββ¬ββββββββββββ β
β β
βΌ β
ββββββββββββββββββββββββ β
β 120s cooldown ββββββββββββββββββββββββββββββββ
ββββββββββββββββββββββββ
The top-level orchestrator that manages symbolic targeting, dependency tracking, and final runtime verification.
A scoped subclass of AutoReviewer that overrides scan_directory() to operate on exactly one file.
| Method | Signature | Description |
|---|---|---|
__init__ |
(target_dir: str, target_file: str) |
Sets the forced target file |
scan_directory |
() β list[str] |
Returns only [self.forced_target_file] if it exists |
The master controller that owns the main event loop.
| Method | Signature | Description |
|---|---|---|
__init__ |
(target_dir: str) |
Initializes paths, creates AutoReviewer, loads SYMBOLS.json |
run_master_loop |
() |
Infinite loop: bootstrap β target β iterate β cooldown (120s) |
execute_targeted_iteration |
(iteration: int) |
Single iteration: backup β pick target β run pipeline β verify β cascade |
_run_final_verification_and_heal |
(backup_state: dict) β bool |
Launches app for 10s; auto-heals up to 3 times; rolls back on failure |
detect_symbolic_ripples |
(old, new, source_file) β list |
Finds files referencing symbols defined in the modified file |
pick_target_file |
() β str |
Uses LLM to intelligently select next file based on ANALYSIS.md and HISTORY.md |
build_initial_analysis |
() |
Genesis scan: builds ANALYSIS.md and SYMBOLS.json from scratch |
update_analysis_for_single_file |
(target_abs_path, rel_path) |
Updates one file's section in ANALYSIS.md |
update_ledger_for_file |
(rel_path, code) |
Parses definitions (AST for Python, regex for JS/TS) and references |
generate_structure_dropdowns |
(filepath, code) β str |
Generates HTML <details> dropdowns for imports, classes, functions, constants |
append_to_history |
(rel_path, old_code, new_code) |
Appends truncated unified diff to HISTORY.md |
load_ledger |
() β dict |
Loads SYMBOLS.json or returns empty schema |
save_ledger |
() |
Writes SYMBOLS.json to disk |
Internal Parsers:
| Method | Language | Extracts |
|---|---|---|
_parse_python |
Python | Imports, classes, functions (with args), uppercase constants |
_parse_javascript |
JS/TS | Imports, classes, functions (3 patterns including arrows), constants/entities |
_parse_html |
HTML | Script sources, stylesheet links, element IDs |
_parse_css |
CSS | Class selectors (first 50) |
The core review and modification engine. Inherits from both CoreUtilsMixin and PromptsAndMemoryMixin.
| Attribute | Type | Description |
|---|---|---|
target_dir |
str |
Absolute path to the project being reviewed |
pr_file |
str |
Path to PEER_REVIEW.md |
feature_file |
str |
Path to FEATURE.md |
failed_pr_file |
str |
Path to FAILED_PEER_REVIEW.md |
failed_feature_file |
str |
Path to FAILED_FEATURE.md |
memory_file |
str |
Path to MEMORY.md |
analysis_path |
str |
Path to ANALYSIS.md |
history_path |
str |
Path to HISTORY.md |
symbols_path |
str |
Path to SYMBOLS.json |
memory |
str |
Loaded content of MEMORY.md |
session_context |
list[str] |
Running log of actions in the current session |
key_cooldowns |
dict |
Maps API keys to their cooldown expiry timestamps |
| Method | Signature | Description |
|---|---|---|
get_language_info |
(filepath) β tuple[str, str] |
Returns (language_name, language_tag) for syntax highlighting |
scan_for_lazy_code |
(filepath, content) β list[str] |
AST walker that flags Any type hints |
run_linters |
(filepath) β tuple[str, str] |
Runs ruff check and mypy on a single file |
build_patch_prompt |
(lang_name, lang_tag, content, ruff_out, mypy_out, custom_issues) β str |
Assembles the PP.md prompt with all context |
get_valid_edit |
(prompt, source_code, require_edit, target_filepath) β tuple[str, str, str] |
Core edit loop: streams LLM β validates XML β shows diff β gets approval |
run_linter_fix_loop |
(context_of_change) β bool |
Runs ruff/node/CSS checks; auto-fixes up to 3 times per language |
run_and_verify_app |
(context_of_change) β bool |
Launches entry file for 10s; auto-fixes crashes up to 3 times |
analyze_file |
(filepath, current_index, total_files) |
Phase 1 per-file analysis: lint β scan β patch prompt β AI review |
scan_directory |
() β list[str] |
Walks target_dir finding supported files, skipping ignored paths |
propose_feature |
(target_path) |
Phase 2: generates a feature proposal with <SNIPPET> block |
implement_feature |
(feature_content) β bool |
Applies an approved feature from FEATURE.md into the source |
implement_pr |
(pr_content) β bool |
Applies all approved patches from PEER_REVIEW.md |
check_downstream_breakages |
(target_path, rel_path) β bool |
Phase 3: runs workspace-wide mypy to detect cascading errors |
propose_cascade_fix |
(mypy_errors, trigger_file) β bool |
Generates and applies a fix for downstream type errors |
write_pr |
(filepath, explanation, llm_response) |
Appends a patch proposal to PEER_REVIEW.md |
run_pipeline |
(current_iteration) |
Master pipeline: Phase 1β6 with approval checkpoints |
This is the most complex method in PyOB. It handles:
- Pre-LLM Checkpoint: User can
EDIT_PROMPT,AUGMENT_PROMPT, orSKIP - Key Rotation: Cycles through available Gemini keys; falls back to Ollama
- 429 Handling: Rate-limited keys get 20-minute quarantine
- XML Validation: Calls
apply_xml_edits()and rejects partial failures - Diff Display: Shows colorized unified diff (green=added, red=removed, blue=hunks)
- Post-LLM Checkpoint: User can
APPLY,FULL_DIFF,EDIT_CODE,EDIT_XML,REGENERATE, orSKIP
get_valid_edit() Flow:
βββββββββββββββ ββββββββββββ βββββββββββββ ββββββββββββ
β Pre-LLM ββββΆβ Stream ββββΆβ Validate ββββΆβ Show β
β Checkpoint β β LLM β β XML Edits β β Diff β
β (User) β β Response β β (5-layer) β β (color) β
βββββββββββββββ ββββββββββββ βββββββββββββ ββββββ¬ββββββ
β² β Fail β
ββββββββββββββββ βΌ
ββββββββββββββββ
β Post-LLM β
β Checkpoint β
β (User) β
ββββββββββββββββ
Provides foundational infrastructure shared across all components.
| Constant | Value | Description |
|---|---|---|
GEMINI_API_KEYS |
list[str] |
Pool of Gemini API keys for rotation |
GEMINI_MODEL |
"gemini-2.5-flash" |
Primary cloud LLM model |
LOCAL_MODEL |
"qwen3-coder:30b" |
Fallback local Ollama model |
PR_FILE_NAME |
"PEER_REVIEW.md" |
Bug fix proposal filename |
FEATURE_FILE_NAME |
"FEATURE.md" |
Feature proposal filename |
FAILED_PR_FILE_NAME |
"FAILED_PEER_REVIEW.md" |
Rolled-back PR filename |
FAILED_FEATURE_FILE_NAME |
"FAILED_FEATURE.md" |
Rolled-back feature filename |
MEMORY_FILE_NAME |
"MEMORY.md" |
Persistent memory filename |
ANALYSIS_FILE |
"ANALYSIS.md" |
Project analysis filename |
HISTORY_FILE |
"HISTORY.md" |
Change history filename |
SYMBOLS_FILE |
"SYMBOLS.json" |
Dependency graph filename |
IGNORE_DIRS |
set |
Directories excluded from scanning |
IGNORE_FILES |
set |
Files excluded from scanning (includes PyOB's own source files) |
SUPPORTED_EXTENSIONS |
set |
.py, .js, .ts, .html, .css, .json, .sh |
| Method | Signature | Description |
|---|---|---|
get_user_approval |
(prompt_text, timeout=220) β str |
Non-blocking terminal input with countdown timer; supports Windows (msvcrt) and Unix (tty/termios/select) |
_launch_external_code_editor |
(initial_content, file_suffix=".py") β str |
Opens proposed code in $EDITOR (default: nano) for manual refinement |
_edit_prompt_with_external_editor |
(initial_prompt) β str |
Opens a prompt in $EDITOR for manual editing |
_get_user_prompt_augmentation |
(initial_text="") β str |
Opens a temp .txt file for quick instruction injection |
backup_workspace |
() β dict |
Snapshots all supported files into an in-memory dictionary |
restore_workspace |
(state: dict) |
Writes all files in the snapshot back to disk |
load_memory |
() β str |
Reads MEMORY.md content or returns empty string |
stream_gemini |
(prompt, api_key, on_chunk) β str |
Streams Gemini API response via SSE; returns ERROR_CODE_XXX on failure |
stream_ollama |
(prompt, on_chunk) β str |
Streams Ollama local model response |
_stream_single_llm |
(prompt, key=None, context="") β str |
Unified LLM streamer with animated progress spinner |
get_valid_llm_response |
(prompt, validator, context="") β str |
Loops LLM calls until validator(response) returns True |
ensure_imports_retained |
(orig_code, new_code, filepath) β str |
AST-based comparison that prepends any imports dropped during editing |
apply_xml_edits |
(source_code, llm_response) β tuple[str, str, bool] |
5-strategy XML edit engine (see Section 6) |
_find_entry_file |
() β str | None |
Searches for if __name__ == "__main__":, then main.py/app.py |
Manages the prompt template lifecycle and persistent memory.
| Method | Signature | Description |
|---|---|---|
_ensure_prompt_files |
() |
Writes all 8 prompt templates to the target directory on every initialization |
load_prompt |
(filename, **kwargs) β str |
Loads a template and performs {key} β value substitution |
_get_impactful_history |
() β str |
Extracts the 3 most recent HISTORY.md entries as a summary |
_get_rich_context |
() β str |
Builds a comprehensive context block from ANALYSIS.md header + recent history + memory |
update_memory |
() |
Synthesizes session actions into MEMORY.md via the UM.md template |
refactor_memory |
() |
Aggressively summarizes MEMORY.md via the RM.md template to prevent bloat |
This is the most critical logic path in PyOB, ensuring codebase integrity through four distinct layers.
Edits are atomic. If the AI proposes five <EDIT> blocks and the system fails to find the exact <SEARCH> anchor for the fifth one, the entire multi-block patch is rejected. The system then triggers an automatic regeneration attempt rather than applying a partial (broken) fix.
Key behavior:
- The
apply_xml_edits()method returns a booleanall_edits_succeeded - If
False,get_valid_edit()increments theattemptscounter and loops - No partial edits are written to disk
Immediately after file modification via run_linter_fix_loop():
| Language | Validator | Error Handling |
|---|---|---|
| Python | ruff format β ruff check |
Groups errors by file; AI auto-fixes up to 3 times per file |
| JavaScript | node --check |
Per-file validation; AI auto-fixes up to 3 times |
| CSS | Brace counting ({ vs }) |
Reports unbalanced braces; no AI auto-fix |
If Layer 2 or Layer 4 detects an error, PyOB initiates a Post-Implementation Repair (PIR).
| Fixer Type | Context Provided |
|---|---|
Standard Fixer (ALF.md, FRE.md) |
Error text + broken code |
PIR Fixer (PIR.md) |
Original feature request + error text + broken code |
The PIR advantage: When the AI knows what it was trying to do (e.g., "I duplicated a function while trying to add timezone support"), it can make a logically correct repair instead of a blind syntax fix.
Controlled by both autoreviewer.py (run_and_verify_app) and entrance.py (_run_final_verification_and_heal):
- Identifies the project's entry point (searches for
if __name__ == "__main__":, thenmain.py/app.py) - Launches the app with
subprocess.Popenand monitors for 10 seconds - Checks
stderrfor crash keywords:Traceback,Exception:,Error:,NameError:,AttributeError: - On
ModuleNotFoundError: auto-installs the missing package viapip install - On crash: feeds the traceback to
_fix_runtime_errors()which identifies the most likely culprit file from the traceback path - Retries up to 3 times before performing a full workspace rollback
Return codes considered non-crash: None, 0, 15, -15, 137, -9, 1 (process signals)
PyOB tracks the "Global Impact" of code changes via SYMBOLS.json.
{
"definitions": {
"MyClass": "models/user.py",
"calculate_total": "utils/math.py",
"initApp": "static/app.js"
},
"references": {
"main.py": ["MyClass", "calculate_total", "initApp"],
"views/dashboard.py": ["MyClass", "calculate_total"],
"static/app.js": ["initApp"]
}
}During the Genesis Scan (build_initial_analysis()), the controller parses every file:
| Language | Definition Extraction | Reference Extraction |
|---|---|---|
| Python | AST: FunctionDef, ClassDef names |
Regex: [a-zA-Z0-9_$]{4,} followed by ( or . |
| JS/TS | Regex: function, class, const/var/let declarations |
Same regex pattern as Python |
When a file containing a definition is edited, detect_symbolic_ripples():
- Computes the unified diff between old and new content
- Extracts all identifiers (4+ chars) from added/removed lines
- Checks if any extracted identifier is a definition owned by the source file
- Finds all other files that reference those identifiers
- Adds impacted files to the
cascade_queuefor automatic review in subsequent iterations
The cascade_queue is a FIFO list maintained by EntranceController:
- When a ripple is detected, impacted files are appended (deduplicated)
- Each cascade target also receives the triggering diff as
cascade_diffs[rel_path] - On the next iteration, cascade files take priority over LLM-selected targets
- The cascade reviewer's
session_contextincludes the dependency change diff
<THOUGHT>
Explanation of what this edit does and why...
</THOUGHT>
<EDIT>
<SEARCH>
exact lines to find in source
</SEARCH>
<REPLACE>
new replacement lines
</REPLACE>
</EDIT>The apply_xml_edits() method in core_utils.py attempts to match each <SEARCH> block using progressively fuzzier strategies:
if raw_search in new_code:
new_code = new_code.replace(raw_search, raw_replace, 1)Direct substring replacement. Fastest and most reliable.
clean_search = raw_search.strip("\n")
if clean_search in new_code:
new_code = new_code.replace(clean_search, clean_replace, 1)Tolerates leading/trailing newlines added by the LLM.
def normalize(t):
t = re.sub(r"#.*", "", t) # Strip comments
return re.sub(r"\s+", " ", t).strip() # Collapse whitespace
# Slides a window over source lines looking for normalized matchIgnores comments and whitespace differences. Matches by normalizing both search and source into single-space strings.
# Builds a regex from each search line:
# ^[ \t]*{escaped_line}[ \t]*\n+
# Allows flexible indentationConstructs a multiline regex that tolerates indentation differences between the AI's output and the actual source.
# Strips each line and checks if search_line is contained in code_line
for i in range(len(code_lines) - len(search_lines) + 1):
match = all(sline in code_lines[i+j].strip() for j, sline in enumerate(search_lines_stripped))Most forgiving strategy β checks if each stripped search line appears as a substring within the corresponding source line.
Before replacement, the engine:
- Detects the base indentation of the
<SEARCH>block (first non-empty line) - Detects the base indentation of the
<REPLACE>block (first non-empty line) - Strips the replace block's base indent
- Prepends the search block's base indent to every non-empty line
This prevents indentation corruption when the AI outputs code at a different indentation level than the source.
If any of the <EDIT> blocks fails all 5 strategies:
all_edits_succeededis set toFalseget_valid_edit()detects this and increments the attempt counter- The AI is asked to regenerate the entire response
- No partial edits are written to disk
All 8 templates are defined as Python strings in prompts_and_memory.py β _ensure_prompt_files() and written to the target directory as .md files on every initialization. This ensures templates are always fresh and match the current PyOB version.
Templates use {variable_name} placeholders. The load_prompt() method performs simple string replacement:
for key, value in kwargs.items():
template = template.replace(f"{{{key}}}", str(value))Variables: memory_section, ruff_section, mypy_section, custom_issues_section, lang_tag, content
Purpose: Analyzes code for bugs, syntax errors, and architectural gaps. Strict rules: 2-5 line <SEARCH> blocks, no hallucinated bugs, no new features.
Variables: memory_section, lang_tag, content, rel_path
Purpose: Suggests one interactive feature. Must output <THOUGHT> + <SNIPPET> blocks. Checks for orphaned logic that needs UI connections.
Variables: memory_section, feature_content, lang_name, lang_tag, source_code, rel_path
Purpose: Surgically implements an approved feature. Respects function signatures from ANALYSIS.md. Uses multiple <EDIT> blocks (imports, __init__, logic).
Variables: rel_path, err_text, code
Purpose: Fixes syntax errors from linter validation. Minimal context β just the error and the code.
Variables: memory_section, logs, rel_path, code
Purpose: Diagnoses and fixes runtime crashes from traceback logs.
Variables: context_of_change, err_text, rel_path, code
Purpose: Context-aware error recovery. Receives the original goal that caused the breakage, enabling intelligent repair.
Variables: memory_section, trigger_file, rel_broken_path, mypy_errors, broken_code
Purpose: Fixes downstream type errors caused by changes in a dependency file.
Variables: current_memory, session_summary
Purpose: Synthesizes session actions into MEMORY.md. Merges rather than appends.
Variables: current_memory
Purpose: Aggressively summarizes bloated MEMORY.md. Consolidates repeated entries.
PyOB allows for "Supervised Autonomy" through interactive terminal checkpoints.
Before any LLM call in get_valid_edit():
| Command | Action |
|---|---|
ENTER (empty) |
Send prompt as-is |
EDIT_PROMPT |
Opens full prompt in $EDITOR for manual refinement |
AUGMENT_PROMPT |
Opens a blank file to add quick instructions (appended to prompt) |
SKIP |
Cancel the operation entirely |
After the AI generates proposed changes:
| Command | Action |
|---|---|
ENTER (empty) |
Apply the proposed change to disk |
FULL_DIFF |
View the complete unified diff in a pager ($PAGER or less -R) |
EDIT_CODE |
Open the proposed code in $EDITOR; save to apply your refinements |
EDIT_XML |
Open the raw AI XML response in $EDITOR; re-parse after editing |
REGENERATE |
Reject the proposal; increment attempts and ask AI again |
SKIP |
Cancel and keep original code |
All checkpoints have a configurable timeout (default: 220 seconds). If the timeout expires without user input, the system defaults to "PROCEED" β auto-applying the change to maintain autonomous operation during unattended sessions.
The get_user_approval() method provides a real-time countdown display:
β³ 185s remaining | You: FULL_DIFF
- Unix: Uses
tty.setcbreak()+select.select()for non-blocking character-by-character input with 100ms polling - Windows: Uses
msvcrt.kbhit()+msvcrt.getwch()for the same behavior
- Endpoint:
https://generativelanguage.googleapis.com/v1beta/models/{model}:streamGenerateContent?alt=sse - Protocol: Server-Sent Events (SSE)
- Temperature:
0.1(near-deterministic) - Timeout: 220 seconds
- Real-time output: Chunks are printed to stdout as they arrive
- Library:
ollamaPython package - Model:
qwen3-coder:30b - Context window: 32,000 tokens
- Temperature:
0.1 - Fallback condition: All Gemini API keys exhausted or rate-limited
Available Keys Pool:
βββββββ βββββββ βββββββ βββββββ βββββββ
β K1 β β K2 β β K3 β β K4 β β K5 β
β β
β β β³ β β β
β β β
β β β³ β
βββββββ βββββββ βββββββ βββββββ βββββββ
β β β
Available Available Available
Selection: key = available_keys[attempts % len(available_keys)]
- On each attempt, select from the pool of non-cooled-down keys using modular rotation
- On
HTTP 429: quarantine the key for 20 minutes (key_cooldowns[key] = time.time() + 1200) - If all keys are quarantined: seamlessly switch to local Ollama
- Keys are automatically reinstated when their cooldown expires
During LLM inference, a background thread displays:
β Ή Reading [game.py] ~1250 ctx... [βββββββββββββββββββββββββ] 48.0% (5.2s)
- Estimates progress based on
input_tokens / 12.0seconds expected - Transitions to
"100% - AI Inference..."when estimate is exceeded - Clears line and shows
"π€ AI Output (Gemini ...abc1):"when first chunk arrives
Generated during the genesis scan and updated after every file modification.
Structure:
# π§ Project Analysis
**Project Summary:**
[AI-generated 2-sentence project description]
---
## π File Directory
### `models/user.py`
**Summary:** [AI-generated one-sentence description]
<details><summary>Imports (5)</summary>...</details>
<details><summary>Classes/Structures (2)</summary>...</details>
<details><summary>Logic/Functions (8)</summary>...</details>
<details><summary>Entities/Constants (3)</summary>...</details>
---Purpose: Allows the AI to "see" the entire project architecture without reading every file into the context window. Used by pick_target_file() to make intelligent targeting decisions.
Updated at the end of every pipeline iteration (Phase 5) and aggressively refactored every 2nd iteration (Phase 6).
Key behaviors:
- The
UM.mdtemplate instructs the AI to merge recent actions into the existing memory rather than appending usingmem_strto set a memory cap @if len(mem_str) > 1500: - The
RM.mdtemplate consolidates repeated entries and removes redundant logs - Memory content is injected into prompts via
_get_rich_context()as### Logic Memory: - Maximum memory size is kept manageable through periodic refactoring
Append-only log of every unified diff applied to the project.
Structure:
## 2026-03-04 12:30:45 - `game.py`
```diff
--- Original
+++ Proposed
@@ -10,3 +10,5 @@
...
**Truncation:** Diffs longer than 20 lines are truncated to first 5 + last 5 lines with a `[TRUNCATED FOR MEMORY]` marker.
### `SYMBOLS.json` β Dependency Graph
See [Section 5: Symbolic Dependency Management](#5-symbolic-dependency-management).
---
## 11. Safety & Rollback Mechanisms
### Workspace Backup/Restore
Before every modification attempt, `backup_workspace()` creates an in-memory snapshot:
```python
state = {}
for root, dirs, files in os.walk(self.target_dir):
# Skips IGNORE_DIRS
for file in files:
if any(file.endswith(ext) for ext in SUPPORTED_EXTENSIONS):
state[path] = file_content
If verification fails, restore_workspace(state) writes all files back to their backed-up content.
The ensure_imports_retained() method prevents the AI from accidentally dropping imports:
- Parses both original and new code with
ast.parse() - Extracts all
ImportandImportFromnodes from the original - Checks if each original import exists in the new code
- Prepends any missing imports to the new code
This runs automatically during implement_feature() for Python files.
When a PR or feature implementation fails and the workspace is rolled back:
PEER_REVIEW.mdβ renamed toFAILED_PEER_REVIEW.mdFEATURE.mdβ renamed toFAILED_FEATURE.md
This preserves the failed proposal for debugging while clearing the active queue.
Files are only added to the cascade queue if they're not already present:
if r not in self.cascade_queue:
self.cascade_queue.append(r)| Variable | Default | Description |
|---|---|---|
EDITOR |
nano |
Terminal editor for prompt/code editing |
PAGER |
less -R |
Pager for viewing full diffs |
| Constant | Default | Description |
|---|---|---|
GEMINI_API_KEYS |
5 keys | API key pool for rotation |
GEMINI_MODEL |
"gemini-2.5-flash" |
Gemini model identifier |
LOCAL_MODEL |
"qwen3-coder:30b" |
Ollama model identifier |
IGNORE_DIRS |
12 directories | Directories excluded from scanning |
IGNORE_FILES |
14 files | Files excluded from scanning |
SUPPORTED_EXTENSIONS |
7 extensions | File types PyOB can review |
| Constant | Value | Location |
|---|---|---|
| User approval timeout | 220 seconds | get_user_approval() |
| Key quarantine duration | 1200 seconds (20 min) | get_valid_edit() |
| API request timeout | 220 seconds | stream_gemini() |
| Master loop cooldown | 120 seconds | run_master_loop() |
| Runtime test duration | 10 seconds | run_and_verify_app() |
| Runtime process kill grace | 2 seconds | run_and_verify_app() |
| Memory refactor interval | Every 2 iterations | run_pipeline() |
| Constant | Value |
|---|---|
PR_FILE_NAME |
"PEER_REVIEW.md" |
FEATURE_FILE_NAME |
"FEATURE.md" |
FAILED_PR_FILE_NAME |
"FAILED_PEER_REVIEW.md" |
FAILED_FEATURE_FILE_NAME |
"FAILED_FEATURE.md" |
MEMORY_FILE_NAME |
"MEMORY.md" |
ANALYSIS_FILE |
"ANALYSIS.md" |
HISTORY_FILE |
"HISTORY.md" |
SYMBOLS_FILE |
"SYMBOLS.json" |
| Extension | Language Name | Tag |
|---|---|---|
.py |
Python | python |
.js |
JavaScript | javascript |
.ts |
TypeScript | typescript |
.html |
HTML | html |
.css |
CSS | css |
.json |
JSON | json |
.sh |
Bash | bash |
.md |
Markdown | markdown |
The scan_for_lazy_code() method flags:
ast.Namenodes wherenode.id == "Any"β bareAnytype hint usageast.Attributenodes wherenode.attr == "Any"βtyping.Anyusage
1. EntranceController.__init__()
βββ Creates AutoReviewer, loads empty ledger
2. run_master_loop()
βββ Checks for ANALYSIS.md β Not found
3. build_initial_analysis()
βββ Scans all supported files
βββ For each file:
β βββ Parses structure (AST/regex) β generates dropdowns
β βββ Updates SYMBOLS.json with definitions and references
β βββ Asks LLM for one-sentence summary
βββ Writes ANALYSIS.md
βββ Saves SYMBOLS.json
4. execute_targeted_iteration(1)
βββ Backup workspace
βββ pick_target_file() β LLM selects from ANALYSIS.md
βββ Create TargetedReviewer for selected file
βββ run_pipeline(1)
β βββ Phase 1: analyze_file() β scan, lint, review
β βββ Phase 2: propose_feature() β if no bugs found
β βββ User checkpoint β APPLY / SKIP
β βββ implement_pr() or implement_feature()
β βββ Phase 3: check_downstream_breakages()
β βββ Phase 4: run_and_verify_app()
β βββ Phase 5: update_memory()
βββ Update ANALYSIS.md for modified file
βββ Update SYMBOLS.json
βββ Detect symbolic ripples β queue cascades
βββ Final verification with healing
5. 120-second cooldown β loop back to step 4
If ANALYSIS.md already exists, step 3 is skipped. The system resumes the targeted iteration loop immediately.
Iteration N: Modified function `calculate()` in `math.py`
βββ Ripple detected: `main.py` references `calculate`
βββ Added to cascade_queue
Iteration N+1: cascade_queue is not empty
βββ Pops `main.py` from queue
βββ Session context includes: "CRITICAL SYMBOLIC RIPPLE: ..."
βββ TargetedReviewer scans `main.py` with cascade context
| Problem | Cause | Solution |
|---|---|---|
Warning: 'ollama' package not found |
Ollama Python package not installed | pip install ollama |
| All keys rate-limited, no Ollama | Both backends unavailable | Install Ollama and pull qwen3-coder:30b |
ruff / mypy not found |
Linting tools not installed | pip install ruff mypy (PyOB will skip these checks gracefully) |
Node.js not installed |
JS validation unavailable | Install Node.js (PyOB will skip JS checks) |
| Edits keep failing to match | AI generating incorrect <SEARCH> blocks |
System auto-retries; if persistent, use EDIT_XML to fix manually |
| App crashes during runtime test | Feature implementation introduced a bug | System auto-heals up to 3 times; then rolls back |
| Memory growing too large | Many iterations without refactoring | Memory auto-refactors every 2 iterations; can manually delete MEMORY.md |
FAILED_PEER_REVIEW.md appears |
PR implementation failed and was rolled back | Review the failed file; issues will be re-detected on next scan |
PyOB uses Python's built-in logging module at the INFO level:
2026-03-04 12:30:45,123 | [1/5] Scanning game.py (Python) - Reading 245 lines into AI context...
All output includes timestamps for debugging timing-related issues.
PyOB β Surgical precision, never destructive. π¦
