Note
This is an experiment that didnt quite work out. I think this is best as a layer above opencode, and not as a plugin. But feel free to try or fork.
An OpenCode plugin that turns an AI coding session into a persistent, self-correcting loop. Describe a goal, walk away, and come back to working code.
New here? Start with GETTINGSTARTEDGUIDE.md.
Two techniques combine to make this work:
- Ralph - the main session acting as strategist+supervisor. It reviews what failed, adjusts the plan and instructions, then delegates coding to a worker. It never writes code itself.
- RLM (Recursive Language Model worker) - a file-first coding session based on arXiv:2512.24601. Each attempt gets a clean context window and loads all state from files rather than inheriting noise from prior turns.
Long-running AI coding sessions degrade. By attempt 4 or 5, the context window contains echoes of three failed strategies, retracted plans, contradictory tool outputs, and the model's own hedging. The agent starts reasoning from a corrupted premise. You end up re-explaining what went wrong, manually pruning bad state, or starting over.
The standard response — "just use a bigger context window" — makes things worse. More capacity means more noise survives longer. The problem isn't window size, it's window hygiene.
ralph-rlm solves this by treating each attempt as disposable. State lives in files, not in the context window. Each new session loads exactly what it needs from those files and nothing else. The loop gets smarter with each failure by updating its instructions — not by accumulating turns.
The insight from the RLM paper is that context windows are not free. Every token of prior conversation competes with new information for the model's attention. Failed attempts, debug noise, and superseded plans don't disappear when you move on — they stay in the window and subtly bias future reasoning.
The solution is to make context windows ephemeral by design. Each worker session:
- Starts clean, with no memory of prior attempts
- Loads exactly the state it needs from protocol files
- Does one pass, then stops
The protocol files carry forward what matters. Everything else is discarded.
Context windows are session-local and finite. Files are persistent, inspectable, diff-able, and shared across sessions. By routing all persistent state through the filesystem, the loop gains properties that in-context memory cannot provide:
- Durability: state survives crashes, restarts, and context compression
- Inspectability: you can read
AGENT_CONTEXT_FOR_NEXT_RALPH.mdat any time to see exactly what the next attempt will see - Shareability: multiple sessions (Ralph, worker, sub-agents) read and write the same files concurrently
- Debuggability: the entire history of an overnight run is in plaintext files you can grep
Ralph keeps strategy and execution separate: you (main session) handle strategy and delegation, while workers implement. When a session that just wrote failing code is also responsible for diagnosing why it failed and planning the next approach, it pattern-matches against its own failed reasoning. It proposes variations on what didn't work rather than stepping back.
In this mode, the worker always gets a fresh window. It reads the failure record cold, without the accumulated baggage of having written the code. This mirrors how experienced engineering teams work: the reviewer of a failing PR is often not the one who writes the fix.
The loop is only as good as its exit condition. verify.command is the single source of truth for "done." A machine-verifiable criterion — tests pass, types check, linter clean — turns the exit question from a judgment call into a boolean. The model cannot talk its way out of a failing test suite.
This contract has a corollary: the better your verify command, the better the loop performs. A verify that checks only syntax will produce syntactically valid but logically broken code. A verify that runs the full test suite, typechecks, and lints will produce code that passes all three.
The RLM paper demonstrates that full-file reads are expensive and often counterproductive. When a model dumps a 2000-line file into its context to answer a question that requires 30 lines, the relevant section is buried in noise and the window fills with irrelevant code.
rlm_grep + rlm_slice give surgical access: search first to find line numbers, then read only the relevant range. CONTEXT_FOR_RLM.md is the designated large-reference file — a place to paste API docs, specs, or large codebases that should never be read in full.
NOTES_AND_LEARNINGS.md and RLM_INSTRUCTIONS.md are the loop's long-term memory. They survive context resets and accumulate across attempts. The loop doesn't just retry — it gets smarter with each failure.
RLM_INSTRUCTIONS.md is the inner loop's operating manual. The main strategist (you) updates it between attempts when a pattern of failures reveals a gap in guidance. By attempt 10, the instructions encode everything learned from attempts 1-9.
This is why the approach scales to overnight runs. A fresh worker in attempt 10 starts with the accumulated knowledge of 9 prior attempts, encoded in protocol files, without the accumulated noise.
You → main session (supervisor + strategist in one)
│
├─ attempt 1:
│ ├─ strategist (you): ralph_load_context() → review failures → update PLAN.md
│ ├─ ralph_spawn_worker() → spawns RLM worker session W1
│ │
│ └─ RLM worker session W1 ← fresh context
│ W1: ralph_load_context() → code → ralph_verify() → STOP
│
├─ plugin verifies on W1 idle
│ fail → roll state files → spawn attempt 2
│
├─ attempt 2:
│ ├─ strategist (you): reads AGENT_CONTEXT_FOR_NEXT_RALPH.md → adjusts strategy
│ ├─ ralph_spawn_worker() → spawns RLM worker session W2
│ │
│ └─ RLM worker session W2 ← fresh context
│ W2: loads compact state from files → code → STOP
│
└─ pass → done toast
Each role has a distinct purpose and fresh context window where applicable:
| Role | Session | Context | Responsibility |
|---|---|---|---|
| main | Your conversation | Persistent | Supervisor + strategist. Review failures, update PLAN.md / RLM_INSTRUCTIONS.md, call ralph_spawn_worker(). |
| worker | Per-attempt coder | Fresh | ralph_load_context() → code → ralph_verify() → stop. |
- Supervisor+strategist (main session): orchestration, planning, and delegation. Never edits files or runs code directly.
- RLM worker: does the actual coding and verification for this attempt. One pass per session.
- Sub-agent: narrow task helper; updates its own state files under
.opencode/agents/<name>/.
main idle
└─ strategist (you) calls ralph_spawn_worker()
└─ spawn Worker(1)
└─ Worker(1) calls ralph_verify() and goes idle
└─ plugin runs verify
├─ pass → done
└─ fail → roll state files
└─ strategist (you) handles next attempt
The plugin drives the loop from session.idle events. Neither Ralph nor the worker need to know about the outer loop — they just load context, do their job, and stop.
Each worker session is required to:
- Call
ralph_load_context()first — blocked fromwrite/edit/bashuntil it does. - Read
PLAN.mdandRLM_INSTRUCTIONS.mdas authoritative instructions. - Use
rlm_grep+rlm_sliceto access large reference documents — never dump them whole. - Write scratch work to
CURRENT_STATE.mdthroughout the attempt. - Promote durable changes (completed milestones, new constraints) to
PLAN.md. - Append insights to
NOTES_AND_LEARNINGS.md. - Call
ralph_verify()when ready, then stop.
The one-pass contract is enforced socially (system prompt) and mechanically (context gate on destructive tools). Workers do not re-prompt themselves. Ralph controls iteration.
For tasks that can be decomposed, a worker can subagent_spawn a child session with an isolated goal. Each sub-agent gets its own state directory under .opencode/agents/<name>/ and the same protocol file structure. The worker polls with subagent_await and integrates the result.
Sub-agents follow the same discipline as workers: one pass, file-first, fresh context.
Spawned sessions (Ralph and workers) can communicate back to the main conversation at runtime:
ralph_report()— fire-and-forget progress updates, appended toSUPERVISOR_LOG.mdandCONVERSATION.md, and posted to the main conversationralph_ask()— blocks the session until you respond viaralph_respond(), enabling interactive decision points mid-loop (e.g., "should I rewrite auth.ts or patch it?")
This is implemented via file-based IPC (.opencode/pending_input.json) so responses survive across any session boundary.
your-repo/
└── .opencode/
├── package.json ← add "effect" dependency
├── ralph.json ← verify command + tuning
└── plugins/
└── ralph-rlm.ts ← the plugin
Copy ralph-rlm.ts into .opencode/plugins/ and create .opencode/package.json:
{
"dependencies": {
"effect": "^3.13.0"
}
}OpenCode runs bun install at startup automatically.
Copy the plugin to ~/.config/opencode/plugins/ralph-rlm.ts and add the package.json to ~/.config/opencode/package.json.
Create .opencode/ralph.json. All fields are optional — the plugin runs with safe defaults if the file is absent.
{
"enabled": true,
"autoStartOnMainIdle": false,
"statusVerbosity": "normal",
"maxAttempts": 25,
"heartbeatMinutes": 15,
"verify": {
"command": ["bun", "run", "verify"],
"cwd": "."
},
"gateDestructiveToolsUntilContextLoaded": true,
"maxRlmSliceLines": 200,
"requireGrepBeforeLargeSlice": true,
"grepRequiredThresholdLines": 120,
"subAgentEnabled": true,
"maxSubAgents": 5,
"maxConversationLines": 1200,
"conversationArchiveCount": 3,
"reviewerEnabled": false,
"reviewerRequireExplicitReady": true,
"reviewerMaxRunsPerAttempt": 1,
"reviewerOutputDir": ".opencode/reviews",
"reviewerPostToConversation": true,
"agentMdPath": "AGENT.md"
}| Field | Default | Description |
|---|---|---|
enabled |
true |
Set to false to disable the outer loop without removing the plugin. |
autoStartOnMainIdle |
false |
Automatically start attempt 1 when the main session goes idle. Set true to enable auto-start, or keep false for manual starts via ralph_create_supervisor_session(). |
statusVerbosity |
"normal" |
Supervisor status emission level: minimal (warnings/errors), normal, or verbose. |
maxAttempts |
20 |
Hard stop after this many failed verify attempts. |
heartbeatMinutes |
15 |
Warn if active strategist/worker has no progress for this many minutes. |
strategistHandoffMinutes |
5 |
Warn/retry if the strategist does not spawn a worker within this many minutes. |
strategistHandoffMaxRetries |
2 |
Max retries to respawn strategist after a missed handoff. |
verifyTimeoutMinutes |
0 |
Timeout for verify command in minutes. 0 disables timeouts. |
verify.command |
- | Shell command to run as an array, e.g. ["bun", "run", "verify"]. If omitted, verify always returns unknown. |
verify.cwd |
"." |
Working directory for the verify command, relative to the repo root. |
gateDestructiveToolsUntilContextLoaded |
true |
Block write, edit, bash, etc. until ralph_load_context() has been called in the current attempt. |
maxRlmSliceLines |
200 |
Maximum lines a single rlm_slice call may return. |
requireGrepBeforeLargeSlice |
true |
Require a recent rlm_grep call before slices larger than grepRequiredThresholdLines. |
grepRequiredThresholdLines |
120 |
Line threshold above which grep-first is required. |
subAgentEnabled |
true |
Allow subagent_spawn. |
maxSubAgents |
5 |
Maximum concurrently running sub-agents per session. |
maxConversationLines |
1200 |
Rotate CONVERSATION.md when it grows beyond this many lines. |
conversationArchiveCount |
3 |
Number of rotated archives to keep (CONVERSATION.1.md, etc.). |
reviewerEnabled |
false |
Enable optional reviewer sub-agent tooling. |
reviewerRequireExplicitReady |
true |
Require explicit ralph_request_review() before reviewer runs. |
reviewerMaxRunsPerAttempt |
1 |
Max reviewer runs per attempt unless forced. |
reviewerOutputDir |
".opencode/reviews" |
Directory for reviewer report files. |
reviewerPostToConversation |
true |
Post reviewer lifecycle updates to main conversation feed. |
agentMdPath |
"AGENT.md" |
Path (relative to repo root) to the project AGENT.md. Read by ralph_load_context() and included in the context payload. Set to "" to disable. |
{ "command": ["bun", "run", "verify"] }
{ "command": ["bun", "test"] }
{ "command": ["npm", "test"] }
{ "command": ["cargo", "test"] }
{ "command": ["python", "-m", "pytest"] }
{ "command": ["make", "ci"] }
{ "command": ["./scripts/verify.sh"] }If your repo has no test files yet, avoid bun test as a verify command because Bun exits non-zero when no tests match. Use a command that reflects your current quality gate (for example bun run verify with typecheck + build).
The verify command is the loop's exit condition. It should be as comprehensive as you want the output to be. A verify that runs tests + typecheck + lint will produce code that passes all three; a verify that only checks syntax will produce syntactically valid code that may be logically broken.
If you are setting up a repo from scratch, use this sequence:
- Run
ralph_quickstart_wizard(...)for one-call setup, or useralph_doctor(autofix=true)+ralph_bootstrap_plan(...)manually. - Validate plan quality with
ralph_validate_plan(). - Start the loop with
ralph_create_supervisor_session()(or enable auto-start if desired). - Monitor progress in
SUPERVISOR_LOG.md(structured) andCONVERSATION.md(readable timeline).
If setup is incomplete, auto-start is skipped and the plugin emits a warning with next actions.
The plugin already manages the core loop roles (main / ralph / worker / subagent).
Use OpenCode agents to control posture and delegation, not to replace loop orchestration.
Recommended split:
supervisor(primary): your default top-level operator for safe orchestration.- built-in
plan(primary): dry analysis/planning without edits. - built-in
build(primary): full manual implementation when needed. - project subagents (optional): focused review/docs/security helpers.
This repo now includes project-local agent files under .opencode/agents/:
.opencode/agents/supervisor.md.opencode/agents/ralph-reviewer.md.opencode/agents/docs-writer.md.opencode/agents/security-auditor.md
These profiles intentionally keep loop ownership in ralph-rlm.
Do not model the strategist/worker roles as OpenCode primary/subagent replacements.
The plugin bootstraps these files on first run if they do not exist. They are the persistent memory of the loop — commit them to version control.
| File | Purpose |
|---|---|
PLAN.md |
Goals, milestones, definition of done, changelog. Updated via ralph_update_plan(). |
RLM_INSTRUCTIONS.md |
Inner loop operating manual and playbooks. Updated via ralph_update_rlm_instructions(). |
CURRENT_STATE.md |
Scratch pad for the current Ralph attempt. Reset on each rollover. |
PREVIOUS_STATE.md |
Snapshot of the last attempt's scratch. Automatically written on rollover. |
AGENT_CONTEXT_FOR_NEXT_RALPH.md |
Shim passed to the next attempt: verdict, summary, next step. |
CONTEXT_FOR_RLM.md |
Large reference document (API docs, specs, etc.). Always accessed via rlm_grep + rlm_slice. |
NOTES_AND_LEARNINGS.md |
Append-only log of durable insights. Survives all context resets. |
TODOS.md |
Optional lightweight task list. |
SUPERVISOR_LOG.md |
Append-only feed of all ralph_report() entries across all attempts and sessions. |
CONVERSATION.md |
Append-only human-readable timeline of supervisor updates, loop events, questions, and responses. |
Sub-agent state lives under .opencode/agents/<name>/ with the same structure.
Attempt N worker writes:
CURRENT_STATE.md ← scratch: what I tried, what I found
NOTES_AND_LEARNINGS.md ← append: durable insight from this attempt
On N→N+1 rollover, plugin writes:
PREVIOUS_STATE.md ← copy of CURRENT_STATE.md
CURRENT_STATE.md ← reset to blank template
AGENT_CONTEXT_FOR_NEXT_RALPH.md ← verdict + summary + next step
Ralph(N+1) reads and optionally updates:
AGENT_CONTEXT_FOR_NEXT_RALPH.md ← why it failed
PLAN.md ← adjusts strategy
RLM_INSTRUCTIONS.md ← adjusts worker guidance
Worker(N+1) reads:
All of the above via ralph_load_context()
This is why the loop can run overnight. Each fresh session starts with the accumulated knowledge of all prior attempts, encoded in files — not in a context window that would be reset.
OpenCode loads AGENT.md from the repo root into every session's system prompt automatically. The plugin coexists with this but the two files serve different roles:
AGENT.md |
RLM_INSTRUCTIONS.md |
|
|---|---|---|
| Scope | Static project-wide rules | Dynamic per-loop operating manual |
| Who writes it | You (developer) | Agent (via ralph_update_rlm_instructions()) |
| Changes | Rarely — git-committed conventions | Every loop — playbooks, learnings, constraints |
| Injected by | OpenCode automatically (system prompt) | ralph_load_context() return payload |
ralph_load_context() automatically reads AGENT.md (configurable via agentMdPath) and includes it in the context payload under agent_md. This means:
- Sub-agents, which run in isolated sessions that may not have AGENT.md injected, still see the project rules.
- Every attempt starts with both the static project context and the dynamic loop state in one payload.
To disable AGENT.md inclusion:
{ "agentMdPath": "" }To point to a non-standard location:
{ "agentMdPath": "docs/AGENT.md" }Keep AGENT.md focused on facts that never change loop-to-loop: repo layout, build commands, code style. Defer loop-specific guidance to RLM_INSTRUCTIONS.md.
# Project Agent Rules
## Repo layout
- `src/` — application source
- `tests/` — test suite (`bun test`)
- `docs/` — documentation
## Build and verify
- Install: `bun install`
- Test: `bun test`
- Typecheck: `bun run typecheck`
## Code style
- TypeScript strict mode; no `any`
- Prefer Effect-TS over raw Promises for async/error handling
## Loop guidance
This project uses the ralph-rlm plugin.
- Call `ralph_load_context()` at the start of every attempt.
- Task-specific playbooks live in `RLM_INSTRUCTIONS.md` — check there for the current strategy before starting work.
- Do NOT put attempt-specific state in AGENT.md; write it to `CURRENT_STATE.md` or `NOTES_AND_LEARNINGS.md`.If your AGENT.md contains instructions that clash with the plugin's file-first rules (e.g. "always read files in full"), add a note that defers to RLM_INSTRUCTIONS.md:
## Note on file access
When working with the ralph-rlm loop, prefer `rlm_grep` + `rlm_slice` for large files
over full reads. The loop-specific protocol in `RLM_INSTRUCTIONS.md` takes precedence
over general file-access guidance in this document.If you want your AGENT.md content appended to the plugin's system prompt fragment (instead of included in the context payload), use RALPH_SYSTEM_PROMPT_APPEND:
export RALPH_SYSTEM_PROMPT_APPEND="@AGENT.md"This injects the file on every turn rather than only when ralph_load_context() is called.
Check whether the repository is ready for Ralph/RLM execution. It validates core setup (verify command, baseline files, placeholders) and returns structured diagnostics.
With autofix: true, it applies safe bootstrap fixes such as creating .opencode/ralph.json defaults and a baseline AGENT.md when missing.
ralph_bootstrap_plan(goal, requirements?, stopping_conditions?, features?, steps?, todos?, overwrite_plan?, overwrite_todos?)
Generate PLAN.md and TODOS.md from explicit project requirements. This is the fastest way to turn a rough prompt into a concrete execution plan.
Bind the current session as the supervisor and optionally start attempt 1 immediately.
- Use this when
autoStartOnMainIdleis disabled. - Use
force_rebind: trueto move supervision to a different session. - Use
restart_if_done: trueto start a new run after completion or manual stop.
Stop supervision for the current process. This prevents further auto-loop orchestration until restarted.
- Use this when you want to pause/stop Ralph from spawning more sessions.
- Use this after verification passes and the user confirms they are done, or when the user asks to stop the loop.
- Active child sessions are aborted and any running verify command is stopped.
- Pass
delete_sessions: trueto delete child sessions after aborting them. - Resume later with
ralph_create_supervisor_session(restart_if_done=true).
Return current supervision state: bound session, attempt number, active strategist/worker session IDs, and done status.
Pause automatic loop orchestration without ending supervision state.
Resume from pause. Optionally start the loop immediately.
Reset protocol/runtime state. Requires confirm: "RESET_RALPH_STATE".
scope: "attempt"resets scratch files.scope: "full"resets scratch + baseline protocol scaffolding.
Validate PLAN.md structure (goal, requirements, stopping conditions, milestones/checklists) before long runs.
One-call setup helper: applies basic setup, writes PLAN.md/TODOS.md, validates plan, and can optionally start the loop.
Reads all protocol files and returns them as a structured JSON payload. Must be called at the start of every attempt. Calling it marks the session as context-loaded, which unblocks destructive tools.
args:
includeRlmContextHeadings boolean optional Return headings-only from CONTEXT_FOR_RLM.md (default true)
rlmHeadingsMax number optional Max headings to return (default 80)
Search a file by regex and return matching lines with line numbers. Defaults to CONTEXT_FOR_RLM.md. Use this to locate the relevant section before slicing.
Read a specific line range from a file. Enforces the maxRlmSliceLines limit. Requires a recent rlm_grep call if the slice exceeds grepRequiredThresholdLines.
Apply a unified diff patch to PLAN.md. Automatically appends a changelog entry. Use for durable changes only: completed milestones, new constraints, clarified acceptance criteria.
Apply a unified diff patch to RLM_INSTRUCTIONS.md. Appends a changelog entry. The Fixed Header section should not be modified.
Manually trigger a rollover: copies CURRENT_STATE.md to PREVIOUS_STATE.md, resets scratch, writes the next-attempt shim. Optionally appends a learning to NOTES_AND_LEARNINGS.md. The outer loop calls this automatically on verify failure; the agent can also call it explicitly.
Run the configured verify command. Returns { verdict: "pass"|"fail"|"unknown", output, error }.
Main strategist only. Spawn a fresh RLM worker session for this attempt. Call this after reviewing protocol files and optionally updating PLAN.md / RLM_INSTRUCTIONS.md. Then stop — the plugin handles verification and prompts you for the next attempt if needed.
If you call this from an unbound session you will get: ralph_spawn_worker() must be called from the bound supervisor session. Bind first with ralph_create_supervisor_session().
Spawn a child OpenCode session to handle an isolated sub-task. Creates .opencode/agents/<name>/ state files, then sends the initial prompt to the child session.
Poll a sub-agent's CURRENT_STATE.md for completion. Returns { status: "done"|"running"|"not_found", current_state }. The sub-agent signals completion by writing ## Final Result or outputting SUB_AGENT_DONE.
Read any protocol file from a sub-agent's state directory without waiting for completion. Useful for monitoring progress mid-run.
List all sub-agents registered in the current session with their name, goal, status, and spawn time.
These tools let spawned sessions (RLM worker + sub-agents) communicate back to the main conversation at runtime. State is carried in .opencode/pending_input.json for question/response pairs, SUPERVISOR_LOG.md for structured status entries, and CONVERSATION.md for the readable event timeline.
User answers to ralph_ask() are persisted too: when you reply via ralph_respond(), the response is appended to CONVERSATION.md.
Optionally publish explicit attempt status (running, blocked, done, error) for the current session. This improves observability and handoffs.
If an inner session does not call ralph_set_status(), the loop still works: idle + verify events continue with implicit status handling.
Mark the current attempt as review-ready. This is the recommended gate before running the reviewer so it does not execute too often.
Run an optional reviewer sub-agent and write the review report to a file (default: .opencode/reviews/review-attempt-N.md).
- Honors
reviewerRequireExplicitReadyandreviewerMaxRunsPerAttemptunlessforce=true. - Use
wait=true(default) to block until review completion. - Reviewer gate/runtime state is persisted in
.opencode/reviewer_state.jsonso restarts can resume tracking.
Show reviewer gate state: active reviewer, review requests, and runs per attempt.
Fire-and-forget progress report. Appends a timestamped entry to SUPERVISOR_LOG.md and CONVERSATION.md, shows a toast, and optionally posts into the main conversation so you can see what's happening without opening a separate session.
args:
message string required Progress message
level string optional "info" | "warning" | "error" (default: "info")
post_to_conversation boolean optional Post to main conversation (default: true)
Snapshot the active worker's CURRENT_STATE.md and optionally post it into the main conversation for quick "peek" access in the TUI.
args:
maxLines number optional Max lines to include (default: 120)
post_to_conversation boolean optional Post to main conversation (default: true)
Ask a question and block until you respond via ralph_respond(). The question is written to .opencode/pending_input.json, a toast appears in the main session, and the main conversation is prompted with the question ID and response instruction. The calling session polls every 5 seconds.
Use this for decisions that can't be inferred from the protocol files — e.g., "should I rewrite auth.ts from scratch or patch the existing implementation?"
args:
question string required The question
context string optional Additional context for the decision
timeout_minutes number optional Minutes to wait before timing out (default: 15)
Returns { id, answer } as JSON once you respond.
Respond to a pending question, unblocking the session that called ralph_ask(). The id is shown in the toast and in the main conversation prompt (format: ask-NNNN). If you mistype the ID, the tool returns an error listing all pending unanswered questions with their IDs.
args:
id string required Question ID (e.g. "ask-1234567890")
answer string required Your answer
Every internal prompt the plugin sends to the model is customisable through environment variables. Values are loaded once at startup.
# Literal text — use \n for newlines
RALPH_CONTINUE_PROMPT="Attempt {{attempt}}: fix the verify.\n\nCall ralph_verify() when done."
# File reference (relative to worktree)
RALPH_SYSTEM_PROMPT="@.opencode/prompts/system.txt"
# Absolute file path
RALPH_BOOTSTRAP_RLM_INSTRUCTIONS="@/home/user/prompts/rlm-instructions.md"| Variable | Tokens | Description |
|---|---|---|
RALPH_SYSTEM_PROMPT |
— | Full system prompt injected on every turn. Replaces the default. |
RALPH_SYSTEM_PROMPT_APPEND |
— | Appended after the system prompt. Useful for adding project-specific rules without replacing the base. |
RALPH_COMPACTION_CONTEXT |
— | Context block injected when the session is compacted (context window compressed). |
RALPH_CONTINUE_PROMPT |
{{attempt}} {{verdict}} |
Re-prompt sent to the agent after a failed verification attempt. |
RALPH_DONE_FILE_CONTENT |
{{timestamp}} |
Content written to AGENT_CONTEXT_FOR_NEXT_RALPH.md when verification passes. |
RALPH_SUBAGENT_PROMPT |
{{name}} {{goal}} {{context}} {{stateDir}} {{doneSentinel}} {{doneHeading}} |
Initial prompt sent to a spawned sub-agent. |
RALPH_SUBAGENT_DONE_SENTINEL |
— | Phrase the sub-agent must output to signal completion. Default: SUB_AGENT_DONE. |
RALPH_SUBAGENT_DONE_HEADING |
— | Heading in CURRENT_STATE.md that marks sub-agent completion. Default: ## Final Result. |
RALPH_BOOTSTRAP_RLM_INSTRUCTIONS |
{{timestamp}} |
Initial content written to RLM_INSTRUCTIONS.md when it does not exist. |
RALPH_BOOTSTRAP_CURRENT_STATE |
— | Template written to CURRENT_STATE.md on bootstrap and after each rollover. |
RALPH_CONTEXT_GATE_ERROR |
— | Error message thrown when the agent tries a destructive tool before loading context. |
RALPH_WORKER_SYSTEM_PROMPT |
— | System prompt injected into every RLM worker session. Describes the one-pass contract. |
RALPH_WORKER_PROMPT |
{{attempt}} |
Initial prompt sent to each spawned RLM worker session. |
RALPH_SESSION_SYSTEM_PROMPT |
— | Legacy: system prompt for separate strategist sessions (unused in main-as-strategist mode). |
RALPH_SESSION_PROMPT |
{{attempt}} |
Prompt sent to the main strategist session when an attempt starts. |
.opencode/prompts/continue.txt:
Ralph attempt {{attempt}} — last verify: {{verdict}}.
You are working in a TypeScript monorepo. Rules:
1. Call ralph_load_context() first.
2. Check PLAN.md for the current milestone.
3. Run `bun typecheck` before `bun test`.
4. Write all intermediate findings to CURRENT_STATE.md.
5. When the verify passes, stop.
.env or shell:
export RALPH_CONTINUE_PROMPT="@.opencode/prompts/continue.txt"Fill in your verify.command, write a goal in PLAN.md, and start a session. The loop runs automatically.
1. Edit PLAN.md — set your goal and definition of done.
2. Open OpenCode and describe the task.
3. Agent calls ralph_load_context(), reads PLAN.md, starts working.
4. Agent calls ralph_verify().
5. If fail → Ralph rolls state, re-prompts. Go to 3.
6. If pass → Ralph shows toast. Done.
Set maxAttempts high (25–50), write a detailed PLAN.md with a precise definition of done, and close your laptop. The loop will:
- Make an attempt.
- Run verify.
- On failure: roll state, prompt the strategist (you) to diagnose and adjust, spawn the next worker.
- Repeat until it passes or hits
maxAttempts.
In the morning, check SUPERVISOR_LOG.md and CONVERSATION.md for the progress feed, NOTES_AND_LEARNINGS.md for what the loop learned, and AGENT_CONTEXT_FOR_NEXT_RALPH.md for where it stopped.
Use ralph_report and ralph_ask to stay informed and make decisions without micromanaging:
Worker:
ralph_report("Finished refactoring auth module. 3 tests failing — all in legacy JWT path.")
ralph_ask("The legacy JWT path is only used by the mobile app. Rewrite or remove?")
← blocks until you call ralph_respond("ask-...", "Remove it, mobile app is deprecated")
(continues with the answer)
You stay in the loop for decisions that require human judgment. Everything else runs unattended.
To avoid over-running reviews, use explicit readiness:
- Worker marks readiness with
ralph_request_review("ready for correctness review"). - Supervisor runs
ralph_run_reviewer(). - Review report is written to
.opencode/reviews/review-attempt-N.md(or configured output path).
Reviewer execution is gated by reviewerRequireExplicitReady and reviewerMaxRunsPerAttempt unless forced.
Parent agent:
1. ralph_load_context()
2. Identify two independent sub-tasks
3. subagent_spawn("auth", "implement JWT auth", context)
4. subagent_spawn("api", "implement REST endpoints", context)
5. subagent_await("auth") — poll until done
6. subagent_await("api") — poll until done
7. Integrate results, update PLAN.md
8. ralph_verify()
Edit RLM_INSTRUCTIONS.md to add project-specific playbooks, register MCP tools, or adjust the debug workflow. Changes persist across attempts. Use ralph_update_rlm_instructions() from within a session, or edit the file directly.
The instructions file is the primary lever for improving loop performance. If the loop keeps making the same mistake, add a rule. If it keeps following an inefficient path, add a playbook. The main strategist (you) updates these instructions between attempts based on what it observes in the failure record.
| Hook | What it does |
|---|---|
event: session.idle |
Routes idle events: worker → handleWorkerIdle (verify + continue loop); main/other → handleMainIdle (kick off attempt 1). Also emits heartbeat/staleness warnings and supervisor status updates to SUPERVISOR_LOG.md and CONVERSATION.md. |
event: session.created |
Pre-allocates session state for known worker sessions. |
event: session.status |
Refreshes heartbeat/progress timestamps for active sessions and surfaces explicit session error statuses to the supervisor feed. |
experimental.chat.system.transform |
Two-way routing: worker → RLM file-first prompt; main/other → supervisor+strategist prompt. |
experimental.session.compacting |
Injects protocol file pointers into compaction context so state survives context compression. |
tool.execute.before |
Blocks destructive tools (write, edit, bash, delete, move, rename) in worker and sub-agent sessions until ralph_load_context() has been called. |
The outer loop is named after the Ralph Wiggum technique — a while loop that feeds a prompt to an AI agent until it succeeds. The name reflects the philosophy: persistent, not clever. The loop doesn't try to be smart about when to give up. It tries, records what happened, and tries again with better instructions.
The key addition in this plugin over a naive Ralph implementation is the separation of the strategist from the worker. The main session handles strategy and delegation, while each worker gets a fresh context to implement. This keeps planning clean while still benefiting from fresh worker windows.
The worker discipline is based on Recursive Language Models (arXiv:2512.24601). The paper's core finding: keeping large inputs in an external environment and having the model grep/slice/recurse over them significantly outperforms shoving everything into the context window at once. Models reason better when they can retrieve exactly what they need rather than filtering signal from a noisy dump.
This plugin approximates that approach using the filesystem as the external environment and rlm_grep + rlm_slice as the retrieval primitives. CONTEXT_FOR_RLM.md is the designated large-reference file — paste API docs, database schemas, or reference code there and the worker accesses it surgically rather than reading it whole.
The loop's correctness guarantee is only as strong as verify.command. This is a feature, not a limitation. It forces clarity about what "done" means before the loop starts. Ambiguous acceptance criteria produce ambiguous results regardless of how many attempts you give the loop.
The practical recommendation: make your verify command as strict as you can tolerate. If you would normally merge a PR that passes tests + typecheck + lint, configure that as your verify command. The loop will produce code that meets that bar.