Skip to content

wyf111/opencode-sop-engine

opencode-sop-engine

Production-grade Skill orchestration, SOP enforcement, and long-context runtime control for OpenCode.

As a general-purpose skill orchestration core for OpenCode, this engine moves LLM execution from ad-hoc "best effort" behavior to a predictable, auditable, and operationally stable workflow pipeline.

Chinese Documentation


Disciplining the AI: The OpenCode SOP Engine


Why SOP Engine

In typical code-agent setups, many skills are exposed at once and the model is expected to plan calls by itself. In real projects, that often fails in three ways:

  • Routing drift -> FSM enforcement

    • Symptom: stage skipping, wrong call order, or unsafe tool usage at the wrong time.
    • Fix: define execution order with Flow DSL (A -> B -> C) and enforce hard runtime checks in tool.execute.before. Non-compliant calls are blocked before execution.
  • Unverified completion -> evidence-gated submission

    • Symptom: model claims "done" without real tool activity.
    • Fix: progress requires explicit submit_step_result with evidence.callIds. The engine only accepts real call IDs completed in the current stage.
  • Long-context degradation -> dynamic history pruning

    • Symptom: repeated retries bloat context and reduce response quality.
    • Fix: with pruneHistory=true, outgoing messages are pruned before LLM calls, preserving only high-value context and stage checkpoints.

Design Philosophy: Countering Overconfidence and Process Laziness

This engine is built to address two common failure modes in real-world LLM-driven engineering:

  • Overconfidence: the model tends to skip critical analysis and validation, then outputs seemingly complete but unverified results.
  • Process laziness: the model tends to bypass required stages (architecture review, security checks, test reasoning) to optimize for short-term speed.

Production engineering quality cannot depend on model self-discipline. opencode-sop-engine turns AI execution from weak prompt guidance into hard state-machine routing: stage order, tool gating, evidence submission, and transition control are enforced by the runtime.

Core Value: Balancing Freedom and Order

The engine enforces process while preserving extensibility:

  1. Flow ownership: developers can define task pipelines with Flow DSL (A -> B -> C, with parallel branches and routes).
  2. Critical gates: teams can place organizational standard skills at key stages (for example security scan and release checks).
  3. Asset reuse: validated workflow/profile + skill setups can be versioned and reused as team assets.

Quick Start

1. Install package

npm install opencode-sop-engine

2. Register project-level plugin

Create .opencode/plugins/sop-engine.ts in your project:

import type { Plugin } from "@opencode-ai/plugin";
import SopEnginePlugin from "opencode-sop-engine";

export const SopEngineProjectPlugin: Plugin = SopEnginePlugin;

3. Choose workflow

Use the bundled default.workflow.json first:

/run-skill -p default.workflow.json --goal "Fix auth module concurrency bug"

For custom workflow files outside plugin profiles/, set SOP_ENGINE_TRUST_MODE=open and pass -p with that file path.

4. Stop pipeline (optional)

Stop current run:

/stop-skill

5. Manually skip current stage (optional)

Disabled by default. Enable it in workflow settings:

{
  "settings": {
    "allowManualSkip": true
  }
}

Then, during a run:

/skip-stage --reason "business approved skip for this stage"

Note: /skip-stage only skips the current active stage. It does not target arbitrary stages.

6. Check current session status (optional)

/sop-status

This prints the current session run snapshot, including:

  • sessionID, status, profile
  • current stage with progress (x/y)
  • activeTasks, pendingTasks
  • runtime counters (autoPrompts, loopRepeats, idleNoSubmit)
  • key switches (pruneHistory, allowManualSkip)
  • startedAt, updatedAt, and lastError (if any)

Technical Architecture

Module boundaries

Module Responsibility Main files
Plugin entry register hooks/tools and orchestrate runtime flow src/index.ts
Execution engine FSM state, evidence tracking, stage routing src/core/engine.ts
Profile loader workflow loading, bundle resolving, trust mode src/core/profile-registry.ts
Schema/compiler schema validation and Flow DSL -> strict stages compile src/core/workflow-schema.ts, src/core/workflow-compiler.ts
Prompt and guards strict prompt, message pruning, runtime guard policies src/core/prompt.ts, src/core/message-pruning.ts
Persistence runtime snapshot IO and tombstone merge src/core/state-store.ts

Hook pipeline

Hook Purpose Technical behavior
chat.message command routing parse /run-skill, /stop-skill, /skip-stage, /sop-status, load profile, start/stop run, rewrite kickoff
experimental.chat.system.transform SOP constraints injection inject current-stage rules + checkpoint capsule
experimental.chat.messages.transform pre-request pruning prune completed-stage chatter when pruneHistory=true
tool.execute.before hard gate before execution validate stage-level allowed/blocked tools; allow in-stage auxiliary skill calls
tool.execute.after metadata enrichment attach metadata.sop (profileId, stageId, status) for observability (does not affect routing decisions)
event (message.part.updated / session.idle) async orchestration collect real tool evidence, auto-dispatch, submission nudges, finalize end states

Kickoff rewrite behavior

When /run-skill is matched, the original command text is not forwarded to the model as-is. chat.message rewrites the first user message into a kickoff instruction:

  • preserves your business goal text (if provided via --goal or -- <goal text>)
  • appends SOP kickoff context (profile + current stage + active tasks)
  • removes command noise such as /run-skill -p ... from model-facing input

Runtime state model (core fields)

SessionRunState includes:

  • routing state: currentStageId, activeTaskIds
  • stage boundary: currentStageStartedAt
  • evidence state: pendingToolCalls, completedToolCalls, claim ownership
  • risk counters: autoPromptCount, loopRepeatCount, idleNoSubmitCount
  • memory capsule: contextState.completedStages

Reliability semantics

  • debounced persist + in-flight queue to reduce high-frequency IO pressure
  • lock/tombstone strategy to reduce multi-process overwrite risk
  • fail-open pruning: skip pruning when message boundary is uncertain

Runtime Semantics (command to completion)

/run-skill
  -> chat.message (parse command, load profile, start run)
  -> system.transform (inject strict SOP constraints)
  -> messages.transform (optional history pruning)
  -> tool.execute.before (Skill/Tool guard)
  -> real tool execution
  -> submit_step_result (task submission)
  -> state transition (next stage / routes)
  -> session.idle (auto-dispatch or submission nudge)
  -> completed / failed

Manual skip path (optional):

/skip-stage
  -> chat.message (validate allowManualSkip)
  -> synthetic skipped_by_user submission for current stage
  -> routes.skip / routes.skipped_by_user / routes.default / nextOnSuccess
  -> next stage (or completed)

Workflow Configuration (Advanced)

Flow DSL example (serial + parallel + dynamic routes)

{
  "profileId": "advanced-demo",
  "flow": "analyze -> (write_code | write_test) -> verify{pass:deliver,fail:write_code}",
  "settings": {
    "maxAutoPrompts": 20,
    "maxLoopRepeats": 3,
    "maxIdleNoSubmit": 3,
    "pruneHistory": true
  },
  "defaults": {
    "allowedTools": ["*"],
    "blockedTools": ["bash*", "shell*"]
  }
}

Note: Flow DSL profiles do not support a stages field.
Node names in flow are mapped 1:1 to taskId, and by default to task skill names.

Mapping for the example above:

  • analyze -> skill: "analyze"
  • write_code -> skill: "write_code"
  • write_test -> skill: "write_test"
  • verify -> skill: "verify"
  • deliver -> skill: "deliver"

If you need custom mapping (for example analyze -> sys-analyzer), use a Strict Stages profile (explicit stages/tasks):

{
  "schemaVersion": "1.2.0",
  "profileId": "advanced-demo-strict",
  "displayName": "Advanced Demo Strict",
  "settings": {
    "maxAutoPrompts": 20,
    "maxLoopRepeats": 3,
    "maxIdleNoSubmit": 3
  },
  "stages": [
    {
      "id": "analyze",
      "title": "Analyze",
      "mode": "single",
      "tasks": [
        {
          "id": "analyze",
          "title": "Analyze",
          "instruction": "Analyze requirement and code context.",
          "skill": "sys-analyzer",
          "allowedTools": ["*"],
          "completion": {
            "tool": "submit_step_result",
            "acceptedStatuses": ["done"]
          }
        }
      ],
      "nextOnSuccess": "deliver"
    },
    {
      "id": "deliver",
      "title": "Deliver",
      "mode": "single",
      "tasks": [
        {
          "id": "deliver",
          "title": "Deliver",
          "instruction": "Prepare final output and delivery notes.",
          "skill": "git-committer",
          "allowedTools": ["*"],
          "completion": {
            "tool": "submit_step_result",
            "acceptedStatuses": ["done"]
          }
        }
      ]
    }
  ]
}

settings fields

Field Default Purpose
maxAutoPrompts 20 max auto-dispatch count per run
maxLoopRepeats 3 max repeated dispatches on same stage signature
maxIdleNoSubmit 3 max idle rounds without submission
pruneHistory true enable pre-request history pruning
allowManualSkip false allow /skip-stage for current active stage

Context Pruning (pruneHistory)

When enabled, the engine prunes outgoing messages before each LLM request:

  • keep: all system messages, the run kickoff goal message (first user message created after this run starts), active-stage window, pending/running tool messages
  • remove: intermediate chatter from completed stages
  • carry-forward: completed-stage outcomes through CURRENT_WORKFLOW_STATE_DATA
  • safety valve: fail-open on uncertain boundaries (skip pruning instead of risking corruption)

submit_step_result Contract

Tool name: submit_step_result

Required fields:

  • taskId
  • status
  • summary
  • evidence.callIds (required for every submit_step_result)

Optional fields:

  • artifacts
  • evidence.notes

Example:

{
  "taskId": "analyze",
  "status": "done",
  "summary": "Root cause identified.",
  "artifacts": ["reports/analyze.md"],
  "evidence": {
    "callIds": ["call_123"],
    "notes": "Used read_file and grep"
  }
}

Notes:

  • evidence.callIds must reference real tool calls from the current stage.
  • evidence.callIds is required for every submission and must contain current-stage real call IDs.
  • Stage completion still requires evidence that matches the stage required skill (task.skill), so auxiliary skill calls alone cannot advance the workflow.

Command Reference

Syntax

/run-skill [flags] [-- goal_text]

Options

Option Type Run mode List mode Description
-p, --profile string required optional workflow JSON file
-f, --flow string optional invalid select flow in a bundle
--goal string optional invalid business goal text
--max-auto-prompts number optional invalid runtime override
--max-loop-repeats number optional invalid runtime override
-l, --list flag invalid required list available flows only

Notes:

  • positional arguments are not accepted
  • --list does not accept --flow, --goal, or --max-*
  • if goal text contains spaces, pass it as one argument (e.g. --goal "fix auth race" or --goal 'fix auth race'), or use -- <goal text>
  • bare /run-skill -l works when profiles/default.workflow.json exists

Manual stage skip command

/skip-stage --reason "approved skip"
/skip-stage -- <reason text>

Notes:

  • Only skips the current active stage.
  • Requires an active running SOP run in this session.
  • Requires settings.allowManualSkip=true in the loaded workflow.

Session status command

/sop-status

Notes:

  • Shows status for the current chat session only.
  • If no run is active in this session, it returns status: none.

Troubleshooting

Symptom Likely cause What to check
/run-skill rejected invalid flags use -p for run mode; avoid positional args
flow not found wrong -f value run /run-skill -l -p <file>
/skip-stage not applied manual skip disabled or no active run set settings.allowManualSkip=true and ensure /run-skill is running
stage not advancing missing submit_step_result verify acceptedStatuses and submission timing
evidence rejected invalid or cross-stage callIds use real call IDs from current stage only
run fails quickly threshold guard triggered tune maxLoopRepeats / maxIdleNoSubmit

Trust Mode

Environment variable: SOP_ENGINE_TRUST_MODE

  • restricted (default): only workflow files under plugin profiles/ are allowed
  • open: external/absolute workflow paths are allowed

About

Production-grade Skill orchestration, SOP enforcement, and long-context runtime control for OpenCode

Resources

License

Code of conduct

Contributing

Security policy

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors