Skip to content

Plan mode can be bypassed after reading plan_sop without enter_plan_mode #458

@HamsteRider-m

Description

@HamsteRider-m

Problem

When an agent reads memory/plan_sop.md (the standard operating procedure for plan mode), it is supposed to subsequently create a plan file and enter plan mode by calling handler.enter_plan_mode(...). In practice, the agent may skip this step and continue normal execution. The system does not enforce the transition—plan mode is effectively opt-in with no runtime guard.

Root cause

The existing implementation only enforces plan-mode constraints after enter_plan_mode() has been called:

  • do_no_tool() blocks premature completion claims only when already in plan mode.
  • turn_end_callback() appends the plan hint only when _in_plan_mode() is true.

There is no mechanism to detect that plan mode should have been entered but was not, so the agent can read plan_sop, extract the SOP points, and then proceed with ordinary tool calls—without ever creating plan.md or calling enter_plan_mode().

Impact

Without a guard, complex tasks lack the mandatory plan→execute→verify lifecycle:

  • No plan.md is created or maintained.
  • No per-step checklist enforcement.
  • The verify/verdict gate is bypassed.
  • Existing plan-mode UI and completion checks are never activated.

Proposed fix

Add a pending-state guard between "plan SOP read" and "plan mode entered":

  1. When the agent reads plan_sop (or otherwise triggers a plan-mode-required task), set a pending flag.
  2. If the next turn still has not called handler.enter_plan_mode(...), the turn-end callback injects a hard guard: create plan, enter plan mode, and disallow ordinary execution until done.
  3. Once enter_plan_mode() succeeds, clear the pending state.
  4. All existing in-plan-mode checks continue to work as before.

Acceptance criteria

  • After reading plan_sop, if enter_plan_mode() is not called, the next turn is intercepted and corrected.
  • self.working["in_plan_mode"] is set to the plan file path.
  • TUI plan/todo panel recognizes the active plan.
  • Unchecked items prevent premature completion.
  • Tests cover: pending flag set on SOP read, guard intercepts missing entry, pending cleared on successful entry.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions