From 364ce67484fd79f86edac42a3c413d472bb42d58 Mon Sep 17 00:00:00 2001 From: Hakula Chen Date: Fri, 29 May 2026 15:12:02 +0800 Subject: [PATCH] docs(research): add tool permission/approval prior-art survey --- .cspell/words.txt | 1 + docs/research/README.md | 7 +- docs/research/tools/permissions.md | 112 +++++++++++++++++++++++++++++ 3 files changed, 117 insertions(+), 3 deletions(-) create mode 100644 docs/research/tools/permissions.md diff --git a/.cspell/words.txt b/.cspell/words.txt index e744dff2..e651308b 100644 --- a/.cspell/words.txt +++ b/.cspell/words.txt @@ -60,6 +60,7 @@ MULT nixfmt nixos nixpkgs +nohup nonewline nucleo numtide diff --git a/docs/research/README.md b/docs/research/README.md index bedece31..6e77f511 100644 --- a/docs/research/README.md +++ b/docs/research/README.md @@ -36,9 +36,10 @@ Organized by topic. Each subdirectory mirrors the corresponding directory in [`d ## Tools -| Document | Description | -| ---------------------------------------- | ---------------------------------------------- | -| [Output Truncation](tools/truncation.md) | Per-tool vs central caps, spillover strategies | +| Document | Description | +| ---------------------------------------- | ------------------------------------------------------ | +| [Output Truncation](tools/truncation.md) | Per-tool vs central caps, spillover strategies | +| [Permissions](tools/permissions.md) | Tool approval modes, rule grammar, decision precedence | ## Terminal UI diff --git a/docs/research/tools/permissions.md b/docs/research/tools/permissions.md new file mode 100644 index 00000000..5402c30f --- /dev/null +++ b/docs/research/tools/permissions.md @@ -0,0 +1,112 @@ +# Tool Permissions and Approval (Reference) + +Research on how coding agents gate mutating tool calls before they run. Based on [Claude Code](https://github.com/hakula139/claude-code), [OpenAI Codex](https://github.com/openai/codex), and [opencode](https://github.com/anomalyco/opencode). + +oxide-code today runs every tool unconditionally. `bash` executes an opaque `bash -c` string (`tool/bash.rs:104-107`), `edit` performs the replacement once the Read-before-Edit gate passes, and `write` can create a new file plus arbitrary parent directories with no prior read (`tool/write.rs:115-125`). The FileTracker gate is an _integrity_ check (it proves the model saw the current bytes), not an _authorization_ one: it never asks whether the path is inside the working directory and never asks the human. A permission layer has to decide, before each call, three things: which tool classes are dangerous enough to gate, how a standing policy (mode plus rules) resolves to allow / ask / deny, and how an interactive "ask" reaches the user and rides a decision back without deadlocking the turn loop. The three reference tools answer these differently, and one of the axes (OS sandboxing) collapses entirely because oxide-code has no sandbox to lean on. + +## Claude Code (TypeScript) + +Permission resolution is a layered pipeline keyed on a `ToolPermissionContext`: the current mode plus four source-keyed rule maps (`alwaysAllowRules` / `alwaysDenyRules` / `alwaysAskRules` and `additionalWorkingDirectories`). Every call runs through `hasPermissionsToUseTool` → `hasPermissionsToUseToolInner` (`src/utils/permissions/permissions.ts`), which yields a `PermissionDecision` of behavior `allow | ask | deny`, plus an internal `passthrough` meaning "no opinion, keep looking". + +Four external modes (`src/types/permissions.ts`) set the standing posture: `default` prompts on anything not pre-allowed, `acceptEdits` auto-allows file writes and edits inside the working directories while still prompting Bash and MCP, `plan` is read-only analysis, and `bypassPermissions` auto-allows everything (the "dangerously skip" mode). Two further modes, `dontAsk` (turns every `ask` into `deny`) and an Anthropic-only `auto` (routes `ask` to an AI classifier), exist but are internal. Shift+Tab cycles default → acceptEdits → plan → bypass. + +The decision order is the load-bearing part. Tool-wide deny is checked first, then tool-wide ask, then the tool's own `checkPermissions(input, ctx)`, then tool-level deny, then a `requiresUserInteraction()` class that always prompts, then content-specific ask rules and `safetyCheck` results that are _immune to bypass_. Only after all of that does `bypassPermissions` short-circuit to allow, then a tool-wide allow rule, and any leftover `passthrough` becomes `ask`. The invariant: precedence is **deny > ask > allow**, and deny / ask are evaluated _before_ any mode-based auto-allow, so an explicit deny can never be downgraded by `acceptEdits` or an allow rule. + +Rules are `ToolName` or `ToolName(specifier)` strings parsed by `permissionRuleParser.ts`: the first unescaped `(` opens the content, the last unescaped `)` must be the final char, and `Bash` / `Bash()` / `Bash(*)` all collapse to a tool-wide rule. Bash specifiers (`src/tools/BashTool/bashPermissions.ts`) come in exact, prefix (`npm run test:*`), and wildcard (`git *`) shapes. The matcher splits compound commands and refuses to let a prefix or wildcard _allow_ rule match a chain (`cd:*` must not allow `cd x && rm -rf /`), strips safe wrappers (`timeout`, `nice`, `nohup`) and safe env-var prefixes before matching, and for deny / ask rules strips _all_ leading env vars so `FOO=bar denied` stays denied. Read and Edit specifiers (`src/utils/permissions/filesystem.ts`) are gitignore-style path globs resolved against a root inferred from the leading character (`//abs/**`, `~/...`, `/rel`, plain `x/**` matches anywhere). Edit access implies read access, and reads / writes inside a working directory need no rule. + +Config lives in the `permissions` block of `settings.json`: `allow`, `deny`, `ask` string arrays, `defaultMode`, `additionalDirectories`, and `disableBypassPermissionsMode`. Concrete examples from source and tests: `Bash(npm run test:*)`, `Bash(git *)`, `Read(./src/**)`, `Read(~/.zshrc)`, `Edit(/.claude/skills/my-skill/**)`, `mcp__github` (a whole server), `WebFetch(domain:example.com)`. + +When a decision resolves to `ask`, a focus-grabbing dialog renders (`src/components/permissions/`). The option set is tool-specific. File operations offer `Yes`, a session option ("allow all edits during this session"), and `No`. Bash adds `Yes, and don't ask again for ` with an _editable prefix input_ pre-filled with the suggested rule, so the user can widen or narrow `npm run:*` before saving. Both Yes and No can expand into a feedback row. The message from `createPermissionRequestMessage` explains _why_ approval is needed (which rule, mode, hook, or safety check fired). + +Persistence spans three editable disk sources ordered low → high precedence: `userSettings` (`~/.claude/settings.json`), `projectSettings` (`PROJECT/.claude/settings.json`, committed), `localSettings` (`PROJECT/.claude/settings.local.json`, gitignored), plus a read-only enterprise `policySettings` at the top. A `PermissionUpdate` carries a `destination` and is the single unit of both in-memory application and disk write. "Yes" writes nothing, "Yes during this session" applies an update with `destination: 'session'` (memory only, dropped at exit), and "don't ask again" writes the rule string into the chosen settings file's `permissions.allow` array. Safety checks gate edits to `.git/`, `.claude/`, `.vscode/`, and shell rc files even under `acceptEdits` and `bypassPermissions`. + +## OpenAI Codex (Rust) + +Codex crosses two orthogonal axes per command. The first is an _approval policy_ (`enum AskForApproval`, `protocol/src/protocol.rs`): `UnlessTrusted` (only known-safe reads auto-run, else prompt), `OnFailure` (deprecated: auto-run sandboxed, escalate on denial), `OnRequest` (the default: the model decides when to ask, auto-running sandboxed unless flagged dangerous), `Never` (never prompt, failures go to the model), and `Granular` (per-flow boolean gates where a `false` auto-rejects). The second is a _sandbox policy_ (`enum SandboxPolicy`): `read-only`, `workspace-write` (with `writable_roots`, network toggle, tmpdir excludes), `danger-full-access`, and `external-sandbox`. + +The two cross in `render_decision_for_unmatched_command` (`core/src/exec_policy.rs`), which returns `Decision::{Allow, Prompt, Forbidden}`. Starlark-style exec-policy rules are checked first; only unmatched commands hit the heuristic. A known-safe command under `UnlessTrusted` allows; a dangerous command (`rm -f`, `sudo `) or one with no sandbox protection forbids under `Never` and prompts otherwise; everything else auto-runs under `Never` / `OnFailure` (relying on the sandbox), prompts under `UnlessTrusted`, and under `OnRequest` allows when the sandbox is unrestricted or external but prompts when it is restricted and the command requests escalation. File writes go through a parallel `assess_patch_safety` (`core/src/safety.rs`) returning `SafetyCheck::{AutoApprove, AskUser, Reject}`: auto-approve only when the patch is provably inside the writable roots _and_ a real platform sandbox exists, otherwise ask. + +Config is TOML in `$CODEX_HOME/config.toml`: `approval_policy` (`untrusted` / `on-failure` / `on-request` / `never`), `sandbox_mode` (`read-only` / `workspace-write` / `danger-full-access`), and a `[sandbox_workspace_write]` table with `writable_roots` and the network / tmpdir toggles. CLI flags `-a/--ask-for-approval` and `-s/--sandbox` override per run, and `--dangerously-bypass-approvals-and-sandbox` (alias `--yolo`) sets `Never` plus `danger-full-access` together, documented as "EXTREMELY DANGEROUS. solely for externally-sandboxed envs". The former `--full-auto` was removed and now warns the user toward `--sandbox workspace-write`. Defaults are `OnRequest` approval and `read-only` sandbox. + +Approval surfaces as a bottom-pane `SelectionView` modal (`tui/src/bottom_pane/approval_overlay.rs`) with per-request titles ("Would you like to run the following command?", "Would you like to make the following edits?"). Exec options map to a `ReviewDecision`: "Yes, proceed" (`Approved`), "don't ask again for commands that start with ``" (`ApprovedExecpolicyAmendment`, persisted), "don't ask again for this command in this session" (`ApprovedForSession`), "No, continue without running it" (`Denied`, the model retries something else), and "No, and tell Codex what to do differently" (`Abort`, control returns to the user). Network prompts get host-scoped variants. The `Denied`-vs-`Abort` split (continue-and-retry versus stop-and-return) is the distinctive piece of the decision shape. + +Persistence has three tiers. Session-scoped decisions live in an in-memory `ApprovalStore: HashMap` (`core/src/tools/sandboxing.rs`) on `SessionServices`, checked by `with_cached_approval` before prompting and lost at session end; a patch returns multiple keys so a subset of files stays auto-approved. Persisted prefix rules ("don't ask again for commands starting with X") write an execpolicy amendment to disk and survive across sessions. The standing posture (`approval_policy`, `sandbox_mode`) lives in `config.toml` and profile layers. Notably, metadata protection is structural rather than a rule: even under `workspace-write`, `.git`, `.codex`, and `.git/hooks` are forced read-only subpaths (`protocol/src/permissions.rs`) to block privilege escalation through hook injection. + +## opencode (TypeScript) + +opencode evaluates a flat ruleset of `{permission, pattern, action}` tuples where `action ∈ {allow, deny, ask}`. `permission` is a tool-keyed string (`read`, `edit`, `glob`, `grep`, `bash`, `webfetch`, `external_directory`, plus MCP tool names), and `pattern` is a glob matched against a tool-specific target: the file path for `edit` / `read`, the literal command string for `bash`, `*` for catch-all. Evaluation (`packages/core/src/permission.ts`) flattens every ruleset and runs `findLast` over the rules where both the permission and the pattern match. **Last match wins**, the default fallback is `ask`, so precedence is purely positional: order rules general → specific. The wildcard matcher converts globs to regex and treats a trailing space-star as optional so a bare `git status` still matches a `git status *` rule. + +Granularity is per-tool _and_ per-agent: each agent carries its own resolved ruleset, and the effective set at ask time is `Permission.merge(agent.permission, session.permission)`. `edit`, `write`, and `apply_patch` all collapse to the single `edit` permission key, so a user grants once. A `{permission}: {"*": "deny"}` rule removes the tool from the agent's toolset entirely rather than merely gating it. + +Config is the `permission` key, which accepts either a bare action shorthand (`"permission": "ask"`, normalized to `{"*": action}`) or a per-tool object whose value is itself an action or a `{pattern: action}` map: + +```jsonc +{ + "permission": { + "edit": "ask", + "bash": { "git push": "ask", "*": "allow" }, + "read": { "*": "allow", "*.env": "ask", "*.env.example": "allow" }, + "webfetch": "deny", + "external_directory": { "*": "ask", "~/safe/*": "allow" } + } +} +``` + +Path patterns expand `~` and `$HOME`, and key order is preserved (`propertyOrder: "original"`) so positional precedence is stable. Per-agent overrides live under `agent..permission`, and the env var `OPENCODE_PERMISSION` merges a JSON blob on top. + +The gate lives _inside_ each tool's `execute` via `ctx.ask(...)`, so the tool can do its own work (compute a diff, scan for external-directory access) before asking. `ask()` evaluates each requested pattern: any `deny` throws a model-facing `DeniedError`, all-allow proceeds silently, otherwise it publishes a bus event and blocks on a deferred until the user replies. Rejection fails the deferred with `RejectedError` or `CorrectedError(feedback)`, both surfacing to the model as tool-result text that names the declining and carries the feedback. + +Persistence is the cautionary tale. Config rules are the durable "set and forget" surface, loaded fresh each run. Session-scoped rules merge into a `SessionTable.permission` column, and subagents inherit a filtered slice. But "allow always" at runtime, in this dev-branch revision, only pushes a new rule into an in-memory `state.approved` array (`permission/index.ts`): the UI literally says "until OpenCode is restarted". A project-keyed `PermissionTable` and an `Approval` schema _exist_ in the SQL layer and seed `approved` on init, yet no runtime write-back to that table is wired through the live ask / reply loop. The half-built table is schema without wiring, a reminder that "allow always → persist to disk" is the part that rots if it is reached for before it is needed. + +## Cross-Tool Comparison + +| Axis | Claude Code | Codex | opencode | +| ----------------------- | ---------------------------------------------------- | ------------------------------------------------------ | ------------------------------------------------------ | +| Standing posture | mode enum (default / acceptEdits / plan / bypass) | approval policy enum x sandbox policy enum | none; rules carry the whole posture | +| Per-tool control | per-tool `checkPermissions` + rule maps | exec-policy + per-tool `assess_patch_safety` | per-tool permission key in the rule tuple | +| Rule grammar | `ToolName(specifier)` strings | Starlark-ish prefix rules + curated safe / danger sets | `{permission, pattern, action}` tuples | +| Decision precedence | fixed: deny > ask > allow, before mode auto-allow | two-axis decision matrix, exec rules first | positional: last match wins, default `ask` | +| Bash matching | exact / prefix / wildcard, compound-split, env-strip | known-safe allowlist + danger denylist | glob over the literal command string | +| Where "always" persists | session memory or settings file by destination | session cache + on-disk execpolicy amendment | session memory only ("until restart") in this revision | +| Sandbox coupling | none (orthogonal sandbox feature) | tight: approval auto-run leans on the sandbox | none | +| Metadata protection | `safetyCheck` immune to bypass (`.git`, `.claude`) | structural read-only subpaths (`.git`, `.codex`) | not a distinct tier | + +Two structural facts fall out for oxide-code. First, the cleanest, least-rot-prone piece is shared across Claude Code: a small `allow | ask | deny` behavior enum with a fixed precedence evaluated before any mode auto-allow. Both opencode's last-match-wins and Codex's two-axis matrix express the same intent with more moving parts. Second, **without an OS sandbox, Codex's two axes collapse to one**: the approval policy _is_ the safety boundary. Codex's own code proves this, since `assess_patch_safety` falls back to ask and `render_decision_for_unmatched_command` forces prompt or forbid whenever no sandbox protection exists. Only the suspicion ports cleanly to oxide-code, since the auto-run half depends on a sandbox it does not have. + +## oxide-code Seam Analysis + +A gate slots into five existing seams. The maps below carry the `file:line` anchors a design doc would build on. + +### Agent-loop dispatch + +Tool execution happens in `dispatch_tool_call` (`agent.rs:478-494`), specifically the final `await_unless_aborted(tools.run(name, input), user_rx, pending).await` at `agent.rs:493`. The gate must intercept _before_ that line, after the parse-error short-circuit (`agent.rs:486-492`) that already returns a synthetic `ToolOutput` before running anything, which is the precedent for a synthetic "denied" output. Dispatch is sequential: `run_tool_round` (`agent.rs:417-459`) awaits each call one at a time, so at most one approval is ever outstanding and no id fan-out is needed. Token streaming is already drained before any tool runs (`stream_response` returns before the round loop is entered), so an approval wait never blocks streaming for the round. + +The constraint that shapes the whole design is the `select!` in `await_unless_aborted` (`agent.rs:513-545`), the only place that races the `user_rx` channel. An approval decision must ride back on that _same_ channel as a new `UserAction` variant, never a second channel the loop is not polling, or the loop blocks on a future nothing can complete. The catch-all arm at `agent.rs:531-540` currently logs and drops every action except Cancel / Quit / SubmitPrompt, so a naive decision would be swallowed; the await loop must match the new variant explicitly and continue looping the way `SubmitPrompt` buffers. Cancel must still return `TurnAbort::Cancelled` and Quit `TurnAbort::Quit` while a prompt is pending, and a queued `SubmitPrompt` must not be misread as a decision. The wait future must also be cancel-safe by drop, holding no half-recorded round, since `select!` drops it on a user action. + +### Config layering + +The config system has two wiring points, one per layer. The file layer (`config/file.rs`) adds an optional `permission` field on `FileConfig` (struct at `file.rs:22-25`) routed through `FileConfig::merge` (`file.rs:72-79`) via the `merge_section` helper (`file.rs:144-149`). The append-versus-replace decision for rule lists lives in a new `PermissionFileConfig::merge`, modeled on `ThemeFileConfig::merge` (`file.rs:127-142`), the one existing list merge that appends rather than replaces. Resolution happens in `Config::load` (`config.rs:368-485`) after the theme / compaction blocks (`config.rs:457-466`), reading `fc.permission`, applying an `OX_PERMISSION_MODE` env override with the `env::string(...).map(parse).or(file_value)` pattern that `effort` uses (`config.rs:404-408`), and adding the resolved field to `Config` (`config.rs:339-361`), `ConfigSnapshot` (`config.rs:51-69`), and `Config::snapshot` (`config.rs:488-504`). The `/config` display appends rows in `slash/config.rs:48-77`. + +The load-bearing constraint is trust. Project `ox.toml` is checked-in and untrusted, so a permission allowlist set there is a privilege-escalation vector exactly like the secrets that `reject_project_secrets` (`file.rs:180-204`) already blocks. Whether mode and rules are user / env-only or project-settable is an explicit decision: env > project > user > default precedence must hold (`config.rs:1-4`), `env::string` treats empty as absent so `OX_PERMISSION_MODE=""` falls through (`config.rs:382`), parse errors must propagate rather than defaulting permissive (`config.rs:364-367`), and `deny_unknown_fields` belongs on every new file struct (`file.rs:21,28,44,55,64`). Append-merge means a project can only widen what the user allowed, never silently shrink it; revocation then flows through deny rules. The `Effort` enum (`config.rs:90-151`) is the cleanest leaf-enum template for a mode enum, with `ALL` / `as_str` / `Display` / `FromStr`, and the untagged `SlotPatch` (`theme/loader.rs:93-95`) is the precedent for a "bare string or table" rule form. + +### Modal and event round-trip + +The interactive prompt is a new event / action pair in `agent/event.rs` (`AgentEvent` at lines 37-104, `UserAction` at 110-146): an `ApprovalRequested` carrying the tool-use id plus a small `Clone` preview, and an `ApprovalDecision` carrying the id so a stale decision for a dropped call is ignored. The preview payload must stay `Clone` and small, since `AgentEvent` is sent over a bounded mpsc; `ToolResultView` and `DiffChunk` are already `Clone` (`tool.rs:99,140`). On the TUI side, `handle_agent_event` (`app.rs:424-523`) gains an arm that pushes an `ApprovalModal` onto the `ModalStack`. The decision return path already exists: `Submitted(ModalAction::User(ApprovalDecision))` flows through `apply_modal_action` (`app.rs:218-225`) → `dispatch_user_action` → `forward_to_agent` (`app.rs:277-292`) → `user_tx`. + +`ConfirmDeleteSessionModal` (`slash/confirm.rs`) is the structural template, the existing destructive-action gate with key handling and a sticky-error field. The edit preview can be built without running the tool via `edit::synthesize_chunk(old, new)` (`edit.rs:390`, already `pub(crate)`, no file read) rendered by `diff::render` (`tui/components/chat/blocks/tool/diff.rs:15`), the same `+` / `-` gutter the chat uses; a write preview is an all-add diff from the `content` arg; bash shows the command string. The trap to guard is `clear_modals` on session swap (`app.rs:233-236`), which would silently drop a pending approval and strand the agent. Esc must return an explicit `Deny` rather than the universal-cancel `ModalAction::None`, since the blocked agent never observes `None`. + +### Session-scoped state + +Session-scoped "allow for this session" memory mirrors `FileTracker` (`file_tracker.rs`): a live in-memory map queried by the gate the same place `verify_current_content` is consulted today. The FileTracker round-trip is the template if resume survival is ever wanted, snapshotting on finish (`state.rs:181-206`) and rehydrating on resume (`handle.rs:303,381`) through the session actor (`actor.rs:135` `absorb`, new `SessionCmd` arm modeled on `ToolMetadata`). The reason to start process-local is that a session approval has no disk ground-truth to re-validate against the way a `FileSnapshot` rehashes (`file_tracker.rs:286`), so blindly re-admitting a persisted "allow bash X" on resume is a trust regression. The roadmap stance is binding here: session commands should feel reversible, and cross-session writes require an explicit user action (`docs/roadmap.md:89`, `docs/design/slash/commands.md:32`). A persisted project allowlist therefore stays out of the config merge and out of the session actor, landing later as an explicit confirmed writer using `util/fs.rs::atomic_write_private`, never a side effect of clicking "allow". + +### Tool risk classes + +Per-instance metadata lives on the `Tool` trait rather than a `match name` table (`tool.rs:163-166`, CLAUDE.md "Trait Design"), so a risk descriptor becomes a new trait method on each tool. The six tools fall into three classes. Read-only (`read`, `glob`, `grep`) never mutate and auto-allow; note `read` can open any absolute path, so confidentiality scoping is a separate future concern. Edit-class spans `edit` and `write`, with `write` strictly more dangerous: `edit` requires the file to exist and to have been read, so it cannot create files (`tool/edit.rs:189-191`), while `write` creates arbitrary files and `create_dir_all` arbitrary parents and bypasses the gate entirely for a new file (`tool/write.rs:115-125`). Execute (`bash`) is the unbounded surface, an opaque `bash -c` string with no tokenization (`tool/bash.rs:104-107`); the timeout and process-group machinery bound runtime rather than authority. + +Two facts constrain bash matching specifically. The command is unparsed, so prefix or substring matching is best-effort UX and never a security boundary: `ls; rm -rf` and `$()` indirection defeat naive matching, which is why Claude Code's compound-split and env-strip discipline (`shellRuleMatching.ts`) is the real cost of treating bash rules as a boundary. The deny list is the dependable lever, conservatively matching the raw command string. Path-based keys for edit / write need canonicalization before any inside-cwd predicate, since `..` and symlinks are resolved nowhere in `edit.rs` / `write.rs` today, so a raw-string cwd check is bypassable. + +## Implications for oxide-code + +The prior art points at a small, sandbox-free core: a sync, pure rule engine producing `allow | ask | deny` with fixed precedence (deny first, before any mode auto-allow), a mode enum shaped like `Effort`, and Claude Code's `ToolName(specifier)` string grammar for interoperable muscle memory. Read-only tools auto-allow, `edit` / `write` and `bash` gate by mode, and only the "ask" resolution awaits the modal. Session-scoped "allow always" stays in memory, mirroring FileTracker, and persisted project allowlists wait for an explicit confirmed action per the roadmap stance. The metadata-protection tier that both Claude Code and Codex apply to `.git/` and their own config directories is a cheap, pure path check worth carrying early. + +What deliberately stays open here, because it is design rather than research, is the chosen mode set and its defaults (whether the default flips today's unchecked behavior), the project-rule trust decision, the write-versus-edit blast-radius split, the headless / `-p` auto-allow-versus-auto-deny policy, and the exact control-flow plumbing for the approval wait. Those land in a design doc (`docs/design/tools/permissions.md`) at implementation time, which will ratify the model and phasing this survey only sketches.