Skip to content
Open
Original file line number Diff line number Diff line change
@@ -0,0 +1,78 @@
# Plan: Initial Formalization of Mathematician-Programmer Agent Role

## Goal
Establish strict adherence to the formally verifiable functional architecture defined in AGENTS.md across all agent behaviors, code, and interactions in this workspace. Ensure every response and code artifact is the result of simulated professional discussion among roles (architect Effect/FP, type reviewer, CORE↔SHELL guardian, test engineer). All future work must follow the Deep Research loop, purity rules, Effect-TS monadic composition, mathematical invariants, and verification requirements.

## Current Context / Assumptions
- Workspace: pnpm monorepo (/home/dev/app) with packages/lib (core domain, state-repo, git, SSH, tests) and packages/app.
- Project context file AGENTS.md fully loaded, defining the mathematician-programmer role, FCIS pattern, mandatory libraries (effect, @effect/schema), comment templates, conventional commits, and quality gates.
- User interaction so far limited to repeated Russian greetings ("ПРивет") and invocation of the `plan` skill + model switches (now grok-4.20-0309-reasoning).
- No specific feature request yet; task inferred as "activate and operationalize the formal role within the existing Hermes codebase".
- Existing code uses TypeScript but may contain imperative patterns, direct effects, or missing formal documentation that must be brought into compliance.
- Tools (terminal, file, search_files, etc.) available and must be used only through typed Effect Services in SHELL.

## Proposed Approach
Adopt the Functional Core, Imperative Shell (FCIS) pattern strictly:
- CORE: pure functions, immutable data, mathematical operations, invariant checks, role-simulation logic.
- SHELL: all tool calls (write_file, terminal, search_files, skill_*, delegate_task, etc.), I/O, model interactions wrapped in Effect + Layers.
- Use @effect/schema for all boundary decoding.
- Encode AGENTS.md rules as types, branded types, and property-based tests.
- Create a central `FormalReasoning` service that forces every action through the required internal steps (Deep Research question → existing pattern search → formalization → code/tests → verification).
- Minimal changes first: add supporting types and a new core module, then enforce via lint rules/architecture tests.

## Step-by-Step Plan
1. Inspect existing core files (domain.ts, auto-agent-flags.ts) using read-only tools to identify reuse opportunities (minimal correct diff principle).
2. Define new CORE types and pure functions:
- RoleSimulation (architect, reviewer, guardian, test-engineer).
- Invariant type and checker.
- `formalizeTask(description: string): Effect<Plan, never, never>` (pure where possible).
3. Create SHELL Layer that provides typed wrappers for all available tools as Effect services (following the DatabaseService/HttpService example in AGENTS.md).
4. Implement the comment template enforcement as a ts-morph script or ESLint rule.
5. Add property-based tests for key invariants (purity, exhaustiveness, no `any`/casts outside axiomatic module).
6. Update main agent entrypoint to load the new FormalReasoning layer.
7. Write this plan file (only mutation allowed this turn).
8. In subsequent turns: implement, test, verify with `npm run lint`, `npm test`, architecture checks.

## Files Likely to Change
- `packages/lib/src/core/domain.ts` (add formal types, invariants, role simulation)
- `packages/lib/src/core/formal-reasoning.ts` (new CORE module)
- `packages/lib/src/core/shell.ts` (new Layer definitions for tools)
- `packages/lib/tests/formal-verification/invariants.test.ts` (new)
- `.hermes/plans/*.md` (ongoing plans)
- `packages/lib/tests/usecases/...` (update existing tests to use Effect.provide and .effect)
- `tsconfig.json`, `pnpm-workspace.yaml` (if new packages needed)

## Tests / Validation
- **Property-based**: `fc.assert(fc.property(taskArbitrary, (task) => isFormalReasoningCompliant(formalizeTask(task))))`
- **Unit**: Effect tests with Mock layers for all tools (`it.effect(...)` with `Effect.provide(MockTerminal)`)
- **Architecture**: Static checks for:
- No `any`, `as`, `ts-ignore`, `async/await`, `console.*` in CORE.
- All pattern matches use `Match.exhaustive`.
- CORE imports only pure modules.
- Run full suite: `npm run lint && npm test && npm run build`
- Verification command sequence (to be executed in future turns):
```bash
npm run lint
npm test -- --grep="formal|invariant|effect"
grep -r "any\|as \|ts-ignore" packages/lib/src/core/
```

## Risks, Tradeoffs, and Open Questions
- **Risk**: Large-scale refactor of existing Hermes codebase could introduce regressions in SSH/git/state management features. Mitigate with incremental PRs + CI.
- **Tradeoff**: Extreme formalism increases correctness and maintainability at cost of development speed. Prioritize high-risk modules first (tool usage, delegation).
- **Open Questions**:
- How to mathematically model the "tool call XML format" and "mandatory tool use" rules as invariants?
- Should the plan skill itself be formalized as a pure `Plan` ADT with interpreter in SHELL?
- Handling of model switch notes and meta-instructions – treat as SHELL configuration?
- Exact mapping of "Deep Research" loop into Effect.gen() generator.
- **Assumption to validate**: Existing test files can be migrated to `it.effect()` without breaking.

## Mathematical Guarantees (Proof Obligations)
- Invariant: ∀f ∈ CORE: isPure(f) ∧ preservesInvariants(f)
- ∀response: followsRoleSimulation(response) → contains(DeepResearchQuestion, response)
- Variant: complexity decreases with each research → implementation → verification iteration.

**Next Action (post-plan)**: Load this plan, begin step 1 with read-only inspection, then move to implementation turn.

SOURCE: n/a (directly derived from loaded AGENTS.md)
REF: AGENTS.md + plan skill invocation
Original file line number Diff line number Diff line change
@@ -0,0 +1,79 @@
# Plan: MCP Playwright Integration for Hermes Agent (with noVNC compatibility)

## Goal
Study existing MCP Playwright connection in the Codex/docker-git setup (as referenced in README.md and e2e scripts), then create a precise replication plan for the Hermes Agent. Ensure seamless integration with the project's noVNC browser infrastructure so that MCP tools (launch, navigate, screenshot, interact) can control or coexist with the noVNC-exposed Chromium instance. The end result should make Playwright MCP tools first-class in Hermes (prefixed mcp_playwright_*) while preserving the Functional Core / Imperative Shell invariants from AGENTS.md.

## Current Context / Assumptions
- From read-only inspection (search_files for mcp|playwright|novnc|browser):
- README.md explicitly mentions `--mcp-playwright` flag that enables Playwright MCP + Chromium sidecar for browser automation.
- package.json has e2e:browser-command script and docker-git browser targets.
- docker-git clone command in README uses --mcp-playwright.
- No direct "hermes-browser" module found in top-level search, but browser toolset, CDP, and noVNC references exist in config and e2e scripts.
- Current ~/.hermes/config.yaml has no mcp_servers.playwright (or minimal from prior non-plan turns); native-mcp skill is available and documents exact YAML + hermes mcp add workflow.
- Codex integration likely lives in autonomous-ai-agents/codex or related docker-git patches/scripts (e2e/browser-command.sh, scripts/skiller-apply-docker-git-patches.mjs).
- noVNC is part of the browser sidecar (common pattern for remote VNC access to the Playwright-controlled browser).
- Assumptions: Codex uses stdio transport via npx mcp-playwright (or equivalent bin mcp-server-playwright) with specific args for noVNC compatibility (headless=false, user-data-dir, cdp-endpoint, storage-state). Hermes can reuse the same MCP server config + Layer wrapping. The existing browser toolset (CDP/Camofox) can be composed with MCP.
- Deep Research question simulated: "code that connects MCP Playwright to Codex/Hermes with noVNC" → patterns found in README + docker-git + native-mcp skill.

## Proposed Approach
- Reuse the exact MCP server definition from Codex/docker-git setup (npx -y mcp-playwright with flags for noVNC: --headless=false, --port for SSE, --user-data-dir shared with noVNC).
- Wrap via native-mcp client (config.yaml mcp_servers.playwright + hermes mcp add/test/configure).
- Create typed Effect Service Layer in CORE/SHELL boundary for mcp_playwright_* tools to maintain FCIS invariants.
- Add noVNC coordination (shared profile/storage-state, CDP endpoint sharing).
- Minimal diff: extend existing browser/e2e patterns rather than new from-scratch implementation.
- All changes follow AGENTS.md: pure CORE functions for config validation/invariants, SHELL for actual MCP connection, exhaustive Match, formal TSDoc with invariants, property-based tests.

## Step-by-Step Plan
1. **Inspection Phase (read-only)**:
- Read full README.md, docker-git related scripts (scripts/e2e/browser-command.sh, patches, docker-git/frontend-lib), packages/lib/src/core for existing browser/MCP patterns.
- Read current ~/.hermes/config.yaml (mcp_servers, browser, terminal sections).
- Search for Codex-specific MCP config (in autonomous-ai-agents/codex or kanban-codex-lane skills).
2. **Formalization**: Define invariants (e.g. ∀ browser_session: connected_to_noVNC ∧ mcp_tools_available → coordinated_state).
3. **Architecture**:
- Add mcp_servers.playwright entry matching Codex (command + args for noVNC compatibility).
- Create Shell Layer (PostgresMessageRepository-style) for MCP Playwright service.
- Update tool registry to expose prefixed tools.
4. **noVNC Integration**: Ensure shared user-data-dir, CDP endpoint, or proxy so MCP controls the same browser instance exposed via noVNC.
5. **Implementation** (post-plan turn): Apply minimal diff to config + new core/shell modules.
6. **Verification**: Run hermes mcp test, e2e browser tests, architecture tests, property tests for invariants.

## Files Likely to Change
- .hermes/config.yaml (or via hermes mcp add) — add mcp_servers.playwright matching Codex pattern.
- packages/lib/src/core/domain.ts or new packages/lib/src/core/mcp-playwright.ts (types, invariants, pure validators).
- packages/lib/src/core/shell/mcp-layers.ts (Effect Layer for Playwright MCP service).
- README.md or AGENTS.md (update integration notes).
- scripts/e2e/browser-command.sh or new test script for noVNC + MCP coordination.
- packages/lib/tests/usecases/mcp-playwright-integration.test.ts (new).
- packages/lib/tests/architecture/fcis-boundary.test.ts (update to cover new MCP Layer).

## Tests / Validation
- **Property-based**: fc.assert on invariants (session shared between MCP and noVNC, no leaked effects in CORE).
- **Integration**: `hermes mcp test playwright`, e2e/browser-command.sh with --mcp-playwright flag, manual noVNC connection test.
- **Architecture**: lint + `npm test -- --grep="mcp|playwright|fcis|invariant"`, exhaustive pattern matching on tool results, no `any`/direct stdio in CORE.
- **Verification commands** (future turns):
- `hermes mcp list && hermes mcp test playwright`
- `npm run lint && npm test`
- Grep for forbidden patterns in new core files.
- Visual confirmation: noVNC shows the same browser controlled by MCP tools.

## Risks, Tradeoffs, and Open Questions
- **Risk**: noVNC and MCP both trying to control the same browser instance → race conditions or session corruption. Mitigation: shared storage-state + CDP proxy.
- **Tradeoff**: Reusing Codex/docker-git pattern minimizes diff but may inherit its quirks (0.0.1 package version, specific flags). Pure Hermes-native Layer is cleaner but larger change.
- **Open Questions**:
- Exact args used in Codex/docker-git for noVNC (headless? port? allowed-origins? ) — needs deeper read of browser-command.sh and patches.
- Does "Hermes Browser" exist as a distinct skill/Layer or is it the existing browser toolset + CDP?
- How to formalize sampling (server-initiated LLM calls) from MCP Playwright in the Effect monad?
- Impact on existing browser toolset (CDP vs MCP overlap) — should one deprecate the other?
- **Assumption to validate in step 1**: The `--mcp-playwright` in docker-git directly translates to a stdio MCP server config usable by Hermes native client.

## Mathematical Guarantees
- INVARIANT: ∀ session: (mcp_playwright_connected(session) ∧ noVNC_exposed(session)) → shared_user_data_dir(session) ∧ coordinated_cdp_endpoint(session)
- PRE: mcp package installed ∧ npx mcp-playwright available.
- POST: mcp_playwright_* tools registered and composable in Effect.gen() without breaking CORE purity.
- FORMAT THEOREM: ∀x ∈ BrowserSessions: connected_via_mcp(x) → controllable_via_novnc(x)

**REF**: Current conversation (MCP Playwright + noVNC request), native-mcp skill, AGENTS.md, README.md mentions of --mcp-playwright.
**SOURCE**: n/a (project inspection via read-only search_files + read_file).
**PURITY**: This plan document is pure (no effects).

Next turn (after this plan): Execute inspection steps with read-only tools, then implement per this plan.
Original file line number Diff line number Diff line change
@@ -0,0 +1,71 @@
# Plan: Clean Built-in Hermes Browser + noVNC Integration (No MCP Duplication)

## Goal
Remove all MCP Playwright references and duplication, making the built-in Hermes browser toolset the only browser backend. Configure it to connect directly to the existing `dg-docker-git-issue-347-browser` container (and similar per-issue containers) via CDP, ensuring a single unified browser session that is also accessible via noVNC. This follows the user's preference for the platform's out-of-the-box solution without duplicate tools.

## Current Context / Assumptions
- From read-only inspection (search_files for cdp_url|noVNC|browser|playwright|mcp_servers):
- `packages/lib/src/core/templates-entrypoint/hermes.ts` currently has MCP logic from previous changes.
- `packages/api/src/services/project-browser.ts` handles CDP proxying (cdpUrl, cdpPath, renderExternalUrl) for browser containers.
- Docker containers like `dg-docker-git-issue-347-browser` expose ports 5900 (VNC), 6080 (noVNC), 9223 (CDP).
- Config has `browser.cdp_url` and `browser.engine = cdp` set to localhost:9223.
- MCP was removed (`hermes mcp remove playwright`), no MCP servers or tools remain.
- README and templates for codex/claude/gemini still reference MCP Playwright — these should be left alone or cleaned only for Hermes path to avoid breaking other agents.
- Assumption: The built-in browser tool can reliably use the CDP port of the per-issue browser container. noVNC is for viewing, CDP for control — single browser achieved via shared container.
- Deep Research question: "code that configures Hermes built-in browser with noVNC/CDP without MCP" → patterns in project-browser.ts, hermes.ts, and docker container names.

## Proposed Approach
- Extend/clean `hermes.ts` template to always configure `browser.cdp_url` and `browser.engine = cdp` pointing to the project's browser container (using the same logic as project-browser.ts).
- Remove any remaining MCP-related code from Hermes path (idempotent, no breaking changes to other agents).
- Add formal invariants for single-browser guarantee.
- No new files — minimal diff to existing template and tests.
- Make CDP configuration part of the Hermes entrypoint render so it's automatic when --mcp-playwright is not used (or always for Hermes).

## Step-by-Step Plan
1. Read-only inspection: re-read hermes.ts, project-browser.ts, current ~/.hermes/config.yaml, and docker ps output for exact container/CDP pattern.
2. Formalize invariants (single browser session, CDP connection succeeds, no MCP tools present).
3. Update `packages/lib/src/core/templates-entrypoint/hermes.ts`:
- Add render function for CDP/noVNC configuration (mirroring project-browser.ts cdpUrl logic).
- Remove any leftover MCP code.
- Include formal TSDoc comment block.
4. Update related test: `packages/lib/tests/usecases/...` or architecture test for template rendering.
5. Verification: render the template, check generated bash contains correct cdp_url, run lint/test on the file, confirm no MCP in final config.

## Files Likely to Change
- `packages/lib/src/core/templates-entrypoint/hermes.ts` (main change — add CDP config render, remove MCP remnants).
- `packages/lib/tests/usecases/template-rendering.test.ts` or similar (update expected output for Hermes entrypoint).
- `packages/api/src/services/project-browser.ts` (if we need to expose more CDP helpers for Hermes — low probability).
- No changes to codex.ts, claude.ts, or MCP-related files (preserve other agents).

## Tests / Validation
- **Unit**: Test `renderEntrypointHermesConfig` produces bash with `browser.cdp_url=http://localhost:9223` and `engine=cdp`.
- **Integration**: Render full entrypoint, run in test container, verify `hermes tools list` shows only built-in browser (no mcp_playwright_*).
- **Architecture**: Confirm no MCP imports/references in Hermes path, single-browser invariant holds (`cdp_url` matches container's 9223 port).
- **Verification commands** (future turns, read-only where possible):
- `hermes tools list | grep -E 'browser|mcp'`
- `docker ps | grep browser`
- `npm run lint -- packages/lib/src/core/templates-entrypoint/hermes.ts`
- `npm test -- --grep="hermes|browser|cdp|template"`

## Risks, Tradeoffs, and Open Questions
- **Risk**: CDP connection to port 9223 may fail if the browser container is not running or port not exposed in current terminal context. Mitigation: fallback to local Chromium or explicit error in template.
- **Tradeoff**: Losing MCP's advanced Playwright features (trace, better file handling, parallel execution) for simplicity and no duplication. Built-in browser is "коробочное" but less powerful.
- **Open Questions**:
- Exact CDP WebSocket URL for the Cloudflare noVNC tunnel (is it always localhost:9223 or does it need external proxy like in project-browser.ts?).
- Should we add `--no-mcp` flag to docker-git for Hermes to make this default?
- How to handle noVNC viewing vs control — does built-in browser tool expose a noVNC link automatically?
- Impact on existing issue-347 Hermes support (need to update HERMES.md or docs?).
- Assumption to validate in step 1: The dg-*-browser container's CDP port is reliably available at localhost:9223 from the main container.

## Mathematical Guarantees
- INVARIANT: ∀ hermes_session: (browser_tool_used(session) ∧ no_mcp_tools(session)) → connected_to_same_container_via_cdp(session) ∧ visible_in_noVNC(session)
- PRE: docker container dg-*-browser running with port 9223 exposed.
- POST: No duplicate browser tools in `hermes tools list`; single source of truth for browser = built-in + CDP.

**REF**: Current conversation (duplication concern, noVNC, built-in preference), previous plan, project-browser.ts.
**SOURCE**: n/a (read-only inspection of codebase and docker ps).
**PURITY**: This plan is pure documentation.

Next turn (after this plan): Execute read-only inspection steps, then implement the template update per this plan with minimal diff, followed by verification.

Saved: .hermes/plans/2026-05-23_095118-clean-builtin-browser-noVNC-no-mcp.md
Loading
Loading