|
| 1 | +# Plan: Initial Formalization of Mathematician-Programmer Agent Role |
| 2 | + |
| 3 | +## Goal |
| 4 | +Establish strict adherence to the formally verifiable functional architecture defined in AGENTS.md across all agent behaviors, code, and interactions in this workspace. Ensure every response and code artifact is the result of simulated professional discussion among roles (architect Effect/FP, type reviewer, CORE↔SHELL guardian, test engineer). All future work must follow the Deep Research loop, purity rules, Effect-TS monadic composition, mathematical invariants, and verification requirements. |
| 5 | + |
| 6 | +## Current Context / Assumptions |
| 7 | +- Workspace: pnpm monorepo (/home/dev/app) with packages/lib (core domain, state-repo, git, SSH, tests) and packages/app. |
| 8 | +- Project context file AGENTS.md fully loaded, defining the mathematician-programmer role, FCIS pattern, mandatory libraries (effect, @effect/schema), comment templates, conventional commits, and quality gates. |
| 9 | +- User interaction so far limited to repeated Russian greetings ("ПРивет") and invocation of the `plan` skill + model switches (now grok-4.20-0309-reasoning). |
| 10 | +- No specific feature request yet; task inferred as "activate and operationalize the formal role within the existing Hermes codebase". |
| 11 | +- Existing code uses TypeScript but may contain imperative patterns, direct effects, or missing formal documentation that must be brought into compliance. |
| 12 | +- Tools (terminal, file, search_files, etc.) available and must be used only through typed Effect Services in SHELL. |
| 13 | + |
| 14 | +## Proposed Approach |
| 15 | +Adopt the Functional Core, Imperative Shell (FCIS) pattern strictly: |
| 16 | +- CORE: pure functions, immutable data, mathematical operations, invariant checks, role-simulation logic. |
| 17 | +- SHELL: all tool calls (write_file, terminal, search_files, skill_*, delegate_task, etc.), I/O, model interactions wrapped in Effect + Layers. |
| 18 | +- Use @effect/schema for all boundary decoding. |
| 19 | +- Encode AGENTS.md rules as types, branded types, and property-based tests. |
| 20 | +- Create a central `FormalReasoning` service that forces every action through the required internal steps (Deep Research question → existing pattern search → formalization → code/tests → verification). |
| 21 | +- Minimal changes first: add supporting types and a new core module, then enforce via lint rules/architecture tests. |
| 22 | + |
| 23 | +## Step-by-Step Plan |
| 24 | +1. Inspect existing core files (domain.ts, auto-agent-flags.ts) using read-only tools to identify reuse opportunities (minimal correct diff principle). |
| 25 | +2. Define new CORE types and pure functions: |
| 26 | + - RoleSimulation (architect, reviewer, guardian, test-engineer). |
| 27 | + - Invariant type and checker. |
| 28 | + - `formalizeTask(description: string): Effect<Plan, never, never>` (pure where possible). |
| 29 | +3. Create SHELL Layer that provides typed wrappers for all available tools as Effect services (following the DatabaseService/HttpService example in AGENTS.md). |
| 30 | +4. Implement the comment template enforcement as a ts-morph script or ESLint rule. |
| 31 | +5. Add property-based tests for key invariants (purity, exhaustiveness, no `any`/casts outside axiomatic module). |
| 32 | +6. Update main agent entrypoint to load the new FormalReasoning layer. |
| 33 | +7. Write this plan file (only mutation allowed this turn). |
| 34 | +8. In subsequent turns: implement, test, verify with `npm run lint`, `npm test`, architecture checks. |
| 35 | + |
| 36 | +## Files Likely to Change |
| 37 | +- `packages/lib/src/core/domain.ts` (add formal types, invariants, role simulation) |
| 38 | +- `packages/lib/src/core/formal-reasoning.ts` (new CORE module) |
| 39 | +- `packages/lib/src/core/shell.ts` (new Layer definitions for tools) |
| 40 | +- `packages/lib/tests/formal-verification/invariants.test.ts` (new) |
| 41 | +- `.hermes/plans/*.md` (ongoing plans) |
| 42 | +- `packages/lib/tests/usecases/...` (update existing tests to use Effect.provide and .effect) |
| 43 | +- `tsconfig.json`, `pnpm-workspace.yaml` (if new packages needed) |
| 44 | + |
| 45 | +## Tests / Validation |
| 46 | +- **Property-based**: `fc.assert(fc.property(taskArbitrary, (task) => isFormalReasoningCompliant(formalizeTask(task))))` |
| 47 | +- **Unit**: Effect tests with Mock layers for all tools (`it.effect(...)` with `Effect.provide(MockTerminal)`) |
| 48 | +- **Architecture**: Static checks for: |
| 49 | + - No `any`, `as`, `ts-ignore`, `async/await`, `console.*` in CORE. |
| 50 | + - All pattern matches use `Match.exhaustive`. |
| 51 | + - CORE imports only pure modules. |
| 52 | +- Run full suite: `npm run lint && npm test && npm run build` |
| 53 | +- Verification command sequence (to be executed in future turns): |
| 54 | + ```bash |
| 55 | + npm run lint |
| 56 | + npm test -- --grep="formal|invariant|effect" |
| 57 | + grep -r "any\|as \|ts-ignore" packages/lib/src/core/ |
| 58 | + ``` |
| 59 | + |
| 60 | +## Risks, Tradeoffs, and Open Questions |
| 61 | +- **Risk**: Large-scale refactor of existing Hermes codebase could introduce regressions in SSH/git/state management features. Mitigate with incremental PRs + CI. |
| 62 | +- **Tradeoff**: Extreme formalism increases correctness and maintainability at cost of development speed. Prioritize high-risk modules first (tool usage, delegation). |
| 63 | +- **Open Questions**: |
| 64 | + - How to mathematically model the "tool call XML format" and "mandatory tool use" rules as invariants? |
| 65 | + - Should the plan skill itself be formalized as a pure `Plan` ADT with interpreter in SHELL? |
| 66 | + - Handling of model switch notes and meta-instructions – treat as SHELL configuration? |
| 67 | + - Exact mapping of "Deep Research" loop into Effect.gen() generator. |
| 68 | +- **Assumption to validate**: Existing test files can be migrated to `it.effect()` without breaking. |
| 69 | + |
| 70 | +## Mathematical Guarantees (Proof Obligations) |
| 71 | +- Invariant: ∀f ∈ CORE: isPure(f) ∧ preservesInvariants(f) |
| 72 | +- ∀response: followsRoleSimulation(response) → contains(DeepResearchQuestion, response) |
| 73 | +- Variant: complexity decreases with each research → implementation → verification iteration. |
| 74 | + |
| 75 | +**Next Action (post-plan)**: Load this plan, begin step 1 with read-only inspection, then move to implementation turn. |
| 76 | + |
| 77 | +SOURCE: n/a (directly derived from loaded AGENTS.md) |
| 78 | +REF: AGENTS.md + plan skill invocation |
0 commit comments