PentesterFlow · PentesterFlow · Jun 10, 2026 · Jun 10, 2026
diff --git a/AUDIT.md b/AUDIT.md
@@ -14,6 +14,28 @@ Severity legend: **HIGH** = security bypass / RCE primitive / crash / silent dat
 
 ---
 
+## Remediation status (verified against source)
+
+> Updated 2026-06-10. Each finding below was re-checked against the current tree; fixes carry an
+> inline comment citing their finding ID, and the full suite is green (`tsc --noEmit` clean, 561
+> tests passing). **35 of 39 findings are fixed.** The remaining 4 are accepted decisions, not open
+> defects.
+
+- **Fixed (35):** H1, H3, H4, H5, H6, H7, H8, H9, H10 · M1–M14 (all) · L1, L2, L3, L4, L5, L7, L8,
+  L11, L12, L13, L14, L15.
+- **Accepted — won't change (3):**
+  - **H2** (DNS-rebinding SSRF) — keep permissive; reaching internal/metadata IPs is often the
+    engagement goal. See the 🟡 row in the triage table.
+  - **L9** (debug log writes unredacted tool I/O) — opt-in, `0o600`, local-only; the operator's
+    explicit choice. See the ⛔ row.
+  - **L6** (no generic high-entropy redaction fallback) — inherent to label-anchored redaction.
+- **Hardened (1):** **L10** — self-update now pins the installer to the requested release tag
+  (immutable ref) instead of mutable `main` for versioned updates, and asserts the script URL is
+  https on `raw.githubusercontent.com` before fetch. The downloaded binary was already SHA-256
+  verified fail-closed by `install.sh`. (`src/update/selfUpdate.ts`)
+
+---
+
 ## Capability-impact triage — "fix without limiting the operator"
 
 PentesterFlow's mission is to help authorized pentesters/bug-hunters/security-engineers work

diff --git a/CHANGELOG.md b/CHANGELOG.md
@@ -4,6 +4,42 @@ All notable changes to this project are documented here. The format is based on
 [Keep a Changelog](https://keepachangelog.com/), and this project adheres to
 [Semantic Versioning](https://semver.org/).
 
+## [Unreleased]
+
+### Added
+
+- **Saved memory (`#` quick-add)** — a curated, human-readable memory layer
+  modeled on Claude Code. `#<text>` saves a durable fact (one Markdown file per
+  fact with frontmatter, under `.pentesterflow/memory/`); `#!<text>` saves it to
+  the personal scope. The fact catalog is pinned into the system prompt every
+  turn (survives compaction) and the most relevant facts are recalled in full
+  per turn, surfaced as a `recalled memory: …` line. Manage with
+  `/memory add|list|forget`. Secrets are redacted before write.
+- **Parallel tool dispatch** — independent tool calls in one step now run
+  concurrently (bounded fan-out) with results recorded in call order; recon
+  fan-outs finish in ~max(latency) instead of the sum. Single-call and
+  `load_skill` steps stay sequential. The permission prompter serializes its
+  modal so approvals still appear one at a time.
+- **LLM retry/backoff** — transient backend failures (HTTP 429 / 502 / 503 /
+  504 and connection drops) are retried with exponential backoff, honoring a
+  `Retry-After` header. Applied to the OpenAI-compatible client (Kimi, Groq,
+  OpenRouter, DeepSeek, LM Studio).
+
+### Changed
+
+- **Self-update hardening** — a pinned `pentesterflow update <version>` now
+  fetches the installer from that release tag (immutable) instead of `main`, and
+  the installer URL is asserted to be https on `raw.githubusercontent.com`
+  before fetch.
+
+### Fixed
+
+- **Redaction gaps** — connection-string query-param credentials
+  (`?password=` / `&auth=` / `&access_token=`), HTTP Digest `response=` hashes,
+  and GCP service-account `private_key_id` are now masked.
+- Closed out the internal code audit: 35 of 39 findings fixed, 3 accepted as
+  intentional, 1 hardened (see `AUDIT.md`).
+
 ## [0.2.0] - 2026-06-06
 
 Hardening, model tuning, and a transcript/status overhaul, plus Claude

diff --git a/README.md b/README.md
@@ -273,10 +273,30 @@ Useful commands:
 | Command | Purpose |
 |---|---|
 | `/compact` | Summarize the current session into persistent memory. |
-| `/memory` | Show current session memory. |
+| `/memory` | Show saved facts + the session checkpoint. |
+| `/memory add <text>` | Save a durable fact (same as `#<text>`). |
+| `/memory list` | List saved facts. |
+| `/memory forget <text>` | Drop saved facts and checkpoint items matching the text. |
 | `/snapshot` | Write a redacted context snapshot immediately. |
 | `/next [objective]` | Ask for coverage-driven next steps. |
 
+### Saved memory (`#` quick-add)
+
+Type `#` followed by anything you want the agent to remember for the rest of
+this session and beyond — for example `#orders API is IDOR-prone on
+/api/orders/{id}`. Use `#!<text>` to save it to your **personal** scope instead
+of the project.
+
+- Saved facts are durable, human-readable Markdown — one file per fact with
+  frontmatter — under `./.pentesterflow/memory/` (project) and
+  `~/.pentesterflow/memory/` (personal), with a generated `MEMORY.md` index.
+- The fact catalog is pinned into the system prompt on **every** turn, so it
+  survives compaction; the facts most relevant to the current turn are recalled
+  in full automatically (you'll see a `recalled memory: …` line).
+- Secrets are redacted before a fact is written to disk.
+- Manage them with `#<text>` / `/memory add`, `/memory list`, and
+  `/memory forget <text>`.
+
 ## Burp Integration
 
 Use the companion

diff --git a/src/agent/agent.test.ts b/src/agent/agent.test.ts
@@ -10,6 +10,7 @@ import { describe, expect, it } from 'vitest';
 import { IntelligenceStore } from '../intelligence/store.js';
 import type { Client } from '../llm/client.js';
 import type { ChatRequest, ChatResponse } from '../llm/types.js';
+import { MemoryStore } from '../memory/store.js';
 import { AlwaysAllow } from '../permission/permission.js';
 import { Store as SessionStore, newID as newSessionID } from '../session/store.js';
 import { Registry as SkillRegistry } from '../skills/registry.js';
@@ -1217,3 +1218,230 @@ describe('reconcileToolCalls (H6)', () => {
     expect(reconcileToolCalls(input)).toEqual(input);
   });
 });
+
+// ---------- E1: parallel tool dispatch ----------
+
+/** A tool that blocks until `n` instances are running at once, proving the
+ *  agent dispatched them concurrently. If they ran sequentially the first
+ *  never sees a second arrive and falls through the timeout, returning
+ *  'serial' — a fast, deterministic failure rather than a hang. */
+class BarrierTool implements Tool {
+  private count = 0;
+  private waiters: Array<(ok: boolean) => void> = [];
+  constructor(private readonly need: number) {}
+  name(): string {
+    return 'barrier';
+  }
+  description(): string {
+    return 'barrier';
+  }
+  schema(): Record<string, unknown> {
+    return { type: 'object', properties: { id: { type: 'string' } } };
+  }
+  requiresPermission(): boolean {
+    return false;
+  }
+  async run(args: Record<string, unknown>): Promise<string> {
+    this.count += 1;
+    if (this.count >= this.need) {
+      for (const w of this.waiters) w(true);
+      this.waiters = [];
+      return `parallel-ok:${String(args.id ?? '')}`;
+    }
+    const ok = await new Promise<boolean>((resolve) => {
+      this.waiters.push(resolve);
+      setTimeout(() => resolve(false), 300);
+    });
+    return `${ok ? 'parallel-ok' : 'serial'}:${String(args.id ?? '')}`;
+  }
+}
+
+/** A tool that sleeps for `delay_ms` then echoes `id` — used to verify result
+ *  order is the call order, not the completion order. */
+class DelayTool implements Tool {
+  name(): string {
+    return 'delay';
+  }
+  description(): string {
+    return 'delay';
+  }
+  schema(): Record<string, unknown> {
+    return { type: 'object', properties: { id: { type: 'string' }, delay_ms: { type: 'number' } } };
+  }
+  requiresPermission(): boolean {
+    return false;
+  }
+  async run(args: Record<string, unknown>): Promise<string> {
+    await new Promise((r) => setTimeout(r, Number(args.delay_ms ?? 0)));
+    return `done:${String(args.id ?? '')}`;
+  }
+}
+
+function toolCall(id: string, name: string, args: Record<string, unknown>) {
+  return { id, type: 'function' as const, function: { name, arguments: JSON.stringify(args) } };
+}
+
+function agentWithTools(scripted: ChatResponse[], tools: Tool[]): Agent {
+  const reg = new ToolRegistry();
+  for (const t of tools) reg.register(t);
+  return new Agent({
+    client: new FakeClient(scripted),
+    tools: reg,
+    skills: new SkillRegistry(),
+    prompter: new AlwaysAllow(),
+    store: null,
+    target: new Target(),
+  });
+}
+
+describe('Agent parallel tool dispatch (E1)', () => {
+  it('runs independent tool calls in the same step concurrently', async () => {
+    const agent = agentWithTools(
+      [
+        {
+          message: {
+            role: 'assistant',
+            content: '',
+            toolCalls: [
+              toolCall('c1', 'barrier', { id: 'a' }),
+              toolCall('c2', 'barrier', { id: 'b' }),
+            ],
+          },
+          finishReason: 'tool_calls',
+        },
+        { message: { role: 'assistant', content: 'done' }, finishReason: 'stop' },
+      ],
+      [new BarrierTool(2)],
+    );
+    const { events, sink } = collect();
+    await agent.run('go', new AbortController().signal, sink);
+    const results = events
+      .filter((e) => e.type === 'tool-result')
+      .map((e) => (e.type === 'tool-result' ? e.result : ''));
+    // Both only resolve to parallel-ok if they were in flight simultaneously.
+    expect(results).toEqual(['parallel-ok:a', 'parallel-ok:b']);
+  });
+
+  it('emits tool results in call order even when later calls finish first', async () => {
+    const agent = agentWithTools(
+      [
+        {
+          message: {
+            role: 'assistant',
+            content: '',
+            toolCalls: [
+              toolCall('c1', 'delay', { id: 'first', delay_ms: 60 }),
+              toolCall('c2', 'delay', { id: 'second', delay_ms: 0 }),
+            ],
+          },
+          finishReason: 'tool_calls',
+        },
+        { message: { role: 'assistant', content: 'done' }, finishReason: 'stop' },
+      ],
+      [new DelayTool()],
+    );
+    const { events, sink } = collect();
+    await agent.run('go', new AbortController().signal, sink);
+    const order = events
+      .filter((e) => e.type === 'tool-result')
+      .map((e) => (e.type === 'tool-result' ? e.result : ''));
+    // 'second' finishes first, but results are recorded in call order.
+    expect(order).toEqual(['done:first', 'done:second']);
+    // History keeps the same order: assistant, tool(first), tool(second).
+    const toolMsgs = agent.getHistory().filter((m) => m.role === 'tool');
+    expect(toolMsgs.map((m) => m.toolCallID)).toEqual(['c1', 'c2']);
+  });
+});
+
+// ---------- Curated memory: pinned catalog + recall + survives compaction ----------
+
+describe('Agent curated memory', () => {
+  function makeMemoryAgent(scripted: ChatResponse[]) {
+    const cwd = mkdtempSync(join(tmpdir(), 'pf-agent-mem-cwd-'));
+    const home = mkdtempSync(join(tmpdir(), 'pf-agent-mem-home-'));
+    const memoryStore = new MemoryStore({ cwd, home });
+    const tools = new ToolRegistry();
+    tools.register(new EchoTool());
+    const agent = new Agent({
+      client: new FakeClient(scripted),
+      tools,
+      skills: new SkillRegistry(),
+      prompter: new AlwaysAllow(),
+      store: null,
+      target: new Target(),
+      memoryStore,
+    });
+    return {
+      agent,
+      memoryStore,
+      cleanup: () => {
+        rmSync(cwd, { recursive: true, force: true });
+        rmSync(home, { recursive: true, force: true });
+      },
+    };
+  }
+
+  it('pins a saved fact into the system prompt immediately (next turn)', async () => {
+    const { agent, cleanup } = makeMemoryAgent([]);
+    try {
+      const fact = await agent.addMemory({ text: 'orders API IDOR on /api/orders/{id}' });
+      expect(fact).not.toBeNull();
+      const sys = agent.getHistory()[0]?.content ?? '';
+      expect(sys).toContain('Saved memory');
+      expect(sys).toContain(fact?.name ?? 'NOPE');
+    } finally {
+      cleanup();
+    }
+  });
+
+  it('recalls a relevant fact into the turn and emits a memory-recall event', async () => {
+    const { agent, cleanup } = makeMemoryAgent([
+      { message: { role: 'assistant', content: 'ok' }, finishReason: 'stop' },
+    ]);
+    try {
+      await agent.addMemory({ text: 'orders API IDOR via sequential id on /api/orders/{id}' });
+      const { events, sink } = collect();
+      await agent.run('test the orders endpoint for idor', new AbortController().signal, sink);
+      const recall = events.find((e) => e.type === 'memory-recall');
+      expect(recall && recall.type === 'memory-recall' ? recall.names.length : 0).toBeGreaterThan(
+        0,
+      );
+    } finally {
+      cleanup();
+    }
+  });
+
+  it('keeps the saved-memory catalog in the prompt after a compaction', async () => {
+    // compact() resets history to [system, summary]; the catalog must still be
+    // present in the rebuilt system prompt so the model never "forgets" it.
+    const { agent, cleanup } = makeMemoryAgent([
+      { message: { role: 'assistant', content: 'COMPACTED SUMMARY' }, finishReason: 'stop' },
+    ]);
+    try {
+      const fact = await agent.addMemory({ text: 'login OAuth redirect_uri bypass works' });
+      // Seed a couple of turns of history so there is something to compact.
+      agent.getHistory(); // no-op accessor
+      await agent.compact(new AbortController().signal, () => {});
+      const sys = agent.getHistory()[0]?.content ?? '';
+      expect(sys).toContain('Saved memory');
+      expect(sys).toContain(fact?.name ?? 'NOPE');
+    } finally {
+      cleanup();
+    }
+  });
+
+  it('forgetMemory removes a curated fact and drops it from the prompt', async () => {
+    const { agent, cleanup } = makeMemoryAgent([]);
+    try {
+      const fact = await agent.addMemory({ text: 'orders API IDOR on /api/orders/{id}' });
+      const name = fact?.name ?? 'NOPE';
+      expect(agent.listCuratedMemory()).toHaveLength(1);
+      const removed = await agent.forgetMemory('orders');
+      expect(removed).toContain(name);
+      expect(agent.listCuratedMemory()).toHaveLength(0);
+      expect(agent.getHistory()[0]?.content ?? '').not.toContain(name);
+    } finally {
+      cleanup();
+    }
+  });
+});