Skip to content

feat(auth): add retention and recovery doctor#89

Open
ndycode wants to merge 4 commits intogit-plan/15-sync-rollbackfrom
git-plan/16-retention-doctor
Open

feat(auth): add retention and recovery doctor#89
ndycode wants to merge 4 commits intogit-plan/15-sync-rollbackfrom
git-plan/16-retention-doctor

Conversation

@ndycode
Copy link
Owner

@ndycode ndycode commented Mar 12, 2026

Summary

  • Add sync-history retention with latest-pointer preservation.
  • Add doctor checks for rollback checkpoints and actionable restores.

Summary by cubic

Add retention for sync history and per‑reason auto snapshots, and extend the auth doctor with recovery and rollback checks. This clarifies rollback readiness and keeps recovery artifacts lean without losing the latest checkpoints.

  • New Features
    • Doctor checks: write‑ahead journal, rotating backups, snapshot backups, overall recovery chain, actionable named backup restores, and latest Codex CLI rollback checkpoint with clear actions when missing/invalid.
    • Auto snapshot retention: pruneAutoGeneratedSnapshots keeps the latest per reason and preserves any snapshot referenced by the latest manual rollback checkpoint; snapshotAccountStorage enforces per‑reason retention (default 3) and prunes older auto‑generated snapshots only.
    • Sync history retention: pruneSyncHistory/pruneSyncHistoryEntries cap history (default 200), keep the latest entry per kind, and keep the latest pointer aligned on disk.
    • Tests updated to cover per‑reason pruning, rollback‑referenced snapshot preservation, sync‑history pruning, and doctor JSON output.

Written for commit 06f6ffc. Summary will update on new commits.

note: greptile review for oc-chatgpt-multi-auth. cite files like lib/foo.ts:123. confirm regression tests + windows concurrency/token redaction coverage.

Greptile Summary

this PR adds sync-history retention (capped at 200, per-kind latest preserved) and extends the auth doctor with six new checks covering WAL journal, rotating backups, snapshot backups, recovery chain integrity, rollback checkpoint status, and actionable named-backup restores. the additions fit cleanly into the existing doctor/storage/sync-history architecture, but there's a critical inconsistency between the two snapshot pruning paths that can silently corrupt rollback capability.

  • bug: enforceSnapshotRetention deletes rollback-referenced snapshots — unlike pruneAutoGeneratedSnapshots, the inline pruner called by snapshotAccountStorage doesn't consult getLatestManualCodexCliRollbackSnapshotNames. after 3 auto syncs following a manual rollback checkpoint, the checkpoint snapshot is deleted, making rollback silently impossible. pruneAutoGeneratedSnapshots correctly preserves it; enforceSnapshotRetention does not.
  • windows filesystem riskpruneAutoGeneratedSnapshots catches fs.rm errors at debug level only; a Windows antivirus EBUSY / EPERM lock will be swallowed without warning, leaving the caller with an incorrect pruned list.
  • pruneSyncHistory deadlock (flagged in prior thread) — readSyncHistory called inside withHistoryLock re-enters waitForPendingHistoryWrites, deadlocking if a concurrent appendSyncHistoryEntry is in flight.
  • redundant ternary in pruneSyncHistory serialization block — always evaluates to "\n".
  • test coverage gaptest/storage-recovery-paths.test.ts tests pruneAutoGeneratedSnapshots but no test exercises the path where enforceSnapshotRetention prunes a rollback-referenced snapshot (the critical bug above is untested).

Confidence Score: 2/5

  • not safe to merge — enforceSnapshotRetention can silently delete active rollback checkpoints after 3+ subsequent syncs, breaking the recovery chain this PR is designed to protect
  • one confirmed logic bug (enforceSnapshotRetention missing rollback-name preservation) can silently delete a live rollback checkpoint in normal usage, plus the carry-over deadlock in pruneSyncHistory and the failing doctor test; no explicit windows concurrency reasoning or regression test accompanies the new pruning paths as required
  • lib/storage.ts (dual pruning paths with divergent invariants), lib/sync-history.ts (deadlock inside lock)

Important Files Changed

Filename Overview
lib/storage.ts adds enforceSnapshotRetention and pruneAutoGeneratedSnapshots — two pruning paths with divergent semantics; enforceSnapshotRetention doesn't preserve rollback-referenced snapshots (critical bug), and pruneAutoGeneratedSnapshots silently swallows EBUSY on Windows
lib/sync-history.ts adds pruneSyncHistoryEntries and pruneSyncHistory with correct per-kind latest preservation; deadlock when readSyncHistory is called inside withHistoryLock (flagged in prior thread); minor: redundant always-true ternary in serialization
lib/codex-manager.ts adds WAL journal, rotating backups, snapshot backups, recovery chain, rollback checkpoint, and named-backup-restores doctor checks; double existsSync TOCTOU on Windows flagged in prior thread
test/storage-recovery-paths.test.ts adds pruning tests for pruneAutoGeneratedSnapshots including rollback-reference preservation; notably absent: a test exercising enforceSnapshotRetention deleting a rollback-referenced snapshot (the uncovered bug path)
test/codex-manager-cli.test.ts new rollback checkpoint doctor test; check key predicate searches for "rollback-checkpoint" but production emits "codex-cli-rollback-checkpoint" (flagged in prior thread) — assertion will always fail

Sequence Diagram

sequenceDiagram
    participant Sync as codex-cli-sync
    participant Storage as storage.ts
    participant SyncHist as sync-history.ts
    participant FS as Filesystem

    Sync->>Storage: snapshotAccountStorage(reason="codex-cli-sync")
    Storage->>FS: createBackup(snapshotName) → snapshot C
    Storage->>Storage: enforceSnapshotRetention()
    Note over Storage: groups by reason, keeps top 3 by updatedAt<br/>⚠️ does NOT check rollback-referenced names
    Storage->>FS: unlink(old snapshots) — may delete C if >3 exist

    Sync->>SyncHist: appendSyncHistoryEntry(rollbackSnapshot=C)
    SyncHist->>FS: append to sync-history.ndjson
    SyncHist->>FS: overwrite sync-history-latest.json

    Note over Storage,SyncHist: later: pruneSyncHistory()
    SyncHist->>SyncHist: waitForPendingHistoryWrites()
    SyncHist->>SyncHist: withHistoryLock(...)
    SyncHist->>SyncHist: readSyncHistory() ← calls waitForPendingHistoryWrites() inside lock ⚠️ deadlock risk
    SyncHist->>FS: writeFile(pruned history)
    SyncHist->>FS: rewriteLatestEntry()

    Note over Storage,SyncHist: explicit call: pruneAutoGeneratedSnapshots()
    Storage->>SyncHist: getLatestManualCodexCliRollbackSnapshotNames()
    SyncHist->>FS: readSyncHistory() → returns entry with rollbackSnapshot=C
    Storage->>Storage: preserveNames.add(C)
    Storage->>FS: rm(stale snapshots) — C is preserved ✓
Loading

Fix All in Codex

Prompt To Fix All With AI
This is a comment left during a code review.
Path: lib/storage.ts
Line: 1587-1629

Comment:
**`enforceSnapshotRetention` silently deletes rollback-referenced snapshots**

`enforceSnapshotRetention` keeps only the top `ACCOUNT_SNAPSHOT_RETENTION_PER_REASON` (3) snapshots per reason by `updatedAt`, but it never calls `getLatestManualCodexCliRollbackSnapshotNames`. `pruneAutoGeneratedSnapshots` correctly seeds its `preserveNames` set from that function — this one does not.

concrete scenario:
1. manual codex-cli-sync creates rollback checkpoint snapshot **C**
2. three subsequent auto syncs create **D**, **E**, **F** (same reason, same filename pattern)
3. `snapshotAccountStorage` is called again → `enforceSnapshotRetention` keeps F, E, D and **deletes C**
4. `getLatestCodexCliSyncRollbackPlan` now returns `status: "error"` / `ENOENT` — rollback is silently broken

`pruneAutoGeneratedSnapshots` would have preserved C because it holds a `preserveNames` set seeded from sync history. `enforceSnapshotRetention` has no such guard, creating an inconsistent invariant across two pruning paths operating on the same files.

the fix is to await `getLatestManualCodexCliRollbackSnapshotNames()` inside `enforceSnapshotRetention` and skip any snapshot whose name is in that set before calling `fs.unlink`.

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: lib/sync-history.ts
Line: 251-253

Comment:
**Redundant always-true ternary in serialization**

this branch is already inside `else { }` which is only reached when `prunedEntries.length > 0`, so the ternary `(prunedEntries.length > 0 ? "\n" : "")` always evaluates to `"\n"` and the dead branch is unreachable.

```suggestion
		const serialized =
			prunedEntries.map((entry) => serializeEntry(entry)).join("\n") + "\n";
```

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: lib/storage.ts
Line: 1776-1788

Comment:
**`pruneAutoGeneratedSnapshots` swallows EBUSY silently on Windows**

`fs.rm(backup.path, { force: true })` only suppresses `ENOENT`; on Windows, antivirus scanners can hold a short-lived exclusive lock, causing `EBUSY` or `EPERM`. the catch here logs only at `debug` level, so a locked file is silently skipped without any warning to the operator. `enforceSnapshotRetention` already logs at `warn` for non-ENOENT errors — `pruneAutoGeneratedSnapshots` should do the same to make Windows filesystem concurrency failures visible.

How can I resolve this? If you propose a fix, please make it concise.

Last reviewed commit: 06f6ffc

Greptile also left 1 inline comment on this PR.

Context used:

  • Rule used - What: Every code change must explain how it defend... (source)

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Mar 12, 2026

📝 Walkthrough

Walkthrough

adds snapshot retention and pruning policies with per-reason limits, sync history truncation with per-kind preservation, and enhanced doctor diagnostics reporting rollback checkpoint status, wal journal presence, and recovery artifacts across storage, sync-history, and codex-manager modules.

Changes

Cohort / File(s) Summary
Snapshot Retention & Pruning
lib/storage.ts
Added ACCOUNT_SNAPSHOT_RETENTION_PER_REASON constant, isAccountSnapshotName(), listAccountSnapshots(), and enforceSnapshotRetention() to automatically prune account snapshots per reason after creation. Introduced pruneAutoGeneratedSnapshots() with selective pruning logic that preserves manually rolled back snapshots and retains latest per reason via sync-history integration.
Sync History Pruning
lib/sync-history.ts
Introduced SYNC_HISTORY_MAX_ENTRIES cap, PrunedSyncHistory interface, and pruneSyncHistoryEntries() to compute pruned subsets preserving latest per kind. Added pruneSyncHistory() for async read-prune-write workflow with lock coordination. Exposed test hooks __resetSyncHistoryForTests(), __getLastSyncHistoryErrorForTests(), __getLastSyncHistoryPathsForTests() and cloneSyncHistoryEntry().
Doctor Diagnostics & Rollback Reporting
lib/codex-manager.ts
Integrated getLatestCodexCliSyncRollbackPlan import and added diagnostic checks for storage-journal (wal presence), rotating-backups validity, snapshot-backups validity and counts, recovery-chain aggregation, and named-backup-restores. Extended doctor flow to surface rollback checkpoint status with account counts and emit actionable recovery guidance when checkpoints are missing or invalid.
Snapshot Pruning Tests
test/storage-recovery-paths.test.ts
Added test cases verifying auto-snapshot pruning removes older auto backups while retaining newer auto/manual backups and sync-history referenced rollback snapshots. Imports and integrates sync-history test helpers for state management.
Sync History Pruning Tests
test/sync-history.test.ts, test/codex-cli-sync.test.ts
New comprehensive test suite for history pruning verifying per-kind latest preservation with aggressive trimming, on-disk latest file mirroring, and readLatestSyncHistorySync integration. Existing test expanded with pruneSyncHistory invocation and assertions.
Rollback Plan Tests
test/codex-manager-cli.test.ts
Added mock for getLatestCodexCliSyncRollbackPlan in codex-cli/sync.js and test cases verifying doctor json output reports rollback-checkpoint with severity "ok" when rollback plan is ready.

Sequence Diagrams

sequenceDiagram
    actor Client
    participant Storage
    participant Retention
    participant SyncHistory

    Client->>Storage: snapshotAccountStorage()
    Storage->>Storage: create snapshot backup
    Storage->>Retention: enforceSnapshotRetention()
    Retention->>Storage: listAccountSnapshots()
    Storage-->>Retention: all snapshots metadata
    Retention->>SyncHistory: check rollback references
    SyncHistory-->>Retention: rollback-referenced snapshots
    Retention->>Retention: filter by reason, preserve latest & rollbacks
    Retention->>Storage: delete eligible pruned snapshots
    Retention-->>Client: pruning complete
Loading
sequenceDiagram
    actor Client
    participant SyncHistory as Sync History
    participant FileSystem as Disk
    
    Client->>SyncHistory: pruneSyncHistory(maxEntries)
    SyncHistory->>FileSystem: acquire history lock
    SyncHistory->>FileSystem: read history entries
    FileSystem-->>SyncHistory: entries[]
    SyncHistory->>SyncHistory: group by kind, preserve latest per kind
    SyncHistory->>SyncHistory: apply maxEntries limit
    SyncHistory->>FileSystem: write pruned entries back
    SyncHistory->>FileSystem: rewrite latest entry file
    FileSystem-->>SyncHistory: lock released
    SyncHistory-->>Client: {removed, kept, latest}
Loading
sequenceDiagram
    actor User
    participant Doctor as Doctor Flow
    participant RollbackPlan as Rollback Check
    participant Storage as Storage Checks
    participant Output as Diagnostics
    
    User->>Doctor: codex doctor --json
    Doctor->>RollbackPlan: getLatestCodexCliSyncRollbackPlan()
    RollbackPlan-->>Doctor: plan status (ready/invalid/missing)
    alt Plan Ready
        Doctor->>Output: emit checkpoint ok with account count
    else Plan Invalid or Missing
        Doctor->>Storage: gather recovery artifacts (wal, backups, snapshots)
        Storage-->>Doctor: recovery metadata
        Doctor->>Output: emit checkpoint error with recovery guidance
    end
    Doctor->>Output: emit storage-journal, rotating-backups, snapshot-backups, recovery-chain checks
    Output-->>User: doctor report
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Concerns to flag

concurrency: lib/sync-history.ts introduces lock coordination for write operations but verify the lock acquisition and release in pruneSyncHistory() handles all code paths including error cases. check that pending write tracking doesn't leak or deadlock under concurrent prune/append scenarios.

missing regression tests: no explicit tests for concurrent snapshot pruning invocations or concurrent history pruning during sync operations. verify snapshot creation under active pruning doesn't race. also missing tests for the actual doctor json output format with all diagnostic blocks present (storage-journal, rotating-backups, snapshot-backups, recovery-chain, named-backup-restores).

windows edge cases: lib/storage.ts uses snapshot name pattern matching—verify regex handles windows path separators correctly in snapshot names. wal path computation in lib/codex-manager.ts via walPath from storagePath needs testing on windows with backslashes.

error handling rigor: lib/storage.ts:enforceSnapshotRetention() and pruneAutoGeneratedSnapshots() tolerate ENOENT but don't distinguish between file-not-found during deletion vs. other permission/io errors. logging is present but verify error messages surface root cause clearly for debugging.

sync-history test isolation: test/sync-history.test.ts and test/storage-recovery-paths.test.ts both reset and configure sync history state. ensure test cleanup (afterEach) in test/storage-recovery-paths.test.ts:line +82 properly isolates history state between test suites to avoid flakes from cross-test pollution.

Suggested labels

bug

🚥 Pre-merge checks | ✅ 1 | ❌ 2

❌ Failed checks (2 warnings)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
Description check ⚠️ Warning PR description is incomplete and fails to address critical code review findings flagged by greptile. Required sections (Risk/Rollback, validation checklist) missing or unfilled. Complete Risk and Rollback section with clear mitigation plan for enforceSnapshotRetention bug, deadlock in pruneSyncHistory, and Windows EBUSY handling. Document test coverage for concurrent/Windows scenarios and validate all three greptile findings are resolved before merge.
✅ Passed checks (1 passed)
Check name Status Explanation
Title check ✅ Passed The title follows conventional commit format with a clear type (feat) and scope (auth), and the summary is concise and accurately reflects the main additions: retention policies and recovery doctor checks.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
  • 📝 Generate docstrings (stacked PR)
  • 📝 Generate docstrings (commit on current branch)
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch git-plan/16-retention-doctor
📝 Coding Plan
  • Generate coding plan for human review comments

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Comment on lines +1954 to +1956
const checkpoint = payload.checks.find(
(check) => check.key === "rollback-checkpoint",
);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test is searching for a check with key "rollback-checkpoint", but the actual implementation in lib/codex-manager.ts line 3808 uses key "codex-cli-rollback-checkpoint". This test will always fail because checkpoint will be undefined.

// Fix: Update the key to match the implementation
const checkpoint = payload.checks.find(
  (check) => check.key === "codex-cli-rollback-checkpoint",
);
Suggested change
const checkpoint = payload.checks.find(
(check) => check.key === "rollback-checkpoint",
);
const checkpoint = payload.checks.find(
(check) => check.key === "codex-cli-rollback-checkpoint",
);

Spotted by Graphite

Fix in Graphite


Is this helpful? React 👍 or 👎 to let us know.

Comment on lines +237 to +241
try {
entries = await readSyncHistory();
} catch {
entries = [];
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Deadlock: readSyncHistory inside withHistoryLock awaits pending writes

readSyncHistory() starts with await waitForPendingHistoryWrites(). if any appendSyncHistoryEntry call arrives after pruneSyncHistory's outer waitForPendingHistoryWrites() completes but before withHistoryLock's callback begins, those write promises will be sitting in pendingHistoryWrites, waiting to acquire the same lock we currently hold. then readSyncHistory's internal waitForPendingHistoryWrites() will await those promises indefinitely → deadlock.

fix: read the history file directly inside the lock instead of delegating to readSyncHistory:

try {
    const content = await fs.readFile(paths.historyPath, "utf8");
    entries = content
        .split(/\r?\n/)
        .map((l) => l.trim())
        .filter(Boolean)
        .map((l) => JSON.parse(l) as SyncHistoryEntry);
} catch {
    entries = [];
}

this avoids the nested waitForPendingHistoryWrites call entirely while inside the lock.

Prompt To Fix With AI
This is a comment left during a code review.
Path: lib/sync-history.ts
Line: 237-241

Comment:
**Deadlock: `readSyncHistory` inside `withHistoryLock` awaits pending writes**

`readSyncHistory()` starts with `await waitForPendingHistoryWrites()`. if any `appendSyncHistoryEntry` call arrives after `pruneSyncHistory`'s outer `waitForPendingHistoryWrites()` completes but before `withHistoryLock`'s callback begins, those write promises will be sitting in `pendingHistoryWrites`, waiting to acquire the same lock we currently hold. then `readSyncHistory`'s internal `waitForPendingHistoryWrites()` will await those promises indefinitely → deadlock.

fix: read the history file directly inside the lock instead of delegating to `readSyncHistory`:

```ts
try {
    const content = await fs.readFile(paths.historyPath, "utf8");
    entries = content
        .split(/\r?\n/)
        .map((l) => l.trim())
        .filter(Boolean)
        .map((l) => JSON.parse(l) as SyncHistoryEntry);
} catch {
    entries = [];
}
```

this avoids the nested `waitForPendingHistoryWrites` call entirely while inside the lock.

How can I resolve this? If you propose a fix, please make it concise.

Fix in Codex

Comment on lines 3611 to +3616
}
}

addCheck({
key: "storage-journal",
severity: existsSync(walPath) ? "ok" : "warn",
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Double existsSync — TOCTOU race on Windows

existsSync(walPath) is evaluated twice in the same addCheck call — once for severity and once for message. on windows, antivirus scanners routinely lock or delete files between consecutive syscalls, which can yield a mismatched severity+message pair (e.g. "ok" severity with a "journal missing" message). cache the result in a boolean before constructing the check object.

Prompt To Fix With AI
This is a comment left during a code review.
Path: lib/codex-manager.ts
Line: 3611-3616

Comment:
**Double `existsSync` — TOCTOU race on Windows**

`existsSync(walPath)` is evaluated twice in the same `addCheck` call — once for `severity` and once for `message`. on windows, antivirus scanners routinely lock or delete files between consecutive syscalls, which can yield a mismatched severity+message pair (e.g. `"ok"` severity with a "journal missing" message). cache the result in a boolean before constructing the check object.

How can I resolve this? If you propose a fix, please make it concise.

Fix in Codex

Comment on lines +1954 to +1955
const checkpoint = payload.checks.find(
(check) => check.key === "rollback-checkpoint",
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Test check-key mismatch — assertion always fails

the doctor implementation (see lib/codex-manager.ts line 3808) emits the full prefixed name "codex-cli-rollback-checkpoint", but this find predicate searches for the unprefixed string. checkpoint will always be undefined, expect(checkpoint).toBeDefined() will always fail, and the severity/details assertions never execute. update the string in the predicate to match what the production code actually emits.

Prompt To Fix With AI
This is a comment left during a code review.
Path: test/codex-manager-cli.test.ts
Line: 1954-1955

Comment:
**Test check-key mismatch — assertion always fails**

the doctor implementation (see `lib/codex-manager.ts` line 3808) emits the full prefixed name `"codex-cli-rollback-checkpoint"`, but this find predicate searches for the unprefixed string. `checkpoint` will always be `undefined`, `expect(checkpoint).toBeDefined()` will always fail, and the severity/details assertions never execute. update the string in the predicate to match what the production code actually emits.

How can I resolve this? If you propose a fix, please make it concise.

Fix in Codex

Copy link

@cubic-dev-ai cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

5 issues found across 7 files

Prompt for AI agents (unresolved issues)

Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="test/sync-history.test.ts">

<violation number="1" location="test/sync-history.test.ts:36">
P2: Avoid bare `fs.rm` in test cleanup; use retry-based removal to prevent Windows EBUSY/EPERM/ENOTEMPTY flakes.</violation>
</file>

<file name="lib/storage.ts">

<violation number="1" location="lib/storage.ts:1709">
P2: `keepLatestPerReason` is implemented as a global top-N retention, so snapshots can be pruned even when they are within the latest N for their reason.</violation>
</file>

<file name="lib/sync-history.ts">

<violation number="1" location="lib/sync-history.ts:233">
P1: Prune writes are not registered in `pendingHistoryWrites`, so concurrent `readSyncHistory()` can read mid-prune and return inconsistent results.</violation>

<violation number="2" location="lib/sync-history.ts:238">
P1: Calling `readSyncHistory()` while holding `withHistoryLock` can deadlock with concurrent appends because `readSyncHistory` waits on pending writes that are themselves blocked on the same lock.</violation>
</file>

<file name="test/codex-manager-cli.test.ts">

<violation number="1" location="test/codex-manager-cli.test.ts:1955">
P2: Update the expected check key to `codex-cli-rollback-checkpoint` so this test actually matches doctor output and exercises the severity/details assertions.</violation>
</file>

Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.

): Promise<{ removed: number; kept: number; latest: SyncHistoryEntry | null }> {
const maxEntries = options.maxEntries ?? SYNC_HISTORY_MAX_ENTRIES;
await waitForPendingHistoryWrites();
return withHistoryLock(async () => {
Copy link

@cubic-dev-ai cubic-dev-ai bot Mar 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1: Prune writes are not registered in pendingHistoryWrites, so concurrent readSyncHistory() can read mid-prune and return inconsistent results.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At lib/sync-history.ts, line 233:

<comment>Prune writes are not registered in `pendingHistoryWrites`, so concurrent `readSyncHistory()` can read mid-prune and return inconsistent results.</comment>

<file context>
@@ -163,6 +225,44 @@ export function readLatestSyncHistorySync(): SyncHistoryEntry | null {
+): Promise<{ removed: number; kept: number; latest: SyncHistoryEntry | null }> {
+	const maxEntries = options.maxEntries ?? SYNC_HISTORY_MAX_ENTRIES;
+	await waitForPendingHistoryWrites();
+	return withHistoryLock(async () => {
+		const paths = getSyncHistoryPaths();
+		await ensureHistoryDir(paths.directory);
</file context>
Fix with Cubic

await ensureHistoryDir(paths.directory);
let entries: SyncHistoryEntry[] = [];
try {
entries = await readSyncHistory();
Copy link

@cubic-dev-ai cubic-dev-ai bot Mar 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1: Calling readSyncHistory() while holding withHistoryLock can deadlock with concurrent appends because readSyncHistory waits on pending writes that are themselves blocked on the same lock.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At lib/sync-history.ts, line 238:

<comment>Calling `readSyncHistory()` while holding `withHistoryLock` can deadlock with concurrent appends because `readSyncHistory` waits on pending writes that are themselves blocked on the same lock.</comment>

<file context>
@@ -163,6 +225,44 @@ export function readLatestSyncHistorySync(): SyncHistoryEntry | null {
+		await ensureHistoryDir(paths.directory);
+		let entries: SyncHistoryEntry[] = [];
+		try {
+			entries = await readSyncHistory();
+		} catch {
+			entries = [];
</file context>
Suggested change
entries = await readSyncHistory();
const content = await fs.readFile(paths.historyPath, "utf8");
entries = content
.split(/\r?\n/)
.map((line) => line.trim())
.filter(Boolean)
.map((line) => JSON.parse(line) as SyncHistoryEntry);
Fix with Cubic

await __resetSyncHistoryForTests();
configureSyncHistoryForTests(null);
await fs
.rm(workDir, { recursive: true, force: true })
Copy link

@cubic-dev-ai cubic-dev-ai bot Mar 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2: Avoid bare fs.rm in test cleanup; use retry-based removal to prevent Windows EBUSY/EPERM/ENOTEMPTY flakes.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At test/sync-history.test.ts, line 36:

<comment>Avoid bare `fs.rm` in test cleanup; use retry-based removal to prevent Windows EBUSY/EPERM/ENOTEMPTY flakes.</comment>

<file context>
@@ -0,0 +1,187 @@
+		await __resetSyncHistoryForTests();
+		configureSyncHistoryForTests(null);
+		await fs
+			.rm(workDir, { recursive: true, force: true })
+			.catch(() => undefined);
+	});
</file context>
Fix with Cubic

checks: Array<{ key: string; severity: string; details?: string }>;
};
const checkpoint = payload.checks.find(
(check) => check.key === "rollback-checkpoint",
Copy link

@cubic-dev-ai cubic-dev-ai bot Mar 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2: Update the expected check key to codex-cli-rollback-checkpoint so this test actually matches doctor output and exercises the severity/details assertions.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At test/codex-manager-cli.test.ts, line 1955:

<comment>Update the expected check key to `codex-cli-rollback-checkpoint` so this test actually matches doctor output and exercises the severity/details assertions.</comment>

<file context>
@@ -1908,6 +1917,48 @@ describe("codex manager cli commands", () => {
+			checks: Array<{ key: string; severity: string; details?: string }>;
+		};
+		const checkpoint = payload.checks.find(
+			(check) => check.key === "rollback-checkpoint",
+		);
+		expect(checkpoint).toBeDefined();
</file context>
Suggested change
(check) => check.key === "rollback-checkpoint",
(check) => check.key === "codex-cli-rollback-checkpoint",
Fix with Cubic

Comment on lines +1587 to +1629
async function enforceSnapshotRetention(): Promise<void> {
const snapshots = await listAccountSnapshots();
if (snapshots.length === 0) return;

const grouped = snapshots.reduce<Record<string, NamedBackupMetadata[]>>(
(acc, snapshot) => {
const reason = getAccountSnapshotReason(snapshot.name) ?? "unknown";
acc[reason] = acc[reason] ?? [];
acc[reason].push(snapshot);
return acc;
},
{},
);

for (const [reason, entries] of Object.entries(grouped)) {
const sorted = [...entries].sort(
(a, b) => (b.updatedAt ?? 0) - (a.updatedAt ?? 0),
);
const keep = sorted.slice(0, ACCOUNT_SNAPSHOT_RETENTION_PER_REASON);
const drop = sorted.slice(ACCOUNT_SNAPSHOT_RETENTION_PER_REASON);
for (const snapshot of drop) {
try {
await fs.unlink(snapshot.path);
log.debug("Pruned snapshot backup", { path: snapshot.path });
} catch (error) {
const code = (error as NodeJS.ErrnoException).code;
if (code !== "ENOENT") {
log.warn("Failed to prune snapshot backup", {
path: snapshot.path,
reason,
error: String(error),
});
}
}
}
if (keep.length > 0) {
log.debug("Snapshot retention kept latest backups", {
reason,
kept: keep.map((entry) => entry.name),
});
}
}
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

enforceSnapshotRetention silently deletes rollback-referenced snapshots

enforceSnapshotRetention keeps only the top ACCOUNT_SNAPSHOT_RETENTION_PER_REASON (3) snapshots per reason by updatedAt, but it never calls getLatestManualCodexCliRollbackSnapshotNames. pruneAutoGeneratedSnapshots correctly seeds its preserveNames set from that function — this one does not.

concrete scenario:

  1. manual codex-cli-sync creates rollback checkpoint snapshot C
  2. three subsequent auto syncs create D, E, F (same reason, same filename pattern)
  3. snapshotAccountStorage is called again → enforceSnapshotRetention keeps F, E, D and deletes C
  4. getLatestCodexCliSyncRollbackPlan now returns status: "error" / ENOENT — rollback is silently broken

pruneAutoGeneratedSnapshots would have preserved C because it holds a preserveNames set seeded from sync history. enforceSnapshotRetention has no such guard, creating an inconsistent invariant across two pruning paths operating on the same files.

the fix is to await getLatestManualCodexCliRollbackSnapshotNames() inside enforceSnapshotRetention and skip any snapshot whose name is in that set before calling fs.unlink.

Prompt To Fix With AI
This is a comment left during a code review.
Path: lib/storage.ts
Line: 1587-1629

Comment:
**`enforceSnapshotRetention` silently deletes rollback-referenced snapshots**

`enforceSnapshotRetention` keeps only the top `ACCOUNT_SNAPSHOT_RETENTION_PER_REASON` (3) snapshots per reason by `updatedAt`, but it never calls `getLatestManualCodexCliRollbackSnapshotNames`. `pruneAutoGeneratedSnapshots` correctly seeds its `preserveNames` set from that function — this one does not.

concrete scenario:
1. manual codex-cli-sync creates rollback checkpoint snapshot **C**
2. three subsequent auto syncs create **D**, **E**, **F** (same reason, same filename pattern)
3. `snapshotAccountStorage` is called again → `enforceSnapshotRetention` keeps F, E, D and **deletes C**
4. `getLatestCodexCliSyncRollbackPlan` now returns `status: "error"` / `ENOENT` — rollback is silently broken

`pruneAutoGeneratedSnapshots` would have preserved C because it holds a `preserveNames` set seeded from sync history. `enforceSnapshotRetention` has no such guard, creating an inconsistent invariant across two pruning paths operating on the same files.

the fix is to await `getLatestManualCodexCliRollbackSnapshotNames()` inside `enforceSnapshotRetention` and skip any snapshot whose name is in that set before calling `fs.unlink`.

How can I resolve this? If you propose a fix, please make it concise.

Fix in Codex

Copy link

@cubic-dev-ai cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1 issue found across 1 file (changes from recent commits).

Prompt for AI agents (unresolved issues)

Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="lib/storage.ts">

<violation number="1" location="lib/storage.ts:1609">
P2: Snapshot pruning uses a single `fs.unlink` attempt with no retry, so transient file-lock errors can prevent retention from actually removing old snapshots.</violation>
</file>

Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.

const drop = sorted.slice(ACCOUNT_SNAPSHOT_RETENTION_PER_REASON);
for (const snapshot of drop) {
try {
await fs.unlink(snapshot.path);
Copy link

@cubic-dev-ai cubic-dev-ai bot Mar 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2: Snapshot pruning uses a single fs.unlink attempt with no retry, so transient file-lock errors can prevent retention from actually removing old snapshots.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At lib/storage.ts, line 1609:

<comment>Snapshot pruning uses a single `fs.unlink` attempt with no retry, so transient file-lock errors can prevent retention from actually removing old snapshots.</comment>

<file context>
@@ -1569,6 +1584,50 @@ function buildAccountSnapshotName(
+		const drop = sorted.slice(ACCOUNT_SNAPSHOT_RETENTION_PER_REASON);
+		for (const snapshot of drop) {
+			try {
+				await fs.unlink(snapshot.path);
+				log.debug("Pruned snapshot backup", { path: snapshot.path });
+			} catch (error) {
</file context>
Suggested change
await fs.unlink(snapshot.path);
for (let attempt = 0; attempt < BACKUP_COPY_MAX_ATTEMPTS; attempt += 1) {
try {
await fs.unlink(snapshot.path);
break;
} catch (error) {
const code = (error as NodeJS.ErrnoException).code;
if (code === "ENOENT") {
break;
}
const canRetry =
(code === "EPERM" || code === "EBUSY") &&
attempt + 1 < BACKUP_COPY_MAX_ATTEMPTS;
if (canRetry) {
await new Promise((resolve) =>
setTimeout(resolve, BACKUP_COPY_BASE_DELAY_MS * 2 ** attempt),
);
continue;
}
throw error;
}
}
Fix with Cubic

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 8

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@lib/codex-manager.ts`:
- Around line 3614-3621: The addCheck blocks that call existsSync(walPath)
should cache the result once (e.g., const walExists = existsSync(walPath)) and
use that single boolean for severity, message and details to avoid race windows
IO drift; flip the semantics so missing .wal is treated as normal/ok and
presence is treated as exceptional/warn (since lib/storage.ts deletes the
journal on successful save), apply this fix to both addCheck occurrences that
reference walPath, and add a vitest regression in test/codex-manager-cli.test.ts
asserting that storage without a .wal reports healthy/ok; ensure the change
handles potential EBUSY concurrency on Windows by reading the boolean once
rather than multiple existsSync calls.
- Around line 3799-3845: The doctor check emits different keys for one logical
check; make the key stable (e.g., always "codex-cli-rollback-checkpoint")
instead of switching to "codex-cli-rollback-checkpoint-missing" or "-invalid".
Update the logic around getLatestCodexCliSyncRollbackPlan() and the
addCheck(...) calls so they always use the same key, move state specifics into
severity (ready -> ok, missing -> error, invalid -> error) and/or into details
or a new status field, and ensure message/details include the original
isMissing/rollbackReason info; also update the affected vitest
(test/codex-manager-cli.test.ts) expectations to assert the stable key and the
new place where state is recorded.

In `@lib/storage.ts`:
- Around line 1772-1788: The code currently collects snapshots into pruned
before deletion, which reports them as removed even if fs.rm fails; change the
logic in the loop over autoSnapshots (using variables autoSnapshots, keepSet,
pruned and calling fs.rm) to perform the delete first with the same
retry/backoff strategy used in the existing retry block in lib/storage.ts (the
approach in the 277–331 block), only pushing snapshot.backup into pruned after a
successful delete, and log failures as errors (or keep debug) without marking
them pruned; also add a vitest in test/storage-recovery-paths.test.ts that mocks
fs.rm to return EBUSY on first attempts and succeed thereafter to validate the
retry behavior.
- Around line 1650-1652: The current production snapshot path calls
enforceSnapshotRetention() after createBackup; change that to call
pruneAutoGeneratedSnapshots({ keepLatestPerReason:
ACCOUNT_SNAPSHOT_RETENTION_PER_REASON }) so the rollback-aware pruner
(pruneAutoGeneratedSnapshots) is used instead of enforceSnapshotRetention;
update the code in the createBackup return path (the block that calls
createBackup) to import/use pruneAutoGeneratedSnapshots and
ACCOUNT_SNAPSHOT_RETENTION_PER_REASON, and add a vitest in
test/storage-recovery-paths.test.ts that exercises snapshotAccountStorage() to
ensure rollback checkpoints are preserved by the pruner.

In `@lib/sync-history.ts`:
- Around line 228-263: pruneSyncHistory currently calls readSyncHistory() while
inside withHistoryLock which can deadlock with pending appends; change
pruneSyncHistory to read and parse the history file directly inside the lock
(use getSyncHistoryPaths() -> paths.historyPath and fs.readFile + existing
deserialize/parsing helpers) instead of calling readSyncHistory, then proceed to
pruneSyncHistoryEntries/serializeEntry/writeFile as before; keep
rewriteLatestEntry, lastAppendPaths/lastAppendError updates. Add a vitest in
test/sync-history.test.ts that interleaves an append (the function that enqueues
pendingHistoryWrites / append logic) and pruneSyncHistory to reproduce the
ordering, and assert no deadlock and correct removed/kept/latest; also ensure
new queue behavior covers EBUSY/429 retry handling per lib/** concurrency
guidelines.

In `@test/codex-manager-cli.test.ts`:
- Around line 1954-1959: The test is looking for a check with key
"rollback-checkpoint" but the actual code path in runDoctor emits the key
"codex-cli-rollback-checkpoint"; update the assertion in the test (the
payload.checks.find call) to look for "codex-cli-rollback-checkpoint" instead so
it matches the emitted check name from runDoctor (see lib/codex-manager.ts
runDoctor emission and the test around payload.checks).

In `@test/storage-recovery-paths.test.ts`:
- Around line 605-677: Add a deterministic vitest regression that simulates a
transient Windows file-lock error during deletion: in
test/storage-recovery-paths.test.ts add a case (near the existing prune tests)
that uses a mocked fs.unlink/unlinkSync (or the async unlink used by
pruneAutoGeneratedSnapshots) which throws EBUSY/EPERM on the first call for a
specific backup path and then succeeds on subsequent calls; call
getNamedBackupsDirectoryPath() to create two snapshots and invoke
pruneAutoGeneratedSnapshots(), then assert that either the function retries and
the snapshot is eventually removed (expect existsSync(...) false) or, if
designed to retain on error, that the snapshot remains and appears in
result.kept (expect result.pruned/result.kept accordingly); ensure the mock is
deterministic and restored after the test so other tests aren’t affected.

In `@test/sync-history.test.ts`:
- Around line 152-186: Add a deterministic concurrency regression to the test by
introducing a deferred interleaving so pruneSyncHistory races a pending append:
modify the test around appendSyncHistoryEntry/pruneSyncHistory to start an
appendSyncHistoryEntry that uses a controlled deferred (promise you can resolve
from the test) to delay the actual disk write, then call pruneSyncHistory()
while that append is still pending, then resolve the deferred to let the write
finish and await both promises; assert both promises settle,
readLatestSyncHistorySync()/readSyncHistory() reflect the correct latest entry,
and the final history length is as expected. Ensure you use vitest-friendly
deferred utilities (or a small local Deferred) so the test is deterministic and
reproducible, and reference the functions appendSyncHistoryEntry,
pruneSyncHistory, readLatestSyncHistorySync, and readSyncHistory when making the
changes.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 2a0468d9-c033-43e6-847f-a3e7fec396e2

📥 Commits

Reviewing files that changed from the base of the PR and between 5444db0 and 06f6ffc.

📒 Files selected for processing (7)
  • lib/codex-manager.ts
  • lib/storage.ts
  • lib/sync-history.ts
  • test/codex-cli-sync.test.ts
  • test/codex-manager-cli.test.ts
  • test/storage-recovery-paths.test.ts
  • test/sync-history.test.ts
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: cubic · AI code reviewer
  • GitHub Check: Greptile Review
🧰 Additional context used
📓 Path-based instructions (2)
test/**

⚙️ CodeRabbit configuration file

tests must stay deterministic and use vitest. demand regression cases that reproduce concurrency bugs, token refresh races, and windows filesystem behavior. reject changes that mock real secrets or skip assertions.

Files:

  • test/storage-recovery-paths.test.ts
  • test/codex-cli-sync.test.ts
  • test/sync-history.test.ts
  • test/codex-manager-cli.test.ts
lib/**

⚙️ CodeRabbit configuration file

focus on auth rotation, windows filesystem IO, and concurrency. verify every change cites affected tests (vitest) and that new queues handle EBUSY/429 scenarios. check for logging that leaks tokens or emails.

Files:

  • lib/sync-history.ts
  • lib/codex-manager.ts
  • lib/storage.ts
🔇 Additional comments (1)
test/codex-cli-sync.test.ts (1)

982-1042: good public-api coverage for sync-history pruning.

this goes through pruneSyncHistory and readLatestSyncHistorySync instead of reaching into internals, which is the right integration boundary for test/codex-cli-sync.test.ts:982-1042.

Comment on lines +3614 to +3621
addCheck({
key: "storage-journal",
severity: existsSync(walPath) ? "ok" : "warn",
message: existsSync(walPath)
? "Write-ahead journal found"
: "Write-ahead journal missing; recovery will rely on backups",
details: walPath,
});
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

flip the wal check semantics and snapshot the file state once.

lib/codex-manager.ts:3614-3621 and lib/codex-manager.ts:3673-3685 treat a missing .wal as degraded, but lib/storage.ts:2154-2177 deletes that journal after a successful save. that means the normal healthy state will usually show a warning here. the repeated existsSync() calls also let severity, message, and details drift if the file changes between checks on windows. please cache the booleans once and make wal presence the exceptional state. add a vitest regression in test/codex-manager-cli.test.ts for healthy storage with no wal.

suggested fix
 	setStoragePath(null);
 	const storagePath = getStoragePath();
 	const walPath = `${storagePath}.wal`;
+	const storageExists = existsSync(storagePath);
+	const walExists = existsSync(walPath);
 	const checks: DoctorCheck[] = [];
 	const addCheck = (check: DoctorCheck): void => {
 		checks.push(check);
 	};
@@
 	addCheck({
 		key: "storage-journal",
-		severity: existsSync(walPath) ? "ok" : "warn",
-		message: existsSync(walPath)
-			? "Write-ahead journal found"
-			: "Write-ahead journal missing; recovery will rely on backups",
+		severity: walExists ? "warn" : "ok",
+		message: walExists
+			? "Write-ahead journal present; a previous write may have been interrupted"
+			: "No write-ahead journal present",
 		details: walPath,
 	});
@@
 	const hasAnyRecoveryArtifact =
-		existsSync(storagePath) ||
-		existsSync(walPath) ||
+		storageExists ||
+		walExists ||
 		validRotatingBackups.length > 0 ||
 		validSnapshots.length > 0;
@@
-		details: `storage=${existsSync(storagePath)}, wal=${existsSync(walPath)}, rotating=${validRotatingBackups.length}, snapshots=${validSnapshots.length}`,
+		details: `storage=${storageExists}, wal=${walExists}, rotating=${validRotatingBackups.length}, snapshots=${validSnapshots.length}`,
 	});
As per coding guidelines, `lib/**`: focus on auth rotation, windows filesystem IO, and concurrency. verify every change cites affected tests (vitest) and that new queues handle EBUSY/429 scenarios.

Also applies to: 3673-3685

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@lib/codex-manager.ts` around lines 3614 - 3621, The addCheck blocks that call
existsSync(walPath) should cache the result once (e.g., const walExists =
existsSync(walPath)) and use that single boolean for severity, message and
details to avoid race windows IO drift; flip the semantics so missing .wal is
treated as normal/ok and presence is treated as exceptional/warn (since
lib/storage.ts deletes the journal on successful save), apply this fix to both
addCheck occurrences that reference walPath, and add a vitest regression in
test/codex-manager-cli.test.ts asserting that storage without a .wal reports
healthy/ok; ensure the change handles potential EBUSY concurrency on Windows by
reading the boolean once rather than multiple existsSync calls.

Comment on lines +3799 to +3845
const rollbackPlan = await getLatestCodexCliSyncRollbackPlan();
const rollbackReason = (rollbackPlan.reason ?? "").trim();
const rollbackSnapshotPath = rollbackPlan.snapshot?.path;
const rollbackSnapshotName = rollbackPlan.snapshot?.name;
const rollbackReasonLower = rollbackReason.toLowerCase();
if (rollbackPlan.status === "ready") {
const count =
rollbackPlan.accountCount ?? rollbackPlan.storage?.accounts.length;
addCheck({
key: "codex-cli-rollback-checkpoint",
severity: "ok",
message: `Latest manual Codex CLI rollback checkpoint ready (${count ?? "?"} account${count === 1 ? "" : "s"})`,
details: rollbackSnapshotPath,
});
} else if (rollbackPlan.snapshot) {
const isMissing =
rollbackReasonLower.includes("missing") ||
rollbackReasonLower.includes("not found") ||
rollbackReasonLower.includes("enoent");
const key = isMissing
? "codex-cli-rollback-checkpoint-missing"
: "codex-cli-rollback-checkpoint-invalid";
const recoveryAction = isMissing
? "Action: Recreate the rollback checkpoint by running a manual Codex CLI sync with storage backups enabled."
: "Action: Recreate the rollback checkpoint with a fresh manual Codex CLI sync before attempting rollback.";
const detailParts = [
rollbackSnapshotPath ?? rollbackSnapshotName,
rollbackReason || undefined,
recoveryAction,
].filter(Boolean);
addCheck({
key,
severity: "error",
message: isMissing
? "Latest manual Codex CLI rollback checkpoint is missing; rollback is blocked."
: "Latest manual Codex CLI rollback checkpoint cannot be restored.",
details: detailParts.join(" | ") || undefined,
});
} else {
addCheck({
key: "codex-cli-rollback-checkpoint",
severity: "warn",
message: "No manual Codex CLI rollback checkpoint has been recorded yet",
details:
"Action: Run a manual Codex CLI sync with backups enabled to capture a rollback checkpoint before applying changes.",
});
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

keep the rollback checkpoint key stable across states.

lib/codex-manager.ts:3807-3844 emits different key values for one logical doctor check. that makes the json harder to automate against, and it already conflicts with the existing vitest expectation called out for test/codex-manager-cli.test.ts. keep one stable key and move ready / missing / invalid into severity, details, or a dedicated status field.

suggested fix
 	} else if (rollbackPlan.snapshot) {
 		const isMissing =
 			rollbackReasonLower.includes("missing") ||
 			rollbackReasonLower.includes("not found") ||
 			rollbackReasonLower.includes("enoent");
-		const key = isMissing
-			? "codex-cli-rollback-checkpoint-missing"
-			: "codex-cli-rollback-checkpoint-invalid";
 		const recoveryAction = isMissing
 			? "Action: Recreate the rollback checkpoint by running a manual Codex CLI sync with storage backups enabled."
 			: "Action: Recreate the rollback checkpoint with a fresh manual Codex CLI sync before attempting rollback.";
 		const detailParts = [
+			`status=${isMissing ? "missing" : "invalid"}`,
 			rollbackSnapshotPath ?? rollbackSnapshotName,
 			rollbackReason || undefined,
 			recoveryAction,
 		].filter(Boolean);
 		addCheck({
-			key,
+			key: "codex-cli-rollback-checkpoint",
 			severity: "error",
 			message: isMissing
 				? "Latest manual Codex CLI rollback checkpoint is missing; rollback is blocked."
 				: "Latest manual Codex CLI rollback checkpoint cannot be restored.",
 			details: detailParts.join(" | ") || undefined,
 		});
As per coding guidelines, `lib/**`: verify every change cites affected tests (vitest).
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@lib/codex-manager.ts` around lines 3799 - 3845, The doctor check emits
different keys for one logical check; make the key stable (e.g., always
"codex-cli-rollback-checkpoint") instead of switching to
"codex-cli-rollback-checkpoint-missing" or "-invalid". Update the logic around
getLatestCodexCliSyncRollbackPlan() and the addCheck(...) calls so they always
use the same key, move state specifics into severity (ready -> ok, missing ->
error, invalid -> error) and/or into details or a new status field, and ensure
message/details include the original isMissing/rollbackReason info; also update
the affected vitest (test/codex-manager-cli.test.ts) expectations to assert the
stable key and the new place where state is recorded.

Comment on lines +1650 to +1652
const snapshot = await createBackup(backupName, { force });
await enforceSnapshotRetention();
return snapshot;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

wire snapshot creation through the rollback-aware pruner.

lib/storage.ts:1650-1652 still calls enforceSnapshotRetention(), but the rollback-preserving logic lives in lib/storage.ts:1739-1795, not lib/storage.ts:1587-1629. because lib/codex-cli/sync.ts:190-196 records rollback checkpoints via snapshotAccountStorage(), later codex-cli-sync snapshots can delete the latest manual rollback checkpoint out from under the doctor and restore flow. please route the production path through pruneAutoGeneratedSnapshots({ keepLatestPerReason: ACCOUNT_SNAPSHOT_RETENTION_PER_REASON }) and add a vitest that exercises snapshotAccountStorage() itself in test/storage-recovery-paths.test.ts.

suggested fix
 	try {
 		const snapshot = await createBackup(backupName, { force });
-		await enforceSnapshotRetention();
+		await pruneAutoGeneratedSnapshots({
+			keepLatestPerReason: ACCOUNT_SNAPSHOT_RETENTION_PER_REASON,
+		});
 		return snapshot;
 	} catch (error) {
As per coding guidelines, `lib/**`: focus on auth rotation, windows filesystem IO, and concurrency. verify every change cites affected tests (vitest).
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
const snapshot = await createBackup(backupName, { force });
await enforceSnapshotRetention();
return snapshot;
const snapshot = await createBackup(backupName, { force });
await pruneAutoGeneratedSnapshots({
keepLatestPerReason: ACCOUNT_SNAPSHOT_RETENTION_PER_REASON,
});
return snapshot;
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@lib/storage.ts` around lines 1650 - 1652, The current production snapshot
path calls enforceSnapshotRetention() after createBackup; change that to call
pruneAutoGeneratedSnapshots({ keepLatestPerReason:
ACCOUNT_SNAPSHOT_RETENTION_PER_REASON }) so the rollback-aware pruner
(pruneAutoGeneratedSnapshots) is used instead of enforceSnapshotRetention;
update the code in the createBackup return path (the block that calls
createBackup) to import/use pruneAutoGeneratedSnapshots and
ACCOUNT_SNAPSHOT_RETENTION_PER_REASON, and add a vitest in
test/storage-recovery-paths.test.ts that exercises snapshotAccountStorage() to
ensure rollback checkpoints are preserved by the pruner.

Comment on lines +1772 to +1788
const pruned: NamedBackupMetadata[] = [];
for (const snapshot of autoSnapshots) {
if (keepSet.has(snapshot.name)) continue;
pruned.push(snapshot.backup);
}

for (const backup of pruned) {
try {
await fs.rm(backup.path, { force: true });
} catch (error) {
log.debug("Failed to prune auto-generated snapshot", {
name: backup.name,
path: backup.path,
error: String(error),
});
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

do not report snapshots as pruned before the delete succeeds.

lib/storage.ts:1772-1788 adds every candidate to pruned before fs.rm() runs, then downgrades delete failures to a debug log. on windows, EPERM/EBUSY from antivirus or concurrent readers will leave the file on disk while callers get a false success. only append to pruned after a successful delete, and use the same retry/backoff approach you already use in lib/storage.ts:277-331. please add a vitest in test/storage-recovery-paths.test.ts that mocks an EBUSY delete.

suggested fix
-	const pruned: NamedBackupMetadata[] = [];
-	for (const snapshot of autoSnapshots) {
-		if (keepSet.has(snapshot.name)) continue;
-		pruned.push(snapshot.backup);
-	}
-
-	for (const backup of pruned) {
+	const pruned: NamedBackupMetadata[] = [];
+	const pruneCandidates = autoSnapshots
+		.filter((snapshot) => !keepSet.has(snapshot.name))
+		.map((snapshot) => snapshot.backup);
+
+	for (const backup of pruneCandidates) {
 		try {
-			await fs.rm(backup.path, { force: true });
+			await removeFileWithRetry(backup.path);
+			pruned.push(backup);
 		} catch (error) {
-			log.debug("Failed to prune auto-generated snapshot", {
+			log.warn("Failed to prune auto-generated snapshot", {
 				name: backup.name,
 				path: backup.path,
 				error: String(error),
 			});
 		}
 	}
As per coding guidelines, `lib/**`: focus on auth rotation, windows filesystem IO, and concurrency. verify every change cites affected tests (vitest) and that new queues handle EBUSY/429 scenarios.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@lib/storage.ts` around lines 1772 - 1788, The code currently collects
snapshots into pruned before deletion, which reports them as removed even if
fs.rm fails; change the logic in the loop over autoSnapshots (using variables
autoSnapshots, keepSet, pruned and calling fs.rm) to perform the delete first
with the same retry/backoff strategy used in the existing retry block in
lib/storage.ts (the approach in the 277–331 block), only pushing snapshot.backup
into pruned after a successful delete, and log failures as errors (or keep
debug) without marking them pruned; also add a vitest in
test/storage-recovery-paths.test.ts that mocks fs.rm to return EBUSY on first
attempts and succeed thereafter to validate the retry behavior.

Comment on lines +228 to +263
export async function pruneSyncHistory(
options: { maxEntries?: number } = {},
): Promise<{ removed: number; kept: number; latest: SyncHistoryEntry | null }> {
const maxEntries = options.maxEntries ?? SYNC_HISTORY_MAX_ENTRIES;
await waitForPendingHistoryWrites();
return withHistoryLock(async () => {
const paths = getSyncHistoryPaths();
await ensureHistoryDir(paths.directory);
let entries: SyncHistoryEntry[] = [];
try {
entries = await readSyncHistory();
} catch {
entries = [];
}

const {
entries: prunedEntries,
removed,
latest,
} = pruneSyncHistoryEntries(entries, maxEntries);
if (prunedEntries.length === 0) {
await fs.rm(paths.historyPath, { force: true });
} else {
const serialized =
prunedEntries.map((entry) => serializeEntry(entry)).join("\n") +
(prunedEntries.length > 0 ? "\n" : "");
await fs.writeFile(paths.historyPath, serialized, {
encoding: "utf8",
mode: 0o600,
});
}
await rewriteLatestEntry(latest, paths);
lastAppendPaths = paths;
lastAppendError = null;
return { removed, kept: prunedEntries.length, latest };
});
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

avoid calling readSyncHistory() while holding the history lock.

lib/sync-history.ts:233-239 can deadlock with lib/sync-history.ts:149-180 and lib/sync-history.ts:182-211. once prune holds withHistoryLock(), a concurrent append is added to pendingHistoryWrites but cannot run until prune exits; readSyncHistory() then waits for that pending append forever. read and parse paths.historyPath directly inside the locked section instead. please add a vitest in test/sync-history.test.ts that interleaves append and prune.

suggested fix
 	return withHistoryLock(async () => {
 		const paths = getSyncHistoryPaths();
 		await ensureHistoryDir(paths.directory);
 		let entries: SyncHistoryEntry[] = [];
 		try {
-			entries = await readSyncHistory();
-		} catch {
-			entries = [];
+			const content = await fs.readFile(paths.historyPath, "utf8");
+			entries = content
+				.split(/\r?\n/)
+				.map((line) => line.trim())
+				.filter(Boolean)
+				.map((line) => JSON.parse(line) as SyncHistoryEntry);
+		} catch (error) {
+			if ((error as NodeJS.ErrnoException).code !== "ENOENT") {
+				throw error;
+			}
 		}
As per coding guidelines, `lib/**`: focus on auth rotation, windows filesystem IO, and concurrency. verify every change cites affected tests (vitest) and that new queues handle EBUSY/429 scenarios.
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
export async function pruneSyncHistory(
options: { maxEntries?: number } = {},
): Promise<{ removed: number; kept: number; latest: SyncHistoryEntry | null }> {
const maxEntries = options.maxEntries ?? SYNC_HISTORY_MAX_ENTRIES;
await waitForPendingHistoryWrites();
return withHistoryLock(async () => {
const paths = getSyncHistoryPaths();
await ensureHistoryDir(paths.directory);
let entries: SyncHistoryEntry[] = [];
try {
entries = await readSyncHistory();
} catch {
entries = [];
}
const {
entries: prunedEntries,
removed,
latest,
} = pruneSyncHistoryEntries(entries, maxEntries);
if (prunedEntries.length === 0) {
await fs.rm(paths.historyPath, { force: true });
} else {
const serialized =
prunedEntries.map((entry) => serializeEntry(entry)).join("\n") +
(prunedEntries.length > 0 ? "\n" : "");
await fs.writeFile(paths.historyPath, serialized, {
encoding: "utf8",
mode: 0o600,
});
}
await rewriteLatestEntry(latest, paths);
lastAppendPaths = paths;
lastAppendError = null;
return { removed, kept: prunedEntries.length, latest };
});
export async function pruneSyncHistory(
options: { maxEntries?: number } = {},
): Promise<{ removed: number; kept: number; latest: SyncHistoryEntry | null }> {
const maxEntries = options.maxEntries ?? SYNC_HISTORY_MAX_ENTRIES;
await waitForPendingHistoryWrites();
return withHistoryLock(async () => {
const paths = getSyncHistoryPaths();
await ensureHistoryDir(paths.directory);
let entries: SyncHistoryEntry[] = [];
try {
const content = await fs.readFile(paths.historyPath, "utf8");
entries = content
.split(/\r?\n/)
.map((line) => line.trim())
.filter(Boolean)
.map((line) => JSON.parse(line) as SyncHistoryEntry);
} catch (error) {
if ((error as NodeJS.ErrnoException).code !== "ENOENT") {
throw error;
}
}
const {
entries: prunedEntries,
removed,
latest,
} = pruneSyncHistoryEntries(entries, maxEntries);
if (prunedEntries.length === 0) {
await fs.rm(paths.historyPath, { force: true });
} else {
const serialized =
prunedEntries.map((entry) => serializeEntry(entry)).join("\n") +
(prunedEntries.length > 0 ? "\n" : "");
await fs.writeFile(paths.historyPath, serialized, {
encoding: "utf8",
mode: 0o600,
});
}
await rewriteLatestEntry(latest, paths);
lastAppendPaths = paths;
lastAppendError = null;
return { removed, kept: prunedEntries.length, latest };
});
}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@lib/sync-history.ts` around lines 228 - 263, pruneSyncHistory currently calls
readSyncHistory() while inside withHistoryLock which can deadlock with pending
appends; change pruneSyncHistory to read and parse the history file directly
inside the lock (use getSyncHistoryPaths() -> paths.historyPath and fs.readFile
+ existing deserialize/parsing helpers) instead of calling readSyncHistory, then
proceed to pruneSyncHistoryEntries/serializeEntry/writeFile as before; keep
rewriteLatestEntry, lastAppendPaths/lastAppendError updates. Add a vitest in
test/sync-history.test.ts that interleaves an append (the function that enqueues
pendingHistoryWrites / append logic) and pruneSyncHistory to reproduce the
ordering, and assert no deadlock and correct removed/kept/latest; also ensure
new queue behavior covers EBUSY/429 retry handling per lib/** concurrency
guidelines.

Comment on lines +1954 to +1959
const checkpoint = payload.checks.find(
(check) => check.key === "rollback-checkpoint",
);
expect(checkpoint).toBeDefined();
expect(checkpoint?.severity).toBe("ok");
expect(checkpoint?.details).toBe("/mock/backups/rollback.json");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

use the emitted doctor key here.

runDoctor emits codex-cli-rollback-checkpoint, not rollback-checkpoint, so this assertion fails against the actual json payload in lib/codex-manager.ts:3799-3810 and test/codex-manager-cli.test.ts:1954-1959.

suggested fix
-		const checkpoint = payload.checks.find(
-			(check) => check.key === "rollback-checkpoint",
-		);
+		const checkpoint = payload.checks.find(
+			(check) => check.key === "codex-cli-rollback-checkpoint",
+		);
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
const checkpoint = payload.checks.find(
(check) => check.key === "rollback-checkpoint",
);
expect(checkpoint).toBeDefined();
expect(checkpoint?.severity).toBe("ok");
expect(checkpoint?.details).toBe("/mock/backups/rollback.json");
const checkpoint = payload.checks.find(
(check) => check.key === "codex-cli-rollback-checkpoint",
);
expect(checkpoint).toBeDefined();
expect(checkpoint?.severity).toBe("ok");
expect(checkpoint?.details).toBe("/mock/backups/rollback.json");
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/codex-manager-cli.test.ts` around lines 1954 - 1959, The test is looking
for a check with key "rollback-checkpoint" but the actual code path in runDoctor
emits the key "codex-cli-rollback-checkpoint"; update the assertion in the test
(the payload.checks.find call) to look for "codex-cli-rollback-checkpoint"
instead so it matches the emitted check name from runDoctor (see
lib/codex-manager.ts runDoctor emission and the test around payload.checks).

Comment on lines +605 to +677
it("prunes older auto snapshots while keeping latest and manual backups", async () => {
const backupsDir = getNamedBackupsDirectoryPath();
await fs.mkdir(backupsDir, { recursive: true });
const payload = {
version: 3,
activeIndex: 0,
accounts: [{ refreshToken: "rt-a" }],
};
const olderName = "accounts-codex-cli-sync-snapshot-2026-03-10_00-00-00";
const olderPath = join(backupsDir, `${olderName}.json`);
const newerName = "accounts-codex-cli-sync-snapshot-2026-03-12_00-00-00";
const newerPath = join(backupsDir, `${newerName}.json`);
const manualPath = join(backupsDir, "manual-checkpoint.json");
await fs.writeFile(olderPath, JSON.stringify(payload), "utf-8");
await fs.writeFile(newerPath, JSON.stringify(payload), "utf-8");
await fs.writeFile(manualPath, JSON.stringify(payload), "utf-8");

const result = await pruneAutoGeneratedSnapshots();
expect(result.pruned.map((entry) => entry.name)).toContain(olderName);
expect(result.kept.map((entry) => entry.name)).toContain(newerName);
expect(existsSync(olderPath)).toBe(false);
expect(existsSync(newerPath)).toBe(true);
expect(existsSync(manualPath)).toBe(true);
});

it("retains rollback-referenced snapshots when pruning auto-generated backups", async () => {
const backupsDir = getNamedBackupsDirectoryPath();
const logsDir = join(workDir, "logs");
configureSyncHistoryForTests(logsDir);
await fs.mkdir(backupsDir, { recursive: true });
await fs.mkdir(logsDir, { recursive: true });
const payload = {
version: 3,
activeIndex: 0,
accounts: [{ refreshToken: "rt-b" }],
};
const referencedName =
"accounts-codex-cli-sync-snapshot-2026-03-11_00-00-00";
const referencedPath = join(backupsDir, `${referencedName}.json`);
const staleName = "accounts-codex-cli-sync-snapshot-2026-03-09_00-00-00";
const stalePath = join(backupsDir, `${staleName}.json`);
await fs.writeFile(referencedPath, JSON.stringify(payload), "utf-8");
await fs.writeFile(stalePath, JSON.stringify(payload), "utf-8");

await appendSyncHistoryEntry({
kind: "codex-cli-sync",
recordedAt: Date.now(),
run: {
outcome: "changed",
runAt: Date.now(),
sourcePath: storagePath,
targetPath: storagePath,
summary: {
sourceAccountCount: 0,
targetAccountCountBefore: 0,
targetAccountCountAfter: 0,
addedAccountCount: 0,
updatedAccountCount: 0,
unchangedAccountCount: 0,
destinationOnlyPreservedCount: 0,
selectionChanged: false,
},
trigger: "manual",
rollbackSnapshot: { name: referencedName, path: referencedPath },
},
});

const result = await pruneAutoGeneratedSnapshots();
expect(result.kept.map((entry) => entry.name)).toContain(referencedName);
expect(result.pruned.map((entry) => entry.name)).toContain(staleName);
expect(existsSync(referencedPath)).toBe(true);
expect(existsSync(stalePath)).toBe(false);
});
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

add a windows ebusy/eperm prune regression.

these cases only cover successful deletes in test/storage-recovery-paths.test.ts:605-677. they will not catch the windows file-lock path where pruneAutoGeneratedSnapshots hits transient EBUSY/EPERM during unlink and either needs retries or must leave the snapshot intact. add a deterministic fs mock here that fails the first delete attempt and verifies the retry or safe-retain behavior.

As per coding guidelines, "tests must stay deterministic and use vitest. demand regression cases that reproduce concurrency bugs, token refresh races, and windows filesystem behavior."

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/storage-recovery-paths.test.ts` around lines 605 - 677, Add a
deterministic vitest regression that simulates a transient Windows file-lock
error during deletion: in test/storage-recovery-paths.test.ts add a case (near
the existing prune tests) that uses a mocked fs.unlink/unlinkSync (or the async
unlink used by pruneAutoGeneratedSnapshots) which throws EBUSY/EPERM on the
first call for a specific backup path and then succeeds on subsequent calls;
call getNamedBackupsDirectoryPath() to create two snapshots and invoke
pruneAutoGeneratedSnapshots(), then assert that either the function retries and
the snapshot is eventually removed (expect existsSync(...) false) or, if
designed to retain on error, that the snapshot remains and appears in
result.kept (expect result.pruned/result.kept accordingly); ensure the mock is
deterministic and restored after the test so other tests aren’t affected.

Comment on lines +152 to +186
it("keeps latest entry on disk after pruning and mirrors latest file", async () => {
await appendSyncHistoryEntry({
kind: "codex-cli-sync",
recordedAt: 1,
run: createCodexRun(1, "/source-1"),
});
await appendSyncHistoryEntry({
kind: "live-account-sync",
recordedAt: 2,
reason: "watch",
outcome: "success",
path: "/watch-1",
snapshot: createLiveSnapshot(2),
});
await appendSyncHistoryEntry({
kind: "codex-cli-sync",
recordedAt: 3,
run: createCodexRun(3, "/source-2"),
});
await appendSyncHistoryEntry({
kind: "live-account-sync",
recordedAt: 4,
reason: "poll",
outcome: "error",
path: "/poll-2",
snapshot: createLiveSnapshot(4),
});

const result = await pruneSyncHistory({ maxEntries: 1 });
const latestOnDisk = readLatestSyncHistorySync();
expect(latestOnDisk).toEqual(result.latest);
const history = await readSyncHistory();
expect(history.at(-1)).toEqual(result.latest);
expect(history).toHaveLength(2);
});
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

add a concurrent append/prune regression here.

this only exercises pruneSyncHistory() after every appendSyncHistoryEntry() has already finished in test/sync-history.test.ts:152-186, so it will stay green even if the prune path deadlocks when it races a pending history write. add a deterministic deferred interleaving that starts an append, invokes prune before the write completes, then releases the write and asserts both promises settle.

As per coding guidelines, "tests must stay deterministic and use vitest. demand regression cases that reproduce concurrency bugs, token refresh races, and windows filesystem behavior."

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/sync-history.test.ts` around lines 152 - 186, Add a deterministic
concurrency regression to the test by introducing a deferred interleaving so
pruneSyncHistory races a pending append: modify the test around
appendSyncHistoryEntry/pruneSyncHistory to start an appendSyncHistoryEntry that
uses a controlled deferred (promise you can resolve from the test) to delay the
actual disk write, then call pruneSyncHistory() while that append is still
pending, then resolve the deferred to let the write finish and await both
promises; assert both promises settle,
readLatestSyncHistorySync()/readSyncHistory() reflect the correct latest entry,
and the final history length is as expected. Ensure you use vitest-friendly
deferred utilities (or a small local Deferred) so the test is deterministic and
reproducible, and reference the functions appendSyncHistoryEntry,
pruneSyncHistory, readLatestSyncHistorySync, and readSyncHistory when making the
changes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant