Skip to content

feat(auth): add backup restore manager#76

Open
ndycode wants to merge 5 commits intogit-plan/01-reset-safetyfrom
git-plan/02-backup-restore-manager
Open

feat(auth): add backup restore manager#76
ndycode wants to merge 5 commits intogit-plan/01-reset-safetyfrom
git-plan/02-backup-restore-manager

Conversation

@ndycode
Copy link
Owner

@ndycode ndycode commented Mar 12, 2026

Summary

  • Add named-backup metadata and restore assessment primitives on top of existing import/export flows.
  • Add a first-class backup restore manager with explicit confirmation.
  • Cover storage and CLI restore-manager flows.

Summary by cubic

Adds a backup restore manager to the auth flow so users can list, assess, and restore named backups with clear status, confirmation, and safer path handling. Also hardens restore paths and makes the CLI catch and report restore failures.

  • New Features

    • CLI: Adds “Recovery → Restore From Backup” menu that lists backups with account counts and “updated X” timing, requires confirmation, and returns to login on failure.
    • Storage: New APIs listNamedBackups, assessNamedBackupRestore, createNamedBackup, restoreNamedBackup, and getNamedBackupsDirectoryPath. Supports manual filenames already in the backups dir, includes metadata (created/updated, size, version), and enforces limits with importAccounts.
    • Docs: Adds global and project-scoped backup paths.
  • Safety

    • Hardened restore path resolution to contain names within the backups directory; rejects escaping names.
    • Keeps WAL journaling and rotating backups; improves error messages.
    • Tests: Add CLI restore flow and failure handling, path containment checks, and Windows-safe cleanup retries.

Written for commit 7443074. Summary will update on new commits.

note: greptile review for oc-chatgpt-multi-auth. cite files like lib/foo.ts:123. confirm regression tests + windows concurrency/token redaction coverage.

Greptile Summary

adds a first-class backup restore manager to the auth login flow: named backups are discovered by scanning ~/.codex/multi-auth/backups/, assessed against the current account limit (pre-dedup merge count shown in the confirm prompt), and restored via the existing importAccounts transaction path with a new try/catch in runAuthLogin. the transactionSnapshotContext is also patched to capture storagePath at transaction start so exportAccounts no longer blindly reuses a stale snapshot from a different storage scope.

  • lib/storage.ts — new exported apis listNamedBackups, assessNamedBackupRestore, createNamedBackup, restoreNamedBackup, getNamedBackupsDirectoryPath; storagePath guard added to exportAccounts to prevent cross-scope snapshot reuse; path traversal is fully blocked via normalizeNamedBackupFileName + assertWithinDirectory in named-backup-export.ts
  • lib/codex-manager.tsrunBackupRestoreManager batches assessments concurrently, shows a selection menu with per-backup status/count/age hints, confirms before calling restoreNamedBackup, and is wrapped in a top-level try/catch so errors return to the login loop rather than crashing the session
  • lib/ui/auth-menu.ts / lib/ui/copy.ts — "Recovery → Restore From Backup" section added above "Danger Zone"; existing menu focus-key routing updated
  • test/storage.test.ts — new named-backup tests correctly inherit the removeWithRetry afterEach; path-traversal rejection is covered; a concurrent-restore race test is missing
  • windows / token safety: assertWithinDirectory uses lstatSync+realpathSync with case-insensitive comparison on win32 and refuses symlinked backup roots; the storage mutex serialises concurrent restores; redaction policy unchanged (paths logged via log.warn, tokens not included in backup metadata)

Confidence Score: 3/5

  • mostly safe but the new storagePath deadlock path in exportAccounts needs to be addressed before merging
  • the path-traversal and windows safety story is solid; the menu error-handling is fixed; the deadlock regression in exportAccounts when storagePath differs inside an active transaction is a real (if rare) logic bug introduced in this PR; the missing concurrent-restore test leaves a concurrency guarantee unverified on windows CI
  • lib/storage.ts — exportAccounts deadlock risk at the transactionState.storagePath path-equality guard

Important Files Changed

Filename Overview
lib/storage.ts adds listNamedBackups, assessNamedBackupRestore, createNamedBackup, restoreNamedBackup, and getNamedBackupsDirectoryPath; fixes stale transaction snapshot bug via storagePath capture — but the new path-equality guard in exportAccounts creates a potential deadlock if setStoragePath is ever called while a transaction is active (theoretical on prod, realistic in windows CI tests)
lib/codex-manager.ts adds runBackupRestoreManager with batch assessment, confirm prompt, and proper try/catch around the whole menu entry; formatRelativeDateShort handles clock-skew future timestamps correctly by collapsing them to "today"
lib/ui/auth-menu.ts adds restore-backup action type and "Recovery" section to the auth menu; change is straightforward and consistent with existing menu construction patterns
test/codex-manager-cli.test.ts adds mocks for all four new storage APIs and confirm; two new integration tests cover the happy-path restore and error-recovery flows; good coverage of the new menu branch
test/storage.test.ts new named-backup tests live inside the existing import/export suite and correctly inherit the removeWithRetry afterEach; path-traversal rejection test is present; missing a concurrent-restore regression test

Sequence Diagram

sequenceDiagram
    participant User
    participant AuthMenu
    participant BackupRestoreManager
    participant Storage

    User->>AuthMenu: select "Restore From Backup"
    AuthMenu->>BackupRestoreManager: runBackupRestoreManager()
    BackupRestoreManager->>Storage: listNamedBackups()
    Storage-->>BackupRestoreManager: NamedBackupMetadata[]
    BackupRestoreManager->>Storage: loadAccounts()
    Storage-->>BackupRestoreManager: currentStorage
    BackupRestoreManager->>Storage: assessNamedBackupRestore(name, {currentStorage}) [parallel]
    Storage-->>BackupRestoreManager: BackupRestoreAssessment[]
    BackupRestoreManager->>User: select(backups with status/count/age hints)
    User-->>BackupRestoreManager: { type: "restore", assessment }
    BackupRestoreManager->>User: confirm("merge N accounts, M after dedupe?")
    User-->>BackupRestoreManager: true
    BackupRestoreManager->>Storage: restoreNamedBackup(name)
    Storage->>Storage: resolveNamedBackupRestorePath(name)
    Storage->>Storage: importAccounts(backupPath) [inside storageMutex]
    Storage-->>BackupRestoreManager: { imported, skipped, total }
    BackupRestoreManager-->>User: "Restored backup. Imported N, skipped M, total T."
Loading

Comments Outside Diff (1)

  1. lib/storage.ts, line 2597-2604 (link)

    deadlock when storagePath changes inside an active transaction

    the new path-equality guard falls through to withAccountStorageTransaction when transactionState.active === true but transactionState.storagePath !== currentStoragePath. withAccountStorageTransaction calls withStorageLock, which chains onto storageMutex. if we are already inside a withStorageLock invocation (which is how we got here with an active transaction), the new work item is queued behind the lock we are currently holding — the outer lock waits for the inner lock that can never run → deadlock.

    this matters in windows CI where tests call setStoragePathDirect between operations and have occasionally raced into cross-path export paths.

    the safer fallback is to throw or return early with an explicit error rather than silently re-entering the lock:

Fix All in Codex

Prompt To Fix All With AI
This is a comment left during a code review.
Path: lib/storage.ts
Line: 2597-2604

Comment:
**deadlock when `storagePath` changes inside an active transaction**

the new path-equality guard falls through to `withAccountStorageTransaction` when `transactionState.active === true` but `transactionState.storagePath !== currentStoragePath`. `withAccountStorageTransaction` calls `withStorageLock`, which chains onto `storageMutex`. if we are already inside a `withStorageLock` invocation (which is how we got here with an active transaction), the new work item is queued behind the lock we are currently holding — the outer lock waits for the inner lock that can never run → deadlock.

this matters in windows CI where tests call `setStoragePathDirect` between operations and have occasionally raced into cross-path export paths.

the safer fallback is to throw or return early with an explicit error rather than silently re-entering the lock:

```suggestion
	const currentStoragePath = getStoragePath();
	const storage =
		transactionState?.active &&
		transactionState.storagePath === currentStoragePath
			? transactionState.snapshot
			: transactionState?.active
				? (() => {
						throw new Error(
							"exportAccounts called inside a transaction for a different storage path",
						);
					})()
				: await withAccountStorageTransaction((current) =>
						Promise.resolve(current),
					);
```

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: lib/storage.ts
Line: 1775-1781

Comment:
**misleading error when a manually-named backup disappears between assess and restore**

`findExistingNamedBackupPath` returns `undefined` when the file no longer exists (was deleted between assessment and restore). the fallback `buildNamedBackupPath(name)` calls `normalizeNamedBackupFileName(name)`, which throws `"Backup filename may only contain letters, numbers, hyphens, and underscores"` for any name containing a space (e.g. `"Manual Backup"`).

the test suite explicitly covers placing a file called `"Manual Backup.json"` manually and restoring it. if that file is deleted between assessment and restore, the user sees a character-validation error instead of a "file not found" message — confusing on windows where antivirus briefly locks and then deletes files.

consider making `resolveNamedBackupRestorePath` preserve the original intent:

```typescript
async function resolveNamedBackupRestorePath(name: string): Promise<string> {
  const existingPath = await findExistingNamedBackupPath(name);
  if (existingPath) {
    return existingPath;
  }
  // only use strict normalization for programmatic names; surface a
  // clear error for manual names (e.g. containing spaces) that have gone missing
  try {
    return buildNamedBackupPath(name);
  } catch {
    const backupRoot = getNamedBackupRoot(getStoragePath());
    throw new Error(`Import file not found in ${backupRoot} for backup "${name}"`);
  }
}
```

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: test/storage.test.ts
Line: 1159-1166

Comment:
**missing concurrency regression test for simultaneous restore calls**

the new suite covers single-threaded assess → restore and path-traversal rejection, but there is no test that fires two concurrent `restoreNamedBackup` calls after both assessments independently pass the `wouldExceedLimit` check. `importAccounts` holds the storage lock and re-checks the limit, so both cannot succeed — but on windows, where antivirus-induced `EBUSY` on the WAL rename can cause the mutex to stall, the second restore may observe a stale snapshot and still proceed past the limit gate.

adding a test like this would confirm the transactional guard holds end-to-end:

```typescript
it("concurrent restores are serialised and only one succeeds when limit is tight", async () => {
  // seed exactly MAX_ACCOUNTS - 1 accounts so a single restore would push to the limit
  // but two concurrent restores cannot both succeed
  const [r1, r2] = await Promise.allSettled([
    restoreNamedBackup("backup-a"),
    restoreNamedBackup("backup-b"),
  ]);
  const succeeded = [r1, r2].filter((r) => r.status === "fulfilled").length;
  expect(succeeded).toBe(1);
});
```

How can I resolve this? If you propose a fix, please make it concise.

Last reviewed commit: 7443074

Context used:

  • Rule used - What: Every code change must explain how it defend... (source)

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Mar 12, 2026

Note

Currently processing new changes in this PR. This may take a few minutes, please wait...

⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: f8fa6ad5-4c6c-4110-af7b-d6c8d1c2f62c

📥 Commits

Reviewing files that changed from the base of the PR and between d67ca3d and 7443074.

📒 Files selected for processing (8)
  • docs/reference/storage-paths.md
  • lib/cli.ts
  • lib/codex-manager.ts
  • lib/storage.ts
  • lib/ui/auth-menu.ts
  • lib/ui/copy.ts
  • test/codex-manager-cli.test.ts
  • test/storage.test.ts
✨ Finishing Touches
  • 📝 Generate docstrings (stacked PR)
  • 📝 Generate docstrings (commit on current branch)
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch git-plan/02-backup-restore-manager
📝 Coding Plan
  • Generate coding plan for human review comments

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@cubic-dev-ai cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

3 issues found across 8 files

Prompt for AI agents (unresolved issues)

Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="test/storage.test.ts">

<violation number="1" location="test/storage.test.ts:316">
P2: Use retry-based cleanup instead of bare `fs.rm` in test teardown to avoid Windows EBUSY/EPERM/ENOTEMPTY flakiness.</violation>
</file>

<file name="lib/storage.ts">

<violation number="1" location="lib/storage.ts:475">
P2: Backup name normalization in path resolution breaks restore for listed backup files whose filenames contain spaces or other non-normalized characters.</violation>
</file>

<file name="test/codex-manager-cli.test.ts">

<violation number="1" location="test/codex-manager-cli.test.ts:1609">
P3: `vi.doMock` here is never unmocked. `vi.resetModules()` does not clear the mock registry, so this confirm mock will leak into later tests and can change their behavior. Add a `vi.doUnmock`/`vi.unmock` cleanup (e.g., in `afterEach`) to keep tests isolated.</violation>
</file>

Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.

Comment on lines +330 to +333
afterEach(async () => {
setStoragePathDirect(null);
await fs.rm(testWorkDir, { recursive: true, force: true });
});
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

afterEach missing windows retry for EBUSY/EPERM

the outer export/import suite's afterEach (lines 128-147) was updated in this PR to retry fs.rm on EBUSY, EPERM, and ENOTEMPTY — exactly the errors antivirus or the WAL rename logic can leave behind on windows. the new "named backups" suite's afterEach does plain fs.rm with no retry, so named-backup integration tests will flake on windows CI under the same conditions.

Suggested change
afterEach(async () => {
setStoragePathDirect(null);
await fs.rm(testWorkDir, { recursive: true, force: true });
});
afterEach(async () => {
setStoragePathDirect(null);
for (let attempt = 0; attempt < 5; attempt += 1) {
try {
await fs.rm(testWorkDir, { recursive: true, force: true });
break;
} catch (error) {
const code = (error as NodeJS.ErrnoException).code;
if (
(code !== "EBUSY" && code !== "EPERM" && code !== "ENOTEMPTY") ||
attempt === 4
) {
throw error;
}
await new Promise((resolve) =>
setTimeout(resolve, 25 * 2 ** attempt),
);
}
}
});
Prompt To Fix With AI
This is a comment left during a code review.
Path: test/storage.test.ts
Line: 330-333

Comment:
**`afterEach` missing windows retry for `EBUSY`/`EPERM`**

the outer export/import suite's `afterEach` (lines 128-147) was updated in this PR to retry `fs.rm` on `EBUSY`, `EPERM`, and `ENOTEMPTY` — exactly the errors antivirus or the WAL rename logic can leave behind on windows. the new "named backups" suite's `afterEach` does plain `fs.rm` with no retry, so named-backup integration tests will flake on windows CI under the same conditions.

```suggestion
		afterEach(async () => {
			setStoragePathDirect(null);
			for (let attempt = 0; attempt < 5; attempt += 1) {
				try {
					await fs.rm(testWorkDir, { recursive: true, force: true });
					break;
				} catch (error) {
					const code = (error as NodeJS.ErrnoException).code;
					if (
						(code !== "EBUSY" && code !== "EPERM" && code !== "ENOTEMPTY") ||
						attempt === 4
					) {
						throw error;
					}
					await new Promise((resolve) =>
						setTimeout(resolve, 25 * 2 ** attempt),
					);
				}
			}
		});
```

How can I resolve this? If you propose a fix, please make it concise.

Fix in Codex

Copy link

@cubic-dev-ai cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

4 issues found across 8 files

Prompt for AI agents (unresolved issues)

Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="test/storage.test.ts">

<violation number="1" location="test/storage.test.ts:132">
P2: Replace the bare `fs.rm` in test cleanup with retry/backoff to avoid Windows CI flakes on transient file locks.</violation>
</file>

<file name="lib/storage.ts">

<violation number="1" location="lib/storage.ts:476">
P2: `resolveNamedBackupPath` allows path traversal via unsanitized backup names before directory confinement.</violation>
</file>

<file name="lib/cli.ts">

<violation number="1" location="lib/cli.ts:283">
P2: `restore-backup` is wired only in the TUI path; the fallback login prompt cannot select this new mode.</violation>
</file>

<file name="lib/codex-manager.ts">

<violation number="1" location="lib/codex-manager.ts:4171">
P2: Handle restoreNamedBackup errors so a missing/invalid backup or limit breach doesn’t crash the auth menu.</violation>
</file>

Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.

return { mode: "deep-check" };
case "verify-flagged":
return { mode: "verify-flagged" };
case "restore-backup":
Copy link

@cubic-dev-ai cubic-dev-ai bot Mar 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2: restore-backup is wired only in the TUI path; the fallback login prompt cannot select this new mode.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At lib/cli.ts, line 283:

<comment>`restore-backup` is wired only in the TUI path; the fallback login prompt cannot select this new mode.</comment>

<file context>
@@ -279,6 +280,8 @@ export async function promptLoginMode(
 				return { mode: "deep-check" };
 			case "verify-flagged":
 				return { mode: "verify-flagged" };
+			case "restore-backup":
+				return { mode: "restore-backup" };
 			case "select-account": {
</file context>
Fix with Cubic

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant