DiscourseGraphs · trangdoan982 · Feb 26, 2026 · Feb 26, 2026
diff --git a/apps/obsidian/.gitignore b/apps/obsidian/.gitignore
@@ -0,0 +1,3 @@
+e2e/test-vault*/
+e2e/test-results/
+e2e/html-report/
diff --git a/apps/obsidian/e2e/NOTES.md b/apps/obsidian/e2e/NOTES.md
@@ -0,0 +1,159 @@
+# E2E Testing for Obsidian Plugin — Notes
+
+## Approaches Considered
+
+### Option 1: Playwright `electron.launch()`
+
+The standard Playwright approach for Electron apps — point `executablePath` at the binary and let Playwright manage the process lifecycle.
+
+**Pros:**
+- First-class Playwright API — `app.evaluate()` runs code in the main process, not just renderer
+- Automatic process lifecycle management (launch, close, cleanup)
+- Access to Electron-specific APIs (e.g., `app.evaluate(() => process.env)`)
+- Well-documented, widely used for Electron testing
+
+**Cons:**
+- **Does not work with Obsidian.** Obsidian's executable is a launcher that loads an `.asar` package (`obsidian-1.11.7.asar`) and forks a new Electron process. Playwright connects to the initial process, which exits, causing `kill EPERM` and connection failures.
+- No workaround without modifying Obsidian's startup or using a custom Electron shell
+
+**Verdict:** Not viable for Obsidian.
+
+---
+
+### Option 2: CDP via `chromium.connectOverCDP()` (chosen)
+
+Launch Obsidian as a subprocess with `--remote-debugging-port=9222`, then connect via Chrome DevTools Protocol.
+
+**Pros:**
+- Works with Obsidian's forked process architecture — the debug port is inherited by the child process
+- Full access to renderer via `page.evaluate()` — Obsidian's global `app` object is available
+- Keyboard/mouse interaction works normally
+- Can take screenshots, traces, and use all Playwright assertions
+- Process is managed explicitly — clear control over startup and teardown
+
+**Cons:**
+- No main process access (can't call Electron APIs directly, only renderer-side `window`/`app`)
+- Must manually manage process lifecycle (spawn, pkill, port polling)
+- Fixed debug port (9222) means tests can't run in parallel across multiple Obsidian instances without port management
+- Port polling adds ~2-5s startup overhead
+- `pkill -f Obsidian` in setup is aggressive — kills ALL Obsidian instances, not just test ones
+
+**Verdict:** Works well for PoC. Sufficient for single-worker CI/local testing.
+
+---
+
+### Option 3: Obsidian's built-in plugin testing (not explored)
+
+Obsidian has no official testing framework. Some community approaches exist (e.g., `obsidian-jest`, hot-reload-based testing), but none are mature or maintained.
+
+**Verdict:** Not a real option today.
+
+---
+
+## What We Learned
+
+### Obsidian internals accessible via `page.evaluate()`
+- `app.plugins.plugins["@discourse-graph/obsidian"]` — check plugin loaded
+- `app.vault.getMarkdownFiles()` — list files
+- `app.vault.read(file)` — read file content
+- `app.vault.create(name, content)` — create files
+- `app.workspace.openLinkText(path, "", false)` — open a file in the editor
+- `app.commands.executeCommandById(id)` — could execute commands directly (alternative to command palette UI)
+
+### Plugin command IDs
+Commands are registered with IDs like `@discourse-graph/obsidian:create-discourse-node`. The command palette shows them as "Discourse Graph: Create discourse node".
+
+### Modal DOM structure
+The `ModifyNodeModal` renders React inside Obsidian's `.modal-container`:
+- Node type: `<select>` element (`.modal-container select`)
+- Content: `<input type="text">` (`.modal-container input[type='text']`)
+- Confirm: `<button class="mod-cta">`
+
+### Vault configuration
+Minimum config for plugin to load:
+- `.obsidian/community-plugins.json` → `["@discourse-graph/obsidian"]`
+- `.obsidian/app.json` → `{"livePreview": true}` (restricted mode must be off, but this is handled by Obsidian detecting the plugins dir)
+- Plugin files (`main.js`, `manifest.json`, `styles.css`) in `.obsidian/plugins/@discourse-graph/obsidian/`
+
+---
+
+## Proposal: Full Agentic Testing Flow
+
+### Goal
+AI coding agents (Cursor, Claude Code) can run `pnpm test:e2e` after making changes to automatically verify features work end-to-end. The test suite should be comprehensive enough to catch regressions, fast enough to run frequently, and deterministic enough to trust the results.
+
+### Phase 1: Stabilize the PoC (current state + hardening)
+
+**Isolation improvements:**
+- Use a unique temp directory per test run (`os.tmpdir()`) instead of a fixed `test-vault/` path to avoid stale state
+- Use a random debug port to allow parallel runs
+- Replace `pkill -f Obsidian` with tracking the specific child PID — parse it from `lsof -i :<port>` after launch
+- Add a global setup/teardown in Playwright config to manage the single Obsidian instance across all tests
+
+**Reliability improvements:**
+- Replace `waitForTimeout()` calls with proper waitFor conditions (e.g., `waitForSelector`, `waitForFunction`)
+- Add retry logic for CDP connection (currently fails hard on timeout)
+- Add a `test.beforeEach` that resets vault state (delete all non-config files) instead of full vault recreation
+
+### Phase 2: Expand test coverage
+
+**Core plugin features to test:**
+- Create each discourse node type (Question, Claim, Evidence, Source)
+- Verify frontmatter (`nodeTypeId`) is set correctly
+- Verify file naming conventions (e.g., `QUE - `, `CLM - `, `EVD - `, `SRC - `)
+- Open node type menu via hotkey (`Cmd+\`)
+- Discourse context view toggle
+- Settings panel opens and renders
+
+**Vault-level tests:**
+- Create multiple nodes and verify they appear in file explorer
+- Verify node format regex matching (files follow the format pattern)
+
+**Use `app.commands.executeCommandById()` as the primary way to trigger commands** — faster, more reliable, and avoids flaky command palette typing. Reserve command palette tests for testing the palette itself.
+
+### Phase 3: Agentic integration
+
+**For agents to use the tests effectively:**
+
+1. **Fast feedback loop** — Tests should complete in <30s total. Current PoC is ~14s for 2 tests, which is good. Keep Obsidian running between test files using Playwright's `globalSetup`/`globalTeardown`.
+
+2. **Clear error messages** — When a test fails, the agent needs to understand WHY. Add descriptive assertion messages:
+   ```ts
+   expect(pluginLoaded, "Plugin should be loaded — check dist/ is built and plugin ID matches manifest.json").toBe(true);
+   ```
+
+3. **Screenshot-on-failure for visual debugging** — Already configured. Consider adding `page.screenshot()` at key checkpoints even on success, so agents can visually verify state.
+
+4. **Test file organization** — One test file per feature area:
+   ```
+   e2e/tests/
+   ├── plugin-load.spec.ts        # Plugin loads, settings exist
+   ├── node-creation.spec.ts      # Create each node type
+   ├── command-palette.spec.ts    # Command palette interactions
+   ├── discourse-context.spec.ts  # Context view, relations
+   └── settings.spec.ts           # Settings panel
+   ```
+
+5. **CI integration** — Run in GitHub Actions with a macOS runner. Obsidian would need to be pre-installed on the runner (or downloaded in a setup step). This is the biggest open question — Obsidian doesn't have a headless mode, so CI would need `xvfb` or a virtual display.
+
+6. **Agent-executable test commands:**
+   ```bash
+   pnpm test:e2e              # run all tests
+   pnpm test:e2e -- --grep "node creation"  # run specific tests
+   pnpm test:e2e:ui           # interactive Playwright UI (for humans)
+   ```
+
+### Phase 4: Advanced (future)
+
+- **Visual regression testing** — Compare screenshots against baselines to catch UI regressions
+- **Obsidian version matrix** — Test against multiple Obsidian versions (download different `.asar` files)
+- **Headless mode wrapper** — Investigate running Obsidian with `--disable-gpu --headless` flags (may not work due to Obsidian's renderer requirements)
+- **Test data fixtures** — Pre-built vaults with specific node/relation configurations for testing complex scenarios
+- **Performance benchmarks** — Measure plugin load time, command execution time
+
+### Open Questions
+
+1. **CI runner setup** — How to install Obsidian on GitHub Actions macOS runners? Is there a `.dmg` download URL that's stable? Or do we cache the `.app` bundle?
+2. **Obsidian updates** — Obsidian auto-updates the `.asar`. Should tests pin a specific version? How to prevent auto-update during test runs?
+3. **Multiple vaults** — Obsidian tracks known vaults globally. Test vaults may accumulate in Obsidian's vault list. Need cleanup strategy.
+4. **Restricted mode** — The PoC doesn't explicitly disable restricted mode via config. The plugin loads because the `community-plugins.json` file is present, but a fresh Obsidian install might prompt the user to enable community plugins. Need to investigate if there's a config flag to skip this.
diff --git a/apps/obsidian/e2e/helpers/commands.ts b/apps/obsidian/e2e/helpers/commands.ts
@@ -0,0 +1,44 @@
+import type { Page } from "@playwright/test";
+
+/**
+ * Execute a command via Obsidian's internal command API (fast, reliable).
+ * Use this for most test scenarios.
+ */
+export const executeCommand = async (page: Page, commandId: string): Promise<void> => {
+  await page.evaluate((id) => {
+    // @ts-expect-error - Obsidian's global `app` is available at runtime
+    app.commands.executeCommandById(`@discourse-graph/obsidian:${id}`);
+  }, commandId);
+  await page.waitForTimeout(500);
+};
+
+/**
+ * Execute a command via the command palette UI.
+ * Use this when testing the palette interaction itself.
+ */
+export const executeCommandViaPalette = async (page: Page, commandLabel: string): Promise<void> => {
+  await page.keyboard.press("Meta+p");
+  await page.waitForTimeout(500);
+
+  await page.keyboard.type(commandLabel, { delay: 30 });
+  await page.waitForTimeout(500);
+
+  await page.keyboard.press("Enter");
+  await page.waitForTimeout(1_000);
+};
+
+/**
+ * Ensure an active editor exists by creating and opening a scratch file.
+ * Required before running editorCallback commands like "Create discourse node".
+ */
+export const ensureActiveEditor = async (page: Page): Promise<void> => {
+  await page.evaluate(async () => {
+    // @ts-expect-error - Obsidian's global `app` is available at runtime
+    const vault = app.vault;
+    const fileName = `scratch-e2e-${Date.now()}.md`;
+    const file = await vault.create(fileName, "");
+    // @ts-expect-error - Obsidian's global `app` is available at runtime
+    await app.workspace.openLinkText(file.path, "", false);
+  });
+  await page.waitForTimeout(1_000);
+};
diff --git a/apps/obsidian/e2e/helpers/modal.ts b/apps/obsidian/e2e/helpers/modal.ts
@@ -0,0 +1,44 @@
+import type { Page } from "@playwright/test";
+
+/**
+ * Wait for the ModifyNodeModal to appear.
+ */
+export const waitForModal = async (page: Page): Promise<void> => {
+  await page.waitForSelector(".modal-container", { timeout: 5_000 });
+};
+
+/**
+ * Select a node type from the dropdown in the modal.
+ */
+export const selectNodeType = async (page: Page, label: string): Promise<void> => {
+  const nodeTypeSelect = page.locator(".modal-container select").first();
+  await nodeTypeSelect.selectOption({ label });
+  await page.waitForTimeout(300);
+};
+
+/**
+ * Fill the content/title input field in the modal.
+ */
+export const fillNodeContent = async (page: Page, content: string): Promise<void> => {
+  const contentInput = page.locator(".modal-container input[type='text']").first();
+  await contentInput.click();
+  await contentInput.fill(content);
+  await page.waitForTimeout(300);
+};
+
+/**
+ * Click the Confirm button (mod-cta) in the modal.
+ */
+export const confirmModal = async (page: Page): Promise<void> => {
+  // Use force: true to bypass any suggestion/autocomplete overlay that may cover the button
+  await page.locator(".modal-container button.mod-cta").click({ force: true });
+  await page.waitForTimeout(2_000);
+};
+
+/**
+ * Click the Cancel button in the modal.
+ */
+export const cancelModal = async (page: Page): Promise<void> => {
+  await page.locator(".modal-container button:not(.mod-cta)").click();
+  await page.waitForTimeout(500);
+};
diff --git a/apps/obsidian/e2e/helpers/obsidian-setup.ts b/apps/obsidian/e2e/helpers/obsidian-setup.ts
@@ -0,0 +1,132 @@
+import fs from "fs";
+import path from "path";
+import { spawn, execSync, type ChildProcess } from "child_process";
+import { chromium, type Browser, type Page } from "@playwright/test";
+
+const OBSIDIAN_PATH = "/Applications/Obsidian.app/Contents/MacOS/Obsidian";
+const PLUGIN_ID = "@discourse-graph/obsidian";
+const DEBUG_PORT = 9222;
+
+export const createTestVault = (vaultPath: string): void => {
+  // Clean up any existing vault
+  cleanTestVault(vaultPath);
+
+  // Create vault directories
+  const obsidianDir = path.join(vaultPath, ".obsidian");
+  const pluginDir = path.join(obsidianDir, "plugins", PLUGIN_ID);
+  fs.mkdirSync(pluginDir, { recursive: true });
+
+  // Write community-plugins.json to enable our plugin
+  fs.writeFileSync(
+    path.join(obsidianDir, "community-plugins.json"),
+    JSON.stringify([PLUGIN_ID]),
+  );
+
+  // Write app.json to disable restricted mode (allows community plugins)
+  fs.writeFileSync(
+    path.join(obsidianDir, "app.json"),
+    JSON.stringify({ livePreview: true }),
+  );
+
+  // Copy built plugin files from dist/ into the vault's plugin directory
+  const distDir = path.join(__dirname, "..", "..", "dist");
+  if (!fs.existsSync(distDir)) {
+    throw new Error(
+      `dist/ directory not found at ${distDir}. Run "pnpm build" first.`,
+    );
+  }
+
+  const filesToCopy = ["main.js", "manifest.json", "styles.css"];
+  for (const file of filesToCopy) {
+    const src = path.join(distDir, file);
+    if (fs.existsSync(src)) {
+      fs.copyFileSync(src, path.join(pluginDir, file));
+    }
+  }
+};
+
+export const cleanTestVault = (vaultPath: string): void => {
+  if (fs.existsSync(vaultPath)) {
+    fs.rmSync(vaultPath, { recursive: true, force: true });
+  }
+};
+
+/**
+ * Launch Obsidian as a subprocess with remote debugging enabled,
+ * then connect via Chrome DevTools Protocol.
+ *
+ * We can't use Playwright's electron.launch() because Obsidian's
+ * executable is a launcher that loads an asar package and forks
+ * a new process, which breaks Playwright's Electron API connection.
+ */
+export const launchObsidian = async (vaultPath: string): Promise<{
+  browser: Browser;
+  page: Page;
+  obsidianProcess: ChildProcess;
+}> => {
+  // Kill any existing Obsidian instances to free the debug port
+  try {
+    execSync("pkill -f Obsidian", { stdio: "ignore" });
+    // Wait for processes to fully exit
+    await new Promise((resolve) => setTimeout(resolve, 2_000));
+  } catch {
+    // No existing Obsidian processes — that's fine
+  }
+
+  // Launch Obsidian with remote debugging port
+  const obsidianProcess = spawn(
+    OBSIDIAN_PATH,
+    [`--remote-debugging-port=${DEBUG_PORT}`, `--vault=${vaultPath}`],
+    {
+      stdio: "pipe",
+      detached: true,
+    },
+  );
+
+  obsidianProcess.stderr?.on("data", (data: Buffer) => {
+    console.log("[Obsidian stderr]", data.toString());
+  });
+  obsidianProcess.stdout?.on("data", (data: Buffer) => {
+    console.log("[Obsidian stdout]", data.toString());
+  });
+
+  // Unref so the spawned process doesn't keep the test runner alive
+  obsidianProcess.unref();
+
+  // Wait for the debug port to be ready
+  await waitForDebugPort(DEBUG_PORT, 30_000);
+
+  // Connect to Obsidian via CDP
+  const browser = await chromium.connectOverCDP(`http://localhost:${DEBUG_PORT}`);
+
+  // Get the first browser context and page
+  const context = browser.contexts()[0];
+  if (!context) {
+    throw new Error("No browser context found after connecting via CDP");
+  }
+
+  let page = context.pages()[0];
+  if (!page) {
+    page = await context.waitForEvent("page");
+  }
+
+  // Wait for Obsidian to finish initializing
+  await page.waitForLoadState("domcontentloaded");
+  await page.waitForSelector(".workspace", { timeout: 30_000 });
+
+  return { browser, page, obsidianProcess };
+};
+
+const waitForDebugPort = async (port: number, timeoutMs: number): Promise<void> => {
+  const start = Date.now();
+  while (Date.now() - start < timeoutMs) {
+    try {
+      const response = await fetch(`http://localhost:${port}/json/version`);
+      if (response.ok) return;
+    } catch {
+      // Port not ready yet, retry
+    }
+    await new Promise((resolve) => setTimeout(resolve, 500));
+  }
+  throw new Error(`Obsidian debug port ${port} did not become ready within ${timeoutMs}ms`);
+};