Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions apps/obsidian/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
e2e/test-vault*/
e2e/test-results/
e2e/html-report/
159 changes: 159 additions & 0 deletions apps/obsidian/e2e/NOTES.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,159 @@
# E2E Testing for Obsidian Plugin — Notes

## Approaches Considered

### Option 1: Playwright `electron.launch()`

The standard Playwright approach for Electron apps — point `executablePath` at the binary and let Playwright manage the process lifecycle.

**Pros:**
- First-class Playwright API — `app.evaluate()` runs code in the main process, not just renderer
- Automatic process lifecycle management (launch, close, cleanup)
- Access to Electron-specific APIs (e.g., `app.evaluate(() => process.env)`)
- Well-documented, widely used for Electron testing

**Cons:**
- **Does not work with Obsidian.** Obsidian's executable is a launcher that loads an `.asar` package (`obsidian-1.11.7.asar`) and forks a new Electron process. Playwright connects to the initial process, which exits, causing `kill EPERM` and connection failures.
- No workaround without modifying Obsidian's startup or using a custom Electron shell

**Verdict:** Not viable for Obsidian.

---

### Option 2: CDP via `chromium.connectOverCDP()` (chosen)

Launch Obsidian as a subprocess with `--remote-debugging-port=9222`, then connect via Chrome DevTools Protocol.

**Pros:**
- Works with Obsidian's forked process architecture — the debug port is inherited by the child process
- Full access to renderer via `page.evaluate()` — Obsidian's global `app` object is available
- Keyboard/mouse interaction works normally
- Can take screenshots, traces, and use all Playwright assertions
- Process is managed explicitly — clear control over startup and teardown

**Cons:**
- No main process access (can't call Electron APIs directly, only renderer-side `window`/`app`)
- Must manually manage process lifecycle (spawn, pkill, port polling)
- Fixed debug port (9222) means tests can't run in parallel across multiple Obsidian instances without port management
- Port polling adds ~2-5s startup overhead
- `pkill -f Obsidian` in setup is aggressive — kills ALL Obsidian instances, not just test ones

**Verdict:** Works well for PoC. Sufficient for single-worker CI/local testing.

---

### Option 3: Obsidian's built-in plugin testing (not explored)

Obsidian has no official testing framework. Some community approaches exist (e.g., `obsidian-jest`, hot-reload-based testing), but none are mature or maintained.

**Verdict:** Not a real option today.

---

## What We Learned

### Obsidian internals accessible via `page.evaluate()`
- `app.plugins.plugins["@discourse-graph/obsidian"]` — check plugin loaded
- `app.vault.getMarkdownFiles()` — list files
- `app.vault.read(file)` — read file content
- `app.vault.create(name, content)` — create files
- `app.workspace.openLinkText(path, "", false)` — open a file in the editor
- `app.commands.executeCommandById(id)` — could execute commands directly (alternative to command palette UI)

### Plugin command IDs
Commands are registered with IDs like `@discourse-graph/obsidian:create-discourse-node`. The command palette shows them as "Discourse Graph: Create discourse node".

### Modal DOM structure
The `ModifyNodeModal` renders React inside Obsidian's `.modal-container`:
- Node type: `<select>` element (`.modal-container select`)
- Content: `<input type="text">` (`.modal-container input[type='text']`)
- Confirm: `<button class="mod-cta">`

### Vault configuration
Minimum config for plugin to load:
- `.obsidian/community-plugins.json` → `["@discourse-graph/obsidian"]`
- `.obsidian/app.json` → `{"livePreview": true}` (restricted mode must be off, but this is handled by Obsidian detecting the plugins dir)
- Plugin files (`main.js`, `manifest.json`, `styles.css`) in `.obsidian/plugins/@discourse-graph/obsidian/`

---

## Proposal: Full Agentic Testing Flow

### Goal
AI coding agents (Cursor, Claude Code) can run `pnpm test:e2e` after making changes to automatically verify features work end-to-end. The test suite should be comprehensive enough to catch regressions, fast enough to run frequently, and deterministic enough to trust the results.

### Phase 1: Stabilize the PoC (current state + hardening)

**Isolation improvements:**
- Use a unique temp directory per test run (`os.tmpdir()`) instead of a fixed `test-vault/` path to avoid stale state
- Use a random debug port to allow parallel runs
- Replace `pkill -f Obsidian` with tracking the specific child PID — parse it from `lsof -i :<port>` after launch
- Add a global setup/teardown in Playwright config to manage the single Obsidian instance across all tests

**Reliability improvements:**
- Replace `waitForTimeout()` calls with proper waitFor conditions (e.g., `waitForSelector`, `waitForFunction`)
- Add retry logic for CDP connection (currently fails hard on timeout)
- Add a `test.beforeEach` that resets vault state (delete all non-config files) instead of full vault recreation

### Phase 2: Expand test coverage

**Core plugin features to test:**
- Create each discourse node type (Question, Claim, Evidence, Source)
- Verify frontmatter (`nodeTypeId`) is set correctly
- Verify file naming conventions (e.g., `QUE - `, `CLM - `, `EVD - `, `SRC - `)
- Open node type menu via hotkey (`Cmd+\`)
- Discourse context view toggle
- Settings panel opens and renders

**Vault-level tests:**
- Create multiple nodes and verify they appear in file explorer
- Verify node format regex matching (files follow the format pattern)

**Use `app.commands.executeCommandById()` as the primary way to trigger commands** — faster, more reliable, and avoids flaky command palette typing. Reserve command palette tests for testing the palette itself.

### Phase 3: Agentic integration

**For agents to use the tests effectively:**

1. **Fast feedback loop** — Tests should complete in <30s total. Current PoC is ~14s for 2 tests, which is good. Keep Obsidian running between test files using Playwright's `globalSetup`/`globalTeardown`.

2. **Clear error messages** — When a test fails, the agent needs to understand WHY. Add descriptive assertion messages:
```ts
expect(pluginLoaded, "Plugin should be loaded — check dist/ is built and plugin ID matches manifest.json").toBe(true);
```

3. **Screenshot-on-failure for visual debugging** — Already configured. Consider adding `page.screenshot()` at key checkpoints even on success, so agents can visually verify state.

4. **Test file organization** — One test file per feature area:
```
e2e/tests/
├── plugin-load.spec.ts # Plugin loads, settings exist
├── node-creation.spec.ts # Create each node type
├── command-palette.spec.ts # Command palette interactions
├── discourse-context.spec.ts # Context view, relations
└── settings.spec.ts # Settings panel
```

5. **CI integration** — Run in GitHub Actions with a macOS runner. Obsidian would need to be pre-installed on the runner (or downloaded in a setup step). This is the biggest open question — Obsidian doesn't have a headless mode, so CI would need `xvfb` or a virtual display.

6. **Agent-executable test commands:**
```bash
pnpm test:e2e # run all tests
pnpm test:e2e -- --grep "node creation" # run specific tests
pnpm test:e2e:ui # interactive Playwright UI (for humans)
```

### Phase 4: Advanced (future)

- **Visual regression testing** — Compare screenshots against baselines to catch UI regressions
- **Obsidian version matrix** — Test against multiple Obsidian versions (download different `.asar` files)
- **Headless mode wrapper** — Investigate running Obsidian with `--disable-gpu --headless` flags (may not work due to Obsidian's renderer requirements)
- **Test data fixtures** — Pre-built vaults with specific node/relation configurations for testing complex scenarios
- **Performance benchmarks** — Measure plugin load time, command execution time

### Open Questions

1. **CI runner setup** — How to install Obsidian on GitHub Actions macOS runners? Is there a `.dmg` download URL that's stable? Or do we cache the `.app` bundle?
2. **Obsidian updates** — Obsidian auto-updates the `.asar`. Should tests pin a specific version? How to prevent auto-update during test runs?
3. **Multiple vaults** — Obsidian tracks known vaults globally. Test vaults may accumulate in Obsidian's vault list. Need cleanup strategy.
4. **Restricted mode** — The PoC doesn't explicitly disable restricted mode via config. The plugin loads because the `community-plugins.json` file is present, but a fresh Obsidian install might prompt the user to enable community plugins. Need to investigate if there's a config flag to skip this.
44 changes: 44 additions & 0 deletions apps/obsidian/e2e/helpers/commands.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,44 @@
import type { Page } from "@playwright/test";

/**
* Execute a command via Obsidian's internal command API (fast, reliable).
* Use this for most test scenarios.
*/
export const executeCommand = async (page: Page, commandId: string): Promise<void> => {
await page.evaluate((id) => {
// @ts-expect-error - Obsidian's global `app` is available at runtime
app.commands.executeCommandById(`@discourse-graph/obsidian:${id}`);
}, commandId);
await page.waitForTimeout(500);
};

/**
* Execute a command via the command palette UI.
* Use this when testing the palette interaction itself.
*/
export const executeCommandViaPalette = async (page: Page, commandLabel: string): Promise<void> => {
await page.keyboard.press("Meta+p");
await page.waitForTimeout(500);

await page.keyboard.type(commandLabel, { delay: 30 });
await page.waitForTimeout(500);

await page.keyboard.press("Enter");
await page.waitForTimeout(1_000);
};

/**
* Ensure an active editor exists by creating and opening a scratch file.
* Required before running editorCallback commands like "Create discourse node".
*/
export const ensureActiveEditor = async (page: Page): Promise<void> => {
await page.evaluate(async () => {
// @ts-expect-error - Obsidian's global `app` is available at runtime
const vault = app.vault;
const fileName = `scratch-e2e-${Date.now()}.md`;
const file = await vault.create(fileName, "");
// @ts-expect-error - Obsidian's global `app` is available at runtime
await app.workspace.openLinkText(file.path, "", false);
});
await page.waitForTimeout(1_000);
};
44 changes: 44 additions & 0 deletions apps/obsidian/e2e/helpers/modal.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,44 @@
import type { Page } from "@playwright/test";

/**
* Wait for the ModifyNodeModal to appear.
*/
export const waitForModal = async (page: Page): Promise<void> => {
await page.waitForSelector(".modal-container", { timeout: 5_000 });
};

/**
* Select a node type from the dropdown in the modal.
*/
export const selectNodeType = async (page: Page, label: string): Promise<void> => {
const nodeTypeSelect = page.locator(".modal-container select").first();
await nodeTypeSelect.selectOption({ label });
await page.waitForTimeout(300);
};

/**
* Fill the content/title input field in the modal.
*/
export const fillNodeContent = async (page: Page, content: string): Promise<void> => {
const contentInput = page.locator(".modal-container input[type='text']").first();
await contentInput.click();
await contentInput.fill(content);
await page.waitForTimeout(300);
};

/**
* Click the Confirm button (mod-cta) in the modal.
*/
export const confirmModal = async (page: Page): Promise<void> => {
// Use force: true to bypass any suggestion/autocomplete overlay that may cover the button
await page.locator(".modal-container button.mod-cta").click({ force: true });
await page.waitForTimeout(2_000);
};

/**
* Click the Cancel button in the modal.
*/
export const cancelModal = async (page: Page): Promise<void> => {
await page.locator(".modal-container button:not(.mod-cta)").click();
await page.waitForTimeout(500);
};
132 changes: 132 additions & 0 deletions apps/obsidian/e2e/helpers/obsidian-setup.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,132 @@
import fs from "fs";
import path from "path";
import { spawn, execSync, type ChildProcess } from "child_process";
import { chromium, type Browser, type Page } from "@playwright/test";

const OBSIDIAN_PATH = "/Applications/Obsidian.app/Contents/MacOS/Obsidian";
const PLUGIN_ID = "@discourse-graph/obsidian";
const DEBUG_PORT = 9222;

export const createTestVault = (vaultPath: string): void => {
// Clean up any existing vault
cleanTestVault(vaultPath);

// Create vault directories
const obsidianDir = path.join(vaultPath, ".obsidian");
const pluginDir = path.join(obsidianDir, "plugins", PLUGIN_ID);
fs.mkdirSync(pluginDir, { recursive: true });

// Write community-plugins.json to enable our plugin
fs.writeFileSync(
path.join(obsidianDir, "community-plugins.json"),
JSON.stringify([PLUGIN_ID]),
);

// Write app.json to disable restricted mode (allows community plugins)
fs.writeFileSync(
path.join(obsidianDir, "app.json"),
JSON.stringify({ livePreview: true }),
);

// Copy built plugin files from dist/ into the vault's plugin directory
const distDir = path.join(__dirname, "..", "..", "dist");
if (!fs.existsSync(distDir)) {
throw new Error(
`dist/ directory not found at ${distDir}. Run "pnpm build" first.`,
);
}

const filesToCopy = ["main.js", "manifest.json", "styles.css"];
for (const file of filesToCopy) {
const src = path.join(distDir, file);
if (fs.existsSync(src)) {
fs.copyFileSync(src, path.join(pluginDir, file));
}
}
};

export const cleanTestVault = (vaultPath: string): void => {
if (fs.existsSync(vaultPath)) {
fs.rmSync(vaultPath, { recursive: true, force: true });
}
};

/**
* Launch Obsidian as a subprocess with remote debugging enabled,
* then connect via Chrome DevTools Protocol.
*
* We can't use Playwright's electron.launch() because Obsidian's
* executable is a launcher that loads an asar package and forks
* a new process, which breaks Playwright's Electron API connection.
*/
export const launchObsidian = async (vaultPath: string): Promise<{
browser: Browser;
page: Page;
obsidianProcess: ChildProcess;
}> => {
// Kill any existing Obsidian instances to free the debug port
try {
execSync("pkill -f Obsidian", { stdio: "ignore" });
// Wait for processes to fully exit
await new Promise((resolve) => setTimeout(resolve, 2_000));
} catch {
// No existing Obsidian processes — that's fine
}

// Launch Obsidian with remote debugging port
const obsidianProcess = spawn(
OBSIDIAN_PATH,
[`--remote-debugging-port=${DEBUG_PORT}`, `--vault=${vaultPath}`],
{
stdio: "pipe",
detached: true,
},
);

obsidianProcess.stderr?.on("data", (data: Buffer) => {
console.log("[Obsidian stderr]", data.toString());
});
obsidianProcess.stdout?.on("data", (data: Buffer) => {
console.log("[Obsidian stdout]", data.toString());
});

// Unref so the spawned process doesn't keep the test runner alive
obsidianProcess.unref();

// Wait for the debug port to be ready
await waitForDebugPort(DEBUG_PORT, 30_000);

// Connect to Obsidian via CDP
const browser = await chromium.connectOverCDP(`http://localhost:${DEBUG_PORT}`);

// Get the first browser context and page
const context = browser.contexts()[0];
if (!context) {
throw new Error("No browser context found after connecting via CDP");
}

let page = context.pages()[0];
if (!page) {
page = await context.waitForEvent("page");
}

// Wait for Obsidian to finish initializing
await page.waitForLoadState("domcontentloaded");
await page.waitForSelector(".workspace", { timeout: 30_000 });

return { browser, page, obsidianProcess };
};

const waitForDebugPort = async (port: number, timeoutMs: number): Promise<void> => {
const start = Date.now();
while (Date.now() - start < timeoutMs) {
try {
const response = await fetch(`http://localhost:${port}/json/version`);
if (response.ok) return;
} catch {
// Port not ready yet, retry
}
await new Promise((resolve) => setTimeout(resolve, 500));
}
throw new Error(`Obsidian debug port ${port} did not become ready within ${timeoutMs}ms`);
};
Loading