Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
106 changes: 106 additions & 0 deletions .claude/skills/debug-e2e/SKILL.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,106 @@
---
name: debug-e2e
description: Debug flaky Playwright E2E test failures from CI
---

# Debugging Flaky E2E Test Failures

Use this skill when investigating flaky Playwright E2E test failures from CI.

## Workflow

### 1. Get CI failure details

```bash
# View PR checks status
gh pr checks <PR_NUMBER>

# Get failed test logs
gh run view <RUN_ID> --log-failed 2>&1 | head -500

# Search for specific failure patterns
gh run view <RUN_ID> --log-failed 2>&1 | rg -C 30 "FAILED|Error:|Expected|Timed out"
```

### 2. Reproduce locally with repeat-each

Run the failing test multiple times to reproduce flaky behavior:

```bash
# Run specific test 20 times
npm run e2e -- test/e2e/<file>.e2e.ts --repeat-each=20 -g "<test name pattern>"

# Target specific browser if failure is browser-specific
npm run e2e -- test/e2e/<file>.e2e.ts --repeat-each=20 -g "<test name>" --project=<browser>
```

### 2.1 Calibrate test duration

Start small to get signal quickly, then scale up only if needed.

- 5-10 repeats on a single browser is usually under a few minutes.
- 20+ repeats across all browsers can take a long time, especially for full files.

Always run repeats on a single test or at most one file. Never repeat the whole suite.
Prefer narrowing with `-g` and `--project` first, then increase `--repeat-each` once the fix looks stable.

### 3. Analyze failure artifacts

When tests fail, Playwright saves traces and error context:

```bash
# View error context (page snapshot at failure time)
cat test-results/<test-name>-<browser>/error-context.md

# Open trace viewer (interactive)
npx playwright show-trace test-results/<test-name>-<browser>/trace.zip
```

### 4. Common flakiness patterns

**React Aria NumberInput flakiness**: When `fill()` on a NumberInput doesn't stick, it's often because:

- The component re-renders after a prop change (e.g., `maxValue` changing)
- Hydration race conditions

**Fix pattern - NumberInput helper**:

```typescript
import { fillNumberInput } from './utils'

await fillNumberInput(input, 'value')
```

**Form hydration issues**: Wait for a field that only renders after mount:

```typescript
await expect(page.getByRole('radiogroup', { name: 'Block size' })).toBeVisible()
```

**State change timing**: When clicking changes form state, wait for visual confirmation:

```typescript
await page.getByRole('radio', { name: 'Local' }).click()
// Wait for dependent UI to update
await expect(page.getByRole('radiogroup', { name: 'Block size' })).toBeHidden()
```

### 5. Sleep as last resort

If deterministic waits don't work, use `sleep()` from `test/e2e/utils.ts`:

```typescript
import { sleep } from './utils'

await sleep(200) // Use sparingly, prefer deterministic waits
```

### 6. Verify fix is stable

Run at least 30-50 iterations to confirm flakiness is resolved:

```bash
npm run e2e -- test/e2e/<file>.e2e.ts --repeat-each=50 -g "<test name>"
```

A good target is 0 failures out of 100+ runs. Test on all browsers that failed in CI.
1 change: 1 addition & 0 deletions AGENTS.md
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,7 @@
# Testing code

- Run local checks before sending PRs: `npm run lint`, `npm run tsc`, `npm test run`, and `npm run e2ec`; pass `-- --ui` for Playwright UI mode or project/name filters like `npm run e2ec -- instance -g 'boot disk'`.
- You don't usually need to run all the e2e tests, so try to filter by filename. CI will run the full set.
- Keep Playwright specs focused on user-visible behavior—use accessible locators (`getByRole`, `getByLabel`), the helpers in `test/e2e/utils.ts` (`expectToast`, `expectRowVisible`, `selectOption`, `clickRowAction`), and close toasts so follow-on assertions aren’t blocked.
- Cover role-gated flows by logging in with `getPageAsUser`; exercise negative paths (e.g., forbidden actions) alongside happy paths as shown in `test/e2e/system-update.e2e.ts`.
- Consider `expectVisible` and `expectNotVisible` deprecated: prefer `expect().toBeVisible()` and `toBeHidden()` in new code.
Expand Down
2 changes: 1 addition & 1 deletion OMICRON_VERSION
Original file line number Diff line number Diff line change
@@ -1 +1 @@
c765b3539203e34f65cd402f139cf604035d5993
44e65c3b30720e20b3d3be9bab4efbf6cac0ee2c
18 changes: 15 additions & 3 deletions app/api/__generated__/Api.ts

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 1 addition & 1 deletion app/api/__generated__/OMICRON_VERSION

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

13 changes: 11 additions & 2 deletions app/api/__generated__/validate.ts

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

6 changes: 6 additions & 0 deletions app/components/StateBadge.tsx
Original file line number Diff line number Diff line change
Expand Up @@ -90,3 +90,9 @@ export const DiskTypeBadge = (props: { diskType: DiskType; className?: string })
{props.diskType}
</Badge>
)

export const ReadOnlyBadge = () => (
<Badge color="neutral" className="relative">
Read only
</Badge>
)
11 changes: 6 additions & 5 deletions app/components/form/fields/DiskSizeField.tsx
Original file line number Diff line number Diff line change
Expand Up @@ -12,8 +12,6 @@ import type {
ValidateResult,
} from 'react-hook-form'

import { MAX_DISK_SIZE_GiB } from '@oxide/api'

import { NumberField } from './NumberField'
import type { TextFieldProps } from './TextField'

Expand All @@ -22,6 +20,8 @@ interface DiskSizeProps<
TName extends FieldPath<TFieldValues>,
> extends TextFieldProps<TFieldValues, TName> {
minSize?: number
/** Undefined means no client-side limit (e.g., for local disks) */
maxSize: number | undefined
validate?(diskSizeGiB: number): ValidateResult
}

Expand All @@ -32,6 +32,7 @@ export function DiskSizeField<
required = true,
name,
minSize = 1,
maxSize,
validate,
...props
}: DiskSizeProps<TFieldValues, TName>) {
Expand All @@ -41,7 +42,7 @@ export function DiskSizeField<
required={required}
name={name}
min={minSize}
max={MAX_DISK_SIZE_GiB}
max={maxSize}
validate={(diskSizeGiB) => {
// Run a number of default validators
if (Number.isNaN(diskSizeGiB)) {
Expand All @@ -50,8 +51,8 @@ export function DiskSizeField<
if (diskSizeGiB < minSize) {
return `Must be at least ${minSize} GiB`
}
if (diskSizeGiB > MAX_DISK_SIZE_GiB) {
return `Can be at most ${MAX_DISK_SIZE_GiB} GiB`
if (maxSize !== undefined && diskSizeGiB > maxSize) {
return `Can be at most ${maxSize} GiB`
}
// Run any additional validators passed in from the callsite
return validate?.(diskSizeGiB)
Expand Down
Loading
Loading