Skip to content

feat: test failure triage — group batch failures by root cause #43

Description

@SahilRakhaiya05

Problem

When many tests fail in the same project (batch test run --all, regression reruns, MCP execute), agents and humans see a flat list of 20–50 separate failures. In practice, those failures often share one underlying root cause:

  • Authentication / token expired
  • Staging environment down (network_timeout, infra)
  • Broken navigation (routing_404)
  • Shared code defect (same recommendedFixTarget.reference)
  • Corrupted test data / producer failure cascading to consumers

Today the CLI only supports per-test analysis (test failure get, test failure summary). An agent must download many bundles or guess which test to investigate first — wasting AI tokens, testing credits, and debug time.

Proposed solution (CLI Phase-0)

Add testsprite test failure triage --project <id>:

  1. List all failed tests (GET /tests?status=failed)
  2. Fetch lightweight failure/summary per test (no screenshots/video)
  3. Group client-side using deterministic heuristics over existing M2.1 fields:
    • shared recommendedFixTarget.reference
    • env-wide failureKind (infra, network, network_timeout, routing_404)
    • normalized rootCauseHypothesis prefix
    • singleton fallback
  4. Return clusters with representativeTestId, memberTestIds, confidence, fixPriority

Why CLI-first

  • Uses only existing public APIs — no backend changes required
  • Immediately reduces duplicate investigation and bundle downloads
  • Natural read surface when native backend clustering ships later

Acceptance criteria

  • testsprite test failure triage --project <id> --output json returns clustered output
  • --type, --filter, --max-concurrency supported
  • --dry-run returns canned sample
  • Unit + integration tests, docs, CHANGELOG, agent skill updated

Future (backend)

Native clustering API with semantic embeddings, wave/cascade graph, and --rerun-representatives orchestration.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions