Skip to content

Benchmarks only run at publish — perf regressions accumulate invisibly between releases #1433

@carlos-alm

Description

@carlos-alm

What happened

The v3.12.0 publish gate failed with: native 1-file rebuild +1827%, native full build +98%, wasm full build +95% vs 3.11.2. Two real regressions had accumulated across ~25 resolver PRs (Phase 8.x):

  1. JS/TS extractor grew from 5 to 15 full-tree walks per file (+135% per-file extraction)
  2. runPostNativePrototypeMethods ran unscoped on every native build (~1s flat tax, 20x on 1-file rebuilds)

None of this was visible until release time because build/query/incremental benchmarks only execute in the publish workflow (pre-publish-benchmark job). The per-PR CI runs the regression-guard test against committed history only — no fresh measurement — so the 'dev' rolling-entry path in the guard never receives data between releases.

Proposal

A lightweight per-PR perf canary, e.g.:

  • a single timed buildGraph (both engines) over a small fixed corpus on PRs that touch src/extractors/, src/domain/graph/, or crates/ — compare against the last release's committed numbers with a loose (50%?) threshold
  • or a scheduled (nightly/weekly) run of the existing benchmark scripts that appends the 'dev' entry the guard already knows how to compare

Methodology note from the v3.12.0 failure: scripts/benchmark.ts measures its full build single-shot while scripts/incremental-benchmark.ts uses median-of-5 — the same metric passed in one suite and failed in the other within one CI run. Worth unifying while touching this.

Related: #1436 (the v3.12.0 regression fix PR)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions