Skip to content

Bug 2032243: Use the confidence interval for the significance column#1056

Open
kala-moz wants to merge 3 commits into
mozilla:mainfrom
kala-moz:use-CI-for-significance-col
Open

Bug 2032243: Use the confidence interval for the significance column#1056
kala-moz wants to merge 3 commits into
mozilla:mainfrom
kala-moz:use-CI-for-significance-col

Conversation

@kala-moz

@kala-moz kala-moz commented Jun 18, 2026

Copy link
Copy Markdown
Contributor

The significance column, its filter, its sort, and the expanded-row alert now all read from a single precomputed bootstrap (BCa) CI on the difference of medians, replacing the Mann-Whitney p-value's interpretation string as the source of truth.

Added MannWhitneyResultsItem.bootstrapCi and a precomputeMannWhitneyCI helper that the four data loaders (compare, over-time, subtests, subtests-over-time) call once per row. Doing it at load time keeps sort/filter/render at property-access speed.

  • Sig column: matchesFunction and sortFunction read bootstrapCi. Sort is "significant first, then |medianDiff| desc" — written in ASC semantics so useTableSort's DESC swap produces that order. The cell render uses bootstrapCi?.significant to pick icon vs -
  • renderExpandedRight reads the same precomputed CI (with an inline fallback for callers that mount the strategy without a loader).
  • Drop the "Significance (p-value)" row from PValCliffsDeltaComp. The Δ-median alert already conveys CI-based significance, and showing the p-value contradicts the Sig column.
  • Update tooltipSignificance and tooltipStatusMannWhitney to match the new definition (BCa CI excludes zero, not p-value < 0.05).
  • Update tests + snapshots to reflect the new sig values and orderings; remove two RevisionRow tests that asserted text from the now-removed p-value row.

Deploy Link

Browser time preview

Updates:

  1. Now considers replicates when available.
  2. Changed from symbols to letters - S for Significant. NS for Not Significant
  3. Performance fix: The Sig column's bootstrap CI was being precomputed for every row in all four loaders before the table could render. This caused major performance problems in captured profiles. Switched to lazy compute: pay the BCa cost only when the user engages with the Sig column (filter or sort click). First click pays the per-row cost across all rows; after that, every comparison is a cached property read. The four loaders no longer call precomputeMannWhitneyCI. Their .then callbacks are removed entirely; results flow straight through. Unused imports (precomputeMannWhitneyCI, MannWhitneyResultsItem) dropped from each file.
Screenshot 2026-06-24 at 18 09 38

@netlify

netlify Bot commented Jun 18, 2026

Copy link
Copy Markdown

Deploy Preview for mozilla-perfcompare ready!

Name Link
🔨 Latest commit 23a51da
🔍 Latest deploy log https://app.netlify.com/projects/mozilla-perfcompare/deploys/6a333e121ee1a500089e5bb3
😎 Deploy Preview https://deploy-preview-1056--mozilla-perfcompare.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

@kala-moz kala-moz requested a review from padenot June 18, 2026 00:38
@netlify

netlify Bot commented Jun 23, 2026

Copy link
Copy Markdown

Deploy Preview for mozilla-perfcompare ready!

Name Link
🔨 Latest commit 23a51da
🔍 Latest deploy log https://app.netlify.com/projects/mozilla-perfcompare/deploys/6a3b0389300b3896fde23cc9
😎 Deploy Preview https://deploy-preview-1056--mozilla-perfcompare.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

The Sig column, its filter, its sort, and the expanded-row alert now all
read from a single precomputed bootstrap (BCa) CI on the difference of
medians, replacing the Mann-Whitney p-value's `interpretation` string as
the source of truth.
@kala-moz kala-moz force-pushed the use-CI-for-significance-col branch from 23a51da to 13259e7 Compare June 23, 2026 23:05
@netlify

netlify Bot commented Jun 23, 2026

Copy link
Copy Markdown

Deploy Preview for mozilla-perfcompare ready!

Name Link
🔨 Latest commit 1f4ca56
🔍 Latest deploy log https://app.netlify.com/projects/mozilla-perfcompare/deploys/6a3c7ec58837550008b7c3bd
😎 Deploy Preview https://deploy-preview-1056--mozilla-perfcompare.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

@kala-moz

kala-moz commented Jun 24, 2026

Copy link
Copy Markdown
Contributor Author

@kala-moz I always have - in the significance column with this? Can you check e.g. on https://deploy-preview-1056--mozilla-perfcompare.netlify.app/compare-results?baseRev=51308dbe0e7e0810c9f6eec28ce49d34c49d6480&baseRepo=mozilla-central&newRev=64a89bfa4912122c11d6d2d381bd4c8d538b56e6&newRepo=mozilla-central&framework=13&test_version=mann-whitney-u&search=speedometer https://deploy-preview-1056--mozilla-perfcompare.netlify.app/subtests-compare-results?baseRev=51308dbe0e7e0810c9f6eec28ce49d34c49d6480&baseRepo=mozilla-central&newRev=64a89bfa4912122c11d6d2d381bd4c8d538b56e6&newRepo=mozilla-central&framework=13&baseParentSignature=5854576&newParentSignature=5854576&test_version=mann-whitney-u which are two recent revs on central.

Hmm so the - means that it's not significant. However, that can be confusing because we also used to use - to signify that something is null or there isn't any information. I think I will change the symbol here. For S to mean Significant and NS to mean Not Significant.

Update: Forgot to consider the replicates. Removed expensive precompute CI calculations from the loaders that affected performance as well.

Updated preview

Subtests preview

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants