-
Notifications
You must be signed in to change notification settings - Fork 75
Open
Description
Sample platform commit (found at the bottom of each page) : 66f057a
In raising this issue, I confirm the following (please check boxes, eg [X]):
- I have read and understood the contributors guide.
- I have checked that the bug-fix I am reporting can be replicated, or that the feature I am suggesting isn't already present.
- I have checked that the issue I'm posting isn't already reported.
- I have checked that the issue I'm posting isn't already solved and no duplicates exist in closed issues and in opened issues
- I have checked the pull requests tab for existing solutions/implementations to my issue/suggestion.
My familiarity with the project is as follows (check one, eg [X]):
- I have never visited/used the platform.
- I have used the platform just a couple of times.
- I have used the platform extensively, but have not contributed previously.
- I am an active contributor to the platform.
Summary
The recently added BaselineStatus model state still needs end-to-end integration in the CI and reporting flow.
Today the platform only partially understands never_worked:
- baseline state can exist at the model and schema level, but reporting can still misclassify failures
- baseline history should only be refreshed from trusted main-repo commit runs, not from unmerged PR or fork runs
- never worked needs to be evaluated per platform using last_passed_on_linux and last_passed_on_windows, otherwise a test that passed on Linux can be misreported on Windows
- historical migration backfill should trust known pass history and avoid over-classifying ambiguous old failures
Expected behavior
- completed main-repo commit runs refresh baseline status using full regression-result logic
- PR and fork runs do not mutate shared baseline history
- PR comments distinguish new failures, failures already failing on master, and tests that have never worked on the current platform
- migration backfill marks tests as established only when trusted pass history exists; otherwise they remain unknown until refreshed by trusted runs
Why this matters
This keeps CI failure signals accurate. Without it, contributors can get misleading PR comments and unmerged runs can pollute the baseline used to classify regressions.
Verification
Focused controller and regression tests cover trusted baseline refresh, no PR baseline mutation, platform-specific never-worked reporting, and existing PR comment behavior.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels