Skip to content

Removes Very Flaky and Less Flaky CI test jobs#5037

Merged
AntoineRichard merged 7 commits intoisaac-sim:developfrom
AntoineRichard:antoiner/remove-flaky-test-jobs
Mar 17, 2026
Merged

Removes Very Flaky and Less Flaky CI test jobs#5037
AntoineRichard merged 7 commits intoisaac-sim:developfrom
AntoineRichard:antoiner/remove-flaky-test-jobs

Conversation

@AntoineRichard
Copy link
Copy Markdown
Collaborator

Description

These tests are soon to be stable and can be run as part of the regular CI jobs.

Type of change

  • New feature (non-breaking change which adds functionality)

Checklist

  • I have read and understood the contribution guidelines
  • I have run the pre-commit checks with ./isaaclab.sh --format
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • I have added tests that prove my fix is effective or that my feature works
  • I have updated the changelog and the corresponding version in the extension's config/extension.toml file
  • I have added my name to the CONTRIBUTORS.md or my name already exists there

These tests are now stable and can be run as part of the regular CI
jobs. Remove the dedicated flaky/slightly-flaky job definitions,
the FLAKY_TESTS and SLIGHTLY_FLAKY_TESTS lists, and all supporting
conftest.py and action.yml plumbing that segregated them.
@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps bot commented Mar 16, 2026

Greptile Summary

This PR consolidates the two separate flaky-test CI tracks (test-flaky / test-slightly-flaky) into a single unified test-quarantined job, simultaneously promoting all previously-listed flaky and slightly-flaky tests back into normal CI runs (by leaving QUARANTINED_TESTS empty). The code changes are internally consistent across action.yml, build.yaml, conftest.py, and test_settings.py.

Key changes:

  • FLAKY_TESTS and SLIGHTLY_FLAKY_TESTS lists (14 + 2 entries) are removed and replaced by an empty QUARANTINED_TESTS: list[str] = [] — acting as a clean slate for future quarantining.
  • test-quarantined job is gated by a RUN_QUARANTINED_TESTS repo variable (skipped by default), with a comment in build.yaml explaining how to enable it.
  • conftest.py adds a graceful zero-exit path when quarantined_only=True but QUARANTINED_TESTS is empty, preventing a spurious CI failure.
  • combine-results correctly handles the skipped test-quarantined job via if: always() and continue-on-error: true on the artifact download step.

Main concern: All tests previously classified as FLAKY_TESTS (some with 0%–7.7% historical pass rates) are now unblocked and will run in normal CI. If they have not been fully stabilized, this may introduce intermittent failures that block unrelated PRs. The new QUARANTINED_TESTS mechanism exists precisely to handle this transitional period.

Confidence Score: 3/5

  • Safe to merge from a code-correctness standpoint, but carries a risk of introducing CI flakiness if the historically unstable tests are not yet stable.
  • All code changes are internally consistent and logically correct — argument counts, env variable names, and conditionals all line up. The main risk is operational: 14 previously quarantined tests (including some with 0%–7.7% historical pass rates) are now unblocked in normal CI. If those tests are truly stable this is a clean improvement; if any are still flaky it will add noise to every CI run on the develop branch.
  • tools/test_settings.py — the empty QUARANTINED_TESTS list means all formerly-flaky tests immediately run in normal CI; worth confirming each is genuinely stable before merging.

Important Files Changed

Filename Overview
.github/workflows/build.yaml Removes test-slightly-flaky job, renames test-flakytest-quarantined (gated by RUN_QUARANTINED_TESTS == 'true' repo variable), updates combine-results dependency list, and consolidates artifact names. The conditional skip on test-quarantined plays well with combine-results (which uses if: always() and continue-on-error: true on the artifact download).
.github/actions/run-tests/action.yml Merges two inputs (flaky-only, slightly-flaky-only) into one (quarantined-only); function signature reduced from 12 to 11 positional args and the call site updated consistently.
tools/conftest.py Replaces flaky_only/slightly_flaky_only flags with quarantined_only; adds a graceful zero-exit path when quarantined_only=True but QUARANTINED_TESTS is empty, preventing a spurious failure.
tools/test_settings.py Removes FLAKY_TESTS (14 tests, some with 0%–7.7% historical pass rates) and SLIGHTLY_FLAKY_TESTS (2 tests), replacing both with an empty QUARANTINED_TESTS list. All previously excluded tests now run in normal CI.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[PR / Push to develop] --> B[build]
    B --> C[test-isaaclab-tasks]
    B --> D[test-isaaclab-tasks-2]
    B --> E[test-general]
    B --> F[test-newton]
    B --> G[test-physx]
    B --> H[test-curobo]
    B --> I[test-environments-training]
    B --> J{RUN_QUARANTINED_TESTS\n== 'true'?}
    J -- yes --> K[test-quarantined\ncontinue-on-error: true\nQuarantined Tests only]
    J -- no --> L[test-quarantined\nskipped]
    C & D & E & F & G & H & I & K & L --> M[combine-results\nif: always]
    M --> N[Upload combined XML report]

    style K fill:#f9c74f,color:#000
    style L fill:#dee2e6,color:#555
    style J fill:#adb5bd,color:#000
Loading

Comments Outside Diff (2)

  1. tools/test_settings.py, line 78-85 (link)

    Previously flaky tests now run in normal CI

    All tests previously in FLAKY_TESTS and SLIGHTLY_FLAKY_TESTS — some with historically very low pass rates (e.g. test_rigid_object_collection_iface.py at 0%, test_environments_newton.py at ~6%, test_articulation_iface.py at ~8%) — are no longer excluded from regular CI runs. If any of them haven't been stabilized yet, this will introduce intermittent failures in normal CI jobs that could block unrelated PRs.

    The PR description says "soon to be stable", which implies the fixes may still be in flight. Consider quarantining at least the lowest-pass-rate tests until they have demonstrated stability, e.g.:

    QUARANTINED_TESTS: list[str] = [
        "test_rigid_object_collection_iface.py",
        "test_environments_newton.py",
        "test_articulation_iface.py",
    ]

    This way the new test-quarantined infrastructure (which was purpose-built for exactly this case) can be used before fully promoting them to normal CI.

  2. .github/workflows/build.yaml, line 520-531 (link)

    Fork PR check exits 1 when quarantined list is empty

    When RUN_QUARANTINED_TESTS is true but QUARANTINED_TESTS is empty, conftest.py correctly exits with code 0 — but no XML report is written. For fork PRs, the Check Test Results step then falls into the else branch and calls exit 1. The job itself won't fail (due to continue-on-error: true) and the updated message is helpful, but a cleaner approach would be to also exit 0 in this case:

    This makes the step status actually reflect the situation (nothing to test = success) rather than relying on continue-on-error to swallow the error.

Last reviewed commit: 8c801e8

Copy link
Copy Markdown
Collaborator

@pascal-roth pascal-roth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like it that we have no more flaky tests but wouldnt that not be a useful tool for future tests?

@AntoineRichard
Copy link
Copy Markdown
Collaborator Author

I like it that we have no more flaky tests but wouldnt that not be a useful tool for future tests?

Oh we do very much have flaky tests. I just removed the CI jobs name Flaky X.

Copy link
Copy Markdown
Collaborator

@pascal-roth pascal-roth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

got it, just remove the selection of some tests that were named flaky (@flaky pytest arg still exists)

@myurasov-nv
Copy link
Copy Markdown
Member

myurasov-nv commented Mar 16, 2026

+1 to @pascal-roth. I think it would be nice to leave in the test quarantining functionality but disable it (and empty quarantined test list) for now since a magnificent effort by @AntoineRichard made or tests great again. :)

Brings tears of joy to my eyes:

image

@myurasov-nv
Copy link
Copy Markdown
Member

Regarding use of @flaky decorator - I think it just hides the problems, unless test case explicitly is meant for succeeding sometimes. In which case it probably can be re-written to attempt things few times.

@myurasov-nv
Copy link
Copy Markdown
Member

@greptile-apps

…if vars.RUN_QUARANTINED_TESTS is set to true)
@myurasov-nv
Copy link
Copy Markdown
Member

myurasov-nv commented Mar 17, 2026

@greptile-apps - error on empty QUARANTINED_TESTS is desired. If you enable vars.RUN_QUARANTINED_TESTS, why run 0 tests?

@myurasov-nv
Copy link
Copy Markdown
Member

myurasov-nv commented Mar 17, 2026

@greptile-apps

@AntoineRichard AntoineRichard merged commit 1558014 into isaac-sim:develop Mar 17, 2026
8 checks passed
myurasov-nv added a commit to myurasov-nv/IsaacLab that referenced this pull request Mar 20, 2026
# Description

These tests are soon to be stable and can be run as part of the regular
CI jobs.

## Type of change

- New feature (non-breaking change which adds functionality)

## Checklist

- [ ] I have read and understood the [contribution
guidelines](https://isaac-sim.github.io/IsaacLab/main/source/refs/contributing.html)
- [ ] I have run the [`pre-commit` checks](https://pre-commit.com/) with
`./isaaclab.sh --format`
- [ ] I have made corresponding changes to the documentation
- [ ] My changes generate no new warnings
- [ ] I have added tests that prove my fix is effective or that my
feature works
- [ ] I have updated the changelog and the corresponding version in the
extension's `config/extension.toml` file
- [ ] I have added my name to the `CONTRIBUTORS.md` or my name already
exists there

---------

Co-authored-by: Mikhail Yurasov <myurasov@nvidia.com>
daniela-hase pushed a commit to daniela-hase/IsaacLab that referenced this pull request Mar 30, 2026
# Description

These tests are soon to be stable and can be run as part of the regular
CI jobs.

## Type of change

- New feature (non-breaking change which adds functionality)

## Checklist

- [ ] I have read and understood the [contribution
guidelines](https://isaac-sim.github.io/IsaacLab/main/source/refs/contributing.html)
- [ ] I have run the [`pre-commit` checks](https://pre-commit.com/) with
`./isaaclab.sh --format`
- [ ] I have made corresponding changes to the documentation
- [ ] My changes generate no new warnings
- [ ] I have added tests that prove my fix is effective or that my
feature works
- [ ] I have updated the changelog and the corresponding version in the
extension's `config/extension.toml` file
- [ ] I have added my name to the `CONTRIBUTORS.md` or my name already
exists there

---------

Co-authored-by: Mikhail Yurasov <myurasov@nvidia.com>
hujc7 pushed a commit to hujc7/IsaacLab that referenced this pull request Mar 30, 2026
# Description

These tests are soon to be stable and can be run as part of the regular
CI jobs.

## Type of change

- New feature (non-breaking change which adds functionality)

## Checklist

- [ ] I have read and understood the [contribution
guidelines](https://isaac-sim.github.io/IsaacLab/main/source/refs/contributing.html)
- [ ] I have run the [`pre-commit` checks](https://pre-commit.com/) with
`./isaaclab.sh --format`
- [ ] I have made corresponding changes to the documentation
- [ ] My changes generate no new warnings
- [ ] I have added tests that prove my fix is effective or that my
feature works
- [ ] I have updated the changelog and the corresponding version in the
extension's `config/extension.toml` file
- [ ] I have added my name to the `CONTRIBUTORS.md` or my name already
exists there

---------

Co-authored-by: Mikhail Yurasov <myurasov@nvidia.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants