Skip to content

fix: restore VSCode test discovery and make GPU isolation opt-in#605

Draft
planetf1 wants to merge 1 commit intogenerative-computing:mainfrom
planetf1:fix/issue-604-gpu-isolation
Draft

fix: restore VSCode test discovery and make GPU isolation opt-in#605
planetf1 wants to merge 1 commit intogenerative-computing:mainfrom
planetf1:fix/issue-604-gpu-isolation

Conversation

@planetf1
Copy link
Contributor

@planetf1 planetf1 commented Mar 7, 2026

Fix: Restore VSCode test discovery and make GPU isolation opt-in

Fixes #604

Type of PR

  • Bug Fix
  • New Feature
  • Documentation
  • Other

Description

Issue #604 reported three critical problems: (1) VSCode test discovery broken - pytest --collect-only would hang, (2) Silent test skipping - 400+ tests silently skipped due to automatic GPU isolation, (3) Hidden failures - test failures masked by hard pytest.exit() calls.

Solution: Implemented 4-guard architecture for opt-in GPU isolation:

  1. Test Discovery Guard - Never run isolation during --collect-only (fixes VSCode)
  2. Opt-in Guard - Only isolate when --isolate-heavy flag or CICD=1 set
  3. Hardware Guard - Only applies to CUDA environments
  4. Single Module Guard - No isolation needed for single module

Changes:

  • Added --isolate-heavy CLI flag for explicit GPU process isolation
  • Added @pytest.mark.requires_gpu_isolation marker (applied to 4 heavy GPU test files)
  • Rewrote pytest_collection_finish with 4-guard architecture
  • Non-destructive execution - failures propagate, remaining tests continue
  • Removed dead code (import warnings, _collect_heavy_ram_modules())
  • Updated documentation (AGENTS.md, CONTRIBUTING.md, test/README.md, test/MARKERS_GUIDE.md)

Files: 9 files changed, 115 insertions, 75 deletions

Testing

  • Tests added to the respective file if code was changed
  • New code has 100% coverage if code as added
  • Ensure existing tests and github automation passes (a maintainer will kick off the github automation when the rest of the PR is populated)

Additional Validation:

  • Test discovery works instantly (pytest --collect-only)
  • Normal execution unaffected (no isolation overhead)
  • Hardware guard prevents isolation on non-CUDA systems
  • All code quality checks pass (ruff, mypy, pre-commit)
  • Example testing unaffected (48 examples collected)
  • test collection with vscode plugin
  • Originator to test failing scenario
  • Run tests on LSF cluster

^^ DRAFT whilst verifying remaining areas

- Add --isolate-heavy CLI flag for explicit GPU isolation
- Add @pytest.mark.requires_gpu_isolation marker
- Rewrite pytest_collection_finish with 4-guard architecture
- Fix test discovery (pytest --collect-only now works instantly)
- Apply markers to all 4 heavy GPU test files
- Fix failure propagation from subprocesses
- Update documentation for new markers and flags

Fixes generative-computing#604
@github-actions
Copy link
Contributor

github-actions bot commented Mar 7, 2026

The PR description has been updated. Please fill out the template for your PR to be reviewed.

@mergify
Copy link

mergify bot commented Mar 7, 2026

Merge Protections

Your pull request matches the following merge protections and will not be merged until they are valid.

🟢 Enforce conventional commit

Wonderful, this rule succeeded.

Make sure that we follow https://www.conventionalcommits.org/en/v1.0.0/

  • title ~= ^(fix|feat|docs|style|refactor|perf|test|build|ci|chore|revert|release)(?:\(.+\))?:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Testing is generally broken

1 participant