Add backfill support and fix topology-sensitive green context tests#2031
Merged
leofang merged 1 commit intoNVIDIA:mainfrom May 6, 2026
Merged
Add backfill support and fix topology-sensitive green context tests#2031leofang merged 1 commit intoNVIDIA:mainfrom
leofang merged 1 commit intoNVIDIA:mainfrom
Conversation
Contributor
…tests Add `backfill` field to SMResourceOptions. When True, sets CU_DEV_SM_RESOURCE_GROUP_BACKFILL on each group's flags, allowing the driver to relax the co-scheduling constraint when assigning SMs to groups. This enables requesting arbitrary aligned SM counts that the driver would otherwise reject due to hardware topology constraints. Fix test_two_groups and related tests: replace _aligned_half (which computed half the SMs rounded to alignment — not always a valid partition on all GPU topologies) with _safe_two_group_count (uses min_partition_size, always valid). Add dedicated test_two_groups_backfill that exercises the aggressive even-split with backfill=True. Fixes NVIDIA#2025. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Member
Author
|
/ok to test d703209 |
|
mdboom
approved these changes
May 6, 2026
Contributor
mdboom
left a comment
There was a problem hiding this comment.
LGTM. And I confirmed the tests are passing on my local machine where I first found the issue with the tests.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Fixes #2025.
Summary
backfillfield toSMResourceOptionsforSMResource.split().Details
Backfill support
SMResourceOptionsgains abackfillfield (bool | Sequence[bool], defaultFalse). WhenTrue, setsCU_DEV_SM_RESOURCE_GROUP_BACKFILLon the group's flags, allowing the driver to relax the co-scheduling constraint when assigning SMs to groups. This enables requesting arbitrary aligned SM counts that would otherwise be rejected due to hardware topology constraints.Without backfill,
cuDevSmResourceSplitmay reject requests even when the total SM count is sufficient and the requested count is properly aligned tocoscheduledSmCount. This is because the driver's internal assignment algorithm has additional constraints beyond simple alignment. Withbackfill=True, the driver satisfies the requested SM count but some SMs may not have the co-scheduling guarantee.See the CUDA driver API documentation for details on
CU_DEV_SM_RESOURCE_GROUP_BACKFILL.Test fixes
Replace
_aligned_halfheuristic (computed half the SMs rounded to alignment — not always a valid partition on all GPU topologies) with_safe_two_group_count(usesmin_partition_size, always valid). Add dedicatedtest_two_groups_backfillthat exercises the aggressive even-split withbackfill=True.Validation
-- Leo's bot