Refactor/torch autocast encapsulate global state#7946
Open
nathon-lee wants to merge 15 commits intodeepspeedai:masterfrom
Open
Refactor/torch autocast encapsulate global state#7946nathon-lee wants to merge 15 commits intodeepspeedai:masterfrom
nathon-lee wants to merge 15 commits intodeepspeedai:masterfrom
Conversation
This reverts commit ff88670. Co-authored-by: nathon-lee <248585198+nathon-lee@users.noreply.github.com>
Revert "fix: update 1 file reformatted." (ff88670)
This reverts commit b90aee5.
Revert accidental Muon optimizer code re-introduction from copilot PRs
Signed-off-by: nathon-lee <leejianwoo@gmail.com>
tohtana
reviewed
Apr 3, 2026
Collaborator
tohtana
left a comment
There was a problem hiding this comment.
Hi @nathon-lee, thank you for opening this PR!
_autocast_state is still global and doesn't seem support different configs for multiple engines. Did I misunderstand something?
Contributor
Author
Signed-off-by: nathon-lee <leejianwoo@gmail.com>
Contributor
Author
|
/ @copilot review |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
refactor: replace bare global vars in torch_autocast with _AutocastState
TORCH_AUTOCAST_INITIALIZED and TORCH_AUTOCAST_DTYPE were module-level
globals mutated via
globalstatements inside init_autocast_params().This pattern is fragile: it is invisible to type checkers, prevents
isolation between multiple engine instances, and makes the state harder
to reset in tests.
Replace them with a private
_AutocastStatedataclass instance_autocast_state. The public API (is_autocast_initialized,get_autocast_dtype) is unchanged, so no call sites are affected.
fix: store autocast state per-engine to support multiple engine configs
Previously,
_autocast_statewas a module-level singleton intorch_autocast.py. When a second DeepSpeed engine called
init_autocast_params(), it would overwrite the first engine's dtypeand initialized state, making it impossible to run two engines with
different autocast configurations concurrently.
Fix by attaching
_AutocastStatedirectly to the engine instance(
engine._autocast_state). Updateis_autocast_initialized()andget_autocast_dtype()to accept anengineargument. For ZeROoptimizers (which hold no engine reference), switch from the global
state query to the per-parameter
has_comm_dtype()check; parametersare already stamped by their own engine inside
init_autocast_params(),so isolation is automatic.