-
Notifications
You must be signed in to change notification settings - Fork 458
Print logic axes #2985
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Print logic axes #2985
Conversation
Codecov Report❌ Patch coverage is 📢 Thoughts on this report? Let us know! |
6433043 to
6591621
Compare
dcb87fc to
d0644bc
Compare
richjames0
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm although note my pending comment
NuojCheng
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the PR! Made some comments and suggestions. It would be nice to add a real train test in the PR description.
Thanks for the suggestion. Add a real train log. Please check. |
be0bb6a to
d84d4ca
Compare
fcb87c0
into
AI-Hypercomputer:main
Description
This change builds upon PR#2866 and PR#2926 to add functionality for printing logical axes
Tests
Command:
python -m MaxText.train_compile MaxText/configs/base.yml compile_topology=v5p-1024 compile_topology_num_slices=1 model_name=deepseek3-671b per_device_batch_size=1 ici_tensor_parallelism=8 ici_expert_parallelism=8 log_config=false debug_sharding=trueOutput example:
Log for a real train: https://paste.googleplex.com/6519526529302528
Weight with Logic axes:
Activation Logical Axes:
Checklist
Before submitting this PR, please make sure (put X in square brackets):
gemini-reviewlabel.