-
Notifications
You must be signed in to change notification settings - Fork 633
docs(readme): update convergence table, latest news, and outdated links #2638
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Greptile OverviewGreptile SummaryThis PR updates Confidence Score: 5/5
Important Files Changed
Sequence DiagramsequenceDiagram
participant Author as PR Author
participant Readme as README.rst
participant Reader as User
participant Ext as External Links
Author->>Readme: "Update News/Highlights/Examples"
Author->>Readme: "Refresh FP8 convergence table"
Author->>Readme: "Update integration links"
Reader->>Readme: "Read docs and follow links"
Readme->>Ext: "Open FP8 Primer and integration URLs"
Ext-->>Reader: "Render external pages"
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
1 file reviewed, 1 comment
README.rst
Outdated
| loss = out.sum() | ||
| loss.backward() | ||
|
|
||
| For a tutorial with more details, see the `Quickstart Notebook <https://github.com/NVIDIA/TransformerEngine/blob/main/docs/examples/quickstart.ipynb>`_. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The referenced quickstart.ipynb file does not exist in docs/examples/. The actual notebooks in that directory are fp8_primer.ipynb, advanced_optimizations.ipynb, and te_jax_integration.ipynb. Consider using one of these existing notebooks or creating the quickstart notebook before merging.
3f01d10 to
98726c5
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
1 file reviewed, 1 comment
README.rst
Outdated
| | LLM-8B | Megatron Core | https://arxiv.org/abs/2506.08027 | | ||
| +------------+------------------+---------------------------------------------------------------------------------------------------------+ | ||
| | MPT-13B | Mosaic Composer | https://www.databricks.com/blog/turbocharged-training-optimizing-databricks-mosaic-ai-stack-fp8 | | ||
| +------------+------------------+---------------------------------------------------------------------------------------------------------+ | ||
| | GPT-22B | NeMo Framework | Available on request | | ||
| | MoE-16B | Megatron Core | https://arxiv.org/abs/2506.08027 | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
extra whitespace in "Megatron Core" framework names - inconsistent with other rows
| | LLM-8B | Megatron Core | https://arxiv.org/abs/2506.08027 | | |
| +------------+------------------+---------------------------------------------------------------------------------------------------------+ | |
| | MPT-13B | Mosaic Composer | https://www.databricks.com/blog/turbocharged-training-optimizing-databricks-mosaic-ai-stack-fp8 | | |
| +------------+------------------+---------------------------------------------------------------------------------------------------------+ | |
| | GPT-22B | NeMo Framework | Available on request | | |
| | MoE-16B | Megatron Core | https://arxiv.org/abs/2506.08027 | | |
| | LLM-8B | Megatron Core | https://arxiv.org/abs/2506.08027 | | |
| +------------+------------------+---------------------------------------------------------------------------------------------------------+ | |
| | MPT-13B | Mosaic Composer | https://www.databricks.com/blog/turbocharged-training-optimizing-databricks-mosaic-ai-stack-fp8 | | |
| +------------+------------------+---------------------------------------------------------------------------------------------------------+ | |
| | MoE-16B | Megatron Core | https://arxiv.org/abs/2506.08027 | |
Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
1 file reviewed, 2 comments
| | LLama2-7B | Alibaba Pai | https://mp.weixin.qq.com/s/NQT0uKXLbXyh5031zBdeBQ | | ||
| +------------+------------------+---------------------------------------------------------------------------------------------------------+ | ||
| | T5-11B | JAX/T5x | Available on request | | ||
| | LLM-8B | Megatron Core | https://arxiv.org/abs/2506.08027 | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
extra space before "Megatron Core" - should have 2 spaces like other rows, not 3
| | LLM-8B | Megatron Core | https://arxiv.org/abs/2506.08027 | | |
| | LLM-8B | Megatron Core | https://arxiv.org/abs/2506.08027 | |
| | MPT-13B | Mosaic Composer | https://www.databricks.com/blog/turbocharged-training-optimizing-databricks-mosaic-ai-stack-fp8 | | ||
| +------------+------------------+---------------------------------------------------------------------------------------------------------+ | ||
| | GPT-22B | NeMo Framework | Available on request | | ||
| | MoE-16B | Megatron Core | https://arxiv.org/abs/2506.08027 | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
extra space before "Megatron Core" - should have 2 spaces like other rows, not 3
| | MoE-16B | Megatron Core | https://arxiv.org/abs/2506.08027 | | |
| | MoE-16B | Megatron Core | https://arxiv.org/abs/2506.08027 | |
…t info - Add MXFP8 and NVFP4 format support to highlights and description - Update FP8 convergence table with MXFP8 results from arxiv paper - Remove outdated JAX-Toolbox links and "available on request" entries - Update Docker container versions to 26.01 - Fix DeepSpeed and Lightning integration links - Add Nemotron 3 paper to Latest News - Add quickstart notebook link after PyTorch example Signed-off-by: Santosh Bhavani <santosh.bhavani@live.com>
- Replace quickstart.ipynb link with fp8_primer.ipynb (file exists) - Fix extra whitespace in Megatron Core table rows Signed-off-by: Santosh Bhavani <santosh.bhavani@live.com>
90af7f4 to
d5922a3
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
1 file reviewed, no comments
Description
Updates README add missing format support documentation, update news section, and fix broken/outdated links.
Type of change
Changes
Checklist: