maxtext-v0.2.0
Changes
- Qwen3-Next is now supported.
- New
tpu-post-traintarget in PyPI. Please also use this installation option for running vllm_decode. See the MaxText installation instructions for more info. - New MaxText structure! MaxText has been restructured according to RESTRUCTURE.md. Please feel free to share your thoughts and feedback.
- Muon optimizer is now supported.
- DeepSeek V3.1 is now supported. Use existing configs for DeepSeek V3 671B and load in V3.1 checkpoint to use model.
- New RL and SFT Notebook tutorials are available.
- The ReadTheDocs documentation site has been reorganized.
- Multi-host support for GSPO and GRPO is now available via new RL tutorials.
- A new guide, What is Post Training in MaxText?, is now available.
- Ironwood TPU co-designed AI stack announced. Read the blog post on its co-design with MaxText.
- Optimized models tiering documentation has been refreshed.
- Added Versioning. Check out our first set of release notes!
- Post-Training (SFT, RL) via Tunix is now available.
- Vocabulary tiling (PR) is now supported in MaxText! Adjust config
num_vocab_tilingto unlock more efficient memory usage. - The GPT-OSS family of models (20B, 120B) is now supported.
Deprecations
- Many MaxText modules have changed locations. Core commands like train, decode, sft, etc. will still work as expected temporarily. Please update your commands to the latest file locations
- install_maxtext_github_deps installation script replaced with install_maxtext_tpu_github_deps
tools/setup/setup_post_training_requirements.shfor post training dependency installation is deprecated in favor of pip installation