maxtext-v0.2.0

bvandermoon released this 06 Mar 07:15

77edafe

Changes

Qwen3-Next is now supported.
New tpu-post-train target in PyPI. Please also use this installation option for running vllm_decode. See the MaxText installation instructions for more info.
New MaxText structure! MaxText has been restructured according to RESTRUCTURE.md. Please feel free to share your thoughts and feedback.
Muon optimizer is now supported.
DeepSeek V3.1 is now supported. Use existing configs for DeepSeek V3 671B and load in V3.1 checkpoint to use model.
New RL and SFT Notebook tutorials are available.
The ReadTheDocs documentation site has been reorganized.
Multi-host support for GSPO and GRPO is now available via new RL tutorials.
A new guide, What is Post Training in MaxText?, is now available.
Ironwood TPU co-designed AI stack announced. Read the blog post on its co-design with MaxText.
Optimized models tiering documentation has been refreshed.
Added Versioning. Check out our first set of release notes!
Post-Training (SFT, RL) via Tunix is now available.
Vocabulary tiling (PR) is now supported in MaxText! Adjust config num_vocab_tiling to unlock more efficient memory usage.
The GPT-OSS family of models (20B, 120B) is now supported.

Deprecations

Many MaxText modules have changed locations. Core commands like train, decode, sft, etc. will still work as expected temporarily. Please update your commands to the latest file locations
install_maxtext_github_deps installation script replaced with install_maxtext_tpu_github_deps
tools/setup/setup_post_training_requirements.sh for post training dependency installation is deprecated in favor of pip installation

Assets 2