Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 3 additions & 1 deletion .github/workflows/run_jupyter_notebooks.yml
Original file line number Diff line number Diff line change
Expand Up @@ -87,11 +87,13 @@ jobs:
- name: Run Post-Training Notebooks
shell: bash
env:
PYTHONPATH: "${{ github.workspace }}/src"
HF_TOKEN: ${{ secrets.HF_TOKEN }}
run: |
MAXTEXT_REPO_ROOT=$(pwd)
MAXTEXT_NOTEBOOKS_ROOT="$MAXTEXT_REPO_ROOT/src/MaxText/examples"
MAXTEXT_NOTEBOOKS_ROOT="$MAXTEXT_REPO_ROOT/src/maxtext/examples"

shopt -s nullglob # Prevents loop from running if no files match
for notebook in "$MAXTEXT_NOTEBOOKS_ROOT"/{sft,rl}*.ipynb; do
filename=$(basename "$notebook")
output_name="${filename%.ipynb}_output.ipynb"
Expand Down
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -43,7 +43,7 @@ See our guide on running MaxText in decoupled mode, without any GCP dependencies

* \[December 22, 2025\] [Muon optimizer](https://kellerjordan.github.io/posts/muon) is now supported.
* \[December 10, 2025\] DeepSeek V3.1 is now supported. Use existing configs for [DeepSeek V3 671B](https://github.com/AI-Hypercomputer/maxtext/blob/main/src/MaxText/configs/models/deepseek3-671b.yml) and load in V3.1 checkpoint to use model.
* \[December 9, 2025\] [New RL and SFT Notebook tutorials](https://github.com/AI-Hypercomputer/maxtext/tree/main/src/MaxText/examples) are available.
* \[December 9, 2025\] [New RL and SFT Notebook tutorials](https://github.com/AI-Hypercomputer/maxtext/tree/main/src/maxtext/examples) are available.
* \[December 4, 2025\] The [ReadTheDocs documentation site](https://maxtext.readthedocs.io/en/latest/index.html) has been reorganized.
* \[December 3, 2025\] Multi-host support for GSPO and GRPO is now available via [new RL tutorials](https://maxtext.readthedocs.io/en/latest/tutorials/posttraining/rl_on_multi_host.html).
* \[November 20, 2025\] A new guide, [What is Post Training in MaxText?](https://maxtext.readthedocs.io/en/latest/tutorials/post_training_index.html), is now available.
Expand Down
2 changes: 1 addition & 1 deletion codecov.yml
Original file line number Diff line number Diff line change
Expand Up @@ -34,7 +34,7 @@ fixes:
ignore:
- "src/maxtext/assets"
- "src/MaxText/configs"
- "src/MaxText/examples"
- "src/maxtext/examples"
- "src/MaxText/experimental"
- "src/MaxText/inference"
- "src/MaxText/inference_mlperf"
Expand Down
67 changes: 40 additions & 27 deletions docs/guides/run_python_notebook.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,7 @@ Before starting, make sure you have:
- ✅ Basic familiarity with Jupyter, Python, and Git

**For Method 2 (Visual Studio Code) and Method 3 (Local Jupyter Lab) only:**

- ✅ A Google Cloud Platform (GCP) account with billing enabled
- ✅ TPU quota available in your region (check under IAM & Admin → Quotas)
- ✅ `tpu.nodes.create` permission to create a TPU VM
Expand All @@ -36,16 +37,18 @@ Currently, this method only supports the **`sft_qwen3_demo.ipynb`** notebook, wh

Before proceeding, please verify that the specific notebook you are running works reliably on the free-tier TPU resources. If you encounter frequent disconnections or resource limitations, you may need to:

* Upgrade to a Colab Pro or Pro+ subscription for more stable and powerful TPU access.
- Upgrade to a Colab Pro or Pro+ subscription for more stable and powerful TPU access.

* Move to local Jupyter Lab setup method with access to a powerful TPU machine.
- Move to local Jupyter Lab setup method with access to a powerful TPU machine.

### Step 1: Choose an Example
1.a. Visit the [MaxText examples directory](https://github.com/AI-Hypercomputer/maxtext/tree/main/src/MaxText/examples) on Github.

1.a. Visit the [MaxText examples directory](https://github.com/AI-Hypercomputer/maxtext/tree/main/src/maxtext/examples) on Github.

1.b. Find the notebook you want to run (e.g., `sft_qwen3_demo.ipynb`) and copy its URL.

### Step 2: Import into Colab

2.a. Go to [Google Colab](https://colab.research.google.com/) and sign in.

2.b. Select **File** -> **Open Notebook**.
Expand All @@ -63,9 +66,11 @@ Before proceeding, please verify that the specific notebook you are running work
3.c. Click **Save**

### Step 4: Run the Notebook

Follow the instructions within the notebook cells to install dependencies and run the training/inference.

## Method 2: Visual Studio Code with TPU (Recommended)

Running Jupyter notebooks in Visual Studio Code (VS Code) provides a powerful, interactive environment that combines the flexibility of notebooks with the robust features of a code editor. Follow these steps to get your environment up and running.

### Step 1: Set Up TPU VM
Expand All @@ -75,9 +80,10 @@ In Google Cloud Console, create a standalone TPU VM:
1.a. **Compute Engine** → **TPUs** → **Create TPU**

1.b. Example config:
- **Name:** `maxtext-tpu-node`
- **TPU type:** Choose your desired TPU type
- **Runtime Version:** `tpu-ubuntu2204-base` (or other compatible runtime)

- **Name:** `maxtext-tpu-node`
- **TPU type:** Choose your desired TPU type
- **Runtime Version:** `tpu-ubuntu2204-base` (or other compatible runtime)

### Step 2: SSH to TPU-VM via VS Code

Expand All @@ -86,11 +92,12 @@ In Google Cloud Console, create a standalone TPU VM:
2.b. Follow [Connect to a remote host](https://code.visualstudio.com/docs/remote/ssh#_connect-to-a-remote-host) guide to connect to your TPU-VM via VS Code.

### Step 3. Install Necessary Extensions on VS Code

To enable notebook support, you must install two official extensions from the VS Code Marketplace:

* Python Extension: Provides support for the Python language.
- Python Extension: Provides support for the Python language.

* Jupyter Extension: Enables you to create, edit, and run `.ipynb` files directly inside VS Code.
- Jupyter Extension: Enables you to create, edit, and run `.ipynb` files directly inside VS Code.

To install, click the `Extensions` icon on the left sidebar (or press `Ctrl+Shift+X` or `Cmd+Shift+X`), search for `Jupyter` and `Python`, and click `Install`.

Expand All @@ -99,6 +106,7 @@ To install, click the `Extensions` icon on the left sidebar (or press `Ctrl+Shif
To execute post-training notebooks on your TPU-VM, follow the official [MaxText installation guides](https://maxtext.readthedocs.io/en/latest/tutorials/posttraining/rl.html#create-virtual-environment-and-install-maxtext-dependencies) to install MaxText and its dependencies inside a dedicated virtual environment.

### Step 5: Install the necessary library for Jupyter

Jupyter requires a kernel to execute code. This kernel is tied to a specific Python environment. Open your terminal inside VS Code and run:

```bash
Expand All @@ -110,9 +118,9 @@ uv pip install ipykernel
Before you can run the notebook, you must tell VS Code which Python environment to use.

1. Look at the top-right corner of the notebook editor.
2. Click `Select Kernel`.
3. Choose Python Environments and select the virtual environment you created in Step 4.
4. Open [available post-training notebooks in MaxText](#available-examples) inside VS Code and run the jupyter notebook cells.
1. Click `Select Kernel`.
1. Choose Python Environments and select the virtual environment you created in Step 4.
1. Open [available post-training notebooks in MaxText](#available-examples) inside VS Code and run the jupyter notebook cells.

## Method 3: Local Jupyter Lab with TPU (Recommended)

Expand All @@ -125,12 +133,15 @@ In Google Cloud Console, create a standalone TPU VM:
1.a. **Compute Engine** → **TPUs** → **Create TPU**

1.b. Example config:
- **Name:** `maxtext-tpu-node`
- **TPU type:** Choose your desired TPU type
- **Runtime Version:** `tpu-ubuntu2204-base` (or other compatible runtime)

- **Name:** `maxtext-tpu-node`
- **TPU type:** Choose your desired TPU type
- **Runtime Version:** `tpu-ubuntu2204-base` (or other compatible runtime)

### Step 2: Connect with Port Forwarding

Run the following command on your local machine:

> **Note**: The `--` separator before the `-L` flag is required. This tunnels the remote port 8888 to your local machine securely.

```bash
Expand Down Expand Up @@ -170,13 +181,15 @@ jupyter lab --ip=0.0.0.0 --port=8888 --no-browser --allow-root
```

### Step 7: Access the Notebook

7.a. Look at the terminal output for a URL that looks like: `http://127.0.0.1:8888/lab?token=...`.

7.b. Copy that URL.

7.c. Paste it into your **local computer's browser**.
* **Important:** If you changed the port in Step 2 (e.g., to `9999`), you must manually replace `8888` in the URL with `9999`.
* *Example:* `http://127.0.0.1:9999/lab?token=...`

- **Important:** If you changed the port in Step 2 (e.g., to `9999`), you must manually replace `8888` in the URL with `9999`.
- *Example:* `http://127.0.0.1:9999/lab?token=...`

7.d. Once the interface opens in your browser, Click on the current kernel name (e.g., `Python 3 (ipykernel)`).

Expand All @@ -197,13 +210,13 @@ jupyter lab --ip=0.0.0.0 --port=8888 --no-browser --allow-root

## Common Pitfalls & Debugging

| Issue | Solution |
|-------|----------|
| ❌ TPU runtime mismatch | Check TPU runtime version matches VM image |
| ❌ Colab disconnects | Save checkpoints to GCS or Drive regularly |
| ❌ "RESOURCE_EXHAUSTED" errors | Use smaller batch size or v5e-8 instead of v5e-1 |
| ❌ Firewall blocked | Ensure port 8888 open, or always use SSH tunneling |
| ❌ Path confusion | In Colab use `/content/maxtext`; in TPU VM use `~/maxtext` |
| Issue | Solution |
| ------------------------------ | ---------------------------------------------------------- |
| ❌ TPU runtime mismatch | Check TPU runtime version matches VM image |
| ❌ Colab disconnects | Save checkpoints to GCS or Drive regularly |
| ❌ "RESOURCE_EXHAUSTED" errors | Use smaller batch size or v5e-8 instead of v5e-1 |
| ❌ Firewall blocked | Ensure port 8888 open, or always use SSH tunneling |
| ❌ Path confusion | In Colab use `/content/maxtext`; in TPU VM use `~/maxtext` |

## Support and Resources

Expand All @@ -217,9 +230,9 @@ jupyter lab --ip=0.0.0.0 --port=8888 --no-browser --allow-root
If you encounter issues or have improvements for this guide, please:

1. Open an issue on the MaxText repository
2. Submit a pull request with your improvements
3. Share your experience in the discussions
1. Submit a pull request with your improvements
1. Share your experience in the discussions

---
______________________________________________________________________

**Happy Training! 🚀**
**Happy Training! 🚀**
2 changes: 1 addition & 1 deletion docs/tutorials/first_run.md
Original file line number Diff line number Diff line change
Expand Up @@ -75,7 +75,7 @@ In the same TPU VM where you just installed all the dependencies of MaxText, You

#### Decoding in MaxText via notebook

You can use [demo_decoding.ipynb](https://github.com/AI-Hypercomputer/maxtext/blob/main/src/MaxText/examples/demo_decoding.ipynb) to try out decoding on MaxText's `Llama3.1-8b` model implementation. In this notebook, we give `"I love to"` as the prompt, and the greedily sampled first output token is `" cook"`. Please remember to provide the path to your `Llama3.1-8b` checkpoint for the `load_parameters_path` argument in the config inside the notebook. You can use [to_maxtext.py](https://github.com/AI-Hypercomputer/maxtext/blob/main/src/MaxText/utils/ckpt_conversion/to_maxtext.py) to create a MaxText/Orbax checkpoint from a Huggingface checkpoint.
You can use [demo_decoding.ipynb](https://github.com/AI-Hypercomputer/maxtext/blob/main/src/maxtext/examples/demo_decoding.ipynb) to try out decoding on MaxText's `Llama3.1-8b` model implementation. In this notebook, we give `"I love to"` as the prompt, and the greedily sampled first output token is `" cook"`. Please remember to provide the path to your `Llama3.1-8b` checkpoint for the `load_parameters_path` argument in the config inside the notebook. You can use [to_maxtext.py](https://github.com/AI-Hypercomputer/maxtext/blob/main/src/MaxText/utils/ckpt_conversion/to_maxtext.py) to create a MaxText/Orbax checkpoint from a Huggingface checkpoint.

### Run MaxText on NVIDIA GPUs

Expand Down
2 changes: 1 addition & 1 deletion docs/tutorials/posttraining/multimodal.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ This document provides a guide to use the multimodal functionalities in MaxText
- **Multimodal Decode**: Inference with text+images as input.
- **Supervised Fine-Tuning (SFT)**: Apply SFT to the model using a visual-question-answering dataset.

We also provide a [colab](https://github.com/AI-Hypercomputer/maxtext/blob/main/src/MaxText/examples/multimodal_gemma3_demo.ipynb) for multimodal features demonstration. The following table provides a list of models and modalities we currently support:
We also provide a [colab](https://github.com/AI-Hypercomputer/maxtext/blob/main/src/maxtext/examples/multimodal_gemma3_demo.ipynb) for multimodal features demonstration. The following table provides a list of models and modalities we currently support:

| Models | Input Modalities | Output Modalities |
| :--------------------------------------------- | :--------------- | :---------------- |
Expand Down
13 changes: 0 additions & 13 deletions pedagogical_examples/__init__.py

This file was deleted.

2 changes: 1 addition & 1 deletion src/MaxText/configs/rl.yml
Original file line number Diff line number Diff line change
Expand Up @@ -166,7 +166,7 @@ reasoning_start_token: '<reasoning>'
reasoning_end_token: '</reasoning>'
solution_start_token: '<answer>'
solution_end_token: '</answer>'
chat_template_path: 'src/MaxText/examples/chat_templates/gsm8k_rl.json'
chat_template_path: 'src/maxtext/examples/chat_templates/gsm8k_rl.json'
skip_jax_distributed_system: True

# # TODO(@mazumdera): fix this
Expand Down
Loading
Loading