From 40fc378b0d1560f1644aa727d32e6e2de83a1aac Mon Sep 17 00:00:00 2001 From: Will Feng Date: Sat, 14 Mar 2026 08:56:13 -0700 Subject: [PATCH] docs: mention HELION_BACKEND=tileir alongside ENABLE_TILE=1 The TileIR backend requires both env vars to be set. Update the table, step-by-step instructions, and "Which should I use?" section to consistently mention both ENABLE_TILE=1 and HELION_BACKEND=tileir. --- docs/helion-hackathon.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/docs/helion-hackathon.md b/docs/helion-hackathon.md index 73c91ec..d48ca83 100644 --- a/docs/helion-hackathon.md +++ b/docs/helion-hackathon.md @@ -291,11 +291,11 @@ def my_kernel(...): The B200 instances also ship with **nvtriton**, NVIDIA's extended Triton compiler that includes a **TileIR** backend — an alternative compilation pipeline that bypasses LLVM and compiles directly to CUBIN via NVIDIA's `tileiras` compiler. -| | `ENABLE_TILE=0` (default) | `ENABLE_TILE=1` | +| | `ENABLE_TILE=0` (default) | `ENABLE_TILE=1` + `HELION_BACKEND=tileir` | |---|---|---| | **Helion backend** | `triton` | `tileir` | -**Step 1: Enable TileIR and autotune.** Set the env vars before importing Helion, then autotune as usual. Helion automatically adjusts the search space for the TileIR backend. +**Step 1: Enable TileIR and autotune.** Set both `ENABLE_TILE=1` and `HELION_BACKEND=tileir` env vars before importing Helion, then autotune as usual. Helion automatically adjusts the search space for the TileIR backend. **Step 2: Hardcode the TileIR config in your submission.** Copy the best config from the autotuner output (it will include TileIR-specific fields like `num_ctas` and `occupancy`). The env vars must be set before imports: @@ -320,7 +320,7 @@ def my_kernel(...): ### Which should I use? -Try both `ENABLE_TILE=0` and `ENABLE_TILE=1`, with and without ACFs, then submit whichever gives the best benchmark numbers. +Try both the default backend (`ENABLE_TILE=0`) and the TileIR backend (`ENABLE_TILE=1` + `HELION_BACKEND=tileir`), with and without ACFs, then submit whichever gives the best benchmark numbers. ## Tips