Add ctas_per_cga for PAIR_CTA mode in TLX kernels by rafaykhurram · Pull Request #431 · meta-recsys/generative-recommenders

rafaykhurram · 2025-12-18T07:30:37Z

Differential Revision: D89439254

meta-codesync · 2025-12-18T07:30:44Z

@rafaykhurram has exported this pull request. If you are a Meta employee, you can view the originating Diff in D89439254.

Summary: X-link: meta-recsys/generative-recommenders#431 Add `ctas_per_cga` parameter to `triton.Config` when `PAIR_CTA` is enabled, following the pattern from D89389230 which introduces CUDA-native cluster launch semantics. This change affects: - `tritonbench/operators/gemm/tlx_matmul.py`: Added `ctas_per_cga=(2, 1, 1) if pairCTA else None` to the autotune config - `generative_recommenders/ops/triton/triton_addmm.py`: Added `c.ctas_per_cga = (2, 1, 1) if pair_cta_compatible else None` in `_prune_configs_for_pair_cta` The `ctas_per_cga` parameter enables CUDA-native cluster launch semantics (TLX way), which differs from Triton's `num_ctas` approach: - **Triton's way** (`num_ctas`): Grid is multiplied by cluster_dims to get total CTAs - **TLX/CUDA way** (`ctas_per_cga`): Grid equals total CTAs, ctas_per_cga regroups them into clusters Reviewed By: rafaykhurram Differential Revision: D89439254

Add ctas_per_cga for PAIR_CTA mode in TLX kernels

be21b02

Differential Revision: D89439254

meta-cla Bot added the CLA Signed This label is managed by the Meta Open Source bot. label Dec 18, 2025

meta-codesync Bot added fb-exported meta-exported labels Dec 18, 2025

htyu mentioned this pull request Dec 18, 2025

Add ctas_per_cga for PAIR_CTA mode in TLX kernels meta-pytorch/tritonbench#750

Merged

htyu approved these changes Dec 18, 2025

View reviewed changes

LinjianMa closed this Jan 24, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add ctas_per_cga for PAIR_CTA mode in TLX kernels#431

Add ctas_per_cga for PAIR_CTA mode in TLX kernels#431
rafaykhurram wants to merge 1 commit intometa-recsys:mainfrom
rafaykhurram:export-D89439254

rafaykhurram commented Dec 18, 2025

Uh oh!

meta-codesync Bot commented Dec 18, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

rafaykhurram commented Dec 18, 2025

Uh oh!

meta-codesync Bot commented Dec 18, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants