[Feature] Domino EP support and training optimizations for InternS1 Pro VL by tina-wen · Pull Request #1528 · InternLM/xtuner

tina-wen · 2026-03-03T14:24:27Z

Description

This PR optimizes InternS1 Pro VL model training with three key changes:

Domino EP: Add support for domino_ep parallelism
Less d2h: Remove redundant transfers (loss/grad_norm logs only, no accuracy impact)
Layer-wise MoE loss: Split expert balance loss computation to reduce memory

Results: Performance ↑, Memory ↓, Accuracy unchanged

pppppM · 2026-03-03T15:08:23Z

xtuner/v1/model/moe/moe.py

 MOE_EP_COMPILE_CFG = MOE_NON_EP_COMPILE_CFG.copy()
 MOE_EP_COMPILE_CFG.pop("xtuner.v1.module.decoder_layer.moe_decoder_layer.MoEDecoderLayer.forward")

+class _AllReduce(torch.autograd.Function):


Why not use pytorch's interface directly?

https://github.com/pytorch/pytorch/blob/976d4bd88264c4c66c6a6d5c8f69baad3d6bf56e/torch/distributed/nn/functional.py#L205

wentiange added 3 commits March 3, 2026 14:09

[Adapt] add DominoEP for InternS1 Pro VL

4a8c624

[Optimization] remove some D2H transfers

c796e83

[Feature] Layer-wise MoE balance loss computation

5a0d28e

pppppM reviewed Mar 3, 2026

View reviewed changes

HAOCHENYE added the blocked label Mar 4, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature] Domino EP support and training optimizations for InternS1 Pro VL#1528

[Feature] Domino EP support and training optimizations for InternS1 Pro VL#1528
tina-wen wants to merge 3 commits intoInternLM:mainfrom
tina-wen:split_bal_loss

tina-wen commented Mar 3, 2026

Uh oh!

pppppM Mar 3, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

tina-wen commented Mar 3, 2026

Description

Uh oh!

pppppM Mar 3, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants