Fix rollout_is_seq metrics by excluding padded tokens in packed sequences by Hyperion-shuo · Pull Request #1788 · InternLM/xtuner

Hyperion-shuo · 2026-05-14T07:20:17Z

Summary

This PR fixes the rollout_is_seq metric computation for packed sequences.

Previously, padded positions in packed sequences were included in the rollout_is_seq statistics. As a result, some metrics were dominated by padding artifacts:

rollout_is/mismatch/rollout_is_seq_min stayed close to 0.
rollout_is/mismatch/rollout_is_seq_mean was biased downward.
rollout_is/mismatch/rollout_is_seq_max_deviation stayed close to 1.

This PR masks out padded positions so that these metrics are computed only over valid sequence tokens.

Validation

I compared TensorBoard metrics before and after the fix.

Green: before the fix
Red: after the fix

After the fix, rollout_is_seq_min and rollout_is_seq_max_deviation are no longer affected by padded tokens.

hhaAndroid · 2026-05-14T07:39:57Z

@claude review

claude · 2026-05-14T07:44:45Z

Claude: ## Summary

Clean, well-motivated fix. Excluding padded positions from per-sequence IS metrics is the right approach — the sentinel values (-inf/inf) for distributed max/min aggregation are correct.

Issues

Nit

rollout_is.py L550-562: count_value all_reduce is computed per-key inside the loop, but the validity mask is independent of key. Could hoist to avoid ~15-20 redundant NCCL all_reduce calls per step (negligible wall-clock impact but cleaner).

Verdict

APPROVE

hhaAndroid · 2026-05-15T02:49:05Z

@claude review

claude · 2026-05-15T02:53:59Z

Claude: ## Summary

Second commit properly addresses the hoisted count_value all_reduce nit from the prior review. Verified the final state:

Shape compatibility: has_catastrophic is (batch, 1) (from keepdim=True), matching seq_valid shape — no broadcasting issue.
Sentinel values (-inf/inf) correctly don't affect ReduceOp.MAX/ReduceOp.MIN.
All metric keys containing "max"/"min" (e.g., rollout_is_seq_max_deviation, mismatch_log_ppl_diff_min) are semantically max/min values — dispatch logic is sound.
count == 0 guard prevents NaN from division by zero and prevents sentinels from leaking.

Verdict

APPROVE

fix rollout_is count packed pad sequence

5f2039d

claude Bot reviewed May 14, 2026

View reviewed changes

Comment thread xtuner/v1/rl/base/rollout_is.py Outdated

fix redundant mismatch/valid all reduce ,fix is metrics lint

492b6f2

hhaAndroid approved these changes May 15, 2026

View reviewed changes

hhaAndroid merged commit 91e30dd into InternLM:main May 15, 2026
4 of 5 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix rollout_is_seq metrics by excluding padded tokens in packed sequences#1788

Fix rollout_is_seq metrics by excluding padded tokens in packed sequences#1788
hhaAndroid merged 2 commits into
InternLM:mainfrom
Hyperion-shuo:ss/fix_is_metrics_pack_pad_seq

Hyperion-shuo commented May 14, 2026

Uh oh!

hhaAndroid commented May 14, 2026

Uh oh!

Uh oh!

claude Bot commented May 14, 2026

Uh oh!

hhaAndroid commented May 15, 2026

Uh oh!

claude Bot commented May 15, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

Hyperion-shuo commented May 14, 2026

Summary

Validation

Uh oh!

hhaAndroid commented May 14, 2026

Uh oh!

Uh oh!

claude Bot commented May 14, 2026

Issues

Nit

Verdict

Uh oh!

hhaAndroid commented May 15, 2026

Uh oh!

claude Bot commented May 15, 2026

Verdict

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants