Support Qwen3.5-VL (dense + MoE) via Megatron-Bridge by demouo · Pull Request #2075 · THUDM/slime

demouo · 2026-06-14T09:25:57Z

Summary

Support Qwen3.5-VL (dense + MoE) via NVIDIA Megatron-Bridge and solve #2073 with a standard method.

Changes:

Add slime_plugins/megatron_bridge/qwen3_5_vl.py to register the official Qwen35VLBridge / Qwen35VLMoEBridge by simply importing them so their @MegatronModelBridge.register_bridge decorators run.
Wire it into slime_plugins/megatron_bridge/__init__.py.
Refresh examples/geo3k_vlm/run_geo3k_qwen35.sh so a single script covers both dense (Qwen3.5-9B / 27B) and MoE (Qwen3.5-35B-A3B, ...) via MODEL_NAME, on top of the official megatron-bridge>=0.4.0 (no fork install needed).

Motivation

Qwen3.5-VL support is missing in slime — the only existing path is the legacy text-only slime_plugins/mbridge/qwen3_5.py, which doesn't know about the vision encoder, GDN+Gated-Attention hybrid layers, or M-RoPE.
NVIDIA already shipped these bridges in megatron-bridge 0.4.0 — we just needs to import them so the dispatch decorators run before AutoBridge.from_hf_pretrained. So we reuse NVIDIA's implementation rather than reimplementing on the slime side.
The plugin also patches mapping_registry() to add legacy transformer_layer aliases for every mtp_model_layer mapping (idempotent, no-op on newer Megatron-LM), to absorb the upstream Megatron-LM ↔ megatron-bridge module-naming drift around MTP layers:

def _add_legacy_mtp_aliases(registry):
    """Duplicate every ``mtp.*.mtp_model_layer.*`` mapping with the legacy
    Megatron-LM name ``transformer_layer``.
    The bridge's ``mapping_registry()`` returns a fresh registry on every call
    and ``MegatronMappingRegistry.__init__`` *pre-compiles* the patterns into
    ``_compiled_patterns`` / ``_reverse_patterns`` — so we cannot just append
    to ``registry.mappings``: the new entries would never be matched at
    lookup time. Instead we build a brand-new registry from the augmented
    mapping list, which lets ``__init__`` re-compile everything.
    """
    if registry is None:
        return registry
    original = list(registry.mappings)
    extra = []
    for mapping in original:
        m_param = getattr(mapping, "megatron_param", None)
        if isinstance(m_param, str) and ".mtp_model_layer." in m_param:
            alias = copy.copy(mapping)
            alias.megatron_param = m_param.replace(".mtp_model_layer.", ".transformer_layer.")
            extra.append(alias)
    if not extra:
        return registry

    cls = registry.__class__
    new_registry = cls(*original, *extra)
    return new_registry

Usage

Dense (default)

MODEL_NAME=Qwen3.5-9B  bash examples/geo3k_vlm/run_geo3k_qwen35.sh
MODEL_NAME=Qwen3.5-27B bash examples/geo3k_vlm/run_geo3k_qwen35.sh

MoE

MODEL_NAME=Qwen3.5-35B-A3B bash examples/geo3k_vlm/run_geo3k_qwen35.sh

Pass

AutoBridge automatically routes Qwen3.5-VL to the official bridge:

import slime_plugins.megatron_bridge   # triggers registration
from megatron.bridge import AutoBridge

bridge = AutoBridge.from_hf_pretrained("/path/to/Qwen3.5-9B", trust_remote_code=True)
print(type(bridge._model_bridge).__name__)              # Qwen35VLBridge
provider = bridge.to_megatron_provider(load_weights=False)
print(type(provider).__name__, provider.position_embedding_type, provider.vision_config is not None)
# Qwen35VLModelProvider mrope True

End-to-end: weights load through bridge (incl. vision encoder / GDN / MoE) and forward / rollout / actor-train all complete on Qwen3.5-9B and Qwen3.5-35B-A3B examples

…se + MoE) with MTP-naming alias and end-to-end geo3k example Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

feat(qwen3.5-vl): wire NVIDIA megatron-bridge Qwen3.5-VL bridges (den…

e9c4ab2

…se + MoE) with MTP-naming alias and end-to-end geo3k example Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support Qwen3.5-VL (dense + MoE) via Megatron-Bridge#2075

Support Qwen3.5-VL (dense + MoE) via Megatron-Bridge#2075
demouo wants to merge 1 commit into
THUDM:mainfrom
demouo:support_qwen35_all_vlm_megatron_bridge

demouo commented Jun 14, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

demouo commented Jun 14, 2026

Summary

Changes:

Motivation

Usage

Pass

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant