feat(speculative): add Qwen3 dense target support for EAGLE-1/2/3 by khazic · Pull Request #2313 · NVIDIA-NeMo/Automodel

khazic · 2026-05-25T12:37:19Z

What does this PR do?

Add Qwen3 (Qwen3ForCausalLM) as a supported target for EAGLE-1 /
EAGLE-2 / EAGLE-3 training. Stacked on top of #2312 (Phi-3 support);
the actual Qwen3-specific delta is one registry entry + three example
configs + docstring updates.

Depends on #2312. Until #2312 lands, the diff shown here also
includes the Phi-3 PR's commits as a prefix. After #2312 merges this
branch will rebase onto main and become a 3-line incremental PR.
The Qwen3-specific commit is b3d018c8.

Changelog (Qwen3 delta only)

components/speculative/eagle/registry.py: append
Qwen3ForCausalLM to _DENSE_ARCHITECTURES. Qwen3 already works
through the existing config-driven draft path -- it decouples
head_dim from hidden_size / num_attention_heads, which the
attention layer already reads via
getattr(config, "head_dim", ...); attention_bias and
mlp_bias are exposed on Qwen3Config so they are read normally.
Add example YAMLs:
examples/speculative/eagle{1,2,3}/qwen3_eagle{1,2,3}_perfectblend.yaml.
Update draft / recipe docstrings to mention Qwen3 alongside Llama and Phi-3.

No code-path changes were required. The registry dispatch already
exists from #2312.

Verification

End-to-end smoke test on 8 x H100:

Target: Qwen/Qwen3-8B (15.26 GB, 8.19 B params, model_type=qwen3).
Dataset: PerfectBlend (200-sample slice).
EAGLE-3 over 25 optimizer steps:

2026-05-25 20:34:04 INFO Training start: start_epoch=0 num_epochs=1 batches_per_epoch=25
2026-05-25 20:34:06 INFO epoch=0 step=1  train_loss=9.846308  train_acc=0.000000
2026-05-25 20:34:06 INFO epoch=0 step=2  train_loss=8.477696  train_acc=0.029953
2026-05-25 20:34:07 INFO epoch=0 step=3  train_loss=8.218086  train_acc=0.053741
...
2026-05-25 20:34:13 INFO epoch=0 step=23 train_loss=6.182835  train_acc=0.113525
2026-05-25 20:34:13 INFO epoch=0 step=24 train_loss=6.219086  train_acc=0.098724
2026-05-25 20:34:14 INFO epoch=0 step=25 train_loss=6.175844  train_acc=0.093933
2026-05-25 20:34:14 INFO Epoch 0 done: total_batches_seen=25 global_step=25

Loss decreases 9.85 -> 6.18 over 25 steps (~37% drop), accuracy ticks
up from 0 to ~0.09. No TypeError / AttributeError at
draft construction, target load (Liger applied to model type: qwen3
without complaint), or training step.

Before your PR is "Ready for review"

Pre checks:

Contributor guidelines followed
Did you write any new necessary tests? No -- end-to-end repro needs
multi-GPU + a real Qwen3 target; smoke-test evidence above.
Example configs added under examples/speculative/eagle{1,2,3}/.

copy-pr-bot · 2026-05-25T12:37:23Z

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

HuiyingLi · 2026-05-25T12:38:11Z

/ok to test b3d018c

HuiyingLi · 2026-05-25T12:58:17Z

/ok to test ee3bdc3

HuiyingLi · 2026-05-25T16:08:46Z

/ok to test 0be62cc

Register ``Qwen3ForCausalLM`` in the EAGLE dense draft dispatch table. Qwen3 already works through the existing config-driven draft path: ``head_dim`` is read via ``getattr(config, "head_dim", ...)`` (Qwen3 decouples it from ``hidden_size / num_attention_heads``), and ``attention_bias`` / ``mlp_bias`` are read via ``getattr(..., False)`` so Qwen3's config exposes them correctly. No code-path changes required; just an allowlist entry plus example configs and docstrings. - registry.py: append "Qwen3ForCausalLM" to ``_DENSE_ARCHITECTURES``. - Add example YAMLs: ``qwen3_eagle{1,2,3}_perfectblend.yaml``. - Update docstrings (draft modules + recipes) to mention Qwen3. End-to-end smoke-tested on 8x H100 with Qwen/Qwen3-8B target on a PerfectBlend 200-sample slice (EAGLE-3, 25 steps): loss decreases 9.85 -> 6.18 (~37% drop), train_acc ticks up from 0 to ~0.09. No construction-time / load-time errors. Signed-off-by: khazic <khazzz1c@gmail.com>

EAGLE-3 draft reads ACT2FN[config.hidden_act] from the target config, but EAGLE-1/2 draft hardcoded nn.SiLU(). All currently registered dense architectures (Llama / Phi-3 / Qwen3) happen to use silu, so the hardcode is correct today. However, the dense registry is intended to grow to cover non-SiLU families next (e.g. Gemma uses gelu_pytorch_tanh). With the hardcode in place, registering such an architecture would silently mismatch the target's activation: no crash, no error, training still converges, but draft hidden states drift from target and speculative acceptance rate quietly drops with no observable symptom. Read hidden_act from config so the draft matches the target by construction and adding new architectures stays a one-line registry change. Signed-off-by: khazic <khazzz1c@gmail.com>

HuiyingLi · 2026-05-26T01:59:53Z

/ok to test ef5eb6c

khazic requested review from HuiyingLi, ZhiyuLi-Nvidia, adil-a, akoumpa, athitten, hemildesai, pthombre and zyzhou5 as code owners May 25, 2026 12:37

github-actions Bot added the community-request label May 25, 2026

copy-pr-bot Bot temporarily deployed to nemo-ci May 25, 2026 12:38 Inactive

copy-pr-bot Bot had a problem deploying to nemo-ci May 25, 2026 12:38 Error

copy-pr-bot Bot temporarily deployed to test May 25, 2026 12:38 Inactive

copy-pr-bot Bot temporarily deployed to public May 25, 2026 12:38 Inactive

copy-pr-bot Bot temporarily deployed to public May 25, 2026 12:40 Inactive

copy-pr-bot Bot temporarily deployed to public May 25, 2026 12:41 Inactive

copy-pr-bot Bot temporarily deployed to nemo-ci May 25, 2026 12:43 Inactive

copy-pr-bot Bot temporarily deployed to public May 25, 2026 12:48 Inactive

khazic force-pushed the khazic/feat/eagle-qwen3-support branch from b3d018c to ee3bdc3 Compare May 25, 2026 12:48

copy-pr-bot Bot temporarily deployed to test May 25, 2026 12:58 Inactive

copy-pr-bot Bot temporarily deployed to nemo-ci May 25, 2026 12:58 Inactive

copy-pr-bot Bot temporarily deployed to test May 25, 2026 15:59 Inactive

copy-pr-bot Bot temporarily deployed to public May 25, 2026 15:59 Inactive

khazic force-pushed the khazic/feat/eagle-qwen3-support branch from fa4074f to 0be62cc Compare May 25, 2026 16:00

copy-pr-bot Bot temporarily deployed to test May 25, 2026 16:09 Inactive

copy-pr-bot Bot temporarily deployed to public May 25, 2026 16:09 Inactive

copy-pr-bot Bot temporarily deployed to nemo-ci May 25, 2026 16:09 Inactive

copy-pr-bot Bot temporarily deployed to public May 25, 2026 16:11 Inactive

copy-pr-bot Bot temporarily deployed to nemo-ci May 25, 2026 16:15 Inactive

copy-pr-bot Bot temporarily deployed to public May 25, 2026 16:17 Inactive

svcnvidia-nemo-ci removed the waiting-on-customer Waiting on the original author to respond label May 25, 2026

khazic added 2 commits May 26, 2026 09:56

khazic force-pushed the khazic/feat/eagle-qwen3-support branch from 0be62cc to ef5eb6c Compare May 26, 2026 01:57

copy-pr-bot Bot temporarily deployed to nemo-ci May 26, 2026 02:00 Inactive

copy-pr-bot Bot deployed to nemo-ci May 26, 2026 02:00 Active

copy-pr-bot Bot temporarily deployed to nemo-ci May 26, 2026 02:00 Inactive

copy-pr-bot Bot temporarily deployed to test May 26, 2026 02:00 Inactive

copy-pr-bot Bot temporarily deployed to public May 26, 2026 02:00 Inactive

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(speculative): add Qwen3 dense target support for EAGLE-1/2/3#2313

feat(speculative): add Qwen3 dense target support for EAGLE-1/2/3#2313
khazic wants to merge 2 commits into
NVIDIA-NeMo:mainfrom
khazic:khazic/feat/eagle-qwen3-support

khazic commented May 25, 2026

Uh oh!

copy-pr-bot Bot commented May 25, 2026

Uh oh!

HuiyingLi commented May 25, 2026

Uh oh!

HuiyingLi commented May 25, 2026

Uh oh!

HuiyingLi commented May 25, 2026

Uh oh!

HuiyingLi commented May 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

khazic commented May 25, 2026

What does this PR do?

Changelog (Qwen3 delta only)

Verification

Before your PR is "Ready for review"

Uh oh!

copy-pr-bot Bot commented May 25, 2026

Uh oh!

HuiyingLi commented May 25, 2026

Uh oh!

HuiyingLi commented May 25, 2026

Uh oh!

HuiyingLi commented May 25, 2026

Uh oh!

HuiyingLi commented May 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants