Skip to content

Fix for fp16/bf16 export & compile in qwen3vl & qwen3vlmoe models#980

Open
qcdipankar wants to merge 11 commits into
release/v1.22.0_tmpfrom
fp16_bug_fix_qwen3vl
Open

Fix for fp16/bf16 export & compile in qwen3vl & qwen3vlmoe models#980
qcdipankar wants to merge 11 commits into
release/v1.22.0_tmpfrom
fp16_bug_fix_qwen3vl

Conversation

@qcdipankar
Copy link
Copy Markdown
Contributor

Added fix for fp16 export in qwen3 and qwen3vl modeling files.

@qcdipankar qcdipankar self-assigned this May 12, 2026
@asmigosw
Copy link
Copy Markdown
Contributor

Please convert all the nodes and IO info datatype except the logits in custom dtype:

  1. final_mask = torch.ones((seq_len, seq_len), dtype=torch.float32)
  2. IOInfo(name="pixel_values", datatype=torch.float32, shape=("batch_size", 3, "image_size", "image_size")),

@qcdipankar qcdipankar force-pushed the fp16_bug_fix_qwen3vl branch from ef89c88 to 0a0db36 Compare May 18, 2026 10:14
@qcdipankar qcdipankar changed the title Fix for fp16 export in qwen3vl & qwen3vlmoe models Fix for fp16/bf16 export & compile in qwen3vl & qwen3vlmoe models May 19, 2026
@asmigosw
Copy link
Copy Markdown
Contributor

Can you please change the below dtype also at line 115:
self._set_cos_sin_cache(seq_len=self.original_max_seq_len, device=self.inv_freq.device, dtype=torch.get_default_dtype())

Change dtype = config.torch_dtype as torch.get_default_dtype() sets the ROPE weights to default float32.

Please make same changes in both modelling file and check for any other dtype which is in float32 and make it take from torch_dtype passed by user.

@qcdipankar qcdipankar force-pushed the fp16_bug_fix_qwen3vl branch from 4a0dbf7 to bcbdffe Compare May 19, 2026 14:43
Comment thread QEfficient/transformers/models/qwen3_vl_moe/modeling_qwen3_vl_moe.py Outdated
Copy link
Copy Markdown
Contributor

@asmigosw asmigosw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@qcdipankar qcdipankar force-pushed the fp16_bug_fix_qwen3vl branch from e55232b to d926081 Compare May 23, 2026 10:57
@quic-rishinr quic-rishinr added the 1.22 Release 1.22 candidate label May 25, 2026
@quic-rishinr quic-rishinr changed the base branch from main to release/v1.22.0_tmp May 25, 2026 17:02
qcdipankar added 11 commits May 25, 2026 22:34
Signed-off-by: Dipankar Sarkar <dipankar@qti.qualcomm.com>
Signed-off-by: Dipankar Sarkar <dipankar@qti.qualcomm.com>
Signed-off-by: Dipankar Sarkar <dipankar@qti.qualcomm.com>
Signed-off-by: Dipankar Sarkar <dipankar@qti.qualcomm.com>
Signed-off-by: Dipankar Sarkar <dipankar@qti.qualcomm.com>
Signed-off-by: Dipankar Sarkar <dipankar@qti.qualcomm.com>
Signed-off-by: Dipankar Sarkar <dipankar@qti.qualcomm.com>
Signed-off-by: Dipankar Sarkar <dipankar@qti.qualcomm.com>
Signed-off-by: Dipankar Sarkar <dipankar@qti.qualcomm.com>
Signed-off-by: Dipankar Sarkar <dipankar@qti.qualcomm.com>
Signed-off-by: Dipankar Sarkar <dipankar@qti.qualcomm.com>
@quic-rishinr quic-rishinr force-pushed the fp16_bug_fix_qwen3vl branch from d926081 to 374b6ed Compare May 25, 2026 17:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

1.22 Release 1.22 candidate

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants