Skip to content

Qwen image with magcache#998

Open
quic-amitraj wants to merge 24 commits into
quic:mainfrom
quic-amitraj:qwen_image_with_magcache
Open

Qwen image with magcache#998
quic-amitraj wants to merge 24 commits into
quic:mainfrom
quic-amitraj:qwen_image_with_magcache

Conversation

@quic-amitraj
Copy link
Copy Markdown
Contributor

@quic-amitraj quic-amitraj commented May 20, 2026

Summary

This PR adds runtime MagCache support for Qwen-Image in QEfficient and demonstrates a strong latency reduction on AI100 while preserving visual quality.

Why This Matters

Qwen-Image denoising is transformer-heavy. By reducing expensive transformer executions in later denoise phases, this PR delivers a meaningful inference-time win with minimal image drift.

Benchmark:

Without MagCache:

qwen_image_without_cache

With MagCache:

  • magcache_thresh=0.06
  • magcache_K=2
  • magcache_retention_ratio=0.2
qwen_image_magcache_0 6

Common setup for both runs:

  • Config: examples/diffusers/qwen_image/qwen_config.json
  • Resolution: 1664 x 928
  • Steps: 50
  • true_cfg_scale=4.0
  • max_sequence_length=128
  • seed=42
Metric Without MagCache With MagCache Improvement
Transformer total time 236.9427 s 174.9652 s 1.35x faster
VAE decode time 0.3301 s 0.3342 s ~same
End-to-end time 237.2729 s 175.2993 s 1.35x faster
Transformer executed steps 50 37 1.35x fewer executed calls

Observed differences are minor generation-level variations, while the large latency reduction is retained.

Compile Configuration Used:

  • aic_num_cores=16
  • mdp_ts_num_devices=4
  • mos=1, mdts_mos=1
  • convert_to_fp16=true

qcdipankar and others added 24 commits April 21, 2026 05:14
Signed-off-by: Dipankar Sarkar <quic_dipankar@quicinc.com>
Signed-off-by: Dipankar Sarkar <dipankar@qti.qualcomm.com>
Signed-off-by: Dipankar Sarkar <quic_dipankar@quicinc.com>
Signed-off-by: Dipankar Sarkar <dipankar@qti.qualcomm.com>
Signed-off-by: Dipankar Sarkar <quic_dipankar@quicinc.com>
Signed-off-by: Dipankar Sarkar <dipankar@qti.qualcomm.com>
Signed-off-by: Dipankar Sarkar <quic_dipankar@quicinc.com>
Signed-off-by: Dipankar Sarkar <dipankar@qti.qualcomm.com>
Signed-off-by: Dipankar Sarkar <quic_dipankar@quicinc.com>
Signed-off-by: Dipankar Sarkar <dipankar@qti.qualcomm.com>
Signed-off-by: Dipankar Sarkar <quic_dipankar@quicinc.com>
Signed-off-by: Dipankar Sarkar <dipankar@qti.qualcomm.com>
Signed-off-by: Dipankar Sarkar <quic_dipankar@quicinc.com>
Signed-off-by: Dipankar Sarkar <dipankar@qti.qualcomm.com>
Signed-off-by: Dipankar Sarkar <quic_dipankar@quicinc.com>
Signed-off-by: Dipankar Sarkar <dipankar@qti.qualcomm.com>
Signed-off-by: Dipankar Sarkar <dipankar@qti.qualcomm.com>
     - Refactor to inline with Diffusers design
     - Adding npi, scale factor changes

Signed-off-by: vtirumal <vtirumal@qti.qualcomm.com>
Signed-off-by: vtirumal <vtirumal@qti.qualcomm.com>
Support for Wan Image to video model
Model card:  "Wan-AI/Wan2.2-I2V-A14B-Diffusers"

---------

Signed-off-by: vtirumal <vtirumal@qti.qualcomm.com>
Signed-off-by: Dipankar Sarkar <quic_dipankar@quicinc.com>
Signed-off-by: Dipankar Sarkar <dipankar@qti.qualcomm.com>
Signed-off-by: Dipankar Sarkar <quic_dipankar@quicinc.com>
Signed-off-by: Dipankar Sarkar <dipankar@qti.qualcomm.com>
     - Refactor to inline with Diffusers design
     - Adding npi, scale factor changes

Signed-off-by: vtirumal <vtirumal@qti.qualcomm.com>
Signed-off-by: vtirumal <vtirumal@qti.qualcomm.com>
 - Updated text encoder to return padded, fixed shaped embeddings
 - Refactored modeling changes, onnx params, qwen pipeline
 - Added pytest with dummy config

Signed-off-by: vtirumal <vtirumal@qti.qualcomm.com>
  - Clean and refactor most of the Qwen image files

Signed-off-by: vtirumal <vtirumal@qti.qualcomm.com>
Signed-off-by: vtirumal <vtirumal@qti.qualcomm.com>
Signed-off-by: Amit Raj <amitraj@qti.qualcomm.com>
Signed-off-by: Amit Raj <amitraj@qti.qualcomm.com>
@quic-amitraj quic-amitraj self-assigned this May 20, 2026
@quic-amitraj quic-amitraj added Diffusers Use for PR related to diffusers in efficient-transformers. performance labels May 20, 2026
@quic-amitraj quic-amitraj requested a review from vbaddi May 20, 2026 18:08
@quic-amitraj quic-amitraj marked this pull request as ready for review May 20, 2026 18:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Diffusers Use for PR related to diffusers in efficient-transformers. performance

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants