Skip to content

Commit bc82599

Browse files
lsteinclaude
andcommitted
Add Qwen Image Edit 2511 model support
Adds full support for the Qwen Image Edit 2511 model architecture, including both the diffusers version (Qwen/Qwen-Image-Edit-2511) and GGUF quantized versions (unsloth/Qwen-Image-Edit-2511-GGUF). Backend changes: - Add QwenImageEdit base model type to taxonomy - Add diffusers and GGUF model config classes with detection logic - Add model loader for diffusers and GGUF formats - Add 5 invocation nodes: model loader, text/vision encoder, denoise, image-to-latents, latents-to-image - Add QwenVLEncoderField for Qwen2.5-VL vision-language encoder - Add QwenImageEditConditioningInfo and conditioning field - Add generation modes and step callback support - Add 5 starter models (full diffusers + Q2_K, Q4_K_M, Q6_K, Q8_0 GGUF) Frontend changes: - Add graph builder for linear UI generation - Register in canvas and generate enqueue hooks - Update type definitions, optimal dimensions, grid sizes - Add readiness validation, model picker grouping, clip skip config - Regenerate OpenAPI schema Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> fix: use AutoProcessor.from_pretrained to load Qwen VL processor correctly Co-authored-by: lstein <111189+lstein@users.noreply.github.com> Agent-Logs-Url: https://github.com/lstein/InvokeAI/sessions/4d4417be-0f61-4faa-a21c-16e9ce81fec7 chore: bump diffusers==0.37.1 Co-authored-by: lstein <111189+lstein@users.noreply.github.com> Agent-Logs-Url: https://github.com/lstein/InvokeAI/sessions/38a76809-d9a3-40f1-b5b3-fb56342e8e90 fix: handle multiple reference images feature: add text encoder selection to advanced section for Qwen Image Edit feat: complete Qwen Image Edit pipeline with LoRA, GGUF, quantization, and UI support Major additions: - LoRA support: loader invocation, config detection, conversion utils, prefix constants, and LayerPatcher integration in denoise with sidecar patching for GGUF models - Lightning LoRA: starter models (4-step and 8-step bf16), shift override parameter for the distilled sigma schedule - GGUF fixes: correct base class (ModelLoader), zero_cond_t=True, correct in_channels (no /4 division) - Denoise: use FlowMatchEulerDiscreteScheduler directly, proper CFG gating (skip negative when cfg<=1), reference latent pixel-space resize - I2L: resize reference image to generation dimensions before VAE encoding - Graph builder: wire LoRAs via collection loader, VAE-encode reference image as latents for spatial conditioning, pass shift/quantization params - Frontend: shift override (checkbox+slider), LoRA graph wiring, scheduler hidden for Qwen Image Edit, model switching cleanup - Starter model bundle for Qwen Image Edit - LoRA config registered in discriminated union (factory.py) - Downgrade transformers requirement back to >=4.56.0 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
1 parent f7aa5fc commit bc82599

60 files changed

Lines changed: 3174 additions & 57 deletions

File tree

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

invokeai/app/api/dependencies.py

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -50,6 +50,7 @@
5050
CogView4ConditioningInfo,
5151
ConditioningFieldData,
5252
FLUXConditioningInfo,
53+
QwenImageEditConditioningInfo,
5354
SD3ConditioningInfo,
5455
SDXLConditioningInfo,
5556
ZImageConditioningInfo,
@@ -140,6 +141,7 @@ def initialize(
140141
SD3ConditioningInfo,
141142
CogView4ConditioningInfo,
142143
ZImageConditioningInfo,
144+
QwenImageEditConditioningInfo,
143145
],
144146
ephemeral=True,
145147
),

invokeai/app/invocations/fields.py

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -171,6 +171,8 @@ class FieldDescriptions:
171171
sd3_model = "SD3 model (MMDiTX) to load"
172172
cogview4_model = "CogView4 model (Transformer) to load"
173173
z_image_model = "Z-Image model (Transformer) to load"
174+
qwen_image_edit_model = "Qwen Image Edit model (Transformer) to load"
175+
qwen_vl_encoder = "Qwen2.5-VL tokenizer, processor and text/vision encoder"
174176
sdxl_main_model = "SDXL Main model (UNet, VAE, CLIP1, CLIP2) to load"
175177
sdxl_refiner_model = "SDXL Refiner Main Modde (UNet, VAE, CLIP2) to load"
176178
onnx_main_model = "ONNX Main model (UNet, VAE, CLIP) to load"
@@ -340,6 +342,12 @@ class ZImageConditioningField(BaseModel):
340342
)
341343

342344

345+
class QwenImageEditConditioningField(BaseModel):
346+
"""A Qwen Image Edit conditioning tensor primitive value"""
347+
348+
conditioning_name: str = Field(description="The name of conditioning tensor")
349+
350+
343351
class ConditioningField(BaseModel):
344352
"""A conditioning tensor primitive value"""
345353

invokeai/app/invocations/metadata.py

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -166,6 +166,10 @@ def invoke(self, context: InvocationContext) -> MetadataOutput:
166166
"z_image_img2img",
167167
"z_image_inpaint",
168168
"z_image_outpaint",
169+
"qwen_image_edit_txt2img",
170+
"qwen_image_edit_img2img",
171+
"qwen_image_edit_inpaint",
172+
"qwen_image_edit_outpaint",
169173
]
170174

171175

invokeai/app/invocations/model.py

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -72,6 +72,13 @@ class GlmEncoderField(BaseModel):
7272
text_encoder: ModelIdentifierField = Field(description="Info to load text_encoder submodel")
7373

7474

75+
class QwenVLEncoderField(BaseModel):
76+
"""Field for Qwen2.5-VL encoder used by Qwen Image Edit models."""
77+
78+
tokenizer: ModelIdentifierField = Field(description="Info to load tokenizer submodel")
79+
text_encoder: ModelIdentifierField = Field(description="Info to load text_encoder submodel")
80+
81+
7582
class Qwen3EncoderField(BaseModel):
7683
"""Field for Qwen3 text encoder used by Z-Image models."""
7784

invokeai/app/invocations/primitives.py

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -24,6 +24,7 @@
2424
InputField,
2525
LatentsField,
2626
OutputField,
27+
QwenImageEditConditioningField,
2728
SD3ConditioningField,
2829
TensorField,
2930
UIComponent,
@@ -473,6 +474,17 @@ def build(cls, conditioning_name: str) -> "ZImageConditioningOutput":
473474
return cls(conditioning=ZImageConditioningField(conditioning_name=conditioning_name))
474475

475476

477+
@invocation_output("qwen_image_edit_conditioning_output")
478+
class QwenImageEditConditioningOutput(BaseInvocationOutput):
479+
"""Base class for nodes that output a Qwen Image Edit conditioning tensor."""
480+
481+
conditioning: QwenImageEditConditioningField = OutputField(description=FieldDescriptions.cond)
482+
483+
@classmethod
484+
def build(cls, conditioning_name: str) -> "QwenImageEditConditioningOutput":
485+
return cls(conditioning=QwenImageEditConditioningField(conditioning_name=conditioning_name))
486+
487+
476488
@invocation_output("conditioning_output")
477489
class ConditioningOutput(BaseInvocationOutput):
478490
"""Base class for nodes that output a single conditioning tensor"""

0 commit comments

Comments
 (0)