[Train]: add longcat & flux full scripts by Watebear · Pull Request #1171 · ModelTC/LightX2V

Watebear · 2026-06-19T05:30:38Z

No description provided.

gemini-code-assist

Code Review

This pull request introduces training configurations and run scripts for flux2_klein and longcat_image models, alongside refactoring model loaders to extract load_transformer methods. Feedback highlights critical issues: the newly added models lack the required load_full_weights_for_resume method, which will cause crashes when resuming training. Additionally, the configuration files contain absolute local paths that hinder portability, and the run scripts should allow overriding CUDA_VISIBLE_DEVICES for better flexibility. Finally, a copy-paste error in run_longcat_image_t2i_lora.sh needs to be corrected as the GPU allocation does not match the single-process configuration.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

gemini-code-assist · 2026-06-19T05:32:10Z

+    def load_transformer(self, model_path=None):
+        model_path = model_path or self.config["model"]["pretrained_model_name_or_path"]
+        return Flux2Transformer2DModel.from_pretrained(model_path, subfolder="transformer").to(self.device, dtype=self.running_dtype)


The BaseModel class defines load_full_weights_for_resume as a required method for resuming full training, raising NotImplementedError if not implemented. Since Flux2KleinModel does not implement this method, attempting to resume full training will crash. Implementing it using the newly parameterized load_transformer method resolves this issue.

Suggested change

def load_transformer(self, model_path=None):

model_path = model_path or self.config["model"]["pretrained_model_name_or_path"]

return Flux2Transformer2DModel.from_pretrained(model_path, subfolder="transformer").to(self.device, dtype=self.running_dtype)

def load_transformer(self, model_path=None):

model_path = model_path or self.config["model"]["pretrained_model_name_or_path"]

return Flux2Transformer2DModel.from_pretrained(model_path, subfolder="transformer").to(self.device, dtype=self.running_dtype)

def load_full_weights_for_resume(self, resume_ckpt_path):

self.transformer = self.load_transformer(resume_ckpt_path)

gemini-code-assist · 2026-06-19T05:32:10Z

+    def load_transformer(self, model_path=None):
+        model_path = model_path or self.config["model"]["pretrained_model_name_or_path"]
+        transformer = LongCatImageTransformer2DModel.from_pretrained(model_path, subfolder="transformer").to(self.device, dtype=self.running_dtype)
        attention_backend = self.config["model"].get("attention_backend", None)
        if attention_backend is not None:
-            self.transformer.set_attention_backend(attention_backend)
+            transformer.set_attention_backend(attention_backend)
+        return transformer


The BaseModel class defines load_full_weights_for_resume as a required method for resuming full training. Since LongCatImageModel does not implement this method, attempting to resume full training will crash. Implementing it using the newly parameterized load_transformer method resolves this issue.

Suggested change

def load_transformer(self, model_path=None):

model_path = model_path or self.config["model"]["pretrained_model_name_or_path"]

transformer = LongCatImageTransformer2DModel.from_pretrained(model_path, subfolder="transformer").to(self.device, dtype=self.running_dtype)

attention_backend = self.config["model"].get("attention_backend", None)

if attention_backend is not None:

self.transformer.set_attention_backend(attention_backend)

transformer.set_attention_backend(attention_backend)

return transformer

def load_transformer(self, model_path=None):

model_path = model_path or self.config["model"]["pretrained_model_name_or_path"]

transformer = LongCatImageTransformer2DModel.from_pretrained(model_path, subfolder="transformer").to(self.device, dtype=self.running_dtype)

attention_backend = self.config["model"].get("attention_backend", None)

if attention_backend is not None:

transformer.set_attention_backend(attention_backend)

return transformer

def load_full_weights_for_resume(self, resume_ckpt_path):

self.transformer = self.load_transformer(resume_ckpt_path)

gemini-code-assist · 2026-06-19T05:32:10Z

@@ -0,0 +1,92 @@
+model:
+    name: flux2_klein
+    pretrained_model_name_or_path: /data/nvme1/models/FLUX.2-Klein-9B


Avoid committing absolute local paths (e.g., /data/nvme1/...) to the repository as they are environment-specific and break portability. Please use placeholder paths like /path/to/... instead.

pretrained_model_name_or_path: /path/to/FLUX.2-Klein-9B

gemini-code-assist · 2026-06-19T05:32:10Z

@@ -0,0 +1,95 @@
+model:
+    name: longcat_image
+    pretrained_model_name_or_path: /data/nvme1/models/meituan-longcat/LongCat-Image


Avoid committing absolute local paths (e.g., /data/nvme1/...) to the repository as they are environment-specific and break portability. Please use placeholder paths like /path/to/... instead.

pretrained_model_name_or_path: /path/to/meituan-longcat/LongCat-Image

gemini-code-assist · 2026-06-19T05:32:10Z

 model:
    name: qwen_image
-    pretrained_model_name_or_path: /path/to/Qwen/Qwen-Image-2512
+    pretrained_model_name_or_path: /data/nvme1/wushuo/hf_models/models/Qwen-Image-2512


The configuration file has been updated with absolute paths specific to a local environment (/data/nvme1/...). This reduces the portability of the configuration and can cause immediate failures for other users. It is highly recommended to use placeholder paths (e.g., /path/to/...) as was done previously, allowing users to configure their own paths.

pretrained_model_name_or_path: /path/to/Qwen/Qwen-Image-2512

gemini-code-assist · 2026-06-19T05:32:11Z

@@ -0,0 +1,9 @@
+#!/bin/bash
+# full parameters train use fsdp2 by default
+export CUDA_VISIBLE_DEVICES=0,1,2,3


Allowing the user to override CUDA_VISIBLE_DEVICES via environment variables makes the script much more flexible for multi-user or multi-GPU environments.

Suggested change

export CUDA_VISIBLE_DEVICES=0,1,2,3

export CUDA_VISIBLE_DEVICES=${CUDA_VISIBLE_DEVICES:-0,1,2,3}

gemini-code-assist · 2026-06-19T05:32:11Z

@@ -0,0 +1,8 @@
+#!/bin/bash
+
+export CUDA_VISIBLE_DEVICES=2


Allowing the user to override CUDA_VISIBLE_DEVICES via environment variables makes the script much more flexible.

Suggested change

export CUDA_VISIBLE_DEVICES=2

export CUDA_VISIBLE_DEVICES=${CUDA_VISIBLE_DEVICES:-2}

gemini-code-assist · 2026-06-19T05:32:11Z

@@ -0,0 +1,9 @@
+#!/bin/bash
+# full parameters train use fsdp2 by default
+export CUDA_VISIBLE_DEVICES=4,5,6,7


Allowing the user to override CUDA_VISIBLE_DEVICES via environment variables makes the script much more flexible for multi-user or multi-GPU environments.

Suggested change

export CUDA_VISIBLE_DEVICES=4,5,6,7

export CUDA_VISIBLE_DEVICES=${CUDA_VISIBLE_DEVICES:-4,5,6,7}

gemini-code-assist · 2026-06-19T05:32:11Z

+# full parameters train use fsdp2 by default
+export CUDA_VISIBLE_DEVICES=4,5,6,7
+export PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True


There are a couple of issues in this script:

The comment on line 2 refers to 'full parameters train use fsdp2 by default', but this is a LoRA training script.

CUDA_VISIBLE_DEVICES is set to 4 GPUs (4,5,6,7), but --nproc_per_node is set to 1. This is likely a copy-paste error from the full training script. It should target a single GPU by default, and ideally allow user override via ${CUDA_VISIBLE_DEVICES:-...}.

Suggested change

# full parameters train use fsdp2 by default

export CUDA_VISIBLE_DEVICES=4,5,6,7

export PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True

export CUDA_VISIBLE_DEVICES=${CUDA_VISIBLE_DEVICES:-4}

export PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True

[feat]: add longcat & flux full scripts

954dd99

gemini-code-assist Bot reviewed Jun 19, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Train]: add longcat & flux full scripts#1171

[Train]: add longcat & flux full scripts#1171
Watebear wants to merge 1 commit into
mainfrom
train

Watebear commented Jun 19, 2026

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

gemini-code-assist Bot Jun 19, 2026

Uh oh!

gemini-code-assist Bot Jun 19, 2026

Uh oh!

gemini-code-assist Bot Jun 19, 2026

Uh oh!

gemini-code-assist Bot Jun 19, 2026

Uh oh!

gemini-code-assist Bot Jun 19, 2026

Uh oh!

gemini-code-assist Bot Jun 19, 2026

Uh oh!

gemini-code-assist Bot Jun 19, 2026

Uh oh!

gemini-code-assist Bot Jun 19, 2026

Uh oh!

gemini-code-assist Bot Jun 19, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

	export CUDA_VISIBLE_DEVICES=0,1,2,3
	export CUDA_VISIBLE_DEVICES=${CUDA_VISIBLE_DEVICES:-0,1,2,3}

	export CUDA_VISIBLE_DEVICES=2
	export CUDA_VISIBLE_DEVICES=${CUDA_VISIBLE_DEVICES:-2}

	export CUDA_VISIBLE_DEVICES=4,5,6,7
	export CUDA_VISIBLE_DEVICES=${CUDA_VISIBLE_DEVICES:-4,5,6,7}

Conversation

Watebear commented Jun 19, 2026

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot Jun 19, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Jun 19, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Jun 19, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Jun 19, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Jun 19, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Jun 19, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Jun 19, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Jun 19, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Jun 19, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant