ð ããããïŒ
English | ç®äœäžæ | æ¥æ¬èª
- ç®æ¬¡
- 玹ä»
- ã¯ã€ãã¯ã¹ã¿ãŒã
- ãããªçµæ
- äœ¿çšæ¹æ³
- ã¢ãã«ã®å Žæ
- åèæç®
- ã©ã€ã»ã³ã¹
VideoX-Funã¯ãããªçæã®ãã€ãã©ã€ã³ã§ãããAIç»åããããªã®çæãDiffusion Transformerã®ããŒã¹ã©ã€ã³ã¢ãã«ãšLoraã¢ãã«ã®ãã¬ãŒãã³ã°ã«äœ¿çšã§ããŸããæã ã¯ããã§ã«åŠç¿æžã¿ã®ããŒã¹ã©ã€ã³ã¢ãã«ããçŽæ¥äºæž¬ãè¡ããç°ãªãè§£å床ãç§æ°ãFPSã®ãããªãçæããããšããµããŒãããŠããŸãããŸãããŠãŒã¶ãŒãç¬èªã®ããŒã¹ã©ã€ã³ã¢ãã«ãLoraã¢ãã«ããã¬ãŒãã³ã°ããç¹å®ã®ã¹ã¿ã€ã«å€æãè¡ãããšããµããŒãããŠããŸãã
ç°ãªããã©ãããã©ãŒã ããã®ã¯ã€ãã¯ã¹ã¿ãŒãããµããŒãããŸãã詳现ã¯ã¯ã€ãã¯ã¹ã¿ãŒããåç §ããŠãã ããã
æ°æ©èœïŒ
- Wan2.1-Fun-V1.1ããŒãžã§ã³ãæŽæ°ïŒ14Bãš1.3Bã¢ãã«ã®ControlïŒåç §ç»åã¢ãã«ããµããŒããã«ã¡ã©å¶åŸ¡ã«ã察å¿ãããã«ãInpaintã¢ãã«ãåèšç·Žããæ§èœãåäžããŸããã[2025.04.25]
- Wan2.1-Fun-V1.0ã®æŽæ°ïŒ14Bããã³1.3Bã®I2VïŒç»åãããããªïŒã¢ãã«ãšControlã¢ãã«ããµããŒãããéå§ãã¬ãŒã ãšçµäºãã¬ãŒã ã®äºæž¬ã«å¯Ÿå¿ã[2025.03.26]
- CogVideoX-Fun-V1.5ã®æŽæ°ïŒI2Vã¢ãã«ãšé¢é£ãããã¬ãŒãã³ã°ã»äºæž¬ã³ãŒããã¢ããããŒãã[2024.12.16]
- å ±é ¬Loraã®ãµããŒãïŒå ±é ¬éäŒææè¡ã䜿çšããŠLoraããã¬ãŒãã³ã°ããçæãããåç»ãæé©åãã人éã®å¥œã¿ã«ããããäžèŽãããã詳现æ å ±ãæ°ããããŒãžã§ã³ã®å¶åŸ¡ã¢ãã«ã§ã¯ãCannyãDepthãPoseãMLSDãªã©ã®ç°ãªãå¶åŸ¡æ¡ä»¶ã«å¯Ÿå¿ã[2024.11.21]
- diffusersã®ãµããŒãïŒCogVideoX-Fun Controlãdiffusersã§ãµããŒããããããã«ãªããŸãããa-r-r-o-wããã®PRã§ãµããŒããæäŸããŠãããããšã«æè¬ããŸãã詳现ã¯ããã¥ã¡ã³ããã芧ãã ããã[2024.10.16]
- CogVideoX-Fun-V1.1ã®æŽæ°ïŒi2vã¢ãã«ãåãã¬ãŒãã³ã°ããNoiseã远å ããŠåç»ã®åãã®ç¯å²ãæ¡å€§ãå¶åŸ¡ã¢ãã«ã®ãã¬ãŒãã³ã°ã³ãŒããšControlã¢ãã«ãã¢ããããŒãã[2024.09.29]
- CogVideoX-Fun-V1.0ã®æŽæ°ïŒã³ãŒããäœæïŒWindowsãšLinuxã«å¯Ÿå¿ããŸããã2Bããã³5Bã¢ãã«ã§ã®æå€§256x256x49ãã1024x1024x49ãŸã§ã®ä»»æã®è§£å床ã®åç»çæããµããŒãã[2024.09.18]
æ©èœïŒ
ç§ãã¡ã®UIã€ã³ã¿ãŒãã§ãŒã¹ã¯æ¬¡ã®ãšããã§ãïŒ
DSWã«ã¯ç¡æã®GPUæéãããããŠãŒã¶ãŒã¯äžåºŠç³è«ã§ããç³è«åŸ3ãæéæå¹ã§ãã
Aliyunã¯Freetierã§ç¡æã®GPUæéãæäŸããŠããŸããååŸããŠAliyun PAI-DSWã§äœ¿çšãã5å以å ã«CogVideoX-Funãéå§ã§ããŸãïŒ
ç§ãã¡ã®ComfyUIã¯æ¬¡ã®ãšããã§ãã詳现ã¯ComfyUI READMEãåç
§ããŠãã ããã
Dockerã䜿çšããå Žåããã·ã³ã«ã°ã©ãã£ãã¯ã¹ã«ãŒããã©ã€ããšCUDAç°å¢ãæ£ããã€ã³ã¹ããŒã«ãããŠããããšã確èªããŠãã ããã
次ã®ã³ãã³ãããã®æ¹æ³ã§å®è¡ããŸãïŒ
# ã€ã¡ãŒãžããã«
docker pull mybigpai-public-registry.cn-beijing.cr.aliyuncs.com/easycv/torch_cuda:cogvideox_fun
# ã€ã¡ãŒãžã«å
¥ã
docker run -it -p 7860:7860 --network host --gpus all --security-opt seccomp:unconfined --shm-size 200g mybigpai-public-registry.cn-beijing.cr.aliyuncs.com/easycv/torch_cuda:cogvideox_fun
# ã³ãŒããã¯ããŒã³
git clone https://github.com/aigc-apps/VideoX-Fun.git
# VideoX-Funã®ãã£ã¬ã¯ããªã«å
¥ã
cd VideoX-Fun
# éã¿ãããŠã³ããŒã
mkdir models/Diffusion_Transformer
mkdir models/Personalized_Model
# Please use the hugginface link or modelscope link to download the model.
# CogVideoX-Fun
# https://huggingface.co/alibaba-pai/CogVideoX-Fun-V1.1-5b-InP
# https://modelscope.cn/models/PAI/CogVideoX-Fun-V1.1-5b-InP
# Wan
# https://huggingface.co/alibaba-pai/Wan2.1-Fun-V1.1-14B-InP
# https://modelscope.cn/models/PAI/Wan2.1-Fun-V1.1-14B-InP
以äžã®ç°å¢ã§ãã®ã©ã€ãã©ãªã®å®è¡ã確èªããŠããŸãïŒ
Windowsã®è©³çްïŒ
- OS: Windows 10
- python: python3.10 & python3.11
- pytorch: torch2.2.0
- CUDA: 11.8 & 12.1
- CUDNN: 8+
- GPUïŒ Nvidia-3060 12G & Nvidia-3090 24G
Linuxã®è©³çްïŒ
- OS: Ubuntu 20.04, CentOS
- python: python3.10 & python3.11
- pytorch: torch2.2.0
- CUDA: 11.8 & 12.1
- CUDNN: 8+
- GPUïŒNvidia-V100 16G & Nvidia-A10 24G & Nvidia-A100 40G & Nvidia-A100 80G
éã¿ãä¿åããããã«çŽ60GBã®ãã£ã¹ã¯ã¹ããŒã¹ãå¿ èŠã§ãã確èªããŠãã ããïŒ
éã¿ãæå®ããããã¹ã«é 眮ããããšããå§ãããŸãïŒ
ComfyUIãéããŠ:
ã¢ãã«ãComfyUIã®éã¿ãã©ã«ã ComfyUI/models/Fun_Models/ ã«å
¥ããŸãïŒ
ðŠ ComfyUI/
âââ ð models/
â âââ ð Fun_Models/
â âââ ð CogVideoX-Fun-V1.1-2b-InP/
â âââ ð CogVideoX-Fun-V1.1-5b-InP/
â âââ ð Wan2.1-Fun-V1.1-14B-InP
â âââ ð Wan2.1-Fun-V1.1-1.3B-InP/
ç¬èªã®pythonãã¡ã€ã«ãŸãã¯UIã€ã³ã¿ãŒãã§ãŒã¹ãå®è¡:
ðŠ models/
âââ ð Diffusion_Transformer/
â âââ ð CogVideoX-Fun-V1.1-2b-InP/
â âââ ð CogVideoX-Fun-V1.1-5b-InP/
â âââ ð Wan2.1-Fun-V1.1-14B-InP
â âââ ð Wan2.1-Fun-V1.1-1.3B-InP/
âââ ð Personalized_Model/
â âââ ããªãã®ãã¬ãŒãã³ã°æžã¿ã®ãã©ã³ã¹ãã©ãŒããŒã¢ãã« / ããªãã®ãã¬ãŒãã³ã°æžã¿ã®Loraã¢ãã«ïŒUIããŒãçšïŒ
inp_1.mp4 |
inp_2.mp4 |
inp_3.mp4 |
inp_4.mp4 |
inp_5.mp4 |
inp_6.mp4 |
inp_7.mp4 |
inp_8.mp4 |
Generic Control Video + Reference Image:
| Reference Image | Control Video | Wan2.1-Fun-V1.1-14B-Control | Wan2.1-Fun-V1.1-1.3B-Control |
|
pose_control.mp4 |
14b_ref.mp4 |
1.3b_ref.mp4 |
Generic Control Video (Canny, Pose, Depth, etc.) and Trajectory Control:
Fun-Trajectory_00003.mp4 |
Fun-Trajectory-Merge_00003.mp4 |
Fun_00006.mp4 |
pose.mp4 |
canny.mp4 |
depth.mp4 |
pose_out.mp4 |
canny_out.mp4 |
depth_out.mp4 |
| Pan Up | Pan Left | Pan Right |
Pan_Up.mp4 |
Pan_Left.mp4 |
Pan_Right.mp4 |
| Pan Down | Pan Up + Pan Left | Pan Up + Pan Right |
Pan_Down.mp4 |
Pan_Left_Up.mp4 |
Pan_Right_Up.mp4 |
è§£å床-1024
00000005.mp4 |
00000006.mp4 |
00000009.mp4 |
00000010.mp4 |
è§£å床-768
00000001.mp4 |
00000002.mp4 |
00000005.mp4 |
00000006.mp4 |
è§£å床-512
00000036.mp4 |
00000035.mp4 |
00000034.mp4 |
00000033.mp4 |
demo_pose.mp4 |
demo_scribble.mp4 |
demo_depth.mp4 |
| çŸããæŸãã ç®ãšé髪ã®è¥ã女æ§ãçœãæãçãŠäœãã²ãããã«ã¡ã©ã¯åœŒå¥³ã®é¡ã«çŠç¹ãåãããŠããŸããé«å質ãåäœãæé«å質ãé«è§£å床ãè¶ åŸ®çŽ°ã倢ã®ãããªã | çŸããæŸãã ç®ãšé髪ã®è¥ã女æ§ãçœãæãçãŠäœãã²ãããã«ã¡ã©ã¯åœŒå¥³ã®é¡ã«çŠç¹ãåãããŠããŸããé«å質ãåäœãæé«å質ãé«è§£å床ãè¶ åŸ®çŽ°ã倢ã®ãããªã | è¥ãã¯ãã |
00000010.mp4 |
00000011.mp4 |
00000012.mp4 |
Wan2.1ã®ãã©ã¡ãŒã¿ãéåžžã«å€§ãããããGPUã¡ã¢ãªãç¯çŽããã³ã³ã·ã¥ãŒããŒåãGPUã«é©å¿ãããå¿
èŠããããŸããåäºæž¬ãã¡ã€ã«ã«ã¯GPU_memory_modeãæäŸããŠãããmodel_cpu_offloadãmodel_cpu_offload_and_qfloat8ãsequential_cpu_offloadã®äžããéžæã§ããŸãããã®æ¹æ³ã¯CogVideoX-Funã®çæã«ãé©çšãããŸãã
model_cpu_offload: ã¢ãã«å šäœã䜿çšåŸã«CPUã«ç§»åããäžéšã®GPUã¡ã¢ãªãç¯çŽããŸããmodel_cpu_offload_and_qfloat8: ã¢ãã«å šäœã䜿çšåŸã«CPUã«ç§»åããTransformerã¢ãã«ã«å¯ŸããŠfloat8ã®éååãè¡ããããå€ãã®GPUã¡ã¢ãªãç¯çŽããŸããsequential_cpu_offload: ã¢ãã«ã®åå±€ã䜿çšåŸã«CPUã«ç§»åããŸããé床ã¯é ããªããŸããã倧éã®GPUã¡ã¢ãªãç¯çŽããŸãã
qfloat8ã¯ã¢ãã«ã®æ§èœãéšåçã«äœäžãããå¯èœæ§ããããŸãããããå€ãã®GPUã¡ã¢ãªãç¯çŽã§ããŸããååãªGPUã¡ã¢ãªãããå Žåã¯ãmodel_cpu_offloadã®äœ¿çšããå§ãããŸãã
詳现ã¯ComfyUI READMEãã芧ãã ããã
- ã¹ããã1: 察å¿ããéã¿ãããŠã³ããŒããã
modelsãã©ã«ãã«é 眮ããŸãã - ã¹ããã2: ç°ãªãéã¿ãšäºæž¬ç®æšã«åºã¥ããŠãç°ãªããã¡ã€ã«ã䜿çšããŠäºæž¬ãè¡ããŸããçŸåšããã®ã©ã€ãã©ãªã¯CogVideoX-FunãWan2.1ãããã³Wan2.1-FunããµããŒãããŠããŸãã
examplesãã©ã«ãå ã®ãã©ã«ãåã§åºå¥ãããç°ãªãã¢ãã«ããµããŒãããæ©èœãç°ãªããŸãã®ã§ãç¶æ³ã«å¿ããŠåºå¥ããŠãã ããã以äžã¯CogVideoX-FunãäŸãšããŠèª¬æããŸãã- ããã¹ããããããª:
examples/cogvideox_fun/predict_t2v.pyãã¡ã€ã«ã§promptãneg_promptãguidance_scaleãseedã倿ŽããŸãã- 次ã«ã
examples/cogvideox_fun/predict_t2v.pyãã¡ã€ã«ãå®è¡ããçµæãçæãããã®ãåŸ ã¡ãŸããçµæã¯samples/cogvideox-fun-videosãã©ã«ãã«ä¿åãããŸãã
- ç»åãããããª:
examples/cogvideox_fun/predict_i2v.pyãã¡ã€ã«ã§validation_image_startãvalidation_image_endãpromptãneg_promptãguidance_scaleãseedã倿ŽããŸããvalidation_image_startã¯ãããªã®éå§ç»åãvalidation_image_endã¯ãããªã®çµäºç»åã§ãã- 次ã«ã
examples/cogvideox_fun/predict_i2v.pyãã¡ã€ã«ãå®è¡ããçµæãçæãããã®ãåŸ ã¡ãŸããçµæã¯samples/cogvideox-fun-videos_i2vãã©ã«ãã«ä¿åãããŸãã
- ãããªãããããª:
examples/cogvideox_fun/predict_v2v.pyãã¡ã€ã«ã§validation_videoãvalidation_image_endãpromptãneg_promptãguidance_scaleãseedã倿ŽããŸããvalidation_videoã¯ãããªçæã®ããã®åç §ãããªã§ãã以äžã®ãã¢ãããªã䜿çšããŠå®è¡ã§ããŸãïŒãã¢ãããª- 次ã«ã
examples/cogvideox_fun/predict_v2v.pyãã¡ã€ã«ãå®è¡ããçµæãçæãããã®ãåŸ ã¡ãŸããçµæã¯samples/cogvideox-fun-videos_v2vãã©ã«ãã«ä¿åãããŸãã
- éåžžã®å¶åŸ¡ä»ããããªçæïŒCannyãPoseãDepthãªã©ïŒ:
examples/cogvideox_fun/predict_v2v_control.pyãã¡ã€ã«ã§control_videoãvalidation_image_endãpromptãneg_promptãguidance_scaleãseedã倿ŽããŸããcontrol_videoã¯ãCannyãPoseãDepthãªã©ã®æŒç®åã§æœåºãããå¶åŸ¡çšãããªã§ãã以äžã®ãã¢ãããªã䜿çšããŠå®è¡ã§ããŸãïŒãã¢ãããª- 次ã«ã
examples/cogvideox_fun/predict_v2v_control.pyãã¡ã€ã«ãå®è¡ããçµæãçæãããã®ãåŸ ã¡ãŸããçµæã¯samples/cogvideox-fun-videos_v2v_controlãã©ã«ãã«ä¿åãããŸãã
- ããã¹ããããããª:
- ã¹ããã3: èªåã§ãã¬ãŒãã³ã°ããä»ã®ããã¯ããŒã³ãLoraãçµã¿åããããå Žåã¯ãå¿
èŠã«å¿ããŠ
examples/{model_name}/predict_t2v.pyãexamples/{model_name}/predict_i2v.pyãlora_pathãä¿®æ£ããŸãã
å€ã«ãŒãã§ã®æšè«ãè¡ãéã¯ãxfuserãªããžããªã®ã€ã³ã¹ããŒã«ã«æ³šæããŠãã ãããxfuser==0.4.2 ãš yunchang==0.6.2 ã®ã€ã³ã¹ããŒã«ãæšå¥šãããŸãã
pip install xfuser==0.4.2 --progress-bar off -i https://mirrors.aliyun.com/pypi/simple/
pip install yunchang==0.6.2 --progress-bar off -i https://mirrors.aliyun.com/pypi/simple/
ulysses_degree ãš ring_degree ã®ç©ã䜿çšãã GPU æ°ãšäžèŽããããšã確èªããŠãã ãããããšãã°ã8ã€ã®GPUã䜿çšããå Žåãulysses_degree=2 ãš ring_degree=4ããŸã㯠ulysses_degree=4 ãš ring_degree=2 ãèšå®ããããšãã§ããŸãã
ulysses_degreeã¯ãããïŒheadïŒã«åå²ããåŸã®äžŠååãè¡ããŸããring_degreeã¯ã·ãŒã±ã³ã¹ã«åå²ããåŸã®äžŠååãè¡ããŸãã
ring_degree 㯠ulysses_degree ãããéä¿¡ã³ã¹ããé«ãããããããã®ãã©ã¡ãŒã¿ãèšå®ããéã«ã¯ãã·ãŒã±ã³ã¹é·ãšã¢ãã«ã®ãããæ°ãèæ
®ããå¿
èŠããããŸãã
8GPUã§ã®äžŠåæšè«ãäŸã«æããŸãïŒ
-
Wan2.1-Fun-V1.1-14B-InP ã¯ãããæ°ã40ãããŸãããã®å Žåã
ulysses_degreeã¯40ã§å²ãåããå€ïŒäŸïŒ2, 4, 8ãªã©ïŒã«èšå®ããå¿ èŠããããŸãããããã£ãŠã8GPUã䜿çšããŠäžŠåæšè«ãè¡ãå Žåãulysses_degree=8ãšring_degree=1ãèšå®ã§ããŸãã -
Wan2.1-Fun-V1.1-1.3B-InP ã¯ãããæ°ã12ãããŸãããã®å Žåã
ulysses_degreeã¯12ã§å²ãåããå€ïŒäŸïŒ2, 4ãªã©ïŒã«èšå®ããå¿ èŠããããŸãããããã£ãŠã8GPUã䜿çšããŠäžŠåæšè«ãè¡ãå Žåãulysses_degree=4ãšring_degree=2ãèšå®ã§ããŸãã
ãã©ã¡ãŒã¿ã®èšå®ãå®äºãããã以äžã®ã³ãã³ãã§äžŠåæšè«ãå®è¡ããŠãã ããïŒ
torchrun --nproc-per-node=8 examples/wan2.1_fun/predict_t2v.pyWebUIã¯ãããã¹ããããããªãç»åãããããªããããªãããããªãããã³éåžžã®å¶åŸ¡ä»ããããªçæïŒCannyãPoseãDepthãªã©ïŒããµããŒãããŸããçŸåšããã®ã©ã€ãã©ãªã¯CogVideoX-FunãWan2.1ãããã³Wan2.1-FunããµããŒãããŠãããexamplesãã©ã«ãå
ã®ãã©ã«ãåã§åºå¥ãããŠããŸããç°ãªãã¢ãã«ããµããŒãããæ©èœãç°ãªããããç¶æ³ã«å¿ããŠåºå¥ããŠãã ããã以äžã¯CogVideoX-FunãäŸãšããŠèª¬æããŸãã
- ã¹ããã1: 察å¿ããéã¿ãããŠã³ããŒããã
modelsãã©ã«ãã«é 眮ããŸãã - ã¹ããã2:
examples/cogvideox_fun/app.pyãã¡ã€ã«ãå®è¡ããGradioããŒãžã«å ¥ããŸãã - ã¹ããã3: ããŒãžäžã§çæã¢ãã«ãéžæãã
promptãneg_promptãguidance_scaleãseedãªã©ãå ¥åãããçæããã¯ãªãã¯ããŠçµæãçæãããã®ãåŸ ã¡ãŸããçµæã¯sampleãã©ã«ãã«ä¿åãããŸãã
å®å šãªã¢ãã«ãã¬ãŒãã³ã°ã®æµãã«ã¯ãããŒã¿ã®ååŠçãšVideo DiTã®ãã¬ãŒãã³ã°ãå«ãŸããã¹ãã§ããç°ãªãã¢ãã«ã®ãã¬ãŒãã³ã°ããã»ã¹ã¯é¡äŒŒããŠãããããŒã¿åœ¢åŒãé¡äŒŒããŠããŸãïŒ
ç»åããŒã¿ã䜿çšããŠLoraã¢ãã«ããã¬ãŒãã³ã°ããç°¡åãªãã¢ãæäŸããŸããã詳现ã¯wikiãã芧ãã ããã
é·ããããªã®ã»ã°ã¡ã³ããŒã·ã§ã³ãã¯ãªãŒãã³ã°ã説æã®ããã®å®å šãªããŒã¿ååŠçãªã³ã¯ã¯ããããªãã£ãã·ã§ã³ã»ã¯ã·ã§ã³ã®READMEãåç §ããŠãã ããã
ããã¹ãããç»åããã³ãããªçæã¢ãã«ããã¬ãŒãã³ã°ãããå Žåããã®åœ¢åŒã§ããŒã¿ã»ãããé 眮ããå¿ èŠããããŸãã
ðŠ project/
âââ ð datasets/
â âââ ð internal_datasets/
â âââ ð train/
â â âââ ð 00000001.mp4
â â âââ ð 00000002.jpg
â â âââ ð .....
â âââ ð json_of_internal_datasets.json
json_of_internal_datasets.jsonã¯æšæºã®JSONãã¡ã€ã«ã§ããjsonå ã®file_pathã¯çžå¯Ÿãã¹ãšããŠèšå®ã§ããŸãã以äžã®ããã«ïŒ
[
{
"file_path": "train/00000001.mp4",
"text": "ã¹ãŒããšãµã³ã°ã©ã¹ãçãè¥ãç·æ§ã®ã°ã«ãŒããè¡ã®éããæ©ããŠããã",
"type": "video"
},
{
"file_path": "train/00000002.jpg",
"text": "ã¹ãŒããšãµã³ã°ã©ã¹ãçãè¥ãç·æ§ã®ã°ã«ãŒããè¡ã®éããæ©ããŠããã",
"type": "image"
},
.....
]次ã®ããã«çµ¶å¯Ÿãã¹ãšããŠèšå®ããããšãã§ããŸãïŒ
[
{
"file_path": "/mnt/data/videos/00000001.mp4",
"text": "ã¹ãŒããšãµã³ã°ã©ã¹ãçãè¥ãç·æ§ã®ã°ã«ãŒããè¡ã®éããæ©ããŠããã",
"type": "video"
},
{
"file_path": "/mnt/data/train/00000001.jpg",
"text": "ã¹ãŒããšãµã³ã°ã©ã¹ãçãè¥ãç·æ§ã®ã°ã«ãŒããè¡ã®éããæ©ããŠããã",
"type": "image"
},
.....
]ããŒã¿ååŠçæã«ããŒã¿åœ¢åŒãçžå¯Ÿãã¹ã®å Žåãscripts/{model_name}/train.shãæ¬¡ã®ããã«èšå®ããŸãã
export DATASET_NAME="datasets/internal_datasets/"
export DATASET_META_NAME="datasets/internal_datasets/json_of_internal_datasets.json"
ããŒã¿åœ¢åŒã絶察ãã¹ã®å Žåãscripts/train.shãæ¬¡ã®ããã«èšå®ããŸãã
export DATASET_NAME=""
export DATASET_META_NAME="/mnt/data/json_of_internal_datasets.json"
次ã«ãscripts/train.shãå®è¡ããŸãã
sh scripts/train.shããã€ãã®ãã©ã¡ãŒã¿èšå®ã®è©³çްã«ã€ããŠïŒ Wan2.1-Funã¯Readme TrainãšReadme Loraãåç §ããŠãã ããã Wan2.1ã¯Readme TrainãšReadme Loraãåç §ããŠãã ããã CogVideoX-Funã¯Readme TrainãšReadme Loraãåç §ããŠãã ããã
V1.1:
| åç§° | ã¹ãã¬ãŒãžå®¹é | Hugging Face | Model Scope | 説æ |
|---|---|---|---|---|
| Wan2.1-Fun-V1.1-1.3B-InP | 19.0 GB | ð€ãªã³ã¯ | ðãªã³ã¯ | Wan2.1-Fun-V1.1-1.3Bã®ããã¹ãã»ç»åããåç»çæã®éã¿ããã«ãè§£å床ã§èšç·ŽãããæåãšæåŸã®ç»åäºæž¬ããµããŒãããŸãã |
| Wan2.1-Fun-V1.1-14B-InP | 47.0 GB | ð€ãªã³ã¯ | ðãªã³ã¯ | Wan2.1-Fun-V1.1-14Bã®ããã¹ãã»ç»åããåç»çæã®éã¿ããã«ãè§£å床ã§èšç·ŽãããæåãšæåŸã®ç»åäºæž¬ããµããŒãããŸãã |
| Wan2.1-Fun-V1.1-1.3B-Control | 19.0 GB | ð€ãªã³ã¯ | ðãªã³ã¯ | Wan2.1-Fun-V1.1-1.3Bã®ãããªå¶åŸ¡éã¿ãCannyãDepthãPoseãMLSDãªã©ã®ç°ãªãå¶åŸ¡æ¡ä»¶ã«å¯Ÿå¿ããåç §ç»åïŒå¶åŸ¡æ¡ä»¶ã䜿çšããå¶åŸ¡ãè»è·¡å¶åŸ¡ããµããŒãããŸãã512ã768ã1024ã®ãã«ãè§£å床ã§ã®åç»äºæž¬ããµããŒããã81ãã¬ãŒã ãæ¯ç§16ãã¬ãŒã ã§èšç·ŽãããŠããŸããå€èšèªäºæž¬ã«å¯Ÿå¿ããŠããŸãã |
| Wan2.1-Fun-V1.1-14B-Control | 47.0 GB | ð€ãªã³ã¯ | ðãªã³ã¯ | Wan2.1-Fun-V1.1-14Bã®ãããªå¶åŸ¡éã¿ãCannyãDepthãPoseãMLSDãªã©ã®ç°ãªãå¶åŸ¡æ¡ä»¶ã«å¯Ÿå¿ããåç §ç»åïŒå¶åŸ¡æ¡ä»¶ã䜿çšããå¶åŸ¡ãè»è·¡å¶åŸ¡ããµããŒãããŸãã512ã768ã1024ã®ãã«ãè§£å床ã§ã®åç»äºæž¬ããµããŒããã81ãã¬ãŒã ãæ¯ç§16ãã¬ãŒã ã§èšç·ŽãããŠããŸããå€èšèªäºæž¬ã«å¯Ÿå¿ããŠããŸãã |
| Wan2.1-Fun-V1.1-1.3B-Control-Camera | 19.0 GB | ð€ãªã³ã¯ | ðãªã³ã¯ | Wan2.1-Fun-V1.1-1.3Bã®ã«ã¡ã©ã¬ã³ãºå¶åŸ¡éã¿ã512ã768ã1024ã®ãã«ãè§£å床ã§ã®åç»äºæž¬ããµããŒããã81ãã¬ãŒã ãæ¯ç§16ãã¬ãŒã ã§èšç·ŽãããŠããŸããå€èšèªäºæž¬ã«å¯Ÿå¿ããŠããŸãã |
| Wan2.1-Fun-V1.1-14B-Control-Camera | 47.0 GB | ð€ãªã³ã¯ | ðãªã³ã¯ | Wan2.1-Fun-V1.1-14Bã®ã«ã¡ã©ã¬ã³ãºå¶åŸ¡éã¿ã512ã768ã1024ã®ãã«ãè§£å床ã§ã®åç»äºæž¬ããµããŒããã81ãã¬ãŒã ãæ¯ç§16ãã¬ãŒã ã§èšç·ŽãããŠããŸããå€èšèªäºæž¬ã«å¯Ÿå¿ããŠããŸãã |
V1.0:
| åç§° | ã¹ãã¬ãŒãžå®¹é | Hugging Face | Model Scope | 説æ |
|---|---|---|---|---|
| Wan2.1-Fun-1.3B-InP | 19.0 GB | ð€Link | ðLink | Wan2.1-Fun-1.3Bã®ããã¹ãã»ç»åããåç»çæããéã¿ããã«ãè§£å床ã§åŠç¿ãããéå§ã»çµäºç»åäºæž¬ããµããŒãã |
| Wan2.1-Fun-14B-InP | 47.0 GB | ð€Link | ðLink | Wan2.1-Fun-14Bã®ããã¹ãã»ç»åããåç»çæããéã¿ããã«ãè§£å床ã§åŠç¿ãããéå§ã»çµäºç»åäºæž¬ããµããŒãã |
| Wan2.1-Fun-1.3B-Control | 19.0 GB | ð€Link | ðLink | Wan2.1-Fun-1.3Bã®ãããªå¶åŸ¡ãŠã§ã€ããCannyãDepthãPoseãMLSDãªã©ã®ç°ãªãå¶åŸ¡æ¡ä»¶ããµããŒããããã©ãžã§ã¯ããªå¶åŸ¡ãå©çšå¯èœã512ã768ã1024ã®ãã«ãè§£å床ã§ã®ãããªäºæž¬ããµããŒããã81ãã¬ãŒã ïŒ1ç§éã«16ãã¬ãŒã ïŒã§ãã¬ãŒãã³ã°æžã¿ã§ãå€èšèªäºæž¬ã«ã察å¿ããŠããŸãã |
| Wan2.1-Fun-14B-Control | 47.0 GB | ð€Link | ðLink | Wan2.1-Fun-14Bã®ãããªå¶åŸ¡ãŠã§ã€ããCannyãDepthãPoseãMLSDãªã©ã®ç°ãªãå¶åŸ¡æ¡ä»¶ããµããŒããããã©ãžã§ã¯ããªå¶åŸ¡ãå©çšå¯èœã512ã768ã1024ã®ãã«ãè§£å床ã§ã®ãããªäºæž¬ããµããŒããã81ãã¬ãŒã ïŒ1ç§éã«16ãã¬ãŒã ïŒã§ãã¬ãŒãã³ã°æžã¿ã§ãå€èšèªäºæž¬ã«ã察å¿ããŠããŸãã |
| åç§° | Hugging Face | Model Scope | 説æ |
|---|---|---|---|
| Wan2.1-T2V-1.3B | ð€Link | ðLink | äžè±¡2.1-1.3Bã®ããã¹ãããåç»çæããéã¿ |
| Wan2.1-T2V-14B | ð€Link | ðLink | äžè±¡2.1-14Bã®ããã¹ãããåç»çæããéã¿ |
| Wan2.1-I2V-14B-480P | ð€Link | ðLink | äžè±¡2.1-14B-480Pã®ç»åããåç»çæããéã¿ |
| Wan2.1-I2V-14B-720P | ð€Link | ðLink | äžè±¡2.1-14B-720Pã®ç»åããåç»çæããéã¿ |
V1.5:
| åç§° | ã¹ãã¬ãŒãžã¹ããŒã¹ | Hugging Face | Model Scope | 説æ |
|---|---|---|---|---|
| CogVideoX-Fun-V1.5-5b-InP | 20.0 GB | ð€Link | ðLink | å ¬åŒã®ã°ã©ãçæãããªã¢ãã«ã¯ãè€æ°ã®è§£å床ïŒ512ã768ã1024ïŒã§ãããªãäºæž¬ã§ããŸãã85ãã¬ãŒã ã8ãã¬ãŒã /ç§ã§ãã¬ãŒãã³ã°ãããŠããŸãã |
| CogVideoX-Fun-V1.5-Reward-LoRAs | - | ð€ãªã³ã¯ | ðãªã³ã¯ | å ¬åŒã®å ±é ¬éäŒææè¡ã¢ãã«ã§ãCogVideoX-Fun-V1.5ãçæãããããªãæé©åãã人éã®å奜ã«ããããåãããã«ããã |
V1.1:
| åç§° | ã¹ãã¬ãŒãžã¹ããŒã¹ | Hugging Face | Model Scope | 説æ |
|---|---|---|---|---|
| CogVideoX-Fun-V1.1-2b-InP | 13.0 GB | ð€ãªã³ã¯ | ðãªã³ã¯ | å ¬åŒã®ã°ã©ãçæãããªã¢ãã«ã¯ãè€æ°ã®è§£å床ïŒ512ã768ã1024ã1280ïŒã§ãããªãäºæž¬ã§ããŸãã49ãã¬ãŒã ã8ãã¬ãŒã /ç§ã§ãã¬ãŒãã³ã°ãããŠããŸããåç §ç»åã«ãã€ãºã远å ãããV1.0ãšæ¯èŒããŠåãã®å¹ ãåºãã£ãŠããŸãã |
| CogVideoX-Fun-V1.1-5b-InP | 20.0 GB | ð€ãªã³ã¯ | ðãªã³ã¯ | å ¬åŒã®ã°ã©ãçæãããªã¢ãã«ã¯ãè€æ°ã®è§£å床ïŒ512ã768ã1024ã1280ïŒã§ãããªãäºæž¬ã§ããŸãã49ãã¬ãŒã ã8ãã¬ãŒã /ç§ã§ãã¬ãŒãã³ã°ãããŠããŸããåç §ç»åã«ãã€ãºã远å ãããV1.0ãšæ¯èŒããŠåãã®å¹ ãåºãã£ãŠããŸãã |
| CogVideoX-Fun-V1.1-2b-Pose | 13.0 GB | ð€ãªã³ã¯ | ðãªã³ã¯ | å ¬åŒã®ããŒãºã³ã³ãããŒã«ãããªã¢ãã«ã¯ãè€æ°ã®è§£å床ïŒ512ã768ã1024ã1280ïŒã§ãããªãäºæž¬ã§ããŸãã49ãã¬ãŒã ã8ãã¬ãŒã /ç§ã§ãã¬ãŒãã³ã°ãããŠããŸãã |
| CogVideoX-Fun-V1.1-2b-Control | 13.0 GB | ð€Link | ðLink | å ¬åŒã®ã³ã³ãããŒã«ãããªã¢ãã«ã¯ãè€æ°ã®è§£å床ïŒ512ã768ã1024ã1280ïŒã§ãããªãäºæž¬ã§ããŸãã49ãã¬ãŒã ã8ãã¬ãŒã /ç§ã§ãã¬ãŒãã³ã°ãããŠããŸããCannyãDepthãPoseãMLSDãªã©ã®ããŸããŸãªã³ã³ãããŒã«æ¡ä»¶ããµããŒãããŸãã |
| CogVideoX-Fun-V1.1-5b-Pose | 20.0 GB | ð€ãªã³ã¯ | ðãªã³ã¯ | å ¬åŒã®ããŒãºã³ã³ãããŒã«ãããªã¢ãã«ã¯ãè€æ°ã®è§£å床ïŒ512ã768ã1024ã1280ïŒã§ãããªãäºæž¬ã§ããŸãã49ãã¬ãŒã ã8ãã¬ãŒã /ç§ã§ãã¬ãŒãã³ã°ãããŠããŸãã |
| CogVideoX-Fun-V1.1-5b-Control | 20.0 GB | ð€ãªã³ã¯ | ðãªã³ã¯ | å ¬åŒã®ã³ã³ãããŒã«ãããªã¢ãã«ã¯ãè€æ°ã®è§£å床ïŒ512ã768ã1024ã1280ïŒã§ãããªãäºæž¬ã§ããŸãã49ãã¬ãŒã ã8ãã¬ãŒã /ç§ã§ãã¬ãŒãã³ã°ãããŠããŸããCannyãDepthãPoseãMLSDãªã©ã®ããŸããŸãªã³ã³ãããŒã«æ¡ä»¶ããµããŒãããŸãã |
| CogVideoX-Fun-V1.1-Reward-LoRAs | - | ð€ãªã³ã¯ | ðãªã³ã¯ | å ¬åŒã®å ±é ¬éäŒææè¡ã¢ãã«ã§ãCogVideoX-Fun-V1.1ãçæãããããªãæé©åãã人éã®å奜ã«ããããåãããã«ããã |
(Obsolete) V1.0:
| åç§° | ã¹ãã¬ãŒãžã¹ããŒã¹ | Hugging Face | Model Scope | 説æ |
|---|---|---|---|---|
| CogVideoX-Fun-2b-InP | 13.0 GB | ð€ãªã³ã¯ | ðãªã³ã¯ | å ¬åŒã®ã°ã©ãçæãããªã¢ãã«ã¯ãè€æ°ã®è§£å床ïŒ512ã768ã1024ã1280ïŒã§ãããªãäºæž¬ã§ããŸãã49ãã¬ãŒã ã8ãã¬ãŒã /ç§ã§ãã¬ãŒãã³ã°ãããŠããŸãã |
| CogVideoX-Fun-5b-InP | 20.0 GB | ð€ãªã³ã¯ | ðãªã³ã¯ | å ¬åŒã®ã°ã©ãçæãããªã¢ãã«ã¯ãè€æ°ã®è§£å床ïŒ512ã768ã1024ã1280ïŒã§ãããªãäºæž¬ã§ããŸãã49ãã¬ãŒã ã8ãã¬ãŒã /ç§ã§ãã¬ãŒãã³ã°ãããŠããŸãã |
- æ¥æ¬èªããµããŒãã
- CogVideo: https://github.com/THUDM/CogVideo/
- EasyAnimate: https://github.com/aigc-apps/EasyAnimate
- Wan2.1: https://github.com/Wan-Video/Wan2.1/
ãã®ãããžã§ã¯ãã¯Apache License (Version 2.0)ã®äžã§ã©ã€ã»ã³ã¹ãããŠããŸãã
CogVideoX-2Bã¢ãã«ïŒå¯Ÿå¿ããTransformersã¢ãžã¥ãŒã«ãVAEã¢ãžã¥ãŒã«ãå«ãïŒã¯ãApache 2.0ã©ã€ã»ã³ã¹ã®äžã§ãªãªãŒã¹ãããŠããŸãã
CogVideoX-5Bã¢ãã«ïŒTransformersã¢ãžã¥ãŒã«ïŒã¯ãCogVideoXã©ã€ã»ã³ã¹ã®äžã§ãªãªãŒã¹ãããŠããŸãã



