Training Code for SeC

This folder contains the training code for SeC, which is trained in two stages.

Dataset and Model Preparation

1. Dataset Preparation

Download the SA-V dataset and organize it in the following format. Before training, please extract video frames first following the script here

dataset
├── SA-V
│   ├── metavideo                   # the original SA-V dataset
│   │   ├── {video_id}_manual.json  # video annotations
│   │   ├── {video_id}.mp4
│   │   └── ...
│   ├── train                       # extracted video frames for training
│   │   ├── {video_id}
│   │   │   ├── 00000.jpg           # video frame
│   │   │   ├── 00004.jpg           # video frame
│   │   │   └── ...
│   │   ├── {video_id}
│   │   └── ...
├── sav_scene_top2k.txt
└── sav_video_obj_ids.txt

2. Pretrained Model Preparation

Download the InternVL2_5-4B and SAM2.1-Hiera-L, and organize them in the following format.

pretrained_models
├── InternVL2_5-4B
├── sam2.1_hiera_large.pt
└── reshape.py

Then, please run the reshape.py script to convert the weights of the SAM model into the format suitable for SeC:

Training

Stage 1: Enhanced Pixel-level Association Module

Fine-tune the SAM-based model with memory enhancement:

export PYTHONPATH=.
python training/sam2/training/train.py \
    -c "sec_sam2.1_hiera_l_finetune_maskmem22.yaml" \
    --use-cluster 0 \
    --num-gpus 8

Checkpoints and logs will be saved to work_dirs/sec_sam2/sec_sam2.1_hiera_l_finetune_maskmem22.yaml/.

Stage 2: Concept Guidance Module

Train the Concept Guidance Module using distributed training. Please replace the model_path and sam2_path in training/sec/configs/sec-4b.py with the paths to the pretrained InternVL2_5-4B model and the fine-tuned SAM model from Stage 1, respectively.

export PYTHONPATH=.
bash tools/dist.sh train "training/sec/configs/sec_4b.py" 8

Checkpoints and logs will be saved to work_dirs/sec_4b/.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Training Code for SeC

Dataset and Model Preparation

1. Dataset Preparation

2. Pretrained Model Preparation

Training

Stage 1: Enhanced Pixel-level Association Module

Stage 2: Concept Guidance Module

FilesExpand file tree

TRAIN.md

Latest commit

History

TRAIN.md

File metadata and controls

Training Code for SeC

Dataset and Model Preparation

1. Dataset Preparation

2. Pretrained Model Preparation

Training

Stage 1: Enhanced Pixel-level Association Module

Stage 2: Concept Guidance Module