git checkout -b feature/awesome-feature
git checkout -b fix/awesome-fix- The
linguafolder should contain only commonly used classes and functions, avoiding application-specific elements. - Each model or pipeline should have its own dedicated folder within the
appsdirectory. - Each Pull Request should focus on a specific new feature (e.g., adding CFG support, implementing a new architecture, or introducing new configurations) and include a detailed test plan to facilitate effective code review.
Example:
sudo adduser mczhuge
sudo usermod -aG sudo mczhuge
sudo curl -L https://juicefs.com/static/juicefs -o /usr/local/bin/juicefs && sudo chmod +x /usr/local/bin/juicefs && sudo /usr/local/bin/juicefs mount world-model /jfssu - mczhuge
sudo chmod -R u+w $Pollux
sudo chown -R mczhuge $Polluxwget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh
bash Miniconda3-latest-Linux-x86_64.sh
source /home/mczhuge/miniconda3/bin/activateExample:
ssh-keygen -t ed25519 -f ~/.ssh/mczhuge -C "mczhuge@gmail.com"
cat ~/.ssh/mczhuge.pub
chmod 600 ~/.ssh/mczhuge
eval "$(ssh-agent -s)"
ssh-add ~/.ssh/mczhuge
ssh-keyscan -t ed25519 github.com >> ~/.ssh/known_hosts
ssh -T git@github.comwget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x86_64/nvidia-fabricmanager-535_535.183.01-1_amd64.deb
sudo dpkg -i nvidia-fabricmanager-535_535.183.01-1_amd64.deb
sudo systemctl daemon-reload
sudo systemctl start nvidia-fabricmanagersudo mkdir -p /home/mczhuge/.ssh
sudo chmod 700 /home/mczhuge/.ssh
sudo chown mczhuge:mczhuge /home/mczhuge/.ssh
sudo cp /home/ubuntu/.ssh/authorized_keys /home/mczhuge/.ssh/
sudo chmod 600 /home/mczhuge/.ssh/authorized_keys
sudo chown mczhuge:mczhuge /home/mczhuge/.ssh/authorized_keyssudo vim /etc/ssh/sshd_config
# add
PermitRootLogin no
PubkeyAuthentication yes
PasswordAuthentication no
AllowUsers ubuntu mczhugesudo systemctl restart sshdexport PYTHONPATH=/jfs/mczhuge/Pollux:$PYTHONPATH
python -m apps.main.train config=apps/main/configs/pollux_v0.5.yaml
To access models like Llama or FLUX.1-dev, authenticate with Huggingface by following the steps below:
huggingface-cli login
git config --global credential.helper storeIf you need access to the Wandb team, please contact Mingchen for an invitation.
- Obtain your Wandb API key from this link.
- Log in using the following command:
wandb login- Update your configuration file (e.g.,
apps/main/configs/LLAMA_Baseline_1B.yaml) with the relevant Wandb settings:
# Update the `name` field for each new run. The `dump_dir` will be auto-generated.
name: "ImageNet_1B_BaseLine_256_Flux_LLAMA_Pre_Train_MC" # Ensure to modify this for each runUse the following command to train the model on 4 GPUs:
torchrun --standalone --nnodes 1 --nproc-per-node 4 -m apps.main.train config=apps/main/configs/pollux_v0.5.yaml