Skip to content

aHapBean/PCP-MAE

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

26 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

PCP-MAE (NeurIPS 2024 Spotlight)

PWC PWC PWC PWC PWC

PCP-MAE: Learning to Predict Centers for Point Masked Autoencoders NeurIPS 2024 spotlight
Xiangdong Zhang*, Shaofeng Zhang* and Junchi Yan

If you like our project, please give us a star ⭐ on GitHub for the latest update.

PCP-MAE: Learning to Predict Centers for Point Masked Autoencoders

Figure 1: Overview of the proposed PCP-MAE.

πŸ“° News

  • πŸ’₯ Aug, 2024: PCP-MAE is available in arxiv.
  • πŸŽ‰ Sept, 2024: PCP-MAE is accepted by NeurIPS 2024 as spotlight.
  • πŸ“Œ Oct, 2024: The corresponding checkpoints are released in Google Drive and the code will be coming soon.
  • πŸ“Œ Oct, 2024: The code has been released.
  • πŸ’‘ Nov, 2024: The introduction to PCP-MAE is added.
  • πŸŽ‰ June, 2025: Our work Point-PQAE is accepted by ICCV 2025, which introduces a new paradigm for point cloud self-supervised learning.

βœ… TODO List

  • Complete the introduction for the PCP-MAE project.
  • Publish the training and inference code.
  • Release the checkpoints for pre-training and finetuning.

πŸ” Introduction

In this paper, we show a motivating empirical result that when directly feeding the centers of masked patches to the decoder without information from the encoder, it still reconstructs well. In other words, the centers of patches are important and the reconstruction objective does not necessarily rely on representations of the encoder, thus preventing the encoder from learning semantic representations.

In short, the 2D MAE and Point-MAE differ in several aspects, as shown in the figure below. Therefore, it is inappropriate to directly transfer 2D MAE operations to the 3D domain.

Figure 2: When the encoder in Point-MAE is removed, the point cloud can still be reconstructed.

Based on this key observation, we propose a simple yet effective method, i.e., learning to Predict Centers for Point Masked AutoEncoders (PCP-MAE) which guides the model to learn to predict the significant centers and use the predicted centers to replace the directly provided centers.

Our method is of high pre-training efficiency compared to other alternatives and achieves great improvement over Point-MAE, particularly surpassing it by 5.50% on OBJ-BG, 6.03% on OBJ-ONLY, and 5.17% on PB-T50-RS for 3D object classification on the ScanObjectNN dataset.

Figure 3: Efficiency and performance comparison.

To ensure a fair time comparison, the code for Point-MAE should be modified slightly in two ways:

  1. Add "config.dataset.train.others.whole = True" to the training to align Point-FEMAE and our method.
  2. Instead of using KNN_CUDA, change it into the knn_point function (refer to the official code of ReCon, Point-FEMAE or our PCP-MAE) which directly uses torch operation to align with Point-FEMAE and our approach. This will significantly increase the training speed.

PCP-MAE Models

Task Dataset Config Acc. Checkpoints Download
Pre-training ShapeNet base.yaml N.A. Pre-train
Classification ScanObjectNN finetune_scan_objbg.yaml 95.52% OBJ_BG
Classification ScanObjectNN finetune_scan_objonly.yaml 94.32% OBJ_ONLY
Classification ScanObjectNN finetune_scan_hardest.yaml 90.35% PB_T50_RS
Classification ModelNet40(1k) w/o voting finetune_modelnet.yaml 94.1% ModelNet40_1K
Classification ModelNet40(1k) w/ voting finetune_modelnet.yaml 94.4% ModelNet40_1K_voting
Part Segmentation ShapeNetPart segmentation 84.9% Cls.mIoU TBD
Scene Segmentation S3DIS semantic_segmentataion 61.3% mIoU TBD
Task Dataset Config 5w10s (%) 5w20s (%) 10w10s (%) 10w20s (%) Download
Few-shot learning ModelNet40 fewshot.yaml 97.4 Β± 2.3 99.1 Β± 0.8 93.5Β±3.7 95.9Β±2.7 FewShot

The checkpoints and logs have been released on Google Drive. To fully reproduce our reported results, we recommend fine-tuning the pre-trained ckpt-300 with different random seeds (typically 8 different seeds) and recording the best performance which is also adopted by other peer methods (e.g. Point-MAE and ReCon). Occasionally, ckpt-275 may outperform ckpt-300, so we encourage you to try to fine-tune with both ckpt-300 and ckpt-275.

Requirements

PyTorch >= 1.7.0 < 1.11.0; python >= 3.7; CUDA >= 9.0; GCC >= 4.9; torchvision;

# Quick Start
conda create -n pcpmae python=3.10 -y
conda activate pcpmae

# Install pytorch
conda install pytorch==2.0.1 torchvision==0.15.2 cudatoolkit=11.8 -c pytorch -c nvidia
# pip install torch==2.0.1+cu118 torchvision==0.15.2+cu118 -f https://download.pytorch.org/whl/torch_stable.html

# Install required packages
pip install -r requirements.txt
# Install the extensions
# Chamfer Distance & emd
cd ./extensions/chamfer_dist
python setup.py install --user
cd ./extensions/emd
python setup.py install --user
# PointNet++
pip install "git+https://github.com/erikwijmans/Pointnet2_PyTorch.git#egg=pointnet2_ops&subdirectory=pointnet2_ops_lib"

Datasets

We use ShapeNet, ScanObjectNN, ModelNet40, ShapeNetPart and S3DIS in this work. See DATASET.md for details.

Pre-training

To pretrain PCP-MAE on ShapeNet training set, run the following command. If you want to try different models or masking ratios etc., first create a new config file, and pass its path to --config.

CUDA_VISIBLE_DEVICES=<GPU> python main.py --config cfgs/pretrain/base.yaml --exp_name <output_file_name>

Fine-tuning

Fine-tuning on ScanObjectNN, run:

# Select one config from finetune_scan_objbg/objonly/hardest.yaml
CUDA_VISIBLE_DEVICES=<GPUs> python main.py --config cfgs/finetune_scan_hardest.yaml \
--finetune_model --exp_name <output_file_name> --ckpts <path/to/pre-trained/model> --seed $RANDOM


# Test with fine-tuned ckpt
CUDA_VISIBLE_DEVICES=<GPUs> python main.py --test --config cfgs/finetune_scan_hardest.yaml \
--exp_name <output_file_name> --ckpts <path/to/best/fine-tuned/model>

Fine-tuning on ModelNet40, run:

CUDA_VISIBLE_DEVICES=<GPUs> python main.py --config cfgs/finetune_modelnet.yaml \
--finetune_model --exp_name <output_file_name> --ckpts <path/to/pre-trained/model> --seed $RANDOM

# Test with fine-tuned ckpt
CUDA_VISIBLE_DEVICES=<GPUs> python main.py --test --config cfgs/finetune_modelnet.yaml \
--exp_name <output_file_name> --ckpts <path/to/best/fine-tuned/model>

Voting on ModelNet40, run:

CUDA_VISIBLE_DEVICES=<GPUs> python main.py --test --config cfgs/finetune_modelnet.yaml \
--exp_name <output_file_name> --ckpts <path/to/best/fine-tuned/model> --seed $RANDOM --vote

Few-shot learning, run:

CUDA_VISIBLE_DEVICES=<GPUs> python main.py --config cfgs/fewshot.yaml --finetune_model \
--ckpts <path/to/pre-trained/model> --exp_name <output_file_name> --way <5 or 10> --shot <10 or 20> --fold <0-9> --seed $RANDOM

Part segmentation on ShapeNetPart, run:

cd segmentation
python main.py --gpu <gpu_id> --ckpts <path/to/pre-trained/model> \
--log_dir <log_dir> --learning_rate 0.0002 --epoch 300 \
--root <path/to/data> \
--seed $RANDOM

Semantic segmentation on S3DIS, run:

cd semantic_segmentation
python main.py --ckpts <path/to/pre-trained/model> \
--root path/to/data --learning_rate 0.0002 --epoch 60 --gpu <gpu_id> --log_dir <log_dir>

Visualization

Simple visualization, run:

python main_vis.py --config cfgs/pretrain/base.yaml --exp_name final_vis \
--ckpts <path/to/pre-trained/model> --test

In addition to the simple method mentioned above for visualizing point clouds, we use the PointFlowRenderer repository to render high-quality point cloud images.

Contact

If you have any questions related to the code or the paper, feel free to email Xiangdong (zhangxiangdong@sjtu.edu.cn) or Shaofeng (sherrylone@sjtu.edu.cn).

License

PCP-MAE is released under MIT License. See the LICENSE file for more details. Besides, the licensing information for pointnet2 modules is available here.

Acknowledgements

This codebase is built upon Point-MAE, ReCon, Pointnet2_PyTorch.

Citation

If you find our work useful in your research, please consider citing:

@article{zhang2024pcp,
  title={PCP-MAE: Learning to Predict Centers for Point Masked Autoencoders},
  author={Zhang, Xiangdong and Zhang, Shaofeng and Yan, Junchi},
  journal={arXiv preprint arXiv:2408.08753},
  year={2024}
}

About

[NeurIPS 2024 Spotlight (Top 2.5%πŸ†)] PCP-MAE: Learning to Predict Centers for Point Masked Autoencoders

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages