PCP-MAE (NeurIPS 2024 Spotlight)

PCP-MAE: Learning to Predict Centers for Point Masked Autoencoders NeurIPS 2024 spotlight
Xiangdong Zhang*, Shaofeng Zhang* and Junchi Yan

If you like our project, please give us a star ⭐ on GitHub for the latest update.

PCP-MAE: Learning to Predict Centers for Point Masked Autoencoders

Figure 1: Overview of the proposed PCP-MAE.

📰 News

💥 Aug, 2024: PCP-MAE is available in arxiv.
🎉 Sept, 2024: PCP-MAE is accepted by NeurIPS 2024 as spotlight.
📌 Oct, 2024: The corresponding checkpoints are released in Google Drive and the code will be coming soon.
📌 Oct, 2024: The code has been released.
💡 Nov, 2024: The introduction to PCP-MAE is added.
🎉 June, 2025: Our work Point-PQAE is accepted by ICCV 2025, which introduces a new paradigm for point cloud self-supervised learning.

✅ TODO List

Complete the introduction for the PCP-MAE project.
Publish the training and inference code.
Release the checkpoints for pre-training and finetuning.

🔍 Introduction

In this paper, we show a motivating empirical result that when directly feeding the centers of masked patches to the decoder without information from the encoder, it still reconstructs well. In other words, the centers of patches are important and the reconstruction objective does not necessarily rely on representations of the encoder, thus preventing the encoder from learning semantic representations.

In short, the 2D MAE and Point-MAE differ in several aspects, as shown in the figure below. Therefore, it is inappropriate to directly transfer 2D MAE operations to the 3D domain.

Figure 2: When the encoder in Point-MAE is removed, the point cloud can still be reconstructed.

Based on this key observation, we propose a simple yet effective method, i.e., learning to Predict Centers for Point Masked AutoEncoders (PCP-MAE) which guides the model to learn to predict the significant centers and use the predicted centers to replace the directly provided centers.

Our method is of high pre-training efficiency compared to other alternatives and achieves great improvement over Point-MAE, particularly surpassing it by 5.50% on OBJ-BG, 6.03% on OBJ-ONLY, and 5.17% on PB-T50-RS for 3D object classification on the ScanObjectNN dataset.

Figure 3: Efficiency and performance comparison.

To ensure a fair time comparison, the code for Point-MAE should be modified slightly in two ways:

Add "config.dataset.train.others.whole = True" to the training to align Point-FEMAE and our method.
Instead of using KNN_CUDA, change it into the knn_point function (refer to the official code of ReCon, Point-FEMAE or our PCP-MAE) which directly uses torch operation to align with Point-FEMAE and our approach. This will significantly increase the training speed.

PCP-MAE Models

Task	Dataset	Config	Acc.	Checkpoints Download
Pre-training	ShapeNet	base.yaml	N.A.	Pre-train
Classification	ScanObjectNN	finetune_scan_objbg.yaml	95.52%	OBJ_BG
Classification	ScanObjectNN	finetune_scan_objonly.yaml	94.32%	OBJ_ONLY
Classification	ScanObjectNN	finetune_scan_hardest.yaml	90.35%	PB_T50_RS
Classification	ModelNet40(1k) w/o voting	finetune_modelnet.yaml	94.1%	ModelNet40_1K
Classification	ModelNet40(1k) w/ voting	finetune_modelnet.yaml	94.4%	ModelNet40_1K_voting
Part Segmentation	ShapeNetPart	segmentation	84.9% Cls.mIoU	TBD
Scene Segmentation	S3DIS	semantic_segmentataion	61.3% mIoU	TBD

Task	Dataset	Config	5w10s (%)	5w20s (%)	10w10s (%)	10w20s (%)	Download
Few-shot learning	ModelNet40	fewshot.yaml	97.4 ± 2.3	99.1 ± 0.8	93.5±3.7	95.9±2.7	FewShot

The checkpoints and logs have been released on Google Drive. To fully reproduce our reported results, we recommend fine-tuning the pre-trained ckpt-300 with different random seeds (typically 8 different seeds) and recording the best performance which is also adopted by other peer methods (e.g. Point-MAE and ReCon). Occasionally, ckpt-275 may outperform ckpt-300, so we encourage you to try to fine-tune with both ckpt-300 and ckpt-275.

Requirements

PyTorch >= 1.7.0 < 1.11.0; python >= 3.7; CUDA >= 9.0; GCC >= 4.9; torchvision;

# Quick Start
conda create -n pcpmae python=3.10 -y
conda activate pcpmae

# Install pytorch
conda install pytorch==2.0.1 torchvision==0.15.2 cudatoolkit=11.8 -c pytorch -c nvidia
# pip install torch==2.0.1+cu118 torchvision==0.15.2+cu118 -f https://download.pytorch.org/whl/torch_stable.html

# Install required packages
pip install -r requirements.txt

# Install the extensions
# Chamfer Distance & emd
cd ./extensions/chamfer_dist
python setup.py install --user
cd ./extensions/emd
python setup.py install --user
# PointNet++
pip install "git+https://github.com/erikwijmans/Pointnet2_PyTorch.git#egg=pointnet2_ops&subdirectory=pointnet2_ops_lib"

Datasets

We use ShapeNet, ScanObjectNN, ModelNet40, ShapeNetPart and S3DIS in this work. See DATASET.md for details.

Pre-training

To pretrain PCP-MAE on ShapeNet training set, run the following command. If you want to try different models or masking ratios etc., first create a new config file, and pass its path to --config.

CUDA_VISIBLE_DEVICES=<GPU> python main.py --config cfgs/pretrain/base.yaml --exp_name <output_file_name>

Fine-tuning

Fine-tuning on ScanObjectNN, run:

# Select one config from finetune_scan_objbg/objonly/hardest.yaml
CUDA_VISIBLE_DEVICES=<GPUs> python main.py --config cfgs/finetune_scan_hardest.yaml \
--finetune_model --exp_name <output_file_name> --ckpts <path/to/pre-trained/model> --seed $RANDOM


# Test with fine-tuned ckpt
CUDA_VISIBLE_DEVICES=<GPUs> python main.py --test --config cfgs/finetune_scan_hardest.yaml \
--exp_name <output_file_name> --ckpts <path/to/best/fine-tuned/model>

Fine-tuning on ModelNet40, run:

CUDA_VISIBLE_DEVICES=<GPUs> python main.py --config cfgs/finetune_modelnet.yaml \
--finetune_model --exp_name <output_file_name> --ckpts <path/to/pre-trained/model> --seed $RANDOM

# Test with fine-tuned ckpt
CUDA_VISIBLE_DEVICES=<GPUs> python main.py --test --config cfgs/finetune_modelnet.yaml \
--exp_name <output_file_name> --ckpts <path/to/best/fine-tuned/model>

Voting on ModelNet40, run:

CUDA_VISIBLE_DEVICES=<GPUs> python main.py --test --config cfgs/finetune_modelnet.yaml \
--exp_name <output_file_name> --ckpts <path/to/best/fine-tuned/model> --seed $RANDOM --vote

Few-shot learning, run:

CUDA_VISIBLE_DEVICES=<GPUs> python main.py --config cfgs/fewshot.yaml --finetune_model \
--ckpts <path/to/pre-trained/model> --exp_name <output_file_name> --way <5 or 10> --shot <10 or 20> --fold <0-9> --seed $RANDOM

Part segmentation on ShapeNetPart, run:

cd segmentation
python main.py --gpu <gpu_id> --ckpts <path/to/pre-trained/model> \
--log_dir <log_dir> --learning_rate 0.0002 --epoch 300 \
--root <path/to/data> \
--seed $RANDOM

Semantic segmentation on S3DIS, run:

cd semantic_segmentation
python main.py --ckpts <path/to/pre-trained/model> \
--root path/to/data --learning_rate 0.0002 --epoch 60 --gpu <gpu_id> --log_dir <log_dir>

Visualization

Simple visualization, run:

python main_vis.py --config cfgs/pretrain/base.yaml --exp_name final_vis \
--ckpts <path/to/pre-trained/model> --test

In addition to the simple method mentioned above for visualizing point clouds, we use the PointFlowRenderer repository to render high-quality point cloud images.

Contact

If you have any questions related to the code or the paper, feel free to email Xiangdong (zhangxiangdong@sjtu.edu.cn) or Shaofeng (sherrylone@sjtu.edu.cn).

License

PCP-MAE is released under MIT License. See the LICENSE file for more details. Besides, the licensing information for pointnet2 modules is available here.

Acknowledgements

This codebase is built upon Point-MAE, ReCon, Pointnet2_PyTorch.

Citation

If you find our work useful in your research, please consider citing:

@article{zhang2024pcp,
  title={PCP-MAE: Learning to Predict Centers for Point Masked Autoencoders},
  author={Zhang, Xiangdong and Zhang, Shaofeng and Yan, Junchi},
  journal={arXiv preprint arXiv:2408.08753},
  year={2024}
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

PCP-MAE (NeurIPS 2024 Spotlight)

If you like our project, please give us a star ⭐ on GitHub for the latest update.

PCP-MAE: Learning to Predict Centers for Point Masked Autoencoders

📰 News

✅ TODO List

🔍 Introduction

PCP-MAE Models

Requirements

Datasets

Pre-training

Fine-tuning

Visualization

Contact

License

Acknowledgements

Citation

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 26 Commits
cfgs		cfgs
datasets		datasets
extensions		extensions
figs		figs
models		models
segmentation		segmentation
semantic_segmentation		semantic_segmentation
tools		tools
utils		utils
.gitignore		.gitignore
DATASET.md		DATASET.md
LICENSE		LICENSE
README.md		README.md
main.py		main.py
main_vis.py		main_vis.py
requirements.txt		requirements.txt
vis.sh		vis.sh

Folders and files

Latest commit

History

Repository files navigation

PCP-MAE (NeurIPS 2024 Spotlight)

If you like our project, please give us a star ⭐ on GitHub for the latest update.

PCP-MAE: Learning to Predict Centers for Point Masked Autoencoders

📰 News

✅ TODO List

🔍 Introduction

PCP-MAE Models

Requirements

Datasets

Pre-training

Fine-tuning

Visualization

Contact

License

Acknowledgements

Citation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages