You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This repository is a paper digest of recent advances in collaborative / cooperative / multi-agent perception for V2I / V2V / V2X autonomous driving scenario. Papers are listed in alphabetical order of the first character.
(Talk) Robust Collaborative Perception against Communication Interruption [video], Uncertainty Quantification of Collaborative Detection for Self-Driving [video], Collaborative and Adversarial 3D Perception for Autonomous Driving [video], Vehicle-to-Vehicle Communication for Self-Driving [video], Adversarial Robustness for Self-Driving [video], 2022 1st Cooperative Perception Workshop Playback [video], 基于群体协作的超视距态势感知 [video], 协同自动驾驶:仿真与感知 [video], 新一代协作感知Where2comm减少通信带宽十万倍 [video], 基于V2X的多源协同感知技术初探 [video], 面向车路协同的群智机器网络 [video], IACS 2023 协同感知PhD Sharing [video], CICV 2022 数据驱动的车路协同专题 [video]
(Survey) Collaborative Perception in Autonomous Driving: Methods, Datasets and Challenges [paper], A Survey and Framework of Cooperative Perception: From Heterogeneous Singleton to Hierarchical Cooperation [paper]
(Library) OpenCOOD: Open Cooperative Detection Framework for Autonomous Driving [code] [doc], CoPerception: SDK for Collaborative Perception [code] [doc], OpenCDA: Simulation Tool Integrated with Prototype Cooperative Driving Automation [code] [doc]
(People) Runsheng Xu@UCLA [web], Yiming Li@NYU [web], Hang Qiu@Waymo [web]
(Workshop) ICRA 2023 [web], MFI 2022 [web], ITSC 2020 [web]
(Background) Current Approaches and Future Directions for Point Cloud Object Detection in Intelligent Agents [video], 3D Object Detection for Autonomous Driving: A Review and New Outlooks [paper], DACOM: Learning Delay-Aware Communication for Multi-Agent Reinforcement Learning [video], A Survey of Multi-Agent Reinforcement Learning with Communication [paper]
The results above are directly borrowed from publicly accessible papers. Since some of the results here are reported by the following papers instead of the original ones, the most reliable data source links are also given. The best effort is tried to ensure that all the collected benchmark results are in the same training and testing settings (if provided).
In Joint Set evaluation, the OPV2V test split (16 scenes), OPV2V test culver city split (4 scenes), OPV2V validation split (9 scenes), V2XSet test split (19 scenes) and V2XSet validation split (6 scenes) are combined together as a much larger evaluation dataset (totaling 54 different scenes) to allow more stable ranking. The evaluated models are trained on a joint set of OPV2V train split and V2XSet train split with ego vehicle shuffling to augment the data.
By default, the message is broadcasted to all agents to form a fully connected communication graph. Considering collaboration efficiency and bandwidth constraint, Who2com, When2com and Where2comm further apply different strategies to prune the fully connected communication graph into a partially connected one during inference. Both fully connected mode and partially connected mode are evaluated here and the latter is marked in italic.
For fair comparison, all methods adopt the identical one-stage training settings in ideal scenarios (i.e., no pose error or time delay) without weight fine-tuning and message compression, extra fusion modules (e.g., down-sampling convolution layers) of intermediate collaboration mode are simplified if not necessary to mitigate the concern about the actual performance gain. PointPillar is adopted as the backbone for all reproduced methods.
Though the reproduction process is simple and quick (the whole round takes less than 2 days with only two 3090 GPUs), multiple advanced training strategies are applied, which may boost some performance and make the ranking not aligned with the original reports. The reproduction is just a straightforward and fair evaluation for representative collaborative perception methods. To know how the official results are obtained, please refer to the papers or codes collected below for more details, which could be helpful.
🔖Dataset and Simulator
CVPR 2023:tada::tada::tada:
V2V4Real (V2V4Real: A Large-Scale Real-World Dataset for Vehicle-to-Vehicle Cooperative Perception) [paper] [code] [project]
V2X-Seq (V2X-Seq: The Large-Scale Sequential Dataset for the Vehicle-Infrastructure Cooperative Perception and Forecasting) [paper] [code] [project]
ICRA 2023
DAIR-V2X-C Complemented (Robust Collaborative 3D Object Detection in Presence of Pose Errors) [paper] [code] [project]
STAR (Multi-Robot Scene Completion: Towards Task-Agnostic Collaborative Perception) [paper&review] [code]
Mode: Intermediate Collaboration
Dataset: V2X-Sim
Task: 2D Segmentation, 3D Detection
IJCAI 2022
IA-RCP (Robust Collaborative Perception against Communication Interruption) [paper] [code]
Mode: Intermediate Collaboration
Dataset: V2X-Sim
Task: 3D Detection
MM 2022
CRCNet (Complementarity-Enhanced and Redundancy-Minimized Collaboration Network for Multi-agent Perception) [paper] [code]
Mode: Intermediate Collaboration
Dataset: V2X-Sim
Task: 3D Detection
ICRA 2022
AttFuse (OPV2V: An Open Benchmark Dataset and Fusion Pipeline for Perception with Vehicle-to-Vehicle Communication) [paper] [code]
Mode: Intermediate Collaboration
Dataset: OPV2V
Task: 3D Detection
MP-Pose (Multi-Robot Collaborative Perception with Graph Neural Networks) [paper] [code]
Mode: Intermediate Collaboration
Dataset: AirSim-MAP
Task: 2D Segmentation
NeurIPS 2021:tada::tada::tada:
DiscoNet (Learning Distilled Collaboration Graph for Multi-Agent Perception) [paper&review] [code]
Mode: Early Collaboration (teacher model), Intermediate Collaboration (student model)
Dataset: V2X-Sim
Task: 3D Detection
ICCV 2021:tada::tada::tada:
Adversarial V2V (Adversarial Attacks On Multi-Agent Communication) [paper] [code]
Mode: Intermediate Collaboration
Dataset: V2V-Sim (not publicly available)
Task: Adversarial Attack
IROS 2021
MASH (Overcoming Obstructions via Bandwidth-Limited Multi-Agent Spatial Handshaking) [paper] [code]
Mode: Late Collaboration
Dataset: AirSim (simulator)
Task: 2D Segmentation
CVPR 2020:tada::tada::tada:
When2com (When2com: Multi-Agent Perception via Communication Graph Grouping) [paper] [code]
Mode: Intermediate Collaboration
Dataset: AirSim-MAP
Task: 2D Segmentation, 3D Classification
ECCV 2020:tada::tada::tada:
V2VNet (V2VNet: Vehicle-to-Vehicle Communication for Joint Perception and Prediction) [paper] [code]
Mode: Intermediate Collaboration
Dataset: V2V-Sim (not publicly available)
Task: 3D Detection, Motion Forecasting
CoRL 2020:tada::tada::tada:
Robust V2V (Learning to Communicate and Correct Pose Errors) [paper] [code]
Mode: Intermediate Collaboration
Dataset: V2V-Sim (not publicly available)
Task: 3D Detection, Motion Forecasting
ICRA 2020
Who2com (Who2com: Collaborative Perception via Learnable Handshake Communication) [paper] [code]
Mode: Intermediate Collaboration
Dataset: AirSim-CP (has an asynchronous issue between views)
Task: 2D Segmentation
About
This repository is a paper digest of recent advances in collaborative / cooperative / multi-agent perception for V2I / V2V / V2X autonomous driving scenario.