Welcome to the official repository of OpenRSD! This project proposes an open-prompt remote sensing object detection method that supports multimodal prompts and integrates multi-task detection heads to balance accuracy and real-time requirements for single-stage or two-stage detectors.
This paper has been accepted by ICCV 2025.
- 🎯 Multimodal Prompt Support: Supports text, image, and other modal prompt inputs
- 🔄 Multiple Detection Heads: Supports alignment detection head (high real-time performance, supports large vocabulary) and fusion detection head (high accuracy)
- Requirements
- Installation
- Dataset Preparation
- Quick Start
- Project Structure
- Results
- Citation
- License
- Python >= 3.7
- PyTorch >= 1.8.0
- CUDA >= 10.2
- mmcv-full >= 1.4.0
- mmdetection
- mmrotate
- Other dependencies can be found in
requirements.txt
git clone https://github.com/floatingstarZ/OpenRSD.git
cd OpenRSD# Create environment only
conda create -n openrsd python=3.8 -y
# Or use environment.yml to install all dependencies (CUDA 10.2 environment)
conda env create -f environment.ymlInstall PyTorch according to your CUDA version:
conda activate openrsd
# For example, CUDA 11.1
conda install pytorch torchvision torchaudio cudatoolkit=11.1 -c pytorch# Install the project
pip install -v -e .
# Or
python setup.py developAll datasets can be downloaded via the following link:
Baidu Netdisk Share: OpenRSD Link: https://pan.baidu.com/s/1QWWZOfrjAWhEbk1eQASXVQ?pwd=usnc Extraction code: usnc -- Shared by Baidu Netdisk Super Member v9
All image and annotation folders in the dataset are compressed.
./BaiduPCS_Upload.py is the batch upload script used in this project. Its basic principle is: compress -> upload, and preserves the original directory structure.
You can refer to this script to write a batch decompression script. The dataset path is ./data.
All datasets used in this project are downloaded from the internet as raw remote sensing data and processed through slicing. For specific data processing workflows and scripts, please refer to the tools/data/ directory.
python tools/train.py ${CONFIG_FILE} [optional arguments]Use the intelligent multi-GPU training scheduling script that automatically detects available GPUs and intelligently allocates resources:
cd EXP_CONFIG
python multi_train_any_gpu.py -k XXX -c 90Parameter Description:
-k: Keyword filtering, specify one or more keywords to filter configurations to train (e.g.,-k A08 A10means only train models whose configuration names contain A08 or A10)-c: Starting command count, used to set the starting port number (default 10, port number = 29500 + count)-d: Specify allowed GPU IDs, separated by commas (e.g.,-d 0,1,2,3means only use GPUs 0-3, default uses all available GPUs)-r: Set runner type (options:det/cls/few/resume, default isdet)
Features:
- ✅ Automatically detects GPU memory usage and intelligently allocates available GPUs
- ✅ Automatically skips already trained models (checks if checkpoint files exist)
- ✅ Supports multi-task parallel training, automatically manages GPU resources to avoid conflicts
- ✅ Supports batch training of multiple configurations without manual management
Usage Examples:
# Train all configurations containing "A08" keyword
python multi_train_any_gpu.py -k A08 -c 10
# Train configurations with multiple keywords (A08 or A10)
python multi_train_any_gpu.py -k A08 A10 -c 10
# Only use GPUs 0-3 for training
python multi_train_any_gpu.py -k A08 -c 10 -d 0,1,2,3
# Resume training
python multi_train_any_gpu.py -k A08 -c 10 -r resumepython tools/test.py ${CONFIG_FILE} ${CHECKPOINT_FILE} [optional arguments]Use the batch evaluation script to automatically evaluate models of different epochs on multiple datasets.
You can modify the model_info configuration at Line 206 to evaluate different models, where val_using_aux=True uses the fusion head and =False uses the alignment head.
cd ./M_Tools/Eval_Tools
python eval_diff_epochs.py -d 0 -e 24Parameter Description:
-d: Specify the GPU ID to use (e.g.,-d 0means use GPU 0)-e: Specify the list of epochs to evaluate, can specify one or more (e.g.,-e 24or-e 12 24 36)
Features:
- ✅ Automatically evaluates on multiple remote sensing datasets (DOTA2, DIOR_R, FAIR1M, SpaceNet, Xview, HRSC2016, WHU_Mix, etc.)
- ✅ Supports batch evaluation of models at multiple epochs
- ✅ Automatically saves evaluation results (including
.pklprediction files and.jsonevaluation results) - ✅ Supports evaluation using auxiliary branches (can configure
val_using_auxin the script)
Usage Examples:
# Evaluate a single epoch (epoch 24)
python eval_diff_epochs.py -d 0 -e 24
# Evaluate multiple epochs (epochs 12, 24, 36)
python eval_diff_epochs.py -d 0 -e 12 24 36
# Use a different GPU
python eval_diff_epochs.py -d 1 -e 24Notes:
- Before use, you need to configure model information (
model_info) in the script, including:cfg_pth: Configuration file pathcfg_name: Configuration name (used to construct checkpoint path)val_using_aux: Which branch to use for evaluation
- Evaluation results will be saved in the
./results/TEST_EVAL/directory - Evaluation results for each epoch will be saved in separate subdirectories
Configuration files are located in the M_configs/ directory, including:
Step1_A08_Large_Pretrain/: Large-scale pretraining configurationsStep2_A10_Large_Pretrain_Stage3/: Stage 3 pretraining configurationsStep3_A12_SelfTrain/: Self-training configurationsOther/: Other configurations
MMRotate_AD_Pub/
├── M_AD/ # Main algorithm implementation
│ ├── models/ # Model definitions
│ │ ├── detectors/ # Detectors (Flex_Rtmdet, E_Rtmdet, Hindsight_Rtmdet, etc.)
│ │ ├── dense_heads/ # Detection heads (Flex_Rrtmdet_head, E_Rrtmdet_head, etc.)
│ │ ├── backbones/ # Backbones (CSPNeXt, ViT, Swin, etc.)
│ │ ├── necks/ # Neck networks (PAFPN, Ace_fpn, etc.)
│ │ ├── roi_heads/ # ROI heads (Open_standard_roi_head, Hin_Box_Prompt_head, etc.)
│ │ ├── layers/ # Custom layers (Transformer, DINOv2 related layers)
│ │ ├── task_modules/ # Task modules (Assigner, etc.)
│ │ └── utils/ # Utility functions
│ ├── datasets/ # Dataset related
│ │ ├── samplers/ # Data samplers (multi-task samplers, etc.)
│ │ └── transforms/ # Data transforms
│ ├── engine/ # Training engine
│ │ ├── optimizers/ # Optimizers
│ │ └── runner/ # Training runners
│ ├── evaluation/ # Evaluation related
│ │ └── metrics/ # Evaluation metrics
│ └── structures/ # Data structures
│ └── bbox/ # Bounding box related
├── M_configs/ # Configuration files
│ ├── Step1_A08_Large_Pretrain/ # Large-scale pretraining configurations
│ ├── Step2_A10_Large_Pretrain_Stage3/ # Stage 3 pretraining configurations
│ ├── Step3_A12_SelfTrain/ # Self-training configurations
│ └── Other/ # Other configurations (e.g., InContext learning)
├── M_Tools/ # Tool script collection
│ ├── Eval_Tools/ # Evaluation tools
│ │ ├── eval_diff_epochs.py # Batch evaluation for different epochs
│ │ ├── auto_eval.py # Automatic evaluation script
│ │ ├── eval_cross_data.py # Cross-dataset evaluation
│ │ └── eval_configs/ # Evaluation configurations
│ └── Base_Data_infos/ # Dataset information configurations
├── EXP_CONFIG/ # Experiment configuration management
│ ├── multi_train_any_gpu.py # Multi-GPU training scheduling script
│ ├── multi_eval_any_gpu.py # Multi-GPU evaluation scheduling script
│ ├── py_cmd.py # Training command wrapper script
│ └── CONFIGS/ # Experiment configuration definitions
├── tools/ # MMDetection/MMRotate tool scripts
│ ├── train.py # Training script
│ ├── test.py # Testing script
│ ├── data/ # Data processing tools (DOTA, DIOR, FAIR1M dataset processing, etc.)
│ ├── analysis_tools/ # Analysis tools (log analysis, result analysis, etc.)
│ └── model_converters/ # Model conversion tools
├── mmdet/ # MMDetection core code
├── mmrotate/ # MMRotate core code
├── mmyolo/ # MMYOLO core code (partial functionality dependency)
├── commonlibs/ # Common utility library
├── ctlib/ # Custom utility library
├── src/ # Resource files
│ └── images/ # Image resources (method diagrams, etc.)
├── requirements.txt # Python dependency list
├── setup.py # Installation script
├── environment.yml # Conda environment configuration
└── README.md # This file
Main Directory Descriptions:
- M_AD/: Core algorithm implementation, containing all custom models, datasets, training engines, etc.
- M_configs/: Experiment configuration files, organized by training stage
- M_Tools/: Evaluation and data processing tool collection
- EXP_CONFIG/: Experiment management and scheduling scripts, supporting automatic multi-GPU scheduling
- tools/: Standard tools provided by MMDetection/MMRotate framework
For detailed experimental results and model weights, please refer to the paper. Main results include:
- Detection performance on multiple remote sensing datasets
- Accuracy and speed comparisons under different configurations
- Ablation study results
If you use this codebase in your research or wish to refer to the baseline results published here, please use the following BibTeX entry:
@inproceedings{huang2025openrsd,
title={OpenRSD: Towards open-prompts for object detection in remote sensing images},
author={Huang, Ziyue and Feng, Yongchao and Liu, Ziqi and Yang, Shuai and Liu, Qingjie and Wang, Yunhong},
booktitle={Proceedings of the IEEE/CVF International Conference on Computer Vision},
pages={8384--8394},
year={2025}
}This project is licensed under the Apache License 2.0.
This project is based on the following excellent open-source projects:
Thanks to all contributors and authors of related work!
A: Select the corresponding configuration file according to your training stage:
- Step1: Large-scale pretraining stage
- Step2: Stage 3 pretraining
- Step3: Self-training stage
A: You can try the following methods:
- Reduce
batch_size - Reduce input image size
img_scale - Use gradient accumulation
- Use fewer GPUs
A: Please refer to the following steps:
- Prepare the dataset, format reference examples in the
tools/data/directory - Modify the data path and number of classes in the configuration file
- Adjust training parameters as needed
A: Use the testing script:
python tools/test.py ${CONFIG_FILE} ${CHECKPOINT_FILE} --eval mAPIf you have any questions or suggestions, please contact us through:
- Submit an Issue
- Send an email to the project maintainer (ziyuehuang@buaa.edu.cn)


