This repository contains the source code and datasets for our paper published in Journal of Manufacturing Systems (2026):
Designing Synthetic Active Learning for Model Refinement in Manufacturing Parts Detection
Journal of Manufacturing Systems, Volume 84, 2026, Pages 68–84
📄 Read the paper (ScienceDirect)
SAL extends our previous work on static synthetic dataset generation for manufacturing object detection.
- 📄 Previous paper: ICRA 2025, Domain Randomization for Object Detection in Manufacturing Applications Using Synthetic Data: A Comprehensive Study
- 💻 base generation pipeline: SynMfg
The previous work generates a fixed synthetic dataset before training.
SAL instead continuously generates synthetic data during training based on model weaknesses.
This code generates synthetic data from 3D models using SAL. We use two datasets to generate synthetic images and train an object detection model, which performs well on real-world data.
-
Robotic Dataset: Published by Horváth et al., this dataset includes both 3D models and real images.
- 📂 3D Models: Located in
data/Objects/Robotic/, containing 10.objfiles. - 🖼️ Real Images: Download from Dropbox – Public Robotic Dataset. We We use the
yolo_cropped_allsubset for real-image evaluation.
- 📂 3D Models: Located in
-
SIP15-OD Dataset: Developed by us in our previous ICRA2025 paper. It contains 15 manufacturing object 3D models across three use cases, along with 395 real images featuring 996 annotated objects taken in various manufacturing environments.
Due to company policy, the original CAD models cannot be publicly released. However, the real-world annotated images are available via: Roboflow-SIP15OD.
Below are samples of the synthetic data and their real-world counterparts from the robotic dataset, as well as the three use cases from the SIP-15-OD dataset.
![]() |
![]() |
![]() |
![]() |
Figure 1. Overview of the Synthetic Active Learning (SAL) framework. The system iteratively generates synthetic data, trains the detection model, evaluates performance weaknesses, and updates generation configurations to refine model performance.
Synthetic Active Learning (SAL) is a fully automatic model refinement framework for manufacturing parts detection using only synthetic data generated with domain randomization.
Traditional domain randomization pipelines generate a fixed synthetic dataset before training. While effective, static datasets cannot adapt to performance weaknesses that emerge during training, such as poor results for specific object categories, challenging materials, object sizes, or recurring misclassification patterns.
Inspired by active learning, SAL shifts the focus from selecting real data for labeling to selecting what synthetic data to generate from an effectively unlimited variation space. The framework continuously identifies weak areas of the model and generates targeted synthetic data to improve them.
The iterative SAL pipeline consists of:
- Synthetic data generation
- Model training
- Weakness evaluation
- Generation configuration update
To diagnose model weaknesses, SAL introduces four custom evaluators:
- Category Performance Evaluator
- Misclassification Evaluator
- Challenging Size Evaluator
- Overall Performance Evaluator
Based on these analyses, four corresponding updaters adjust the synthetic data generation process:
- Category Distribution Updater
- Object Pairwise Updater
- Object Size Updater
- Object Material Updater
This closed-loop system automatically regenerates data targeting underperforming attributes, without requiring real images or manual intervention. The process continues until performance stabilizes and further improvements become marginal.
To enable efficient simultaneous training and generation, SAL employs a data block shifting scheme, where one data block is regenerated while the remaining blocks are used for training. This design supports continuous dataset refinement with efficient GPU utilization.
Across four industrial use cases from two datasets, SAL achieves:
- +2–6 percentage point improvement in mAP@50 over static training
- Significant gains in previously underperforming categories
- More balanced per-class detection performance
- Reduced need for extensive hyperparameter tuning
SAL demonstrates that synthetic data pipelines can evolve from static dataset creation to adaptive, model-driven refinement suitable for scalable industrial deployment.
- Setup conda environment using
conda env create -f environment.yml - Activate environment using
conda activate SAL_Code
- Go to Blender 3.4, and download the appropriate version of Blender for your system. As an example
blender-3.4.1-windows-x64.msifor Windows orblender-3.4.1-linux-x64.tar.xzfor Linux. - Install Blender.
- Set blender environment variable
BLENDER_PATHto the Blender executable. As an exampleC:\Program Files\Blender Foundation\Blender 3.4\blender.exefor Windows or/user/blender-3.4.1-linux-x64/blenderfor Linux.
Downloaded textures are put into their corresponding folders inside the data folder structure.
Synthetic_Active_Learning_Code/
└── data/
├── Background_Images/
├── Objects/
├── PBR_Textures/
└── Texture_Images/
- Go to Google Drive.
- Download all image files from train and testval folders.
- Put all images into
data/Background_Images.
- Go to Flickr 8k Dataset.
- Download all image files.
- Put all images into
data/Texture_Images.
- Run
blenderproc download cc_textures data/PBR_Textures. It downloads textures from cc0textures.com. - To use specific material textures like metal, create a new folder named
data/Metal_Texturesand place only the metal textures from thecc_texturesdata there.
The preparation of 3D models used in the pipeline can be read about in the objects section.
This repository uses JSON configuration files (see configs/) to control both synthetic data generation (domain randomization) and the SAL training loop (training, evaluators, and updaters).
Below we summarize the main parameters and their default values used in our experiments.
| Parameter | Description | Default |
|---|---|---|
| Scene | ||
background_texture_type |
Background texture type. 1: no texture, 2: random images from BG-20L. |
2 |
total_distracting_objects |
Maximum number of distractors in the scene. | 10 |
| Object characteristics | ||
max_objects |
Maximum number of objects per scene. -1 includes all objects and empty background images. |
-1 |
multiple_of_same_object |
Allow multiple instances of the same object in one scene. | TRUE |
object_weights |
Sampling weights for object categories. [] means uniform distribution. |
[] |
nr_objects_weights |
Sampling weights for number of objects. [] means uniform distribution. |
[] |
object_rotation_x |
Min–max rotation angle around x-axis (degrees). | 0–360 |
object_rotation_y |
Min–max rotation angle around y-axis (degrees). | 0–360 |
object_distance_scale |
Min–max distance ratio between objects. 0.53 prevents overlap. |
0.53–1.0 |
objects_texture_type |
Object texture type. 1: RGB, 2: image, 3: PBR, 0: random. |
3 |
| Camera | ||
camera_zoom |
Min–max camera zoom. | 0.1–0.7 |
camera_theta |
Min–max azimuth angle (degrees). | 0–360 |
camera_phi |
Min–max polar angle (degrees, max 90). | 0–60 |
camera_focus_point_x |
Min–max shift for focus point x. | 0–0.5 |
camera_focus_point_y |
Min–max shift for focus point y. | 0–0.5 |
camera_focus_point_z |
Min–max shift for focus point z. | 0–0.5 |
| Illumination | ||
light_count_auto |
Automatically set light count based on scene size. | 1 |
light_energy |
Min–max light energy. | 5–150 |
light_color_red |
Min–max red channel value. | 0–255 |
light_color_green |
Min–max green channel value. | 0–255 |
light_color_blue |
Min–max blue channel value. | 0–255 |
| Post-processing | ||
vertical_flip |
Probability of vertical flip augmentation. | 0.2 |
horizontal_flip |
Probability of horizontal flip augmentation. | 0.2 |
blur |
Probability of blur augmentation. | 0.2 |
to_gray |
Probability of grayscale conversion. | 0.2 |
clahe |
Probability of applying CLAHE. | 0.2 |
random_brightness_contrast |
Probability of brightness/contrast adjustment. | 0.2 |
random_gamma |
Probability of random gamma correction. | 0.2 |
image_compression |
Probability of image compression augmentation. | 0.2 |
crop_and_pad |
Probability of crop-and-pad augmentation. | 0.2 |
multiplicative_noise |
Probability of multiplicative noise augmentation. | 0.2 |
| Parameter | Description | Default |
|---|---|---|
| Training | ||
model |
Base model checkpoint (e.g., yolov8n.pt, yolov8s.pt). |
yolov8m.pt |
training_dataset_size |
Training dataset size (generates more if needed). | 10500 |
validation_dataset_size |
Validation dataset size (-1 uses all). |
2500 |
evaluation_dataset_size |
Evaluation dataset size (-1 uses all). |
-1 |
epochs |
Number of training epochs. | 2000 |
num_workers |
Data loader workers. | 4 |
batch_size |
Batch size. | 16 |
dropout |
Dropout probability. | 0.0 |
img_size |
Input image size. | 720 |
learning_rate |
Learning rate. | 0.001 |
weight_decay |
Weight decay. | 0.0005 |
postprocess_iou_thres |
IOU threshold used in post-processing. | 0.3 |
postprocess_conf_thres |
Confidence threshold used in post-processing. | 0.01 |
nr_dataset_segments |
Number of dataset segments (data blocks). | 4 |
training_stop_patience |
Patience (epochs) before stopping training. | 2000 |
evaluation_patience |
Patience (epochs) for evaluation. | 100 |
evaluation_backoff |
Backoff (epochs) before re-evaluating. | 100 |
early_stop_backoff |
Backoff (epochs) for early stopping after new data. | 80 |
configuration_update_ratio |
Ratio of new data generated using updated configuration. | 0.7 |
| Evaluators | ||
map_conf_thres |
Confidence threshold for mAP computation. | 0.25 |
confusion_matrix.iou_thres |
IOU threshold for confusion matrix samples. | 0.45 |
confusion_matrix.conf_thres |
Confidence threshold for confusion matrix samples. | 0.25 |
incorrect_evaluator.iou_thres |
IOU threshold for incorrect evaluator. | 0.45 |
incorrect_evaluator.conf_thres |
Confidence threshold for incorrect evaluator. | 0.25 |
confusion_evaluator.iou_thres |
IOU threshold for confusion evaluator. | 0.45 |
confusion_evaluator.conf_thres |
Confidence threshold for confusion evaluator. | 0.25 |
size_evaluator.iou_thres |
IOU threshold for size evaluator. | 0.45 |
size_evaluator.conf_thres |
Confidence threshold for size evaluator. | 0.25 |
| Updaters | ||
size_configuration.enabled |
Enable object size updater. | True |
size_configuration.nr_objects_range |
Object count range used during configuration update. | [2, 12] |
size_configuration.camera_zoom_min_range |
Range for updating minimum camera zoom. | [0.05, 0.6] |
size_configuration.camera_zoom_max_range |
Range for updating maximum camera zoom. | [0.3, 0.75] |
size_configuration.min_object_distance |
Range for updating minimum object distance scale. | [0.5, 1.0] |
size_configuration.max_object_distance |
Range for updating maximum object distance scale. | [0.5, 1.0] |
class_configuration.enabled |
Enable category distribution updater. | True |
pair_configuration.enabled |
Enable object pairwise updater. | True |
metal_configuration.enabled |
Enable material updater. | True |
To start the SAL training process use the command:
Run python train.py --config configs/config-sample-continuous.json to start the generation.
To run the training three datasets need to be specified in the dataset yaml file: Training, Validation, and evaluation. If there are fewer samples in the paths than is specified in the configuration file, more will be generated. After checking that there are enough samples in the datasets, the datasets that will be continuously updated will be copied to a "working directory". This enables multiple training instances to be run simultaneity and preserves the original dataset.
The resulting model will be saved in the parent save folder ("continuous_runs" by default) along other graphs and metrics. Although the model uses Ultralytics architecture, it doesn't get saved in their format. To convert it back use the conversion script.
The resulting SAL model can be tested using our own implemented tests that give similar metrics to Ultralytics. The results are saved to the specified parent run folder. The "test" dataset in the data yaml will be used for these tests. Alongside metrics, predictions and ground truth from the test set is provided.
Run python test.py --config configs/config-sample-test.json to start the testing process.
There is also an option to use Ultralytics based testing ("validation"). To run this process the SAL model is converted to a Ultralytics model. This will capture the terminal output that the Ultralytics validation process outputs into a text file that gets put in the validation folder. The validation folder will be renamed using the model run name and the dataset name. Multiple model paths ("weights") and ("dataset_yamls") can be provided. All provided model paths will be tested on all datasets. Be aware that this runs using the specified "val" images in the dataset yaml.
Run python ultralytics_val.py --config configs/config-sample-ultralytics-val.json to start the Ultralytics testing process.
To make the model more useful it can be converted to an Ultralytics model which can be used as normal.
Run python convert_to_ultralytics.py --sal_model_path continuous_runs/train1/weights/best.pt --save_model_path converted_model.pt to start the Ultralytics conversion proces.
Run python Generation/Blender/generation_main.py --config configs/config-sample-generation.json to start the generation.
We compare static training with Synthetic Active Learning (SAL) on the robotics use case. Results are shown in terms of training dynamics and real-world evaluation performance.
Static training (above): training loss decreases smoothly on a fixed synthetic dataset.
SAL training (below): loss fluctuates as new synthetic data is introduced in each refinement loop.
The fluctuations in SAL are expected. Each time new targeted synthetic data is generated, the loss temporarily increases before decreasing again as the model adapts. This behavior reflects the continuous dataset refinement process.
Static training (above): mAP@50 on the real test dataset.
SAL training (below): mAP@50 on the real test dataset.
When evaluated on real data (never seen during training), SAL consistently outperforms static training.
- Higher overall mAP@50
- Clear improvements in previously underperforming categories
- More balanced per-class performance
The robotic dataset is from Horváth et al., including their .obj files and real images accessed from their GitLab repository. Thanks for their great work!
We also thank previous works in domain randomization for industrial applications, including Tobin et al., Eversberg and Lambrecht, and Horváth et al..
We acknowledge the contributions of the YOLOv8 model from Ultralytics, which we used for training our model.
If you find this work useful for your research, please consider citing:
@article{ZHU202668,
title = {Designing Synthetic Active Learning for Model Refinement in Manufacturing Parts Detection},
author = {Zhu, Xiaomeng and Henningsson, Jacob and Mårtensson, Pär and Hanson, Lars and Björkman, Mårten and Maki, Atsuto},
journal = {Journal of Manufacturing Systems},
volume = {84},
pages = {68--84},
year = {2026},
doi = {10.1016/j.jmsy.2025.11.023}
}For the static domain randomization pipeline, please cite our ICRA 2025 work (see the Related Publications section above) and refer to the corresponding GitHub repository: SynMfg








