When training a custom dataset with the model, I consistently encounter the following error around epoch 9. The setup uses Python 3.11 and a single RTX 3090 GPU.
From my investigation, this issue appears in DETR-based models and related detectors, but I have not been able to find a definitive solution. The number of classes is correctly set to num_classes + 1, accounting for the background class.
tensor([[[nan, nan, nan, nan],
...
[nan, nan, nan, nan]],
Epoch 9: 45%|████▍ | 460/1029 [07:40<10:23, 1.09s/it, loss=34.6598][rank0]: Traceback (most recent call last):
File ".../train_deim_dev.py", line 64, in <module>
trainer.fit(
File ".../site-packages/deimkit/trainer.py", line 428, in fit
train_stats = train_one_epoch(
File ".../site-packages/deimkit/engine/solver/det_engine.py", line 72, in train_one_epoch
loss_dict = criterion(outputs, targets, **metas)
File ".../site-packages/deimkit/engine/deim/deim_criterion.py", line 276, in forward
indices = self.matcher(outputs_without_aux, targets)['indices']
File ".../site-packages/deimkit/engine/deim/matcher.py", line 101, in forward
cost_giou = -generalized_box_iou(box_cxcywh_to_xyxy(out_bbox), box_cxcywh_to_xyxy(tgt_bbox))
File ".../site-packages/deimkit/engine/deim/box_ops.py", line 56, in generalized_box_iou
assert (boxes1[:, 2:] >= boxes1[:, :2]).all()
AssertionError
When training a custom dataset with the model, I consistently encounter the following error around epoch 9. The setup uses Python 3.11 and a single RTX 3090 GPU.
From my investigation, this issue appears in DETR-based models and related detectors, but I have not been able to find a definitive solution. The number of classes is correctly set to
num_classes + 1, accounting for the background class.🔥 Error Trace
🧠 Notes
[cx, cy, w, h]format and should be valid (value >0).Any guidance or suggestions for debugging this would be appreciated!