fix: slice numpy array values in `custom_data` per row in CSVSink by farukalamai · Pull Request #2199 · roboflow/supervision

farukalamai · 2026-04-03T20:37:59Z

Before submitting

Self-reviewed the code
Updated documentation, follow Google-style
Added docs entry for autogeneration (if new functions/classes)
Added/updated tests
All tests pass locally

Description

Fixes a bug in CSVSink and JSONSink where passing a numpy array as a
custom_data value wrote the entire array on every row instead of the
per-detection scalar value.

Type of Change

🐛 Bug fix (non-breaking change which fixes an issue)

Motivation and Context

When users pass computed per-detection values like detections.area via
custom_data, each row should receive its own scalar — not the whole array.

# Before (broken): every row got the full array
with sv.CSVSink("out.csv") as sink:
    sink.append(detections, custom_data={"area": detections.area})
# area column: [400.0, 400.0] on every row ❌

# After (fixed): each row gets its own value
# area column: 400.0, 400.0 ✅

The root cause was row.update(custom_data) inside the per-detection loop,
which blindly wrote the whole value. The fix applies the same per-index
slicing logic that detections.data already uses correctly.

Closes #1397

Changes Made

src/supervision/detection/tools/csv_sink.py — slice numpy array values in custom_data per detection row
src/supervision/detection/tools/json_sink.py — same fix
tests/detection/test_csv.py — added test case for numpy array in custom_data

Testing

I have tested this code locally
I have added unit tests that prove my fix is effective or that my feature works
All new and existing tests pass

Google Colab (optional)

Colab link:

Screenshots/Videos (optional)

Additional Notes

The fix is backward compatible — scalar values in custom_data (e.g.
{"frame_number": 42}) continue to work as before, written as-is on every
row.

…SONSink

codecov · 2026-04-08T12:30:54Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 78%. Comparing base (72fc49f) to head (43da77a).
⚠️ Report is 1 commits behind head on develop.

Additional details and impacted files

@@           Coverage Diff           @@
##           develop   #2199   +/-   ##
=======================================
  Coverage       78%     78%           
=======================================
  Files           63      63           
  Lines         7972    7979    +7     
=======================================
+ Hits          6248    6257    +9     
+ Misses        1724    1722    -2

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copilot

Pull request overview

Fixes incorrect serialization of per-detection custom_data in CSVSink/JSONSink when users pass numpy arrays (previously the full array was written on every row), aligning output with expected “one value per detection row” behavior.

Changes:

Update CSVSink.parse_detection_data() to slice custom_data numpy arrays per detection row.
Update JSONSink.parse_detection_data() to slice custom_data numpy arrays per detection row.
Add a unit test ensuring CSVSink slices numpy-array custom_data per row.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 5 comments.

File	Description
`src/supervision/detection/tools/csv_sink.py`	Slice numpy-array `custom_data` per detection row when producing CSV rows.
`src/supervision/detection/tools/json_sink.py`	Apply analogous per-row slicing for numpy-array `custom_data` when producing JSON rows.
`tests/detection/test_csv.py`	Add regression test covering numpy-array `custom_data` in `CSVSink`.

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

The else branch added by the original fix used hasattr(value, "__getitem__") to decide whether to slice custom_data values per detection. This incorrectly indexes dicts by integer 0, raising KeyError when custom_data contains dict values (e.g. {"metadata": {"sensor_id": 101}}). Non-ndarray types (dicts, scalars, lists used as a single value per detection) should be written as-is. Only np.ndarray values require per-row indexing. [resolve roboflow#1] /review finding by sw-engineer (report: .temp/output-review-fix-csv-json-sink-custom-data-array-slicing-2026-04-13.md): CSVSink else branch KeyError:0 on dict custom_data at csv_sink.py:153 --- Co-authored-by: Claude Code <noreply@anthropic.com>

Adds a parametrized case to test_json_sink verifying that np.ndarray values in custom_data are sliced per detection row (not written as the full array on every row). Mirrors the existing CSV counterpart added by the original PR. [resolve roboflow#3] /review finding by qa-specialist (report: .temp/output-review-fix-csv-json-sink-custom-data-array-slicing-2026-04-13.md): Missing JSONSink test for numpy array custom_data slicing --- Co-authored-by: Claude Code <noreply@anthropic.com>

Exercises all three dispatch branches of the custom_data handler simultaneously: np.ndarray values are sliced per detection row while scalar values (int, str, etc.) are broadcast as-is to every row. [resolve roboflow#4] /review finding by qa-specialist (report: .temp/output-review-fix-csv-json-sink-custom-data-array-slicing-2026-04-13.md): Missing test for mixed-type custom_data --- Co-authored-by: Claude Code <noreply@anthropic.com>

Borda · 2026-04-13T21:20:04Z

Thanks for the fix @farukalamai! I've pushed 3 follow-up commits to your branch:

fix: restore non-ndarray custom_data passthrough in CSVSink — the new else branch used hasattr(value, "__getitem__") to decide whether to slice custom_data values per detection. This accidentally indexed dicts by integer 0, raising KeyError: 0 for the existing "Complex Data" test case (custom_data={"metadata": {"sensor_id": 101}} etc.). Non-ndarray types should be written as-is; only np.ndarray values need per-row indexing. Changed to row[key] = value in the else branch.
test: add JSONSink test for numpy array custom_data slicing — symmetric test for JSONSink to match the CSV coverage you added.
test: add mixed-type custom_data test (ndarray + scalar together) — exercises all three dispatch paths at once: ndarray sliced per row + scalar broadcast to every row.

All pre-commit hooks pass and 15/15 tests pass.

Extract a private _slice_value(value, i) static method into both CSVSink and JSONSink that centralises the per-row ndarray dispatch: 0-d ndarray -> value as-is, n-d ndarray -> value[i], anything else -> value as-is. Both parse_detection_data methods now call this helper for detections.data and custom_data, removing the duplicated isinstance chains and eliminating the dead `hasattr(__getitem__)` else-branch (detections.data values are always ndarrays via convert_data). Drop the pure-ndarray-only test_detections_array_custom_data.csv case from test_csv_sink; the mixed test (ndarray + scalar together) is a strict superset that exercises all three dispatch paths simultaneously. --- Co-authored-by: Claude Code <noreply@anthropic.com>

fix: slice numpy array values in custom_data per row in CSVSink and J…

628f295

…SONSink

farukalamai requested a review from SkalskiP as a code owner April 3, 2026 20:38

Borda requested a review from Copilot April 8, 2026 12:27

Copilot started reviewing on behalf of Borda April 8, 2026 12:28 View session

Borda changed the title ~~fix: slice numpy array values in custom_data per row in CSVSink and J…~~ fix: slice numpy array values in custom_data per row in CSVSink Apr 8, 2026

Copilot AI reviewed Apr 8, 2026

View reviewed changes

Borda added waiting for author bug Something isn't working labels Apr 8, 2026

Borda and others added 5 commits April 13, 2026 11:21

Update src/supervision/detection/tools/csv_sink.py

1b2d2b0

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Merge branch 'develop' into fix/csv-json-sink-custom-data-array-slicing

0714fae

Borda approved these changes Apr 13, 2026

View reviewed changes

Borda merged commit e514142 into roboflow:develop Apr 13, 2026
24 checks passed

Borda mentioned this pull request Apr 17, 2026

releasing 0.28.0 #2220

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: slice numpy array values in `custom_data` per row in CSVSink#2199

fix: slice numpy array values in `custom_data` per row in CSVSink#2199
Borda merged 7 commits intoroboflow:developfrom
farukalamai:fix/csv-json-sink-custom-data-array-slicing

farukalamai commented Apr 3, 2026

Uh oh!

codecov bot commented Apr 8, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Borda commented Apr 13, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

farukalamai commented Apr 3, 2026

Description

Type of Change

Motivation and Context

Changes Made

Testing

Google Colab (optional)

Screenshots/Videos (optional)

Additional Notes

Uh oh!

codecov bot commented Apr 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Borda commented Apr 13, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

codecov bot commented Apr 8, 2026 •

edited

Loading