Skip to content

Add read-time per-frame metadata#1733

Draft
PaulHax wants to merge 23 commits into
mainfrom
frame-metadata-readtime
Draft

Add read-time per-frame metadata#1733
PaulHax wants to merge 23 commits into
mainfrom
frame-metadata-readtime

Conversation

@PaulHax

@PaulHax PaulHax commented Jul 1, 2026

Copy link
Copy Markdown
Collaborator

Summary

  • Implements v1 per-frame metadata as read-only media telemetry for image-sequence datasets.
  • Reads a co-located .txt or .csv sidecar at view time.
  • Matches rows to frames by image filename.
  • Shows the active frame's values in the Frame Metadata section of the Media Metadata panel.

The sidecar file remains the source of truth. This PR does not import telemetry into annotations, does not write a derived frame_metadata.json, and does not add frame metadata to exports.

How To Use

  1. Create a .txt or .csv file with a header row and one row per image.
  2. Include at least one column containing image filenames. The column can have any name; DIVE finds it by matching the values against the dataset images.
  3. Include at least one metadata column beyond the filename column, for example timestamp, latitude, longitude, depth, altitude, or other telemetry fields.
  4. Place the file next to the imagery:
    • Single-camera image sequence: put the sidecar in the dataset folder beside the images.
    • Multicam image sequence: put one shared sidecar at the multicam parent folder, or put one sidecar inside each camera folder.
  5. Open the dataset and the Media Metadata panel. The Frame Metadata section updates as the playhead moves.

Example:

image_file timestamp latitude longitude water_depth altitude
img_0001.tif 15:40:56 46.575870 -124.603094 192.80 2.78
img_0002.tif 15:41:04 46.575912 -124.603080 193.10 2.70

Each metadata row must include the image filename it describes. Rows that do not match an image are ignored. The metadata file does not need a special name.

For multicam, a shared sidecar can use one filename column per camera:

port_image starboard_image timestamp latitude longitude water_depth
port_0001.tif starboard_0001.tif 15:40:56 46.575870 -124.603094 192.80
port_0002.tif starboard_0002.tif 15:41:04 46.575912 -124.603080 193.10

Implemented Approach

Source Contract

  • Delimited sidecars: Added Python and TypeScript parsers for .txt and .csv sidecars with a header row and comma, tab, or whitespace delimiter sniffing.
  • Filename joins: Match metadata rows to image frames by filename value, with filename normalization matching DIVE's image-key behavior.
  • Raw display values: Preserve source field order through parsing, serving, and display, and keep all values as raw strings.

Source Discovery

  • Co-located files: Read candidate sidecars from the dataset folder, multicam parent folder, or camera child folder.
  • Safety checks: Reject VIAME annotation CSVs, ignore unrelated text files and bare image lists, and skip ambiguous matches instead of guessing.

Read Contract

  • Web endpoint: Added GET /dive_dataset/:id/frame_metadata?startFrame=<n>&endFrame=<n> with inclusive, non-negative frame-window bounds.
  • Desktop parity: Added matching desktop loadFrameMetadata support so web and desktop return the same { cameras: { cameraName: { frame: values } } } shape.
  • No source state: Return an empty cameras map when no usable frame metadata source is present.

Camera Routing

  • Single camera: Return single-camera records under the singleCam camera key.
  • Multicam: Route parent-level and per-camera sidecars by matching each camera's own image filenames, so shared files with columns such as port_image and starboard_image bind correctly.
  • Collisions: If two distinct records target the same camera/frame, omit that frame rather than resolving by precedence.

Client Display

  • Windowed cache: Added a bounded client cache around the playhead so the UI fetches frame metadata windows instead of holding the full dataset in client state.
  • Stale response guard: Ignore older frame-metadata responses when a newer request has already been issued.
  • Panel placement: Display the active frame's values in the existing Media Metadata panel, where the Frame Metadata section follows the playhead and active camera.

Persistence and Export

  • Read-only: Keep frame metadata out of annotation, attribute, and dataset metadata stores.
  • No persistence/export: Do not import telemetry into annotations, write a derived frame_metadata.json, maintain a field registry, or add frame metadata to exports.
  • Docs: Added documentation for the source contract, architecture, and UI behavior.

V1 Boundaries

  • Supported: image-sequence datasets with co-located .txt or .csv sidecars.
  • Not included: editing frame metadata, importing it into annotations, exporting it, embedded EXIF/KLV, video telemetry, charting, training integration, manual out-of-folder source selection, or server-side parsed-source caching.

Related Issues

PaulHax added 23 commits June 30, 2026 16:38
The headerless image-match fallback in is_viame_csv could misclassify a
VIAME-shaped telemetry table (filename in column 1, leading integer
columns) as an annotation CSV and reject it. DIVE's VIAME exports always
carry the '# 1: Detection or Track-id' comment header, so key detection
on that header and drop the now-dead imageMap fallback.
Fix correctness regressions in the telemetry-vs-VIAME detection path:

- Recognize headerless VIAME CSVs. Detection required the DIVE
  "# 1: Detection or Track-id" comment header, so a headerless VIAME
  annotation CSV was misclassified as frame metadata and silently
  dropped on import. Also treat a file as VIAME when its first
  non-comment row is itself a detection, which keeps VIAME-shaped
  telemetry (led by a text header) as telemetry.
- Decode web sidecar files leniently so a single non-UTF-8 .txt/.csv
  no longer 500s the whole frame_metadata route.
- Parse desktop sidecars with csv-parse relax mode so a bare quote
  character no longer throws, matching Python's csv.reader.
- Compare multicam records order-independently so identical records in
  a different column order are not wrongly flagged as a collision,
  matching the server's dict comparison.

Add regression tests for headerless VIAME rejection (server + desktop)
and bare-quote parsing (desktop).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant