Skip to content

[HAI] Wire protocol / API payload schema for label semantics #1903

@rfloca

Description

@rfloca

Sibling issue to #1868. This issue covers the wire-protocol / API-payload structure for label semantics; #1868 covers the terminology/content side. The schema designed here consumes whatever terminology decisions are reached in #1868.

Background

The discussion in #1868 made clear that "what semantic info accompanies a label" and "how that info is structured on the wire" are two distinct layers, shaped by different stakeholders and different design constraints. This issue is the home for the second.

The motivation for a structured wire representation: CSV (the format preferred for human editing) is flat and cannot represent multiplicity natively, which matters because e.g. DICOM's Anatomic Region Sequence and Primary Anatomic Structure Sequence can have multiplicity > 1, and because identifying multiple coding schemes per concept is a core requirement. A structured representation (JSON, XML, or similar) is needed for machine-to-machine use; the human-editable CSV view can be kept as a surface on top.

In scope

  • A schema for label-semantics records: per-entry structure carrying (codingScheme, codeValue, codeMeaning) triplets, the category/type/region 5-tuple, multiplicity, and protocol/algorithm identification.
  • Separation of addressing (how integer codes in segmentation arrays map to entries) from semantic content (what each entry means). The two should be cleanly decoupled so the schema is portable across different consumers.
  • Capability discovery: how a model declares which terminology features it supports.
  • Lossless conversion contract between a human-editable representation (Slicer-style labels.csv) and this wire format.
  • REST API surface for exposing semantic information from a MONAI Label server.

Out of scope

Inputs to consider (not authoritative — starting points)

  • Slicer's labels.csv format.
  • DCMQI schemas and OpenAnatomy patterns.
  • MITK's EUCAIM.json conventions (working example of multi-scheme handling).
  • DICOM equivalent-codes pattern per DICOM PS3.3 §8.9.
  • The [HAIG API Standardisation Proposal (April 2026)]; in particular the semantic_id_dictionary field in the Prompting Contract as an example downstream consumer.

Open design questions

  • Required vs. optional. The HAIG draft marks the dictionary field as Required for its benchmarking context. Provide a well defined terminlogy to encode label meaning semantically #1868's framing is closer to "optional with capability discovery". This needs an explicit position.
  • Addressing vs. semantics. Should integer keys (or any addressing scheme) be part of an entry, or wrapped externally by the caller? Implications for portability across consumers.
  • Instance-level semantics. Whether the schema needs to express instance-ID rules per class, or whether instance is purely an output-array addressing concern.
  • Multiplicity. Which fields admit multiplicity (Anatomic Region Sequence and Primary Anatomic Structure Sequence allow it in DICOM). How to encode it without ambiguity.
  • Front-end vs. server translation. Where does the human-readable → coded-entry mapping happen? Different clients have different capabilities.

Initial next steps

  • Survey existing schemas (DCMQI, OpenAnatomy, MITK conventions) so we don't reinvent.
  • Draft a strawman JSON schema for review.
  • Produce a structured response to the HAIG semantic_id_dictionary proposal from this issue's perspective, surfacing the integration questions above for the HAIG working group.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions