Problem
The current data model requires schema changes (and migrations) every time a dataset/extraction needs new fields. We want to support arbitrary, user-defined schemas without per-shape tables.
Scope
This issue is the foundation only — does not migrate existing entities.
Add two new resources
1. schemas — versioned, immutable schema definitions
- Columns:
id, name, version, definition (JSONB, opaque blob), workspace_id, created_at
- Endpoints:
GET/POST /api/v1/schemas, GET /api/v1/schemas/{id}/versions
- Schemas are immutable once published; new versions get new ids.
2. payloads — generic data carriers
- Columns:
id, schema_ref (FK to a specific schema version), data (JSONB), meta (JSONB), created_at
- Endpoints:
GET/POST /api/v1/payloads, GET /api/v1/payloads/{id}
Architectural rules
- One payload table for all schemas. JSONB. No table-per-schema.
- Validation lives at the write boundary, once — handlers call a single
validate(payload, schema) and trust payload.data thereafter.
SchemaEngine interface (validate, describe_fields, default_value) with one default implementation. Pluggable for future engines.
- API returns generic
Payload shape; clients narrow at the edge.
Out of scope (separate follow-up issues)
- Migrating existing
Dataset/Record/Question to the new model.
- Indexed projections for queryable JSONB fields.
- Frontend
<SchemaForm> / <SchemaView> (separate frontend issue).
Acceptance criteria
- Alembic migration adds both tables.
- CRUD endpoints work; OpenAPI documents them.
SchemaEngine interface has at least one impl with tests for validate/describe/default.
- Integration test: create schema → create payload referencing schema → validation rejects bad payload, accepts good one.
Problem
The current data model requires schema changes (and migrations) every time a dataset/extraction needs new fields. We want to support arbitrary, user-defined schemas without per-shape tables.
Scope
This issue is the foundation only — does not migrate existing entities.
Add two new resources
1.
schemas— versioned, immutable schema definitionsid,name,version,definition(JSONB, opaque blob),workspace_id,created_atGET/POST /api/v1/schemas,GET /api/v1/schemas/{id}/versions2.
payloads— generic data carriersid,schema_ref(FK to a specific schema version),data(JSONB),meta(JSONB),created_atGET/POST /api/v1/payloads,GET /api/v1/payloads/{id}Architectural rules
validate(payload, schema)and trustpayload.datathereafter.SchemaEngineinterface (validate,describe_fields,default_value) with one default implementation. Pluggable for future engines.Payloadshape; clients narrow at the edge.Out of scope (separate follow-up issues)
Dataset/Record/Questionto the new model.<SchemaForm>/<SchemaView>(separate frontend issue).Acceptance criteria
SchemaEngineinterface has at least one impl with tests for validate/describe/default.