Skip to content

[backend] Add schemas + payloads tables for arbitrary-schema data model #209

@JonnyTran

Description

@JonnyTran

Problem

The current data model requires schema changes (and migrations) every time a dataset/extraction needs new fields. We want to support arbitrary, user-defined schemas without per-shape tables.

Scope

This issue is the foundation only — does not migrate existing entities.

Add two new resources

1. schemas — versioned, immutable schema definitions

  • Columns: id, name, version, definition (JSONB, opaque blob), workspace_id, created_at
  • Endpoints: GET/POST /api/v1/schemas, GET /api/v1/schemas/{id}/versions
  • Schemas are immutable once published; new versions get new ids.

2. payloads — generic data carriers

  • Columns: id, schema_ref (FK to a specific schema version), data (JSONB), meta (JSONB), created_at
  • Endpoints: GET/POST /api/v1/payloads, GET /api/v1/payloads/{id}

Architectural rules

  • One payload table for all schemas. JSONB. No table-per-schema.
  • Validation lives at the write boundary, once — handlers call a single validate(payload, schema) and trust payload.data thereafter.
  • SchemaEngine interface (validate, describe_fields, default_value) with one default implementation. Pluggable for future engines.
  • API returns generic Payload shape; clients narrow at the edge.

Out of scope (separate follow-up issues)

  • Migrating existing Dataset/Record/Question to the new model.
  • Indexed projections for queryable JSONB fields.
  • Frontend <SchemaForm> / <SchemaView> (separate frontend issue).

Acceptance criteria

  • Alembic migration adds both tables.
  • CRUD endpoints work; OpenAPI documents them.
  • SchemaEngine interface has at least one impl with tests for validate/describe/default.
  • Integration test: create schema → create payload referencing schema → validation rejects bad payload, accepts good one.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions