Skip to content

Rule: when chart data stays inline vs. requires a dataset (expressibility boundary) #2502

Description

@os-zhuang

Why

Charts can carry their data query inline (self-contained: objectName + aggregate + filter) or reference a named dataset. We need one mechanical, checkable rule so an author — especially the build agent — deterministically picks the right one. Without it we get drift both ways: datasets created for trivial queries (over-abstraction, hidden coupling) and inline queries that quietly can't express what's asked (under-abstraction).

Decision principle

The choice is made purely on expressibilitycan the inline chart-query engine express this query? It is not made on reuse, on "a dashboard filter drives it", or on "it might change later".

"Should this be a governed / canonical metric" is a separate human promotion (save as dataset), layered on top — never something the agent infers from the query. The agent's rule is expressibility only.

The rule (draft — details below to standardize)

Default inline. A dataset is required only when the query cannot be expressed inline.

Inline envelope (what a chart's own query may hold)

  • exactly one object (+ its direct lookup fields for filtering / labels)
  • filters: field/op/value, and/or compound — all resolvable on that object (or a direct lookup)
  • group by: 0..N fields on that object
  • aggregate: one function (count/sum/avg/min/max) over one field
  • sort + limit

Fits entirely inside this → inline; do not create a dataset.

Dataset triggers (any one → dataset)

  1. Join across objects that changes grain — mixes rows/fields from >1 object, beyond "aggregate a child by a parent field". e.g. revenue (Invoice) by industry (Account).
  2. Computed / derived column — a field that must be expressed, not stored (margin = revenue - cost; a CASE bucket).
  3. Aggregate of an aggregate (multi-level) — e.g. average of per-account totals.
  4. Window / sequential — rolling N-day, running total, rank, period-over-period.
  5. Pivot / reshape — rows→columns (matrix), or a grain group-by can't produce.
  6. Union of sources — one series combining rows from two objects.
  7. A parameter that reshapes the query (grain / window / join) — not one that plugs into a WHERE (that's a dashboard filter → Dashboard-level filters (date / region) driving multiple charts #2501).

Explicitly NOT triggers (stay inline)

Details to standardize (the point of this issue)

The rule above is the shape; the edges need pinning to the real engine:

  • Pin the inline envelope to what ObjectChart / analytics.query actually supports — this is the source of truth, not the prose above. Enumerate: relationship rollups, multiple series, multiple group-by fields, HAVING-style filters on the aggregate, date-bucketing (by month/quarter).
  • Is a direct parent→child relationship aggregate inline or a join trigger? (e.g. "task count per project" over a lookup.) Decide and document; it's the most common ambiguous case.
  • Computed column boundary — display formatting / unit conversion (still inline?) vs. a real derived expression (dataset). Where's the line.
  • Filter depth on lookups — one hop inline; multi-hop → dataset?
  • Promotion flow — inline → save as dataset: how the chart switches from an inline query to datasetRef, and whether it's reversible.
  • Encoding — write the rule into the objectstack-ui skill (datasets/analytics section) so the build agent reads it (same pattern as the react-blocks contract).
  • Optional enforcement — an os build lint that flags (a) a dataset created for an inline-expressible query, and (b) an inline query using a construct the engine can't express (should have been a dataset).

Acceptance

Given a data need, the build agent deterministically picks inline vs dataset per the pinned envelope; a lint (if built) flags both over- and under-abstraction; the ambiguous cases above have a documented answer.

Related: #2501 (dashboard-level filters — the "parameter that plugs into a WHERE" side of trigger #7).

Metadata

Metadata

Assignees

No one assigned

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions