feat(observability): per-request performance timing → Server-Timing (perf-tuning mode)

## Problem

We currently cannot answer "where did a request's time go?" from inside the platform. When the cloud control-plane data API feels slow (e.g. a `/api/v1/data/sys_environment` console request measured at ~360ms of *row-count-independent* server time on top of network), external black-box probing **cannot** break it down. The cost is split across auth/session resolution, org-scope `beforeFind` hooks, the `find` + `count` SQL, and serialization — all invisible from outside.

Performance analysis should be a **basic platform capability**: a "perf-tuning mode" where an operator can see, per request, exactly how the time was spent — without standing up an external tracing backend.

## What already exists (reuse, don't reinvent)

- `MetricsRegistry` (counter/histogram/gauge) — `packages/observability/src/contracts.ts:40`
- `instrumentRouteHandler` already times **every** HTTP request and assigns `X-Request-Id` — `packages/runtime/src/observability/instrument.ts:52` (emits `http_request_duration_ms`)
- `HookMetricsRecorder` already times **every** hook execution — `packages/objectql/src/hook-metrics.ts:41`
- `ObservabilityServicePlugin` registers the metrics service — `packages/runtime/src/observability/observability-service-plugin.ts:71`

## What's missing (this card)

1. **No per-request `Server-Timing` response header.** Metrics only flow to aggregate OTLP; nothing comes back to the browser, so a developer can't open DevTools → Network and see the breakdown for *the request they just made*.
2. **The SQL driver does not time individual queries.** `packages/plugins/driver-sql/src/sql-driver.ts` never wires knex `.on('query'|'query-response')`, so we have no per-query duration or per-request query **count** — the single most useful number for diagnosing "N sequential round-trips".
3. **No request-scoped collector.** No `AsyncLocalStorage` threads a "current request" through `find`/`count`/hook/driver, so per-query timings can't be attributed back to the originating HTTP request.

## Proposed design

A request-scoped **timing collector**, gated, that emits a standard `Server-Timing` header.

- **Collector**: an `AsyncLocalStorage<TimingCollector>` in `@objectstack/runtime` (or `@objectstack/observability`). When inactive, `record()` is a no-op → **zero overhead** when the mode is off.
- **Entry/exit**: `instrumentRouteHandler` (`instrument.ts:52`) opens the scope at request start; at response it serializes the collected spans into a `Server-Timing` header. The dispatcher result already supports `headers` — `http-dispatcher.ts` (`HttpDispatcherResult.response.headers`).
- **SQL spans**: wire knex `.on('query')` + `.on('query-response')` in `SqlDriver` (constructor, `sql-driver.ts:523`), keyed by `__knexQueryUid`, recording `{ sql, durationMs }` into the active collector. Surfaces total DB time + **query count** + slowest query.
- **Hook spans**: feed the existing `HookMetricsRecorder` durations (`hook-metrics.ts`) into the collector too — splits "hook time" from "DB time".
- **Auth/session span**: time the identity/session resolution in the dispatcher (`http-dispatcher.ts` executionContext resolution) — the prime suspect for the unexplained overhead.
- **Header shape** (RFC: `Server-Timing`):
  ```
  Server-Timing: auth;dur=42, db;dur=210;desc="6 queries", hooks;dur=18, serialize;dur=7, total;dur=355
  ```

## Gating

- Off by default. Enabled per-request via a debug header (`X-OS-Debug-Timing: 1`) **requiring admin/service identity**, or globally via an env flag (`OS_PERF_TIMING=1`) for an environment under investigation.
- Optional richer JSON (per-query SQL list, slowest query) only for admin/service — never leak SQL text to ordinary users.

## Open-core boundary

This is a **mechanism** → lives in the framework (the request path, driver, engine all do). Policy (who may see it, whether it's exposed in the cloud console as a "perf panel") is **cloud-side**. Mirrors the existing observability split (mechanism in framework, exporter/gating in cloud — see `docs/design/observability.md`).

## Acceptance criteria

- [ ] With the mode on, a `/api/v1/data/<object>` response carries a `Server-Timing` header decomposing total into at least: auth, db (with query count), hooks, serialize.
- [ ] Per-query SQL timing attributed to the originating request (works under concurrency — ALS, not globals).
- [ ] Zero measurable overhead when the mode is off (no ALS scope, no-op `record()`).
- [ ] Gated to admin/service identity (or env flag); SQL text never exposed to non-admins.
- [ ] Unit test: collector accumulates + serializes; integration test: header present on a real find and reflects query count.

## Why now

This is the tool that finally breaks down the ~360ms control-plane data-path overhead we've been chasing by external probing (see cloud perf investigation). It's both the diagnostic we need *and* a permanent platform capability.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat(observability): per-request performance timing → Server-Timing (perf-tuning mode) #2408

Problem

What already exists (reuse, don't reinvent)

What's missing (this card)

Proposed design

Gating

Open-core boundary

Acceptance criteria

Why now

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Uh oh!

feat(observability): per-request performance timing → Server-Timing (perf-tuning mode) #2408

Description

Problem

What already exists (reuse, don't reinvent)

What's missing (this card)

Proposed design

Gating

Open-core boundary

Acceptance criteria

Why now

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions