You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
improvement(logs): move per-block progress markers to Redis to cut write amplification (#5248)
* improvement(logs): move per-block progress markers to Redis to cut write amplification
Per-block lastStartedBlock/lastCompletedBlock markers were persisted via a
jsonb_set UPDATE on workflow_execution_logs on every block start and complete
(~2N UPDATEs per run) — the heaviest write query in the DB. These are live
progress breadcrumbs with no DB-polling consumer (live progress comes from the
executor over WebSocket); their only durable value is a breadcrumb folded into
the final record.
Behind the redis-progress-markers flag, markers now live in Redis during the run
and are folded into the single terminal UPDATE at completion, dropping per-run
row UPDATEs from ~2N+1 to 1.
- New progress-markers module: HASH execution:progress:{id}, atomic Lua
monotonic-guard writes preserving the existing <= ordering, reservation-aligned
TTL backstop, graceful no-op when Redis is unavailable
- Deterministic GC: cleared at every terminal/pause boundary; TTL covers crashes
- Flag resolved once per logging session so a run never mixes write paths
- Fold markers into the completion record (Redis wins, falls back to row markers)
- Merge live markers for in-flight detail reads
- Extract shared getExecutionReservationTtlMs so marker and admission-slot TTLs
share one source of truth
* fix(logs): SQL fallback when Redis marker write fails, fold markers on force-fail, validate marker shape
Addresses review feedback on the redis-progress-markers PR:
- persistLast* now falls back to the jsonb_set UPDATE when Redis is unavailable or the write fails (setLast* returns whether it persisted), so a marker is never dropped when the flag is on without a healthy Redis.
- markExecutionAsFailed folds live Redis markers into execution_data before clearing, so the last-started/last-completed breadcrumb survives the force-fail path.
- getProgressMarkers validates marker shape (rebuilds from typed fields), so a stale or wrong-shaped Redis value can never reach API consumers.
* chore(logs): convert inline marker comments to TSDoc
* fix(logs): preserve markers when the completion read fails
getProgressMarkers now returns null on a Redis read error (vs {} for genuinely empty). completeWorkflowExecution and markExecutionAsFailed skip clearProgressMarkers when the read returns null, so a transient read error at completion no longer wipes markers that are still durably in Redis — the TTL reclaims them instead.
* fix(logs): resolve marker store split-brain by latest-timestamp-wins + drain on force-fail
- When a Redis marker write falls back to SQL, Redis and the row can each hold a marker for a different block; reads/folds previously preferred Redis unconditionally and could pick a stale value. Now the completion fold, the in-flight detail read, and the force-fail fold all pick the marker with the later timestamp (pickLatestStartedMarker/pickLatestCompletedMarker; markExecutionAsFailed uses a monotonic SQL guard).
- markAsFailed now drains pending per-block marker writes (not just the completion promise) before folding, so a force-fail racing onBlockStart/onBlockComplete still captures the latest breadcrumb.
* fix(logs): harden Lua marker guard against non-table decoded values
Guard the monotonic-check index with type(decoded) == 'table' so a corrupted Redis field that decodes to a non-table (e.g. a number) can't error the eval; our write path only ever stores JSON objects, so this is defense-in-depth.
* perf(logs): skip completion Redis read/clear when markers went to SQL
completeWorkflowExecution now takes readProgressMarkers (the session's resolved marker mode); when the flag is off it skips the per-completion HGETALL+DEL entirely instead of probing a key that was never written. Sticky to the session so it stays flip-safe (an execution that wrote to Redis always folds+clears Redis). Non-session callers default to true (safe read-and-fold). Also hardened the Lua guard with type(decoded)=='table'.
Copy file name to clipboardExpand all lines: apps/sim/lib/core/config/env.ts
+1Lines changed: 1 addition & 0 deletions
Original file line number
Diff line number
Diff line change
@@ -77,6 +77,7 @@ export const env = createEnv({
77
77
TABLE_SNAPSHOT_CACHE: z.boolean().optional(),// Mount tables into sandboxes by reference via a version-keyed CSV snapshot in object storage instead of draining the whole table into web-process heap
78
78
PII_REDACTION: z.boolean().optional(),// Redact PII from workflow logs via configurable Data Retention rules (Presidio at the logger persist choke point) and expose the Data Retention config UI
79
79
TRIGGER_EU_REGION: z.boolean().optional(),// Route Trigger.dev runs to eu-central-1 instead of the default us-east-1 (fallback for the trigger-eu-region flag when AppConfig is not the source of truth)
80
+
REDIS_PROGRESS_MARKERS: z.boolean().optional(),// Write per-block live progress markers to Redis instead of jsonb_set UPDATEs on workflow_execution_logs (fallback for the redis-progress-markers flag when AppConfig is not the source of truth)
80
81
81
82
// Table feature limits (per plan). Apply when billing is disabled (free tier defaults) or for billed plans.
82
83
FREE_TABLES_LIMIT: z.number().optional(),// Max user tables per workspace on free tier (default: 5)
0 commit comments