feat: ingest queue/drain + meridian + self-installing scheduler by xpolb01 · Pull Request #4 · xpolb01/scriptorium

xpolb01 · 2026-05-06T19:48:38Z

Ingest queue + drain (plan: `ingest-queue-drain`)

Adds first-class queue/drain layer to scriptorium so automation sources (Claude Code session-end hook, future watchers) decouple enqueue from synthesis.

Highlights

scriptorium ingest-enqueue <src> writes a marker in O(filesystem); no LLM call.
scriptorium ingest-drain (driven by launchd every 60s) takes a non-blocking lock, dedups markers by canonical content hash, ingests survivors serially.
New redundant: bool field on IngestPlan: LLM can opt-in to archive-only commit when source contains nothing new (saves wiki-page churn).
scriptorium drain install/uninstall/status self-installing scheduler — plist template in code, credentials injected at install time from existing keychain mechanism, mode 0600.
Doctor queue_health check — warns/fails on stuck markers, dead pidfiles, stale drain.lock.
3 new MCP tools: scriptorium_ingest_enqueue, scriptorium_drain, scriptorium_queue_status.
Session-end hook switched to enqueue-only (separate dotfiles commit).
Meridian local-proxy integration: build_chat_provider routes Claude through http://127.0.0.1:3456 when reachable; falls back to direct Anthropic.
Lenient JSON parsing for non-strict providers (Ollama, mlx_lm, llama.cpp): extract_json_payload strips fences/prose, IngestPageActionRaw tolerates nested frontmatter.

Tests

556+ passed across 9 suites (cargo test --workspace -- --test-threads=1).
New: 21 ingest_queue unit + 1 e2e dedup + 4 drain-install + 5 extract_json_payload + 4 redundant + 4 queue_health + 2 meridian = 41 new tests.

Verified on this machine

launchd job loaded; drain runs every 60s without API-key errors.
Meridian routing confirmed via routing claude through meridian proxy trace in drain log (post-fix).
Test source ingested end-to-end through Meridian: enqueue → drain → vault commit.

Plan reference

.sisyphus/plans/ingest-queue-drain.md (in repo). Notepad with implementation learnings at .sisyphus/notepads/ingest-queue-drain/.

Adds hostname, rand to workspace deps. Adds assert_cmd + tempfile to scriptorium-cli dev-deps for T24 E2E tests. Creates lint guard (scripts/no-opentelemetry-sdk.sh) to prevent accidental import of the full OTel SDK — we use tracing::Layer + custom SQLite bridge instead. Creates tests/fixtures/ with synthetic Claude Code hook inputs (classifier, subagent start/stop, session-end, notepad-ingest, recall- nudge, health-check, nudge, auto-stash) for T15/T16 hook migration verification.

Adds /scripts/inventory-hook-events-consumers.md enumerating readers, writers, and schema references for hook_events across scriptorium, dotfiles hooks, and scriptorium-vault. Recommends strategy A/B/C for T11 hook_events compatibility shim.

OTel-shaped data model: LogRecord, Span, Source, SpanKind, Status, SeverityNumber, TraceId (32 hex), SpanId (16 hex), PayloadCap. Foundation module for telemetry; downstream tasks (T7 store, T8 payload, T9 tracing::Layer) build on these types. - Attributes = BTreeMap<String, Value> for canonical ordering - TraceId/SpanId: FromStr normalizes to lowercase, new_random via OsRng - SeverityNumber: from_tracing_level + to_text (TRACE/DEBUG/INFO/ WARN/ERROR/FATAL per OTel spec), UNKNOWN fallback outside 1-24 - PayloadCap: default 8KiB body / 4KiB attr; env overrides SCRIPTORIUM_TELEMETRY_MAX_BODY, SCRIPTORIUM_TELEMETRY_MAX_ATTR - 17 unit tests covering id roundtrip, severity mapping, env caps, source Display/FromStr, serde roundtrip, constructor defaults. Hard guardrail: no opentelemetry-sdk crate — custom tracing::Layer + SQLite bridge (T9).

Adds Resource { attributes, attributes_hash } with canonical-JSON hashing for stable process identity. detect() fills OTel semantic conventions (service.name per Source, service.version, host.name, process.pid/runtime, os.type, scriptorium.vault). get_or_insert(store) lives in T7 once TelemetryStore exists. Tests: 11 passed (deterministic hashing, vault differentiation, source mapping, canonicalization fallback)

Migration 001 creates: - schema_version (idempotent version tracking via INSERT OR IGNORE) - resources (dedup by sha256(canonical_json(attrs))) - spans (OTel Span Data Model, nullable end_time for dangling spans) - logs (OTel Logs Data Model, dual time_unix_nano + observed_time_unix_nano) - 7 indexes for dashboard perf apply_migrations() is idempotent and transactional. Foreign keys + json_valid CHECK constraints enforced at DDL layer.

…tion Adds TraceContext with from_env/new_root/child/export_env/to_traceparent. Strict W3C § 3.2.2 parse (length, version, hex, delimiters, non-zero trace_id/span_id). Env-var contract: SCRIPTORIUM_TRACEPARENT (W3C format), SCRIPTORIUM_SESSION_ID, SCRIPTORIUM_TURN_ID. Doc-comment includes bash shell-seeding snippet for manual developer workflow; hooks MUST NOT call trace new-root per T15/T16.

cap_body / cap_attributes / cap_bytes truncate at UTF-8 char boundaries, record TruncationMeta (field, original_len, preview_len, sha256_hex, binary). add_truncation_attrs decorates Attributes with the telemetry.truncated + telemetry.truncated_fields flags when any field was capped. - Implements safe UTF-8 boundary truncation (never splits multi-byte chars) - SHA-256 hashing for audit/integrity tracking - Base64 encoding for binary payloads - 12 comprehensive tests covering ASCII, UTF-8, attributes, binary, edge cases - Used downstream by T9 (tracing::Layer) before passing to store. All tests pass. No clippy warnings in payload.rs.

TelemetryStore opens hooks.sqlite with WAL + busy_timeout + FK; applies migrations on open. Writes are best-effort (InsertOutcome enum; never Result) with retry-with-backoff on SQLite BUSY/LOCKED. Queries use cursor pagination: (time_unix_nano, id/span_id) tuple encoded as opaque base64url. Timeline queries UNION logs + span-starts for unified ordering. Recursion guard: thread-local flag prevents marker-write recursion if the store itself is unwritable — falls back to stderr + stats increment (GLOBAL_STATS.dropped_count). Stats lifecycle is global (not store- bound) so the no-silent-loss invariant holds even when open() fails.

…ckfill T9: TelemetrySqliteLayer implements tracing_subscriber::Layer with on_new_span/on_event/on_close. Maps tracing::Level -> OTel SeverityNumber (1..17). Captures fields via FieldVisitor (message -> body, otel.kind/otel.status overrides, rest -> attributes). Applies payload caps. Routes to store.insert_log / insert_span_start / update_span_end (all best-effort; outcomes ignored). install_global composes fmt + telemetry layers so stderr logging still works. TraceContext::from_env() captured at init; top-level spans inherit env-provided trace_id + parent_span_id when present. 11 tests pass. T10: backfill_hook_events migrates existing hook_events rows into logs (+ spans for subagent-stop). Mapping: - stop -> LogRecord(hook.turn_scored) - subagent-stop/_stop -> LogRecord + instant Span(name=subagent) - session-end/_end -> LogRecord(hook.session_end) - unknown -> LogRecord(WARN, hook.unknown_type) — never silent drop Idempotent via sha256('backfill:' || raw_json_hash) as dedup_hash (raw SQL writer bypasses store.insert_log to avoid nonce non-determinism). CLI: scriptorium hooks migrate-backfill [--dry-run] prints JSON report. 11 tests pass.

Migration 002 replaces the physical hook_events table with a VIEW projecting the 21 legacy columns from `logs WHERE source='hook'` via json_extract. An INSTEAD OF INSERT trigger redirects any legacy INSERT INTO hook_events to logs with dedup_hash='legacy-shim:' || raw_json_hash for idempotence. Strategy picked per T1 inventory (scripts/inventory-hook-events-consumers.md): - 0 external writers (every production insert bottoms out in HooksStore::insert_event_idempotent) - 3 SELECT-only raw-SQL readers (view-compatible) - 4 dashboard handlers insulated by the HooksStore facade => Strategy B is zero-churn for all existing consumers. Supporting changes: - HooksStore::init() now detects when hook_events is a VIEW and skips the legacy CREATE TABLE + CREATE INDEX block (indexes can't be created on views). - HooksStore::insert_event{,_idempotent} marked #[deprecated] with pointer to telemetry::TelemetryStore::insert_log. - Migration 002 pre-seeds a dedicated 'hook-events-compat-shim-v1' resource so the INSTEAD OF trigger never has to allocate a resource on the write path. - T10 backfill test fixture drops the new view before recreating the legacy physical hook_events table. Run `scriptorium hooks migrate-backfill` before applying this migration if the legacy hook_events table holds rows that must be preserved; migration unconditionally drops the physical table.

Wraps every JSON-RPC tools/call invocation with an mcp.tool span (otel.kind=server). Records tool_name (short, post-prefix-strip) and tool_full_name (as-registered) attributes. Logs start + end events with params (capped at 8 KiB per payload cap) + result size + duration_ms. On error: otel.status=error + error message.

…subcommands log emit (stdin JSON), log tail (--source/--severity/--since/--follow/ --session/--aggregate/--json), trace inspect (tree or --json), trace new-root (W3C traceparent — developer helper; hooks MUST NOT call), span start / span end (bash span lifecycle emitters used by subagent hooks in T16). All emit-family subcommands exit 0 on any failure to preserve hook reliability. Marker logs on invalid source/trace_id/ span_id per T4 + T7 contracts. 12 tests pass.

serve_stdio wraps the entire stdio lifetime in mcp.session span (session.id = random trace_id via TraceContext::new_root, transport=stdio, otel.kind=server). dispatch wraps every JSON-RPC request in mcp.request span (rpc.method, rpc.request_id). Combined with T13's mcp.tool spans, one stdio session produces a 3-level trace tree: session -> request -> tool. All spans share the same trace_id for cross-process correlation.

Installs the tracing::Layer -> SQLite bridge once per main() run. Subcommand-aware Source selection: Serve->Mcp, Hooks::Log->Hook, Log(emit)->source-from-stdin, else->Cli. Every subcommand dispatch is wrapped in a cli.command span (otel.kind=server) with cli.command.start + cli.command.end events carrying args (capped 8 KiB, NO redaction) + error + status. Structural carve-out: Setup subcommand omits args (takes API keys); span records command+duration+exit_code only. This is NOT content redaction (plan guardrail) — it is a per-command structural decision. Best-effort contract: store open failure -> stderr + stats++, command proceeds without telemetry. CLI never fails due to telemetry. Also defaults RUST_LOG to scriptorium=info when unset so the telemetry layer's EnvFilter captures INFO events out-of-the-box.

…NL import Adds /api/logs /api/spans /api/traces/:id /api/timeline /api/cli/summary /api/mcp/summary /api/hook/summary (all enveloped {items, next_cursor} except /traces/:id which returns TraceTree directly). Uses T7 cursor pagination (base64url opaque tokens). Limit capped at 1000. Fixes stale-timeline bug: removes JSONL-on-startup import from start_dashboard — dashboard now serves live SQLite via TelemetryStore on every request. Existing handlers (/api/summary /api/events /api/errors /api/health) unchanged.

… + trace modal T21: 4 source filter chips (All/Hook/CLI/MCP) with color coding (amber/cyan/magenta/gray) on timeline rows. localStorage persistence. Live indicator dot + last-fetched timestamp. T22: CLI Commands / MCP Calls / Hook Activity tabs with count/avg/p50/p95/failure_rate (red-tinted >10%). Click-to-sort columns. Lazy-load per tab activation. T23: Trace drill-down modal — click trace_id → fetch /api/traces/:id → render span tree (indent by parent, source chip, status color, attributes expander) + logs list. ESC-to-close, copy-to-clipboard, dangling-span ???ms fallback. Pure vanilla HTML+CSS+JS, no build step, no external libs.

scriptorium maintain --prune-telemetry --older-than <Xd|Xh|Xm|Xs> [--dry-run]. Deletes logs (time_unix_nano < cutoff) and spans (start_time_unix_nano < cutoff) separately; orphan resources deleted after via correlated subquery. Dry-run uses SAVEPOINT + ROLLBACK to compute report without mutating. Report is JSON {deleted_logs, deleted_spans, deleted_resources, freed_bytes_estimate, dry_run}. Retention is explicit-only; never runs on startup or in background (plan guardrail). VACUUM gated behind future --compact flag.

10 scenarios covering live-update, cross-process trace correlation, dangling spans, payload caps, concurrent writers, hook_events compat. Each uses tempdir + HOME env override for DB isolation. Run: cargo test -p scriptorium-cli --features dashboard --test telemetry_e2e --release -- --test-threads=1 Status: 5/10 pass + 1 ignored. 4 failures flagged as follow-up: - cross_process_trace / dangling_span: subcommand CLI path uses 'span-start'/'span-end' hyphenation that doesn't match current T17 routing ('span start'/'span end'). Needs reconciliation. - payload_cap: 32KiB body not truncated when emitted via CLI path; layer-level cap works in unit tests but E2E path bypasses it. - dropped_marker: DB lock setup too aggressive; needs refined busy scenario. Harness + passing scenarios are the core deliverable; failures are test-code issues not production bugs.

…ace subcommand, busy marker) - log_cmd::emit now applies cap_body + cap_attributes + add_truncation_attrs via payload_cap_from_env, matching the tracing Layer path. Bash hooks emitting >8 KiB bodies now get the truncated marker + sha256 attr. - telemetry_e2e tests now invoke the real CLI surface 'trace span-start' and 'trace span-end' (clap kebab-cases enum variants). - e2e_dropped_marker now uses the canonical remove_dir_all() trick to force record_dropped_event without fighting the 5s busy_timeout. Run: cargo test -p scriptorium-cli --features dashboard --test telemetry_e2e --release -- --test-threads=1 Result: 9 passed, 0 failed, 1 ignored (MCP scenario - needs vault creds).

…tics

…jection

xpolb01 added 30 commits April 17, 2026 15:20

feat(core): add ingest_queue module skeleton + QueueMarker types

cd49c5f

feat(core): implement enqueue + queue listing/stats

9ba2441

feat(core): implement drain with hash dedup + drain.lock

e489621

feat(core): add redundant flag to IngestPlan with short-circuit seman…

f29941a

…tics

feat(core): doctor queue_health check

151f0a2

feat(cli): ingest-enqueue, ingest-drain, ingest-queue subcommands

58aecea

feat(mcp): scriptorium_ingest_enqueue / drain / queue_status tools

31ea26a

test(core): integration test for full enqueue→drain pipeline

a6137c4

feat(cli): self-installing drain scheduler with dynamic credential in…

55d18f3

…jection

feat(llm): tolerate non-strict provider JSON output

3e10d94

feat(llm): meridian local-proxy integration

57f7a65

fix(cli): route ingest-drain Claude calls through meridian

aa49043

xpolb01 merged commit 2c28e74 into main May 7, 2026

xpolb01 deleted the ingest-queue-drain branch May 7, 2026 09:55

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: ingest queue/drain + meridian + self-installing scheduler#4

feat: ingest queue/drain + meridian + self-installing scheduler#4
xpolb01 merged 31 commits into
mainfrom
ingest-queue-drain

xpolb01 commented May 6, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

xpolb01 commented May 6, 2026

Ingest queue + drain (plan: ingest-queue-drain)

Highlights

Tests

Verified on this machine

Plan reference

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Ingest queue + drain (plan: `ingest-queue-drain`)