Skip to content

feat: semantic layer compiler (Slice 1)#22

Open
kosminus wants to merge 1 commit into
mainfrom
feat/semantic-layer-compiler
Open

feat: semantic layer compiler (Slice 1)#22
kosminus wants to merge 1 commit into
mainfrom
feat/semantic-layer-compiler

Conversation

@kosminus

Copy link
Copy Markdown
Owner

What

Adds a semantic layer compiler: introspect an operational database (schema + column statistics + view definitions + query logs), run deterministic inference, and propose draft semantic-layer objects — inferred join paths, metrics, value dictionaries, glossary entities, and refusal boundaries (PII masking, tenant row filters, dead tables, fan-out warnings) — each with evidence and a confidence score, reviewed through a new Compiler page before anything is created.

Why

QueryWise's hardest cold-start problem is building the semantic layer: a new customer connects an operational DB and faces empty glossary/metrics/dictionary. Operational schemas are hostile in ways warehouses aren't — no declared FKs, int-coded statuses, soft deletes, tenant columns — and that hostility is exactly the signal a compiler can mine. Crucially, the generated guardrails (fan-out warnings, PII masking, dead tables) prevent the most common classes of silently-wrong text-to-SQL answers.

Changes

Engine (backend/app/semantic_compiler/ — self-contained, no FastAPI/ORM imports, standalone-CLI extractable)

  • Collectors: catalog (via existing connector introspection), pg_stats/CHECK IN-lists/enums/unique indexes, pg_get_viewdef, pg_stat_statements — each degrades to empty when unavailable and the run records which sources answered
  • sqlmeta.py: sqlglot analysis (join pairs, aggregates, GROUP BY, WHERE) with graceful degradation, mirroring lineage_service
  • Inference: join paths without FKs (naming convention ~0.45 + value-overlap probe +0.35 + log co-occurrence +0.15; a failed probe kills the candidate), dictionaries (enum/CHECK/lookup-table label probing/most_common_vals, handling negative n_distinct), view→metric extraction (aggregates/dimensions/canonical filters), recurring log aggregates, dead tables, tenant scoping (call-weighted log confirmation required), PII (name + sampled value shape), fan-out warnings (1:N parent-measure double-counting)
  • Output hard-capped per kind with a confidence threshold — 40 good drafts beat 400 mediocre ones

LLM annotation (app/llm/agents/semantic_annotator.py): names/describes only — output merges onto naming fields, structurally unable to invent tables/joins/values; runs complete without a provider

Staging review flow (migration 013, compilation_service.py)

  • New compilation_runs / compilation_findings tables; findings become real semantic objects only on explicit accept (draft metrics/glossary feed the query-pipeline context builder today, so unreviewed output stays out)
  • Accept dispatches through existing creation paths (embedding + lineage), landing as status='draft' for normal certification; data policies created disabled; fan-out guidance becomes a knowledge document (injected into SQL prompts via existing RAG)
  • Accepted findings are name-keyed and rematerialized after every re-introspection (introspect_and_cache wipes the schema cache, cascading to inferred relationships and dictionary entries); cached_relationships gains origin/confidence/cardinality/evidence
  • Background job (semantic_compilation) with in-memory progress, registered for both in-process and arq backends

API + frontend

  • /connections/{id}/compilation/runs (+ get), /compilation/findings (+ /accept, /dismiss, /bulk)
  • New Compiler page: run button, live progress banner, findings grouped by kind with confidence bars and expandable evidence, per-kind bulk accept/dismiss

Test fixture + eval

  • opsdb: hostile operational schema in the sample-db container (no FKs, tenant_id, soft deletes, lookup tables, business-logic views, dead customers_bak); pg_stat_statements now preloaded; run_ops_workload.py populates query logs. Init scripts apply on a fresh volume (docker compose down -v)
  • eval_compiler_ifrs9.py: scores recovery of the IFRS 9 seed metadata with declared FKs hidden — relationships 5/5 recall @ 100% precision, dictionary 79% recall / 89% precision, glossary table-coverage 10/10, confidence calibrated (0.81 correct vs 0.60 incorrect)
  • 26 new unit tests (no DB/LLM); full suite 256 passing; frontend tsc/ESLint/build clean

Verification

Live run against opsdb (LLM off): all 9 inferred joins correct with zero spurious edges; dictionaries with labels probed from lookup tables; tenant row-filter draft; dead-table and fan-out findings. Accept flows verified for every kind, including rematerialization after re-introspect. LLM-annotated run produced grounded names ("Customer Lifetime Value") without touching structure.

🤖 Generated with Claude Code

Attacks the cold-start problem: point QueryWise at an operational DB
with an empty semantic layer and get reviewable draft objects with
evidence and confidence scores.

- Engine (app/semantic_compiler/): self-contained collectors
  (catalog, pg_stats/CHECK/enums, view definitions, pg_stat_statements)
  + deterministic inference: join paths without FKs (naming +
  value-overlap probe + log co-occurrence), value dictionaries,
  view/log metric extraction, dead tables, tenant scoping, PII,
  fan-out warnings. LLM pass names/describes only — never invents.
- Staging review flow: compilation_runs/compilation_findings
  (migration 013); findings become semantic objects only on accept,
  landing as status='draft'; policies created disabled. Accepted
  findings are name-keyed and rematerialize after re-introspection.
- API (/connections/{id}/compilation/*) + Compiler page (progress,
  findings grouped by kind with evidence, bulk accept/dismiss).
- opsdb fixture: hostile operational schema + pg_stat_statements
  workload script; eval harness scores recovery of the IFRS 9 seed
  (relationships 5/5 @ 100% precision with FKs hidden).

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant