Skip to content

Migrate to elide toolkit; build engine resource model + orchestrator#287

Merged
martsokha merged 15 commits into
mainfrom
migrate/elide
Jun 26, 2026
Merged

Migrate to elide toolkit; build engine resource model + orchestrator#287
martsokha merged 15 commits into
mainfrom
migrate/elide

Conversation

@martsokha

@martsokha martsokha commented Jun 24, 2026

Copy link
Copy Markdown
Member

Summary

Pivots the runtime off the in-tree nvisy-{core,context,pattern,ner,llm,codec,toolkit} crates and onto the upstream elide toolkit, then builds the full engine — multi-tenant resource registry, two-phase run orchestrator, request-shaped analyzer + override surface — on top of it. By the time this PR lands the workspace builds end-to-end on elide; the server runs against the new engine surface; the CLI parses the example config.

Workspace shape on this branch

  • elide is the toolkit. elide, elide-core, elide-ner, elide-ocr, elide-stt come from upstream; the runtime never re-derives recognition, modality, codec, redaction, or provenance primitives.
  • nvisy-core slims to {error, health, policy, plan, schema}. Wire shapes + policy vocabulary; everything else (primitives, modality markers, recognition contracts) is elide_core::*.
  • nvisy-engine is the runtime brain. Layout:
    • analyzer/ — compile per-request AnalyzerParams into elide::detection::Analyzer<M> per modality (text / tabular / image / audio), wiring pattern + NER + LLM recognizers and OCR / STT enrichers.
    • anonymizer/ — compile Policy sets into elide::redaction::Anonymizer<M> per modality. Pattern-matches degenerate Predicate::LabelOneOf / Predicate::TagOneOf shapes onto elide's indexed with_label / with_tag fast paths; composites fall through to with_catalog_predicate. Single PolicyHashAlgorithm → Sha2Algorithm conversion via From impl (no duplicate helpers).
    • keyspace/policy, context, file resource families as extension traits on the shared RegistryHandle. Versioned blobs keyed (actor_id, resource_id, version); files split metadata + bytes so list_files is cheap.
    • runs/ — two-phase lifecycle: start mints a UUIDv7 run, persists per-doc inputs, fans the analyzer out under futures::buffer_unordered with a tokio::time::timeout, lands AwaitingReview. apply resolves policies, layers reviewer overrides as high-precedence per-entity rules in front of the policy chain (decorator pattern, first-match-wins gives them priority), writes redacted bytes back through FileRegistry, lands Applied / PartiallyApplied. Plus get, get_doc, list, cancel, delete, override_entity.
    • EngineHandle bundles RegistryHandle + FormatRegistry so handlers + orchestrator take a single dep.
  • elide-bento is a new local crate carrying vendor-specific BentoML NER + OCR backends.
  • nvisy-server runs on the new engine surface — every handler reaches state through EngineHandle extension traits; files are first-class inputs/outputs via FileLineage::RedactedFrom.
  • nvisy-cli parses Nvisy.toml (example shipped, parser test green), wires the server.

Policy + request shapes

  • UUID identity end-to-end. Policy and Rule are UUID-keyed; reviewer overrides and policy rules both stamp Attribution { policy_id, reason } so every redaction traces to the exact rule that fired.
  • Composable Predicate. One field per rule, not a Label/Tag/Predicate three-way kind enum. Indexed fast paths preserved for the degenerate shapes.
  • AnalyzerParams is map-by-kind. Recognizers: pattern at-most-one (Option), ner + llm lists. Enrichers: language, ocr, stt each at-most-one. Per-recognizer labels removed — scope + label_catalog is the sole source of truth.
  • AnalyzerOverrides is the per-request layer. ScalarOverride { Inherit, Replace, Remove } for single-value fields; CollectionOverride { Inherit, Replace, Patch { extend, remove } } for lists with selector-keyed patches. Folds onto the deployment default — clients say only what they want different.

Pairs with upstream

  • nvisycom/elide#93Anonymizer::with_catalog_predicate so closures see the per-anonymizer LabelCatalog (unblocks Predicate::TagOneOf inside composites).
  • Upstream split landed during this branch: elide-detection, elide-redaction, elide-orchestration, elide-lingua joined the workspace; Anonymizer / Analyzer moved to elide::redaction::* / elide::detection::*; Hash / HashAlgorithm renamed to Sha2Hash / Sha2Algorithm behind a sha2 feature.

Commits (newest first)

  • aa2180bb deny: unblock cargo deny check all (publish = workspace; allow 0BSD)
  • 034c93b2 docs: fix rustdoc intra-doc links across nvisy-{core,engine}, elide-bento
  • 8598d9a9 deny: allow elide git source; drop three unused license entries
  • 8e81274e core,server,engine: map-by-kind plan, request overrides, upstream elide bump
  • 989cbc7c engine,server: EngineHandle, file-bytes lifecycle, server port
  • 00483228 style: cargo fmt across nvisy-core, nvisy-engine, elide-bento
  • 82ba6941 engine: EngineHandle bundle; resources as extension traits
  • 002ba1a0 workspace: drop local elide-fake; consume from upstream
  • ad7e9418 core,engine: engine resource model + run orchestrator
  • 786f4ad8 core,engine,bento: full operator set, analyzer plan, attribution wiring
  • dc167145 core,engine: pivot to schema-types-as-wire, conversions in core
  • 1033d9aa core: redesign policy module on elide-core types
  • 7b848254 workspace: delete superseded crates; add elide-bento; slim nvisy-core
  • cb4c7445 deps: add elide as upstream toolkit; wire engine to elide deps

Test plan

  • cargo build --workspace green
  • cargo clippy --workspace --all-targets -- -D warnings green
  • RUSTDOCFLAGS="-D warnings" cargo doc --workspace --no-deps green
  • cargo test -p nvisy-cli example_toml_parses passes
  • cargo deny check all exits 0 (warnings remain for upstream LGPL-3.0 SPDX deprecation and transitive crate version duplicates — neither fails the check)

Follow-ups

  • E4.1 / E4.2 — delete crates/nvisy-{core,context,pattern,ner,llm,codec,toolkit} legacy directories and elide-fake once nothing references them
  • E4.3 — merge migrate/elidemain

🤖 Generated with Claude Code

martsokha and others added 6 commits June 22, 2026 16:18
Workspace gains `elide`, `elide-core`, `elide-llm` as git deps tracking
`nvisycom/elide`'s main branch. `nvisy-engine`/`nvisy-server`/`nvisy-cli`
drop the per-modality `rich` feature (gone from elide; collapsed into
parent modalities with sub-handlers) and the LLM provider toggles
(`openai`/`anthropic`/`google`/`bento`) — all provider backends now
enabled by default through elide-llm.

Engine's manifest now consumes elide+elide-core+elide-llm in place of
the local `nvisy-{core,context,pattern,ner,llm,codec,toolkit}` crates.
Those local crates remain on disk and as workspace path-deps so the
not-yet-migrated consumers (server/cli/fake/ocr/stt/toolkit) keep
parsing; each leaves the workspace alongside its consumer's migration.

Engine source still imports `nvisy_*` paths and will not compile until
the import rewire pass lands (E3.2c).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…sy-core

Bulk teardown of the runtime's toolkit half now that elide ships
upstream equivalents.

Deleted:
  - nvisy-{pattern,llm,ner,ocr,stt}: superseded by elide-{pattern,llm,
    ner,ocr,stt} on nvisycom/elide main
  - nvisy-codec: superseded by elide-codec
  - nvisy-context: superseded by elide-context
  - nvisy-toolkit: superseded by the elide umbrella crate (Analyzer +
    Anonymizer + deduplication layers + operators all ship there)

Renamed:
  - nvisy-fake -> elide-fake (runtime-owned extension over elide types;
    source still uses nvisy_core paths and will be reworked in its own
    pass)

Created:
  - elide-bento: shared BentoML HTTP client wrapper for elide backends
    (per E0.3 plan; minimal boilerplate -- BentoClient + BentoParams +
    BentoError; per-modality backends compose from this in the consuming
    crates)

Slimmed nvisy-core to {error, health, policy}:
  - dropped entity/extraction/modality/primitive/recognition/redaction
    (all re-exported from elide-core at consumer sites)
  - moved nvisy-engine/src/policy/ -> nvisy-core/src/policy/ (policy is
    the runtime's public governance contract; engine consumes it)
  - added elide-core as nvisy-core's only upstream dep so Policy types
    reference elide_core::entity::Label directly

State of the migration: workspace parses; elide-bento is the only crate
that compiles end-to-end. nvisy-core/engine/server/cli/elide-fake source
still imports deleted nvisy-* paths and will be redesigned crate-by-
crate on top of elide's Analyzer/Anonymizer/Orchestrator surface.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
nvisy-core trimmed to {service, policy, schema, context-gated-out}.
Policy is now the runtime's serializable governance spec, structurally
intact from before but rewired to reference elide vocabulary:

  - Label / LabelRef / LabelCatalog from elide_core::entity (renamed
    from EntityLabel*)
  - ConfidenceThreshold from elide_core::primitive
  - Modality + per-modality types (Text/Tabular/Image/Audio) from
    elide_core::modality
  - OperatorId from elide_core::redaction
  - HashAlgorithm mirrored locally (elide's enum is not serializable
    upstream by design; nvisy-core's wire spec owns its vocabulary)

PolicyModality replaces DocumentModality: a runtime-side extension of
elide_core::modality::Modality that pairs each modality with its
serializable redaction spec enum (TextRedaction, ImageRedaction, …).

TextRedaction::Redact renamed to Erase aligning with elide's operator
vocabulary (commit 3598798). No legacy alias — wire format is the
elide vocabulary.

Encrypt operator dropped from the wire enum: reversible AES-256-GCM
needs raw key material, can't safely live in declarative config.
Deployments register custom encrypt operators by OperatorId.

JsonSchema derives kept on policy types. Elide stays free of schemars
(schema generation is an HTTP concern the toolkit doesn't model);
nvisy-core proxies elide types via #[schemars(with = "...")] pointing
at lightweight proxy structs in nvisy-core::schema. Lifted schema to a
top-level module so anything in nvisy-core embedding elide types can
reuse it.

Service module added at crates/nvisy-core/src/service/{mod,error,health}
grouping the runtime's error + healthcheck concerns under one roof.

Context module gated out (// pub mod context) until it gets its own
redesign pass on elide types — its old body referenced deleted
nvisy_core::entity/primitive paths and missed the jiff dep.

nvisy-core now compiles standalone against elide + elide-core.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…core

nvisy-core wire shapes now embed `*Schema` types directly (no
`#[schemars(with = "...")]` overrides). schema.rs gains round-trip
`From`/`Into` impls so engine consumes nvisy-core specs as elide
types via `.into()`. The dual model (embed elide + proxy for
schema) is gone; one type owns each field.

- LabelSchema, OperatorIdSchema, PointSchema, BoundingBoxSchema,
  PolygonSchema, TimeSpanSchema, LanguageTagSchema in
  nvisy_core::schema, each with From/Into to its elide-core
  counterpart (LanguageTag uses TryFrom for parse failures).
- Policy.labels: Vec<LabelSchema>; selector confidence is f32;
  selector labels and tags become Vec<String> (HipStr/LabelRef are
  internal vocabulary, not wire vocabulary).
- TextRedaction/ImageRedaction/AudioRedaction/TabularRedaction
  `Custom { id }` carries OperatorIdSchema.
- PolicyModality trait removed: engine reads `redactions.text` etc.
  directly per modality; the projection trait was dispatch sugar
  not worth a trait hierarchy.
- EntitySelector::matches removed: matching runs in engine where
  the catalog + entity live; nvisy-core only owns the spec.

Engine teardown:

- Wiped crates/nvisy-engine/src/{core,detection,document,modality,
  redaction}. Old detection/redaction/document machinery is
  superseded by elide's Analyzer/Anonymizer/Orchestrator/Report.
- Salvaged registry/{composite_key, fjall_ext, paged} as
  multi-tenant storage primitives (no upstream equivalent;
  CompositeKey actor scoping is genuine engine value).
- Engine src/ skeleton now: lib.rs, registry/, policy_compile.rs
  placeholder for the upcoming compile pass.
- Engine Cargo.toml stripped to its actual deps: nvisy-core,
  elide+elide-core+elide-llm, derive_more, uuid, bytes, fjall.
- Old engine tests deleted (rebuilt as new modules land).

nvisy-core also gains:

- src/source.rs (ContentSource) restored from git history as a
  top-level module — content lineage tracking is service-level,
  not entity-vocabulary.
- src/service/{mod,error,health} groups the runtime's error +
  healthcheck concerns.

`cargo check -p nvisy-core` green; `cargo check -p elide-bento`
green; `cargo machete` reports no unused deps in nvisy-core. Engine
will be filled in module-by-module on top of the new compile seam.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…n wiring

nvisy-core:
  - schema/ becomes a folder module split by elide type: label,
    operator, geometry (Point/BoundingBox/Polygon), time, language,
    color, waveform. Each carries round-trip From/Into to its
    elide-core counterpart so engine consumes specs as elide types
    via `.into()`.
  - policy/redaction expands to the full operator catalogue elide
    ships: Text + Tabular get Erase/Keep/Mask/Replace/Hash/
    Pseudonymize/Encrypt (Tabular adds DropRow/DropColumn); Image
    gets Erase/Keep/Blur/Pixelate/Blackbox; Audio gets Erase/Keep/
    Silence/Beep. `Custom` escape hatches removed -- every wire
    operator is predefined and surfaces here when it lands in elide.
  - new plan/ module sibling to policy/: AnalyzerSpec ties together
    RecognizerSpec (Pattern/Ner/Llm with inline backend configs),
    EnricherSpec (Language/Ocr), DeduplicationSpec (calibrate +
    fusion strategy + resolution strategy + min_confidence), and
    ScopeSpec (languages + jurisdictions). Pure data; JsonSchema
    derives across.
  - ContentSource restored as top-level `source` module; context
    module re-enabled with elide-core primitive imports + schema
    proxies (BoundingBoxSchema, TimeSpanSchema, LanguageTagSchema,
    PolygonSchema, PointSchema) embedded as field types.

nvisy-engine:
  - registry/ salvaged from the pre-rebuild engine: CompositeKey
    actor scoping + fjall_ext utility wrappers + paged. Higher-
    level stores (content/audit/run) get redesigned alongside the
    request/result types they hold.
  - anonymizer/ folder module + per-modality compile_text/tabular/
    image/audio. Walks policies in precedence order; rules attach
    operators via the shared selector::attach helper (single-label/
    tag fast-path, predicate fallback). Stateful operators
    (Pseudonymize, Encrypt) reject with a clear "infrastructure
    not wired" error until vault + KeyProvider plumbing lands.
  - analyzer/ folder module + per-modality compile_*. Pattern
    (always Enhanced-wrapped, modality-generic across
    TextRecognizable), NER (Mock or Bento), LLM (Mock today; real
    providers reject pending credential wiring). Image gets the
    OCR enricher path so a Layout can be stamped before
    recognition runs. Tabular/Audio reject LLM (no upstream
    LlmModality impl).
  - .because(...) on every attached rule: per-rule attribution is
    `Attribution::new("{policy}#{rule}")`; per-policy fallback is
    `"{policy}#<default>"`. Engine threads it via selector::
    {rule_attribution, default_attribution}. Note: this conflates
    PolicyDecisionRef-shaped engine provenance into elide's
    author-facing Attribution slot -- the semantics need a real
    pass before this ships (TBD).

elide-bento:
  - dropped speculative BentoClient + BentoParams wrappers --
    backends now cache `bentoml::Endpoint` directly + clone per
    call to layer `x-request-id`.
  - BentoError moves to pub(crate); every public surface returns
    `Result<_, elide_core::Error>`; internal `?` keeps working via
    the existing From impl.
  - ner/ and ocr/ each split into mod.rs + request.rs + response.rs.
    `WireNerResponse::decode` and `WireOcrResponse::decode` are
    methods; `post_recognize` is a `BentoNer` method.
  - elide-bento depends on elide-ner + elide-ocr directly (it
    implements their backend traits); the umbrella reach is for
    consumers wiring recognizers, not impls.

Engine + bento + core all compile clean and machete-clean. elide-fake
deferred (source still uses pre-rework `nvisy_core::*` paths).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
End-to-end engine over the elide toolkit: persisted policies and
contexts as versioned resources, a run lifecycle that owns analyze
and apply per-document, and a policy module collapsed onto one
composable predicate per rule.

**nvisy-core policy redesign.** `Rule` carries one `predicate:
Predicate` instead of the old `RuleKind` three-way split
(Label/Tag/Predicate). One shape on the wire; the engine
recognises degenerate single-label / single-tag predicates at
compile time and routes them back through elide's `with_label`
and `with_tag` fast paths. Composite predicates (`All`, `Any`,
`Not`) over `LabelOneOf`, `TagOneOf`, `Confidence`, `CoRef`
compose freely — `TagOneOf` inside `All` now evaluates correctly
because the closure receives the per-anonymizer `LabelCatalog`
(elide change in nvisycom/elide#93). `DocumentPredicate` (label /
metadata gating at the doc level) lives alongside in
`policy/document.rs`. Identity is UUID-keyed end to end — every
`Policy`, every `Rule`, every reviewer override lands in the
redaction's `Attribution` so audits trace back to the exact rule
that fired.

**nvisy-engine registry.** `RegistryHandle` opens the fjall
database and pre-opens six keyspaces — policies, contexts,
run_headers, run_docs, run_artifacts, run_inputs. Cheaply cloneable
(`Arc`-backed). Keys are `CompositeKey(actor, id)`, `TripleKey(actor,
run, doc)`, or `VersionedKey(actor, id, semver)` depending on the
resource shape.

**Resources.** `policies::{put, get, latest, list, delete}` and
`contexts::*` are symmetric: immutable per `(actor, id, version)`;
duplicate writes return `Conflict`; lookups by `(id, version)` or
the latest version via a prefix range scan.

**Run orchestrator.** `runs::start(handle, formats, actor, batch)`
mints a UUIDv7 run, persists every input's bytes plus a Queued per-
doc row, writes the run header in `Analyzing`, then fans the
analyzer out per document under `futures::buffer_unordered` with
a hard `tokio::time::timeout`. Each per-doc task decodes via the
codec, picks the modality from `handle.is::<M>()`, compiles the
modality-specific analyzer from the `AnalyzerSpec`, recognises
entities via `analyze_stream`, persists them as `EntityRecord<M>`,
and transitions the row to `AwaitingReview` (or `Failed{reason}`
/ `TimedOut`). When the fan-out drains the header flips to
`AwaitingReview`.

`runs::apply` resolves every referenced policy, fans per-doc
anonymise out the same way, layers reviewer overrides as
high-precedence per-entity rules in front of the policy chain
(decorator pattern — first-match-wins gives overrides priority
without rewriting the policy set), filters policies by
`applies_when` against the merged descriptor + per-request
metadata, runs `Anonymizer::anonymize(&mut handle, &mut entities)`,
encodes back to bytes, writes the redacted artifact to
`run_artifacts`, and transitions the header to `Applied` or
`PartiallyApplied`. `runs::override_entity` lets reviewers stamp a
`RuleAction` onto a single recognised entity by id.

The four per-modality `compile_*` and per-modality `attach_*`
helpers under `engine::anonymizer/` keep the analyzer- and
anonymizer-compile surfaces split for clarity; the apply pipeline
reaches the `attach_policies_*` and `attach_override_*` entry
points directly so it can layer overrides before policies without
cloning the policy set.

Workspace `cargo check`/`build`/`clippy` are green on
`nvisy-engine`; pre-existing `elide-fake` breakage against the
old toolkit nvisy-core API surface is tracked separately under
the E4.1 deletion of legacy crates.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@martsokha martsokha self-assigned this Jun 24, 2026
@martsokha martsokha added feat request for or implementation of a new feature refactor code restructuring without behavior change core content model, errors, shared types engine redaction engine, pipeline runtime, orchestration, configuration architecture architectural decision records and cross-cutting design issues labels Jun 24, 2026
martsokha and others added 8 commits June 25, 2026 00:55
Upstream nvisycom/elide#99 added elide-fake as a peer crate behind
the `fake` feature flag on the `elide` umbrella, re-exporting
`Fake` from `elide::anonymizer::operators::Fake` alongside the
other built-in operators. The local copy was a port of the same
crate kept here while the upstream shape settled; with that done
there is no reason to maintain two copies.

Removes the workspace member entry + dependency declaration and
deletes `crates/elide-fake/`. No runtime crate consumed it
locally, so no callsite migration is needed; future consumers
enable `elide`'s `fake` feature and reach `elide::anonymizer::
operators::Fake`. `cargo check -p nvisy-engine` stays green.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Two related cleanups to the engine's entry-point surface.

**EngineHandle.** The orchestrator entry points used to take a
`(handle: &RegistryHandle, formats: &FormatRegistry)` pair, which
threaded down into every fan-out helper — `start`, `apply`, and
the per-doc and per-fanout helpers each tripped clippy's
`too_many_arguments` (8 / 9 / 8 args respectively). `EngineHandle`
bundles both into one cheaply-cloneable handle (the persistence
registry plus an `Arc<FormatRegistry>`). Two constructors:
`open(path)` defaults to elide's `FormatRegistry::with_builtin()`
for production; `with_formats(path, registry)` lets tests inject
fake codecs. `runs::start/apply/override_entity` now take
`&EngineHandle`; the single-call `fanout_analyze` / `fanout_apply`
helpers got inlined since their abstraction value was thin.

Three `too_many_arguments` warnings cleared; the engine surface
is now what the server module will consume verbatim.

**Resources as extension traits.** `policies::{put,get,latest,
list,delete}` and `contexts::{put,...}` were free functions that
took `&RegistryHandle` as the first arg — the universal sign that
they should be methods on the handle. Converted to two extension
traits, `PolicyRegistry` and `ContextRegistry`, both implemented
on `RegistryHandle`. Methods are namespaced (`get_policy`,
`put_context`) so they don't collide when both traits are in
scope and they read naturally at the call site. Native async fn
in traits (Rust 1.95+) — no `async_trait` macro, matching the
project style.

Call sites change from
`crate::policies::get(handle, actor, id, ver).await?` to
`handle.get_policy(actor, id, ver).await?`. The only existing
internal call site (`runs::orchestrate::resolve_policies`) is
updated; external consumers (when the server module lands) bring
the trait into scope with
`use nvisy_engine::{PolicyRegistry, ContextRegistry};`.

`cargo build -p nvisy-engine` and `cargo clippy -p nvisy-engine`
are clean; the two remaining warnings are pre-existing unused-
code lints on `FjallKeyspaceExt` + `CompositeKey::actor_id`
unrelated to this slice.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Mechanical formatting reflows, no behavior change.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Engine: cleaner core surface.

- EngineHandle bundles RegistryHandle + Arc<FormatRegistry>; orchestrator entry points take &EngineHandle. Cleared the too_many_arguments clippy lints; fanout helpers inlined into start/apply.
- runs::list, runs::cancel, runs::delete: the missing lifecycle holes. list range-scans run_headers by actor; cancel marks Failed{reason:"cancelled"} from Analyzing/AwaitingReview only; delete cascades across run_docs only (files survive).
- FileRegistry + files_metadata/files_content keyspaces. Symmetric with PolicyRegistry/ContextRegistry.
- Files-as-input/output: DocumentInput is just a file_id; RunDocument carries input_file_id + output_file_id. runs::apply writes redacted bytes via FileRegistry::put_file with FileLineage::RedactedFrom { run_id, source_file_id }. run_artifacts and run_inputs keyspaces removed.
- Resource modules collapsed into nvisy_engine::keyspace (policy, file, context submodules) — one place for user-owned resource CRUD. RunRegistry stays in runs::persist as pub(crate); engine-managed state is a different contract from user-owned keyspaces.
- nvisy-core gains FileMetadata + FileLineage.
- Stream closures cleaned to capture owned values (cloned AnalyzerSpec, Arc<Vec<Policy>>, owned HashMap) so start/apply futures compose with axum's handler Send + 'static bounds.

Server: ported end-to-end onto the new engine surface.

- ServiceState wraps EngineHandle.
- Routes: /files, /policies, /contexts, /detections, /redactions, /health. /detections/{id} returns the full run (header + every per-doc body inline) — no /documents/{doc} subroutes. /redactions returns one file id per input (download via /files/{id}/content).
- Response wrappers with full OpenAPI fidelity for run bodies: RunResponse, RunDocumentDto, DocBodyDto, per-modality entity records (TextEntityRecordDto / TabularEntityRecordDto / ImageEntityRecordDto / AudioEntityRecordDto) + per-modality locations. Provenance + recognized_range deferred (audit-internal).
- Handlers wire through the engine's trait surface (PolicyRegistry, FileRegistry, ContextRegistry, runs::start/apply/cancel/delete/list).

cargo build + clippy clean on engine + server. Two pre-existing engine warnings on FjallKeyspaceExt + CompositeKey helpers unchanged. Workspace check is blocked only by nvisy-cli (E3.4, deliberately out of scope).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…am elide bump

Plan layer (nvisy-core): RecognizerParams + EnricherParams now carry per-kind
slots (pattern at-most-one; ner/llm/enrichers as Options or Vecs by kind), every
*Spec renamed to *Params, and Stt joins the enricher set. Per-recognizer `labels`
removed — scope + label_catalog is the sole source of truth.

Server: per-request analyzer overrides (AnalyzerOverrides) layer onto the
deployment default. ScalarOverride / CollectionOverride enums (Inherit /
Replace / Remove / Patch) thread through every field; nested Recognizer /
EnricherOverrides match the plan shape; selector types key collection patches
by name.

Engine: analyzer + anonymizer compile paths consume the new shape; the
PolicyHashAlgorithm → Sha2Algorithm conversion lives in nvisy-core via From,
collapsing the two duplicate helpers.

Upstream elide bumped to 70e6e8bf: imports relocated to elide::redaction::*,
elide::detection::*, elide_core::operator::*, elide_core::entity::provenance::*;
backend modules in elide-ocr/elide-stt are now private (use the crate root);
Hash / HashAlgorithm operators renamed to Sha2Hash / Sha2Algorithm behind a
sha2 feature.

Dead-code sweep surfaced by clippy: FjallKeyspaceExt trait, the unused
CompositeKey accessors, Page::from_paged, and the manual Default impls on
*Override (replaced with derive + #[default]).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
cargo-deny rejected every elide crate (`source-not-allowed`) — the new
upstream split surfaced `elide-detection`, `elide-orchestration`,
`elide-redaction`, `elide-lingua` on top of the existing graph and the
`[sources].allow-git` list was empty. Allow the single elide repo URL;
every other git source still trips `unknown-git = "deny"`.

While here, drop `OpenSSL` / `bzip2-1.0.6` / `CDLA-Permissive-2.0` from
`[licenses].allow` — no crate in the graph carries them, so
`unused-allowed-license = "warn"` was flagging them every run.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…ento

The Docs CI job runs with `-D warnings`; stale intra-doc links from
recent renames (E3.S keyspace collapse, E3.R-a Files-as-input/output,
upstream elide split) and the new bento OCR/NER modules tripped it.
Local rustdoc on this commit is clean.

- nvisy-core: `elide::redaction::operators::Hash` → `Sha2Hash`; drop
  the dangling `KeyProvider` link (the engine takes an `AesKey` at
  construction; no key-provider abstraction exists upstream).
- nvisy-engine/anonymizer: link `Anonymizer` via reference form;
  drop dangling links to private `text` / `tabular` / `image` /
  `audio` submodules and to the renamed `TextBacked` /
  `PolicyDecisionRef` symbols.
- nvisy-engine/keyspace: disambiguate `[`file`]` (also a macro) with
  `[mod@file]` form; same for `policy` / `context`. Qualify the
  `list_files` link with its registry trait.
- nvisy-engine/runs: doc API surface is `get` / `get_doc`, not
  `get_header` / `get_artifact`; correct module-doc reference list.
  Drop the dangling `RunRegistry::get_run` and `TripleKey` links to
  private items.
- elide-bento: drop module-doc references to private `request` /
  `response` submodules; only `BentoNer` / `BentoOcr` are public.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
- nvisy-engine: align with the rest of the workspace by inheriting
  `publish = { workspace = true }` (= `publish = false` from the
  root). Was `publish = true`, which made cargo-deny treat it as a
  publishable crate and reject every `path = "..."` workspace dep
  with the `wildcard` lint.
- deny.toml: allow `0BSD` (BSD Zero Clause) — pulled in via
  `varint-rs` under `lsm-tree` under `fjall`. Strictly more
  permissive than MIT; OSI-approved.

`cargo deny check all` now reports `advisories ok, bans ok, licenses ok,
sources ok` locally.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@martsokha martsokha merged commit b473164 into main Jun 26, 2026
6 checks passed
@martsokha martsokha deleted the migrate/elide branch June 26, 2026 18:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

architecture architectural decision records and cross-cutting design issues core content model, errors, shared types engine redaction engine, pipeline runtime, orchestration, configuration feat request for or implementation of a new feature refactor code restructuring without behavior change

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant