feat(store): persistent SQLite event log + JSON-file artifact store by dgenio · Pull Request #232 · dgenio/contextweaver

dgenio · 2026-05-16T16:59:13Z

Lands the store-only slice of the persistent-backends group (#42, #174,
#223). Two new protocol-conformant backends + an EventLog lifecycle
contract refinement; zero behavior change for existing in-memory users.

#223 — SqliteEventLog + shared _sqlite_base.py

store/_sqlite_base.py: connection helper (WAL, foreign_keys=ON, parent
dir auto-create) + apply_migrations() driven by a
_contextweaver_schema_version table. The other 3 SQLite stores in
epic [epic] SQLite persistent stores #174 reuse this.
store/sqlite_event_log.py: implements every EventLog protocol method
against a single-process SQLite file. Append-only writes via append()
(duplicate ids raise DuplicateItemError); insertion order preserved
by an auto-increment ordinal column; indexes on kind and parent_id.
Round-trips every ContextItem field (metadata + artifact_ref via
JSON columns).
[sqlite] extras placeholder in pyproject.toml (stdlib sqlite3 only;
placeholder exists so a future aiosqlite variant can opt in without
changing the install surface).

#42 — JsonFileArtifactStore

store/json_file_artifacts.py: filesystem-backed ArtifactStore.
Layout {base_dir}/{handle}.data + {base_dir}/{handle}.json. Auto-
creates base_dir; rejects handles containing path separators / ..
/ null bytes. Re-instantiating recovers refs from existing metadata
files.
store/artifacts.py: extracts the drilldown selector dispatch
(head / lines / json_keys / rows) to a module-private
_apply_selector() helper so both backends share one implementation.
Behaviour preserved bit-for-bit; existing tests untouched.

EventLog lifecycle (Mode B addition for clean SQLite cleanup)

store/protocols.py: EventLog protocol now requires close(),
enter, exit.
store/event_log.py: InMemoryEventLog.close() is a no-op; existing
callers continue to work without changes. with SqliteEventLog(path) as log: is the recommended idiom for the new backend.

Tests: 63 new (32 SqliteEventLog, 29 JsonFileArtifactStore, 2 in-memory
lifecycle). New modules at 87–100% coverage. Full suite 1011 passed,
8 skipped.

Verification
ruff format --check src/ tests/ examples/ scripts/ - clean
ruff check src/ tests/ examples/ scripts/ - clean
mypy src/ - 0 issues / 67 files
pytest --cov=contextweaver -q - 1011 passed, 8 skipped
make example, make demo, make scorecard-check - clean
make llms-check - up to date

Closes #42
Closes #223
Refs #174

https://claude.ai/code/session_01ADTmGUqM66tqnGMqrefy4e

Lands the store-only slice of the persistent-backends group (#42, #174, #223). Two new protocol-conformant backends + an EventLog lifecycle contract refinement; zero behavior change for existing in-memory users. #223 — SqliteEventLog + shared _sqlite_base.py - store/_sqlite_base.py: connection helper (WAL, foreign_keys=ON, parent dir auto-create) + apply_migrations() driven by a _contextweaver_schema_version table. The other 3 SQLite stores in epic #174 reuse this. - store/sqlite_event_log.py: implements every EventLog protocol method against a single-process SQLite file. Append-only writes via append() (duplicate ids raise DuplicateItemError); insertion order preserved by an auto-increment ordinal column; indexes on kind and parent_id. Round-trips every ContextItem field (metadata + artifact_ref via JSON columns). - [sqlite] extras placeholder in pyproject.toml (stdlib sqlite3 only; placeholder exists so a future aiosqlite variant can opt in without changing the install surface). #42 — JsonFileArtifactStore - store/json_file_artifacts.py: filesystem-backed ArtifactStore. Layout {base_dir}/{handle}.data + {base_dir}/{handle}.json. Auto- creates base_dir; rejects handles containing path separators / .. / null bytes. Re-instantiating recovers refs from existing metadata files. - store/artifacts.py: extracts the drilldown selector dispatch (head / lines / json_keys / rows) to a module-private _apply_selector() helper so both backends share one implementation. Behaviour preserved bit-for-bit; existing tests untouched. EventLog lifecycle (Mode B addition for clean SQLite cleanup) - store/protocols.py: EventLog protocol now requires close(), __enter__, __exit__. - store/event_log.py: InMemoryEventLog.close() is a no-op; existing callers continue to work without changes. `with SqliteEventLog(path) as log:` is the recommended idiom for the new backend. Tests: 63 new (32 SqliteEventLog, 29 JsonFileArtifactStore, 2 in-memory lifecycle). New modules at 87–100% coverage. Full suite 1011 passed, 8 skipped. Verification ruff format --check src/ tests/ examples/ scripts/ - clean ruff check src/ tests/ examples/ scripts/ - clean mypy src/ - 0 issues / 67 files pytest --cov=contextweaver -q - 1011 passed, 8 skipped make example, make demo, make scorecard-check - clean make llms-check - up to date Closes #42 Closes #223 Refs #174 https://claude.ai/code/session_01ADTmGUqM66tqnGMqrefy4e

Copilot

Pull request overview

Adds persistent store backends for the store layer while preserving the existing in-memory implementations.

Changes:

Adds SqliteEventLog with shared SQLite connection/migration scaffolding.
Adds JsonFileArtifactStore and shares artifact drilldown selector logic.
Extends EventLog lifecycle support with close() and context manager methods, plus tests and docs/changelog updates.

Reviewed changes

Copilot reviewed 13 out of 13 changed files in this pull request and generated 4 comments.

Show a summary per file

File	Description
`src/contextweaver/store/_sqlite_base.py`	Adds reusable SQLite connection and migration helpers.
`src/contextweaver/store/sqlite_event_log.py`	Adds SQLite-backed `EventLog` implementation.
`src/contextweaver/store/json_file_artifacts.py`	Adds filesystem-backed artifact persistence.
`src/contextweaver/store/artifacts.py`	Extracts shared drilldown selector helper.
`src/contextweaver/store/event_log.py`	Adds no-op lifecycle methods to in-memory event log.
`src/contextweaver/store/protocols.py`	Extends `EventLog` protocol with lifecycle methods.
`src/contextweaver/store/__init__.py`	Re-exports new store backends.
`tests/test_store_sqlite_event_log.py`	Adds SQLite event log coverage.
`tests/test_store_json_file_artifacts.py`	Adds JSON-file artifact store coverage.
`tests/test_store_event_log.py`	Adds in-memory event log lifecycle tests.
`pyproject.toml`	Adds placeholder `[sqlite]` optional extra.
`CHANGELOG.md`	Documents new persistent stores and lifecycle contract.
`AGENTS.md`	Updates agent-facing module map for new store backends.

dgenio · 2026-05-16T22:07:52Z

+        clauses: list[str] = []
+        params: list[Any] = []
+        if kinds is not None:
+            if not kinds:
+                return []
+            placeholders = ",".join("?" for _ in kinds)
+            clauses.append(f"kind IN ({placeholders})")
+            params.extend(k.value for k in kinds)
+        where = f"WHERE {' AND '.join(clauses)}" if clauses else ""
+        rows = conn.execute(
+            f"SELECT * FROM event_log {where} ORDER BY ordinal ASC",
+            tuple(params),
+        ).fetchall()
+        items = [_row_to_item(r) for r in rows]
+        if since is not None:
+            items = items[since:]


Good catch — addressed in c56fbd0. SqliteEventLog.query now pulls all rows ordered by insertion, then applies since as a positional slice, then the kinds filter, then limit — exactly mirroring InMemoryEventLog.query. The previous SQL-side kind IN (...) filter inverted the order on mixed-kind logs.

The existing test_query_combined_filters was updated (it had been pinning the buggy ["u3"] output; the correct in-memory-matching value is ["u2"]), and a new test_query_kinds_since_matches_in_memory_semantics pins backend parity directly: it runs four representative (kinds, since, limit) cases against both InMemoryEventLog and SqliteEventLog mirrors and asserts byte-identical results.

Resolving thread.

Generated by Claude Code

dgenio · 2026-05-16T22:07:56Z

+    def _meta_path(self, handle: str) -> Path:
+        return self._base_dir / f"{handle}{_META_SUFFIX}"
+
+    def _data_path(self, handle: str) -> Path:
+        return self._base_dir / f"{handle}{_DATA_SUFFIX}"


Good catch — addressed in c56fbd0. Centralised the validation: _meta_path and _data_path now call _validate_handle before constructing the Path, so every public method that resolves a handle (get, ref, exists, delete, metadata, drilldown) is safe against path traversal — not just put. The explicit call in put was redundant after this and was removed.

Regression coverage:

test_read_path_methods_reject_unsafe_handles — parametrised over five methods × six unsafe-handle shapes (foo/bar, .., ., a\b, x\x00y, "")

test_drilldown_rejects_unsafe_handle — covers the drilldown path explicitly

Resolving thread.

Generated by Claude Code

dgenio · 2026-05-16T22:08:00Z

+
+    def _require_conn(self) -> sqlite3.Connection:
+        if self._conn is None:
+            raise RuntimeError("SqliteEventLog is closed")


Good catch — addressed in c56fbd0. Added StoreClosedError(ContextWeaverError) to contextweaver.exceptions and replaced the bare RuntimeError raised by _require_conn with it. The close() docstring now points to the new exception class.

test_use_after_close_raises now asserts (a) the specific StoreClosedError is raised and (b) it remains catchable via the ContextWeaverError umbrella, so callers can use a single except ContextWeaverError: clause as AGENTS.md suggests.

Resolving thread.

Generated by Claude Code

dgenio · 2026-05-16T22:08:03Z

+            try:
+                raw = json.loads(meta.read_text(encoding="utf-8"))
+                refs.append(ArtifactRef.from_dict(raw))
+            except (json.JSONDecodeError, KeyError, ValueError) as exc:


Good catch — addressed in c56fbd0. TypeError is now in the except clause, alongside the existing json.JSONDecodeError, KeyError, ValueError. The docstring documents the full failure-mode taxonomy so future maintainers know which exceptions are deliberately swallowed.

test_list_refs_skips_non_mapping_json covers the three wrong-shape cases ([], null, a bare string); all three are skipped silently while a well-formed good.json entry still surfaces.

Resolving thread.

Generated by Claude Code

github-actions · 2026-05-16T17:02:33Z

Benchmark delta (vs `main`)

Soft regression feedback only — this comment never blocks the PR.
Latency budget: ⚠️ when head > base × 1.3. Accuracy budget: ⚠️ when head < base - 1pp.

Routing summary (single backend × catalog sizes)

size	recall@k (head Δ vs base)	MRR (head Δ vs base)	p99 (ms)
50	✅ 0.5649 (+0.0000)	✅ 0.4978 (+0.0000)	✅ 0.511 (base 0.463)
83	✅ 0.3825 (+0.0000)	✅ 0.3242 (+0.0000)	✅ 0.742 (base 0.876)
1000	✅ 0.1475 (+0.0000)	✅ 0.1456 (+0.0000)	✅ 41.246 (base 31.897)

Per-backend × per-size matrix

backend	size	recall@k (Δ)	MRR (Δ)	p99 (ms)
bm25	100	✅ 0.3825 (+0.0000)	✅ 0.3399 (+0.0000)	✅ 6.417 (base 5.642)
bm25	500	✅ 0.2250 (+0.0000)	✅ 0.2165 (+0.0000)	✅ 29.594 (base 27.538)
bm25	1000	✅ 0.1575 (+0.0000)	✅ 0.1525 (+0.0000)	✅ 87.867 (base 78.368)
fuzzy	100	✅ 0.0000 (+0.0000)	✅ 0.0000 (+0.0000)	✅ 0.000 (base 0.000)
fuzzy	500	✅ 0.0000 (+0.0000)	✅ 0.0000 (+0.0000)	✅ 0.000 (base 0.000)
fuzzy	1000	✅ 0.0000 (+0.0000)	✅ 0.0000 (+0.0000)	✅ 0.000 (base 0.000)
tfidf	100	✅ 0.3825 (+0.0000)	✅ 0.3220 (+0.0000)	✅ 1.034 (base 0.872)
tfidf	500	✅ 0.2325 (+0.0000)	✅ 0.2314 (+0.0000)	✅ 9.614 (base 8.660)
tfidf	1000	✅ 0.1475 (+0.0000)	✅ 0.1456 (+0.0000)	✅ 36.472 (base 30.071)

Context pipeline (per scenario)

scenario	tokens	dropped	dedup
large_catalog	1514 (base 1514, Δ+0)	0 (base 0, Δ+0)	0 (base 0, Δ+0)
long_conversation	2548 (base 2548, Δ+0)	0 (base 0, Δ+0)	0 (base 0, Δ+0)
short_conversation	496 (base 496, Δ+0)	0 (base 0, Δ+0)	0 (base 0, Δ+0)
stress_conversation	6651 (base 6651, Δ+0)	7 (base 7, Δ+0)	4 (base 4, Δ+0)

Numbers come from make benchmark / make benchmark-matrix.
Latency is hardware-dependent — treat the markers as a rough guide.
See benchmarks/scorecard.md for the full picture.

Addresses all 4 Copilot review comments on the persistent-store PR. 1. **SqliteEventLog.query filter order mismatch with InMemoryEventLog** (Copilot #1). The SQL path applied `kinds` before `since`, while the in-memory path applies `since` to the full insertion-ordered log *before* filtering by kind. On mixed-kind logs this gave different results for the same `(kinds, since, limit)` triple. Switched to pull-all + slice-by-since + filter-by-kind + slice-by-limit, matching the in-memory semantics byte-for-byte. The existing test_query_combined_filters expected the buggy ordering and was updated; a new test_query_kinds_since_matches_in_memory_semantics pins parity across four representative (kinds, since, limit) cases against an InMemoryEventLog mirror. 2. **JsonFileArtifactStore path traversal** (Copilot #2). Handle validation was only enforced inside `put()`, leaving `get`, `ref`, `exists`, `delete`, `metadata`, and `drilldown` vulnerable to `..`, path separators, and absolute paths (read / delete outside `base_dir`). Centralised the validation inside `_meta_path` and `_data_path`, so every public method that resolves a handle is now safe. Parametrised regression test `test_read_path_methods_reject_unsafe_handles` covers all five read / mutate methods × six unsafe-handle shapes; a separate `test_drilldown_rejects_unsafe_handle` covers the drilldown path. 3. **Bare RuntimeError on store-closed** (Copilot #3). AGENTS.md requires public-facing errors to come from `contextweaver.exceptions`. Added `StoreClosedError(ContextWeaverError)` to the exceptions module and replaced the `RuntimeError` raised by `_require_conn` with it. The lifecycle docstring on `close()` now references the new exception. `test_use_after_close_raises` updated to assert both the specific exception and that it remains catchable as `ContextWeaverError`. 4. **list_refs crashes on wrong-shape JSON** (Copilot #4). `ArtifactRef.from_dict()` raises `TypeError` (not `ValueError`) when given a non-mapping JSON value such as `[]`, `null`, or a bare string. `list_refs` only caught `json.JSONDecodeError | KeyError | ValueError`, so a wrong-shape file crashed the whole listing. Added `TypeError` to the except clause and documented the failure-mode taxonomy on the method. `test_list_refs_skips_non_mapping_json` covers all three shapes. Verification: - ruff format/lint clean - mypy strict — 0 issues / 67 source files - pytest -q — 1047 passed, 5 skipped (+36 new tests over baseline) - make example + make demo clean - make scorecard-check + make llms-check clean

Copilot AI review requested due to automatic review settings May 16, 2026 16:59

Copilot started reviewing on behalf of dgenio May 16, 2026 16:59 View session

Copilot AI reviewed May 16, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(store): persistent SQLite event log + JSON-file artifact store#232

feat(store): persistent SQLite event log + JSON-file artifact store#232
dgenio wants to merge 2 commits into
mainfrom
claude/triage-issues-7Pd1o

dgenio commented May 16, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

dgenio May 16, 2026

Uh oh!

dgenio May 16, 2026

Uh oh!

dgenio May 16, 2026

Uh oh!

dgenio May 16, 2026

Uh oh!

github-actions Bot commented May 16, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

dgenio commented May 16, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

dgenio May 16, 2026

Choose a reason for hiding this comment

Uh oh!

dgenio May 16, 2026

Choose a reason for hiding this comment

Uh oh!

dgenio May 16, 2026

Choose a reason for hiding this comment

Uh oh!

dgenio May 16, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions Bot commented May 16, 2026

Benchmark delta (vs main)

Routing summary (single backend × catalog sizes)

Per-backend × per-size matrix

Context pipeline (per scenario)

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Benchmark delta (vs `main`)