feat(store): persistent SQLite event log + JSON-file artifact store#232
feat(store): persistent SQLite event log + JSON-file artifact store#232dgenio wants to merge 2 commits into
Conversation
Lands the store-only slice of the persistent-backends group (#42, #174, #223). Two new protocol-conformant backends + an EventLog lifecycle contract refinement; zero behavior change for existing in-memory users. #223 — SqliteEventLog + shared _sqlite_base.py - store/_sqlite_base.py: connection helper (WAL, foreign_keys=ON, parent dir auto-create) + apply_migrations() driven by a _contextweaver_schema_version table. The other 3 SQLite stores in epic #174 reuse this. - store/sqlite_event_log.py: implements every EventLog protocol method against a single-process SQLite file. Append-only writes via append() (duplicate ids raise DuplicateItemError); insertion order preserved by an auto-increment ordinal column; indexes on kind and parent_id. Round-trips every ContextItem field (metadata + artifact_ref via JSON columns). - [sqlite] extras placeholder in pyproject.toml (stdlib sqlite3 only; placeholder exists so a future aiosqlite variant can opt in without changing the install surface). #42 — JsonFileArtifactStore - store/json_file_artifacts.py: filesystem-backed ArtifactStore. Layout {base_dir}/{handle}.data + {base_dir}/{handle}.json. Auto- creates base_dir; rejects handles containing path separators / .. / null bytes. Re-instantiating recovers refs from existing metadata files. - store/artifacts.py: extracts the drilldown selector dispatch (head / lines / json_keys / rows) to a module-private _apply_selector() helper so both backends share one implementation. Behaviour preserved bit-for-bit; existing tests untouched. EventLog lifecycle (Mode B addition for clean SQLite cleanup) - store/protocols.py: EventLog protocol now requires close(), __enter__, __exit__. - store/event_log.py: InMemoryEventLog.close() is a no-op; existing callers continue to work without changes. `with SqliteEventLog(path) as log:` is the recommended idiom for the new backend. Tests: 63 new (32 SqliteEventLog, 29 JsonFileArtifactStore, 2 in-memory lifecycle). New modules at 87–100% coverage. Full suite 1011 passed, 8 skipped. Verification ruff format --check src/ tests/ examples/ scripts/ - clean ruff check src/ tests/ examples/ scripts/ - clean mypy src/ - 0 issues / 67 files pytest --cov=contextweaver -q - 1011 passed, 8 skipped make example, make demo, make scorecard-check - clean make llms-check - up to date Closes #42 Closes #223 Refs #174 https://claude.ai/code/session_01ADTmGUqM66tqnGMqrefy4e
There was a problem hiding this comment.
Pull request overview
Adds persistent store backends for the store layer while preserving the existing in-memory implementations.
Changes:
- Adds
SqliteEventLogwith shared SQLite connection/migration scaffolding. - Adds
JsonFileArtifactStoreand shares artifact drilldown selector logic. - Extends
EventLoglifecycle support withclose()and context manager methods, plus tests and docs/changelog updates.
Reviewed changes
Copilot reviewed 13 out of 13 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
src/contextweaver/store/_sqlite_base.py |
Adds reusable SQLite connection and migration helpers. |
src/contextweaver/store/sqlite_event_log.py |
Adds SQLite-backed EventLog implementation. |
src/contextweaver/store/json_file_artifacts.py |
Adds filesystem-backed artifact persistence. |
src/contextweaver/store/artifacts.py |
Extracts shared drilldown selector helper. |
src/contextweaver/store/event_log.py |
Adds no-op lifecycle methods to in-memory event log. |
src/contextweaver/store/protocols.py |
Extends EventLog protocol with lifecycle methods. |
src/contextweaver/store/__init__.py |
Re-exports new store backends. |
tests/test_store_sqlite_event_log.py |
Adds SQLite event log coverage. |
tests/test_store_json_file_artifacts.py |
Adds JSON-file artifact store coverage. |
tests/test_store_event_log.py |
Adds in-memory event log lifecycle tests. |
pyproject.toml |
Adds placeholder [sqlite] optional extra. |
CHANGELOG.md |
Documents new persistent stores and lifecycle contract. |
AGENTS.md |
Updates agent-facing module map for new store backends. |
| clauses: list[str] = [] | ||
| params: list[Any] = [] | ||
| if kinds is not None: | ||
| if not kinds: | ||
| return [] | ||
| placeholders = ",".join("?" for _ in kinds) | ||
| clauses.append(f"kind IN ({placeholders})") | ||
| params.extend(k.value for k in kinds) | ||
| where = f"WHERE {' AND '.join(clauses)}" if clauses else "" | ||
| rows = conn.execute( | ||
| f"SELECT * FROM event_log {where} ORDER BY ordinal ASC", | ||
| tuple(params), | ||
| ).fetchall() | ||
| items = [_row_to_item(r) for r in rows] | ||
| if since is not None: | ||
| items = items[since:] |
There was a problem hiding this comment.
Good catch — addressed in c56fbd0. SqliteEventLog.query now pulls all rows ordered by insertion, then applies since as a positional slice, then the kinds filter, then limit — exactly mirroring InMemoryEventLog.query. The previous SQL-side kind IN (...) filter inverted the order on mixed-kind logs.
The existing test_query_combined_filters was updated (it had been pinning the buggy ["u3"] output; the correct in-memory-matching value is ["u2"]), and a new test_query_kinds_since_matches_in_memory_semantics pins backend parity directly: it runs four representative (kinds, since, limit) cases against both InMemoryEventLog and SqliteEventLog mirrors and asserts byte-identical results.
Resolving thread.
Generated by Claude Code
| def _meta_path(self, handle: str) -> Path: | ||
| return self._base_dir / f"{handle}{_META_SUFFIX}" | ||
|
|
||
| def _data_path(self, handle: str) -> Path: | ||
| return self._base_dir / f"{handle}{_DATA_SUFFIX}" |
There was a problem hiding this comment.
Good catch — addressed in c56fbd0. Centralised the validation: _meta_path and _data_path now call _validate_handle before constructing the Path, so every public method that resolves a handle (get, ref, exists, delete, metadata, drilldown) is safe against path traversal — not just put. The explicit call in put was redundant after this and was removed.
Regression coverage:
test_read_path_methods_reject_unsafe_handles— parametrised over five methods × six unsafe-handle shapes (foo/bar, .., ., a\b, x\x00y, "")test_drilldown_rejects_unsafe_handle— covers the drilldown path explicitly
Resolving thread.
Generated by Claude Code
|
|
||
| def _require_conn(self) -> sqlite3.Connection: | ||
| if self._conn is None: | ||
| raise RuntimeError("SqliteEventLog is closed") |
There was a problem hiding this comment.
Good catch — addressed in c56fbd0. Added StoreClosedError(ContextWeaverError) to contextweaver.exceptions and replaced the bare RuntimeError raised by _require_conn with it. The close() docstring now points to the new exception class.
test_use_after_close_raises now asserts (a) the specific StoreClosedError is raised and (b) it remains catchable via the ContextWeaverError umbrella, so callers can use a single except ContextWeaverError: clause as AGENTS.md suggests.
Resolving thread.
Generated by Claude Code
| try: | ||
| raw = json.loads(meta.read_text(encoding="utf-8")) | ||
| refs.append(ArtifactRef.from_dict(raw)) | ||
| except (json.JSONDecodeError, KeyError, ValueError) as exc: |
There was a problem hiding this comment.
Good catch — addressed in c56fbd0. TypeError is now in the except clause, alongside the existing json.JSONDecodeError, KeyError, ValueError. The docstring documents the full failure-mode taxonomy so future maintainers know which exceptions are deliberately swallowed.
test_list_refs_skips_non_mapping_json covers the three wrong-shape cases ([], null, a bare string); all three are skipped silently while a well-formed good.json entry still surfaces.
Resolving thread.
Generated by Claude Code
Benchmark delta (vs
|
| size | recall@k (head Δ vs base) | MRR (head Δ vs base) | p99 (ms) |
|---|---|---|---|
| 50 | ✅ 0.5649 (+0.0000) | ✅ 0.4978 (+0.0000) | ✅ 0.511 (base 0.463) |
| 83 | ✅ 0.3825 (+0.0000) | ✅ 0.3242 (+0.0000) | ✅ 0.742 (base 0.876) |
| 1000 | ✅ 0.1475 (+0.0000) | ✅ 0.1456 (+0.0000) | ✅ 41.246 (base 31.897) |
Per-backend × per-size matrix
| backend | size | recall@k (Δ) | MRR (Δ) | p99 (ms) |
|---|---|---|---|---|
| bm25 | 100 | ✅ 0.3825 (+0.0000) | ✅ 0.3399 (+0.0000) | ✅ 6.417 (base 5.642) |
| bm25 | 500 | ✅ 0.2250 (+0.0000) | ✅ 0.2165 (+0.0000) | ✅ 29.594 (base 27.538) |
| bm25 | 1000 | ✅ 0.1575 (+0.0000) | ✅ 0.1525 (+0.0000) | ✅ 87.867 (base 78.368) |
| fuzzy | 100 | ✅ 0.0000 (+0.0000) | ✅ 0.0000 (+0.0000) | ✅ 0.000 (base 0.000) |
| fuzzy | 500 | ✅ 0.0000 (+0.0000) | ✅ 0.0000 (+0.0000) | ✅ 0.000 (base 0.000) |
| fuzzy | 1000 | ✅ 0.0000 (+0.0000) | ✅ 0.0000 (+0.0000) | ✅ 0.000 (base 0.000) |
| tfidf | 100 | ✅ 0.3825 (+0.0000) | ✅ 0.3220 (+0.0000) | ✅ 1.034 (base 0.872) |
| tfidf | 500 | ✅ 0.2325 (+0.0000) | ✅ 0.2314 (+0.0000) | ✅ 9.614 (base 8.660) |
| tfidf | 1000 | ✅ 0.1475 (+0.0000) | ✅ 0.1456 (+0.0000) | ✅ 36.472 (base 30.071) |
Context pipeline (per scenario)
| scenario | tokens | dropped | dedup |
|---|---|---|---|
| large_catalog | 1514 (base 1514, Δ+0) | 0 (base 0, Δ+0) | 0 (base 0, Δ+0) |
| long_conversation | 2548 (base 2548, Δ+0) | 0 (base 0, Δ+0) | 0 (base 0, Δ+0) |
| short_conversation | 496 (base 496, Δ+0) | 0 (base 0, Δ+0) | 0 (base 0, Δ+0) |
| stress_conversation | 6651 (base 6651, Δ+0) | 7 (base 7, Δ+0) | 4 (base 4, Δ+0) |
Numbers come from make benchmark / make benchmark-matrix.
Latency is hardware-dependent — treat the markers as a rough guide.
See benchmarks/scorecard.md for the full picture.
Addresses all 4 Copilot review comments on the persistent-store PR. 1. **SqliteEventLog.query filter order mismatch with InMemoryEventLog** (Copilot #1). The SQL path applied `kinds` before `since`, while the in-memory path applies `since` to the full insertion-ordered log *before* filtering by kind. On mixed-kind logs this gave different results for the same `(kinds, since, limit)` triple. Switched to pull-all + slice-by-since + filter-by-kind + slice-by-limit, matching the in-memory semantics byte-for-byte. The existing test_query_combined_filters expected the buggy ordering and was updated; a new test_query_kinds_since_matches_in_memory_semantics pins parity across four representative (kinds, since, limit) cases against an InMemoryEventLog mirror. 2. **JsonFileArtifactStore path traversal** (Copilot #2). Handle validation was only enforced inside `put()`, leaving `get`, `ref`, `exists`, `delete`, `metadata`, and `drilldown` vulnerable to `..`, path separators, and absolute paths (read / delete outside `base_dir`). Centralised the validation inside `_meta_path` and `_data_path`, so every public method that resolves a handle is now safe. Parametrised regression test `test_read_path_methods_reject_unsafe_handles` covers all five read / mutate methods × six unsafe-handle shapes; a separate `test_drilldown_rejects_unsafe_handle` covers the drilldown path. 3. **Bare RuntimeError on store-closed** (Copilot #3). AGENTS.md requires public-facing errors to come from `contextweaver.exceptions`. Added `StoreClosedError(ContextWeaverError)` to the exceptions module and replaced the `RuntimeError` raised by `_require_conn` with it. The lifecycle docstring on `close()` now references the new exception. `test_use_after_close_raises` updated to assert both the specific exception and that it remains catchable as `ContextWeaverError`. 4. **list_refs crashes on wrong-shape JSON** (Copilot #4). `ArtifactRef.from_dict()` raises `TypeError` (not `ValueError`) when given a non-mapping JSON value such as `[]`, `null`, or a bare string. `list_refs` only caught `json.JSONDecodeError | KeyError | ValueError`, so a wrong-shape file crashed the whole listing. Added `TypeError` to the except clause and documented the failure-mode taxonomy on the method. `test_list_refs_skips_non_mapping_json` covers all three shapes. Verification: - ruff format/lint clean - mypy strict — 0 issues / 67 source files - pytest -q — 1047 passed, 5 skipped (+36 new tests over baseline) - make example + make demo clean - make scorecard-check + make llms-check clean
Lands the store-only slice of the persistent-backends group (#42, #174,
#223). Two new protocol-conformant backends + an EventLog lifecycle
contract refinement; zero behavior change for existing in-memory users.
#223 — SqliteEventLog + shared _sqlite_base.py
dir auto-create) + apply_migrations() driven by a
_contextweaver_schema_version table. The other 3 SQLite stores in
epic [epic] SQLite persistent stores #174 reuse this.
against a single-process SQLite file. Append-only writes via append()
(duplicate ids raise DuplicateItemError); insertion order preserved
by an auto-increment ordinal column; indexes on kind and parent_id.
Round-trips every ContextItem field (metadata + artifact_ref via
JSON columns).
placeholder exists so a future aiosqlite variant can opt in without
changing the install surface).
#42 — JsonFileArtifactStore
Layout {base_dir}/{handle}.data + {base_dir}/{handle}.json. Auto-
creates base_dir; rejects handles containing path separators / ..
/ null bytes. Re-instantiating recovers refs from existing metadata
files.
(head / lines / json_keys / rows) to a module-private
_apply_selector() helper so both backends share one implementation.
Behaviour preserved bit-for-bit; existing tests untouched.
EventLog lifecycle (Mode B addition for clean SQLite cleanup)
enter, exit.
callers continue to work without changes.
with SqliteEventLog(path) as log:is the recommended idiom for the new backend.Tests: 63 new (32 SqliteEventLog, 29 JsonFileArtifactStore, 2 in-memory
lifecycle). New modules at 87–100% coverage. Full suite 1011 passed,
8 skipped.
Verification
ruff format --check src/ tests/ examples/ scripts/ - clean
ruff check src/ tests/ examples/ scripts/ - clean
mypy src/ - 0 issues / 67 files
pytest --cov=contextweaver -q - 1011 passed, 8 skipped
make example, make demo, make scorecard-check - clean
make llms-check - up to date
Closes #42
Closes #223
Refs #174
https://claude.ai/code/session_01ADTmGUqM66tqnGMqrefy4e