Skip to content

Commit 42156bf

Browse files
authored
Add v1 wire codec, conformance harness, and lifecycle-hardened clients (#3)
1 parent 346f6ff commit 42156bf

50 files changed

Lines changed: 3726 additions & 63 deletions

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

.gitignore

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -53,3 +53,9 @@ Thumbs.db
5353
*.log
5454
.env
5555
.env.local
56+
57+
# Internal planning hub (specs/plans) — not published
58+
localdocs/
59+
60+
# Local review/feature worktrees
61+
.worktrees/

CHANGELOG.md

Lines changed: 21 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -6,6 +6,27 @@ The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/),
66

77
## [Unreleased]
88

9+
## [1.1.0] - 2026-06-09
10+
11+
### Added
12+
- `atomicmemory.contract.v1`: a wire codec for the v1 provider contract's deliberately mixed-case encoding (`Memory.createdAt`/`updatedAt` and `SearchResult.rankingScore` are camelCase on the wire; `version_id`, `observed_at`, and retrieval-receipt fields are snake_case). Encode/decode helpers cover `Memory`, `Provenance`, `SearchResult`, `SearchResultPage`, `SearchRequest`, and ingest payloads (`IngestInput`, `IngestResult`). Dates follow the contract's ISO-8601 UTC millisecond `Z` form (`_to_iso_z`, equivalent to TS `toISOString()`). Naive datetimes in encode paths are assumed UTC. `encode_ingest_input` fails closed on the Python-ahead `content_class` field (no place in the v1 `additionalProperties: false` schemas; TS contract alignment is a recorded follow-up). Explicit-null `version_id` in `SearchResult` normalizes to absent on re-encode, matching the TS optional declaration. `encode_search_request` uses `by_alias=True` so Python-keyword-safe combinator field names (`and_`/`or_`/`not_`) emit their wire aliases; a recursive `_jsonify` walk converts any `datetime` operands in filter trees to the toISOString form. In-process models and provider mappers are unchanged.
13+
- Vendored the TS SDK's versioned v1 wire contract (JSON Schemas, cross-provider conformance corpus, and CONTRACT.md) under `contract/`, with explicit provenance in `contract/VENDORED.json` and a documented refresh script (`scripts/refresh_contract.py`, never run in CI). A pytest conformance harness proves corpus fixtures decode into the Python models (directly for snake-on-wire types, through the codec for the mixed-case search response) and that SDK emissions validate against the vendored draft-2020-12 schemas, with the TS suite's negative cases mirrored against both schemas and Pydantic. The `capabilities-descriptor` case is schema-only (no Python model in this release — recorded follow-up).
14+
- `atomicmemory.contract` re-exports `v1` as a specialty import surface; deliberately not re-exported from the package root to keep the root namespace focused on the core provider API.
15+
- `AsyncProviderFactory` now accepts factories that return an `Awaitable[AsyncProviderRegistration]`, enabling lazy or async provider construction during `AsyncMemoryService.initialize()`.
16+
- `MemoryService.initialize()` and `AsyncMemoryService.initialize()` raise `ConfigError` when the configured default provider has no registered factory, making a misconfigured default an immediate, explicit error rather than a silent no-op.
17+
18+
### Changed
19+
- `content_class` is now accepted on **every** ingest mode (`text`, `messages`, and `verbatim`), not just `verbatim`, and is forwarded to core for all modes. Extraction-based ingests (`text`/`messages`) can now satisfy a core running the default `RAW_CONTENT_POLICY=reject`. Still never defaulted — omitting it leaves the field off the wire and a reject-policy core fails closed.
20+
- Both clients' `initialize()` is now concurrency-safe and idempotent: concurrent callers share a single initialization run (the first caller's registry wins), and the completed outcome — success or failure — is captured in loop-independent state for `AsyncMemoryClient`.
21+
- A failed `initialize()` is sticky: retrying re-raises the original error from any caller; resolve the cause and construct a new client rather than retrying on the same instance.
22+
- `AsyncMemoryClient.initialize()` shields each waiter from cancellation so that one waiter's timeout or cancellation never cancels the shared run for other concurrent callers.
23+
- `AsyncMemoryClient.close()` during a pending initialization cancels the shared run; staged providers are torn down by the service's atomic-initialize cleanup, any concurrent `initialize()` waiter receives `CancelledError`, and the client ends in the not-initialized state without recording a sticky error.
24+
- Both `MemoryService` and `AsyncMemoryService` stage provider registrations atomically: factories and provider `initialize()` calls run against a local staging area, and the maps are replaced only after every provider succeeds; on any failure, already-staged providers are torn down best-effort before the original error re-raises.
25+
- `MemoryService.close()` and `AsyncMemoryService.close()` are best-effort: every provider gets a chance to close regardless of earlier failures, maps are cleared in a `finally` block, and the first failure is re-raised after all providers have been given the chance to close.
26+
27+
### Fixed
28+
- `atomicmemory.__version__` reported `1.0.0` while package metadata said `1.0.1`; all version sources now agree at `1.1.0`, guarded by a regression test that will fail if they drift again.
29+
930
## [1.0.1] - 2026-05-14
1031

1132
### Changed

README.md

Lines changed: 47 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -32,13 +32,13 @@ pip install 'atomicmemory[embeddings]' # + sentence-transformers for local
3232

3333
## Quick start
3434

35-
Prerequisite: start `atomicmemory-core` first. Follow the [Core Quickstart](https://docs.atomicstrata.ai/quickstart) if you do not already have a backend at `http://localhost:3050`.
35+
Prerequisite: start `atomicmemory-core` first. Follow the [Core Quickstart](https://docs.atomicstrata.ai/quickstart) if you do not already have a backend at `http://localhost:17350`.
3636

3737
```python
3838
from atomicmemory import AtomicMemoryClient
3939

4040
with AtomicMemoryClient({
41-
"apiUrl": "http://localhost:3050",
41+
"apiUrl": "http://localhost:17350",
4242
"apiKey": "server-api-key",
4343
"userId": "demo",
4444
}) as client:
@@ -72,7 +72,7 @@ from atomicmemory import AsyncAtomicMemoryClient
7272

7373
async def main() -> None:
7474
async with AsyncAtomicMemoryClient({
75-
"apiUrl": "http://localhost:3050",
75+
"apiUrl": "http://localhost:17350",
7676
"apiKey": "server-api-key",
7777
"userId": "demo",
7878
}) as client:
@@ -131,6 +131,50 @@ The `client.storage` namespace mirrors the TypeScript SDK's direct storage API:
131131

132132
Every storage request sends `Authorization: Bearer <apiKey>` and `X-AtomicMemory-User-Id`. The SDK never sends the legacy `?user_id=` URL parameter.
133133

134+
## v1 wire contract
135+
136+
`atomicmemory.contract.v1` is the wire codec for the v1 provider-contract encoding. The wire form is deliberately mixed-case — `Memory.createdAt`/`updatedAt` and `SearchResult.rankingScore` are camelCase; `version_id`, `observed_at`, and retrieval-receipt fields are snake_case — as pinned by the vendored `contract/CONTRACT.md`. This module is the only place that mapping lives; in-process models and provider mappers are unchanged.
137+
138+
```python
139+
from atomicmemory.contract import v1
140+
141+
# decode a wire search response (e.g. from a cross-SDK provider call)
142+
wire_page = {
143+
"results": [
144+
{
145+
"memory": {
146+
"id": "mem_1",
147+
"content": "I prefer aisle seats on flights.",
148+
"scope": {"user": "demo"},
149+
"kind": "fact",
150+
"createdAt": "2026-05-30T12:00:00.000Z",
151+
},
152+
"score": 0.91,
153+
"rankingScore": 0.87,
154+
}
155+
],
156+
"retrieval": {
157+
"embedding_model": "text-embedding-x",
158+
"embedding_model_version": "1",
159+
"embedding_dimensions": 1536,
160+
"query_text": "deploy gate",
161+
"candidate_ids": ["mem_1"],
162+
"trace_id": "trace-1",
163+
},
164+
}
165+
166+
page = v1.decode_search_result_page(wire_page)
167+
for hit in page.results:
168+
print(hit.memory.content, hit.score) # snake_case in-process models
169+
170+
# re-encode to the exact v1 wire form (millisecond-precision UTC datetimes)
171+
wire_out = v1.encode_search_result_page(page)
172+
```
173+
174+
Two behaviors to know: naive datetimes passed to encode functions are assumed UTC (bare `astimezone()` would shift by the host's UTC offset); `encode_ingest_input` rejects models carrying `content_class` with a clear error because the v1 schemas have `additionalProperties: false` and no such field — this is a Python-ahead field pending TS contract alignment.
175+
176+
This is NOT the AtomicMemory core HTTP API. That boundary stays in the provider mappers. The import path is `atomicmemory.contract` — deliberately not re-exported from the package root to keep the root namespace focused on the core provider API.
177+
134178
## Development
135179

136180
```bash

atomicmemory/__init__.py

Lines changed: 26 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -23,11 +23,25 @@
2323
RateLimitError,
2424
ValidationError,
2525
)
26+
from atomicmemory.memory.capability_profiles import (
27+
CapabilityGap,
28+
CapabilityProfile,
29+
capability_gaps,
30+
satisfies_profile,
31+
)
2632
from atomicmemory.memory.filters import FieldFilter, FieldFilterOp, FilterExpr
33+
from atomicmemory.memory.meta_fact_filter import (
34+
DEFAULT_META_FACT_PATTERNS,
35+
MetaFactFilterConfig,
36+
filter_meta_facts,
37+
is_meta_fact,
38+
resolve_meta_fact_patterns,
39+
)
2740
from atomicmemory.memory.types import (
2841
Capabilities,
2942
CapabilitiesExtensions,
3043
CapabilitiesRequiredScope,
44+
ContentClass,
3145
ContextPackage,
3246
GraphEdge,
3347
GraphNode,
@@ -52,6 +66,7 @@
5266
PackageRequest,
5367
Profile,
5468
Provenance,
69+
RetrievalReceipt,
5570
Scope,
5671
SearchRequest,
5772
SearchResult,
@@ -87,6 +102,7 @@
87102
)
88103

89104
__all__ = [
105+
"DEFAULT_META_FACT_PATTERNS",
90106
"ArtifactHead",
91107
"ArtifactInUseError",
92108
"ArtifactMetadata",
@@ -103,7 +119,10 @@
103119
"Capabilities",
104120
"CapabilitiesExtensions",
105121
"CapabilitiesRequiredScope",
122+
"CapabilityGap",
123+
"CapabilityProfile",
106124
"ConfigError",
125+
"ContentClass",
107126
"ContextPackage",
108127
"DeleteArtifactOptions",
109128
"DeleteArtifactPolicy",
@@ -133,6 +152,7 @@
133152
"Message",
134153
"MessageIngest",
135154
"MessageRole",
155+
"MetaFactFilterConfig",
136156
"NetworkError",
137157
"NotInitializedError",
138158
"PackageFormat",
@@ -146,6 +166,7 @@
146166
"PutManagedInput",
147167
"PutPointerInput",
148168
"RateLimitError",
169+
"RetrievalReceipt",
149170
"Scope",
150171
"SearchRequest",
151172
"SearchResult",
@@ -163,4 +184,9 @@
163184
"VerificationResult",
164185
"VerifyArtifactOptions",
165186
"__version__",
187+
"capability_gaps",
188+
"filter_meta_facts",
189+
"is_meta_fact",
190+
"resolve_meta_fact_patterns",
191+
"satisfies_profile",
166192
]

atomicmemory/_version.py

Lines changed: 6 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,7 @@
1-
"""Version metadata for the atomicmemory Python SDK."""
1+
"""Version metadata for the atomicmemory Python SDK.
22
3-
__version__ = "1.0.0"
3+
Exports:
4+
__version__: The current package version string (PEP 440).
5+
"""
6+
7+
__version__ = "1.1.0"

atomicmemory/client/async_memory_client.py

Lines changed: 91 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,6 @@
11
"""AsyncMemoryClient — async facade for the V3 memory layer.
22
3+
Port of `atomicmemory-sdk/src/client/memory-client.ts` (async variant).
34
Mirrors :class:`atomicmemory.client.memory_client.MemoryClient` with
45
``async def`` for every I/O method and a ``__aenter__`` / ``__aexit__``
56
context manager. Dict coercion + Pydantic-error wrapping is identical
@@ -8,6 +9,8 @@
89

910
from __future__ import annotations
1011

12+
import asyncio
13+
import contextlib
1114
from dataclasses import dataclass
1215
from types import TracebackType
1316
from typing import Any
@@ -17,11 +20,13 @@
1720
import atomicmemory.providers.hindsight
1821
import atomicmemory.providers.mem0 # noqa: F401
1922
from atomicmemory.client.memory_client import (
23+
MemoryProviderConfigs,
2024
_coerce_ingest,
2125
_coerce_list_request,
2226
_coerce_package,
2327
_coerce_ref,
2428
_coerce_search,
29+
_pick_first_provider_key,
2530
)
2631
from atomicmemory.core.errors import ConfigError, NotInitializedError
2732
from atomicmemory.memory.provider import BaseAsyncMemoryProvider
@@ -42,8 +47,6 @@
4247
)
4348
from atomicmemory.providers.atomicmemory.async_handle_impl import AsyncAtomicMemoryHandle
4449

45-
MemoryProviderConfigs = dict[str, Any]
46-
4750

4851
@dataclass
4952
class AsyncProviderStatus:
@@ -62,7 +65,7 @@ class AsyncMemoryClient:
6265
6366
Example:
6467
>>> async with AsyncMemoryClient(
65-
... providers={"atomicmemory": {"api_url": "http://localhost:3050"}}
68+
... providers={"atomicmemory": {"api_url": "http://localhost:17350"}}
6669
... ) as memory:
6770
... await memory.initialize()
6871
... await memory.ingest({"mode": "text", "content": "hi", "scope": {"user": "u1"}})
@@ -88,18 +91,82 @@ def __init__(
8891
)
8992
)
9093
self._initialized = False
94+
self._init_error: Exception | None = None
95+
self._init_task: asyncio.Task[None] | None = None
9196

9297
async def initialize(self, registry: AsyncProviderRegistry | None = None) -> None:
98+
"""Initialize all configured providers. Idempotent and concurrency-safe.
99+
100+
Concurrent calls on one event loop share a single initialization run
101+
(the first call's ``registry`` wins). The COMPLETED outcome — success
102+
or the original failure — is captured into loop-independent state, so
103+
a failed initialization is sticky from any loop: retrying re-raises
104+
the original error; construct a new client after resolving the cause.
105+
An instance is bound to the event loop of its first ``initialize()``
106+
while initialization is still PENDING — awaiting a pending run from a
107+
different loop is unsupported. ``close()`` after a SUCCESSFUL
108+
lifecycle returns the client to the uninitialized state.
109+
"""
93110
if self._initialized:
94111
return
95-
await self._service.initialize(registry if registry is not None else default_async_registry)
112+
if self._init_error is not None:
113+
raise self._init_error
114+
if self._init_task is None:
115+
self._init_task = asyncio.ensure_future(self._run_initialize(registry))
116+
self._init_task.add_done_callback(_mark_retrieved)
117+
task = self._init_task
118+
try:
119+
# shield: cancelling ONE waiter (e.g. wait_for timeout) must not
120+
# cancel the shared run for everyone — promises aren't cancellable
121+
# in TS, so unshielded awaiting would NOT be lifecycle parity.
122+
await asyncio.shield(task)
123+
finally:
124+
if task.done():
125+
self._init_task = None
126+
127+
async def _run_initialize(self, registry: AsyncProviderRegistry | None) -> None:
128+
"""Execute the shared initialization run; capture errors into sticky state.
129+
130+
CancelledError is BaseException and never caught here, so cancellation
131+
never becomes sticky. A cancelled task's ``_init_task`` slot is cleared
132+
by a surviving waiter's ``finally`` once the task is done, or by
133+
``close()``; either path lets a later call start fresh.
134+
"""
135+
try:
136+
await self._service.initialize(registry if registry is not None else default_async_registry)
137+
except Exception as exc:
138+
self._init_error = exc
139+
raise
96140
self._initialized = True
97141

98142
async def close(self) -> None:
143+
"""Close providers; safe to call multiple times.
144+
145+
Closing while an initialization is PENDING cancels that run: staged
146+
providers are torn down by the service's atomic-initialize cleanup,
147+
any concurrent initialize() waiter receives CancelledError, and the
148+
client ends not-initialized (no sticky error is recorded for
149+
cancellation). After a SUCCESSFUL lifecycle, close() returns the
150+
client to the uninitialized state. A FAILED initialization remains
151+
sticky — close() does not reset it.
152+
"""
153+
task = self._init_task
154+
if task is not None:
155+
if not task.done():
156+
task.cancel()
157+
with contextlib.suppress(Exception, asyncio.CancelledError):
158+
await task
159+
# Always clear, even when already done: a run whose waiters were
160+
# all cancelled leaves a stale DONE task behind, and a later
161+
# initialize() awaiting it would resolve instantly WITHOUT
162+
# re-running — silently leaving the client uninitialized.
163+
self._init_task = None
99164
if not self._initialized:
100165
return
101-
await self._service.close()
102-
self._initialized = False
166+
try:
167+
await self._service.close()
168+
finally:
169+
self._initialized = False
103170

104171
async def __aenter__(self) -> AsyncMemoryClient:
105172
return self
@@ -117,6 +184,7 @@ async def ingest(self, input: IngestInput | dict[str, Any]) -> IngestResult:
117184
return await self._service.ingest(_coerce_ingest(input))
118185

119186
async def ingest_direct(self, input: IngestInput | dict[str, Any]) -> IngestResult:
187+
"""Identical to :meth:`ingest`; preserved for wrapper-subclass parity with TS."""
120188
self._assert_initialized()
121189
return await self._service.ingest(_coerce_ingest(input))
122190

@@ -125,6 +193,7 @@ async def search(self, request: SearchRequest | dict[str, Any]) -> SearchResultP
125193
return await self._service.search(_coerce_search(request))
126194

127195
async def search_direct(self, request: SearchRequest | dict[str, Any]) -> SearchResultPage:
196+
"""Identical to :meth:`search`; preserved for wrapper-subclass parity with TS."""
128197
self._assert_initialized()
129198
return await self._service.search(_coerce_search(request))
130199

@@ -133,6 +202,7 @@ async def package(self, request: PackageRequest | dict[str, Any]) -> ContextPack
133202
return await self._service.package(_coerce_package(request))
134203

135204
async def package_direct(self, request: PackageRequest | dict[str, Any]) -> ContextPackage:
205+
"""Identical to :meth:`package`; preserved for wrapper-subclass parity with TS."""
136206
self._assert_initialized()
137207
return await self._service.package(_coerce_package(request))
138208

@@ -181,6 +251,11 @@ def get_provider(self, name: str | None = None) -> BaseAsyncMemoryProvider:
181251

182252
@property
183253
def atomicmemory(self) -> AsyncAtomicMemoryHandle | None:
254+
"""Typed access to AtomicMemory-specific routes.
255+
256+
Returns ``None`` when the client is not yet initialized or the
257+
``atomicmemory`` provider was not configured.
258+
"""
184259
if not self._initialized:
185260
return None
186261
if "atomicmemory" not in self._service.get_configured_providers():
@@ -196,8 +271,13 @@ def _assert_initialized(self) -> None:
196271
raise NotInitializedError("AsyncMemoryClient is not initialized. Call await client.initialize() first.")
197272

198273

199-
def _pick_first_provider_key(providers: MemoryProviderConfigs) -> str | None:
200-
for key, value in providers.items():
201-
if value is not None and key != "default":
202-
return key
203-
return None
274+
def _mark_retrieved(task: asyncio.Task[None]) -> None:
275+
"""Retrieve the task's exception so asyncio never logs 'never retrieved'.
276+
277+
A run whose waiters were all cancelled fails unobserved; without this
278+
callback asyncio would log "Task exception was never retrieved" at GC.
279+
Correctness is unchanged: waiters still see errors through the shield,
280+
and stickiness is recorded by ``_run_initialize`` itself.
281+
"""
282+
if not task.cancelled():
283+
task.exception()

0 commit comments

Comments
 (0)