feat(memory): add DatabaseMemoryService with SQL backend and agent scratchpad#1
feat(memory): add DatabaseMemoryService with SQL backend and agent scratchpad#1Raman369AI wants to merge 7 commits intomainfrom
Conversation
…ratchpad Adds a durable, RDBMS-backed memory service compatible with any SQLAlchemy-supported async database (SQLite, PostgreSQL, MySQL, MariaDB) as a community alternative to the volatile InMemoryMemoryService. Key additions: - DatabaseMemoryService: implements BaseMemoryService with lazy table creation, idempotent session ingest, and delta event ingestion - MemorySearchBackend ABC + KeywordSearchBackend: LIKE/ILIKE search with AND-first → OR-fallback tokenisation strategy - Scratchpad KV store and append-only log for intermediate agent state - Four agent-callable BaseTool subclasses in adk_community.tools: ScratchpadGetTool, ScratchpadSetTool, ScratchpadAppendLogTool, ScratchpadGetLogTool (plus ready-to-use singleton instances) - 38 unit tests covering all methods, tool happy-paths, wrong-service errors, multi-user isolation, and session scoping - Optional sqlalchemy extra in pyproject.toml
There was a problem hiding this comment.
Code Review
This pull request introduces a new SQL-backed memory service, DatabaseMemoryService, which supports SQLAlchemy-compatible databases like SQLite, PostgreSQL, and MySQL. It includes a scratchpad for key-value storage and append-only logging, along with agent-callable tools to interact with these features. Additionally, the PR updates RedisSessionService and OpenMemoryService with formatting improvements and minor logic refinements. Feedback focuses on improving search accuracy by not aggressively stripping non-alphanumeric characters from query tokens, ensuring consistent UTC timestamp formatting, and excluding internal model 'thought' parts from searchable memory to maintain data relevance.
| tokens = [ | ||
| cleaned | ||
| for raw in query.split() | ||
| if (cleaned := re.sub(r'[^\w]', '', raw).lower()) | ||
| ] |
There was a problem hiding this comment.
Aggressively stripping all non-alphanumeric characters from query tokens causes search failures for terms containing punctuation (e.g., 'Gemini-1.5', 'C#', '3.10'). Since the search_text in the database is stored as raw text, the query tokens should not be stripped of these characters to ensure LIKE matching works as expected.
tokens = [t.lower() for t in query.split() if t.strip()]| from collections.abc import Mapping | ||
| from collections.abc import Sequence | ||
| from contextlib import asynccontextmanager | ||
| from datetime import datetime |
|
|
||
|
|
||
| def _format_timestamp(timestamp: float) -> str: | ||
| return datetime.fromtimestamp(timestamp).isoformat() |
There was a problem hiding this comment.
Using datetime.fromtimestamp(timestamp) without an explicit timezone uses the local system time, which can lead to inconsistent timestamps in distributed or cloud environments. It is recommended to use UTC for all stored timestamps.
| return datetime.fromtimestamp(timestamp).isoformat() | |
| return datetime.fromtimestamp(timestamp, tz=timezone.utc).isoformat() |
| """Join all text parts of a Content into a single searchable string.""" | ||
| if not content or not content.parts: | ||
| return '' | ||
| return ' '.join(part.text for part in content.parts if part.text) |
There was a problem hiding this comment.
The search_text should exclude thought parts of the content. These parts represent internal model reasoning and are typically not intended to be part of the searchable memory. This change ensures consistency with the logic used in OpenMemoryService via utils.py.
| return ' '.join(part.text for part in content.parts if part.text) | |
| return ' '.join( | |
| part.text for part in content.parts if part.text and not part.thought | |
| ) |
| """Return True if the event has no usable text content.""" | ||
| if not event.content or not event.content.parts: | ||
| return True | ||
| return not any(part.text for part in event.content.parts if part.text) |
There was a problem hiding this comment.
When determining if an event should be skipped for memory indexing, thought parts should be excluded. This ensures that events containing only internal reasoning are not added to the searchable memory database.
| return not any(part.text for part in event.content.parts if part.text) | |
| return not any( | |
| part.text for part in event.content.parts if part.text and not part.thought | |
| ) |
Replicates upstream PR google#98 inside this fork for visibility and fork-local checks.
Upstream PR: google#98
Summary:
Local verification: