feat(memory): add DatabaseMemoryService with SQL backend and agent scratchpad#98
feat(memory): add DatabaseMemoryService with SQL backend and agent scratchpad#98Raman369AI wants to merge 2 commits intogoogle:mainfrom
Conversation
…ratchpad Adds a durable, RDBMS-backed memory service compatible with any SQLAlchemy-supported async database (SQLite, PostgreSQL, MySQL, MariaDB) as a community alternative to the volatile InMemoryMemoryService. Key additions: - DatabaseMemoryService: implements BaseMemoryService with lazy table creation, idempotent session ingest, and delta event ingestion - MemorySearchBackend ABC + KeywordSearchBackend: LIKE/ILIKE search with AND-first → OR-fallback tokenisation strategy - Scratchpad KV store and append-only log for intermediate agent state - Four agent-callable BaseTool subclasses in adk_community.tools: ScratchpadGetTool, ScratchpadSetTool, ScratchpadAppendLogTool, ScratchpadGetLogTool (plus ready-to-use singleton instances) - 38 unit tests covering all methods, tool happy-paths, wrong-service errors, multi-user isolation, and session scoping - Optional sqlalchemy extra in pyproject.toml
Summary of ChangesHello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request introduces a robust, durable memory solution for the ADK, moving beyond volatile in-memory storage. It provides a flexible SQL-backed service that integrates with various databases and includes a dedicated scratchpad for agents' intermediate working memory. This enhancement allows for persistent storage of agent interactions and internal states, improving the reliability and capabilities of ADK agents by enabling them to retain information across sessions and restarts. Highlights
🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console. Changelog
Activity
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request introduces a significant and well-designed feature: a DatabaseMemoryService that provides durable memory for ADK agents using SQLAlchemy, along with a scratchpad mechanism and associated tools. The implementation is of high quality, with good attention to database-specific details, optional dependencies, and comprehensive test coverage.
My review has identified a couple of potential issues. There's a race condition in the scratchpad's set operation that could occur under concurrent writes to a new key. Additionally, the keyword search backend incorrectly assumes ILIKE support for MySQL and MariaDB, which could lead to runtime errors. I've also included a minor suggestion to improve code readability in the search tokenization logic.
Overall, this is an excellent contribution that fills an important gap in the ADK. The identified issues should be addressed to ensure robustness and correctness across different database backends.
| async with self._session() as sql: | ||
| existing = await sql.get( | ||
| StorageScratchpadKV, (app_name, user_id, session_id, key) | ||
| ) | ||
| if existing is not None: | ||
| existing.value_json = value | ||
| else: | ||
| sql.add( | ||
| StorageScratchpadKV( | ||
| app_name=app_name, | ||
| user_id=user_id, | ||
| session_id=session_id, | ||
| key=key, | ||
| value_json=value, | ||
| ) | ||
| ) |
There was a problem hiding this comment.
The current implementation of set_scratchpad uses a "read-then-write" pattern (get then add or update). This is not atomic and can lead to a race condition if two concurrent calls try to set a value for the same new key. Both calls would find that the key doesn't exist and attempt to insert a new row, leading to an IntegrityError for one of them.
To make this operation atomic and robust against concurrency, you should use an "UPSERT" operation. SQLAlchemy supports this via dialect-specific on_conflict_do_update (for PostgreSQL/SQLite) or on_duplicate_key_update (for MySQL) constructs. This would avoid the race condition and make the method more robust.
| if TYPE_CHECKING: | ||
| from sqlalchemy.ext.asyncio import AsyncSession | ||
|
|
||
| _ILIKE_DIALECTS = frozenset({'postgresql', 'mysql', 'mariadb'}) |
There was a problem hiding this comment.
The _ILIKE_DIALECTS set includes 'mysql' and 'mariadb'. However, the ILIKE operator is specific to PostgreSQL. Using col.ilike() on MySQL or MariaDB with SQLAlchemy will likely result in an AttributeError at runtime, as these dialects do not natively support ILIKE.
For MySQL and MariaDB, LIKE is case-insensitive by default for most common collations. Given that LIKE is sufficient for the default case in MySQL/MariaDB, it's safer to restrict ILIKE usage to just PostgreSQL.
| _ILIKE_DIALECTS = frozenset({'postgresql', 'mysql', 'mariadb'}) | |
| _ILIKE_DIALECTS = frozenset({'postgresql'}) |
| tokens = [ | ||
| cleaned | ||
| for raw in query.split() | ||
| if raw.strip() | ||
| for cleaned in [re.sub(r'[^\w]', '', raw).lower()] | ||
| if cleaned | ||
| ] |
There was a problem hiding this comment.
This list comprehension is a bit dense and can be hard to read due to the nested for and the trick used to name an intermediate variable. It can be simplified for better readability and maintainability.
Using the walrus operator (available since Python 3.8, and your project requires >=3.9) makes the intent clearer and the code more concise.
tokens = [
cleaned
for raw in query.split()
if (cleaned := re.sub(r'[^\w]', '', raw).lower())
]|
Tracking issue: #99 |
Summary
DatabaseMemoryService, aBaseMemoryServicebacked by any SQLAlchemy async-compatible database (SQLite, PostgreSQL, MySQL, MariaDB)MemorySearchBackendABC andKeywordSearchBackend(LIKE/ILIKE, AND-first → OR-fallback tokenisation) for pluggable search strategiesBaseToolsubclasses ingoogle.adk_community.toolsMotivation
The existing
InMemoryMemoryServiceis volatile and test-only. Developers not using Vertex AI have no durable, self-hosted memory option.DatabaseMemoryServicefills that gap using SQLAlchemy async — the same pattern already used byDatabaseSessionServicein the core ADK.This PR was originally submitted to
google/adk-python(#4736) and redirected here by maintainer @rohityan.Changes
src/google/adk_community/memory/schemas/__init__.pysrc/google/adk_community/memory/schemas/memory_schema.pyadk_memory_entries,adk_scratchpad_kv,adk_scratchpad_log) with standaloneDeclarativeBaseand inlinedDynamicJSON/PreciseTimestamptypessrc/google/adk_community/memory/memory_search_backend.pyMemorySearchBackendABC +KeywordSearchBackendsrc/google/adk_community/memory/database_memory_service.pysrc/google/adk_community/tools/__init__.pysrc/google/adk_community/tools/scratchpad_tool.pyScratchpadGetTool,ScratchpadSetTool,ScratchpadAppendLogTool,ScratchpadGetLogTool+ singleton instancessrc/google/adk_community/memory/__init__.pytry/except ImportErrorpyproject.tomlsqlalchemyoptional extra (sqlalchemy[asyncio]>=2.0.0,aiosqlite>=0.19.0)tests/unittests/memory/test_database_memory_service.pyDesign notes
DeclarativeBase— no coupling to sessions schema; same DB can be sharedsession_id=''sentinel in scratchpad tables — user-level scope without nullable PK columnsasyncio.Lock(double-checked) — no explicitinitialize()call neededtry/except ImportErrorin__init__.py— SQLAlchemy + async driver are optional; users without them are unaffectedgoogle.adk_community.tools— avoids any changes to the coregoogle.adkpackageTest plan
All tests use
sqlite+aiosqlite:///:memory:— no external database required.uv sync --group dev pytest tests/unittests/memory/test_database_memory_service.py -v # 38 passedScenarios covered:
add_session_to_memory— filters empty events, persists content/author/timestampadd_events_to_memory— delta, skips duplicateevent_idadd_memory— directMemoryEntrypersist, auto-UUIDsearch_memory— AND match, OR fallback, empty query, no matchMemorySearchBackendis honouredValueErrorValueError