Skip to content

feat(memory): add DatabaseMemoryService with SQL backend and agent scratchpad#98

Open
Raman369AI wants to merge 2 commits intogoogle:mainfrom
Raman369AI:feat/database-memory-service
Open

feat(memory): add DatabaseMemoryService with SQL backend and agent scratchpad#98
Raman369AI wants to merge 2 commits intogoogle:mainfrom
Raman369AI:feat/database-memory-service

Conversation

@Raman369AI
Copy link

Summary

  • Adds DatabaseMemoryService, a BaseMemoryService backed by any SQLAlchemy async-compatible database (SQLite, PostgreSQL, MySQL, MariaDB)
  • Adds MemorySearchBackend ABC and KeywordSearchBackend (LIKE/ILIKE, AND-first → OR-fallback tokenisation) for pluggable search strategies
  • Adds a scratchpad subsystem (KV store + append-only log) for intermediate agent working memory, exposed as four BaseTool subclasses in google.adk_community.tools

Motivation

The existing InMemoryMemoryService is volatile and test-only. Developers not using Vertex AI have no durable, self-hosted memory option. DatabaseMemoryService fills that gap using SQLAlchemy async — the same pattern already used by DatabaseSessionService in the core ADK.

This PR was originally submitted to google/adk-python (#4736) and redirected here by maintainer @rohityan.

Changes

File Action
src/google/adk_community/memory/schemas/__init__.py New — package marker
src/google/adk_community/memory/schemas/memory_schema.py New — ORM tables (adk_memory_entries, adk_scratchpad_kv, adk_scratchpad_log) with standalone DeclarativeBase and inlined DynamicJSON/PreciseTimestamp types
src/google/adk_community/memory/memory_search_backend.py New — MemorySearchBackend ABC + KeywordSearchBackend
src/google/adk_community/memory/database_memory_service.py New — main service class
src/google/adk_community/tools/__init__.py New — package marker + exports
src/google/adk_community/tools/scratchpad_tool.py New — ScratchpadGetTool, ScratchpadSetTool, ScratchpadAppendLogTool, ScratchpadGetLogTool + singleton instances
src/google/adk_community/memory/__init__.py Modified — exports three new public symbols, guarded by try/except ImportError
pyproject.toml Modified — adds sqlalchemy optional extra (sqlalchemy[asyncio]>=2.0.0, aiosqlite>=0.19.0)
tests/unittests/memory/test_database_memory_service.py New — 38 unit tests

Design notes

  • Standalone DeclarativeBase — no coupling to sessions schema; same DB can be shared
  • session_id='' sentinel in scratchpad tables — user-level scope without nullable PK columns
  • Lazy table creation behind asyncio.Lock (double-checked) — no explicit initialize() call needed
  • try/except ImportError in __init__.py — SQLAlchemy + async driver are optional; users without them are unaffected
  • Scratchpad tools live in google.adk_community.tools — avoids any changes to the core google.adk package

Test plan

All tests use sqlite+aiosqlite:///:memory: — no external database required.

uv sync --group dev
pytest tests/unittests/memory/test_database_memory_service.py -v
# 38 passed

Scenarios covered:

  • add_session_to_memory — filters empty events, persists content/author/timestamp
  • Re-ingest same session is idempotent (no duplicates)
  • add_events_to_memory — delta, skips duplicate event_id
  • add_memory — direct MemoryEntry persist, auto-UUID
  • search_memory — AND match, OR fallback, empty query, no match
  • Scratchpad KV — set/get/overwrite/delete/list, JSON types, session scoping
  • Scratchpad log — append/get, tag filter, limit, session scoping
  • Multi-user isolation — user A results do not leak to user B
  • Custom MemorySearchBackend is honoured
  • All 4 scratchpad tools — happy path and wrong-service ValueError
  • Engine construction errors raise ValueError

…ratchpad

Adds a durable, RDBMS-backed memory service compatible with any
SQLAlchemy-supported async database (SQLite, PostgreSQL, MySQL, MariaDB)
as a community alternative to the volatile InMemoryMemoryService.

Key additions:
- DatabaseMemoryService: implements BaseMemoryService with lazy table
  creation, idempotent session ingest, and delta event ingestion
- MemorySearchBackend ABC + KeywordSearchBackend: LIKE/ILIKE search
  with AND-first → OR-fallback tokenisation strategy
- Scratchpad KV store and append-only log for intermediate agent state
- Four agent-callable BaseTool subclasses in adk_community.tools:
  ScratchpadGetTool, ScratchpadSetTool, ScratchpadAppendLogTool,
  ScratchpadGetLogTool (plus ready-to-use singleton instances)
- 38 unit tests covering all methods, tool happy-paths, wrong-service
  errors, multi-user isolation, and session scoping
- Optional sqlalchemy extra in pyproject.toml
@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request introduces a robust, durable memory solution for the ADK, moving beyond volatile in-memory storage. It provides a flexible SQL-backed service that integrates with various databases and includes a dedicated scratchpad for agents' intermediate working memory. This enhancement allows for persistent storage of agent interactions and internal states, improving the reliability and capabilities of ADK agents by enabling them to retain information across sessions and restarts.

Highlights

  • Durable SQL-backed Memory Service: Introduced DatabaseMemoryService for persistent memory storage, supporting various SQLAlchemy async-compatible databases like SQLite, PostgreSQL, MySQL, and MariaDB.
  • Pluggable Memory Search Backends: Implemented a MemorySearchBackend abstract base class and a KeywordSearchBackend for flexible and efficient searching of stored memories using LIKE/ILIKE operations.
  • Agent Scratchpad Subsystem: Added a comprehensive scratchpad feature for agents, comprising a key-value store and an append-only log, accessible through four new BaseTool subclasses.
  • Optional Dependencies and Lazy Initialization: Ensured that SQLAlchemy and async database drivers are optional dependencies, preventing ImportError for users who do not require database memory. Database tables are created lazily upon first access.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Changelog
  • pyproject.toml
    • Added sqlalchemy[asyncio] and aiosqlite as optional dependencies for the new database memory service.
  • src/google/adk_community/memory/init.py
    • Modified to export DatabaseMemoryService, KeywordSearchBackend, and MemorySearchBackend.
    • Wrapped new imports in a try/except ImportError block to make database dependencies optional.
  • src/google/adk_community/memory/database_memory_service.py
    • Added the DatabaseMemoryService class, implementing a SQL-backed memory service with support for various databases.
    • Included methods for adding sessions, events, and raw memory entries.
    • Implemented search_memory utilizing a pluggable search backend.
    • Provided set_scratchpad, get_scratchpad, delete_scratchpad, and list_scratchpad_keys for key-value scratchpad operations.
    • Added append_log and get_log for the append-only scratchpad log.
    • Incorporated lazy table creation and session management.
  • src/google/adk_community/memory/memory_search_backend.py
    • Added MemorySearchBackend as an abstract base class for defining memory search strategies.
    • Implemented KeywordSearchBackend for performing keyword searches with AND/OR logic using LIKE/ILIKE operators.
  • src/google/adk_community/memory/schemas/init.py
    • Created an empty __init__.py file to establish the schemas directory as a Python package.
  • src/google/adk_community/memory/schemas/memory_schema.py
    • Defined SQLAlchemy ORM models: StorageMemoryEntry, StorageScratchpadKV, and StorageScratchpadLog.
    • Introduced custom SQLAlchemy types DynamicJSON for flexible JSON storage and PreciseTimestamp for microsecond precision timestamps.
  • src/google/adk_community/tools/init.py
    • Created an empty __init__.py file to establish the tools directory as a Python package.
    • Exported the new scratchpad tool classes and their singleton instances.
  • src/google/adk_community/tools/scratchpad_tool.py
    • Implemented four BaseTool subclasses: ScratchpadGetTool, ScratchpadSetTool, ScratchpadAppendLogTool, and ScratchpadGetLogTool.
    • Provided helper functions to retrieve the DatabaseMemoryService from the tool context and determine session scope.
    • Ensured tools raise ValueError if the agent's memory service is not DatabaseMemoryService.
  • tests/unittests/memory/test_database_memory_service.py
    • Added 38 unit tests covering the full functionality of DatabaseMemoryService.
    • Included tests for add_session_to_memory, add_events_to_memory, add_memory, and search_memory.
    • Verified scratchpad KV operations (set, get, overwrite, delete, list) and JSON type handling.
    • Tested scratchpad log operations (append, get, tag filtering, limit).
    • Validated multi-user isolation for both memory and scratchpad.
    • Included tests for custom search backends and engine construction error handling.
    • Added tests for all four scratchpad BaseTool subclasses, including happy paths and wrong-service type errors.
Activity
  • The pull request was initially submitted to google/adk-python (#4736) and was subsequently redirected to this repository by maintainer @rohityan.
  • The author provided a comprehensive test plan, detailing the steps to verify the changes, including uv sync --group dev and pytest tests/unittests/memory/test_database_memory_service.py -v.
  • The test plan outlined 15 specific scenarios covered by the unit tests, ensuring thorough validation of the new features.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a significant and well-designed feature: a DatabaseMemoryService that provides durable memory for ADK agents using SQLAlchemy, along with a scratchpad mechanism and associated tools. The implementation is of high quality, with good attention to database-specific details, optional dependencies, and comprehensive test coverage.

My review has identified a couple of potential issues. There's a race condition in the scratchpad's set operation that could occur under concurrent writes to a new key. Additionally, the keyword search backend incorrectly assumes ILIKE support for MySQL and MariaDB, which could lead to runtime errors. I've also included a minor suggestion to improve code readability in the search tokenization logic.

Overall, this is an excellent contribution that fills an important gap in the ADK. The identified issues should be addressed to ensure robustness and correctness across different database backends.

Comment on lines +380 to +395
async with self._session() as sql:
existing = await sql.get(
StorageScratchpadKV, (app_name, user_id, session_id, key)
)
if existing is not None:
existing.value_json = value
else:
sql.add(
StorageScratchpadKV(
app_name=app_name,
user_id=user_id,
session_id=session_id,
key=key,
value_json=value,
)
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The current implementation of set_scratchpad uses a "read-then-write" pattern (get then add or update). This is not atomic and can lead to a race condition if two concurrent calls try to set a value for the same new key. Both calls would find that the key doesn't exist and attempt to insert a new row, leading to an IntegrityError for one of them.

To make this operation atomic and robust against concurrency, you should use an "UPSERT" operation. SQLAlchemy supports this via dialect-specific on_conflict_do_update (for PostgreSQL/SQLite) or on_duplicate_key_update (for MySQL) constructs. This would avoid the race condition and make the method more robust.

if TYPE_CHECKING:
from sqlalchemy.ext.asyncio import AsyncSession

_ILIKE_DIALECTS = frozenset({'postgresql', 'mysql', 'mariadb'})
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The _ILIKE_DIALECTS set includes 'mysql' and 'mariadb'. However, the ILIKE operator is specific to PostgreSQL. Using col.ilike() on MySQL or MariaDB with SQLAlchemy will likely result in an AttributeError at runtime, as these dialects do not natively support ILIKE.

For MySQL and MariaDB, LIKE is case-insensitive by default for most common collations. Given that LIKE is sufficient for the default case in MySQL/MariaDB, it's safer to restrict ILIKE usage to just PostgreSQL.

Suggested change
_ILIKE_DIALECTS = frozenset({'postgresql', 'mysql', 'mariadb'})
_ILIKE_DIALECTS = frozenset({'postgresql'})

Comment on lines +88 to +94
tokens = [
cleaned
for raw in query.split()
if raw.strip()
for cleaned in [re.sub(r'[^\w]', '', raw).lower()]
if cleaned
]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

This list comprehension is a bit dense and can be hard to read due to the nested for and the trick used to name an intermediate variable. It can be simplified for better readability and maintainability.

Using the walrus operator (available since Python 3.8, and your project requires >=3.9) makes the intent clearer and the code more concise.

    tokens = [
        cleaned
        for raw in query.split()
        if (cleaned := re.sub(r'[^\w]', '', raw).lower())
    ]

@Raman369AI
Copy link
Author

Tracking issue: #99

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant