Skip to content

feat: add create_http_options to ContextCacheConfig for cache creation timeout#4702

Open
abhinavmaddineni wants to merge 5 commits intogoogle:mainfrom
abhinavmaddineni:feat/async-cache-creation
Open

feat: add create_http_options to ContextCacheConfig for cache creation timeout#4702
abhinavmaddineni wants to merge 5 commits intogoogle:mainfrom
abhinavmaddineni:feat/async-cache-creation

Conversation

@abhinavmaddineni
Copy link

@abhinavmaddineni abhinavmaddineni commented Mar 4, 2026

Summary

  • Adds create_http_options: Optional[types.HttpOptions] to ContextCacheConfig — passed through to CreateCachedContentConfig when creating a cache
  • Allows users to set a timeout on CachedContent.create() calls, which can take 30-40s on Vertex AI
  • When timeout is exceeded, cache creation fails gracefully and the request proceeds without caching
  • Replaces the previous async_creation field which required global in-memory state

Usage

from google.genai import types

cache_config = ContextCacheConfig(
    cache_intervals=3,
    ttl_seconds=3600,
    min_tokens=1024,
    create_http_options=types.HttpOptions(timeout=10000),  # 10s in ms
)

Test plan

  • All existing cache tests pass (37/37)
  • New test: create_http_options is passed through to CreateCachedContentConfig
  • New test: cache creation without create_http_options works as before
  • Default behavior (create_http_options=None) is unchanged

Fixes #4703

@google-cla
Copy link

google-cla bot commented Mar 4, 2026

Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

View this failed invocation of the CLA check for more information.

For the most up to date status, view the checks section at the bottom of the pull request.

@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request introduces a significant performance improvement for context caching by enabling asynchronous cache creation. By adding an async_creation option, the system can now defer potentially long-running cache generation tasks to background processes, allowing immediate requests to proceed without waiting. This change effectively eliminates substantial latency spikes previously caused by synchronous cache operations, particularly those involving slow external API calls, leading to a much smoother and more responsive user experience.

Highlights

  • Asynchronous Cache Creation: Introduced an async_creation option in ContextCacheConfig to enable non-blocking cache generation.
  • Latency Reduction: Implemented background asyncio.Task execution for CachedContent.create(), preventing current requests from being blocked by slow cache operations.
  • Uncached Fallback: Configured the system to allow requests to proceed uncached when a cache needs to be rebuilt, ensuring responsiveness.
  • Background Task Management: Added a module-level registry and utility functions to track, check, and clean up pending background cache creation tasks.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Changelog
  • src/google/adk/agents/context_cache_config.py
    • Added async_creation boolean field with a default of False to ContextCacheConfig.
    • Updated the __str__ method to include the new async_creation field in the string representation.
  • src/google/adk/models/gemini_context_cache_manager.py
    • Imported the asyncio module.
    • Introduced _pending_cache_tasks, a module-level dictionary to track background asyncio.Task instances for cache creation.
    • Added helper functions _cache_task_key, _check_pending_cache, and _cleanup_stale_tasks for managing the background task registry.
    • Modified handle_context_caching to check for and utilize completed background caches, and to launch new background cache creation tasks when async_creation is enabled and a cache needs to be rebuilt or created.
    • Implemented _launch_background_cache to create and manage asyncio.Task for background cache creation, including request snapshotting.
    • Added _snapshot_request to create a minimal, immutable copy of an LlmRequest for background tasks.
  • tests/unittests/agents/test_context_cache_config.py
    • Updated expected string representations in test_str_representation and test_str_representation_defaults to reflect the addition of the async_creation field.
Activity
  • No human activity has been recorded on this pull request yet.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@adk-bot
Copy link
Collaborator

adk-bot commented Mar 4, 2026

Response from ADK Triaging Agent

Hello @abhinavmaddineni, thank you for creating this PR!

Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA). You can find more information at https://cla.developers.google.com/.

Also, for a new feature, could you please create a GitHub issue and associate it with this PR?

This information will help reviewers to review your PR more efficiently. Thanks!

@adk-bot adk-bot added the core [Component] This issue is related to the core interface and implementation label Mar 4, 2026
CachedContent.create() API calls can take 30-40 seconds, blocking the
user's request when a cache needs to be recreated. This adds an
`async_creation` config option that defers cache creation to a background
asyncio task, letting the current request proceed uncached while the
cache is built for the next request.

When async_creation=False (default), behavior is completely unchanged.
@abhinavmaddineni abhinavmaddineni force-pushed the feat/async-cache-creation branch from 107ad40 to 452a9fa Compare March 4, 2026 15:51
Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a valuable feature for asynchronous context cache creation, which should significantly improve latency in cache miss scenarios. The overall design, using a module-level task registry and snapshotting request data, is sound. However, I've identified a potential bug in the stale task cleanup logic that could lead to crashes and have suggested a more robust implementation. Additionally, I recommend adding unit tests for the new asynchronous pathways to ensure the long-term stability and maintainability of this complex but important feature.

Note: Security Review is unavailable for this PR.

…n timeout

Adds a `create_http_options` field to `ContextCacheConfig` that is passed
through to `CreateCachedContentConfig` when creating a cache. This allows
users to set a timeout (or other HTTP options) on the CachedContent.create()
call, which can take 30-40 seconds on Vertex AI.

When the timeout is exceeded, cache creation fails gracefully and the
request proceeds without caching.

Replaces the previous `async_creation` approach which required global
in-memory state that didn't scale across instances.

Fixes google#4703
@abhinavmaddineni abhinavmaddineni changed the title feat: add async_creation option to ContextCacheConfig feat: add create_http_options to ContextCacheConfig for cache creation timeout Mar 4, 2026
@abhinavmaddineni abhinavmaddineni marked this pull request as ready for review March 4, 2026 16:26
@rohityan rohityan self-assigned this Mar 4, 2026
@rohityan
Copy link
Collaborator

rohityan commented Mar 4, 2026

Hi @abhinavmaddineni , Thank you for your contribution! We appreciate you taking the time to submit this pull request.

  1. You need to fix the failing mypy-diff tests
  2. Fix failing unit tests.
  3. Fix formatting errors. You can use autoforma

@rohityan rohityan added the request clarification [Status] The maintainer need clarification or more information from the author label Mar 4, 2026
…matting

- Add null guard for cache_config before accessing create_http_options (mypy union-attr)
- Update test_runner_realistic_cache_config_scenario expected str to include new field
- Apply pyink formatting to comply with Google style
@abhinavmaddineni abhinavmaddineni force-pushed the feat/async-cache-creation branch from 9cbf285 to e1f12c8 Compare March 4, 2026 20:37
@abhinavmaddineni
Copy link
Author

Hi @rohityan, thanks for the feedback! I've pushed a fix that should address all the CI failures (mypy, tests, formatting).

Regarding the feature itself: this adds an optional create_http_options field to ContextCacheConfig that lets users set a per-request timeout specifically for CachedContent.create() calls.

Why client-level http_options isn't sufficient: The Client(http_options=...) timeout applies globally to all API calls — both LLM generate/streaming calls and cache creation. Cache creation on Vertex AI can take 30-40s for large contexts, while LLM calls typically need a much shorter timeout. A single client-level timeout forces a tradeoff: set it high enough for cache creation, and you lose tight timeout protection on regular LLM calls. The per-request http_options on CreateCachedContentConfig exists precisely for this reason — our change simply exposes it through ContextCacheConfig so ADK users can take advantage of it.

Related issue: #4703

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

core [Component] This issue is related to the core interface and implementation request clarification [Status] The maintainer need clarification or more information from the author

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feat: add create_http_options to ContextCacheConfig for cache creation timeout control

3 participants