Skip to content

fix: recognize Bedrock inference profile ARNs for prompt caching#1883

Open
giulio-leone wants to merge 1 commit intostrands-agents:mainfrom
giulio-leone:fix/bedrock-arn-cache-support
Open

fix: recognize Bedrock inference profile ARNs for prompt caching#1883
giulio-leone wants to merge 1 commit intostrands-agents:mainfrom
giulio-leone:fix/bedrock-arn-cache-support

Conversation

@giulio-leone
Copy link
Contributor

Issue

Closes #1705

Problem

_cache_strategy only checked for 'claude' or 'anthropic' substrings in the model_id:

def _cache_strategy(self) -> str | None:
    model_id = self.config.get('model_id', '').lower()
    if 'claude' in model_id or 'anthropic' in model_id:
        return 'anthropic'
    return None

This works for system inference profile IDs (e.g. us.anthropic.claude-haiku-4-5-20251001-v1:0) but fails for application inference profile ARNs which have the form:

arn:aws:bedrock:us-east-1:123456789012:application-inference-profile/abc123def456

These ARNs don't contain 'claude' or 'anthropic', so automatic caching was silently disabled with a warning:

WARNING | model_id=<arn:aws:bedrock:...> | cache_config is enabled but this model does not support automatic caching

Solution

Added detection for inference profile ARNs: when model_id starts with arn: and contains inference-profile, optimistically enable caching.

Design Rationale

  • No extra API call: Avoided calling GetInferenceProfile to resolve the underlying model, which would add latency, require additional IAM permissions (bedrock:GetInferenceProfile), and create a network dependency on every converse call.
  • Safe default: Currently only Anthropic Claude models support prompt caching on Bedrock. For non-caching models, the cache point is silently ignored by the Converse API — no error is raised.
  • Covers both ARN types: Matches both application-inference-profile and inference-profile (cross-region) ARNs.

Testing

  • Added test_cache_strategy_anthropic_for_inference_profile_arn: Tests both application and cross-region inference profile ARNs
  • All 125 Bedrock tests pass (123 existing + 1 new with 2 assertions)

Changes

  • src/strands/models/bedrock.py: Extended _cache_strategy to detect inference profile ARNs
  • tests/strands/models/test_bedrock.py: Added test for ARN-based cache strategy detection

The _cache_strategy property only checked for 'claude' or 'anthropic'
substrings in the model_id.  This works for system inference profile
IDs (e.g. us.anthropic.claude-haiku-4-5-20251001-v1:0) but fails for
application inference profile ARNs which have the form:

  arn:aws:bedrock:<region>:<account>:application-inference-profile/<id>

These ARNs don't contain 'claude' or 'anthropic', so automatic
caching was silently disabled even when the user explicitly set
cache_config=CacheConfig(strategy='auto').

The fix detects ARNs that contain 'inference-profile' and
optimistically enables caching.  Currently only Anthropic Claude
models support prompt caching on Bedrock; for non-caching models the
cache point is silently ignored by the Converse API.

Closes strands-agents#1705
@giulio-leone
Copy link
Contributor Author

Friendly ping — adds recognition of Bedrock inference profile ARNs for prompt caching, which currently only matches standard model ARNs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[BUG] _supports_caching doesn't recognize Bedrock application inference profile ARNs

1 participant