Skip to content

feat(strands-memory): add event metadata support to AgentCoreMemorySessionManager#339

Merged
tejaskash merged 4 commits intomainfrom
worktree-pr1-metadata-support
Mar 16, 2026
Merged

feat(strands-memory): add event metadata support to AgentCoreMemorySessionManager#339
tejaskash merged 4 commits intomainfrom
worktree-pr1-metadata-support

Conversation

@tejaskash
Copy link
Contributor

@tejaskash tejaskash commented Mar 13, 2026

Summary

Adds user-supplied event metadata support to AgentCoreMemorySessionManager (Phase 1 of #149).

Static metadata:

  • New default_metadata field on AgentCoreMemoryConfig — attaches custom key-value metadata to every message event

Dynamic metadata (for traceId / Langfuse integration):

  • New metadata_provider field — a callable invoked at each event creation, so it can return per-invocation values (e.g. current traceId). This is needed because Strands controls the append_messagecreate_message call path, so users can't pass per-call kwargs through agent().
  • Merge precedence: default_metadata < metadata_provider() < per-call metadata kwarg < internal keys

Infrastructure:

  • _build_metadata() helper with validation: rejects reserved keys (stateType, agentId), enforces 15-key API limit
  • Refactors internal _message_buffer from raw tuple to BufferedMessage NamedTuple for clarity and extensibility
  • Metadata flows through both immediate-send and batched flush paths

Usage example (Langfuse traceId)

from langfuse.decorators import langfuse_context

config = AgentCoreMemoryConfig(
    memory_id=MEM_ID,
    session_id=SESSION_ID,
    actor_id=ACTOR_ID,
    metadata_provider=lambda: {
        "traceId": {"stringValue": langfuse_context.get_current_trace_id() or ""}
    },
)
sm = AgentCoreMemorySessionManager(agentcore_memory_config=config, region_name="us-east-1")
agent = Agent(session_manager=sm)
agent("Hello!")  # Event gets the current traceId automatically

Test plan

  • 11 new unit tests in TestMetadataSupport (default metadata, per-call, merge precedence, reserved keys, max keys, no-metadata, batched, blob, provider called per event, provider merge with defaults, provider reserved keys rejected)
  • 3 new integration tests with positive/negative filter assertions (metadata round-trip, session resume, dynamic traceId with disjoint event sets)
  • 123 existing unit tests pass unchanged (buffer tuple → BufferedMessage migration)
  • Full suite: 1098 tests pass, 0 failures

Related

…ntCoreMemorySessionManager

Allow users to attach custom key-value metadata to conversation events
via a new `default_metadata` config field and per-call `metadata` kwarg.
Metadata is merged (per-call > config defaults > internal) and validated
against reserved keys and the 15-key API limit.

Also refactors the internal message buffer from a raw tuple to a
`BufferedMessage` NamedTuple for clarity and extensibility.

Closes #149 (Phase 1: Metadata)
Default is "user_context".
filter_restored_tool_context: When True, strip historical toolUse/toolResult blocks from
restored messages before loading them into Strands runtime memory. Default is False.
default_metadata: Optional default metadata key-value pairs to attach to every message event.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this actually solve the customer's ask?
What if they need different metadata for each message event? Also, what exactly do they mean by "message_event" — are they
  referring to memory records, AgentCore Memory events, or individual conversation turns? Are they trying to attach a distinct metadata field to each conversation turn?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are we sure this is the interface the customer is looking for? Could we ask them to send an example code block of the support they want?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it solves the ask. metadata_provider is a callable invoked at each create_message(), so each event gets whatever traceId is current at that moment. We have an integration test confirming two invocations with different traceIds produce disjoint, independently filterable event sets.

Per-turn metadata works out of the box with batch_size=1 (the default) since each turn = its own event. With batch_size > 1 multiple turns collapse into one event, so the last traceId wins — but that's an inherent tradeoff of batching, not a metadata limitation.

The customer is talking about STM events, not LTM records (those are extracted async and don't carry event metadata).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes — the customer's use case is tagging events with a traceId from Langfuse that changes per invocation. metadata_provider (a callable) gives them exactly that. We have an integ test that confirms two invocations with different traceIds produce disjoint event sets filterable by list_events.

…n metadata

Add `metadata_provider` config field — a callable invoked at each event
creation, enabling dynamic metadata like traceId that changes per
agent invocation. This solves the Langfuse/user-feedback use case where
a static `default_metadata` is insufficient because Strands controls
the append_message → create_message call path.

Merge precedence: default_metadata < metadata_provider() < per-call kwargs < internal keys.
session_id=SESSION_ID,
actor_id=ACTOR_ID,
default_metadata={
"project": {"stringValue": "atlas"},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would we be able to build this map on behalf of the customer? It feels very verbose.

Ex:
{"project" : "atlas"} --> {"project" : { "stringValue": "atlas"}}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good call. Added auto-normalization — plain strings are now auto-wrapped:

{"project": "atlas"}  →  {"project": {"stringValue": "atlas"}}

Both forms accepted. Updated the README examples to use the simpler format.

RESERVED_METADATA_KEYS = frozenset({STATE_TYPE_KEY, AGENT_ID_KEY})


class BufferedMessage(NamedTuple):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice, I agree with the decision to add some structure here.


def test_metadata_reserved_keys_rejected(self, session_manager):
"""ValueError raised when user metadata contains reserved keys."""
from bedrock_agentcore.memory.integrations.strands.session_manager import RESERVED_METADATA_KEYS
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: personally still new to python, but in most languages I'm used to seeing imports at the top unless we have a strong reason not to. lmk if python convention is different.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moved all inline from datetime import ... to the top of the test file. You're right — top-level is the Python convention too.

Default is "user_context".
filter_restored_tool_context: When True, strip historical toolUse/toolResult blocks from
restored messages before loading them into Strands runtime memory. Default is False.
default_metadata: Optional default metadata key-value pairs to attach to every message event.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are we sure this is the interface the customer is looking for? Could we ask them to send an example code block of the support they want?

flush_interval_seconds: Optional[float] = Field(default=None, gt=0)
context_tag: str = Field(default="user_context", min_length=1)
filter_restored_tool_context: bool = Field(default=False)
default_metadata: Optional[Dict[str, Any]] = None
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason we need Any here instead of the MetadataValue used internally?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tried switching to MetadataValue (a TypedDict) but Pydantic on Python < 3.12 rejects TypedDict from typing in model fields. Kept Any in the type annotation but added a field_validator that normalizes plain strings at config construction time, and normalize_metadata() at runtime for metadata_provider output. So the internal plumbing always gets the right shape regardless of what the user passes.

with `default_metadata` and `metadata_provider` (per-call values override both for the same key):

```python
session_manager.create_message(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what happens when we flush messages in a batch and the metadata is different on each message? Does all the metadata get merged?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When messages in a batch have different metadata, the metadata dicts are merged (later message's keys override earlier ones for the same key). So with batch_size > 1, the last value for each key wins in the combined event. This is documented in the batching tradeoff — with batch_size=1 (the default) each turn gets its own event with its own metadata, so no merging occurs.

- Auto-normalize plain string metadata values to {"stringValue": ...}
  so users can write {"project": "atlas"} instead of the verbose form.
  Applied via pydantic validator on default_metadata and at runtime for
  metadata_provider return values.
- Move inline datetime imports to top of test file (nit from Hweinstock)
- Fix lint/format issues that caused CI Lint and Format check to fail
- Add tests for normalization in both config and session manager
Hweinstock
Hweinstock previously approved these changes Mar 16, 2026
Copy link
Contributor

@Hweinstock Hweinstock left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! probably worth getting a quick review from @jariy17 as well since he is more knowledgable here.

jariy17
jariy17 previously approved these changes Mar 16, 2026
Pydantic v2 handles Callable natively, so arbitrary_types_allowed
is not needed. Removing it avoids any risk of breaking subclasses
or downstream validators.
@tejaskash tejaskash merged commit cd2f2a0 into main Mar 16, 2026
23 checks passed
@jariy17 jariy17 deleted the worktree-pr1-metadata-support branch March 16, 2026 21:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants