fix: guardrail redact targets last user message, not trailing LTM context#1884
Open
giulio-leone wants to merge 1 commit intostrands-agents:mainfrom
Open
fix: guardrail redact targets last user message, not trailing LTM context#1884giulio-leone wants to merge 1 commit intostrands-agents:mainfrom
giulio-leone wants to merge 1 commit intostrands-agents:mainfrom
Conversation
When long-term memory (LTM) session managers like AgentCoreMemorySessionManager append an assistant message containing user context after the user turn, the guardrail redaction logic incorrectly redacted the LTM context instead of the actual user input. Root cause: the redact handler used `self.messages[-1]` which assumes the last message is the user's input. With LTM enabled, the message list looks like: [0] user: 'Tell me something bad' ← should be redacted [1] assistant: '<user_context>...</user_context>' ← was being redacted The fix replaces `self.messages[-1]` with a reverse search for the last message with `role == 'user'`, matching the pattern already used by `_find_last_user_text_message_index()` in the Bedrock model for guardrail_latest_message wrapping. Closes strands-agents#1639
44b6bb3 to
ce2e12f
Compare
Contributor
Author
|
Friendly ping — fixes guardrail redaction to target the actual last user message instead of trailing long-term memory context, which was causing false positive redactions. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Issue
Closes #1639
Problem
When guardrail redaction is enabled (
guardrail_redact_input=True) together with a long-term memory (LTM) session manager likeAgentCoreMemorySessionManager, the redact logic incorrectly modifies the LTM context message instead of the user's input.The LTM session manager appends an assistant message after the user turn:
The redact handler used
self.messages[-1], which blindly picked the last message regardless of role.Root Cause
In
agent.py, the guardrail redaction code assumedself.messages[-1]is always the user's input:With LTM enabled,
messages[-1]is the assistant's context message, not the user's input.Solution
Replaced
self.messages[-1]with a reverse search for the last message withrole == 'user':This matches the pattern already used by
_find_last_user_text_message_index()in the Bedrock model forguardrail_latest_messagewrapping.Testing
test_agent_redacts_user_message_not_ltm_context: Simulates the LTM scenario with a trailing assistant context message, verifies the user message is redacted and the LTM context is preservedChanges
src/strands/agent/agent.py: Changed guardrail redact handler to find last user-role messagetests/strands/agent/test_agent.py: Added test for LTM + guardrail interaction