Skip to content

feat: single @guardrail decorator with validator strategy pattern [AL-288]#736

Open
apetraru-uipath wants to merge 1 commit intomainfrom
feat/guardrails_decorators_second
Open

feat: single @guardrail decorator with validator strategy pattern [AL-288]#736
apetraru-uipath wants to merge 1 commit intomainfrom
feat/guardrails_decorators_second

Conversation

@apetraru-uipath
Copy link
Copy Markdown
Contributor

@apetraru-uipath apetraru-uipath commented Mar 27, 2026

What changed?

Replace three type-specific guardrail decorators (@pii_detection_guardrail, @prompt_injection_guardrail, @deterministic_guardrail) with a single unified @guardrail(validator=..., action=..., name=..., stage=...) decorator using the strategy pattern.

New API:

@guardrail(validator=PromptInjectionValidator(threshold=0.5), action=BlockAction(), name="LLM Prompt Injection", stage=GuardrailExecutionStage.PRE)
@guardrail(validator=PIIValidator(entities=[PIIDetectionEntity(PIIDetectionEntityType.EMAIL, 0.5)]), action=LogAction(), name="LLM PII")
def create_llm():
    return UiPathChat(model="gpt-4o")

Key changes:

  • Add decorators/validators/ package with GuardrailValidatorBase, PIIValidator, PromptInjectionValidator, DeterministicValidator
  • Validators carry what to check (entities, threshold, rules); decorator carries how to respond (action, name, stage)
  • A single validator instance can be reused across multiple @guardrail decorators with different actions or stages
  • Scope/stage validation at decoration time via validate_scope() / validate_stage()
  • Remove old type-specific decorator files (pii_detection.py, prompt_injection.py, deterministic.py) — 100% parity replacement, no backward compat shims
  • Fix double POST guardrail application on sync StructuredTool: StructuredTool.ainvoke (no coroutine) delegates to self.invoke via run_in_executor; overriding ainvoke in _GuardedTool caused POST to fire twice (words++++words++)
  • Run ruff check --fix + ruff format

Files:

  • src/uipath_langchain/guardrails/decorators/guardrail.py — new unified decorator
  • src/uipath_langchain/guardrails/decorators/validators/ — 4 new files
  • src/uipath_langchain/guardrails/decorators/{pii_detection,prompt_injection,deterministic}.py — deleted
  • src/uipath_langchain/guardrails/{__init__,decorators/__init__}.py — updated exports
  • samples/joke-agent-decorator/graph.py — rewritten to use new API

How has this been tested?

  • End-to-end run with uv run uipath run agent '{"topic": "money"}' — verified all 3 scopes (AGENT, LLM, TOOL) fire correctly, words++ filter applies exactly once
  • End-to-end run with uv run uipath run agent '{"topic": "joke about Andrei Petraru"}' — verified Agent PII (PERSON) blocks with AgentRuntimeError
  • uv run pytest — all tests pass (2 pre-existing failures for missing botocore/google optional deps, unrelated)
  • uv run ruff check src/uipath_langchain/guardrails/ — clean
  • uv run ruff format — 4 files reformatted

Are there any breaking changes?

  • Under Feature Flag
  • None
  • DB migrations required
  • API/interface removals or renames
  • Other

Old decorator names (pii_detection_guardrail, prompt_injection_guardrail, deterministic_guardrail) are removed. This branch replaces them entirely — callers must migrate to @guardrail(validator=...). No other public API is affected.

Ticket

AL-288

@apetraru-uipath apetraru-uipath force-pushed the feat/guardrails_decorators_second branch 2 times, most recently from 2e22bb5 to 3747eaa Compare March 27, 2026 15:35
Add stage parameter to pii_detection_guardrail and prompt_injection_guardrail
decorators, matching the flexibility already available in deterministic_guardrail.
pii_detection_guardrail defaults to PRE_AND_POST; prompt_injection_guardrail
defaults to PRE and raises pydantic.ValidationError for POST or PRE_AND_POST
(prompt injection is input-only).

Fix _apply_llm_input_guardrail and _apply_guardrail_to_message_list: previously
concatenated the full conversation history and tried to replace the joined text
inside a single message, which silently failed in any multi-turn conversation.
Now evaluates only the last HumanMessage (PRE/input) or last AIMessage
(POST/output) — semantically correct and replacement works reliably.

Replace _extract_text_from_messages with focused helpers: _get_last_human_message,
_get_last_ai_message, _extract_message_text, _apply_message_text_modification.
Add target_type parameter to _apply_guardrail_to_message_list so input and output
graph-scope wrappers target the correct message type.

PII decorator, joke-agent-decorator sample, BlockAction uses AgentRuntimeError.

Fix _wrap_llm_with_guardrail: factory functions returning BaseChatModel were not
wrapped (fell through StateGraph/dict branches). Also fix Pydantic setattr block
on UiPathChat by using __class__ swap to a dynamic subclass instead of
monkey-patching invoke/ainvoke.

Fix BlockAction swallowed by bare except: split try/except so only guardrail API
errors are suppressed; action exceptions (AgentRuntimeError) now propagate.

Fix CompiledStateGraph not recognised: add _wrap_compiled_graph_with_guardrail
and handle CompiledStateGraph return type from factory functions.

Fix mypy errors in decorators.py: typed list[BaseMessage], CompiledStateGraph
type params, type: ignore[valid-type, misc] for dynamic subclass, and
type: ignore[method-assign] for CompiledStateGraph method patching.

Add prompt_injection_guardrail decorator: _create_prompt_injection_guardrail,
_apply_prompt_injection_guardrail, public prompt_injection_guardrail function;
exported from guardrails/__init__.py; stacked on create_llm() in
joke-agent-decorator/graph.py with BlockAction to block on detection.

Reformat decorators.py for consistency.

Middleware cleanup: delete monolithic middleware.py (duplicate of middlewares/
split files); update guardrails/__init__.py to import from .middlewares; update
joke-agent/graph.py to use new split-file API (tool_names -> tools, optional
scopes on PromptInjection, unconditional POST filter comment).

Revert renames: restore LoggingSeverityLevel as proper int Enum (ERROR, INFO,
WARNING, DEBUG) in actions.py; remove PromptInjectionValidatorType from enums.py;
fix pii_detection.py docstring to use Entity+PIIDetectionEntity; export
LoggingSeverityLevel from guardrails/__init__.py; update both samples to use
LoggingSeverityLevel instead of AgentGuardrailSeverityLevel.

Manual refinements: updates to actions.py, decorators.py, enums.py, models.py,
middlewares/pii_detection.py, guardrails/__init__.py, and joke-agent/graph.py.

Remove joke-agent/.agent/REQUIRED_STRUCTURE.md; further manual edits to
joke-agent/graph.py.

Refactor decorators.py into decorators/ package (pii.py, prompt_injection.py,
deterministic.py, _base.py) with tool-level guardrail support:
- Split monolithic decorators.py into decorators/ subpackage
- Add _wrap_tool_with_guardrail using __class__ swap (Pydantic-safe)
- Add deterministic_guardrail decorator (TOOL scope, local rules, no API call)
- Extend pii_guardrail to support BaseTool and optional tools= kwarg
- Extend _detect_scope to return GuardrailScope.TOOL for BaseTool instances
- Export deterministic_guardrail and RuleFunction from guardrails/__init__.py
- Update joke-agent-decorator/graph.py to demonstrate all three decorator types
  on analyze_joke_syntax tool (3x @deterministic_guardrail + @pii_guardrail)
- Add local CustomFilterAction to joke-agent-decorator sample

Tool guardrail fixes and sample updates:
- _base.py: unwrap LangGraph tool-call envelope (args) for rule evaluation;
  rewrap modified args so super().invoke() receives valid input; handle
  ToolMessage/Command in _extract_output for POST-stage deterministic rules.
- joke-agent-decorator: Agent PII uses LogAction(WARNING) with custom message;
  README aligned with current guardrails and verification scenarios.

Deps and lockfiles: pyproject.toml updates (root and joke-agent-decorator);
remove samples/joke-agent-decorator/uv.lock and samples/joke-agent/uv.lock;
uv.lock at repo root updated.

Refactor _base.py: remove unnecessary casts in _evaluate_rules; catch only
ValueError (JSONDecodeError is subclass); extract _apply_guardrail_to_message_list,
_apply_guardrail_to_input_messages, _apply_guardrail_to_output_messages and use
them in _wrap_stategraph_with_guardrail, _wrap_compiled_graph_with_guardrail,
and _wrap_function_with_guardrail to reduce cognitive complexity.

Made-with: Cursor

Enable enabled_for_evals override across decorators and middlewares for PII,
prompt injection, and deterministic guardrails (default true, user-overridable),
plus docs/sample updates for the new parameter.

Fix mypy errors in pii_detection.py and prompt_injection.py: rebind guardrail
to a typed non-optional local variable so mypy can narrow the type inside
nested class closures.

Made-with: Cursor

Fix _wrap_compiled_graph_with_guardrail: output guardrail (POST stage) was never
applied — invoke/ainvoke discarded the graph output without running
_apply_guardrail_to_output_messages. Now captures the output and evaluates it,
matching the behaviour of _wrap_stategraph_with_guardrail.

Update pyproject.toml and uv.lock.

Fix double POST guardrail application on StructuredTool: remove ainvoke override
from _GuardedTool in deterministic.py and pii_detection.py. StructuredTool.ainvoke
delegates to self.invoke via run_in_executor, so the guardrail chain in invoke
already runs once; the ainvoke override caused a second POST application, producing
"words++++" instead of "words++".

Fix LogAction double-logging: replace print()+logger.log() with a single
logger.log() call using the guardrail name as context prefix.

Fix _evaluate_rules violation message: include guardrail name instead of
positional index so errors read "Rule <name> detected violation" rather than
"Rule 1 detected violation". Pass guardrail_name through _apply_pre/_apply_post
call sites in deterministic.py.

Fix AgentRuntimeError swallowed in guardrail middleware/decorator except handlers:
add explicit except AgentRuntimeError: raise before the generic except Exception
in pii_detection decorator (_apply_pre, _apply_post), pii_detection middleware
(_wrap_tool_call_func, _check_messages), and prompt_injection middleware
(_check_messages) so BlockAction errors propagate instead of being silently logged.

Refactor to single @guardrail decorator with validator strategy pattern:
- Replace pii_detection_guardrail, prompt_injection_guardrail, deterministic_guardrail
  with a unified @guardrail(validator=..., action=..., name=..., stage=...) decorator
- Add GuardrailValidatorBase ABC with PIIValidator, PromptInjectionValidator,
  DeterministicValidator implementations under decorators/validators/
- Validators encode WHAT to check; decorator encodes HOW to respond (action/name/stage)
- Scope and stage validation at decoration time via validate_scope()/validate_stage()
- Factory functions (create_llm, create_joke_agent) handled by deferring scope
  detection to call time; TOOL-only validators (DeterministicValidator) raise
  ValueError when applied to factory functions
- Remove old type-specific decorator files (pii_detection.py, prompt_injection.py,
  deterministic.py); update decorators/__init__.py and guardrails/__init__.py
- Rewrite joke-agent-decorator/graph.py to use new API with reusable validator instances
- Run ruff check + ruff format; fix B024 (GuardrailValidatorBase not ABC since
  neither method is abstract — both are optional override points)

Fix double POST on StructuredTool in new @guardrail decorator: remove ainvoke
override from _GuardedTool. StructuredTool.ainvoke (sync tools) delegates to
self.invoke via run_in_executor; having ainvoke overridden caused POST to fire
twice (once in invoke, once after super().ainvoke() returned).

Rename api_guardrail -> built_in_guardrail throughout decorator layer:
- Parameter and variable api_guardrail renamed to built_in_guardrail in guardrail.py
- Method build_api_guardrail renamed to build_built_in_guardrail in validators/_base.py,
  validators/pii.py, validators/prompt_injection.py
- Local variable guardrail = metadata.guardrail renamed to built_in_guardrail in
  _base.py (_wrap_stategraph_with_guardrail, _wrap_compiled_graph_with_guardrail,
  _wrap_function_with_guardrail)

Fix mypy errors in new @guardrail decorator layer (14 errors across 3 files):
- validators/pii.py: remove redundant supported_scopes/stages declarations that
  conflicted with ClassVar on the base class; base already provides empty lists
- validators/deterministic.py: add ClassVar import; fix supported_stages annotation
  from bare list to ClassVar[list[GuardrailExecutionStage]]
- decorators/guardrail.py: add BaseMessage import; fix messages: list ->
  list[BaseMessage]; add [Any, Any] / [Any, Any, Any] type params to StateGraph
  and CompiledStateGraph function signatures
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant