Category: Engineering Date: February 25, 2026 Read Time: 12 min read
AI agents are powerful, but how do you ensure they don't go rogue? Today we're releasing Predicate Secure - a drop-in security wrapper that adds enterprise-grade authorization and verification to browser automation agents. Think of it as a safety harness for your AI agents.
📦 Open Source: The complete demo is available on GitHub at PredicateSystems/predicate-secure (see the
demo/folder). Get started in 5 minutes with local LLM verification.
Predicate Secure integrates with your existing AI agent frameworks in just 3-5 lines of code - including browser-use, LangChain, PydanticAI, raw Playwright, and OpenClaw. This frictionless adoption means you can add robust security without rewriting your agents.
This post walks through our comprehensive demo that showcases the complete agent security loop: pre-execution authorization, browser automation, and post-execution verification using local LLMs.
When AI agents interact with browsers and web services, they need guardrails. A misconfigured prompt or unexpected model behavior could lead to:
- Navigating to unauthorized domains
- Clicking sensitive buttons or forms
- Exposing credentials or API keys
- Performing unauthorized actions (e.g., deleting all emails)
- Executing actions outside policy boundaries
Traditional approaches rely on prompt engineering or hope for the best. Predicate Secure takes a different approach: enforce policy before execution, verify outcomes after.
Predicate Secure implements a complete three-phase agent loop that combines:
- Pre-execution authorization - Deterministic policy-based decisions
- Action execution - Controlled browser automation
- Post-execution verification - Deterministic assertion checking
This is not a probabilistic safety approach. Every action is governed by explicit policy rules (deterministic authorization) and validated against concrete predicates (deterministic verification). The LLM's role is constrained to generating verification predicates based on observed state changes - the actual verification execution is deterministic.
Phase 1: Pre-Execution Authorization
- Policy-based decision: Is this action allowed?
- Deterministic rule evaluation
Phase 2: Action Execution
- Browser automation with snapshot capture
- Controlled execution environment
Phase 3: Post-Execution Verification
- LLM-generated assertions validate outcomes
- Deterministic predicate evaluation
The demo showcases a complete end-to-end implementation with:
- 0 External Dependencies - 100% offline capable
- Free - Local LLM verification
1. Predicate Runtime SDK (predicate-runtime==1.1.2)
- Browser automation via AsyncPredicateBrowser
- Semantic element detection with
find()DSL - Visual overlay for element highlighting
- Automatic Chrome extension injection
2. Predicate Authority (predicate-authority>=0.1.0)
- YAML-based policy enforcement
- Fail-closed authorization (deny by default)
- Optional Rust-based sidecar for production
- Flexible identity: Local IdP, Okta, Entra ID (Azure AD), OIDC
3. Local LLM Verification (Qwen 2.5 7B Instruct)
- Generates verification predicates from page state changes
- Runs completely offline on Apple Silicon (MPS)
- ~14GB model, 5-second cold start after initial download
4. Cloud Tracing (Optional)
- Upload authorization and verification events to Predicate Studio
- Visualize execution timeline in web UI
- Track decisions across agent runs
Predicate Secure wraps your existing agent code in 3-5 lines - no rewrites needed:
| Framework | Adapter | Integration Effort |
|---|---|---|
browser-use |
BrowserUseAdapter |
3 lines |
LangChain |
SentienceLangChainCore |
4 lines |
PydanticAI |
predicate.integrations.pydanticai |
3 lines |
Raw Playwright |
AgentRuntime.from_playwright_page() |
5 lines |
OpenClaw |
OpenClawAdapter |
3 lines |
Success: All adapters are production-ready and maintained in the
predicate-runtimeSDK. Drop-in security for any agent framework.
The demo executes a simple but complete browser task:
✓ Navigate to https://www.example.com with policy check
✓ Take snapshot with visual element overlay
✓ Find and click "Learn more" link using semantic query
✓ Verify URL contains "example-domains" after navigation
✓ Upload trace to Predicate Studio (if API key provided)
Each action goes through the full authorization + verification loop.
Instead of brittle CSS selectors, we use semantic queries:
from predicate import find
# Find link by semantic properties, not CSS
element = find(snapshot, "role=link text~'Learn more'")
if element:
print(f"Found: {element.text} (ID: {element.id})")
print(f"Clickable: {element.visual_cues.is_clickable}")
await click_element(element)The find() function understands:
- ARIA roles (
role=link,role=button) - Text content matching (
text~'substring') - Visual cues (clickability, visibility)
- Element importance ranking
Authorization rules are declarative YAML:
# Allow navigation to safe domains
- name: allow-navigation-safe-domains
effect: ALLOW
principals:
- "agent:demo-browser"
actions:
- "browser.navigate"
resources:
- "https://www.example.com*"
- "https://www.google.com*"
conditions:
required_labels:
- "browser_initialized"
# Allow clicks on safe element types
- name: allow-browser-click-safe-elements
effect: ALLOW
principals:
- "agent:demo-browser"
actions:
- "browser.click"
resources:
- "element:role=link[*"
- "element:role=button[*"
- "element#*" # By snapshot ID
conditions:
required_labels:
- "element_visible"
- "snapshot_captured"
# Default deny (fail-closed)
- name: default-deny
effect: DENY
principals:
- "*"
actions:
- "*"
resources:
- "*"Note: The policy is fail-closed: any action not explicitly allowed is denied. This prevents agents from taking unexpected actions.
After each action, the local LLM analyzes the state changes and generates deterministic verification predicates (assertions to check):
Important: The LLM is NOT doing visual verification. Instead, it generates structured assertions (like
url_contains,element_exists) based on observed state changes. The actual verification execution is deterministic - predicates are evaluated as true/false checks.
# Capture pre and post snapshots
pre_snapshot = await get_page_summary()
result = await execute_action()
post_snapshot = await get_page_summary()
# LLM generates verification plan (what to check, not the check itself)
verification_plan = verifier.generate_verification_plan(
action="click",
action_target="element#6",
pre_snapshot_summary=pre_snapshot,
post_snapshot_summary=post_snapshot,
context={"task": "Find and click Learn more link"}
)
# Execute generated predicates deterministically
for verification in verification_plan.verifications:
passed = execute_predicate(
verification.predicate, # e.g., "url_contains"
verification.args # e.g., ["example-domains"]
)
if not passed:
raise AssertionError("Post-execution verification failed")The LLM sees both snapshots and generates a structured verification plan:
{
"verifications": [
{
"predicate": "url_contains",
"args": ["example-domains"]
},
{
"predicate": "snapshot_changed",
"args": []
}
],
"reasoning": "Verify navigation by checking URL change and snapshot difference."
}For Production Workflows:
For well-understood web flows (like QA testing flows or regular business processes), you can skip LLM generation and use human-defined predicates directly:
# Predefined verification for known workflows
verification_plan = VerificationPlan(
action="click",
verifications=[
VerificationSpec(predicate="url_contains", args=["example-domains"]),
VerificationSpec(predicate="element_exists", args=["h1"]),
VerificationSpec(predicate="snapshot_changed", args=[]),
],
reasoning="Predefined checks for 'Learn more' click flow",
)
# Execute the same way - deterministic evaluation
all_passed = execute_verifications(verification_plan)This approach is faster (no LLM inference), more predictable (explicit assertions), and ideal for regression testing of known workflows. Use LLM-generated predicates for exploratory tasks or novel scenarios.
Enable visual debugging with snapshot overlays:
from predicate.snapshot import snapshot_async
from predicate.models import SnapshotOptions
snap = await snapshot_async(
browser,
SnapshotOptions(
show_overlay=True, # Highlights detected elements in browser
screenshot=False,
),
)
print(f"Captured {len(snap.elements)} elements")
# Watch the browser - you'll see colored boxes around detected elements!This is invaluable for debugging why an agent can't find an element.
Here's what the demo produces when run:
╭──────────────── Demo Configuration ─────────────────╮
│ Predicate Secure Browser Automation Demo │
│ Task: Navigate to example.com and verify page loads │
│ Start URL: https://www.example.com │
│ Principal: agent:demo-browser │
╰─────────────────────────────────────────────────────╯
Initializing Local LLM Verifier...
⠋ Loading Qwen 2.5 7B model...
✓ Verifier initialized
Initializing Cloud Tracer...
☁️ Cloud tracing enabled (Pro tier)
✓ Cloud tracer initialized
Run ID: 777c0308-82c8-454d-98df-5a603d12d418
View trace: https://studio.predicatesystems.dev/runs/...
Step 1: Initializing Browser...
✓ Browser started
Step 2: Executing Browser Task...
→ Action: navigate (https://www.example.com)
Pre-execution: Checking authorization...
✓ Action authorized
Executing action...
✓ Action executed
Post-execution: Generating verification plan...
i Generated 1 verifications
Reasoning: Fallback: verify URL changed after navigation
Executing verifications...
[1] url_changed()
✓ Passed
✓ All verifications passed
→ Action: snapshot (current_page)
Pre-execution: Checking authorization...
✓ Action authorized
Executing action...
Snapshot captured: 2 elements
(Watch the browser - elements are highlighted!)
✓ Action executed
Post-execution: Generating verification plan...
i Generated 1 verifications
Reasoning: Verify page load by checking URL contains domain.
Executing verifications...
[1] url_contains(example.com)
✓ Passed
✓ All verifications passed
→ Finding link with text: 'Learn more'
✓ Found element: Learn more (ID: 6)
Role: link, Clickable: True
→ Action: click (element#6)
Pre-execution: Checking authorization...
✓ Action authorized
Executing action...
Clicked at coordinates: (256.0, 198.078125)
✓ Action executed
Post-execution: Generating verification plan...
i Generated 2 verifications
Reasoning: Verify navigation and page load.
Executing verifications...
[1] url_contains(example.com)
✓ Passed
[2] snapshot_changed()
✓ Passed
✓ All verifications passed
✓ Task completed successfully
Cleaning up...
✓ Browser closed
Uploading trace to Predicate Studio...
✅ Trace uploaded successfully
View in Studio: https://studio.predicatesystems.dev/runs/...
✓ Python 3.11+ (Python 3.11.9 recommended) ✓ 16GB+ RAM (for 7B model) or 8GB+ (for 3B model) ✓ Apple Silicon Mac (MPS support) or CUDA GPU ✓ 10GB disk space for model files
# Clone repository
cd /path/to/Sentience/predicate-secure/py-predicate-secure
# Install SDK
pip install -e .
# Install demo dependencies
cd demo
pip install -r requirements.txt
# Install Playwright browsers
playwright install chromiumCreate a .env file in the demo directory:
# Browser display (false = show browser)
BROWSER_HEADLESS=false
# LLM model for verification
LLM_MODEL_NAME=Qwen/Qwen2.5-7B-Instruct
LLM_DEVICE=auto # Automatically detects MPS/CUDA/CPU
LLM_MAX_TOKENS=512
LLM_TEMPERATURE=0.0
# Optional: Predicate API key for cloud tracing
# PREDICATE_API_KEY=your-api-key-here
# Demo configuration
DEMO_START_URL=https://www.example.com
DEMO_TASK_DESCRIPTION=Navigate to example.com and verify page loads
DEMO_PRINCIPAL_ID=agent:demo-browserSuccess: The demo works completely offline (after initial model download). No API key required!
# Simple mode with in-process authorization
python secure_browser_demo.py
# First run: Model downloads automatically (~14GB, 2-5 minutes)
# Subsequent runs: Fast startup (~5 seconds)Based on real demo runs on Apple Silicon (M-series):
| Metric | Value | Notes |
|---|---|---|
| Model Load Time | ~5 seconds | After initial download |
| LLM Inference Time | ~3-5 seconds | Per verification plan generation |
| Snapshot Capture | ~1 second | With API or local extension |
| Authorization Check | <1ms | In-process policy evaluation |
| Total Action Loop | ~5-10 seconds | Including verification |
| Memory Usage | ~8GB | 7B model on MPS |
For production, use the Rust-based predicate-authorityd sidecar. The sidecar is optional but recommended for enterprise deployments.
# Start sidecar with local IdP mode
export LOCAL_IDP_SIGNING_KEY="your-production-secret-key"
predicate-authorityd run \
--host 127.0.0.1 \
--port 8787 \
--mode local_only \
--policy-file policies/browser_automation.yaml \
--identity-mode local-idp \
--local-idp-issuer "http://localhost/predicate-local-idp" \
--local-idp-audience "api://predicate-authority"
# Verify sidecar is running
curl http://127.0.0.1:8787/healthThe sidecar integrates with your existing identity provider:
Okta:
predicate-authorityd run \
--identity-mode oidc \
--oidc-issuer https://your-domain.okta.com \
--oidc-client-id <client-id> \
--oidc-client-secret <secret> \
--policy-file policies/browser_automation.yamlEntra ID (Azure AD):
predicate-authorityd run \
--identity-mode entra \
--entra-tenant-id <tenant-id> \
--entra-client-id <client-id> \
--entra-client-secret <secret> \
--policy-file policies/browser_automation.yamlGeneric OIDC:
predicate-authorityd run \
--identity-mode oidc \
--oidc-issuer https://your-idp.com \
--oidc-client-id <client-id> \
--oidc-client-secret <secret> \
--policy-file policies/browser_automation.yamlBenefits of sidecar mode:
✓ Centralized authorization across multiple agents ✓ Production-grade audit logging ✓ Hot-reload policy changes without agent restart ✓ Fleet management and monitoring ✓ Higher performance (Rust vs Python) ✓ Enterprise identity integration (Okta, Entra ID, OIDC)
For enterprise deployments with Predicate Cloud:
export PREDICATE_API_KEY="your-api-key"
predicate-authorityd run \
--mode cloud_connected \
--control-plane-url https://api.predicatesystems.dev \
--tenant-id your-tenant \
--project-id your-project \
--predicate-api-key $PREDICATE_API_KEYThis enables:
- Centralized policy management
- Real-time monitoring dashboard
- Historical audit trails
- Team collaboration on policies
Don't rely on prompt engineering alone. Use policy-based authorization + LLM verification for robust safety.
Qwen 2.5 7B provides sufficient reasoning for verification predicates while running completely offline on consumer hardware.
The find() DSL with role-based and text-based matching is more resilient than brittle CSS selectors.
Snapshot overlays that highlight detected elements make debugging agent behavior dramatically faster.
We're actively developing Predicate Secure with upcoming features:
- Multi-step verification chains - Complex assertion flows
- Replay killswitches - Emergency agent shutdown
- Vision fallback - Handle CAPTCHAs and complex UIs
- Permission recovery - Graceful handling of authorization failures
- Temporal integration - Durable execution for long-running agents
The demo is open source and available in the Sentience repository under predicate-secure/py-predicate-secure/demo.
Try Predicate Secure in 5 minutes. No API key required - runs completely offline with local LLM verification.
📚 Demo README 📖 Architecture Doc 📘 Predicate Authority User Manual 🔧 SDK Python Docs
Have questions or feedback? Reach out to us on GitHub or Discord.
Built with ❤️ by the Predicate team.