-
Notifications
You must be signed in to change notification settings - Fork 15
feat: add financial governance evaluators (spend limits + transaction policy) #141
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,185 @@ | ||
| # Financial Governance Evaluators for Agent Control | ||
|
|
||
| Evaluators that enforce financial spend limits and transaction policies for autonomous AI agents. | ||
|
|
||
| As agents transact autonomously via protocols like [x402](https://github.com/coinbase/x402) and payment layers like [agentpay-mcp](https://github.com/AI-Agent-Economy/agentpay-mcp), enterprises need governance over what agents spend. These evaluators bring financial policy enforcement into the Agent Control framework. | ||
|
|
||
| ## Evaluators | ||
|
|
||
| ### `financial_governance.spend_limit` | ||
|
|
||
| Tracks cumulative agent spend and enforces rolling budget limits. Stateful — records approved transactions and checks new ones against accumulated spend. | ||
|
|
||
| - **Per-transaction cap** — reject any single payment above a threshold | ||
| - **Rolling period budget** — reject payments that would exceed a time-windowed budget | ||
| - **Context-aware overrides** — different limits per channel, agent, or session via evaluate metadata | ||
| - **Pluggable storage** — abstract `SpendStore` protocol with built-in `InMemorySpendStore`; bring your own PostgreSQL, Redis, etc. | ||
|
|
||
| ### `financial_governance.transaction_policy` | ||
|
|
||
| Static policy checks with no state tracking. Enforces structural rules on individual transactions. | ||
|
|
||
| - **Currency allowlist** — only permit specific currencies (e.g., `["USDC", "USDT"]`) | ||
| - **Recipient blocklist/allowlist** — control which addresses an agent can pay | ||
| - **Amount bounds** — minimum and maximum per-transaction limits | ||
|
|
||
| ## Installation | ||
|
|
||
| ```bash | ||
| # From the repo root (development) | ||
| cd evaluators/contrib/financial-governance | ||
| pip install -e ".[dev]" | ||
| ``` | ||
|
|
||
| ## Configuration | ||
|
|
||
| ### Spend Limit | ||
|
|
||
| ```yaml | ||
| controls: | ||
| - name: spend-limit | ||
| evaluator: | ||
| type: financial_governance.spend_limit | ||
| config: | ||
| max_per_transaction: 100.0 # Max USDC per single payment | ||
| max_per_period: 1000.0 # Rolling 24h budget | ||
| period_seconds: 86400 # Budget window (default: 24 hours) | ||
| currency: USDC # Currency to govern | ||
| selector: | ||
| path: input # Extract step.input (transaction dict) | ||
| action: deny | ||
| ``` | ||
| ### Transaction Policy | ||
| ```yaml | ||
| controls: | ||
| - name: transaction-policy | ||
| evaluator: | ||
| type: financial_governance.transaction_policy | ||
| config: | ||
| allowed_currencies: [USDC, USDT] | ||
| blocked_recipients: ["0xDEAD..."] | ||
| allowed_recipients: ["0xALICE...", "0xBOB..."] | ||
| min_amount: 0.01 | ||
| max_amount: 5000.0 | ||
| selector: | ||
| path: input | ||
| action: deny | ||
| ``` | ||
| ## Selector Paths | ||
| Both evaluators support two selector configurations: | ||
| - **`selector.path: "input"`** (recommended) — The evaluator receives `step.input` directly, which should be the transaction dict. | ||
| - **`selector.path: "*"`** — The evaluator receives the full Step object. It automatically extracts `step.input` for transaction fields and `step.context` for channel/agent/session metadata. | ||
|
|
||
| ## Input Data Schema | ||
|
|
||
| The transaction dict (from `step.input`) should contain: | ||
|
|
||
| ```python | ||
| # step.input — transaction payload | ||
| { | ||
| "amount": 50.0, # required — transaction amount | ||
| "currency": "USDC", # required — payment currency | ||
| "recipient": "0xABC...", # required — payment recipient | ||
| } | ||
| ``` | ||
|
|
||
| ## Context-Aware Limits | ||
|
|
||
| Context fields (`channel`, `agent_id`, `session_id`) and per-context limit overrides can be provided in two ways: | ||
|
|
||
| **Option A: Via `step.context`** (recommended for engine integration) | ||
|
|
||
| ```python | ||
| step = Step( | ||
| type="tool", | ||
| name="payment", | ||
| input={"amount": 75.0, "currency": "USDC", "recipient": "0xABC"}, | ||
| context={ | ||
| "channel": "experimental", | ||
| "agent_id": "agent-42", | ||
| "channel_max_per_transaction": 50.0, | ||
| "channel_max_per_period": 200.0, | ||
| }, | ||
| ) | ||
| ``` | ||
|
|
||
| When using `selector.path: "*"`, the evaluator merges `step.context` fields into the transaction data automatically. When using `selector.path: "input"`, context fields must be included directly in `step.input`. | ||
|
|
||
| **Option B: Inline in the transaction dict** (simpler, for direct SDK use) | ||
|
|
||
| ```python | ||
| result = await evaluator.evaluate({ | ||
| "amount": 75.0, | ||
| "currency": "USDC", | ||
| "recipient": "0xABC", | ||
| "channel": "experimental", | ||
| "channel_max_per_transaction": 50.0, | ||
| "channel_max_per_period": 200.0, | ||
| }) | ||
| ``` | ||
|
|
||
| Spend budgets are **scoped by context** — spend in channel A does not count against channel B's budget. When no context fields are present, budgets are global. | ||
|
|
||
| ## Custom SpendStore | ||
|
|
||
| The `SpendStore` protocol requires two methods. Implement them for your backend: | ||
|
|
||
| ```python | ||
| from agent_control_evaluator_financial_governance.spend_limit import ( | ||
| SpendStore, | ||
| SpendLimitConfig, | ||
| SpendLimitEvaluator, | ||
| ) | ||
| class PostgresSpendStore: | ||
| """Example: PostgreSQL-backed spend tracking.""" | ||
| def __init__(self, connection_string: str): | ||
| self._conn = connect(connection_string) | ||
| def record_spend(self, amount: float, currency: str, metadata: dict | None = None) -> None: | ||
| self._conn.execute( | ||
| "INSERT INTO agent_spend (amount, currency, metadata, recorded_at) VALUES (%s, %s, %s, NOW())", | ||
| (amount, currency, json.dumps(metadata)), | ||
| ) | ||
| def get_spend(self, currency: str, since_timestamp: float) -> float: | ||
| row = self._conn.execute( | ||
| "SELECT COALESCE(SUM(amount), 0) FROM agent_spend WHERE currency = %s AND recorded_at >= to_timestamp(%s)", | ||
| (currency, since_timestamp), | ||
| ).fetchone() | ||
| return float(row[0]) | ||
| # Use it: | ||
| store = PostgresSpendStore("postgresql://...") | ||
| evaluator = SpendLimitEvaluator(config, store=store) | ||
| ``` | ||
|
|
||
| ## Running Tests | ||
|
|
||
| ```bash | ||
| cd evaluators/contrib/financial-governance | ||
| pip install -e ".[dev]" | ||
| pytest tests/ -v | ||
| ``` | ||
|
|
||
| ## Design Decisions | ||
|
|
||
| 1. **Decoupled from data source** — The `SpendStore` protocol means no new tables in core Agent Control. Bring your own persistence. | ||
| 2. **Context-aware limits** — Override keys in the evaluate data dict allow per-channel, per-agent, or per-session limits without multiple evaluator instances. | ||
| 3. **Python SDK compatible** — Uses the standard evaluator interface; works with both the server and the Python SDK evaluation engine. | ||
| 4. **Fail-open on errors** — Missing or malformed data returns `matched=False` with an `error` field, following Agent Control conventions. | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This line does not match the implementation anymore. After the fix, malformed runtime payload returns |
||
|
|
||
| ## Related Projects | ||
|
|
||
| - [x402](https://github.com/coinbase/x402) — HTTP 402 payment protocol | ||
| - [agentpay-mcp](https://github.com/up2itnow0822/agentpay-mcp) — MCP server for non-custodial agent payments | ||
|
|
||
| ## License | ||
|
|
||
| Apache-2.0 — see [LICENSE](../../../LICENSE). | ||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,55 @@ | ||
| [project] | ||
| name = "agent-control-evaluator-financial-governance" | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Nice to have this as a standalone package, but I do not think it is actually reachable for end users yet. As-is, I do not think |
||
| version = "0.1.0" | ||
| description = "Financial governance evaluators for agent-control — spend limits and transaction policy enforcement" | ||
| readme = "README.md" | ||
| requires-python = ">=3.12" | ||
| license = { text = "Apache-2.0" } | ||
| authors = [{ name = "agent-control contributors" }] | ||
| keywords = ["agent-control", "evaluator", "financial", "spend-limit", "x402", "agentpay"] | ||
| classifiers = [ | ||
| "Development Status :: 4 - Beta", | ||
| "Intended Audience :: Developers", | ||
| "License :: OSI Approved :: Apache Software License", | ||
| "Programming Language :: Python :: 3", | ||
| "Programming Language :: Python :: 3.12", | ||
| "Topic :: Software Development :: Libraries", | ||
| ] | ||
| dependencies = [ | ||
| "agent-control-evaluators>=3.0.0", | ||
| "agent-control-models>=3.0.0", | ||
| ] | ||
|
|
||
| [project.optional-dependencies] | ||
| dev = [ | ||
| "pytest>=8.0.0", | ||
| "pytest-asyncio>=0.23.0", | ||
| "pytest-cov>=4.0.0", | ||
| "ruff>=0.1.0", | ||
| "mypy>=1.8.0", | ||
| ] | ||
|
|
||
| [project.entry-points."agent_control.evaluators"] | ||
| "financial_governance.spend_limit" = "agent_control_evaluator_financial_governance.spend_limit:SpendLimitEvaluator" | ||
| "financial_governance.transaction_policy" = "agent_control_evaluator_financial_governance.transaction_policy:TransactionPolicyEvaluator" | ||
|
|
||
| [build-system] | ||
| requires = ["hatchling"] | ||
| build-backend = "hatchling.build" | ||
|
|
||
| [tool.hatch.build.targets.wheel] | ||
| packages = ["src/agent_control_evaluator_financial_governance"] | ||
|
|
||
| [tool.ruff] | ||
| line-length = 100 | ||
| target-version = "py312" | ||
|
|
||
| [tool.ruff.lint] | ||
| select = ["E", "F", "I"] | ||
|
|
||
| [tool.pytest.ini_options] | ||
| asyncio_mode = "auto" | ||
|
|
||
| [tool.uv.sources] | ||
| agent-control-evaluators = { path = "../../builtin", editable = true } | ||
| agent-control-models = { path = "../../../models", editable = true } | ||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,46 @@ | ||
| """Financial governance evaluators for agent-control. | ||
| Provides two evaluators for enforcing financial policy on AI agent transactions: | ||
| - ``financial_governance.spend_limit``: Tracks cumulative spend against rolling | ||
| period budgets and per-transaction caps. | ||
| - ``financial_governance.transaction_policy``: Static policy checks — allowlists, | ||
| blocklists, amount bounds, and permitted currencies. | ||
| Both evaluators are registered automatically when this package is installed and | ||
| the ``agent_control.evaluators`` entry point group is discovered. | ||
| Example usage in an agent-control control config:: | ||
| { | ||
| "condition": { | ||
| "selector": {"path": "*"}, | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. If this is meant to be a real Agent Control config example, I think
|
||
| "evaluator": { | ||
| "name": "financial_governance.spend_limit", | ||
| "config": { | ||
| "max_per_transaction": 100.0, | ||
| "max_per_period": 1000.0, | ||
| "period_seconds": 86400, | ||
| "currency": "USDC" | ||
| } | ||
| } | ||
| }, | ||
| "action": {"decision": "deny"} | ||
| } | ||
| """ | ||
|
|
||
| from agent_control_evaluator_financial_governance.spend_limit import ( | ||
| SpendLimitConfig, | ||
| SpendLimitEvaluator, | ||
| ) | ||
| from agent_control_evaluator_financial_governance.transaction_policy import ( | ||
| TransactionPolicyConfig, | ||
| TransactionPolicyEvaluator, | ||
| ) | ||
|
|
||
| __all__ = [ | ||
| "SpendLimitEvaluator", | ||
| "SpendLimitConfig", | ||
| "TransactionPolicyEvaluator", | ||
| "TransactionPolicyConfig", | ||
| ] | ||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,12 @@ | ||
| """Spend-limit evaluator package.""" | ||
|
|
||
| from .config import SpendLimitConfig | ||
| from .evaluator import SpendLimitEvaluator | ||
| from .store import InMemorySpendStore, SpendStore | ||
|
|
||
| __all__ = [ | ||
| "SpendLimitEvaluator", | ||
| "SpendLimitConfig", | ||
| "SpendStore", | ||
| "InMemorySpendStore", | ||
| ] |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,68 @@ | ||
| """Configuration model for the spend-limit evaluator.""" | ||
|
|
||
| from __future__ import annotations | ||
|
|
||
| from pydantic import Field, field_validator | ||
|
|
||
| from agent_control_evaluators import EvaluatorConfig | ||
|
|
||
|
|
||
| class SpendLimitConfig(EvaluatorConfig): | ||
| """Configuration for :class:`~.evaluator.SpendLimitEvaluator`. | ||
| All monetary fields are expressed in the units of *currency*. | ||
| Attributes: | ||
| max_per_transaction: Hard cap on any single transaction amount. A | ||
| transaction whose ``amount`` exceeds this value is blocked | ||
| regardless of accumulated period spend. Set to ``0.0`` to disable. | ||
| max_per_period: Maximum total spend allowed within the rolling | ||
| *period_seconds* window. Set to ``0.0`` to disable. | ||
| period_seconds: Length of the rolling budget window in seconds. | ||
| Defaults to ``86400`` (24 hours). | ||
| currency: Currency symbol this policy applies to (e.g. ``"USDC"``). | ||
| Transactions whose currency does not match are passed through as | ||
| *not matched* (i.e. allowed). | ||
| Example config dict:: | ||
| { | ||
| "max_per_transaction": 500.0, | ||
| "max_per_period": 5000.0, | ||
| "period_seconds": 86400, | ||
| "currency": "USDC" | ||
| } | ||
| """ | ||
|
|
||
| max_per_transaction: float = Field( | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I would be pretty hesitant to use |
||
| default=0.0, | ||
| ge=0.0, | ||
| description=( | ||
| "Per-transaction spend cap in *currency* units. " | ||
| "0.0 means no per-transaction limit." | ||
| ), | ||
| ) | ||
| max_per_period: float = Field( | ||
| default=0.0, | ||
| ge=0.0, | ||
| description=( | ||
| "Maximum cumulative spend allowed in the rolling period window. " | ||
| "0.0 means no period limit." | ||
| ), | ||
| ) | ||
| period_seconds: int = Field( | ||
| default=86_400, | ||
| ge=1, | ||
| description="Rolling budget window length in seconds (default: 86400 = 24 h).", | ||
| ) | ||
| currency: str = Field( | ||
| ..., | ||
| min_length=1, | ||
| description="Currency symbol this policy applies to (e.g. 'USDC', 'ETH').", | ||
| ) | ||
|
|
||
| @field_validator("currency") | ||
| @classmethod | ||
| def normalize_currency(cls, v: str) -> str: | ||
| """Normalize currency symbol to upper-case for consistent comparison.""" | ||
| return v.upper() | ||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The custom store example is already stale relative to the protocol above.
SpendStore.get_spend()now takesscope, so anyone copying this signature will implement the wrong interface and miss the context-scoped budget behavior. I would update the example to includescope: dict[str, str] | None = Noneand show how the metadata filter is applied.