Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
71 commits
Select commit Hold shift + click to select a range
c79aa3e
docs: add initial specification for extensibility hooks in Mellea
araujof Feb 4, 2026
88c2a24
docs: update hook system spec to factor component hooks and address d…
araujof Feb 4, 2026
f3f751b
docs: add clarifications for component hook payload fields and additi…
araujof Feb 5, 2026
dac992e
docs: add implementation plan
araujof Feb 6, 2026
506eae1
docs: update implementation plan
araujof Feb 6, 2026
d184f08
docs: minor cleanups to implementation plan
araujof Feb 6, 2026
aa33f71
feat: update to reflect programmatic and functional-first design
araujof Feb 15, 2026
3c620de
feat: specify hook payload write protection
araujof Feb 17, 2026
c8a9c36
chore: add optional dependency for plugin framework
araujof Feb 17, 2026
7105c83
feat: implemented hook system and initial set of hook types
araujof Feb 17, 2026
91c61fd
feat: add plugin examples
araujof Feb 17, 2026
5e132d3
refactor: update examples to use MelleaHookType enum
araujof Feb 18, 2026
584d0de
feat: add PluginMode enum
araujof Feb 18, 2026
80796c1
refactor: drop estimated_tokens from generation_pre_call payload
araujof Feb 18, 2026
8160ee2
feat: add context manager block support for plugins and plugin sets
araujof Feb 18, 2026
1bc091a
docs: update hook system specification to document with-block support
araujof Feb 18, 2026
1d1defa
chore: removed unused imports
araujof Feb 18, 2026
e19e6dd
feat: implement tool hooks
araujof Feb 18, 2026
93eec9d
feat: update example for tool call hooks
araujof Feb 18, 2026
6479a58
chore: tune internal log levels for clarity
araujof Feb 18, 2026
0b0ffd6
chore: tune internal log levels for clarity
araujof Feb 19, 2026
ece94d2
docs: updated spec with not implemented payload fields
araujof Feb 19, 2026
b569852
fix: minor implementaiton bugs and tests
araujof Feb 19, 2026
0e2bc7f
Update hook_system.md
HendrikStrobelt Feb 17, 2026
0cb9472
refactor: use cpex package; update handling of modified_payloads
araujof Mar 3, 2026
e8e7146
chore: update lock file
araujof Mar 3, 2026
a9d081e
chore: bump cpex version to 0.1.0.dev2
araujof Mar 3, 2026
d4bf2ff
fix: mode semantics
araujof Mar 4, 2026
a46d461
feat: implemented fire_and_forget mode
araujof Mar 4, 2026
b0a1322
feat: update execution mode map
araujof Mar 4, 2026
352f22e
feat: update plugin modes and specs
araujof Mar 5, 2026
77e3259
feat: update examples with concurrent hooks
araujof Mar 5, 2026
432d791
feat: update examples
araujof Mar 5, 2026
91dcc53
feat: refine has_plugins to accept hook type
araujof Mar 5, 2026
25a5893
chore: update cpex version
araujof Mar 5, 2026
b19c006
chore: cleanup
araujof Mar 5, 2026
bf46536
refactor: tool call hook types
araujof Mar 5, 2026
ae7bb40
refactor: tool hooks example; payload mutation handling
araujof Mar 6, 2026
a36ce94
fix: PR review comments
araujof Mar 6, 2026
0b778c2
chore: renamed dependency group from cpex to hooks
araujof Mar 6, 2026
88dc2fc
chore: lint and formatting fixes
araujof Mar 6, 2026
c7dc264
fix: mypy and remaining lint issues
araujof Mar 7, 2026
819b81e
docs: updated specs to reflect implementation changes
araujof Mar 7, 2026
778c694
refactor: backend generate_from_context wrapper
araujof Mar 7, 2026
d26f886
refactor: generation pre call
araujof Mar 7, 2026
e813f74
fix: generation_post_call hook placement
araujof Mar 7, 2026
e303061
feat: added modify result object, unregister function, other cleanups
araujof Mar 7, 2026
8df145f
feat: improve handling of generate_post_call
araujof Mar 8, 2026
695ca37
fix: previously existing mypy issues (can be cherry picked to fix mai…
araujof Mar 8, 2026
d82c61b
refactor: improves invoke_hook function; deduplicate context and othe…
araujof Mar 8, 2026
b9b32b7
docs: update hook specs
araujof Mar 8, 2026
14752f8
refactor: drop backend_kwargs from session_pre_init payload
araujof Mar 8, 2026
e921747
refactor: improvements and bug fixes
araujof Mar 9, 2026
a31128d
refactor: remove unimplemented generation_stream_chunk from hook type…
araujof Mar 9, 2026
04a602e
fix: regression introduced with weakrefs
araujof Mar 9, 2026
2d085e2
chore: cleanup
araujof Mar 9, 2026
98e9799
docs: add tutorial-style examples for plugins
araujof Mar 9, 2026
2b3939c
chore: fix formatting issues in new examples
araujof Mar 9, 2026
51375b1
refactor: drop weakrefs, unwrap session in payloads, and refactor exe…
araujof Mar 10, 2026
af488a2
docs: clarify payload mutability approach in docstring
araujof Mar 10, 2026
c6f8a48
docs: update hook spec to document payload mutability approach
araujof Mar 10, 2026
e932021
refactor: set default to silence plugin errors; added acceptance test…
araujof Mar 10, 2026
f2b149b
fix: minor regressions in examples
araujof Mar 10, 2026
5ba857e
docs: initial user docs for plugins
araujof Mar 10, 2026
e6f0087
refactor: converted a few writable fields into observe-only fields in…
araujof Mar 10, 2026
777324d
docs: minor updates and nits to the plugin docs
araujof Mar 10, 2026
f731758
refactor: move pre and post gen hooks
jakelorocco Mar 10, 2026
77aa36c
refactor: modify plugin fixture and fix tests
jakelorocco Mar 10, 2026
73a402b
refactor: refactor tests for hook_call sites
jakelorocco Mar 10, 2026
bc16c09
refactor: move to local imports for hooks; fix pre-commit issues
jakelorocco Mar 10, 2026
ef22055
fix: add back type error ignore
jakelorocco Mar 10, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1,769 changes: 1,769 additions & 0 deletions docs/dev/hook_system.md

Large diffs are not rendered by default.

1,230 changes: 1,230 additions & 0 deletions docs/dev/hook_system_implementation_plan.md

Large diffs are not rendered by default.

970 changes: 970 additions & 0 deletions docs/docs/core-concept/plugins.mdx

Large diffs are not rendered by default.

3 changes: 2 additions & 1 deletion docs/docs/docs.json
Original file line number Diff line number Diff line change
Expand Up @@ -54,7 +54,8 @@
"core-concept/tuning",
"core-concept/adapters",
"core-concept/alora",
"core-concept/interoperability"
"core-concept/interoperability",
"core-concept/plugins"
]
}
]
Expand Down
Empty file.
128 changes: 128 additions & 0 deletions docs/examples/plugins/class_plugin.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,128 @@
# pytest: ollama, llm
#
# Class-based plugin — group related hooks in a single Plugin subclass.
#
# This example creates a PII protection plugin that:
# 1. Blocks input containing SSN patterns before component execution
# 2. Scans LLM output for SSN patterns after generation (observe-only)
#
# Run:
# uv run python docs/examples/plugins/class_plugin.py

import logging
import re
import sys

from mellea import start_session
from mellea.plugins import HookType, Plugin, PluginViolationError, block, hook, register

logging.basicConfig(
level=logging.INFO,
format="%(asctime)s [%(levelname)s] %(name)s: %(message)s",
datefmt="%Y-%m-%dT%H:%M:%S",
)
logging.getLogger("httpx").setLevel(logging.ERROR)
logging.getLogger("fancy_logger").setLevel(logging.ERROR)
log = logging.getLogger("class_plugin")


class PIIRedactor(Plugin, name="pii-redactor", priority=5):
"""Redacts PII patterns from both input and output.

.. warning:: Shared mutable state
``redaction_count`` is shared across all hook invocations. This is
safe today because all hooks run on the same ``asyncio`` event loop,
but would require a lock or ``contextvars`` if hooks ever execute in
parallel threads.
"""

def __init__(self, patterns: list[str] | None = None):
self.patterns = patterns or [
r"\d{3}-\d{2}-\d{4}", # SSN
r"\b\d{16}\b", # credit card (simplified)
]
self.redaction_count = 0

@hook(HookType.COMPONENT_PRE_EXECUTE)
async def reject_pii_input(self, payload, ctx):
"""Block component execution if the action contains PII patterns."""
if payload.component_type != "Instruction":
return
original = (
str(payload.action._description) if payload.action._description else ""
)
if self._contains_pii(original):
log.warning("[pii-redactor] PII detected in component action — blocking")
self.redaction_count += 1
return block(
"Input contains PII patterns that must be removed before processing",
code="PII_INPUT_DETECTED",
)
log.info("[pii-redactor] no PII found in input")

@hook(HookType.GENERATION_POST_CALL)
async def scan_output(self, payload, ctx):
"""Scan LLM output for PII and log a warning if detected.

``generation_post_call`` is observe-only — plugins cannot modify the
``model_output``. This hook therefore only inspects the output and
records a warning for downstream monitoring/alerting.
"""
mot_value = getattr(payload.model_output, "value", None)
if mot_value is None:
log.info("[pii-redactor] output not yet computed — skipping output scan")
return
original = str(mot_value)
if self._contains_pii(original):
log.warning("[pii-redactor] PII detected in LLM output (observe-only)")
self.redaction_count += 1
else:
log.info("[pii-redactor] no PII found in output")

def _contains_pii(self, text: str) -> bool:
return any(re.search(p, text) for p in self.patterns)


# Create an instance and register it globally
redactor = PIIRedactor()
register(redactor)

if __name__ == "__main__":
log.info("--- Class-based Plugin example (PII Redactor) ---")
log.info("")

with start_session() as m:
log.info("Session started (id=%s)", m.id)
log.info("")

# Request 1: contains an SSN — the input hook blocks execution.
log.info("Request 1: input with PII (should be blocked)")
try:
m.instruct(
"Summarize this customer record: "
"Name: Jane Doe, SSN: 123-45-6789, Status: Active"
)
except PluginViolationError as e:
log.info(
"Blocked as expected on %s: [%s] %s", e.hook_type, e.code, e.reason
)
log.info("")

# Request 2: clean input — no PII, so it reaches the LLM.
# If the LLM output contains PII, scan_output logs a warning (observe-only).
log.info("Request 2: clean input (should succeed)")
try:
result = m.instruct("Name the three primary colors.")
log.info("Result: %s", result)
except PluginViolationError as e:
log.warning(
"Execution blocked on %s: [%s] %s (plugin=%s)",
e.hook_type,
e.code,
e.reason,
e.plugin_name,
)
sys.exit(1)

log.info("")
log.info("Total PII detections: %d", redactor.redaction_count)
130 changes: 130 additions & 0 deletions docs/examples/plugins/execution_modes.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,130 @@
# pytest: ollama, llm
#
# Execution modes — all five PluginMode values side by side.
#
# This example registers five hooks on the same hook type
# (COMPONENT_PRE_EXECUTE), each using a different execution mode.
# It demonstrates:
#
# 1. SEQUENTIAL — serial, can block + modify
# 2. TRANSFORM — serial, can modify only (blocks suppressed)
# 3. AUDIT — serial, observe-only (modifications discarded, blocks logged)
# 4. CONCURRENT — parallel, can block only (modifications discarded)
# 5. FIRE_AND_FORGET — background, observe-only (result ignored)
#
# Execution order: SEQUENTIAL → TRANSFORM → AUDIT → CONCURRENT → FIRE_AND_FORGET
#
# Run:
# uv run python docs/examples/plugins/execution_modes.py

import logging

from mellea import start_session
from mellea.plugins import (
HookType,
PluginMode,
PluginViolationError,
block,
hook,
modify,
plugin_scope,
)

logging.basicConfig(
level=logging.INFO,
format="%(asctime)s [%(levelname)s] %(name)s: %(message)s",
datefmt="%Y-%m-%dT%H:%M:%S",
)
logging.getLogger("httpx").setLevel(logging.ERROR)
logging.getLogger("fancy_logger").setLevel(logging.ERROR)
log = logging.getLogger("execution_modes")


# --- Hook 1: SEQUENTIAL (priority=10) ---
# Serial, chained execution. Can block the pipeline and modify writable
# payload fields. Each hook receives the payload from the prior one.


@hook(HookType.COMPONENT_PRE_EXECUTE, mode=PluginMode.SEQUENTIAL, priority=10)
async def sequential_hook(payload, ctx):
"""Sequential hook — can block + modify, runs inline in priority order."""
log.info("[SEQUENTIAL p=10] component=%s", payload.component_type)


# --- Hook 2: TRANSFORM (priority=20) ---
# Serial, chained execution after all SEQUENTIAL hooks. Can modify writable
# payload fields but CANNOT block — blocking results are suppressed with a
# warning. Ideal for data transformation (PII redaction, prompt rewriting).


@hook(HookType.COMPONENT_PRE_EXECUTE, mode=PluginMode.TRANSFORM, priority=20)
async def transform_hook(payload, ctx):
"""Transform hook — can modify but cannot block."""
log.info("[TRANSFORM p=20] enriching model_options")
opts = dict(payload.model_options or {})
opts.setdefault("temperature", 0.7)
return modify(payload, model_options=opts)


# --- Hook 3: AUDIT (priority=30) ---
# Serial execution after TRANSFORM. Observe-only: payload modifications are
# discarded and violations are logged but do NOT block. Use for monitoring,
# metrics, and gradual policy rollout.


@hook(HookType.COMPONENT_PRE_EXECUTE, mode=PluginMode.AUDIT, priority=30)
async def audit_hook(payload, ctx):
"""Audit hook — observe-only; violations logged but not enforced."""
log.info("[AUDIT p=30] would block, but audit mode only logs")
return block("Audit-mode violation: for monitoring only", code="AUDIT_001")


# --- Hook 4: CONCURRENT (priority=40) ---
# Dispatched in parallel after AUDIT. Can block the pipeline (fail-fast on
# first blocking result) but payload modifications are discarded to avoid
# non-deterministic last-writer-wins races.


@hook(HookType.COMPONENT_PRE_EXECUTE, mode=PluginMode.CONCURRENT, priority=40)
async def concurrent_hook(payload, ctx):
"""Concurrent hook — can block but cannot modify, runs in parallel."""
log.info("[CONCURRENT p=40] component=%s", payload.component_type)


# --- Hook 5: FIRE_AND_FORGET (priority=50) ---
# Dispatched via asyncio.create_task() after all other phases. Receives a
# copy-on-write snapshot of the payload. Cannot modify payloads or block
# execution. Any exceptions are logged but do not propagate.
# The log line may appear *after* the main result is printed.


@hook(HookType.COMPONENT_PRE_EXECUTE, mode=PluginMode.FIRE_AND_FORGET, priority=50)
async def fire_and_forget_hook(payload, ctx):
"""Fire-and-forget hook — runs in background, never blocks."""
log.info("[FIRE_AND_FORGET p=50] logging in the background")


if __name__ == "__main__":
log.info("--- Execution modes example ---")
log.info("")

with start_session() as m:
with plugin_scope(
sequential_hook,
transform_hook,
audit_hook,
concurrent_hook,
fire_and_forget_hook,
):
try:
result = m.instruct("Name the four seasons.")
log.info("")
log.info("Result: %s", result)
except PluginViolationError as e:
log.error("Blocked: %s", e)

log.info("")
log.info(
"Note: the FIRE_AND_FORGET log may have appeared after the result "
"— that is expected behavior."
)
106 changes: 106 additions & 0 deletions docs/examples/plugins/payload_modification.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,106 @@
# pytest: ollama, llm
#
# Payload modification — how to modify payloads in hooks.
#
# This example demonstrates:
# 1. Using modify() to change writable payload fields
# 2. Using model_copy(update={...}) directly for fine-grained control
# 3. What happens when you try to modify a non-writable field (silently discarded)
#
# Run:
# uv run python docs/examples/plugins/payload_modification.py

import logging

from mellea import start_session
from mellea.plugins import (
HookType,
PluginMode,
PluginResult,
hook,
modify,
plugin_scope,
)

logging.basicConfig(
level=logging.INFO,
format="%(asctime)s [%(levelname)s] %(name)s: %(message)s",
datefmt="%Y-%m-%dT%H:%M:%S",
)
logging.getLogger("httpx").setLevel(logging.ERROR)
logging.getLogger("fancy_logger").setLevel(logging.ERROR)
log = logging.getLogger("payload_modification")


# ---------------------------------------------------------------------------
# Hook 1: Inject a max_tokens cap via modify() helper
#
# generation_pre_call writable fields include: model_options, format, tool_calls
# ---------------------------------------------------------------------------


@hook(HookType.GENERATION_PRE_CALL, mode=PluginMode.SEQUENTIAL, priority=10)
async def cap_max_tokens(payload, ctx):
"""Cap max_tokens to 256 on every generation call."""
opts = dict(payload.model_options or {})
if opts.get("max_tokens", float("inf")) > 256:
log.info("[cap_max_tokens] capping max_tokens to 256")
opts["max_tokens"] = 256
return modify(payload, model_options=opts)
log.info("[cap_max_tokens] max_tokens already within cap")


# ---------------------------------------------------------------------------
# Hook 2: Inject default model options via modify()
#
# component_pre_execute writable fields include: requirements, model_options, ...
# This shows model_copy(update={...}) for fine-grained control.
# ---------------------------------------------------------------------------


@hook(HookType.COMPONENT_PRE_EXECUTE, mode=PluginMode.SEQUENTIAL, priority=10)
async def inject_default_options(payload, ctx):
"""Ensure a default temperature is set on every component execution."""
opts = dict(payload.model_options or {})
if "temperature" not in opts:
log.info("[inject_default_options] setting default temperature=0.7")
opts["temperature"] = 0.7
return modify(payload, model_options=opts)
log.info("[inject_default_options] temperature already set")


# ---------------------------------------------------------------------------
# Hook 3: Attempt to modify a non-writable field (observe it is discarded)
#
# generation_pre_call does NOT include 'action' or 'context' as writable.
# This hook tries to modify 'context' — the change will be silently discarded
# by the payload policy enforcement, and the original context will be used.
# ---------------------------------------------------------------------------


@hook(HookType.GENERATION_PRE_CALL, mode=PluginMode.SEQUENTIAL, priority=20)
async def attempt_non_writable(payload, ctx):
"""Try to modify a non-writable field — change will be silently discarded."""
log.info("[attempt_non_writable] attempting to modify 'hook' (non-writable)")
# This modification will be filtered out by the payload policy
modified = payload.model_copy(update={"hook": "tampered"})
return PluginResult(continue_processing=True, modified_payload=modified)


if __name__ == "__main__":
log.info("--- Payload modification example ---")
log.info("")

with start_session() as m:
with plugin_scope(cap_max_tokens, inject_default_options, attempt_non_writable):
result = m.instruct(
"Summarize the benefits of open-source software in one sentence."
)
log.info("")
log.info("Result: %s", result)

log.info("")
log.info(
"Note: the 'hook' field modification in attempt_non_writable was silently "
"discarded by the payload policy — only writable fields are accepted."
)
Loading
Loading