Add support for LLM middleware by tkattkat · Pull Request #1872 · browserbase/stagehand

tkattkat · 2026-03-23T18:20:46Z

why

In some scenarios, users need more granular control / access to the LLM at the provider leve

what changed

Added support for passing custom "middleware" in the model options

test plan

wrote tests
tested locally with own middleware

Summary by cubic

Adds first-class LLM middleware to intercept and wrap model calls for logging, usage tracking, or request transforms without changing call sites. Per-call middleware is used only for that call and is not cached.

New Features
- Accepts middleware: LanguageModelV2Middleware in V3 model config; LLMProvider and getAISDKLanguageModel wrap models via wrapLanguageModel from ai.
- Supports per-call overrides via model options; override middleware is passed through LLMProvider.getClient, applies only to that call, and is never cached.
- Works across providers and preserves modelId, doGenerate, and doStream. Tests cover usage capture, streaming, errors, chaining, and edge cases.
Migration
- To use: pass middleware in model config or a per-call override, e.g. { modelName: 'openai/gpt-4o', apiKey: '...', middleware }.
- Middleware is only applied in direct/local execution and is not serialized over HTTP.

^{Written for commit b09e42c. Summary will update on new commits. Review in cubic}

changeset-bot · 2026-03-23T18:20:50Z

🦋 Changeset detected

Latest commit: b09e42c

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 5 packages

Name	Type
@browserbasehq/stagehand	Patch
@browserbasehq/browse-cli	Patch
@browserbasehq/stagehand-evals	Patch
@browserbasehq/stagehand-server-v3	Patch
@browserbasehq/stagehand-server-v4	Patch

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

cubic-dev-ai

2 issues found across 5 files

Confidence score: 3/5

In packages/core/lib/v3/v3.ts, per-request model.middleware passed via act/extract/observe overrides is being discarded, so middleware that callers expect to run will silently not execute.
The tests in packages/core/tests/unit/llm-middleware.test.ts validate JavaScript semantics (typeof, destructuring) and simulated internals more than app behavior, so they may give false confidence and miss this regression path.
Given the concrete runtime impact and high confidence on the middleware issue, this PR carries some merge risk until that path is corrected or covered by behavior-level tests.
Pay close attention to packages/core/lib/v3/v3.ts, packages/core/tests/unit/llm-middleware.test.ts - middleware override flow appears broken and current tests may not reliably detect it.

Prompt for AI agents (unresolved issues)


Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="packages/core/tests/unit/llm-middleware.test.ts">

<violation number="1" location="packages/core/tests/unit/llm-middleware.test.ts:284">
P2: These tests verify JavaScript language semantics (destructuring and `typeof` operator) rather than application logic. Simulating internal functions provides a false sense of test coverage.</violation>
</file>

<file name="packages/core/lib/v3/v3.ts">

<violation number="1" location="packages/core/lib/v3/v3.ts:452">
P2: Per-request `model.middleware` is silently discarded, so middleware passed via `act/extract/observe` model overrides never runs.</violation>
</file>

Architecture diagram

sequenceDiagram
    participant User
    participant SH as Stagehand (V3)
    participant LP as LLMProvider
    participant AISDK as AI SDK (Internal)
    participant MW as Middleware (User Provided)
    participant Provider as Model Provider (OpenAI/Anthropic/etc.)

    Note over User,Provider: Initialization Flow

    User->>SH: new Stagehand({ model: { modelName, middleware, ... } })
    SH->>SH: CHANGED: resolveModelConfiguration()
    Note right of SH: Extracts middleware from model object
    SH->>LP: NEW: new LLMProvider(logger, middleware)

    Note over User,Provider: Runtime Request Flow (e.g., .act(), .extract())

    SH->>LP: getClient(modelName)
    LP->>AISDK: CHANGED: getAISDKLanguageModel(..., middleware)
    
    AISDK->>AISDK: Create base provider model
    
    alt NEW: Middleware provided
        AISDK->>AISDK: wrapLanguageModel({ model, middleware })
        Note right of AISDK: Applies AI SDK v2 Middleware
    end
    
    AISDK-->>LP: Returns (Wrapped) LanguageModelV2
    LP-->>SH: Returns AISdkClient

    SH->>LP: client.doGenerate(prompt)
    
    alt NEW: Middleware Active
        LP->>MW: wrapGenerate({ doGenerate, model, ... })
        MW->>Provider: doGenerate()
        Provider-->>MW: Raw result + Usage
        MW-->>LP: Result (intercepted/transformed)
    else No Middleware
        LP->>Provider: doGenerate()
        Provider-->>LP: Raw result
    end

    LP-->>SH: Return inference result
    SH-->>User: Return action/data

    Note over MW,Provider: Middleware only applies to local execution.<br/>Not serialized over HTTP/API.

_{Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.}

packages/core/tests/unit/llm-middleware.test.ts

packages/core/lib/v3/v3.ts

cubic-dev-ai

2 issues found across 5 files

Confidence score: 3/5

There is a concrete regression risk in packages/core/lib/v3/v3.ts: per-call middleware cache key collisions can cause one middleware to reuse another middleware’s cached LLM client, leading to incorrect runtime behavior.
The issue is high-confidence and user-facing (severity 7/10, confidence 9/10), which moves this PR into moderate merge risk rather than a low-risk merge.
The test concern in packages/core/tests/unit/llm-middleware.test.ts is lower severity but important: a tautological test won’t validate the real resolver path, so this bug class may slip through CI.
Pay close attention to packages/core/lib/v3/v3.ts and packages/core/tests/unit/llm-middleware.test.ts - fix cache key isolation and add a resolver-driven test to prevent recurrence.

Prompt for AI agents (unresolved issues)


Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="packages/core/lib/v3/v3.ts">

<violation number="1" location="packages/core/lib/v3/v3.ts:486">
P1: Per-call middleware cache keys collide, so different middleware functions can reuse the wrong cached LLM client.</violation>
</file>

<file name="packages/core/tests/unit/llm-middleware.test.ts">

<violation number="1" location="packages/core/tests/unit/llm-middleware.test.ts:296">
P2: This test is tautological: it only destructures a local object instead of exercising the actual model-configuration resolver, so it won’t catch regressions in production logic.</violation>
</file>

Architecture diagram

sequenceDiagram
    participant User as User/Caller
    participant V3 as V3 Engine
    participant LP as LLMProvider
    participant SDK as AI SDK (External)
    participant MW as Middleware (User Provided)

    Note over User, MW: Initialization Flow (Global Middleware)

    User->>V3: new V3({ model: { ..., middleware } })
    V3->>V3: resolveModelConfiguration()
    V3->>LP: CHANGED: new LLMProvider(logger, middleware)
    V3->>LP: getClient(modelName, clientOptions)
    LP->>LP: getAISDKLanguageModel()
    LP->>SDK: NEW: wrapLanguageModel({ model, middleware })
    SDK-->>LP: Wrapped Model instance
    LP-->>V3: AISdkClient (using wrapped model)

    Note over User, MW: Execution Flow (Per-Call Override)

    User->>V3: act / extract / observe ({ model: { ..., middleware: overrideMW } })
    V3->>V3: NEW: Check overrideLlmClients cache (incl. middleware state)
    alt Cache Miss
        V3->>LP: getClient(..., { middleware: overrideMW })
        LP->>LP: NEW: effectiveMiddleware = overrideMW ?? globalMiddleware
        LP->>SDK: wrapLanguageModel({ model, effectiveMiddleware })
        SDK-->>LP: Wrapped Model instance
        LP-->>V3: AISdkClient
        V3->>V3: NEW: Cache client by model + options + middleware
    end

    Note over V3, MW: Runtime Interaction

    V3->>LP: llmClient.doGenerate(prompt)
    LP->>MW: NEW: wrapGenerate({ doGenerate, model })
    MW->>MW: Custom logic (Logging / Transformation)
    MW->>SDK: doGenerate()
    SDK-->>MW: Result + Usage Data
    MW-->>LP: Result
    LP-->>V3: Final Response

_{Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.}

packages/core/lib/v3/v3.ts

packages/core/tests/unit/llm-middleware.test.ts

miguelg719 · 2026-03-26T21:55:38Z

@cubic-dev-ai

cubic-dev-ai · 2026-03-26T21:55:56Z

@cubic-dev-ai

@miguelg719 I have started the AI code review. It will take a few minutes to complete.

cubic-dev-ai

1 issue found across 5 files

Confidence score: 3/5

There is a concrete regression risk in packages/core/lib/v3/v3.ts: cached clients in overrideLlmClients are keyed by model name + client options, so calls with different per-call middleware can incorrectly reuse an earlier client.
Because this issue is medium severity (6/10) with fairly solid confidence (7/10) and can change runtime behavior across requests, this carries some user-impacting risk if merged as-is.
This is likely straightforward to address by aligning cache keying with middleware-sensitive inputs; once fixed, the PR should be in safer shape to merge.
Pay close attention to packages/core/lib/v3/v3.ts - client cache reuse may ignore middleware differences and apply the wrong middleware path.

Prompt for AI agents (unresolved issues)


Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="packages/core/lib/v3/v3.ts">

<violation number="1" location="packages/core/lib/v3/v3.ts:499">
P2: Per-call middleware is passed to getClient, but overrideLlmClients is still keyed only by modelName + clientOptions. That means a later call with a different middleware can reuse the cached client from the first call, so the override is ignored.</violation>
</file>

Architecture diagram

sequenceDiagram
    participant User as Caller/Client
    participant V3 as V3 Orchestrator
    participant Prov as LLMProvider
    participant AI as AI SDK (wrapLanguageModel)
    participant Model as LanguageModelV2

    Note over User, Model: Initialization with Global Middleware
    User->>V3: new Stagehand({ model: { middleware, ... } })
    V3->>V3: resolveModelConfiguration()
    V3->>Prov: NEW: new LLMProvider(logger, middleware)
    
    Note over User, Model: Execution with Per-Call Override
    User->>V3: act / extract / observe({ model: { middleware: perCallMw, ... } })
    
    V3->>V3: CHANGED: Determine effective middleware
    alt Per-call middleware provided
        V3->>V3: Use perCallMw
    else No per-call override
        V3->>V3: Use global middleware
    end

    V3->>Prov: getClient(modelName, options, { middleware })
    Prov->>Prov: getAISDKLanguageModel(..., middleware)
    
    alt Middleware exists
        Prov->>AI: NEW: wrapLanguageModel({ model, middleware })
        AI-->>Prov: Wrapped Model Instance
    else No middleware
        Prov-->>Prov: Raw Model Instance
    end
    
    Prov-->>V3: AISdkClient (with wrapped model)
    
    V3->>Model: doGenerate() / doStream()
    
    Note over Model: Middleware Interception Logic
    opt Middleware Active
        Model->>Model: NEW: wrapGenerate() / wrapStream()
        Note right of Model: Interceptor can log, track usage,<br/>or transform requests
    end

    Model-->>V3: Result (usage, content, etc.)
    V3-->>User: Response

_{Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.}

packages/core/lib/v3/v3.ts

Add support for LLM middleware

25aa812

tkattkat marked this pull request as draft March 23, 2026 18:25

cubic-dev-ai bot reviewed Mar 23, 2026

View reviewed changes

packages/core/tests/unit/llm-middleware.test.ts Show resolved Hide resolved

packages/core/lib/v3/v3.ts Outdated Show resolved Hide resolved

tkattkat added 2 commits March 23, 2026 12:47

propogate middle ware to per action overrides

d5f6e7d

update test

5aca5a6

tkattkat marked this pull request as ready for review March 23, 2026 20:18

cubic-dev-ai bot reviewed Mar 23, 2026

View reviewed changes

packages/core/lib/v3/v3.ts Outdated Show resolved Hide resolved

packages/core/tests/unit/llm-middleware.test.ts Outdated Show resolved Hide resolved

tkattkat added 2 commits March 26, 2026 12:58

remove unnecessary cache key

66b9e61

update tests

724fc1c

cubic-dev-ai bot reviewed Mar 26, 2026

View reviewed changes

packages/core/lib/v3/v3.ts Outdated Show resolved Hide resolved

dont cache providers with per call middleware

b09e42c

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add support for LLM middleware#1872

Add support for LLM middleware#1872
tkattkat wants to merge 6 commits intomainfrom
Add-support-for-LLM-middleware

tkattkat commented Mar 23, 2026 •

edited by cubic-dev-ai bot

Loading

Uh oh!

changeset-bot bot commented Mar 23, 2026 •

edited

Loading

Uh oh!

cubic-dev-ai bot left a comment

Uh oh!

Uh oh!

Uh oh!

cubic-dev-ai bot left a comment

Uh oh!

Uh oh!

Uh oh!

miguelg719 commented Mar 26, 2026

Uh oh!

cubic-dev-ai bot commented Mar 26, 2026

Uh oh!

cubic-dev-ai bot left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

tkattkat commented Mar 23, 2026 • edited by cubic-dev-ai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

why

what changed

test plan

Summary by cubic

Uh oh!

changeset-bot bot commented Mar 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🦋 Changeset detected

Uh oh!

cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

miguelg719 commented Mar 26, 2026

Uh oh!

cubic-dev-ai bot commented Mar 26, 2026

Uh oh!

cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

tkattkat commented Mar 23, 2026 •

edited by cubic-dev-ai bot

Loading

changeset-bot bot commented Mar 23, 2026 •

edited

Loading