Conversation
🦋 Changeset detectedLatest commit: b09e42c The changes in this PR will be included in the next version bump. This PR includes changesets to release 5 packages
Not sure what this means? Click here to learn what changesets are. Click here if you're a maintainer who wants to add another changeset to this PR |
There was a problem hiding this comment.
2 issues found across 5 files
Confidence score: 3/5
- In
packages/core/lib/v3/v3.ts, per-requestmodel.middlewarepassed viaact/extract/observeoverrides is being discarded, so middleware that callers expect to run will silently not execute. - The tests in
packages/core/tests/unit/llm-middleware.test.tsvalidate JavaScript semantics (typeof, destructuring) and simulated internals more than app behavior, so they may give false confidence and miss this regression path. - Given the concrete runtime impact and high confidence on the middleware issue, this PR carries some merge risk until that path is corrected or covered by behavior-level tests.
- Pay close attention to
packages/core/lib/v3/v3.ts,packages/core/tests/unit/llm-middleware.test.ts- middleware override flow appears broken and current tests may not reliably detect it.
Prompt for AI agents (unresolved issues)
Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.
<file name="packages/core/tests/unit/llm-middleware.test.ts">
<violation number="1" location="packages/core/tests/unit/llm-middleware.test.ts:284">
P2: These tests verify JavaScript language semantics (destructuring and `typeof` operator) rather than application logic. Simulating internal functions provides a false sense of test coverage.</violation>
</file>
<file name="packages/core/lib/v3/v3.ts">
<violation number="1" location="packages/core/lib/v3/v3.ts:452">
P2: Per-request `model.middleware` is silently discarded, so middleware passed via `act/extract/observe` model overrides never runs.</violation>
</file>
Architecture diagram
sequenceDiagram
participant User
participant SH as Stagehand (V3)
participant LP as LLMProvider
participant AISDK as AI SDK (Internal)
participant MW as Middleware (User Provided)
participant Provider as Model Provider (OpenAI/Anthropic/etc.)
Note over User,Provider: Initialization Flow
User->>SH: new Stagehand({ model: { modelName, middleware, ... } })
SH->>SH: CHANGED: resolveModelConfiguration()
Note right of SH: Extracts middleware from model object
SH->>LP: NEW: new LLMProvider(logger, middleware)
Note over User,Provider: Runtime Request Flow (e.g., .act(), .extract())
SH->>LP: getClient(modelName)
LP->>AISDK: CHANGED: getAISDKLanguageModel(..., middleware)
AISDK->>AISDK: Create base provider model
alt NEW: Middleware provided
AISDK->>AISDK: wrapLanguageModel({ model, middleware })
Note right of AISDK: Applies AI SDK v2 Middleware
end
AISDK-->>LP: Returns (Wrapped) LanguageModelV2
LP-->>SH: Returns AISdkClient
SH->>LP: client.doGenerate(prompt)
alt NEW: Middleware Active
LP->>MW: wrapGenerate({ doGenerate, model, ... })
MW->>Provider: doGenerate()
Provider-->>MW: Raw result + Usage
MW-->>LP: Result (intercepted/transformed)
else No Middleware
LP->>Provider: doGenerate()
Provider-->>LP: Raw result
end
LP-->>SH: Return inference result
SH-->>User: Return action/data
Note over MW,Provider: Middleware only applies to local execution.<br/>Not serialized over HTTP/API.
Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.
There was a problem hiding this comment.
2 issues found across 5 files
Confidence score: 3/5
- There is a concrete regression risk in
packages/core/lib/v3/v3.ts: per-call middleware cache key collisions can cause one middleware to reuse another middleware’s cached LLM client, leading to incorrect runtime behavior. - The issue is high-confidence and user-facing (severity 7/10, confidence 9/10), which moves this PR into moderate merge risk rather than a low-risk merge.
- The test concern in
packages/core/tests/unit/llm-middleware.test.tsis lower severity but important: a tautological test won’t validate the real resolver path, so this bug class may slip through CI. - Pay close attention to
packages/core/lib/v3/v3.tsandpackages/core/tests/unit/llm-middleware.test.ts- fix cache key isolation and add a resolver-driven test to prevent recurrence.
Prompt for AI agents (unresolved issues)
Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.
<file name="packages/core/lib/v3/v3.ts">
<violation number="1" location="packages/core/lib/v3/v3.ts:486">
P1: Per-call middleware cache keys collide, so different middleware functions can reuse the wrong cached LLM client.</violation>
</file>
<file name="packages/core/tests/unit/llm-middleware.test.ts">
<violation number="1" location="packages/core/tests/unit/llm-middleware.test.ts:296">
P2: This test is tautological: it only destructures a local object instead of exercising the actual model-configuration resolver, so it won’t catch regressions in production logic.</violation>
</file>
Architecture diagram
sequenceDiagram
participant User as User/Caller
participant V3 as V3 Engine
participant LP as LLMProvider
participant SDK as AI SDK (External)
participant MW as Middleware (User Provided)
Note over User, MW: Initialization Flow (Global Middleware)
User->>V3: new V3({ model: { ..., middleware } })
V3->>V3: resolveModelConfiguration()
V3->>LP: CHANGED: new LLMProvider(logger, middleware)
V3->>LP: getClient(modelName, clientOptions)
LP->>LP: getAISDKLanguageModel()
LP->>SDK: NEW: wrapLanguageModel({ model, middleware })
SDK-->>LP: Wrapped Model instance
LP-->>V3: AISdkClient (using wrapped model)
Note over User, MW: Execution Flow (Per-Call Override)
User->>V3: act / extract / observe ({ model: { ..., middleware: overrideMW } })
V3->>V3: NEW: Check overrideLlmClients cache (incl. middleware state)
alt Cache Miss
V3->>LP: getClient(..., { middleware: overrideMW })
LP->>LP: NEW: effectiveMiddleware = overrideMW ?? globalMiddleware
LP->>SDK: wrapLanguageModel({ model, effectiveMiddleware })
SDK-->>LP: Wrapped Model instance
LP-->>V3: AISdkClient
V3->>V3: NEW: Cache client by model + options + middleware
end
Note over V3, MW: Runtime Interaction
V3->>LP: llmClient.doGenerate(prompt)
LP->>MW: NEW: wrapGenerate({ doGenerate, model })
MW->>MW: Custom logic (Logging / Transformation)
MW->>SDK: doGenerate()
SDK-->>MW: Result + Usage Data
MW-->>LP: Result
LP-->>V3: Final Response
Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.
|
@miguelg719 I have started the AI code review. It will take a few minutes to complete. |
There was a problem hiding this comment.
1 issue found across 5 files
Confidence score: 3/5
- There is a concrete regression risk in
packages/core/lib/v3/v3.ts: cached clients inoverrideLlmClientsare keyed by model name + client options, so calls with different per-call middleware can incorrectly reuse an earlier client. - Because this issue is medium severity (6/10) with fairly solid confidence (7/10) and can change runtime behavior across requests, this carries some user-impacting risk if merged as-is.
- This is likely straightforward to address by aligning cache keying with middleware-sensitive inputs; once fixed, the PR should be in safer shape to merge.
- Pay close attention to
packages/core/lib/v3/v3.ts- client cache reuse may ignore middleware differences and apply the wrong middleware path.
Prompt for AI agents (unresolved issues)
Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.
<file name="packages/core/lib/v3/v3.ts">
<violation number="1" location="packages/core/lib/v3/v3.ts:499">
P2: Per-call middleware is passed to getClient, but overrideLlmClients is still keyed only by modelName + clientOptions. That means a later call with a different middleware can reuse the cached client from the first call, so the override is ignored.</violation>
</file>
Architecture diagram
sequenceDiagram
participant User as Caller/Client
participant V3 as V3 Orchestrator
participant Prov as LLMProvider
participant AI as AI SDK (wrapLanguageModel)
participant Model as LanguageModelV2
Note over User, Model: Initialization with Global Middleware
User->>V3: new Stagehand({ model: { middleware, ... } })
V3->>V3: resolveModelConfiguration()
V3->>Prov: NEW: new LLMProvider(logger, middleware)
Note over User, Model: Execution with Per-Call Override
User->>V3: act / extract / observe({ model: { middleware: perCallMw, ... } })
V3->>V3: CHANGED: Determine effective middleware
alt Per-call middleware provided
V3->>V3: Use perCallMw
else No per-call override
V3->>V3: Use global middleware
end
V3->>Prov: getClient(modelName, options, { middleware })
Prov->>Prov: getAISDKLanguageModel(..., middleware)
alt Middleware exists
Prov->>AI: NEW: wrapLanguageModel({ model, middleware })
AI-->>Prov: Wrapped Model Instance
else No middleware
Prov-->>Prov: Raw Model Instance
end
Prov-->>V3: AISdkClient (with wrapped model)
V3->>Model: doGenerate() / doStream()
Note over Model: Middleware Interception Logic
opt Middleware Active
Model->>Model: NEW: wrapGenerate() / wrapStream()
Note right of Model: Interceptor can log, track usage,<br/>or transform requests
end
Model-->>V3: Result (usage, content, etc.)
V3-->>User: Response
Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.
why
what changed
test plan
Summary by cubic
Adds first-class LLM middleware to intercept and wrap model calls for logging, usage tracking, or request transforms without changing call sites. Per-call middleware is used only for that call and is not cached.
New Features
middleware: LanguageModelV2MiddlewareinV3model config;LLMProviderandgetAISDKLanguageModelwrap models viawrapLanguageModelfromai.modeloptions; override middleware is passed throughLLMProvider.getClient, applies only to that call, and is never cached.modelId,doGenerate, anddoStream. Tests cover usage capture, streaming, errors, chaining, and edge cases.Migration
middlewareinmodelconfig or a per-call override, e.g.{ modelName: 'openai/gpt-4o', apiKey: '...', middleware }.Written for commit b09e42c. Summary will update on new commits. Review in cubic