Add mastra auto-instrumentation by Qard · Pull Request #1901 · braintrustdata/braintrust-sdk-javascript

Stephen Belanger (Qard) · 2026-04-23T22:34:26Z

Fixes #1890

Summary

add Mastra auto-instrumentation configs and wire them into the hook, bundler plugin, webpack loader, and public exports
add Mastra tracing channels and a runtime plugin for agent, tool, workflow run, and workflow step spans with current-span context binding
add focused Mastra tests and an e2e scenario covering agent generate/stream, Tool.execute, workflow runs, and workflow-step tool nesting

Testing

pnpm --dir js exec vitest run src/auto-instrumentations/configs/mastra.test.ts src/instrumentation/plugins/mastra-plugin.test.ts src/instrumentation/registry.test.ts
pnpm run test:e2e -- mastra-instrumentation

Notes

pnpm --dir js test -- mastra and pnpm --dir js test -- auto-instrumentations still hit unrelated existing network-dependent failures in src/framework.test.ts and src/logger-misc.test.ts
pnpm run formatting still reports pre-existing unformatted fixture files under js/tests/auto-instrumentations/fixtures/test-files/

Luca Forstner (lforst)

Can we also add the mastra e2e test scenario to the ci summary?

Luca Forstner (lforst)

In the e2e test summary it looks like we don't capture LLM spans.

Do you think that is an issue?

stale

…iterals (#2036) ## Summary - `areInterfaceSignaturesCompatible` was flagging adding an optional property to an inline object literal type (e.g. `mastra?: boolean` on `InstrumentationConfig.integrations`) as a breaking change. - Root cause: it compared each property's *type* as a string, then only re-checked via `isUnionTypeWidening` when the strings differed. Inline object literals aren't unions, so any new field tipped it into "modified" territory. - Fix: delegate to the existing `areObjectTypeDefinitionsCompatible` helper when both sides are inline object literals, mirroring the pattern already in `areTypeAliasSignaturesCompatible`. Also apply the same delegation inside `areObjectTypeDefinitionsCompatible` itself so nested object literals are handled the same way. - Lifted the inline `isObjectType` helper to a module-scope `isObjectLiteralType` so the interface, object-literal, and intersection comparators share one definition. After this lands, the `js-api-compatibility (20)` check on PRs that add optional integration keys (#1891, #1897, #1901, future SDK integrations) will pass instead of failing informationally. ## Test plan - [x] Added 4 regression tests in `describe("areInterfaceSignaturesCompatible", ...)`: - optional property added to inline object literal (the `InstrumentationConfig` case) → passes - required property added → still rejected - property removed → still rejected - optional property added to deeply nested inline object literal → passes - [x] `pnpm exec vitest run tests/api-compatibility/api-compatibility.test.ts` — all 14 interface-compat tests pass plus the rest of the file - [x] `pnpm run fix:formatting` and `pnpm run lint` clean #skip-changeset — this is a test-only change with no impact on the published package. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Stephen Belanger (Qard) · 2026-05-25T04:09:20Z

Luca Forstner (@lforst) Getting some provider limit failures, but otherwise should be good for another review.

Luca Forstner (lforst) · 2026-05-26T08:43:10Z

@@ -0,0 +1,268 @@
+/**


This is the most important feedback.

This goes more into the direction I find good. However, I find this approach still quite sketchy. I would find this acceptable if we were in a hurry to get this out asap, which is not the case.

I honestly think we should invest the time to use require-in-the-middle and import-in-the-middle whenever we are targeting top-level-exports (ie. stable API) which seems to be the case here. That way we don't need sketchy custom one-off instrumentation code for all packages.

IITM would be substantially worse. It's very unreliable and has been known-broken with half the major AI SDKs for a while now because there are a bunch of design issues which are simply not addressable via basic ESM loaders wrapping. I went with orchestrion in the first place because most of the modules we instrument would crash IITM. As for RITM, it's not particularly useful at this point either given many of the AI SDKs are ESM-only now.

There's talk of rewriting the two from scratch on the newer sync loader hooks API, but no one has really been putting any work into that. Unless we want to take on rewriting RITM + IITM ourselves, I don't think there's a viable path there yet.

IITM would be substantially worse.

Can why it would be substantially worse for this particular case?

It's very unreliable

From experience, IITM only has problems when there are actual problems. I have not seen it be unreliable. Can you elaborate what you mean by it being unreliable?

Its only downside that I know of you have already mentioned, which is that it does not work for very specific packages if they export things in a certain dodgy way.

Right now looking at the PR, probabilistically extracting a chunk seems more unreliable (ie. less future-proof) than using IITM to patch public API (which tends to be stable). To me, unless IITM/RITM are broken for Mastra, it seems like the best tool for the job.

When I initially built the auto-instrumentation a few months ago IITM was the first thing I tried because, despite knowing it had some problems, it overall had more adoption just from being around longer. But I quickly found that, since we were starting from scratch, building for the future seemed a better approach.

With IITM, both OpenAI and AI SDK crashed the process on load because of syntax support issues. It's also known to have issues with patching only applying externally and so code which calls other functions from the same file often don't work. Also IITM wraps literally every module, even ones not being patched, in proxies and injects a bunch of setter registration stuff which adds a ton of overhead to every access. That's not only a big performance hit but also a significant stability risk given its poor track record with module parsing, which it had to do whether or not it was actually patching the module or not, since it would not know the patches to apply until after the load due to the ordering of things.

IITM itself is much more hacky than what we're doing here. Yes, there's a bit of jank in what we're doing here, but we can smooth that over as our AST-based patching matures. Our options are: working but with a bit of our own jank to smooth out, and aligning with the direction of the segment leaders, or often broken and concealing a ton of jank within which provides a significant unaddressable risk. As one of the people responsible for creating IITM in the first place, I don't think the tradeoffs are worth it.

The core architecture of IITM trying to treat modules as load-time-patchable like RITM simply does not fit the reality that ESM just doesn't work the way CJS does, and so it accumulated a ton of hacks to try and work around the numerous sharp edges to the point that it's now even more of a mess than when it started. We built it at Datadog and then abandoned it not too long after handing it off to the Node.js foundation because it was just more trouble than it was worth. Datadog created Orchestrion shortly after specifically because of this failure, and now there's several vendors migrating to that instead. Using IITM at this point feels to me like a step backwards.

Also IITM wraps literally every module, even ones not being patched

I believe this no longer has to be the case since createAddHookMessageChannel was added: https://github.com/nodejs/import-in-the-middle#only-intercepting-hooked-modules I have had a nice experience using it in the past.

I also don't have anything against using orchestrion. It covers the cases that IITM/RITM can't. However, I don't understand you mentioning it because we are not using it in this PR.

In any case, I feel like I won't convince you on this so feel free to proceed.

Replaces the chunk-AST instrumentation with a Braintrust ObservabilityExporter that the loader auto-installs into every `new Mastra(...)`. Survives Mastra's content-hashed chunk renames release-to-release because the loader only patches the stable `dist/index.{js,cjs}` entrypoints — chunk paths are read out of the original source at load time, not pinned in a version table. Two integration paths: - Auto: under `node --import braintrust/hook.mjs`, the loader rewrites Mastra's entry into a Proxy-wrapped class that auto-constructs an `Observability` instance if none was provided, and appends a Proxy wrap to `@mastra/observability`'s entry so the constructor injects our exporter into every config. - Manual: `import { BraintrustObservabilityExporter } from "braintrust";` and pass it to `new Mastra({ observability: new Observability({...}) })`. Requires `@mastra/core >= 1.20.0` for the auto path; manual integration works on any Mastra version that accepts an `ObservabilityExporter`. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Stephen Belanger (Qard) requested review from Abhijeet Prasad (AbhiPrasad) and Luca Forstner (lforst) April 23, 2026 22:34

Stephen Belanger (Qard) self-assigned this Apr 23, 2026

Luca Forstner (lforst) previously approved these changes Apr 24, 2026

View reviewed changes

Stephen Belanger (Qard) force-pushed the mastra-auto-instrumentation branch from 37f8aaa to 43c826c Compare May 6, 2026 17:11

Stephen Belanger (Qard) requested a review from Luca Forstner (lforst) May 6, 2026 20:07

Stephen Belanger (Qard) force-pushed the mastra-auto-instrumentation branch from 443f70c to 8d58469 Compare May 6, 2026 20:13

Luca Forstner (lforst) reviewed May 6, 2026

View reviewed changes

Stephen Belanger (Qard) force-pushed the mastra-auto-instrumentation branch from c01e4fd to ffcc4ea Compare May 7, 2026 16:01

Stephen Belanger (Qard) force-pushed the mastra-auto-instrumentation branch 3 times, most recently from b59c93e to 282b413 Compare May 21, 2026 07:21

Luca Forstner (lforst) reviewed May 21, 2026

View reviewed changes

Stephen Belanger (Qard) mentioned this pull request May 21, 2026

fix(api-compat): allow optional property additions to inline object literals #2036

Merged

3 tasks

Stephen Belanger (Qard) force-pushed the mastra-auto-instrumentation branch 2 times, most recently from f7f5fbe to ea0197d Compare May 25, 2026 04:02

Stephen Belanger (Qard) requested a review from Luca Forstner (lforst) May 25, 2026 04:08

Luca Forstner (lforst) reviewed May 26, 2026

View reviewed changes

Stephen Belanger (Qard) force-pushed the mastra-auto-instrumentation branch 3 times, most recently from 4a5d3fb to 7068c6a Compare May 26, 2026 15:06

Luca Forstner (lforst) approved these changes May 27, 2026

View reviewed changes

Comment thread js/src/auto-instrumentations/configs/all.ts Outdated

Stephen Belanger (Qard) force-pushed the mastra-auto-instrumentation branch from 7068c6a to 9aa07c2 Compare May 27, 2026 13:35

Conversation

Stephen Belanger (Qard) commented Apr 23, 2026

Summary

Testing

Notes

Uh oh!

Luca Forstner (lforst) left a comment

Choose a reason for hiding this comment

Uh oh!

Luca Forstner (lforst) left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Stephen Belanger (Qard) commented May 25, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Luca Forstner (lforst) May 26, 2026

Choose a reason for hiding this comment

Uh oh!

Stephen Belanger (Qard) May 26, 2026

Choose a reason for hiding this comment

Uh oh!

Luca Forstner (lforst) May 26, 2026

Choose a reason for hiding this comment

Uh oh!

Stephen Belanger (Qard) May 27, 2026

Choose a reason for hiding this comment

Uh oh!

Luca Forstner (lforst) May 27, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants