Skip to content

fix(agentic-tools): return binary HTTP responses as base64 envelope#36242

Open
fmontes wants to merge 5 commits into
mainfrom
fmontes/fix-response-from-mcp-server
Open

fix(agentic-tools): return binary HTTP responses as base64 envelope#36242
fmontes wants to merge 5 commits into
mainfrom
fmontes/fix-response-from-mcp-server

Conversation

@fmontes

@fmontes fmontes commented Jun 19, 2026

Copy link
Copy Markdown
Member

Proposed Changes

Fixes the MCP agentic-tools adapter corrupting binary HTTP responses (libs/agentic-tools/src/lib/http-client.ts). response.text() UTF-8-decoded every non-JSON body, replacing invalid bytes with U+FFFD — so any binary file asset (images, fonts, .ico) pulled via MCP api.request came back irreversibly corrupted. The dotCMS endpoints were already correct; the bug was entirely in the adapter.

  • Response parsing now branches by content-type:
    • application/json.json()
    • textual types (text/*, application/xml, application/javascript, +json/+xml, application/x-www-form-urlencoded) → .text()
    • everything else → response.arrayBuffer() returned as a base64 envelope { __dotcmsBinary: true, contentType, base64, byteLength }, so the raw bytes survive JSON.stringify across the sandbox boundary in execute.ts.
  • Error responses are read as text regardless of content-type, so dotCMS HTML/text error messages are preserved (the previous error path assumed string/JSON).
  • 25MB size cap on the binary path (MAX_BINARY_RESPONSE_BYTES), matching the existing MAX_REMOTE_FILE_BYTES on the upload side.
  • New responseType?: 'auto' | 'base64' option on RequestOptions to force the binary path regardless of declared content-type.
  • Exported BinaryResponseEnvelope type + isBinaryResponseEnvelope() guard so consumers can detect and base64-decode the body.

Checklist

  • Tests — new http-client.spec.ts with a real 1×1 PNG fixture asserting the round-tripped bytes are byte-exact to the source (the regression guard), plus JSON / textual / +json / forced-base64 / error-as-text cases. All 6 pass; lint clean.
  • Translations — N/A
  • Security Implications Contemplated — size cap guards against memory exhaustion; auth/SSRF handling unchanged.

Additional Info

The fix mirrors the existing upload path (Buffer.from(arrayBuffer).toString('base64')). execute.ts was not changed — its JSON.stringify boundary now serializes the envelope correctly; consumers can import isBinaryResponseEnvelope to decode when ready.

Fixes #36241

🤖 Generated with Claude Code

response.text() UTF-8-decoded every non-JSON body, corrupting binary
file assets (images, fonts, ico) into U+FFFD. Parse JSON via .json(),
textual content types as strings, and everything else as a base64
{ __dotcmsBinary, contentType, base64, byteLength } envelope so bytes
survive JSON.stringify across the sandbox boundary. Error responses are
read as text so server messages are preserved, with a 25MB cap on the
binary path. Adds a PNG round-trip regression test.

Fixes #36241

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@github-actions github-actions Bot added the Area : Frontend PR changes Angular/TypeScript frontend code label Jun 19, 2026
@claude

claude Bot commented Jun 19, 2026

Copy link
Copy Markdown
Contributor

Claude finished @fmontes's task in 1m 55s —— View job


Rollback Safety Analysis

  • Read rollback-unsafe categories reference
  • Get full PR diff
  • Analyze diff against all unsafe categories
  • Apply appropriate label

Result: ✅ Safe To Rollback

All 5 changed files are TypeScript frontend code in core-web/libs/agentic-tools and core-web/apps/mcp-server. The changes do not touch any of the rollback-unsafe categories:

Category checked Verdict
C-1 Structural DB model change ✅ No DB changes
C-2 Elasticsearch mapping change ✅ No ES code
C-3 Content JSON model version bump ✅ No CURRENT_MODEL_VERSION change
C-4 DROP TABLE / DROP COLUMN ✅ No SQL
H-1 One-way data migration ✅ No data migration
H-2 RENAME TABLE / COLUMN ✅ No SQL
H-3 PK restructuring ✅ No SQL
H-4 New ContentType field type ✅ No Java field types
H-5 Storage provider change ✅ No storage provider change
H-6 DROP PROCEDURE / FUNCTION ✅ No SQL
H-7 NOT NULL column without default ✅ No SQL
H-8 VTL viewtool contract change ✅ No VTL / Java viewtool changes
M-1 Column type change ✅ No SQL
M-2 Push publishing bundle format ✅ No bundle XML
M-3 REST/GraphQL API contract change ✅ No REST endpoint changes
M-4 OSGi plugin API breakage ✅ No OSGi interface changes

The change is purely an in-process fix to the MCP adapter's response-parsing logic. Rolling back to N-1 simply reverts to the old response.text() behaviour (the binary-corruption bug). No data, schema, or API contract is affected.

Label applied: AI: Safe To Rollback

@github-actions

github-actions Bot commented Jun 19, 2026

Copy link
Copy Markdown
Contributor

🤖 Bedrock Review — deepseek.v3.2

[🟡 Medium] core-web/libs/agentic-tools/src/lib/http-client.ts:305 — The responseType option is documented but not validated in the request method. If an invalid string is passed (e.g., 'raw'), it will default to undefined and be treated as 'auto', which may be fine, but the behavior is implicit. Consider adding a runtime check or type narrowing.

[🟡 Medium] core-web/libs/agentic-tools/src/lib/http-client.ts:94 — The isTextualContentType function includes +json and +xml but does not account for potential extra parameters (e.g., charset). The check ct.includes('+json') could match application/foo+json; charset=utf-8, which is acceptable, but note that includes is case-sensitive; content-type headers are lowercased, so this is safe.

[🟡 Medium] core-web/libs/agentic-tools/src/lib/http-client.ts:110 — In readBinaryResponse, Number(response.headers.get('content-length')) can return NaN if the header is not a number. Number.isFinite(NaN) is false, so the early size check will be skipped, which is fine, but the variable name declaredLength could be misleading. Consider renaming to contentLengthValue and adding a comment.

[🟠 High] core-web/libs/agentic-tools/src/lib/http-client.ts:124Buffer.from(buffer) creates a new Buffer from an ArrayBuffer. This works, but in a Node.js environment, Buffer.from(buffer) may create an unnecessary copy. Consider using Buffer.from(buffer) directly (it's fine) or Buffer.from(buffer, 0, buffer.byteLength) for clarity. No functional issue.

[🟡 Medium] core-web/libs/agentic-tools/src/lib/http-client.spec.ts:52 — The makeResponse stub's arrayBuffer method returns buffer.buffer.slice(...). This returns an ArrayBuffer, but the slice may not be correct if the Buffer is a view on a larger ArrayBuffer. In tests, the Buffer is created directly, so it's safe, but the implementation is fragile. Consider buffer.buffer.slice(buffer.byteOffset, buffer.byteOffset + buffer.byteLength) is already used, which is correct.

[🟡 Medium] core-web/libs/agentic-tools/src/lib/http-client.spec.ts:165 — The test for oversized binary response mocks arrayBuffer but does not simulate the early rejection via Content-Length. The test checks that arrayBuffer is not called, which is correct, but the mock fetch returns a Response-like object with a headers.get that returns the oversized string. This is fine.

[🟡 Medium] core-web/libs/agentic-tools/src/lib/http-client.spec.ts:188 — The isBinaryResponseEnvelope test for missing contentType or byteLength uses two separate assertions, but the second assertion (isBinaryResponseEnvelope({ __dotcmsBinary: true, base64: 'AA==', contentType: 'image/png' })) is missing byteLength. The test expects false, which is correct, but the description says "missing contentType or byteLength" — the second case is missing byteLength, so the test is valid.

[🟡 Medium] core-web/libs/agentic-tools/tsconfig.lib.json:12 — Adding exclude for *.spec.ts and *.test.ts is a build configuration change. Ensure this aligns with the project's existing conventions for library builds. No functional issue.

[🟡 Medium] core-web/apps/mcp-server/src/tools/execute.ts:43 — The documentation update adds a note about binary responses. This is a documentation change only, so no code issues.

[🟡 Medium] core-web/libs/agentic-tools/src/index.ts:5 — Exporting isBinaryResponseEnvelope and BinaryResponseEnvelope type is fine, but ensure there are no missing exports for other types. No issues.

The PR is clean with minor considerations around type safety and test robustness.


Run: #27829885903 · tokens: in: 4814 · out: 923 · total: 5737

Tell the model that binary file-asset endpoints (e.g. /api/v2/assets,
/dA) return a { __dotcmsBinary, contentType, base64, byteLength }
envelope whose base64 is the raw bytes to decode — not text.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes binary HTTP response corruption in the agentic-tools MCP adapter by avoiding UTF-8 .text() decoding for non-textual payloads, returning a base64 “binary envelope” instead so bytes survive the sandbox JSON.stringify boundary.

Changes:

  • Added content-type–based response decoding with a base64 BinaryResponseEnvelope path for non-textual bodies (plus responseType: 'base64' to force it).
  • Updated error handling to always read non-OK responses as text for readable server error messages.
  • Added Jest regression coverage including a real PNG byte-exact round-trip assertion, and exported envelope helpers from the package entrypoint.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 3 comments.

File Description
core-web/libs/agentic-tools/src/lib/http-client.ts Implements binary-safe response parsing via base64 envelope + type guard, with a binary-size cap.
core-web/libs/agentic-tools/src/lib/http-client.spec.ts Adds regression and behavior tests for JSON/text/+json/forced-base64/error-as-text cases.
core-web/libs/agentic-tools/src/index.ts Exposes BinaryResponseEnvelope and isBinaryResponseEnvelope to consumers.

Comment thread core-web/libs/agentic-tools/src/lib/http-client.ts
Comment thread core-web/libs/agentic-tools/src/lib/http-client.ts
Comment thread core-web/libs/agentic-tools/src/lib/http-client.ts
Address PR review:
- isBinaryResponseEnvelope now also validates contentType (string) and
  byteLength (number), so a partial object can't pass the guard.
- readBinaryResponse rejects early via Content-Length before buffering
  the body, so an oversized response can't OOM before the cap fires; the
  post-read check stays as the backstop for absent/lying headers.
- Tests for both: oversized Content-Length rejection (asserts the body
  is never buffered) and malformed-envelope guard cases.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@fmontes

fmontes commented Jun 19, 2026

Copy link
Copy Markdown
Member Author

Thanks for the review. Addressed in 92fca06:

  • [🟠 High] type guard missing fields (also raised by Copilot) — isBinaryResponseEnvelope now validates contentType and byteLength, with tests for the partial-object cases.
  • [🟡 Medium] +json/+xml suffix match — leaving as-is for now. .includes('+json') is broad in theory, but media types like application/octet-stream+json don't exist in practice and the worst case is a structured-text body returned as a string rather than corrupted bytes (the bug this PR fixes). Can tighten to an end-anchored suffix check in a follow-up if desired.

Also hardened the binary size cap to reject early via Content-Length before buffering (per Copilot's review).

The @nx/js:tsc build compiled src/**/*.spec.ts into the library because
tsconfig.lib.json had no exclude — it lacked jest/test-runner types and
broke `nx build agentic-tools`. This never surfaced before because the
lib had no spec files until this PR added http-client.spec.ts. Exclude
*.spec.ts / *.test.ts from the lib build (they still compile under
tsconfig.spec.json for the test target).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

AI: Safe To Rollback Area : Frontend PR changes Angular/TypeScript frontend code

Projects

Status: No status

Development

Successfully merging this pull request may close these issues.

MCP agentic-tools adapter corrupts binary HTTP responses

2 participants