Summary
The self-hosted agent_toolset_20260401 read tool (src/anthropic/lib/tools/agent_toolset.py, beta_read_tool) decodes every file as UTF-8 via target.read_text(). Reading a binary file (image or PDF) raises an uncaught UnicodeDecodeError, surfaced to the model as a raw tool error — even though the tool-result contract already supports image/document content blocks.
So a Managed Agents agent running in a self-hosted environment (client.beta.environments.work.worker(...).handle_item() / SessionToolRunner) cannot read an image or PDF. This bites the document skills (docx/pdf/pptx/xlsx) that render slides/pages to images for visual QA — every such read fails. (The hosted product and Claude Code's Read both handle images, so this is specific to the open-source self-hosted toolset.)
Version
anthropic==0.103.1. Still present on main / v0.105.2 — beta_read_tool is read_text()-only there too. I couldn't find an existing issue for it.
Repro
-
Self-hosted CMA agent with the agent_toolset_20260401 toolset (a StandardSandbox / agent_toolset_20260401 tool).
-
The agent creates or has an image, e.g. /workspace/slide-1.jpg.
-
The agent calls read(file_path="/workspace/slide-1.jpg").
-
The tool raises (surfaced as the tool result):
UnicodeDecodeError('utf-8', b'\xff\xd8\xff\xe0...', 0, 1, 'invalid start byte')
(\xff\xd8 is the JPEG SOI marker.) The same happens for PDFs (%PDF).
Root cause
beta_read_tool's inner read:
text = target.read_text() # UTF-8; raises UnicodeDecodeError on binary
Only ToolError and OSError are caught; UnicodeDecodeError (a ValueError) propagates uncaught.
Why this is straightforward to fix
The tool-result type already supports content blocks, and the runner forwards them:
BetaFunctionToolResultType = Union[str, Iterable[BetaContent]] (_beta_functions.py)
ToolError's own docstring shows an image block example
_beta_session_runner._to_session_content already forwards image / document / search_result blocks through to the session
So read can simply return an image/document block for binary files.
Suggested fix
Detect image/PDF files (by extension and/or magic bytes) and return a base64 content block instead of decoding as text:
# image/jpeg|png|gif|webp -> "image"; application/pdf -> "document"
media = _binary_media_type(target)
if media is not None:
data = base64.standard_b64encode(target.read_bytes()).decode("ascii")
kind = "document" if media == "application/pdf" else "image"
return [{"type": kind, "source": {"type": "base64", "media_type": media, "data": data}}]
text = target.read_text()
...
A separate (larger) size cap for binary makes sense, since images routinely exceed the 256 KB READ_MAX_BYTES text cap. Happy to open a PR if useful.
Interim workaround
For anyone else self-hosting CMA: we monkeypatch agent_toolset.beta_read_tool in our worker before handle_item() to do exactly the above (binary → base64 image/document block, text → delegate to the original read). Restores visual QA of rendered output for the document skills.
Summary
The self-hosted
agent_toolset_20260401readtool (src/anthropic/lib/tools/agent_toolset.py,beta_read_tool) decodes every file as UTF-8 viatarget.read_text(). Reading a binary file (image or PDF) raises an uncaughtUnicodeDecodeError, surfaced to the model as a raw tool error — even though the tool-result contract already supports image/document content blocks.So a Managed Agents agent running in a self-hosted environment (
client.beta.environments.work.worker(...).handle_item()/SessionToolRunner) cannotreadan image or PDF. This bites the document skills (docx/pdf/pptx/xlsx) that render slides/pages to images for visual QA — every suchreadfails. (The hosted product and Claude Code'sReadboth handle images, so this is specific to the open-source self-hosted toolset.)Version
anthropic==0.103.1. Still present onmain/v0.105.2—beta_read_toolisread_text()-only there too. I couldn't find an existing issue for it.Repro
Self-hosted CMA agent with the
agent_toolset_20260401toolset (a StandardSandbox /agent_toolset_20260401tool).The agent creates or has an image, e.g.
/workspace/slide-1.jpg.The agent calls
read(file_path="/workspace/slide-1.jpg").The tool raises (surfaced as the tool result):
(
\xff\xd8is the JPEG SOI marker.) The same happens for PDFs (%PDF).Root cause
beta_read_tool's innerread:Only
ToolErrorandOSErrorare caught;UnicodeDecodeError(aValueError) propagates uncaught.Why this is straightforward to fix
The tool-result type already supports content blocks, and the runner forwards them:
BetaFunctionToolResultType = Union[str, Iterable[BetaContent]](_beta_functions.py)ToolError's own docstring shows animageblock example_beta_session_runner._to_session_contentalready forwardsimage/document/search_resultblocks through to the sessionSo
readcan simply return animage/documentblock for binary files.Suggested fix
Detect image/PDF files (by extension and/or magic bytes) and return a base64 content block instead of decoding as text:
A separate (larger) size cap for binary makes sense, since images routinely exceed the 256 KB
READ_MAX_BYTEStext cap. Happy to open a PR if useful.Interim workaround
For anyone else self-hosting CMA: we monkeypatch
agent_toolset.beta_read_toolin our worker beforehandle_item()to do exactly the above (binary → base64image/documentblock, text → delegate to the originalread). Restores visual QA of rendered output for the document skills.