Skip to content

fix: Content-filter proxy + MCP servers for OpenCode on Databricks#56

Merged
datasciencemonkey merged 21 commits intomainfrom
fix/litellm-empty-content-blocks
Mar 11, 2026
Merged

fix: Content-filter proxy + MCP servers for OpenCode on Databricks#56
datasciencemonkey merged 21 commits intomainfrom
fix/litellm-empty-content-blocks

Conversation

@datasciencemonkey
Copy link
Copy Markdown
Owner

@datasciencemonkey datasciencemonkey commented Mar 11, 2026

Summary

  • Adds a content-filter proxy running on localhost:4000 inside the container that sanitizes requests/responses between OpenCode and Databricks AI Gateway
  • Fixes multiple OpenCode bugs that cause "Bad Request" errors with Databricks Foundation Models
  • Adds DeepWiki and Exa MCP server support for OpenCode
  • Scope: Databricks Claude models only — Gemini and GPT/Codex have additional compatibility issues tracked separately

What the proxy fixes

OpenCode → localhost:4000 (proxy) → Databricks AI Gateway → Claude

Request-side (sanitize what OpenCode sends):

  • Strip empty/whitespace-only text content blocks (OpenCode #5028)
  • Strip orphaned tool_result blocks with no matching tool_use (multi-pass, cascading)
  • Replace empty assistant message content with placeholder
  • Strip $schema, additionalProperties from tool parameter definitions
  • Strip stream_options from requests

Response-side (fix what Databricks returns):

  • Remap databricks-tool-call back to real tool names
  • Fix finish_reason: "stop""tool_calls" when tools are invoked
  • SSE stream parsing with buffered tool name resolution

Root cause analysis

The bugs trace through three files in OpenCode's core:

Why not PR #52's fork approach

PR #52 forks OpenCode to add a native Databricks provider. After analysis:

  • Doesn't fix the root cause — no commits sanitize empty content blocks
  • Fork maintenance burden — must track upstream indefinitely
  • Supply chain risk — personal GitHub fork built from source at deploy time vs. community-scrutinized npm package
  • Scope creep — bundles fork + spawner app + GitHub CLI + perf fixes

Valuable parts of PR #52 filed as separate issues: #53, #54, #55.

MCP servers

Added remote MCP servers to OpenCode config:

  • DeepWiki — AI-powered documentation for any GitHub repository
  • Exa — Web search and code context retrieval

Model support status

Model Status Notes
Claude Opus 4.6 ✅ Working Via content-filter proxy
Claude Sonnet 4.6 ✅ Working Via content-filter proxy
Gemini Flash/Pro ⚠️ Partial Schema stripping helps, further compat issues likely
GPT Codex ❌ Not working Needs Responses API handling in OpenCode SDK

Changes

File What
content_filter_proxy.py (new) Standalone proxy with request/response sanitization
setup_litellm.py Setup script — kills stale proxy, starts new one, health check
setup_opencode.py Routes through proxy, adds MCP servers, @ai-sdk/openai for GPT
app.py Adds sequential proxy setup step before agents
requirements.txt Unchanged (no new dependencies)
docs/plans/2026-03-11-* Design doc with trade-off analysis

Removal path

When OpenCode fixes #5028:

  1. Delete content_filter_proxy.py and setup_litellm.py
  2. Revert baseURL in setup_opencode.py
  3. Remove proxy step from app.py

Test plan

  • Deploy to Databricks Apps
  • Claude Opus — basic chat works
  • Claude Opus — tool calling works
  • Claude Opus — compaction works
  • Proxy health check (curl http://127.0.0.1:4000/health)
  • Diagnostic logging captures all requests
  • Empty text blocks stripped
  • Orphaned tool_results stripped
  • Empty assistant content replaced with placeholder
  • DeepWiki MCP works in OpenCode
  • Exa MCP works in OpenCode
  • Long session (10+ turns with tools)

🤖 Generated with Claude Code

datasciencemonkey and others added 2 commits March 11, 2026 12:49
…nCode

OpenCode intermittently sends empty text content blocks in messages, which
Databricks Foundation Model API strictly rejects with "text content blocks
must be non-empty" (OpenCode #5028). This adds a LiteLLM proxy running on
localhost:4000 inside the container that strips these blocks before they
reach the API.

Simpler alternative to PR #52's fork approach — no fork maintenance, proven
fix via LiteLLM PR #20384, preserves full AI Gateway/MLflow/UC governance.

Changes:
- setup_litellm.py: new setup script, starts LiteLLM proxy with health check
- setup_opencode.py: route baseURL through localhost:4000 instead of direct
- app.py: add litellm setup step (sequential, before parallel agent setup)
- requirements.txt: add litellm>=1.60
- docs/plans: design document with analysis of PR #52 trade-offs

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@datasciencemonkey
Copy link
Copy Markdown
Owner Author

Root Cause: OpenCode Source Code References

The empty content block bug traces through three files in OpenCode's core:

1. Origin — processor.ts (streaming handler)

processor.ts#L213-L226text-start initializes a text part with text: "" and immediately persists it. If the LLM emits text-starttext-end with no deltas in between (happens between thinking blocks), an empty text part is saved to the DB.

processor.ts#L230-L245text-end trims but does NOT check for empty before persisting. trimEnd() on whitespace-only content produces "", which is still saved.

2. Propagation — message-v2.ts (conversation history)

message-v2.ts#L289toModelMessages() includes empty text parts in the conversation history sent to the LLM. It skips messages with zero parts, but a message with one empty text part passes through.

3. Partial fix exists — but only for @ai-sdk/anthropic

transform.ts#L23-L42normalizeMessages() already filters text: "" parts, but only for the Anthropic provider. Other providers (including @ai-sdk/openai-compatible, which we use for Databricks) receive the empty blocks.

Why the fork (PR #52) doesn't fix this

The fork adds a native Databricks provider but doesn't apply the normalizeMessages() filtering from transform.ts. The bug is in processor.ts persistence, not the provider layer.

Why LiteLLM fixes it

LiteLLM strips empty content blocks on every outbound request at the HTTP level — regardless of which provider OpenCode uses internally. It catches everything that processor.ts produces and message-v2.ts propagates.

datasciencemonkey and others added 2 commits March 11, 2026 13:17
--detailed_debug is a boolean flag (not key-value) and --drop_params
is a config setting, not a CLI arg. Invalid args were causing LiteLLM
to fail to start. Moved drop_params into the YAML config under
litellm_settings.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
LiteLLM proxy requires litellm[proxy] (fastapi, uvicorn, etc.) which
was failing to start in the container. Replaced with a minimal ~80 line
HTTP proxy using stdlib http.server + requests (already installed).

Same sanitization logic: strips empty/whitespace-only text content blocks
from messages before forwarding to Databricks. Zero new dependencies.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@datasciencemonkey datasciencemonkey self-assigned this Mar 11, 2026
datasciencemonkey and others added 13 commits March 11, 2026 15:32
Replace inline f-string proxy with standalone content_filter_proxy.py:

Request-side (sanitize what OpenCode sends):
  - Strip empty/whitespace-only text content blocks (#5028)
  - Strip orphaned tool_result blocks (Anthropic format)
  - Strip orphaned tool messages (OpenAI format)
  - Remove empty messages after filtering

Response-side (fix what Databricks returns):
  - Remap 'databricks-tool-call' back to real tool names
  - Fix finish_reason: 'stop' → 'tool_calls' when tools invoked
  - SSE stream parsing with buffered tool name resolution

Zero external dependencies (stdlib http.server + requests via databricks-sdk).
ThreadingMixIn for concurrent health checks during streaming.
All fixes verified with unit tests.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Key fix: prev_tool_ids now checks the CLEANED message list (not
original indices), so cascading orphans are caught within the same
pass. Multi-pass loop runs up to 5 passes for deeply cascading cases.

Added diagnostic logging to ~/.content-filter-proxy-debug.log:
  - Full message structure on each request
  - Every strip/drop action with IDs and reason
  - Upstream error responses

Handles toolu_bdrk_ format IDs from Databricks.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The GPT Codex endpoints on Databricks AI Gateway only support the
Responses API (openai/v1/responses), not Chat Completions. The
@ai-sdk/openai-compatible SDK defaults to /chat/completions which
fails. Switch to @ai-sdk/openai which natively supports both APIs.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
OpenCode doesn't auto-install npm packages from provider config.
Add explicit npm install of @ai-sdk/openai alongside opencode-ai
so GPT models can use the Responses API.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Update setup step label, comments, and provider names to reflect
that we use a custom content-filter proxy, not LiteLLM.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Gemini rejects $schema, $ref, $defs, additionalProperties in tool
parameter definitions, and stream_options at the top level. Proxy
now detects Gemini models by name and recursively strips these
fields before forwarding.

Only applies to requests with "gemini" in the model name — Claude
and GPT requests are untouched.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Previous approach tried to detect Gemini models by name — unreliable.
Now strips $schema, additionalProperties, stream_options universally.
These fields are never needed by any downstream API (Claude/GPT ignore
them, Gemini rejects them). Safe for all models.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
logging.basicConfig does nothing if root logger is already configured
(e.g., by 'import requests'). Switch to explicit FileHandler on a
named logger so debug output always writes.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
FileHandler buffers in long-running processes and never flushes.
Switch to StreamHandler(stderr) which is already redirected to
~/.content-filter-proxy.log and writes immediately.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
On redeploy, the old proxy keeps running on port 4000 from the
previous container init. New proxy crashes with "Address already in
use". Now reads PID file and kills the old process first.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
PID file approach fails when old process has a different PID than
recorded. Use fuser -k or lsof to find and kill whatever is
listening on port 4000 before starting new proxy.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Message [15] had assistant: str(0 chars) — empty string content that
the API rejects. We were preserving it to avoid breaking alternation,
but the API rejects it anyway. Now replaces with '.' as minimal valid
content. Also handles null content on assistant messages without
tool_calls.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Adds remote MCP servers for both gateway and fallback configs:
- DeepWiki: AI-powered docs for any GitHub repo
- Exa: web search and code context retrieval

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@datasciencemonkey datasciencemonkey changed the title fix: LiteLLM local proxy to sanitize empty content blocks fix: Content-filter proxy + MCP servers for OpenCode on Databricks Mar 11, 2026
datasciencemonkey and others added 4 commits March 11, 2026 18:44
- setup_litellm.py → setup_proxy.py
- LITELLM_PROXY_URL → CONTENT_FILTER_PROXY_URL
- Step ID "litellm" → "proxy"

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
DeepWiki moved from SSE to StreamableHTTP transport.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
If the Databricks SDK fails to resolve the token owner at startup,
auth was failing open — allowing unauthenticated access to the
terminal and all coding agents.

Now fails closed on Databricks Apps: if app_owner is None, deny all
access. Fail-open is only allowed for local development (detected by
absence of DATABRICKS_APP_PORT and /app/python/source_code).

Also denies access when no user identity is in the request headers
on Databricks Apps (shouldn't happen, but defense in depth).

Fixes #57

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Adds CSP that:
- Restricts scripts to 'self' + 'unsafe-inline' (inline <script> block)
- Restricts styles to 'self' + 'unsafe-inline' (inline style attrs)
- Blocks all external script/style sources
- Allows WebSocket connections (connect-src 'self' ws: wss:)
- Prevents framing (frame-ancestors 'none')
- Blocks form submissions to external origins

'unsafe-inline' for scripts is needed because the app has an embedded
<script> block in index.html. Moving to nonce-based CSP would require
server-side nonce injection — tracked as a future improvement.

Fixes #58

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Call Missing content in messages: text content blocks must be non-empty

1 participant