fix: Content-filter proxy + MCP servers for OpenCode on Databricks#56
fix: Content-filter proxy + MCP servers for OpenCode on Databricks#56datasciencemonkey merged 21 commits intomainfrom
Conversation
…nCode OpenCode intermittently sends empty text content blocks in messages, which Databricks Foundation Model API strictly rejects with "text content blocks must be non-empty" (OpenCode #5028). This adds a LiteLLM proxy running on localhost:4000 inside the container that strips these blocks before they reach the API. Simpler alternative to PR #52's fork approach — no fork maintenance, proven fix via LiteLLM PR #20384, preserves full AI Gateway/MLflow/UC governance. Changes: - setup_litellm.py: new setup script, starts LiteLLM proxy with health check - setup_opencode.py: route baseURL through localhost:4000 instead of direct - app.py: add litellm setup step (sequential, before parallel agent setup) - requirements.txt: add litellm>=1.60 - docs/plans: design document with analysis of PR #52 trade-offs Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Root Cause: OpenCode Source Code ReferencesThe empty content block bug traces through three files in OpenCode's core: 1. Origin —
|
--detailed_debug is a boolean flag (not key-value) and --drop_params is a config setting, not a CLI arg. Invalid args were causing LiteLLM to fail to start. Moved drop_params into the YAML config under litellm_settings. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
LiteLLM proxy requires litellm[proxy] (fastapi, uvicorn, etc.) which was failing to start in the container. Replaced with a minimal ~80 line HTTP proxy using stdlib http.server + requests (already installed). Same sanitization logic: strips empty/whitespace-only text content blocks from messages before forwarding to Databricks. Zero new dependencies. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Replace inline f-string proxy with standalone content_filter_proxy.py: Request-side (sanitize what OpenCode sends): - Strip empty/whitespace-only text content blocks (#5028) - Strip orphaned tool_result blocks (Anthropic format) - Strip orphaned tool messages (OpenAI format) - Remove empty messages after filtering Response-side (fix what Databricks returns): - Remap 'databricks-tool-call' back to real tool names - Fix finish_reason: 'stop' → 'tool_calls' when tools invoked - SSE stream parsing with buffered tool name resolution Zero external dependencies (stdlib http.server + requests via databricks-sdk). ThreadingMixIn for concurrent health checks during streaming. All fixes verified with unit tests. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Key fix: prev_tool_ids now checks the CLEANED message list (not original indices), so cascading orphans are caught within the same pass. Multi-pass loop runs up to 5 passes for deeply cascading cases. Added diagnostic logging to ~/.content-filter-proxy-debug.log: - Full message structure on each request - Every strip/drop action with IDs and reason - Upstream error responses Handles toolu_bdrk_ format IDs from Databricks. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The GPT Codex endpoints on Databricks AI Gateway only support the Responses API (openai/v1/responses), not Chat Completions. The @ai-sdk/openai-compatible SDK defaults to /chat/completions which fails. Switch to @ai-sdk/openai which natively supports both APIs. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
OpenCode doesn't auto-install npm packages from provider config. Add explicit npm install of @ai-sdk/openai alongside opencode-ai so GPT models can use the Responses API. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Update setup step label, comments, and provider names to reflect that we use a custom content-filter proxy, not LiteLLM. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Gemini rejects $schema, $ref, $defs, additionalProperties in tool parameter definitions, and stream_options at the top level. Proxy now detects Gemini models by name and recursively strips these fields before forwarding. Only applies to requests with "gemini" in the model name — Claude and GPT requests are untouched. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Previous approach tried to detect Gemini models by name — unreliable. Now strips $schema, additionalProperties, stream_options universally. These fields are never needed by any downstream API (Claude/GPT ignore them, Gemini rejects them). Safe for all models. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
logging.basicConfig does nothing if root logger is already configured (e.g., by 'import requests'). Switch to explicit FileHandler on a named logger so debug output always writes. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
FileHandler buffers in long-running processes and never flushes. Switch to StreamHandler(stderr) which is already redirected to ~/.content-filter-proxy.log and writes immediately. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
On redeploy, the old proxy keeps running on port 4000 from the previous container init. New proxy crashes with "Address already in use". Now reads PID file and kills the old process first. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
PID file approach fails when old process has a different PID than recorded. Use fuser -k or lsof to find and kill whatever is listening on port 4000 before starting new proxy. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Message [15] had assistant: str(0 chars) — empty string content that the API rejects. We were preserving it to avoid breaking alternation, but the API rejects it anyway. Now replaces with '.' as minimal valid content. Also handles null content on assistant messages without tool_calls. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Adds remote MCP servers for both gateway and fallback configs: - DeepWiki: AI-powered docs for any GitHub repo - Exa: web search and code context retrieval Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- setup_litellm.py → setup_proxy.py - LITELLM_PROXY_URL → CONTENT_FILTER_PROXY_URL - Step ID "litellm" → "proxy" Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
DeepWiki moved from SSE to StreamableHTTP transport. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
If the Databricks SDK fails to resolve the token owner at startup, auth was failing open — allowing unauthenticated access to the terminal and all coding agents. Now fails closed on Databricks Apps: if app_owner is None, deny all access. Fail-open is only allowed for local development (detected by absence of DATABRICKS_APP_PORT and /app/python/source_code). Also denies access when no user identity is in the request headers on Databricks Apps (shouldn't happen, but defense in depth). Fixes #57 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Adds CSP that: - Restricts scripts to 'self' + 'unsafe-inline' (inline <script> block) - Restricts styles to 'self' + 'unsafe-inline' (inline style attrs) - Blocks all external script/style sources - Allows WebSocket connections (connect-src 'self' ws: wss:) - Prevents framing (frame-ancestors 'none') - Blocks form submissions to external origins 'unsafe-inline' for scripts is needed because the app has an embedded <script> block in index.html. Moving to nonce-based CSP would require server-side nonce injection — tracked as a future improvement. Fixes #58 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Summary
localhost:4000inside the container that sanitizes requests/responses between OpenCode and Databricks AI GatewayWhat the proxy fixes
OpenCode →
localhost:4000(proxy) → Databricks AI Gateway → ClaudeRequest-side (sanitize what OpenCode sends):
tool_resultblocks with no matchingtool_use(multi-pass, cascading)$schema,additionalPropertiesfrom tool parameter definitionsstream_optionsfrom requestsResponse-side (fix what Databricks returns):
databricks-tool-callback to real tool namesfinish_reason: "stop"→"tool_calls"when tools are invokedRoot cause analysis
The bugs trace through three files in OpenCode's core:
processor.ts#L213-L226— creates empty text parts during streamingmessage-v2.ts#L289— propagates them into conversation historytransform.ts#L23-L42— has a fix but only for@ai-sdk/anthropic, not@ai-sdk/openai-compatibleWhy not PR #52's fork approach
PR #52 forks OpenCode to add a native Databricks provider. After analysis:
Valuable parts of PR #52 filed as separate issues: #53, #54, #55.
MCP servers
Added remote MCP servers to OpenCode config:
Model support status
Changes
content_filter_proxy.py(new)setup_litellm.pysetup_opencode.py@ai-sdk/openaifor GPTapp.pyrequirements.txtdocs/plans/2026-03-11-*Removal path
When OpenCode fixes #5028:
content_filter_proxy.pyandsetup_litellm.pysetup_opencode.pyapp.pyTest plan
curl http://127.0.0.1:4000/health)🤖 Generated with Claude Code