Experimental: MCP server for Genie Code integration by datasciencemonkey · Pull Request #156 · datasciencemonkey/coding-agents-databricks-apps

datasciencemonkey · 2026-05-01T23:21:25Z

Summary

Adds MCP server endpoint at /mcp for Databricks Genie Code integration
5 MCP tools: create_session, run_task, get_status, get_result, close_session
File-based session/task state for stateless HTTP transport
57 tests passing (37 unit + 15 server + 5 integration)

Status

Experimental — needs real-world testing with Genie Code on a deployed Databricks App.

Disk-based state manager for MCP sessions and tasks. Pure Python module with no Flask dependency — just file I/O. Manages session directories at ~/.coda/sessions/{session-id}/ with tasks as subdirectories containing prompt.txt, status.jsonl, and result.json. Includes SessionBusyError/SessionNotFoundError exceptions and the ---CODA-TASK--- prompt wrapping convention. 37 tests covering full session/task lifecycle, edge cases, and error handling — all using tmp_path isolation.

Implements coda_create_session, coda_run_task, coda_get_status, coda_get_result, and coda_close_session via FastMCP with ToolAnnotations. Delegates disk state to task_manager.py; PTY ops via optional app hooks. Background watcher thread polls for result.json with timeout support. Includes 15 tests covering tool registration, disk-only mode, PTY hook integration, busy-session errors, and all CRUD paths.

Exercises the full MCP flow with mocked PTY hooks: - Happy-path: create session, run task, poll status, get result, close - Busy session rejects second task - context_hint=new_topic written to prompt.txt - permissions=yolo produces --yolo flag - Closing nonexistent session returns error

…ook)

…lity

…Socket support) Co-authored-by: Isaac

…ode)

…docs

…, single-user, task protocol)

…assets

Replace the 5-tool poll-heavy MCP API with a 3-tool fire-and-forget model: - coda_run: auto-creates ephemeral session, returns immediately - coda_inbox: dashboard of all background tasks (no polling needed) - coda_get_result: pull full structured result for completed tasks Key changes: - Sessions are ephemeral (auto-close on task completion) - Task chaining via previous_session_id (reads prior session results) - meta.json tracks task metadata for inbox scanning - Concurrency limit configurable via CODA_MAX_CONCURRENT env var - 24h TTL cleanup for expired sessions - Hermes instructions updated for ephemeral sessions + prior context - 22 tests covering full flow, chaining, concurrency, auto-close, cleanup

Documents the 3-tool fire-and-forget + inbox pattern with sequence diagram, data model, tool reference, migration guide, and limitations.

uvicorn + mcp_asgi.py wraps Flask in Starlette's WSGIMiddleware, which asserts scope["type"] == "http" — WebSocket upgrades (scope type "websocket") cause AssertionError, forcing Socket.IO to fall back to HTTP polling with visible jank. gunicorn + gthread + simple-websocket handles WebSocket natively. MCP is already served via Flask Blueprint (mcp_endpoint.py) at /mcp — no ASGI bridge needed.

Three tests assumed v1 behavior (long-lived, reusable sessions): - test_marks_session_idle → test_marks_session_closed (sessions auto-close) - test_can_create_new_task_after_complete → test_closed_session_rejects_new_task - test_multiple_completed_tasks_accumulate → test_multiple_tasks_across_sessions (each task gets its own session, verified via list_all_tasks)

Gateway discovery (3): Added SKIP_CLAUDE_INSTALL env var to bypass curl|bash in tests. Replaced vacuous `if settings_path.exists()` guards with `assert` so missing files fail loudly instead of silently passing. Session detach (3): Mocked subprocess.run (pgrep/ps) in process detection tests — sandbox blocks sysmon access. Mocked pty.openpty in EOF cleanup test — sandbox denies /dev/pty allocation. npm version (1): Added functional npm probe to skip condition — npm cache is root-owned on this machine, so npm commands fail with EPERM. task_manager (3): Already fixed in prior commit — tests updated for v2 ephemeral session model.

Reduces root-level clutter by organizing 8 setup_*.py files into setup/ and 3 install_*.sh files into scripts/. Updated all subprocess paths in app.py, added PYTHONPATH injection in _run_step() so setup scripts can still import from utils.py at repo root, and updated test path references. 275 tests passing. Post-commit hook unchanged (references sync_to_workspace.py at $APP_DIR root).

Moves mcp_server.py, mcp_endpoint.py, mcp_asgi.py, and task_manager.py into a coda_mcp/ package. Uses coda_mcp (not mcp/) to avoid shadowing the pip mcp package used by FastMCP imports. Updated all cross-imports in source and test files. 275 tests passing.

- Updated project structure tree for setup/, scripts/, coda_mcp/ layout - Added CoDA MCP server section with value proposition and usage examples for Genie Code, Claude Desktop, Cursor, and any MCP client - Added /mcp to API endpoints table - Fixed setup_mlflow.py path reference - Updated CLAUDE.md with CoDA MCP server entry

- MLflow tracing: README said MLFLOW_CLAUDE_TRACING_ENABLED=true but code sets "false" (intentional per b8a06c9). Updated README to match. - Parallel setup: README said "7" but code runs 6 parallel + 1 sequential. Fixed to "6". - Skills count: README said 39 but directory has 43 (4 BDD skills were unlisted). Updated badge, heading, and added BDD skills table. - CLAUDE.md: updated skills count to 43, MCP servers to 3.

Security audit findings: - Removed _check_origin() from mcp_endpoint.py — was defined but never called, creating false confidence that origin validation existed. Removed unused os and ensure_https imports. - Added os.chmod(path, 0o600) to all config file writes in cli_auth.py (settings.json, auth.json, .env, config.yaml) so tokens aren't world-readable. Matches pat_rotator.py's existing chmod on ~/.databrickscfg.

…run_step Closes critical and high test coverage gaps identified by audit: - content_filter_proxy.py: 45 tests covering message sanitization, orphaned tool_result stripping, SSE streaming, tool name remapping, token caching - sync_to_workspace.py: 11 tests covering path-escape guard, OAuth env stripping, config reading, error handling - _run_step (app.py): 7 tests covering DATABRICKS_CLIENT_ID/SECRET stripping, PYTHONPATH injection, PATH setup 275 → 338 tests passing.

The PAT reconfiguration path (line 329) runs setup scripts via subprocess.run but didn't inject PYTHONPATH like _run_step does. After the Tier 1 move to setup/, the scripts couldn't resolve `from utils import ...` during PAT rotation reconfiguration.

Covers the PAT reconfiguration subprocess path that was missing PYTHONPATH injection — the exact bug caught in production.

Genie Code requires FastMCP's native transport (streamable_http_app) per docs. The Flask Blueprint reimplementation at /mcp didn't satisfy the MCP protocol expectations, causing "MCP server could not be added". Switch app.yaml from gunicorn to uvicorn with mcp_asgi.py which mounts FastMCP natively at /mcp and Flask via WSGIMiddleware for everything else. WebSocket falls back to HTTP polling under ASGI (documented, works).

WSGIMiddleware cannot handle WebSocket upgrades, causing Socket.IO to fall back to HTTP polling under uvicorn. Add a python-socketio AsyncServer that intercepts /socket.io/ at the ASGI level before WSGIMiddleware, enabling native WebSocket alongside MCP. Architecture: socketio.ASGIApp → mcp_starlette(/mcp) → WSGI(Flask)

python-socketio 5.16.1 uses other_asgi_app, not other_app.

Databricks Apps proxy injects identity headers (X-Forwarded-Email) on HTTP requests but not on WebSocket upgrade requests. Starting with polling ensures auth succeeds during the HTTP handshake, then Socket.IO transparently upgrades to WebSocket without re-triggering auth. Also adds diagnostic logging to the ASGI connect handler to trace proxy header presence on future connection issues.

The app's own URL (mcp-test-coda-*.databricksapps.com) differs from DATABRICKS_HOST (workspace URL). Socket.IO was rejecting the app origin as not in ALLOWED_ORIGINS. Since Databricks proxy handles authentication, Socket.IO CORS can safely use '*'.

Make fire-and-forget pattern unmistakable in both server instructions and coda_run docstring. Explicitly tell LLM clients: do NOT follow up with coda_inbox after submitting — only check when user asks.

Databricks Apps proxy requires OAuth, not PATs. This bridge script translates between Claude Code's stdio MCP transport and the app's Streamable HTTP endpoint, injecting fresh OAuth tokens via `databricks auth token` on each request. Config via env vars (CODA_MCP_URL, DATABRICKS_PROFILE) in Claude Code settings.json — no hardcoded values in the script.

Databricks Apps use OAuth, not PATs. Updated the MCP client section to document the stdio bridge approach (tools/coda-bridge.py) and added tools/ to the project structure.

Prevents Hermes from executing destructive operations (DROP, DELETE, truncate, CLI deletes, permission changes) via prompt-level instructions. Destructive ops require explicit approval via needs_approval status.

datasciencemonkey added 30 commits May 1, 2026 19:08

feat: mount MCP server at /mcp with CORS and PTY integration

ce8e5d2

feat: add ASGI app with native MCP + Flask for Genie Code compatibility

958f57f

fix: call initialize_app() in ASGI entrypoint (was only in gunicorn h…

a849a89

…ook)

fix: enable stateless_http and json_response for Genie Code compatibi…

9adcba4

…lity

fix: add workspace origin to MCP transport_security allowed_origins

7aaf03e

fix: replace ASGI bridge with Flask-native MCP endpoint (restores Web…

4d86f70

…Socket support) Co-authored-by: Isaac

fix: skip security headers for /mcp (CSP was interfering with Genie C…

28a8231

…ode)

fix: use native MCP SDK transport with CORSMiddleware per Genie Code …

db5c1eb

…docs

fix: improve tool descriptions to guide Genie Code polling workflow

d822de2

fix: improve prompt convention with explicit result.json instructions

ec47832

fix: find result.json in both root and results/ subdir

24b6303

fix: add exponential backoff polling instructions for Genie Code

5af976e

feat: add CoDA orchestrator instructions to Hermes config (sub-agents…

0ba8f28

…, single-user, task protocol)

feat: add CoDA Constitution — no destructive actions on pre-existing …

73f76c5

…assets

docs: add v2 MCP background execution flow diagram and reference

58e30c2

Documents the 3-tool fire-and-forget + inbox pattern with sequence diagram, data model, tool reference, migration guide, and limitations.

test: add _configure_all_cli_auth PYTHONPATH regression test

0589fd6

Covers the PAT reconfiguration subprocess path that was missing PYTHONPATH injection — the exact bug caught in production.

datasciencemonkey added 10 commits May 3, 2026 21:08

fix: use other_asgi_app parameter for socketio.ASGIApp

f95bb8a

python-socketio 5.16.1 uses other_asgi_app, not other_app.

fix: await async enter_room/leave_room in ASGI Socket.IO handlers

1a6d282

fix: strengthen MCP instructions to prevent Genie Code polling loop

95e8c7e

Make fire-and-forget pattern unmistakable in both server instructions and coda_run docstring. Explicitly tell LLM clients: do NOT follow up with coda_inbox after submitting — only check when user asks.

docs: add MCP client auth setup to README

478b1a2

Databricks Apps use OAuth, not PATs. Updated the MCP client section to document the stdio bridge approach (tools/coda-bridge.py) and added tools/ to the project structure.

feat: add safety guardrails to CODA-TASK prompt envelope

1ce86bf

Prevents Hermes from executing destructive operations (DROP, DELETE, truncate, CLI deletes, permission changes) via prompt-level instructions. Destructive ops require explicit approval via needs_approval status.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Experimental: MCP server for Genie Code integration#156

Experimental: MCP server for Genie Code integration#156
datasciencemonkey wants to merge 40 commits intomainfrom
coda-mcp

datasciencemonkey commented May 1, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

datasciencemonkey commented May 1, 2026

Summary

Status

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant