Experimental: MCP server for Genie Code integration#156
Open
datasciencemonkey wants to merge 40 commits intomainfrom
Open
Experimental: MCP server for Genie Code integration#156datasciencemonkey wants to merge 40 commits intomainfrom
datasciencemonkey wants to merge 40 commits intomainfrom
Conversation
Disk-based state manager for MCP sessions and tasks.
Pure Python module with no Flask dependency — just file I/O.
Manages session directories at ~/.coda/sessions/{session-id}/
with tasks as subdirectories containing prompt.txt, status.jsonl,
and result.json. Includes SessionBusyError/SessionNotFoundError
exceptions and the ---CODA-TASK--- prompt wrapping convention.
37 tests covering full session/task lifecycle, edge cases,
and error handling — all using tmp_path isolation.
Implements coda_create_session, coda_run_task, coda_get_status, coda_get_result, and coda_close_session via FastMCP with ToolAnnotations. Delegates disk state to task_manager.py; PTY ops via optional app hooks. Background watcher thread polls for result.json with timeout support. Includes 15 tests covering tool registration, disk-only mode, PTY hook integration, busy-session errors, and all CRUD paths.
Exercises the full MCP flow with mocked PTY hooks: - Happy-path: create session, run task, poll status, get result, close - Busy session rejects second task - context_hint=new_topic written to prompt.txt - permissions=yolo produces --yolo flag - Closing nonexistent session returns error
…Socket support) Co-authored-by: Isaac
…, single-user, task protocol)
Replace the 5-tool poll-heavy MCP API with a 3-tool fire-and-forget model: - coda_run: auto-creates ephemeral session, returns immediately - coda_inbox: dashboard of all background tasks (no polling needed) - coda_get_result: pull full structured result for completed tasks Key changes: - Sessions are ephemeral (auto-close on task completion) - Task chaining via previous_session_id (reads prior session results) - meta.json tracks task metadata for inbox scanning - Concurrency limit configurable via CODA_MAX_CONCURRENT env var - 24h TTL cleanup for expired sessions - Hermes instructions updated for ephemeral sessions + prior context - 22 tests covering full flow, chaining, concurrency, auto-close, cleanup
Documents the 3-tool fire-and-forget + inbox pattern with sequence diagram, data model, tool reference, migration guide, and limitations.
uvicorn + mcp_asgi.py wraps Flask in Starlette's WSGIMiddleware, which asserts scope["type"] == "http" — WebSocket upgrades (scope type "websocket") cause AssertionError, forcing Socket.IO to fall back to HTTP polling with visible jank. gunicorn + gthread + simple-websocket handles WebSocket natively. MCP is already served via Flask Blueprint (mcp_endpoint.py) at /mcp — no ASGI bridge needed.
Three tests assumed v1 behavior (long-lived, reusable sessions): - test_marks_session_idle → test_marks_session_closed (sessions auto-close) - test_can_create_new_task_after_complete → test_closed_session_rejects_new_task - test_multiple_completed_tasks_accumulate → test_multiple_tasks_across_sessions (each task gets its own session, verified via list_all_tasks)
Gateway discovery (3): Added SKIP_CLAUDE_INSTALL env var to bypass curl|bash in tests. Replaced vacuous `if settings_path.exists()` guards with `assert` so missing files fail loudly instead of silently passing. Session detach (3): Mocked subprocess.run (pgrep/ps) in process detection tests — sandbox blocks sysmon access. Mocked pty.openpty in EOF cleanup test — sandbox denies /dev/pty allocation. npm version (1): Added functional npm probe to skip condition — npm cache is root-owned on this machine, so npm commands fail with EPERM. task_manager (3): Already fixed in prior commit — tests updated for v2 ephemeral session model.
Reduces root-level clutter by organizing 8 setup_*.py files into setup/ and 3 install_*.sh files into scripts/. Updated all subprocess paths in app.py, added PYTHONPATH injection in _run_step() so setup scripts can still import from utils.py at repo root, and updated test path references. 275 tests passing. Post-commit hook unchanged (references sync_to_workspace.py at $APP_DIR root).
Moves mcp_server.py, mcp_endpoint.py, mcp_asgi.py, and task_manager.py into a coda_mcp/ package. Uses coda_mcp (not mcp/) to avoid shadowing the pip mcp package used by FastMCP imports. Updated all cross-imports in source and test files. 275 tests passing.
- Updated project structure tree for setup/, scripts/, coda_mcp/ layout - Added CoDA MCP server section with value proposition and usage examples for Genie Code, Claude Desktop, Cursor, and any MCP client - Added /mcp to API endpoints table - Fixed setup_mlflow.py path reference - Updated CLAUDE.md with CoDA MCP server entry
- MLflow tracing: README said MLFLOW_CLAUDE_TRACING_ENABLED=true but code sets "false" (intentional per b8a06c9). Updated README to match. - Parallel setup: README said "7" but code runs 6 parallel + 1 sequential. Fixed to "6". - Skills count: README said 39 but directory has 43 (4 BDD skills were unlisted). Updated badge, heading, and added BDD skills table. - CLAUDE.md: updated skills count to 43, MCP servers to 3.
Security audit findings: - Removed _check_origin() from mcp_endpoint.py — was defined but never called, creating false confidence that origin validation existed. Removed unused os and ensure_https imports. - Added os.chmod(path, 0o600) to all config file writes in cli_auth.py (settings.json, auth.json, .env, config.yaml) so tokens aren't world-readable. Matches pat_rotator.py's existing chmod on ~/.databrickscfg.
…run_step Closes critical and high test coverage gaps identified by audit: - content_filter_proxy.py: 45 tests covering message sanitization, orphaned tool_result stripping, SSE streaming, tool name remapping, token caching - sync_to_workspace.py: 11 tests covering path-escape guard, OAuth env stripping, config reading, error handling - _run_step (app.py): 7 tests covering DATABRICKS_CLIENT_ID/SECRET stripping, PYTHONPATH injection, PATH setup 275 → 338 tests passing.
The PAT reconfiguration path (line 329) runs setup scripts via subprocess.run but didn't inject PYTHONPATH like _run_step does. After the Tier 1 move to setup/, the scripts couldn't resolve `from utils import ...` during PAT rotation reconfiguration.
Covers the PAT reconfiguration subprocess path that was missing PYTHONPATH injection — the exact bug caught in production.
Genie Code requires FastMCP's native transport (streamable_http_app) per docs. The Flask Blueprint reimplementation at /mcp didn't satisfy the MCP protocol expectations, causing "MCP server could not be added". Switch app.yaml from gunicorn to uvicorn with mcp_asgi.py which mounts FastMCP natively at /mcp and Flask via WSGIMiddleware for everything else. WebSocket falls back to HTTP polling under ASGI (documented, works).
WSGIMiddleware cannot handle WebSocket upgrades, causing Socket.IO to fall back to HTTP polling under uvicorn. Add a python-socketio AsyncServer that intercepts /socket.io/ at the ASGI level before WSGIMiddleware, enabling native WebSocket alongside MCP. Architecture: socketio.ASGIApp → mcp_starlette(/mcp) → WSGI(Flask)
python-socketio 5.16.1 uses other_asgi_app, not other_app.
Databricks Apps proxy injects identity headers (X-Forwarded-Email) on HTTP requests but not on WebSocket upgrade requests. Starting with polling ensures auth succeeds during the HTTP handshake, then Socket.IO transparently upgrades to WebSocket without re-triggering auth. Also adds diagnostic logging to the ASGI connect handler to trace proxy header presence on future connection issues.
The app's own URL (mcp-test-coda-*.databricksapps.com) differs from DATABRICKS_HOST (workspace URL). Socket.IO was rejecting the app origin as not in ALLOWED_ORIGINS. Since Databricks proxy handles authentication, Socket.IO CORS can safely use '*'.
Make fire-and-forget pattern unmistakable in both server instructions and coda_run docstring. Explicitly tell LLM clients: do NOT follow up with coda_inbox after submitting — only check when user asks.
Databricks Apps proxy requires OAuth, not PATs. This bridge script translates between Claude Code's stdio MCP transport and the app's Streamable HTTP endpoint, injecting fresh OAuth tokens via `databricks auth token` on each request. Config via env vars (CODA_MCP_URL, DATABRICKS_PROFILE) in Claude Code settings.json — no hardcoded values in the script.
Databricks Apps use OAuth, not PATs. Updated the MCP client section to document the stdio bridge approach (tools/coda-bridge.py) and added tools/ to the project structure.
Prevents Hermes from executing destructive operations (DROP, DELETE, truncate, CLI deletes, permission changes) via prompt-level instructions. Destructive ops require explicit approval via needs_approval status.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
/mcpfor Databricks Genie Code integrationStatus
Experimental — needs real-world testing with Genie Code on a deployed Databricks App.