Skip to content

Latest commit

 

History

History
204 lines (159 loc) · 6.92 KB

File metadata and controls

204 lines (159 loc) · 6.92 KB

WARP.md

This file provides guidance to WARP (warp.dev) when working with code in this repository.

Project Overview

fMedCP (Functional Medical Context Protocol) is a Python-based medical AI analysis tool that integrates Claude's reasoning capabilities with medical databases. It combines electronic health records (EHR) and biomedical knowledge graphs to provide evidence-based medical responses using the Model Context Protocol (MCP).

Key Architecture

Core Components

  • MCP Server: Provides secure, read-only access to medical databases using FastMCP
  • Claude Integration: Orchestrates AI-driven medical analysis using Claude Sonnet 4
  • Database Integrations:
    • Neo4j for biomedical knowledge graphs (requires APOC plugin)
    • SQL Server for electronic health records

Main Files

  • MedCP.py: Core implementation containing MCP server, Claude integration, and main processing function
  • example.py: Usage example demonstrating the complete workflow
  • pyproject.toml: Python project configuration with dependencies

Data Flow

  1. User provides medical question via run_medcp() function
  2. MedCP server starts with configured database connections
  3. Claude receives specialized medical tools (knowledge graph queries, clinical record queries)
  4. AI analyzes question using available data sources
  5. Results exported to structured markdown files with performance metrics

Common Development Commands

Installation and Setup

# Install dependencies using uv (preferred) or pip
pip install fastmcp neo4j pymssql pydantic anthropic python-dotenv mcp asyncio-mqtt

# Copy environment template
cp example.env .env
# Edit .env with your actual credentials

Running the System

# Basic usage
python example.py

# Direct usage in code
python -c "from MedCP import run_medcp; result = run_medcp(...)"

Development Tools

# Auto-sync changes (basic git workflow)
./auto_sync.sh          # On Unix/macOS
auto_sync.bat           # On Windows

# Check dependencies
uv sync                 # If using uv package manager

Testing Medical Queries

  • No formal test suite detected
  • Testing is done through example queries in example.py
  • Results are saved to results/ directory as markdown files

Environment Configuration

Required Environment Variables

# Claude API Configuration
CLAUDE_API_KEY=your_claude_api_key_here

# Neo4j Knowledge Graph Configuration
KNOWLEDGE_GRAPH_URI=bolt://localhost:7687
KNOWLEDGE_GRAPH_USERNAME=neo4j
KNOWLEDGE_GRAPH_PASSWORD=your_neo4j_password
KNOWLEDGE_GRAPH_DATABASE=spoke

# SQL Server Clinical Records Configuration
CLINICAL_RECORDS_SERVER=your_sql_server_host
CLINICAL_RECORDS_DATABASE=your_ehr_database
CLINICAL_RECORDS_USERNAME=your_sql_username
CLINICAL_RECORDS_PASSWORD=your_sql_password

# Optional Configuration
MEDCP_LOG_LEVEL=INFO
MEDCP_NAMESPACE=MedCP

Database Requirements

Neo4j Knowledge Graph

  • Requires APOC plugin for schema introspection: CALL apoc.meta.schema()
  • Default database name: "spoke"
  • Must support read-only Cypher queries
  • Used for biomedical knowledge inference (drug interactions, pathways, disease relationships)

SQL Server Clinical Records

  • Must have accessible clinical data tables
  • Requires read-only SELECT permissions
  • Used for patient demographics, diagnosis patterns, treatment outcomes
  • Queries are validated to prevent write operations

Security and Safety Features

Read-Only Operations

  • All database queries are strictly validated to prevent modifications
  • Knowledge graph queries filtered to exclude write operations (MERGE, CREATE, SET, DELETE)
  • Clinical queries limited to SELECT statements only
  • SQL injection protection through parameterized queries

Query Validation

  • ClinicalQueryValidator class ensures only safe operations
  • Cypher injection prevention with parameter binding
  • Automatic query sanitization and validation

Key Functions and Usage Patterns

Main Entry Point

run_medcp(
    question_name="unique_id",
    question_text="What is the prevalence of...",
    claude_api_key=api_key,
    kg_uri=kg_uri,
    kg_username=kg_username,
    kg_password=kg_password,
    clinical_server=clinical_server,
    clinical_database=clinical_database,
    clinical_username=clinical_username,
    clinical_password=clinical_password,
    # Optional parameters
    use_knowledge_graph=True,
    use_clinical_records=True,
    max_tokens=20000,
    max_iterations=50,
    output_dir="results"
)

Selective Data Source Usage

# Use only knowledge graph (no clinical records)
result = run_medcp(..., use_knowledge_graph=True, use_clinical_records=False)

# Use only clinical records (no knowledge graph)
result = run_medcp(..., use_knowledge_graph=False, use_clinical_records=True)

Available MCP Tools

When working with the system, Claude has access to these specialized tools:

Knowledge Graph Tools

  • get_knowledge_graph_schema(): Discover available biomedical entities and relationships
  • query_knowledge_graph(cypher_query, parameters): Execute read-only Cypher queries

Clinical Records Tools

  • list_clinical_tables(): List available clinical data tables with schemas
  • query_clinical_records(sql_query): Execute read-only SQL queries on patient data

Output and Results

Console Output

  • Real-time progress and analysis steps
  • Performance metrics (tokens used, elapsed time, tool calls)
  • Final medical analysis results

Markdown Files

  • Saved to configurable output directory (default: current directory)
  • Contains complete question, Claude's response, reasoning steps, and metrics
  • Structured format for documentation and review

Error Handling and Troubleshooting

Common Issues

  • "APOC plugin not installed": Install Neo4j APOC plugin and configure dbms.security.procedures.unrestricted=apoc.*
  • Connection timeouts: Increase max_tokens and max_iterations for complex queries
  • Rate limits: Reduce token limits or implement delays between queries

Performance Tuning

  • Start with smaller token limits (15000) for exploratory questions
  • Use selective data sources for focused analysis
  • Monitor token usage for cost management
  • Complex questions may require max_tokens=30000 and max_iterations=75

Integration with External Systems

Claude API

  • Uses Claude Sonnet 4 model (claude-sonnet-4-20250514)
  • Implements rate limit handling with automatic retries
  • Token usage tracking and reporting

Model Context Protocol (MCP)

  • Uses FastMCP for server implementation
  • Provides standardized interface for medical data access
  • Supports tool annotations for read-only, destructive, and idempotent operations

Development Notes

  • The system embeds the MCP server in the same process as the Claude client for efficiency
  • All connections are properly cleaned up with context managers and try-finally blocks
  • Logging is configurable and suppresses verbose framework output by default
  • Thread-safe implementation allows concurrent usage