CLAUDE.md

This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.

Development Commands

Environment Setup

# Initial setup (using uv for modern Python package management)
make setup-env    # Create .venv using uv
make install      # Install all dependencies via uv sync
source .venv/bin/activate  # Activate virtual environment

# Database setup
docker-compose up -d  # Start IRIS database
make setup-db         # Initialize database
make load-data        # Load sample data

Testing

# Run tests using the script runner
./scripts/ci/run-tests.sh           # Run all tests
./scripts/ci/run-tests.sh -t unit   # Run only unit tests
./scripts/ci/run-tests.sh -t integration -v  # Integration tests with verbose output
./scripts/ci/run-tests.sh -p -c     # Parallel execution without coverage

# Direct pytest execution
pytest tests/                       # All tests
pytest tests/unit/                  # Unit tests only
pytest tests/integration/           # Integration tests only
pytest tests/e2e/                   # End-to-end tests only
pytest --cov=iris_rag --cov=rag_templates  # With coverage

# Backend mode testing (Feature 035)
make test-community                 # Test with Community Edition mode (1 connection)
make test-enterprise                # Test with Enterprise Edition mode (999 connections)
make test-backend-contracts         # Run backend mode contract tests
IRIS_BACKEND_MODE=community pytest tests/  # Manual backend mode override

Linting and Formatting

# Format code (apply isort and black per pyproject.toml configuration)
black .
isort .

# Lint code
flake8 .
mypy iris_rag/

Docker Operations

# Core services
make docker-up        # Start core services (IRIS, Redis, API, Streamlit)
make docker-down      # Stop all services
make docker-logs      # View logs from all services

# Development environment
make docker-up-dev    # Start with Jupyter notebook
make docker-shell     # Open shell in API container
make docker-iris-shell  # Open IRIS database shell

# Full development setup
make docker-dev       # Start dev environment, wait for health, init data

RAGAS Evaluation

# Quick evaluation on sample data
make test-ragas-sample

# Full evaluation on 1000 PMC documents
make test-ragas-1000

# Dockerized evaluation
make test-ragas-sample-docker
make test-ragas-1000-docker

Architecture Overview

Core Framework Structure

iris_rag/: Main RAG framework package
- core/: Abstract base classes (RAGPipeline, VectorStore) and models
- pipelines/: RAG pipeline implementations (BasicRAG, CRAG, GraphRAG, HybridGraphRAG)
- storage/: Vector store implementations, primarily IRISVectorStore
- services/: Business logic services (entity extraction, storage management)
- config/: Configuration management and pipeline-specific configs
- validation/: Pipeline validation and requirements checking
- memory/: Memory management and incremental indexing components

Available RAG Pipelines (via `create_pipeline()`)

basic → BasicRAGPipeline - Standard vector similarity search
basic_rerank → BasicRAGRerankingPipeline - Vector search + cross-encoder reranking
crag → CRAGPipeline - Corrective RAG with self-evaluation
graphrag → HybridGraphRAGPipeline - Hybrid search (vector + text + graph + RRF)
pylate_colbert → PyLateColBERTPipeline - ColBERT late interaction retrieval

Additional Pipeline (Direct Import): 6. IRIS-Global-GraphRAG - Academic papers with 3D visualization and global communities

Key Integration Points

Vector Database: InterSystems IRIS with native vector search capabilities
LLM Integration: OpenAI and Anthropic APIs via common.utils.get_llm_func
Bridge Adapters: Generic memory components for external system integration
Validation Framework: Automated pipeline requirement validation and setup

Pipeline Factory Pattern

from iris_rag import create_pipeline

# Create with validation (recommended)
pipeline = create_pipeline(
    pipeline_type="basic",           # basic, basic_rerank, crag, graphrag, pylate_colbert
    validate_requirements=True,       # Auto-validate DB setup
    auto_setup=False                 # Auto-fix issues if True
)

# All pipelines share the same standardized API
result = pipeline.query(query="What is diabetes?", top_k=5)

# Standardized response format (100% LangChain & RAGAS compatible):
# - result["answer"]: LLM-generated answer
# - result["retrieved_documents"]: List[Document] with full metadata
# - result["contexts"]: List[str] for RAGAS evaluation
# - result["sources"]: Source references in metadata
# - result["metadata"]: Pipeline-specific metadata fields

Testing Architecture

Unit Tests: tests/unit/ - Component-level testing
Integration Tests: tests/integration/ - Cross-component functionality
E2E Tests: tests/e2e/ - Full pipeline workflows
Contract Tests: tests/contract/ - API contract validation (TDD approach)
Enterprise Scale Tests: 10K document testing with mocking support

Test Fixture Strategy (.DAT Fixture-First Principle)

Constitutional Requirement: All integration and E2E tests with ≥10 entities MUST use .DAT fixtures loaded via iris-devtools. See .specify/memory/constitution.md for complete IRIS testing principles.

Performance Benefits:

.DAT fixtures: 0.5-2 seconds for 100 entities (binary IRIS format)
JSON fixtures: 39-75 seconds for same data
Speedup: 100-200x faster test execution

When to Use What:

Need test data?
├─ Unit test (mocked components)?
│  └─ Use programmatic fixtures (Python code)
│
├─ Integration test (real IRIS database)?
│  ├─ < 10 entities or simple data?
│  │  └─ Use programmatic fixtures
│  │
│  └─ ≥ 10 entities or complex relationships?
│     └─ Use .DAT fixtures (REQUIRED)
│
└─ E2E test (full pipeline)?
   └─ Use .DAT fixtures (REQUIRED)

Fixture Management Commands:

# List available fixtures
make fixture-list

# Get fixture details
make fixture-info FIXTURE=medical-graphrag-20

# Load fixture into IRIS
make fixture-load FIXTURE=medical-graphrag-20

# Create new fixture from current database
make fixture-create FIXTURE=my-test-data

# Validate fixture integrity
make fixture-validate FIXTURE=medical-graphrag-20

Using Fixtures in Tests:

# Automatic fixture loading via pytest marker
@pytest.mark.dat_fixture("medical-graphrag-20")
def test_with_fixture():
    # Fixture automatically loaded before test
    # Database contains 21 entities, 15 relationships
    pass

# Manual fixture loading via FixtureManager
from tests.fixtures.manager import FixtureManager

def test_manual_load():
    manager = FixtureManager()
    result = manager.load_fixture(
        fixture_name="medical-graphrag-20",
        cleanup_first=True,
        validate_checksum=True,
    )
    assert result.success

Fixture Infrastructure (✅ Production Ready): The unified fixture infrastructure provides:

Fast .DAT Loading: 100-200x faster than JSON (via iris-devtools)
Checksum Validation: SHA256 integrity checking for data consistency
Version Management: Semantic versioning with migration history tracking
State Tracking: Session-wide fixture state to prevent schema loops
pytest Integration: Automatic cleanup via @pytest.mark.dat_fixture decorator

Fixture Documentation:

Complete Status: FIXTURE_INFRASTRUCTURE_COMPLETE.md (implementation overview)
CLI Reference: python -m tests.fixtures.cli --help
API Documentation: tests/fixtures/manager.py (FixtureManager class)
Constitution: .specify/memory/constitution.md (Principle II)

Backend Mode Configuration (Feature 035)

Purpose: Prevent license pool exhaustion in IRIS Community Edition while allowing parallel execution in Enterprise Edition.

Modes:

Community: Single connection limit, sequential test execution
Enterprise: 999 connections, parallel test execution

Configuration Precedence (highest to lowest):

IRIS_BACKEND_MODE environment variable
.specify/config/backend_modes.yaml file
Default (community mode)

Usage Examples:

# Pytest fixtures (auto-configured)
def test_example(iris_connection, backend_configuration):
    assert backend_configuration.max_connections == 1  # community mode

# Manual configuration
from iris_rag.testing import load_configuration, ConnectionPool

config = load_configuration()
pool = ConnectionPool(mode=config.mode)
with pool.acquire() as conn:
    # Use connection
    pass

Troubleshooting:

License pool exhaustion: Switch to IRIS_BACKEND_MODE=community
Tests timing out: Check connection pool limits with config.max_connections
Edition mismatch error: Set IRIS_BACKEND_MODE to match your IRIS edition

Configuration Management

Default Config: iris_rag/config/default_config.yaml
Pipeline Configs: config/pipelines.yaml
Environment: .env file for API keys and database connections
Docker Compose: Multiple compose files for different deployment scenarios

HybridGraphRAG Required Dependencies

The HybridGraphRAG pipeline requires iris-vector-graph for operation:

Installation:

pip install rag-templates[hybrid-graphrag]

This installs the iris-vector-graph package providing iris_graph_core integration for 50x performance improvements.

Requirements:

iris-vector-graph>=1.6.0 is now a mandatory dependency
No fallback mechanisms - the pipeline will fail fast with clear error messages if the package is missing
All retrieval methods (hybrid, rrf, text, vector, kg) require iris-vector-graph

Testing GraphRAG Pipelines

Important: HybridGraphRAG integration tests are intentionally skipped in CI because they require:

Configured LLM for entity extraction from documents
iris-vector-graph tables populated with embeddings and optimized indexes
Full knowledge graph (entities + relationships) extracted from documents

Test fixtures cannot provide this setup because:

Entity extraction requires LLM API calls (not available/practical in test environment)
iris-vector-graph requires optimized HNSW tables with real embeddings
Simple 3-document fixtures cannot replicate the complexity of real knowledge graphs

Three-Tier Testing Strategy:

GraphRAG testing uses a pragmatic three-tier approach:

Tier 1: Contract Tests (Automated CI) ✅

pytest tests/contract/test_graphrag_fixtures.py  # 13/13 passing

Purpose: Validate API interfaces and fixture loading
Coverage: Data structures, fixture service, validation logic
Run in CI: Yes - fast (< 1s), reliable, no dependencies
When to run: Always (part of standard test suite)

Tier 2: Realistic Integration Tests (Manual, Development) ℹ️

# Run against real database with 221K+ entities
IRIS_PORT=21972 pytest tests/integration/test_graphrag_realistic.py -v
IRIS_PORT=21972 pytest tests/integration/test_graphrag_with_real_data.py -v

Purpose: Validate GraphRAG against production-like data
Coverage: KG traversal, vector fallback, metadata completeness
Run in CI: No - requires IRIS_PORT environment configuration
When to run: During development, before major releases
Database requirement: 100+ entities, 50+ relationships

Tier 3: E2E HybridGraphRAG Tests (Skipped) ⏭️

pytest tests/integration/test_hybridgraphrag_e2e.py  # All skipped with clear reasons

Purpose: End-to-end validation of all 5 query methods
Status: Intentionally skipped - requires LLM + iris-vector-graph setup
Alternative: Manual testing with real data (see below)

Why Integration Tests are Skipped:

Previous "passing" integration tests were false positives - they used 2,376 pre-existing documents in the database, not the 3-document test fixtures
Maintaining complex LLM mocking + iris-vector-graph setup is brittle and provides little value
Contract tests + manual validation with real data provides better signal

Data Flow

Document Ingestion: Load documents via pipeline.load_documents()
Chunking & Embedding: Automatic text segmentation and vector generation
Storage: Vectors and metadata stored in IRIS vector tables
Query Processing: Multi-modal retrieval (vector, text, graph) depending on pipeline
Generation: LLM synthesis with retrieved context
Response: Standardized response format with sources and metadata

REST API (Production-Grade)

Location: iris_rag/api/

The REST API provides production-ready HTTP endpoints for all RAG pipelines with enterprise features:

Features:

API key authentication (bcrypt-hashed)
Three-tier rate limiting (60/100/1000 requests/min)
Request/response logging with audit trail
WebSocket streaming for real-time progress
Async document upload with validation
Health monitoring for all components
Elasticsearch-inspired error responses
100% LangChain & RAGAS compatible

Quick Start:

# Setup database tables
make api-setup-db

# Create API key
make api-create-key NAME="My Key" EMAIL=user@example.com

# Start server (development mode)
make api-run

# Start server (production mode, 4 workers)
make api-run-prod

# Open API documentation
make api-docs  # http://localhost:8000/docs

CLI Commands:

# Server operations
python -m iris_rag.api.cli run [--host HOST] [--port PORT] [--workers N] [--reload]
python -m iris_rag.api.cli health

# API key management
python -m iris_rag.api.cli create-key --name NAME --owner-email EMAIL [--tier TIER]
python -m iris_rag.api.cli list-keys [--owner-email EMAIL]
python -m iris_rag.api.cli revoke-key --key-id KEY_ID

# Database operations
python -m iris_rag.api.cli setup-db

# Cleanup job (run daily via cron)
python -m iris_rag.api.cleanup_job

API Endpoints:

POST /{pipeline}/_search - Execute query (requires auth)
GET /api/v1/pipelines - List available pipelines (public)
GET /api/v1/pipelines/{name} - Get pipeline details (public)
POST /api/v1/documents/upload - Upload documents (requires write permission)
GET /api/v1/documents/operations/{id} - Track upload progress
GET /api/v1/health - System health check (public)
WS /ws - WebSocket streaming (requires auth)

Authentication:

# All requests (except /health and /pipelines) require API key
Authorization: ApiKey <base64(key_id:key_secret)>

# Example
Authorization: ApiKey N2M5ZTY2NzktNzQyNS00MGRlLTk0NGItZTA3ZmMxZjkwYWU3Om15X3NlY3JldF9rZXk=

Query Example:

curl -X POST http://localhost:8000/api/v1/basic/_search \
  -H "Authorization: ApiKey <your-key>" \
  -H "Content-Type: application/json" \
  -d '{
    "query": "What are the symptoms of diabetes?",
    "top_k": 5
  }'

Response Format (RAGAS compatible):

{
  "response_id": "uuid",
  "request_id": "uuid",
  "answer": "Generated answer text...",
  "retrieved_documents": [
    {
      "doc_id": "uuid",
      "content": "Document text...",
      "score": 0.95,
      "metadata": {"source": "file.pdf", "page_number": 127}
    }
  ],
  "sources": ["file.pdf"],
  "contexts": ["Document text..."],
  "pipeline_name": "basic",
  "execution_time_ms": 1456,
  "retrieval_time_ms": 345,
  "generation_time_ms": 1089,
  "tokens_used": 2345
}

Rate Limiting:

Tier	Requests/Minute	Requests/Hour	Max Concurrent
Basic	60	1,000	5
Premium	100	5,000	10
Enterprise	1,000	50,000	20

Error Handling (Elasticsearch-inspired):

{
  "error": {
    "type": "validation_exception",
    "reason": "Invalid parameter value",
    "details": {
      "field": "top_k",
      "rejected_value": -5,
      "message": "Must be positive integer between 1 and 100",
      "min_value": 1,
      "max_value": 100
    }
  }
}

Database Cleanup:

# Run cleanup job manually
python -m iris_rag.api.cleanup_job

# Schedule with cron (daily at 2 AM)
0 2 * * * cd /path/to/rag-templates && .venv/bin/python -m iris_rag.api.cleanup_job >> logs/cleanup.log 2>&1

Testing:

make api-test                 # Run all API tests
make api-test-contracts       # Run contract tests (TDD)
make api-test-integration     # Run integration tests

Configuration: config/api_config.yaml

server:
  host: 0.0.0.0
  port: 8000
  workers: 4

database:
  pool_size: 20
  max_overflow: 10

pipelines:
  enabled: [basic, basic_rerank, crag, graphrag, pylate_colbert]

rate_limiting:
  max_concurrent_per_key: 10

logging:
  retention_days: 30

Complete Documentation: iris_rag/api/README.md

Git Workflow (Three-Tier Repository Strategy)

Repository Structure

The project uses three repositories for selective public sharing:

origin (private)    → isc-tdyar/iris-vector-rag-private
fork (public)       → isc-tdyar/iris-vector-rag
upstream (community)→ intersystems-community/iris-vector-rag

Current remotes:

git remote -v
# origin    https://github.com/isc-tdyar/iris-vector-rag-private.git
# fork      https://github.com/isc-tdyar/iris-vector-rag.git
# upstream  https://github.com/intersystems-community/iris-vector-rag.git

Daily Development Workflow

1. Private Work (Default)

# Work on features privately
git commit -am "feat: experimental feature"
git push origin main  # Push to private repo only

2. Selective Public Sharing

# Cherry-pick commits for public release
git checkout -b public/feature-name
git cherry-pick <commit-hash>  # Select specific commits
git push fork public/feature-name

# Create PR on GitHub: fork:public/feature-name → upstream:main

3. Emergency Sync (Rare)

# Sync all repositories immediately (requires write access)
git push origin main && git push fork main && git push upstream main

Release Workflow

Version Bump and Publish:

# 1. Update version
vim pyproject.toml  # version = "0.5.x"

# 2. Build and publish to PyPI
uv build
twine upload dist/iris_vector_rag-*.whl dist/iris_vector_rag-*.tar.gz

# 3. Commit and tag
git commit -am "chore: bump version to 0.5.x"
git tag -a v0.5.x -m "Release v0.5.x"

# 4. Push to all repositories
git push origin main
git push fork main
git push upstream main
git push --tags

IMPORTANT: Always use twine for PyPI publishing (not uv publish). See Constitution Principle X.

Troubleshooting Git Operations

Divergent branches error:

# Use merge strategy to reconcile
git pull fork main --no-rebase --no-edit

Check remote status:

git remote -v
git fetch --all
git log --oneline --graph --all --decorate -10

Active Technologies

Python 3.10+ (existing codebase uses 3.10-3.12) (051-enterprise-enhancements)
InterSystems IRIS database (RAG.SourceDocuments table - existing) (051-enterprise-enhancements)

Recent Changes

051-enterprise-enhancements: Added Python 3.10+ (existing codebase uses 3.10-3.12)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CLAUDE.md

Development Commands

Environment Setup

Testing

Linting and Formatting

Docker Operations

RAGAS Evaluation

Architecture Overview

Core Framework Structure

Available RAG Pipelines (via `create_pipeline()`)

Key Integration Points

Pipeline Factory Pattern

Testing Architecture

Test Fixture Strategy (.DAT Fixture-First Principle)

Backend Mode Configuration (Feature 035)

Configuration Management

HybridGraphRAG Required Dependencies

Testing GraphRAG Pipelines

Data Flow

REST API (Production-Grade)

Git Workflow (Three-Tier Repository Strategy)

Repository Structure

Daily Development Workflow

Release Workflow

Troubleshooting Git Operations

See Also

Active Technologies

Recent Changes

FilesExpand file tree

CLAUDE.md

Latest commit

History

CLAUDE.md

File metadata and controls

CLAUDE.md

Development Commands

Environment Setup

Testing

Linting and Formatting

Docker Operations

RAGAS Evaluation

Architecture Overview

Core Framework Structure

Available RAG Pipelines (via create_pipeline())

Key Integration Points

Pipeline Factory Pattern

Testing Architecture

Test Fixture Strategy (.DAT Fixture-First Principle)

Backend Mode Configuration (Feature 035)

Configuration Management

HybridGraphRAG Required Dependencies

Testing GraphRAG Pipelines

Data Flow

REST API (Production-Grade)

Git Workflow (Three-Tier Repository Strategy)

Repository Structure

Daily Development Workflow

Release Workflow

Troubleshooting Git Operations

See Also

Active Technologies

Recent Changes

Available RAG Pipelines (via `create_pipeline()`)