Skip to content

Latest commit

 

History

History
441 lines (356 loc) · 13.8 KB

File metadata and controls

441 lines (356 loc) · 13.8 KB

SingularityLLM Test Coverage Implementation Plan

Executive Summary

This plan addresses critical gaps in SingularityLLM's test coverage for user-facing API functionality. Analysis revealed that while infrastructure testing is strong, advanced features that users directly manipulate lack comprehensive test coverage, creating production reliability risks.

Critical Findings

Coverage Status

WELL TESTED        ████████████████████ 80%
├─ Provider Infrastructure
├─ System Prompts  
├─ Core Chat/Streaming
├─ Session Management
└─ Pipeline System

UNDER TESTED       ████░░░░░░░░░░░░░░░░ 20%
├─ Builder API Methods
├─ Vision/Multimodal
├─ Input Validation
└─ Embeddings

MISSING TESTS      ░░░░░░░░░░░░░░░░░░░░  0%
├─ File Management
├─ Knowledge Bases
├─ Context Caching
├─ Fine-tuning
├─ Assistants API
└─ Batch Processing

Risk Assessment

  • CRITICAL: 6 major feature areas with 0% test coverage
  • HIGH: Advanced APIs with minimal validation
  • MEDIUM: Pipeline customization gaps
  • LOW: Input boundary testing missing

Implementation Strategy

Approach: Hybrid Layer-by-Layer + Quick Wins

  1. Foundation First: Establish patterns and infrastructure
  2. High-Impact Features: Address most-used advanced capabilities
  3. Provider-Specific: Handle provider-dependent features
  4. Enterprise Features: Complete coverage for complex workflows

Dependency Chain

Test Infrastructure
        |
        v
Input Validation ──┐
        |          |
        v          v
Builder API    Quick Wins
        |          |
        v          v
Advanced Features ─┘
        |
        v
Provider-Specific Features
        |
        v
Enterprise Features

Phase 1: Foundation & Infrastructure

1.1 Test Infrastructure Setup

Files to Create:

  • test/support/advanced_feature_helpers.ex
  • Test fixtures for multimodal content
  • Enhanced mock provider capabilities

Key Components:

# test/support/advanced_feature_helpers.ex
defmodule SingularityLLM.Testing.AdvancedFeatureHelpers do
  def setup_mock_file_upload do
    # Mock file upload responses
  end
  
  def create_test_image_fixture do
    # Base64 test images for vision testing
  end
  
  def assert_api_lifecycle(create_fn, list_fn, get_fn, delete_fn) do
    # Reusable lifecycle testing pattern
  end
end

1.2 Input Validation & Builder API

Files to Create:

  • test/singularity_llm/input_validation_test.exs
  • test/singularity_llm/chat_builder_test.exs

Critical Tests:

# Input boundary testing
test "temperature boundary validation" do
  assert_raise FunctionClauseError, fn ->
    SingularityLLM.build(:openai, messages) |> with_temperature(3.0)
  end
end

# Pipeline manipulation testing  
test "insert_before maintains correct order" do
  pipeline = builder
    |> insert_before(ExecuteRequest, CustomPlug)
    |> inspect_pipeline()
  
  assert_pipeline_order(pipeline, [FetchConfig, CustomPlug, ExecuteRequest])
end

1.3 Test Visibility & Quick Wins

Placeholder Files to Create:

test/integration/
├── file_management_test.exs       (@tag :skip, @tag :file_management)
├── knowledge_base_test.exs        (@tag :skip, @tag :knowledge_base)  
├── context_caching_test.exs       (@tag :skip, @tag :context_caching)
├── fine_tuning_test.exs           (@tag :skip, @tag :fine_tuning)
├── assistants_test.exs            (@tag :skip, @tag :assistants)
└── batch_processing_test.exs      (@tag :skip, @tag :batch_processing)

Immediate Wins:

  • Test deprecated stream_chat/3 function
  • Expand configured?/1 testing for unconfigured providers
  • Test SingularityLLM.run/2 with custom pipelines
  • Add comprehensive session persistence testing

Phase 2: High-Impact Advanced Features

2.1 File Management API Testing

Primary Test File: test/integration/file_management_test.exs

Test Coverage:

describe "file lifecycle" do
  test "upload -> list -> get -> delete workflow" do
    # Full lifecycle using SingularityLLM.* functions
    {:ok, file} = SingularityLLM.upload_file(:openai, "test.pdf", opts)
    {:ok, files} = SingularityLLM.list_files(:openai)
    assert file.id in Enum.map(files, & &1.id)
    
    {:ok, retrieved} = SingularityLLM.get_file(:openai, file.id)
    assert retrieved.id == file.id
    
    :ok = SingularityLLM.delete_file(:openai, file.id)
    {:ok, updated_files} = SingularityLLM.list_files(:openai)
    refute file.id in Enum.map(updated_files, & &1.id)
  end
  
  test "handles file format validation" do
    # Test various file types, size limits
  end
  
  test "error scenarios" do
    # Invalid files, missing permissions, corrupted uploads
  end
end

2.2 Vision & Multimodal API Testing

Primary Test File: test/integration/vision_test.exs

Test Coverage:

describe "vision capabilities" do
  test "image loading from various sources" do
    # File paths, URLs, base64 encoding
    {:ok, image1} = SingularityLLM.load_image("test/fixtures/test.jpg")
    {:ok, image2} = SingularityLLM.load_image("data:image/jpeg;base64,...")
    
    message = SingularityLLM.vision_message("What's in this image?", [image1, image2])
    assert length(message.content) == 3  # text + 2 images
  end
  
  test "provider capability checking" do
    assert SingularityLLM.supports_vision?(:openai, "gpt-4-vision-preview")
    refute SingularityLLM.supports_vision?(:openai, "gpt-3.5-turbo")
  end
  
  test "format validation and error handling" do
    # Unsupported formats, corrupted images, size limits
  end
end

2.3 Embeddings & Search Testing

Primary Test File: test/integration/embeddings_test.exs

Test Coverage:

describe "embeddings generation" do
  test "single and batch input processing" do
    {:ok, response} = SingularityLLM.embeddings(:openai, "Hello world")
    assert is_list(response.embeddings)
    assert length(response.embeddings) == 1
    
    {:ok, batch_response} = SingularityLLM.embeddings(:openai, ["Hello", "World"])
    assert length(batch_response.embeddings) == 2
  end
  
  test "embedding index creation and search" do
    texts = ["Document 1 content", "Document 2 content"]
    {:ok, index} = SingularityLLM.create_embedding_index(:openai, texts)
    
    results = SingularityLLM.search_embeddings(index, "content query")
    assert is_list(results)
  end
end

Phase 3: Provider-Specific Advanced Features

3.1 Knowledge Base Operations Testing

Primary Test File: test/integration/knowledge_base_test.exs

Provider Focus: Gemini (primary), others as available

Test Structure:

describe "knowledge base lifecycle" do
  test "create and manage knowledge bases" do
    {:ok, kb} = SingularityLLM.create_knowledge_base(:gemini, "test-kb")
    {:ok, kbs} = SingularityLLM.list_knowledge_bases(:gemini)
    assert kb.name in Enum.map(kbs, & &1.name)
  end
  
  test "document management workflow" do
    {:ok, kb} = SingularityLLM.create_knowledge_base(:gemini, "docs-kb")
    
    document = %{title: "Test Doc", content: "Test content"}
    {:ok, doc} = SingularityLLM.add_document(:gemini, kb.name, document)
    
    {:ok, docs} = SingularityLLM.list_documents(:gemini, kb.name)
    assert doc.id in Enum.map(docs, & &1.id)
    
    results = SingularityLLM.semantic_search(:gemini, kb.name, "test query")
    assert is_list(results)
  end
end

3.2 Context Caching API Testing

Primary Test File: test/integration/context_caching_test.exs

Provider Focus: Anthropic (primary), others as available

3.3 Fine-tuning Workflows Testing

Primary Test File: test/integration/fine_tuning_test.exs

Provider Focus: OpenAI (primary), others as available

Phase 4: Enterprise Features & Completion

4.1 Assistants API Testing

Primary Test File: test/integration/assistants_test.exs

Complex Workflow Testing:

describe "assistants workflow" do
  test "complete assistant interaction" do
    # Create assistant
    {:ok, assistant} = SingularityLLM.create_assistant(:openai, 
      name: "Test Assistant",
      instructions: "You are helpful"
    )
    
    # Create thread
    {:ok, thread} = SingularityLLM.create_thread(:openai)
    
    # Add message and run
    {:ok, _message} = SingularityLLM.create_message(:openai, thread.id, "Hello")
    {:ok, run} = SingularityLLM.run_assistant(:openai, thread.id, assistant.id)
    
    # Verify execution
    assert run.status in ["queued", "in_progress", "completed"]
  end
end

4.2 Batch Processing Testing

Primary Test File: test/integration/batch_processing_test.exs

4.3 Final Integration & Documentation

Implementation Details

Test Infrastructure Patterns

Leveraging Existing Systems

  • 25x Test Caching: Minimize API costs for integration tests
  • Mock Providers: Use for development and CI environments
  • Tagging System: Follow established patterns for test organization
  • PUBLIC_API_TESTING.md: Maintain consistency with existing guidelines

New Patterns to Establish

# Lifecycle testing pattern
defmacro test_api_lifecycle(feature_name, create_fn, list_fn, get_fn, delete_fn) do
  quote do
    test "#{unquote(feature_name)} complete lifecycle" do
      # Standard create -> list -> get -> delete pattern
    end
  end
end

# Provider capability testing
defmacro test_provider_support(providers, feature_test_fn) do
  quote do
    for provider <- unquote(providers) do
      @tag "provider:#{provider}"
      test "#{provider} supports feature" do
        unquote(feature_test_fn).(provider)
      end
    end
  end
end

Error Handling Patterns

# Consistent error assertion patterns
def assert_api_error(fun, expected_error_type) do
  case fun.() do
    {:error, error} -> assert error.type == expected_error_type
    result -> flunk("Expected error, got: #{inspect(result)}")
  end
end

# Provider-specific error handling
def assert_provider_error(provider, fun, expected_errors) do
  expected = Map.get(expected_errors, provider, :generic_error)
  assert_api_error(fun, expected)
end

Resource Management

API Key Management

# Cost-effective testing strategy
def with_test_caching(test_name, fun) do
  if SingularityLLM.Testing.cache_available?(test_name) do
    SingularityLLM.Testing.get_cached_result(test_name)
  else
    result = fun.()
    SingularityLLM.Testing.cache_result(test_name, result)
    result
  end
end

Test Data Management

  • Image Fixtures: Small test images in multiple formats
  • Document Fixtures: Various file types for upload testing
  • Mock Responses: Comprehensive response libraries for offline testing

Success Metrics

Coverage Targets

  • 100% of user-facing API functions have integration tests
  • 95% of user-configurable parameters have boundary tests
  • 90% of error scenarios have explicit test coverage
  • 100% of advanced features have complete lifecycle tests

Performance Targets

  • Core test suite: Under 5 minutes execution time
  • Full integration suite: Under 30 minutes with caching
  • CI pipeline: No degradation in build times

Quality Targets

  • Zero regressions in existing functionality
  • Clear error messages for all validation failures
  • Comprehensive documentation for all new test patterns

Risk Mitigation

API Cost Control

  • Primary Strategy: Maximize use of 25x test caching system
  • Secondary Strategy: Mock providers for development and CI
  • Monitoring: Track API usage and costs per test suite run
  • Fallback: Graceful degradation when API limits reached

Provider Dependencies

  • Multi-provider Testing: Test against multiple providers where available
  • Provider Capability Detection: Automatic feature availability checking
  • Graceful Degradation: Skip tests for unsupported provider features
  • Mock Fallbacks: Use mock providers when live providers unavailable

Team Capacity

  • Incremental Delivery: Each phase delivers independent value
  • Flexible Prioritization: Can pause after any completed phase
  • Parallel Development: Provider-specific features can be developed concurrently
  • Knowledge Transfer: Clear documentation and patterns for team scaling

Next Steps

Immediate Actions (Days 1-5)

Day 1-2: Infrastructure Foundation

  1. Create test/support/advanced_feature_helpers.ex
  2. Set up test fixtures for multimodal content
  3. Enhance mock provider capabilities
  4. Establish API key management patterns

Day 3-4: Quick Wins Implementation

  1. Create test/singularity_llm/input_validation_test.exs
  2. Create test/singularity_llm/chat_builder_test.exs
  3. Implement pipeline manipulation testing using inspect_pipeline/1
  4. Add comprehensive boundary testing for user parameters

Day 5: Visibility & Planning

  1. Create all 6 placeholder test files with appropriate tags
  2. Update test documentation and contribution guidelines
  3. Establish development workflow and review process
  4. Validate test infrastructure with initial implementations

Success Validation

  • Week 1: Foundation complete, all gaps visible in test output
  • Week 2: Input validation and builder API fully tested
  • Week 6: High-impact advanced features covered
  • Week 10: Provider-specific features complete
  • Week 12: Full coverage of user-facing API functionality

Ongoing Maintenance

  • Weekly Reviews: Progress against coverage targets
  • Monthly Audits: Test suite performance and cost optimization
  • Quarterly Updates: New feature integration and pattern evolution
  • Annual Assessment: Comprehensive test strategy review

Plan Status: Ready for implementation
Next Phase: Begin with Day 1-2 infrastructure setup
Success Criteria: Complete test coverage for all user-manipulable API functionality