This plan addresses critical gaps in SingularityLLM's test coverage for user-facing API functionality. Analysis revealed that while infrastructure testing is strong, advanced features that users directly manipulate lack comprehensive test coverage, creating production reliability risks.
WELL TESTED ████████████████████ 80%
├─ Provider Infrastructure
├─ System Prompts
├─ Core Chat/Streaming
├─ Session Management
└─ Pipeline System
UNDER TESTED ████░░░░░░░░░░░░░░░░ 20%
├─ Builder API Methods
├─ Vision/Multimodal
├─ Input Validation
└─ Embeddings
MISSING TESTS ░░░░░░░░░░░░░░░░░░░░ 0%
├─ File Management
├─ Knowledge Bases
├─ Context Caching
├─ Fine-tuning
├─ Assistants API
└─ Batch Processing
- CRITICAL: 6 major feature areas with 0% test coverage
- HIGH: Advanced APIs with minimal validation
- MEDIUM: Pipeline customization gaps
- LOW: Input boundary testing missing
- Foundation First: Establish patterns and infrastructure
- High-Impact Features: Address most-used advanced capabilities
- Provider-Specific: Handle provider-dependent features
- Enterprise Features: Complete coverage for complex workflows
Test Infrastructure
|
v
Input Validation ──┐
| |
v v
Builder API Quick Wins
| |
v v
Advanced Features ─┘
|
v
Provider-Specific Features
|
v
Enterprise Features
Files to Create:
test/support/advanced_feature_helpers.ex- Test fixtures for multimodal content
- Enhanced mock provider capabilities
Key Components:
# test/support/advanced_feature_helpers.ex
defmodule SingularityLLM.Testing.AdvancedFeatureHelpers do
def setup_mock_file_upload do
# Mock file upload responses
end
def create_test_image_fixture do
# Base64 test images for vision testing
end
def assert_api_lifecycle(create_fn, list_fn, get_fn, delete_fn) do
# Reusable lifecycle testing pattern
end
endFiles to Create:
test/singularity_llm/input_validation_test.exstest/singularity_llm/chat_builder_test.exs
Critical Tests:
# Input boundary testing
test "temperature boundary validation" do
assert_raise FunctionClauseError, fn ->
SingularityLLM.build(:openai, messages) |> with_temperature(3.0)
end
end
# Pipeline manipulation testing
test "insert_before maintains correct order" do
pipeline = builder
|> insert_before(ExecuteRequest, CustomPlug)
|> inspect_pipeline()
assert_pipeline_order(pipeline, [FetchConfig, CustomPlug, ExecuteRequest])
endPlaceholder Files to Create:
test/integration/
├── file_management_test.exs (@tag :skip, @tag :file_management)
├── knowledge_base_test.exs (@tag :skip, @tag :knowledge_base)
├── context_caching_test.exs (@tag :skip, @tag :context_caching)
├── fine_tuning_test.exs (@tag :skip, @tag :fine_tuning)
├── assistants_test.exs (@tag :skip, @tag :assistants)
└── batch_processing_test.exs (@tag :skip, @tag :batch_processing)
Immediate Wins:
- Test deprecated
stream_chat/3function - Expand
configured?/1testing for unconfigured providers - Test
SingularityLLM.run/2with custom pipelines - Add comprehensive session persistence testing
Primary Test File: test/integration/file_management_test.exs
Test Coverage:
describe "file lifecycle" do
test "upload -> list -> get -> delete workflow" do
# Full lifecycle using SingularityLLM.* functions
{:ok, file} = SingularityLLM.upload_file(:openai, "test.pdf", opts)
{:ok, files} = SingularityLLM.list_files(:openai)
assert file.id in Enum.map(files, & &1.id)
{:ok, retrieved} = SingularityLLM.get_file(:openai, file.id)
assert retrieved.id == file.id
:ok = SingularityLLM.delete_file(:openai, file.id)
{:ok, updated_files} = SingularityLLM.list_files(:openai)
refute file.id in Enum.map(updated_files, & &1.id)
end
test "handles file format validation" do
# Test various file types, size limits
end
test "error scenarios" do
# Invalid files, missing permissions, corrupted uploads
end
endPrimary Test File: test/integration/vision_test.exs
Test Coverage:
describe "vision capabilities" do
test "image loading from various sources" do
# File paths, URLs, base64 encoding
{:ok, image1} = SingularityLLM.load_image("test/fixtures/test.jpg")
{:ok, image2} = SingularityLLM.load_image("data:image/jpeg;base64,...")
message = SingularityLLM.vision_message("What's in this image?", [image1, image2])
assert length(message.content) == 3 # text + 2 images
end
test "provider capability checking" do
assert SingularityLLM.supports_vision?(:openai, "gpt-4-vision-preview")
refute SingularityLLM.supports_vision?(:openai, "gpt-3.5-turbo")
end
test "format validation and error handling" do
# Unsupported formats, corrupted images, size limits
end
endPrimary Test File: test/integration/embeddings_test.exs
Test Coverage:
describe "embeddings generation" do
test "single and batch input processing" do
{:ok, response} = SingularityLLM.embeddings(:openai, "Hello world")
assert is_list(response.embeddings)
assert length(response.embeddings) == 1
{:ok, batch_response} = SingularityLLM.embeddings(:openai, ["Hello", "World"])
assert length(batch_response.embeddings) == 2
end
test "embedding index creation and search" do
texts = ["Document 1 content", "Document 2 content"]
{:ok, index} = SingularityLLM.create_embedding_index(:openai, texts)
results = SingularityLLM.search_embeddings(index, "content query")
assert is_list(results)
end
endPrimary Test File: test/integration/knowledge_base_test.exs
Provider Focus: Gemini (primary), others as available
Test Structure:
describe "knowledge base lifecycle" do
test "create and manage knowledge bases" do
{:ok, kb} = SingularityLLM.create_knowledge_base(:gemini, "test-kb")
{:ok, kbs} = SingularityLLM.list_knowledge_bases(:gemini)
assert kb.name in Enum.map(kbs, & &1.name)
end
test "document management workflow" do
{:ok, kb} = SingularityLLM.create_knowledge_base(:gemini, "docs-kb")
document = %{title: "Test Doc", content: "Test content"}
{:ok, doc} = SingularityLLM.add_document(:gemini, kb.name, document)
{:ok, docs} = SingularityLLM.list_documents(:gemini, kb.name)
assert doc.id in Enum.map(docs, & &1.id)
results = SingularityLLM.semantic_search(:gemini, kb.name, "test query")
assert is_list(results)
end
endPrimary Test File: test/integration/context_caching_test.exs
Provider Focus: Anthropic (primary), others as available
Primary Test File: test/integration/fine_tuning_test.exs
Provider Focus: OpenAI (primary), others as available
Primary Test File: test/integration/assistants_test.exs
Complex Workflow Testing:
describe "assistants workflow" do
test "complete assistant interaction" do
# Create assistant
{:ok, assistant} = SingularityLLM.create_assistant(:openai,
name: "Test Assistant",
instructions: "You are helpful"
)
# Create thread
{:ok, thread} = SingularityLLM.create_thread(:openai)
# Add message and run
{:ok, _message} = SingularityLLM.create_message(:openai, thread.id, "Hello")
{:ok, run} = SingularityLLM.run_assistant(:openai, thread.id, assistant.id)
# Verify execution
assert run.status in ["queued", "in_progress", "completed"]
end
endPrimary Test File: test/integration/batch_processing_test.exs
- 25x Test Caching: Minimize API costs for integration tests
- Mock Providers: Use for development and CI environments
- Tagging System: Follow established patterns for test organization
- PUBLIC_API_TESTING.md: Maintain consistency with existing guidelines
# Lifecycle testing pattern
defmacro test_api_lifecycle(feature_name, create_fn, list_fn, get_fn, delete_fn) do
quote do
test "#{unquote(feature_name)} complete lifecycle" do
# Standard create -> list -> get -> delete pattern
end
end
end
# Provider capability testing
defmacro test_provider_support(providers, feature_test_fn) do
quote do
for provider <- unquote(providers) do
@tag "provider:#{provider}"
test "#{provider} supports feature" do
unquote(feature_test_fn).(provider)
end
end
end
end# Consistent error assertion patterns
def assert_api_error(fun, expected_error_type) do
case fun.() do
{:error, error} -> assert error.type == expected_error_type
result -> flunk("Expected error, got: #{inspect(result)}")
end
end
# Provider-specific error handling
def assert_provider_error(provider, fun, expected_errors) do
expected = Map.get(expected_errors, provider, :generic_error)
assert_api_error(fun, expected)
end# Cost-effective testing strategy
def with_test_caching(test_name, fun) do
if SingularityLLM.Testing.cache_available?(test_name) do
SingularityLLM.Testing.get_cached_result(test_name)
else
result = fun.()
SingularityLLM.Testing.cache_result(test_name, result)
result
end
end- Image Fixtures: Small test images in multiple formats
- Document Fixtures: Various file types for upload testing
- Mock Responses: Comprehensive response libraries for offline testing
- 100% of user-facing API functions have integration tests
- 95% of user-configurable parameters have boundary tests
- 90% of error scenarios have explicit test coverage
- 100% of advanced features have complete lifecycle tests
- Core test suite: Under 5 minutes execution time
- Full integration suite: Under 30 minutes with caching
- CI pipeline: No degradation in build times
- Zero regressions in existing functionality
- Clear error messages for all validation failures
- Comprehensive documentation for all new test patterns
- Primary Strategy: Maximize use of 25x test caching system
- Secondary Strategy: Mock providers for development and CI
- Monitoring: Track API usage and costs per test suite run
- Fallback: Graceful degradation when API limits reached
- Multi-provider Testing: Test against multiple providers where available
- Provider Capability Detection: Automatic feature availability checking
- Graceful Degradation: Skip tests for unsupported provider features
- Mock Fallbacks: Use mock providers when live providers unavailable
- Incremental Delivery: Each phase delivers independent value
- Flexible Prioritization: Can pause after any completed phase
- Parallel Development: Provider-specific features can be developed concurrently
- Knowledge Transfer: Clear documentation and patterns for team scaling
- Create
test/support/advanced_feature_helpers.ex - Set up test fixtures for multimodal content
- Enhance mock provider capabilities
- Establish API key management patterns
- Create
test/singularity_llm/input_validation_test.exs - Create
test/singularity_llm/chat_builder_test.exs - Implement pipeline manipulation testing using
inspect_pipeline/1 - Add comprehensive boundary testing for user parameters
- Create all 6 placeholder test files with appropriate tags
- Update test documentation and contribution guidelines
- Establish development workflow and review process
- Validate test infrastructure with initial implementations
- Week 1: Foundation complete, all gaps visible in test output
- Week 2: Input validation and builder API fully tested
- Week 6: High-impact advanced features covered
- Week 10: Provider-specific features complete
- Week 12: Full coverage of user-facing API functionality
- Weekly Reviews: Progress against coverage targets
- Monthly Audits: Test suite performance and cost optimization
- Quarterly Updates: New feature integration and pattern evolution
- Annual Assessment: Comprehensive test strategy review
Plan Status: Ready for implementation
Next Phase: Begin with Day 1-2 infrastructure setup
Success Criteria: Complete test coverage for all user-manipulable API functionality