Skip to content

Latest commit

 

History

History
255 lines (197 loc) · 9.54 KB

File metadata and controls

255 lines (197 loc) · 9.54 KB

Mutation Testing Results

Summary

Baseline: 67.7% mutation kill rate (134/198 caught, 64 missed) After improvements: 77.0% mutation kill rate (40/52 caught in retested files, 12 missed) Target: 90%+ mutation kill rate

Test Improvements

Added Schema Validation Tests

Files: crates/pforge-runtime/src/registry.rs, crates/pforge-runtime/src/handler.rs

Killed mutants:

  • registry.rs:104 - replace H::input_schema() with Default::default()
  • registry.rs:108 - replace H::output_schema() with Default::default()
  • handler.rs:115 - replace schema_for!(Input) with Default::default()
  • handler.rs:122 - replace schema_for!(Output) with Default::default()

Tests added:

#[tokio::test]
async fn test_schema_not_default() {
    // Verifies schemas are NOT Default::default()
    assert_ne!(
        serde_json::to_string(&input_schema).unwrap(),
        serde_json::to_string(&default_schema).unwrap()
    );
}

#[tokio::test]
async fn test_schema_properties() {
    // Verifies schema structure
    assert!(input_schema.schema.object.is_some());
}

Added Arithmetic/Boolean Logic Tests

File: crates/pforge-runtime/src/timeout.rs

Killed mutants:

  • timeout.rs:91 - replace * with + in backoff_duration (exponential backoff)
  • timeout.rs:91 - replace * with / in backoff_duration
  • timeout.rs:92 - replace + with * in jitter calculation
  • timeout.rs:92 - replace + with - in jitter calculation
  • timeout.rs:107 - replace || with && in is_retryable
  • timeout.rs:170 - replace < with == in retry loop
  • timeout.rs:170 - replace < with > in retry loop
  • timeout.rs:170 - replace < with <= in retry loop
  • timeout.rs:171 - replace - with + in backoff(attempt - 1)
  • timeout.rs:171 - replace - with / in backoff(attempt - 1)

Tests added:

#[test]
fn test_backoff_multiplier_exact() {
    // Verifies exponential backoff: 100 * 3^n
    assert_eq!(policy.backoff_duration(0).as_millis(), 100);
    assert_eq!(policy.backoff_duration(1).as_millis(), 300);
    assert_eq!(policy.backoff_duration(2).as_millis(), 900);
}

#[test]
fn test_is_retryable_logic() {
    // Verifies || logic (not &&)
    assert!(policy.is_retryable(&Error::Handler("timeout error".to_string())));
    assert!(!policy.is_retryable(&Error::Handler("fatal error".to_string())));
}

#[tokio::test]
async fn test_retry_attempt_comparison() {
    // Verifies attempt < max_attempts (not ==, >, <=)
    assert_eq!(counter.load(Ordering::SeqCst), 5); // Exactly 5 attempts
}

Remaining Surviving Mutants

High Priority (Middleware Methods)

Category: Empty Ok() returns in middleware Impact: Medium - Middleware methods returning early could skip processing

Surviving mutants:

  • timeout.rs:33 - replace TimeoutMiddleware::before with Ok(Default::default())
  • timeout.rs:37 - replace TimeoutMiddleware::after with Ok(Default::default())
  • timeout.rs:146 - replace RetryMiddleware::on_error with Ok(Default::default())
  • timeout.rs:136 - replace RetryMiddleware::policy with Default::default()
  • middleware.rs:11 - replace Middleware::before with Ok(Default::default())
  • middleware.rs:23 - replace Middleware::on_error with Ok(Default::default())
  • middleware.rs:101 - replace LoggingMiddleware::before with Ok(Default::default())
  • middleware.rs:110 - replace LoggingMiddleware::after with Ok(Default::default())
  • middleware.rs:119 - replace LoggingMiddleware::on_error with Ok(Default::default())
  • recovery.rs:294 - replace RecoveryMiddleware::after with Ok(Default::default())

Action needed: Add integration tests that verify middleware actually executes (e.g., logging produces output, timeout actually times out, recovery actually recovers)

Medium Priority (Pipeline & Prompt Logic)

Category: Boolean logic and deleted match arms

Surviving mutants:

  • handlers/pipeline.rs:61 - delete ! in condition evaluation
  • handlers/pipeline.rs:102 - replace == with != in pipeline
  • handlers/pipeline.rs:124 - replace evaluate_condition with false
  • handlers/pipeline.rs:124 - replace evaluate_condition with true
  • handlers/pipeline.rs:125 - delete ! in condition
  • handlers/pipeline.rs:137 - replace interpolate_variables with Default::default()
  • prompt.rs:81 - replace && with || in validate_arguments
  • prompt.rs:102 - delete match arm Value::Number
  • prompt.rs:103 - delete match arm Value::Bool
  • prompt.rs:104 - delete match arm Value::Null
  • prompt.rs:113 - replace && with || in interpolate

Action needed: Add tests for all pipeline conditions (true/false/negation) and prompt variable types

Medium Priority (State & Resource)

Category: Empty Ok() returns and deleted negations

Surviving mutants:

  • state.rs:98 - replace RedbStateManager::delete with Ok(())
  • state.rs:66 - replace exists with Ok(false)
  • state.rs:66 - replace exists with Ok(true)
  • resource.rs:22 - replace ResourceHandler::write with Ok(())
  • resource.rs:28 - replace ResourceHandler::subscribe with Ok(())
  • resource.rs:123 - replace ResourceManager::subscribe with Ok(())
  • resource.rs:127 - delete ! in subscribe
  • resource.rs:192 - replace list_templates with vec!["xyzzy"]
  • resource.rs:192 - replace list_templates with vec![]
  • resource.rs:192 - replace list_templates with vec![""]

Action needed: Add tests that verify state operations actually persist/retrieve data, resources actually write/subscribe

Low Priority (CLI Commands & Server)

Category: Empty Ok() returns that skip substantial work

Surviving mutants:

  • cli/commands/dev.rs:4 - replace execute with Ok(())
  • cli/commands/serve.rs:7 - replace execute with Ok(())
  • cli/commands/build.rs:5 - replace execute with Ok(())
  • cli/commands/new.rs:14 - replace execute with Ok(())
  • cli/main.rs:54 - replace main with Ok(())
  • server.rs:23 - replace register_handlers with Ok(())
  • server.rs:113 - replace McpServer::run with Ok(())
  • server.rs:138 - replace McpServer::registry with Default::default()

Action needed: Add CLI integration tests that verify commands actually execute their work

Low Priority (HTTP & Recovery Details)

Surviving mutants:

  • handlers/http.rs:91 - delete ! in HttpHandler
  • handlers/cli.rs:91 - delete - in CliHandler
  • recovery.rs:107 - delete match arm CircuitState::Closed
  • recovery.rs:126 - delete match arm CircuitState::HalfOpen
  • recovery.rs:224 - replace ErrorTracker::reset with ()

Action needed: Add edge case tests for HTTP errors, circuit breaker state transitions

Very Low Priority (Config)

Surviving mutants:

  • config/types.rs:88 - replace ToolDef::name with ""
  • config/types.rs:88 - replace ToolDef::name with "xyzzy"

Action needed: Property-based tests that tool names must be non-empty

Coverage by Category

Category Total Caught Missed Kill Rate
Schema methods 4 4 0 100% ✅
Arithmetic operators 6 6 0 100% ✅
Boolean operators 4 3 1 75%
Comparison operators 4 4 0 100% ✅
Middleware methods 10 0 10 0% ❌
Empty Ok() returns 15 3 12 20%
Match arm deletions 5 0 5 0% ❌
Other 16 12 4 75%

Next Steps to Reach 90%

  1. Add middleware integration tests (would kill ~10 mutants)

    • Test that timeouts actually time out
    • Test that logging produces output
    • Test that retry actually retries
    • Test that recovery handles errors
  2. Add pipeline/prompt edge case tests (would kill ~11 mutants)

    • Test all condition evaluation paths
    • Test all prompt variable types
    • Test variable interpolation
  3. Add state/resource persistence tests (would kill ~9 mutants)

    • Test that state operations persist data
    • Test that resources write/read correctly
    • Test resource subscriptions
  4. Add CLI integration tests (would kill ~5 mutants)

    • Test that commands execute (not just return Ok)
    • Test server actually starts
    • Test handlers actually register
  5. Add circuit breaker state tests (would kill ~3 mutants)

    • Test all state transitions
    • Test reset behavior

Estimated impact: ~38 additional mutants killed → ~95% mutation kill rate

Configuration

Running Mutation Tests

# All crates
cargo mutants --workspace --output mutants.out

# Specific files
cargo mutants --file crates/pforge-runtime/src/registry.rs

# With coverage report
cargo mutants --workspace --output mutants.out --json

CI/CD Integration (Pending)

Add to .github/workflows/quality.yml:

- name: Mutation Testing
  run: |
    cargo install cargo-mutants
    cargo mutants --workspace --json --output mutants-ci.json
    # Fail if kill rate < 90%
    cargo mutants --workspace --check --minimum-test-time 0.5

Mutation Testing Best Practices

  1. Target critical paths first - Focus on schema generation, error handling, retry logic
  2. Use exact assertions - assert_eq!(actual, expected) not assert!(actual == expected)
  3. Test edge cases - Boundary conditions, empty inputs, all match arms
  4. Verify behavior, not implementation - Test observable effects (logs, state changes, timeouts)
  5. Kill arithmetic mutants - Test exact calculations, not just "close enough"
  6. Kill boolean mutants - Test both true and false branches, verify operators

References