Skip to content

feat: add smoke tests for CLI integration testing#14

Merged
mldangelo merged 14 commits intomainfrom
feat/add-smoke-tests
Jan 11, 2026
Merged

feat: add smoke tests for CLI integration testing#14
mldangelo merged 14 commits intomainfrom
feat/add-smoke-tests

Conversation

@mldangelo
Copy link
Member

Summary

This PR adds comprehensive smoke tests to verify end-to-end CLI functionality using the echo provider.

Smoke tests are high-level integration tests that run against the actual installed promptfoo CLI via subprocess, testing the Python wrapper integration with the Node.js CLI.

What's Added

Smoke Test Suite ()

20 smoke tests covering:

Basic CLI Operations (5 tests)

  • Version flag ()
  • Help output (, )
  • Unknown command handling
  • Missing file error handling

Eval Command (7 tests)

  • Basic evaluation with echo provider
  • Output formats (JSON, YAML, CSV)
  • Command-line flags:
    • --max-concurrency
    • --repeat
    • --verbose
    • --no-cache

Exit Codes (3 tests)

  • Exit code 0 for success
  • Exit code 100 for assertion failures
  • Exit code 1 for configuration errors

Echo Provider (2 tests)

  • Basic prompt echoing
  • Multiple variable handling

Assertions (3 tests)

  • contains assertion
  • icontains assertion (case-insensitive)
  • Failing assertion behavior

Test Fixtures

  • fixtures/configs/basic.yaml - Simple test with passing assertions
  • fixtures/configs/failing-assertion.yaml - Test with failing assertion
  • fixtures/configs/assertions.yaml - Multiple assertions test

Configuration

  • Added pytest configuration with smoke marker for selective testing
  • Updated pyproject.toml with pytest options
  • Comprehensive README documenting smoke test purpose and usage

Why Echo Provider?

The echo provider is perfect for smoke tests because:

  1. No external dependencies - No API keys or network calls required
  2. Deterministic - Always returns the same output for the same input
  3. Fast - No network latency
  4. Predictable - Easy to write assertions against

Running Smoke Tests

# Run all smoke tests
pytest tests/smoke/

# Run only smoke-marked tests
pytest tests/ -m smoke

# Skip smoke tests (unit tests only)
pytest tests/ -m 'not smoke'

# Run a specific test
pytest tests/smoke/test_smoke.py::TestEvalCommand::test_basic_eval

Test Results

All 20 smoke tests pass ✅

$ pytest tests/smoke/ -v
======================== test session starts ========================
tests/smoke/test_smoke.py::TestBasicCLI::test_version_flag PASSED
tests/smoke/test_smoke.py::TestBasicCLI::test_help_flag PASSED
tests/smoke/test_smoke.py::TestBasicCLI::test_eval_help PASSED
tests/smoke/test_smoke.py::TestBasicCLI::test_unknown_command PASSED
tests/smoke/test_smoke.py::TestBasicCLI::test_missing_config_file PASSED
tests/smoke/test_smoke.py::TestEvalCommand::test_basic_eval PASSED
tests/smoke/test_smoke.py::TestEvalCommand::test_json_output PASSED
tests/smoke/test_smoke.py::TestEvalCommand::test_yaml_output PASSED
tests/smoke/test_smoke.py::TestEvalCommand::test_csv_output PASSED
tests/smoke/test_smoke.py::TestEvalCommand::test_max_concurrency_flag PASSED
tests/smoke/test_smoke.py::TestEvalCommand::test_repeat_flag PASSED
tests/smoke/test_smoke.py::TestEvalCommand::test_verbose_flag PASSED
tests/smoke/test_smoke.py::TestExitCodes::test_success_exit_code PASSED
tests/smoke/test_smoke.py::TestExitCodes::test_failure_exit_code PASSED
tests/smoke/test_smoke.py::TestExitCodes::test_config_error_exit_code PASSED
tests/smoke/test_smoke.py::TestEchoProvider::test_echo_provider_basic PASSED
tests/smoke/test_smoke.py::TestEchoProvider::test_echo_provider_with_multiple_vars PASSED
tests/smoke/test_smoke.py::TestAssertions::test_contains_assertion PASSED
tests/smoke/test_smoke.py::TestAssertions::test_multiple_assertions PASSED
tests/smoke/test_smoke.py::TestAssertions::test_failing_assertion PASSED
=================== 20 passed in 31.93s ====================

Notes

  • Smoke tests are slower than unit tests (~32s total) because they spawn subprocesses
  • They require Node.js and promptfoo to be installed
  • They test the integration between Python wrapper and Node.js CLI
  • All existing unit tests still pass (43 passed, 3 skipped)

Inspired By

These smoke tests are inspired by the Node.js promptfoo project's smoke test suite in test/smoke/, adapted for the Python wrapper with similar structure and coverage.

🤖 Generated with Claude Code

mldangelo and others added 9 commits January 6, 2026 06:27
- Add smoke tests that verify end-to-end CLI functionality
- Test basic CLI operations (--version, --help, error handling)
- Test eval command with echo provider (no external dependencies)
- Test output formats (JSON, YAML, CSV)
- Test CLI flags (--repeat, --max-concurrency, --verbose, --no-cache)
- Test exit codes (0 for success, 100 for failures, 1 for errors)
- Test assertions (contains, icontains, failing assertions)
- Add pytest configuration with 'smoke' marker for selective testing
- Add comprehensive README documenting smoke test purpose and usage

Total: 20 smoke tests, all passing ✅

Smoke tests run against the installed promptfoo CLI via subprocess,
testing the Python wrapper integration with the Node.js CLI.

Run smoke tests:
  pytest tests/smoke/              # Run all smoke tests
  pytest tests/ -m smoke           # Run only smoke-marked tests
  pytest tests/ -m 'not smoke'     # Skip smoke tests (unit tests only)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Previously the CI was only testing CLI invocation but not running pytest.

Changes:
- Install dev dependencies (pytest, mypy, ruff) in test jobs
- Run unit tests with: pytest tests/ -v -m 'not smoke'
- Run smoke tests with: pytest tests/smoke/ -v
- Both 'test' and 'test-npx-fallback' jobs now run full test suite

This ensures:
✅ Unit tests run on all platforms (ubuntu, windows) and Python versions (3.9, 3.13)
✅ Smoke tests verify end-to-end CLI functionality
✅ Both global install and npx fallback paths are tested
- Split test_split_path into platform-specific versions (Unix/Windows)
- Split test_find_external_promptfoo_prevents_recursion for platform paths
- Use platform-appropriate node path in test_main_exits_when_neither_external_nor_npx_available
- Tests now skip appropriately on incompatible platforms
The first npx call can be slow as it downloads promptfoo.
Increased timeout from 60s to 120s to accommodate this.
Add safety checks for None values from subprocess.run() output,
which can occur on Windows in certain error conditions.
Resolved conflict in tests/test_cli.py by keeping platform-appropriate
node_path implementation from feature branch.
- Fix line too long (123 > 120) in test_cli.py
- Run ruff format on test files
- Add tests/smoke/.temp-output/ to .gitignore

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Add comprehensive testing strategy section with unit vs smoke tests
- Document test directory structure
- Add smoke test details and commands
- Update CI/CD section to mention both test types
- Update project structure to include tests directory

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds a comprehensive smoke test suite to verify end-to-end CLI functionality for the Python wrapper. The tests use the echo provider to avoid external dependencies and test the integration between the Python wrapper and the Node.js CLI.

Changes:

  • Added 20 smoke tests covering CLI operations, eval command, exit codes, echo provider functionality, and assertions
  • Split platform-specific unit tests in tests/test_cli.py for better cross-platform testing
  • Added pytest configuration with smoke test marker for selective test execution
  • Updated CI workflows to run unit tests and smoke tests separately

Reviewed changes

Copilot reviewed 10 out of 11 changed files in this pull request and generated 7 comments.

Show a summary per file
File Description
tests/test_cli.py Split platform-specific tests for Unix and Windows PATH handling and recursion prevention
tests/smoke/test_smoke.py New comprehensive smoke test suite with 20 tests for CLI integration
tests/smoke/fixtures/configs/basic.yaml Basic test configuration with passing assertions
tests/smoke/fixtures/configs/failing-assertion.yaml Test configuration with intentionally failing assertion
tests/smoke/fixtures/configs/assertions.yaml Test configuration with multiple assertions including case-insensitive matching
tests/smoke/init.py Package initialization for smoke tests
tests/smoke/README.md Comprehensive documentation for smoke test suite
pyproject.toml Added pytest configuration with smoke test marker
.gitignore Added smoke test temporary output directory
.github/workflows/test.yml Updated to run unit tests and smoke tests separately with proper dev dependencies

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines 90 to 96
def test_version_flag(self):
"""Test --version flag outputs version."""
stdout, stderr, exit_code = run_promptfoo(["--version"])

assert exit_code == 0
# Should output a version number (semver format)
assert stdout.strip(), "Version output should not be empty"
Copy link

Copilot AI Jan 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Test method is missing return type annotation. For consistency with the rest of the codebase (see tests/test_cli.py), test methods should include '-> None' return type annotation.

Copilot uses AI. Check for mistakes.
Comment on lines 98 to 272
def test_help_flag(self):
"""Test --help flag outputs help."""
stdout, stderr, exit_code = run_promptfoo(["--help"])

assert exit_code == 0
assert "promptfoo" in stdout.lower()
assert "eval" in stdout.lower()

def test_eval_help(self):
"""Test 'eval --help' outputs eval command help."""
stdout, stderr, exit_code = run_promptfoo(["eval", "--help"])

assert exit_code == 0
assert "--config" in stdout or "-c" in stdout
assert "--output" in stdout or "-o" in stdout

def test_unknown_command(self):
"""Test unknown command returns error."""
stdout, stderr, exit_code = run_promptfoo(
["unknowncommand123"],
expect_error=True,
)

assert exit_code != 0
output = stdout + stderr
assert "unknown" in output.lower() or "not found" in output.lower()

def test_missing_config_file(self):
"""Test missing config file returns error."""
stdout, stderr, exit_code = run_promptfoo(
["eval", "-c", "nonexistent-config-file.yaml"],
expect_error=True,
)

assert exit_code != 0
output = stdout + stderr
# Should indicate the file wasn't found
assert any(
phrase in output.lower()
for phrase in [
"not found",
"no such file",
"does not exist",
"cannot find",
"no configuration file",
]
)


class TestEvalCommand:
"""Eval command smoke tests."""

def test_basic_eval(self):
"""Test basic eval with echo provider."""
config_path = CONFIGS_DIR / "basic.yaml"
stdout, stderr, exit_code = run_promptfoo(["eval", "-c", str(config_path), "--no-cache"])

assert exit_code == 0, f"Eval failed:\nSTDOUT: {stdout}\nSTDERR: {stderr}"
# Should show evaluation results
assert "pass" in stdout.lower() or "✓" in stdout or "success" in stdout.lower()

def test_json_output(self):
"""Test eval outputs valid JSON."""
config_path = CONFIGS_DIR / "basic.yaml"
output_path = OUTPUT_DIR / "output.json"

stdout, stderr, exit_code = run_promptfoo(
["eval", "-c", str(config_path), "-o", str(output_path), "--no-cache"]
)

assert exit_code == 0, f"Eval failed:\nSTDOUT: {stdout}\nSTDERR: {stderr}"
assert output_path.exists(), "Output file was not created"

# Verify it's valid JSON with expected structure
with open(output_path) as f:
data = json.load(f)

assert "results" in data
assert "results" in data["results"]
assert isinstance(data["results"]["results"], list)
assert len(data["results"]["results"]) > 0

# Verify echo provider returns the prompt
first_result = data["results"]["results"][0]
assert "response" in first_result
assert "output" in first_result["response"]
output_text = first_result["response"]["output"]
assert "Hello" in output_text
assert "World" in output_text

def test_yaml_output(self):
"""Test eval outputs YAML format."""
config_path = CONFIGS_DIR / "basic.yaml"
output_path = OUTPUT_DIR / "output.yaml"

stdout, stderr, exit_code = run_promptfoo(
["eval", "-c", str(config_path), "-o", str(output_path), "--no-cache"]
)

assert exit_code == 0
assert output_path.exists()

# Verify it contains YAML-like content
with open(output_path) as f:
content = f.read()

assert "results:" in content

def test_csv_output(self):
"""Test eval outputs CSV format."""
config_path = CONFIGS_DIR / "basic.yaml"
output_path = OUTPUT_DIR / "output.csv"

stdout, stderr, exit_code = run_promptfoo(
["eval", "-c", str(config_path), "-o", str(output_path), "--no-cache"]
)

assert exit_code == 0
assert output_path.exists()

# Verify it's CSV format (has header row with columns)
with open(output_path) as f:
content = f.read()

lines = content.strip().split("\n")
assert len(lines) > 0
# CSV should have comma-separated values
assert "," in lines[0]

def test_max_concurrency_flag(self):
"""Test --max-concurrency flag."""
config_path = CONFIGS_DIR / "basic.yaml"

stdout, stderr, exit_code = run_promptfoo(
["eval", "-c", str(config_path), "--max-concurrency", "1", "--no-cache"]
)

assert exit_code == 0

def test_repeat_flag(self):
"""Test --repeat flag runs tests multiple times."""
config_path = CONFIGS_DIR / "basic.yaml"
output_path = OUTPUT_DIR / "repeat-output.json"

stdout, stderr, exit_code = run_promptfoo(
[
"eval",
"-c",
str(config_path),
"--repeat",
"2",
"-o",
str(output_path),
"--no-cache",
]
)

assert exit_code == 0

# Verify we got repeated results
with open(output_path) as f:
data = json.load(f)

# With repeat=2 and 1 test case, we should have 2 results
assert len(data["results"]["results"]) == 2

def test_verbose_flag(self):
"""Test --verbose flag."""
config_path = CONFIGS_DIR / "basic.yaml"

stdout, stderr, exit_code = run_promptfoo(["eval", "-c", str(config_path), "--verbose", "--no-cache"])

assert exit_code == 0
# Verbose mode should produce output
assert len(stdout) > 0 or len(stderr) > 0
Copy link

Copilot AI Jan 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Test methods in this class are missing return type annotations. For consistency with the rest of the codebase (see tests/test_cli.py), test methods should include '-> None' return type annotation.

Copilot uses AI. Check for mistakes.
Comment on lines 278 to 305
def test_success_exit_code(self):
"""Test exit code 0 when all assertions pass."""
config_path = CONFIGS_DIR / "basic.yaml"

stdout, stderr, exit_code = run_promptfoo(["eval", "-c", str(config_path), "--no-cache"])

assert exit_code == 0

def test_failure_exit_code(self):
"""Test exit code 100 when assertions fail."""
config_path = CONFIGS_DIR / "failing-assertion.yaml"

stdout, stderr, exit_code = run_promptfoo(
["eval", "-c", str(config_path), "--no-cache"],
expect_error=True,
)

# Exit code 100 indicates test failures
assert exit_code == 100, f"Expected exit code 100, got {exit_code}"

def test_config_error_exit_code(self):
"""Test exit code 1 for config errors."""
stdout, stderr, exit_code = run_promptfoo(
["eval", "-c", "nonexistent-file.yaml", "--no-cache"],
expect_error=True,
)

assert exit_code == 1
Copy link

Copilot AI Jan 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Test methods in this class are missing return type annotations. For consistency with the rest of the codebase (see tests/test_cli.py), test methods should include '-> None' return type annotation.

Copilot uses AI. Check for mistakes.
Comment on lines 311 to 352
def test_echo_provider_basic(self):
"""Test echo provider returns the prompt."""
config_path = CONFIGS_DIR / "basic.yaml"
output_path = OUTPUT_DIR / "echo-test.json"

stdout, stderr, exit_code = run_promptfoo(
["eval", "-c", str(config_path), "-o", str(output_path), "--no-cache"]
)

assert exit_code == 0

# Verify echo provider returns the prompt
with open(output_path) as f:
data = json.load(f)

first_result = data["results"]["results"][0]

# Echo provider should return the prompt in the response
output = first_result["response"]["output"]
assert "Hello" in output
assert "World" in output

def test_echo_provider_with_multiple_vars(self):
"""Test echo provider with multiple variables."""
config_path = CONFIGS_DIR / "assertions.yaml"
output_path = OUTPUT_DIR / "echo-multi-var.json"

stdout, stderr, exit_code = run_promptfoo(
["eval", "-c", str(config_path), "-o", str(output_path), "--no-cache"]
)

assert exit_code == 0

with open(output_path) as f:
data = json.load(f)

first_result = data["results"]["results"][0]
output = first_result["response"]["output"]

# Should contain all variable values
assert "Alice" in output
assert "Wonderland" in output
Copy link

Copilot AI Jan 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Test methods in this class are missing return type annotations. For consistency with the rest of the codebase (see tests/test_cli.py), test methods should include '-> None' return type annotation.

Copilot uses AI. Check for mistakes.
Comment on lines 358 to 389
def test_contains_assertion(self):
"""Test contains assertion."""
config_path = CONFIGS_DIR / "basic.yaml"

stdout, stderr, exit_code = run_promptfoo(["eval", "-c", str(config_path), "--no-cache"])

assert exit_code == 0
# All assertions should pass
assert "pass" in stdout.lower() or "✓" in stdout or "success" in stdout.lower()

def test_multiple_assertions(self):
"""Test multiple assertions in single test."""
config_path = CONFIGS_DIR / "assertions.yaml"

stdout, stderr, exit_code = run_promptfoo(["eval", "-c", str(config_path), "--no-cache"])

assert exit_code == 0

def test_failing_assertion(self):
"""Test failing assertion."""
config_path = CONFIGS_DIR / "failing-assertion.yaml"

stdout, stderr, exit_code = run_promptfoo(
["eval", "-c", str(config_path), "--no-cache"],
expect_error=True,
)

# Should fail with exit code 100
assert exit_code == 100
output = stdout + stderr
# Should indicate failure
assert "fail" in output.lower() or "✗" in output or "error" in output.lower()
Copy link

Copilot AI Jan 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Test methods in this class are missing return type annotations. For consistency with the rest of the codebase (see tests/test_cli.py), test methods should include '-> None' return type annotation.

Copilot uses AI. Check for mistakes.
These tests verify the core evaluation pipeline works correctly
using the echo provider (no external API dependencies).

These tests run against the installed promptfoo package via npx,
Copy link

Copilot AI Jan 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment states tests run "via npx" but they actually run via the Python wrapper which may use either a globally installed promptfoo or fall back to npx. Consider updating to "via the Python wrapper (using either global promptfoo or npx)" for accuracy.

Suggested change
These tests run against the installed promptfoo package via npx,
These tests run against the installed promptfoo package via the Python wrapper
(using either a globally installed promptfoo CLI or falling back to npx),

Copilot uses AI. Check for mistakes.

Smoke tests are high-level integration tests that verify the most critical functionality works end-to-end. They:

- Run against the actual installed CLI (via `npx promptfoo`)
Copy link

Copilot AI Jan 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The documentation states tests run "via npx promptfoo" but they actually run via the Python wrapper which may use either a globally installed promptfoo or fall back to npx. Consider updating to "via the Python wrapper (using either global promptfoo or npx)" for accuracy.

Suggested change
- Run against the actual installed CLI (via `npx promptfoo`)
- Run against the actual installed CLI via the Python wrapper (using either global promptfoo or npx)

Copilot uses AI. Check for mistakes.
mldangelo and others added 4 commits January 11, 2026 02:06
- Add `-> None` return type annotations to all smoke test methods
- Add Generator return type to setup_and_teardown fixture
- Update documentation to clarify tests run via Python wrapper
  (not just npx)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Add os.path.isfile mock to unit test to prevent _find_windows_promptfoo()
  from finding real promptfoo installations on Windows CI runners
- Add UTF-8 encoding with error replacement to smoke tests to handle
  Windows cp1252 encoding issues with npx output
- Add warmup_npx fixture to pre-download promptfoo via npx before tests,
  preventing timeout on first test when npx needs to download package

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Add record_wrapper_used mock to tests that mock subprocess.run to prevent
PostHog telemetry calls from interfering with mock call counts.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@mldangelo mldangelo merged commit 8e653c4 into main Jan 11, 2026
10 checks passed
mldangelo added a commit that referenced this pull request Feb 24, 2026
* feat: add smoke tests for CLI integration testing

- Add smoke tests that verify end-to-end CLI functionality
- Test basic CLI operations (--version, --help, error handling)
- Test eval command with echo provider (no external dependencies)
- Test output formats (JSON, YAML, CSV)
- Test CLI flags (--repeat, --max-concurrency, --verbose, --no-cache)
- Test exit codes (0 for success, 100 for failures, 1 for errors)
- Test assertions (contains, icontains, failing assertions)
- Add pytest configuration with 'smoke' marker for selective testing
- Add comprehensive README documenting smoke test purpose and usage

Total: 20 smoke tests, all passing ✅

Smoke tests run against the installed promptfoo CLI via subprocess,
testing the Python wrapper integration with the Node.js CLI.

Run smoke tests:
  pytest tests/smoke/              # Run all smoke tests
  pytest tests/ -m smoke           # Run only smoke-marked tests
  pytest tests/ -m 'not smoke'     # Skip smoke tests (unit tests only)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

* ci: run unit tests and smoke tests in CI

Previously the CI was only testing CLI invocation but not running pytest.

Changes:
- Install dev dependencies (pytest, mypy, ruff) in test jobs
- Run unit tests with: pytest tests/ -v -m 'not smoke'
- Run smoke tests with: pytest tests/smoke/ -v
- Both 'test' and 'test-npx-fallback' jobs now run full test suite

This ensures:
✅ Unit tests run on all platforms (ubuntu, windows) and Python versions (3.9, 3.13)
✅ Smoke tests verify end-to-end CLI functionality
✅ Both global install and npx fallback paths are tested

* fix: use Optional for Python 3.9 compatibility in smoke tests

* fix: make platform-specific tests work on both Unix and Windows

- Split test_split_path into platform-specific versions (Unix/Windows)
- Split test_find_external_promptfoo_prevents_recursion for platform paths
- Use platform-appropriate node path in test_main_exits_when_neither_external_nor_npx_available
- Tests now skip appropriately on incompatible platforms

* fix: increase smoke test timeout for npx fallback scenarios

The first npx call can be slow as it downloads promptfoo.
Increased timeout from 60s to 120s to accommodate this.

* fix: handle None stdout/stderr in smoke tests

Add safety checks for None values from subprocess.run() output,
which can occur on Windows in certain error conditions.

* fix: address linting issues and add temp output to gitignore

- Fix line too long (123 > 120) in test_cli.py
- Run ruff format on test files
- Add tests/smoke/.temp-output/ to .gitignore

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* docs: update AGENTS.md with smoke test documentation

- Add comprehensive testing strategy section with unit vs smoke tests
- Document test directory structure
- Add smoke test details and commands
- Update CI/CD section to mention both test types
- Update project structure to include tests directory

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* style: add return type annotations and fix documentation wording

- Add `-> None` return type annotations to all smoke test methods
- Add Generator return type to setup_and_teardown fixture
- Update documentation to clarify tests run via Python wrapper
  (not just npx)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* fix: resolve Windows CI test failures

- Add os.path.isfile mock to unit test to prevent _find_windows_promptfoo()
  from finding real promptfoo installations on Windows CI runners
- Add UTF-8 encoding with error replacement to smoke tests to handle
  Windows cp1252 encoding issues with npx output
- Add warmup_npx fixture to pre-download promptfoo via npx before tests,
  preventing timeout on first test when npx needs to download package

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* fix: mock telemetry in CLI unit tests

Add record_wrapper_used mock to tests that mock subprocess.run to prevent
PostHog telemetry calls from interfering with mock call counts.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

---------

Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
@mldangelo mldangelo deleted the feat/add-smoke-tests branch February 24, 2026 23:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants