feat: add smoke tests for CLI integration testing by mldangelo · Pull Request #14 · promptfoo/promptfoo-python

mldangelo · 2026-01-06T14:28:34Z

Summary

This PR adds comprehensive smoke tests to verify end-to-end CLI functionality using the echo provider.

Smoke tests are high-level integration tests that run against the actual installed promptfoo CLI via subprocess, testing the Python wrapper integration with the Node.js CLI.

What's Added

Smoke Test Suite ()

20 smoke tests covering:

Basic CLI Operations (5 tests)

Version flag ()
Help output (, )
Unknown command handling
Missing file error handling

Eval Command (7 tests)

Basic evaluation with echo provider
Output formats (JSON, YAML, CSV)
Command-line flags:
- --max-concurrency
- --repeat
- --verbose
- --no-cache

Exit Codes (3 tests)

Exit code 0 for success
Exit code 100 for assertion failures
Exit code 1 for configuration errors

Echo Provider (2 tests)

Basic prompt echoing
Multiple variable handling

Assertions (3 tests)

contains assertion
icontains assertion (case-insensitive)
Failing assertion behavior

Test Fixtures

fixtures/configs/basic.yaml - Simple test with passing assertions
fixtures/configs/failing-assertion.yaml - Test with failing assertion
fixtures/configs/assertions.yaml - Multiple assertions test

Configuration

Added pytest configuration with smoke marker for selective testing
Updated pyproject.toml with pytest options
Comprehensive README documenting smoke test purpose and usage

Why Echo Provider?

The echo provider is perfect for smoke tests because:

No external dependencies - No API keys or network calls required
Deterministic - Always returns the same output for the same input
Fast - No network latency
Predictable - Easy to write assertions against

Running Smoke Tests

# Run all smoke tests
pytest tests/smoke/

# Run only smoke-marked tests
pytest tests/ -m smoke

# Skip smoke tests (unit tests only)
pytest tests/ -m 'not smoke'

# Run a specific test
pytest tests/smoke/test_smoke.py::TestEvalCommand::test_basic_eval

Test Results

All 20 smoke tests pass ✅

$ pytest tests/smoke/ -v
======================== test session starts ========================
tests/smoke/test_smoke.py::TestBasicCLI::test_version_flag PASSED
tests/smoke/test_smoke.py::TestBasicCLI::test_help_flag PASSED
tests/smoke/test_smoke.py::TestBasicCLI::test_eval_help PASSED
tests/smoke/test_smoke.py::TestBasicCLI::test_unknown_command PASSED
tests/smoke/test_smoke.py::TestBasicCLI::test_missing_config_file PASSED
tests/smoke/test_smoke.py::TestEvalCommand::test_basic_eval PASSED
tests/smoke/test_smoke.py::TestEvalCommand::test_json_output PASSED
tests/smoke/test_smoke.py::TestEvalCommand::test_yaml_output PASSED
tests/smoke/test_smoke.py::TestEvalCommand::test_csv_output PASSED
tests/smoke/test_smoke.py::TestEvalCommand::test_max_concurrency_flag PASSED
tests/smoke/test_smoke.py::TestEvalCommand::test_repeat_flag PASSED
tests/smoke/test_smoke.py::TestEvalCommand::test_verbose_flag PASSED
tests/smoke/test_smoke.py::TestExitCodes::test_success_exit_code PASSED
tests/smoke/test_smoke.py::TestExitCodes::test_failure_exit_code PASSED
tests/smoke/test_smoke.py::TestExitCodes::test_config_error_exit_code PASSED
tests/smoke/test_smoke.py::TestEchoProvider::test_echo_provider_basic PASSED
tests/smoke/test_smoke.py::TestEchoProvider::test_echo_provider_with_multiple_vars PASSED
tests/smoke/test_smoke.py::TestAssertions::test_contains_assertion PASSED
tests/smoke/test_smoke.py::TestAssertions::test_multiple_assertions PASSED
tests/smoke/test_smoke.py::TestAssertions::test_failing_assertion PASSED
=================== 20 passed in 31.93s ====================

Notes

Smoke tests are slower than unit tests (~32s total) because they spawn subprocesses
They require Node.js and promptfoo to be installed
They test the integration between Python wrapper and Node.js CLI
All existing unit tests still pass (43 passed, 3 skipped)

Inspired By

These smoke tests are inspired by the Node.js promptfoo project's smoke test suite in test/smoke/, adapted for the Python wrapper with similar structure and coverage.

🤖 Generated with Claude Code

- Add smoke tests that verify end-to-end CLI functionality - Test basic CLI operations (--version, --help, error handling) - Test eval command with echo provider (no external dependencies) - Test output formats (JSON, YAML, CSV) - Test CLI flags (--repeat, --max-concurrency, --verbose, --no-cache) - Test exit codes (0 for success, 100 for failures, 1 for errors) - Test assertions (contains, icontains, failing assertions) - Add pytest configuration with 'smoke' marker for selective testing - Add comprehensive README documenting smoke test purpose and usage Total: 20 smoke tests, all passing ✅ Smoke tests run against the installed promptfoo CLI via subprocess, testing the Python wrapper integration with the Node.js CLI. Run smoke tests: pytest tests/smoke/ # Run all smoke tests pytest tests/ -m smoke # Run only smoke-marked tests pytest tests/ -m 'not smoke' # Skip smoke tests (unit tests only) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

Previously the CI was only testing CLI invocation but not running pytest. Changes: - Install dev dependencies (pytest, mypy, ruff) in test jobs - Run unit tests with: pytest tests/ -v -m 'not smoke' - Run smoke tests with: pytest tests/smoke/ -v - Both 'test' and 'test-npx-fallback' jobs now run full test suite This ensures: ✅ Unit tests run on all platforms (ubuntu, windows) and Python versions (3.9, 3.13) ✅ Smoke tests verify end-to-end CLI functionality ✅ Both global install and npx fallback paths are tested

- Split test_split_path into platform-specific versions (Unix/Windows) - Split test_find_external_promptfoo_prevents_recursion for platform paths - Use platform-appropriate node path in test_main_exits_when_neither_external_nor_npx_available - Tests now skip appropriately on incompatible platforms

The first npx call can be slow as it downloads promptfoo. Increased timeout from 60s to 120s to accommodate this.

Add safety checks for None values from subprocess.run() output, which can occur on Windows in certain error conditions.

Resolved conflict in tests/test_cli.py by keeping platform-appropriate node_path implementation from feature branch.

- Fix line too long (123 > 120) in test_cli.py - Run ruff format on test files - Add tests/smoke/.temp-output/ to .gitignore Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

- Add comprehensive testing strategy section with unit vs smoke tests - Document test directory structure - Add smoke test details and commands - Update CI/CD section to mention both test types - Update project structure to include tests directory Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

Copilot

Pull request overview

This PR adds a comprehensive smoke test suite to verify end-to-end CLI functionality for the Python wrapper. The tests use the echo provider to avoid external dependencies and test the integration between the Python wrapper and the Node.js CLI.

Changes:

Added 20 smoke tests covering CLI operations, eval command, exit codes, echo provider functionality, and assertions
Split platform-specific unit tests in tests/test_cli.py for better cross-platform testing
Added pytest configuration with smoke test marker for selective test execution
Updated CI workflows to run unit tests and smoke tests separately

Reviewed changes

Copilot reviewed 10 out of 11 changed files in this pull request and generated 7 comments.

Show a summary per file

File	Description
tests/test_cli.py	Split platform-specific tests for Unix and Windows PATH handling and recursion prevention
tests/smoke/test_smoke.py	New comprehensive smoke test suite with 20 tests for CLI integration
tests/smoke/fixtures/configs/basic.yaml	Basic test configuration with passing assertions
tests/smoke/fixtures/configs/failing-assertion.yaml	Test configuration with intentionally failing assertion
tests/smoke/fixtures/configs/assertions.yaml	Test configuration with multiple assertions including case-insensitive matching
tests/smoke/init.py	Package initialization for smoke tests
tests/smoke/README.md	Comprehensive documentation for smoke test suite
pyproject.toml	Added pytest configuration with smoke test marker
.gitignore	Added smoke test temporary output directory
.github/workflows/test.yml	Updated to run unit tests and smoke tests separately with proper dev dependencies

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-01-11T07:01:38Z

tests/smoke/test_smoke.py

+    def test_version_flag(self):
+        """Test --version flag outputs version."""
+        stdout, stderr, exit_code = run_promptfoo(["--version"])
+
+        assert exit_code == 0
+        # Should output a version number (semver format)
+        assert stdout.strip(), "Version output should not be empty"


Test method is missing return type annotation. For consistency with the rest of the codebase (see tests/test_cli.py), test methods should include '-> None' return type annotation.

Copilot · 2026-01-11T07:01:39Z

tests/smoke/test_smoke.py

+    def test_help_flag(self):
+        """Test --help flag outputs help."""
+        stdout, stderr, exit_code = run_promptfoo(["--help"])
+
+        assert exit_code == 0
+        assert "promptfoo" in stdout.lower()
+        assert "eval" in stdout.lower()
+
+    def test_eval_help(self):
+        """Test 'eval --help' outputs eval command help."""
+        stdout, stderr, exit_code = run_promptfoo(["eval", "--help"])
+
+        assert exit_code == 0
+        assert "--config" in stdout or "-c" in stdout
+        assert "--output" in stdout or "-o" in stdout
+
+    def test_unknown_command(self):
+        """Test unknown command returns error."""
+        stdout, stderr, exit_code = run_promptfoo(
+            ["unknowncommand123"],
+            expect_error=True,
+        )
+
+        assert exit_code != 0
+        output = stdout + stderr
+        assert "unknown" in output.lower() or "not found" in output.lower()
+
+    def test_missing_config_file(self):
+        """Test missing config file returns error."""
+        stdout, stderr, exit_code = run_promptfoo(
+            ["eval", "-c", "nonexistent-config-file.yaml"],
+            expect_error=True,
+        )
+
+        assert exit_code != 0
+        output = stdout + stderr
+        # Should indicate the file wasn't found
+        assert any(
+            phrase in output.lower()
+            for phrase in [
+                "not found",
+                "no such file",
+                "does not exist",
+                "cannot find",
+                "no configuration file",
+            ]
+        )
+
+
+class TestEvalCommand:
+    """Eval command smoke tests."""
+
+    def test_basic_eval(self):
+        """Test basic eval with echo provider."""
+        config_path = CONFIGS_DIR / "basic.yaml"
+        stdout, stderr, exit_code = run_promptfoo(["eval", "-c", str(config_path), "--no-cache"])
+
+        assert exit_code == 0, f"Eval failed:\nSTDOUT: {stdout}\nSTDERR: {stderr}"
+        # Should show evaluation results
+        assert "pass" in stdout.lower() or "✓" in stdout or "success" in stdout.lower()
+
+    def test_json_output(self):
+        """Test eval outputs valid JSON."""
+        config_path = CONFIGS_DIR / "basic.yaml"
+        output_path = OUTPUT_DIR / "output.json"
+
+        stdout, stderr, exit_code = run_promptfoo(
+            ["eval", "-c", str(config_path), "-o", str(output_path), "--no-cache"]
+        )
+
+        assert exit_code == 0, f"Eval failed:\nSTDOUT: {stdout}\nSTDERR: {stderr}"
+        assert output_path.exists(), "Output file was not created"
+
+        # Verify it's valid JSON with expected structure
+        with open(output_path) as f:
+            data = json.load(f)
+
+        assert "results" in data
+        assert "results" in data["results"]
+        assert isinstance(data["results"]["results"], list)
+        assert len(data["results"]["results"]) > 0
+
+        # Verify echo provider returns the prompt
+        first_result = data["results"]["results"][0]
+        assert "response" in first_result
+        assert "output" in first_result["response"]
+        output_text = first_result["response"]["output"]
+        assert "Hello" in output_text
+        assert "World" in output_text
+
+    def test_yaml_output(self):
+        """Test eval outputs YAML format."""
+        config_path = CONFIGS_DIR / "basic.yaml"
+        output_path = OUTPUT_DIR / "output.yaml"
+
+        stdout, stderr, exit_code = run_promptfoo(
+            ["eval", "-c", str(config_path), "-o", str(output_path), "--no-cache"]
+        )
+
+        assert exit_code == 0
+        assert output_path.exists()
+
+        # Verify it contains YAML-like content
+        with open(output_path) as f:
+            content = f.read()
+
+        assert "results:" in content
+
+    def test_csv_output(self):
+        """Test eval outputs CSV format."""
+        config_path = CONFIGS_DIR / "basic.yaml"
+        output_path = OUTPUT_DIR / "output.csv"
+
+        stdout, stderr, exit_code = run_promptfoo(
+            ["eval", "-c", str(config_path), "-o", str(output_path), "--no-cache"]
+        )
+
+        assert exit_code == 0
+        assert output_path.exists()
+
+        # Verify it's CSV format (has header row with columns)
+        with open(output_path) as f:
+            content = f.read()
+
+        lines = content.strip().split("\n")
+        assert len(lines) > 0
+        # CSV should have comma-separated values
+        assert "," in lines[0]
+
+    def test_max_concurrency_flag(self):
+        """Test --max-concurrency flag."""
+        config_path = CONFIGS_DIR / "basic.yaml"
+
+        stdout, stderr, exit_code = run_promptfoo(
+            ["eval", "-c", str(config_path), "--max-concurrency", "1", "--no-cache"]
+        )
+
+        assert exit_code == 0
+
+    def test_repeat_flag(self):
+        """Test --repeat flag runs tests multiple times."""
+        config_path = CONFIGS_DIR / "basic.yaml"
+        output_path = OUTPUT_DIR / "repeat-output.json"
+
+        stdout, stderr, exit_code = run_promptfoo(
+            [
+                "eval",
+                "-c",
+                str(config_path),
+                "--repeat",
+                "2",
+                "-o",
+                str(output_path),
+                "--no-cache",
+            ]
+        )
+
+        assert exit_code == 0
+
+        # Verify we got repeated results
+        with open(output_path) as f:
+            data = json.load(f)
+
+        # With repeat=2 and 1 test case, we should have 2 results
+        assert len(data["results"]["results"]) == 2
+
+    def test_verbose_flag(self):
+        """Test --verbose flag."""
+        config_path = CONFIGS_DIR / "basic.yaml"
+
+        stdout, stderr, exit_code = run_promptfoo(["eval", "-c", str(config_path), "--verbose", "--no-cache"])
+
+        assert exit_code == 0
+        # Verbose mode should produce output
+        assert len(stdout) > 0 or len(stderr) > 0


Test methods in this class are missing return type annotations. For consistency with the rest of the codebase (see tests/test_cli.py), test methods should include '-> None' return type annotation.

Copilot · 2026-01-11T07:01:39Z

tests/smoke/test_smoke.py

+    def test_success_exit_code(self):
+        """Test exit code 0 when all assertions pass."""
+        config_path = CONFIGS_DIR / "basic.yaml"
+
+        stdout, stderr, exit_code = run_promptfoo(["eval", "-c", str(config_path), "--no-cache"])
+
+        assert exit_code == 0
+
+    def test_failure_exit_code(self):
+        """Test exit code 100 when assertions fail."""
+        config_path = CONFIGS_DIR / "failing-assertion.yaml"
+
+        stdout, stderr, exit_code = run_promptfoo(
+            ["eval", "-c", str(config_path), "--no-cache"],
+            expect_error=True,
+        )
+
+        # Exit code 100 indicates test failures
+        assert exit_code == 100, f"Expected exit code 100, got {exit_code}"
+
+    def test_config_error_exit_code(self):
+        """Test exit code 1 for config errors."""
+        stdout, stderr, exit_code = run_promptfoo(
+            ["eval", "-c", "nonexistent-file.yaml", "--no-cache"],
+            expect_error=True,
+        )
+
+        assert exit_code == 1


Test methods in this class are missing return type annotations. For consistency with the rest of the codebase (see tests/test_cli.py), test methods should include '-> None' return type annotation.

Copilot · 2026-01-11T07:01:39Z

tests/smoke/test_smoke.py

+    def test_echo_provider_basic(self):
+        """Test echo provider returns the prompt."""
+        config_path = CONFIGS_DIR / "basic.yaml"
+        output_path = OUTPUT_DIR / "echo-test.json"
+
+        stdout, stderr, exit_code = run_promptfoo(
+            ["eval", "-c", str(config_path), "-o", str(output_path), "--no-cache"]
+        )
+
+        assert exit_code == 0
+
+        # Verify echo provider returns the prompt
+        with open(output_path) as f:
+            data = json.load(f)
+
+        first_result = data["results"]["results"][0]
+
+        # Echo provider should return the prompt in the response
+        output = first_result["response"]["output"]
+        assert "Hello" in output
+        assert "World" in output
+
+    def test_echo_provider_with_multiple_vars(self):
+        """Test echo provider with multiple variables."""
+        config_path = CONFIGS_DIR / "assertions.yaml"
+        output_path = OUTPUT_DIR / "echo-multi-var.json"
+
+        stdout, stderr, exit_code = run_promptfoo(
+            ["eval", "-c", str(config_path), "-o", str(output_path), "--no-cache"]
+        )
+
+        assert exit_code == 0
+
+        with open(output_path) as f:
+            data = json.load(f)
+
+        first_result = data["results"]["results"][0]
+        output = first_result["response"]["output"]
+
+        # Should contain all variable values
+        assert "Alice" in output
+        assert "Wonderland" in output


Test methods in this class are missing return type annotations. For consistency with the rest of the codebase (see tests/test_cli.py), test methods should include '-> None' return type annotation.

Copilot · 2026-01-11T07:01:39Z

tests/smoke/test_smoke.py

+    def test_contains_assertion(self):
+        """Test contains assertion."""
+        config_path = CONFIGS_DIR / "basic.yaml"
+
+        stdout, stderr, exit_code = run_promptfoo(["eval", "-c", str(config_path), "--no-cache"])
+
+        assert exit_code == 0
+        # All assertions should pass
+        assert "pass" in stdout.lower() or "✓" in stdout or "success" in stdout.lower()
+
+    def test_multiple_assertions(self):
+        """Test multiple assertions in single test."""
+        config_path = CONFIGS_DIR / "assertions.yaml"
+
+        stdout, stderr, exit_code = run_promptfoo(["eval", "-c", str(config_path), "--no-cache"])
+
+        assert exit_code == 0
+
+    def test_failing_assertion(self):
+        """Test failing assertion."""
+        config_path = CONFIGS_DIR / "failing-assertion.yaml"
+
+        stdout, stderr, exit_code = run_promptfoo(
+            ["eval", "-c", str(config_path), "--no-cache"],
+            expect_error=True,
+        )
+
+        # Should fail with exit code 100
+        assert exit_code == 100
+        output = stdout + stderr
+        # Should indicate failure
+        assert "fail" in output.lower() or "✗" in output or "error" in output.lower()


Test methods in this class are missing return type annotations. For consistency with the rest of the codebase (see tests/test_cli.py), test methods should include '-> None' return type annotation.

Copilot · 2026-01-11T07:01:39Z

tests/smoke/test_smoke.py

+These tests verify the core evaluation pipeline works correctly
+using the echo provider (no external API dependencies).
+
+These tests run against the installed promptfoo package via npx,


The comment states tests run "via npx" but they actually run via the Python wrapper which may use either a globally installed promptfoo or fall back to npx. Consider updating to "via the Python wrapper (using either global promptfoo or npx)" for accuracy.

Suggested change

These tests run against the installed promptfoo package via npx,

These tests run against the installed promptfoo package via the Python wrapper

(using either a globally installed promptfoo CLI or falling back to npx),

Copilot · 2026-01-11T07:01:40Z

tests/smoke/README.md

+
+Smoke tests are high-level integration tests that verify the most critical functionality works end-to-end. They:
+
+- Run against the actual installed CLI (via `npx promptfoo`)


The documentation states tests run "via npx promptfoo" but they actually run via the Python wrapper which may use either a globally installed promptfoo or fall back to npx. Consider updating to "via the Python wrapper (using either global promptfoo or npx)" for accuracy.

Suggested change

- Run against the actual installed CLI (via `npx promptfoo`)

- Run against the actual installed CLI via the Python wrapper (using either global promptfoo or npx)

- Add `-> None` return type annotations to all smoke test methods - Add Generator return type to setup_and_teardown fixture - Update documentation to clarify tests run via Python wrapper (not just npx) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

- Add os.path.isfile mock to unit test to prevent _find_windows_promptfoo() from finding real promptfoo installations on Windows CI runners - Add UTF-8 encoding with error replacement to smoke tests to handle Windows cp1252 encoding issues with npx output - Add warmup_npx fixture to pre-download promptfoo via npx before tests, preventing timeout on first test when npx needs to download package Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

Add record_wrapper_used mock to tests that mock subprocess.run to prevent PostHog telemetry calls from interfering with mock call counts. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* feat: add smoke tests for CLI integration testing - Add smoke tests that verify end-to-end CLI functionality - Test basic CLI operations (--version, --help, error handling) - Test eval command with echo provider (no external dependencies) - Test output formats (JSON, YAML, CSV) - Test CLI flags (--repeat, --max-concurrency, --verbose, --no-cache) - Test exit codes (0 for success, 100 for failures, 1 for errors) - Test assertions (contains, icontains, failing assertions) - Add pytest configuration with 'smoke' marker for selective testing - Add comprehensive README documenting smoke test purpose and usage Total: 20 smoke tests, all passing ✅ Smoke tests run against the installed promptfoo CLI via subprocess, testing the Python wrapper integration with the Node.js CLI. Run smoke tests: pytest tests/smoke/ # Run all smoke tests pytest tests/ -m smoke # Run only smoke-marked tests pytest tests/ -m 'not smoke' # Skip smoke tests (unit tests only) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com> * ci: run unit tests and smoke tests in CI Previously the CI was only testing CLI invocation but not running pytest. Changes: - Install dev dependencies (pytest, mypy, ruff) in test jobs - Run unit tests with: pytest tests/ -v -m 'not smoke' - Run smoke tests with: pytest tests/smoke/ -v - Both 'test' and 'test-npx-fallback' jobs now run full test suite This ensures: ✅ Unit tests run on all platforms (ubuntu, windows) and Python versions (3.9, 3.13) ✅ Smoke tests verify end-to-end CLI functionality ✅ Both global install and npx fallback paths are tested * fix: use Optional for Python 3.9 compatibility in smoke tests * fix: make platform-specific tests work on both Unix and Windows - Split test_split_path into platform-specific versions (Unix/Windows) - Split test_find_external_promptfoo_prevents_recursion for platform paths - Use platform-appropriate node path in test_main_exits_when_neither_external_nor_npx_available - Tests now skip appropriately on incompatible platforms * fix: increase smoke test timeout for npx fallback scenarios The first npx call can be slow as it downloads promptfoo. Increased timeout from 60s to 120s to accommodate this. * fix: handle None stdout/stderr in smoke tests Add safety checks for None values from subprocess.run() output, which can occur on Windows in certain error conditions. * fix: address linting issues and add temp output to gitignore - Fix line too long (123 > 120) in test_cli.py - Run ruff format on test files - Add tests/smoke/.temp-output/ to .gitignore Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * docs: update AGENTS.md with smoke test documentation - Add comprehensive testing strategy section with unit vs smoke tests - Document test directory structure - Add smoke test details and commands - Update CI/CD section to mention both test types - Update project structure to include tests directory Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * style: add return type annotations and fix documentation wording - Add `-> None` return type annotations to all smoke test methods - Add Generator return type to setup_and_teardown fixture - Update documentation to clarify tests run via Python wrapper (not just npx) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * fix: resolve Windows CI test failures - Add os.path.isfile mock to unit test to prevent _find_windows_promptfoo() from finding real promptfoo installations on Windows CI runners - Add UTF-8 encoding with error replacement to smoke tests to handle Windows cp1252 encoding issues with npx output - Add warmup_npx fixture to pre-download promptfoo via npx before tests, preventing timeout on first test when npx needs to download package Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * fix: mock telemetry in CLI unit tests Add record_wrapper_used mock to tests that mock subprocess.run to prevent PostHog telemetry calls from interfering with mock call counts. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> --------- Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>

mldangelo and others added 9 commits January 6, 2026 06:27

fix: use Optional for Python 3.9 compatibility in smoke tests

6193feb

fix: increase smoke test timeout for npx fallback scenarios

9cd4d11

The first npx call can be slow as it downloads promptfoo. Increased timeout from 60s to 120s to accommodate this.

fix: handle None stdout/stderr in smoke tests

13c1f1d

Add safety checks for None values from subprocess.run() output, which can occur on Windows in certain error conditions.

Merge branch 'main' into feat/add-smoke-tests

b5a25cb

Resolved conflict in tests/test_cli.py by keeping platform-appropriate node_path implementation from feature branch.

Merge remote-tracking branch 'origin/main' into pr-14

7b188ee

fix: address linting issues and add temp output to gitignore

874955b

- Fix line too long (123 > 120) in test_cli.py - Run ruff format on test files - Add tests/smoke/.temp-output/ to .gitignore Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

mldangelo requested a review from Copilot January 11, 2026 06:56

Copilot started reviewing on behalf of mldangelo January 11, 2026 06:56 View session

Copilot AI reviewed Jan 11, 2026

View reviewed changes

mldangelo and others added 4 commits January 11, 2026 02:06

Merge origin/main into pr-14

62ae7d5

fix: mock telemetry in CLI unit tests

02acd12

Add record_wrapper_used mock to tests that mock subprocess.run to prevent PostHog telemetry calls from interfering with mock call counts. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

mldangelo merged commit 8e653c4 into main Jan 11, 2026
10 checks passed

mldangelo deleted the feat/add-smoke-tests branch February 24, 2026 23:22

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add smoke tests for CLI integration testing#14

feat: add smoke tests for CLI integration testing#14
mldangelo merged 14 commits intomainfrom
feat/add-smoke-tests

mldangelo commented Jan 6, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Jan 11, 2026

Uh oh!

Copilot AI Jan 11, 2026

Uh oh!

Copilot AI Jan 11, 2026

Uh oh!

Copilot AI Jan 11, 2026

Uh oh!

Copilot AI Jan 11, 2026

Uh oh!

Copilot AI Jan 11, 2026

Uh oh!

Copilot AI Jan 11, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

	These tests run against the installed promptfoo package via npx,
	These tests run against the installed promptfoo package via the Python wrapper
	(using either a globally installed promptfoo CLI or falling back to npx),


		Smoke tests are high-level integration tests that verify the most critical functionality works end-to-end. They:

		- Run against the actual installed CLI (via `npx promptfoo`)

	- Run against the actual installed CLI (via `npx promptfoo`)
	- Run against the actual installed CLI via the Python wrapper (using either global promptfoo or npx)

Conversation

mldangelo commented Jan 6, 2026

Summary

What's Added

Smoke Test Suite ()

Basic CLI Operations (5 tests)

Eval Command (7 tests)

Exit Codes (3 tests)

Echo Provider (2 tests)

Assertions (3 tests)

Test Fixtures

Configuration

Why Echo Provider?

Running Smoke Tests

Test Results

Notes

Inspired By

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Jan 11, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 11, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 11, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 11, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 11, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 11, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 11, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants