Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
20 changes: 16 additions & 4 deletions .github/workflows/test.yml
Original file line number Diff line number Diff line change
Expand Up @@ -126,8 +126,14 @@ jobs:
- name: Pin Python version
run: uv python pin ${{ matrix.python-version }}

- name: Install package
run: uv sync
- name: Install package with dev dependencies
run: uv sync --extra dev

- name: Run unit tests
run: uv run pytest tests/ -v -m 'not smoke'

- name: Run smoke tests
run: uv run pytest tests/smoke/ -v

- name: Test CLI can be invoked
run: uv run promptfoo --version
Expand Down Expand Up @@ -192,8 +198,14 @@ jobs:
- name: Pin Python version
run: uv python pin ${{ matrix.python-version }}

- name: Install package
run: uv sync
- name: Install package with dev dependencies
run: uv sync --extra dev

- name: Run unit tests
run: uv run pytest tests/ -v -m 'not smoke'

- name: Run smoke tests (with npx fallback)
run: uv run pytest tests/smoke/ -v

- name: Test CLI fallback to npx (no global install)
run: uv run promptfoo --version
Expand Down
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -42,6 +42,7 @@ htmlcov/
.tox/
.mypy_cache/
.ruff_cache/
tests/smoke/.temp-output/

# Distribution
dist/
Expand Down
77 changes: 70 additions & 7 deletions AGENTS.md
Original file line number Diff line number Diff line change
Expand Up @@ -135,9 +135,12 @@ Runs on every PR and push to main:
- **Lint**: Ruff linting (`uv run ruff check src/`)
- **Format Check**: Ruff formatting (`uv run ruff format --check src/`)
- **Type Check**: mypy static analysis (`uv run mypy src/promptfoo/`)
- **Tests**: pytest on multiple Python versions (3.9, 3.13) and OSes (Ubuntu, Windows)
- **Unit Tests**: Fast tests with mocked dependencies (`uv run pytest -m 'not smoke'`)
- **Smoke Tests**: Integration tests against real CLI (`uv run pytest tests/smoke/`)
- **Build**: Package build validation

Tests run on multiple Python versions (3.9, 3.13) and OSes (Ubuntu, Windows).

### Release Workflow (`.github/workflows/release-please.yml`)

Triggered on push to main:
Expand Down Expand Up @@ -214,7 +217,38 @@ uv run pytest

### Test Structure

Tests are located in the root directory (not yet created, but should be in `tests/` when added).
Tests are organized in the `tests/` directory:

```
tests/
├── __init__.py
├── test_cli.py # Unit tests for CLI wrapper logic
├── test_environment.py # Unit tests for environment detection
├── test_instructions.py # Unit tests for installation instructions
└── smoke/
├── __init__.py
├── README.md # Smoke test documentation
├── test_smoke.py # Integration tests against real CLI
└── fixtures/
└── configs/ # YAML configs for smoke tests
├── basic.yaml
├── assertions.yaml
└── failing-assertion.yaml
```

### Test Types

**Unit Tests** (`tests/test_*.py`):
- Fast, isolated tests for individual functions
- Mock external dependencies
- Run on every PR

**Smoke Tests** (`tests/smoke/`):
- Integration tests that run the actual CLI via subprocess
- Use the `echo` provider (no external API dependencies)
- Test the full Python → Node.js integration
- Slower but verify end-to-end functionality
- Marked with `@pytest.mark.smoke`

### Test Matrix

Expand All @@ -229,16 +263,36 @@ CI tests across:
# Install dependencies with dev extras
uv sync --extra dev

# Run all tests
# Run all tests (unit + smoke)
uv run pytest

# Run only unit tests (fast)
uv run pytest -m 'not smoke'

# Run only smoke tests (slow, requires Node.js)
uv run pytest tests/smoke/

# Run with coverage
uv run pytest --cov=src/promptfoo

# Run specific test class
uv run pytest tests/test_cli.py::TestMainFunction

# Run specific test
uv run pytest tests/test_cli.py::test_wrapper_detection
uv run pytest tests/smoke/test_smoke.py::TestEvalCommand::test_basic_eval
```

### Smoke Test Details

Smoke tests verify critical CLI functionality:
- **Basic CLI**: `--version`, `--help`, unknown commands, missing files
- **Eval Command**: Output formats (JSON, YAML, CSV), flags (`--repeat`, `--verbose`)
- **Exit Codes**: 0 for success, 100 for assertion failures, 1 for errors
- **Echo Provider**: Variable substitution, multiple variables
- **Assertions**: `contains`, `icontains`, failing assertions

The smoke tests use a 120-second timeout to accommodate the first `npx` call which downloads promptfoo.

## Security Practices

### 1. No Credentials in Repository
Expand Down Expand Up @@ -365,14 +419,23 @@ promptfoo-python/
├── src/
│ └── promptfoo/
│ ├── __init__.py # Package exports
│ └── cli.py # Main wrapper implementation
│ ├── cli.py # Main wrapper implementation
│ ├── environment.py # Environment detection
│ └── instructions.py # Node.js installation instructions
├── tests/
│ ├── test_cli.py # Unit tests for CLI
│ ├── test_environment.py # Unit tests for environment detection
│ ├── test_instructions.py # Unit tests for instructions
│ └── smoke/
│ ├── test_smoke.py # Integration smoke tests
│ └── fixtures/configs/ # Test configuration files
├── AGENTS.md # This file (agent documentation)
├── CHANGELOG.md # Auto-generated by release-please
├── CLAUDE.md # Points to AGENTS.md
├── LICENSE # MIT License
├── README.md # User-facing documentation
├── pyproject.toml # Package configuration
├── release-please-config.json # Release-please configuration
├── release-please-config.json # Release-please configuration
└── .release-please-manifest.json # Release version tracking
```

Expand Down Expand Up @@ -443,5 +506,5 @@ git push --force

---

**Last Updated**: 2026-01-05
**Last Updated**: 2026-01-11
**Maintained By**: @promptfoo/engineering
13 changes: 13 additions & 0 deletions pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -102,3 +102,16 @@ show_error_codes = true
pretty = true
check_untyped_defs = true
disallow_incomplete_defs = true

[tool.pytest.ini_options]
testpaths = ["tests"]
python_files = ["test_*.py"]
python_classes = ["Test*"]
python_functions = ["test_*"]
addopts = [
"-v",
"--strict-markers",
]
markers = [
"smoke: smoke tests that run the full CLI (slow, requires Node.js)",
]
88 changes: 88 additions & 0 deletions tests/smoke/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,88 @@
# Smoke Tests

These smoke tests verify that the core promptfoo CLI functionality works correctly through the Python wrapper.

## What are Smoke Tests?

Smoke tests are high-level integration tests that verify the most critical functionality works end-to-end. They:

- Run against the actual installed CLI via the Python wrapper (using either global promptfoo or npx)
- Test the Python wrapper integration with the Node.js CLI
- Use the `echo` provider to avoid external API dependencies
- Verify command-line arguments, file I/O, and output formats
- Check exit codes and error handling

## Running Smoke Tests

```bash
# Run all smoke tests
pytest tests/smoke/

# Run with verbose output
pytest tests/smoke/ -v

# Run a specific test class
pytest tests/smoke/test_smoke.py::TestEvalCommand

# Run a specific test
pytest tests/smoke/test_smoke.py::TestEvalCommand::test_basic_eval
```

## Test Structure

- `test_smoke.py` - Main smoke test suite
- `fixtures/` - Test configuration files
- `configs/` - YAML configuration files for testing

## Test Coverage

### Basic CLI Operations
- Version flag (`--version`)
- Help output (`--help`, `eval --help`)
- Unknown command handling
- Missing file error handling

### Eval Command
- Basic evaluation with echo provider
- Output formats (JSON, YAML, CSV)
- Command-line flags (`--max-concurrency`, `--repeat`, `--verbose`)
- Cache control (`--no-cache`)

### Exit Codes
- Exit code 0 for success
- Exit code 100 for assertion failures
- Exit code 1 for configuration errors

### Echo Provider
- Basic prompt echoing
- Variable substitution
- Multiple variable handling

### Assertions
- `contains` assertion
- `icontains` assertion (case-insensitive)
- Multiple assertions per test
- Failing assertion behavior

## Why Echo Provider?

The `echo` provider is perfect for smoke tests because:

1. **No external dependencies** - Doesn't require API keys or network calls
2. **Deterministic** - Always returns the same output for the same input
3. **Fast** - No network latency
4. **Predictable** - Easy to write assertions against

## Adding New Smoke Tests

1. Create a new test config in `fixtures/configs/` if needed
2. Add test methods to the appropriate test class in `test_smoke.py`
3. Use the `run_promptfoo()` helper to execute CLI commands
4. Make assertions on stdout, stderr, exit codes, and output files

## Notes

- Smoke tests run slower than unit tests (they spawn subprocesses)
- They require Node.js and promptfoo to be installed
- They test the integration between Python and Node.js
- They should be kept focused on critical functionality
1 change: 1 addition & 0 deletions tests/smoke/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
"""Smoke tests for promptfoo CLI."""
22 changes: 22 additions & 0 deletions tests/smoke/fixtures/configs/assertions.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
# yaml-language-server: $schema=https://promptfoo.dev/config-schema.json
description: 'Smoke test - multiple assertions'

providers:
- echo

prompts:
- 'Hello {{name}}, welcome to {{place}}'

tests:
- vars:
name: Alice
place: Wonderland
assert:
- type: contains
value: Hello
- type: contains
value: Alice
- type: contains
value: Wonderland
- type: icontains
value: WELCOME
17 changes: 17 additions & 0 deletions tests/smoke/fixtures/configs/basic.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
# yaml-language-server: $schema=https://promptfoo.dev/config-schema.json
description: 'Smoke test - basic config validation'

providers:
- echo

prompts:
- 'Hello {{name}}'

tests:
- vars:
name: World
assert:
- type: contains
value: Hello
- type: contains
value: World
17 changes: 17 additions & 0 deletions tests/smoke/fixtures/configs/failing-assertion.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
# yaml-language-server: $schema=https://promptfoo.dev/config-schema.json
description: 'Smoke test - config with failing assertion'

providers:
- echo

prompts:
- 'Hello {{name}}'

tests:
- vars:
name: World
assert:
# This assertion will fail because echo returns "Hello World"
# but we're asserting it contains "IMPOSSIBLE_STRING_NOT_IN_OUTPUT"
- type: contains
value: IMPOSSIBLE_STRING_NOT_IN_OUTPUT_12345
Loading