Prevent agents from running entire test suites inadvertently by steering them toward targeted testing.
The Pytest Full-Suite Steering feature monitors tool calls from LLMs and issues a steering message when they attempt to run the entire test suite without specifying specific tests. This helps prevent time-consuming full test suite runs when targeted testing would be more efficient.
When an agent tries to run pytest without arguments, the proxy intercepts the command, returns a helpful steering message, and remembers this attempt. If the agent re-issues the same command, the proxy allows it through, assuming the full suite is genuinely needed.
- Automatic Detection: Recognizes full-suite pytest commands across various formats
- Smart Steering: Blocks first attempt, allows second attempt (per session)
- Session Isolation: Each session has independent state
- Configurable Messages: Customize the steering message
- Wrapper Recognition: Detects pytest through pipenv, poetry, uv, etc.
- Priority Handling: Runs at priority 95 (after dangerous commands, before generic steering)
The feature is disabled by default and can be enabled via CLI flag, environment variable, or YAML configuration.
CLI > Environment Variable > YAML Configuration
--enable-pytest-full-suite-steering # Enable the feature
--disable-pytest-full-suite-steering # Explicitly disableexport PYTEST_FULL_SUITE_STEERING_ENABLED=true # or falsesession:
pytest_full_suite_steering_enabled: true # Default: false
# Optional: Custom steering message
pytest_full_suite_steering_message: |
You requested to run the whole test suite. This may be a lengthy process.
Please consider running only selected tests for optimal speed. If you still
believe you need to run the whole test suite, please re-send your tool call
and it will be executed..venv/Scripts/python.exe -m src.core.cli \
--enable-pytest-full-suite-steering \
--default-backend openaiCreate config/my_config.yaml:
session:
pytest_full_suite_steering_enabled: true
pytest_full_suite_steering_message: |
FULL TEST SUITE DETECTED
You're about to run the entire test suite, which may take several minutes.
Consider running specific tests instead:
- pytest tests/unit/test_file.py
- pytest tests/integration/
- pytest -k "test_name_pattern"
If you really need the full suite, re-send the same command.Then run:
.venv/Scripts/python.exe -m src.core.cli --config config/my_config.yamlexport PYTEST_FULL_SUITE_STEERING_ENABLED=true
.venv/Scripts/python.exe -m src.core.cli --default-backend openaiThe handler detects these as full-suite commands:
pytest(no arguments)python -m pytestpy.testpytest .(current directory)pytest -q(only flags, no file selectors)pipenv run pytestpoetry run pytestuv run pytest
These commands are recognized as targeted and pass through:
pytest tests/unit/test_file.py(specific file)pytest tests/(specific directory)pytest -k marker(marker filtering)pytest --lf(last failed tests)pytest tests/unit/test_file.py::TestClass::test_method(node selection)
- Agent sends tool call:
pytest - Handler intercepts and swallows the command
- Returns steering message instead of executing
- Remembers this command for the current session
- Agent re-issues the same command:
pytest - Handler recognizes it's the second attempt
- Allows the command to execute
- Assumes the agent genuinely needs the full suite
- New session starts
- Agent sends tool call:
pytest - Handler swallows it (independent session state)
- Process repeats for this new session
The handler monitors these tool names for pytest commands:
bash,cmd,shellexec,exec_command,execute,execute_commandrun_command,run_shell_command,run_terminal_cmdpowershell,pwsh,executepwshpython(for-m pytestinvocations)terminal,local_shellcontainer.exec
The handler extracts commands from various argument formats:
- String:
"pytest" - Dict with command:
{"command": "pytest"} - Dict with input:
{"input": "pytest"} - List/tuple:
["pytest", "-q"] - Nested structures:
{"body": {"command": "pytest"}}
The handler maintains session state with:
- TTL: 1800 seconds (30 minutes) by default
- Max Sessions: 1024 sessions cached
- Automatic Cleanup: Expired sessions are pruned automatically
Each session tracks which full-suite commands have been attempted, ensuring the second attempt is allowed through.
During iterative development, agents may default to running the full test suite:
# Agent's first attempt
Agent: "Let me run pytest to check all tests"
Proxy: [Steering message suggesting targeted tests]
# Agent's second attempt (if needed)
Agent: "Run pytest"
Proxy: [Executes the full suite]The steering message educates agents about targeted testing:
Consider running specific tests instead:
- pytest tests/unit/test_file.py
- pytest tests/integration/
- pytest -k "test_name_pattern"
The two-attempt mechanism provides:
- First attempt: Gentle steering toward efficiency
- Second attempt: Respects agent's judgment if full suite is needed
Feature not working:
- Verify the feature is enabled (check config/env/CLI)
- Check logs for "Steering full-suite pytest command" messages
- Ensure the tool name is recognized (see "Recognized Shell Tools")
- Verify the command format matches detection patterns
Commands not being detected:
- Test the detection logic manually (see implementation)
- Check if the command includes file/directory selectors
- Verify the tool name is in the recognized list
- Review logs for parsing errors
Steering message not appearing:
- Confirm the feature is enabled in configuration
- Check that the command is recognized as full-suite
- Verify session state is being maintained
- Review handler registration in logs
Second attempt not working:
- Ensure the exact same command is being re-issued
- Check session TTL hasn't expired (default: 30 minutes)
- Verify session ID is consistent across attempts
- Review session state in logs
The handler runs at priority 95 by default. This can be adjusted if needed:
session:
tool_call_reactor:
pytest_full_suite_steering_enabled: true
pytest_full_suite_handler_priority: 95 # Adjust if neededAdjust the session state TTL if needed:
session:
pytest_full_suite_steering_enabled: true
session_state_ttl: 3600 # 1 hour instead of default 30 minutes- Pytest Context Saving - Automatically add helpful pytest flags
- Test Execution Reminder - Remind agents to run tests before completing tasks
- Dangerous Command Protection - Block destructive commands
- Tool Access Control - Fine-grained control over tool usage
- Handler:
src/core/services/tool_call_handlers/pytest_full_suite_handler.py - Tests:
tests/unit/core/services/tool_call_handlers/test_pytest_full_suite_handler.py - Configuration:
src/core/config/app_config.py - CLI:
src/core/cli.py - DI Registration:
src/core/di/services.py