End-to-end tests for the StackRox MCP server using gevals.
- Go 1.25+
- OpenAI API Key (for AI agent and LLM judge)
- StackRox API Token
cd e2e-tests
./scripts/build-gevals.shCreate .env file:
OPENAI_API_KEY=sk-your-key-here
STACKROX_API_TOKEN=your-token-here./scripts/run-tests.shResults are saved to gevals-stackrox-mcp-e2e-out.json.
# Summary
jq '.tasks[] | {name, passed}' gevals-stackrox-mcp-e2e-out.json
# Tool calls
jq '.tasks[].callHistory[] | {toolName, arguments}' gevals-stackrox-mcp-e2e-out.json| Test | Description | Tool |
|---|---|---|
list-clusters |
List all clusters | list_clusters |
cve-affecting-workloads |
CVE impact on deployments | get_deployments_for_cve |
cve-affecting-clusters |
CVE impact on clusters | get_clusters_for_cve |
cve-nonexistent |
Handle non-existent CVE | get_clusters_for_cve |
cve-cluster-scooby |
CVE with cluster filter | get_clusters_for_cve |
cve-cluster-maria |
CVE with cluster filter | get_clusters_for_cve |
cve-clusters-general |
General CVE query | get_clusters_for_cve |
cve-cluster-list |
CVE across clusters | get_clusters_for_cve |
gevals/eval.yaml: Main test configuration, agent settings, assertionsgevals/mcp-config.yaml: MCP server configurationgevals/tasks/*.yaml: Individual test task definitions
Gevals uses a proxy architecture to intercept MCP tool calls:
- AI agent receives task prompt
- Agent calls MCP tool
- Gevals proxy intercepts and records the call
- Call forwarded to StackRox MCP server
- Server executes and returns result
- Gevals validates assertions and response quality
Tests fail - no tools called
- Verify StackRox Central is accessible
- Check API token permissions
Build errors
go mod tidy
./scripts/build-gevals.sh