-
Notifications
You must be signed in to change notification settings - Fork 9
[Backlog]: Standardized Reporting and Severity Scoring for GenAI Red Teaming #31
Copy link
Copy link
Open
Labels
backlogNew backlog entryNew backlog entry
Description
Checklist
- Backlog entry requires creating new sandboxes.
- Backlog entry requires creating new exploitation code and/or tutorials.
CVE List
No response
Description
While current exploitation modules demonstrate vulnerabilities, there is no standardized way to represent or communicate the severity and impact of findings.
This creates challenges in:
- comparing results across different tools (e.g., garak, promptfoo, agent-based attacks)
- prioritizing remediation
- integrating findings into real-world workflows
Proposal
Define a standard reporting and scoring model for GenAI red teaming results:
-
Severity classification framework, including:
- critical (e.g., sensitive data exfiltration, RCE via agent tools)
- high (e.g., prompt injection leading to policy bypass)
- medium/low (e.g., hallucination without direct impact)
-
Impact dimensions, such as:
- confidentiality (PII leakage, secrets exposure)
- integrity (prompt manipulation, data poisoning)
- availability (system misuse, denial of service patterns)
-
Output format:
- standardized report structure
- compatible with evaluation framework outputs
-
Optional alignment with:
- existing risk scoring models (e.g., CVSS-inspired approach for GenAI)
Value
- Makes findings actionable and comparable
- Enables integration with security workflows and dashboards
- Provides a bridge between technical testing and risk management
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
backlogNew backlog entryNew backlog entry