[Backlog]: Standardized Reporting and Severity Scoring for GenAI Red Teaming

### Checklist

- [ ] Backlog entry requires creating new sandboxes.
- [x] Backlog entry requires creating new exploitation code and/or tutorials.

### CVE List

_No response_

### Description

While current exploitation modules demonstrate vulnerabilities, there is no standardized way to represent or communicate the **severity and impact** of findings.

This creates challenges in:
- comparing results across different tools (e.g., garak, promptfoo, agent-based attacks)
- prioritizing remediation
- integrating findings into real-world workflows

### Proposal

Define a **standard reporting and scoring model** for GenAI red teaming results:

- **Severity classification framework**, including:
  - critical (e.g., sensitive data exfiltration, RCE via agent tools)
  - high (e.g., prompt injection leading to policy bypass)
  - medium/low (e.g., hallucination without direct impact)

- **Impact dimensions**, such as:
  - confidentiality (PII leakage, secrets exposure)
  - integrity (prompt manipulation, data poisoning)
  - availability (system misuse, denial of service patterns)

- **Output format**:
  - standardized report structure
  - compatible with evaluation framework outputs

- Optional alignment with:
  - existing risk scoring models (e.g., CVSS-inspired approach for GenAI)

### Value

- Makes findings **actionable and comparable**
- Enables integration with **security workflows and dashboards**
- Provides a bridge between **technical testing and risk management**

---

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Backlog]: Standardized Reporting and Severity Scoring for GenAI Red Teaming #31

Checklist

CVE List

Description

Proposal

Value

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Backlog]: Standardized Reporting and Severity Scoring for GenAI Red Teaming #31

Description

Checklist

CVE List

Description

Proposal

Value

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions