Skip to content

[Backlog]: Standardized Reporting and Severity Scoring for GenAI Red Teaming #31

@vishaljindal1990

Description

@vishaljindal1990

Checklist

  • Backlog entry requires creating new sandboxes.
  • Backlog entry requires creating new exploitation code and/or tutorials.

CVE List

No response

Description

While current exploitation modules demonstrate vulnerabilities, there is no standardized way to represent or communicate the severity and impact of findings.

This creates challenges in:

  • comparing results across different tools (e.g., garak, promptfoo, agent-based attacks)
  • prioritizing remediation
  • integrating findings into real-world workflows

Proposal

Define a standard reporting and scoring model for GenAI red teaming results:

  • Severity classification framework, including:

    • critical (e.g., sensitive data exfiltration, RCE via agent tools)
    • high (e.g., prompt injection leading to policy bypass)
    • medium/low (e.g., hallucination without direct impact)
  • Impact dimensions, such as:

    • confidentiality (PII leakage, secrets exposure)
    • integrity (prompt manipulation, data poisoning)
    • availability (system misuse, denial of service patterns)
  • Output format:

    • standardized report structure
    • compatible with evaluation framework outputs
  • Optional alignment with:

    • existing risk scoring models (e.g., CVSS-inspired approach for GenAI)

Value

  • Makes findings actionable and comparable
  • Enables integration with security workflows and dashboards
  • Provides a bridge between technical testing and risk management

Metadata

Metadata

Assignees

No one assigned

    Labels

    backlogNew backlog entry

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions