Skip to content

FEAT: Add JailBreakV-28K dataset loader#1548

Open
diamond8658 wants to merge 1 commit intomicrosoft:mainfrom
diamond8658:feature/jailbreakv-28k-loader
Open

FEAT: Add JailBreakV-28K dataset loader#1548
diamond8658 wants to merge 1 commit intomicrosoft:mainfrom
diamond8658:feature/jailbreakv-28k-loader

Conversation

@diamond8658
Copy link
Copy Markdown

@diamond8658 diamond8658 commented Mar 27, 2026

FEAT: Add JailBreakV-28K remote dataset loader

Description

This PR introduces a new remote dataset loader for the JailBreakV-28K (V0.2) benchmark. It enables the seamless ingestion of over 28,000 jailbreak prompts directly from the source CSV hosted on the SaFo-Lab repository. This resolves issue #1007

Key Changes:

  • Asynchronous Ingestion: Implements _JailBreakV28KDataset inheriting from _RemoteDatasetLoader for high-performance data fetching.
  • CSV Version 0.2 Support: Specifically targets the versioned CSV release to ensure reproducibility and stability.
  • Policy Mapping: Maps raw SaFo-Lab codes (P1-P5) to human-readable harm categories (e.g., Somatic Safety, Public Interest) to maintain consistency with existing PyRIT datasets.
  • Defensive Filtering: Includes a safety check to skip prompts containing Jinja2 syntax ({{, {%) to prevent unintended orchestrator execution.
  • Metadata Enrichment: Preserves raw policy codes and categories within the SeedPrompt metadata for granular downstream analysis.

Tests and Documentation

Unit Testing:

  • Added comprehensive unit tests in tests/unit/datasets/test_jailbreakv_28k_dataset.py.
  • Verified 100% test coverage for the new loader class.
  • Handled edge cases including:
    • Malformed/missing CSV columns.
    • Jinja2 safety filtering logic.
    • Unknown policy code fallbacks (defaulting to "Unknown Policy").
    • Empty remote responses (raising ValueError).

Linting:

  • Verified that the code passes all flake8 and ruff checks.
  • Follows the Microsoft pyrit coding standards and naming conventions.

JupyText & Documentation:

  • This PR implements a backend data loader and does not modify existing .ipynb or .md documentation files.
  • No JupyText synchronization was required as no new notebook-based tutorials were introduced in this PR.
  • Verified the loader's output format is compatible with the SeedDataset model used throughout the library's existing documentation and orchestrators.

Implements V0.2 CSV ingestion with policy mapping and Jinja2 safety filtering.
@diamond8658 diamond8658 marked this pull request as ready for review March 27, 2026 20:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant