Skip to content

FEAT: GCG public API - GCG + GCGConfig + ExperimentalWarning, shifts module to experimental status#1792

Open
romanlutz wants to merge 12 commits into
microsoft:mainfrom
romanlutz:romanlutz/gcg-config-api
Open

FEAT: GCG public API - GCG + GCGConfig + ExperimentalWarning, shifts module to experimental status#1792
romanlutz wants to merge 12 commits into
microsoft:mainfrom
romanlutz:romanlutz/gcg-config-api

Conversation

@romanlutz
Copy link
Copy Markdown
Contributor

Summary

Phase 2 of the GCG (Greedy Coordinate Gradient) refactor (umbrella: #960; closes the work started in #1717).

Introduces a clean, supported public API for the upstream GCG paper implementation that PyRIT currently embeds:

from pyrit.auxiliary_attacks.gcg import GCG, GCGConfig, GCGModelConfig

generator = GCG(config=GCGConfig(...))
await generator.execute_async()

GCG is an alias for GCGGenerator, which is a PromptGeneratorStrategy. The legacy GreedyCoordinateGradientAdversarialSuffixGenerator entry point and its YAML-driven experiment runner are removed — see "Why no deprecation cycle" below.

What's in this PR

  • New public API surface (pyrit/auxiliary_attacks/gcg/):
    • GCG / GCGGeneratorPromptGeneratorStrategy subclass that owns the GCG run lifecycle.
    • GCGConfig (+ GCGModelConfig, GCGDataConfig, related dataclasses) — typed, programmatic config to replace the previous YAML-only entry point.
    • data.py — the data-transport layer for the AML compute side.
  • ExperimentalWarning(FutureWarning) (pyrit/exceptions/) emitted on import pyrit.auxiliary_attacks. Subclasses FutureWarning so it is visible by default (unlike DeprecationWarning), and tells users to pin if they depend on this module.
  • AML Dockerfile fix — Ubuntu 22.04 apt python3.11 is frozen at 3.11.0rc1, which is missing sys.get_int_max_str_digits (added in 3.11.1 for CVE-2020-10735). Modern torch._dynamo.polyfills.sys references that attribute and fails to import. Switched to uv python install 3.11 (python-build-standalone), which gives us a real 3.11.x build.
  • Removal of the deprecated wrapper — see below.
  • Docs:
    • doc/code/auxiliary_attacks/0_auxiliary_attacks.py/.ipynb — adds an "Experimental module" callout.
    • doc/code/auxiliary_attacks/1_gcg_azure_ml.py/.ipynb — rewritten around the new GCG / GCGConfig API; outputs regenerated from a successful AML run.
  • Teststest_generator.py, test_config.py, test_public_api.py, test_experimental_warning.py. Removed test_attack_wiring.py, test_gcg_core.py, and test_lifecycle.py which covered the old shim.

Why this module is moving to "experimental" status

pyrit/auxiliary_attacks/ is being formally treated as experimental going forward. The ExperimentalWarning is the user-visible expression of that.

This matters because the GCG configuration story is not finished — and we want the freedom to evolve it without a deprecation cycle each time. In particular, GCGDataConfig currently takes a URL string that the AML compute then reads via pd.read_csv(...) (today: AdvBench from raw.githubusercontent.com). That's fine for the bundled public dataset but doesn't generalize:

  • We want to support arbitrary PyRIT-fetched datasets (i.e. anything reachable via pyrit.datasets fetchers / SeedDataset[SeedObjective]).
  • We want to support internal / private datasets that aren't fetchable from a public URL and need to be materialized locally and shipped to the compute as part of the job.

Standardizing on SeedDataset and building a dataset-transfer adapter on top is the natural next step — but it's explicitly out of scope for this PR. The ExperimentalWarning lets us land that change cleanly in a follow-up instead of needing a deprecation path now. This is also why we removed the old shim outright (next section).

Why no deprecation cycle for the old API

The previous shim entry point (GreedyCoordinateGradientAdversarialSuffixGenerator) is removed in this PR rather than deprecated. With auxiliary_attacks now explicitly marked experimental, a deprecation cycle doesn't add value — the contract is "pin to a specific PyRIT version if you depend on this module," not "we will preserve names across releases."

Validation

  • Unit tests: 113/113 pass.
  • Integration tests: 12/12 pass.
  • Pre-commit clean (ruff, ruff-format, nbstripout, sanitize-notebook-paths, ty).
  • End-to-end on Azure ML: jobs strong_steelpan_2gyz6c2txv, nice_sponge_pb776chjvh, nice_plow_gw2mw89bmj, and great_vulture_lwn9y2fs10 all Completed against gcg-gpu-a100. The notebook outputs committed here are from great_vulture_lwn9y2fs10 (30 steps, final loss 1.189, suffix generated). The last run also validated the MI-based ACR push/pull path (ACR admin user has since been disabled in our workspace; that work was handled separately).

What's next (not in this PR)

  • Dataset standardization — a SeedDataset-backed GCGDataConfig with both a "fetch via pyrit.datasets" path and a "materialize-then-embed" path for private datasets.
  • Extension protocols + opinionated defaults for GCGModelConfig.
  • Doc polish + an "extending GCG" guide.

Refs: #960 (umbrella), #1717 (Phase 1, closed).

romanlutz and others added 12 commits May 15, 2026 16:12
…nfig

Phase 2 of the GCG refactor (umbrella issue microsoft#960). Replaces the
33-parameter generate_suffix() entry point with a typed
PromptGeneratorStrategy that follows the same lifecycle/identity pattern
as FuzzerGenerator and AnecdoctorGenerator.

NEW PUBLIC API (pyrit/auxiliary_attacks/gcg/):

  GCGGenerator(PromptGeneratorStrategy[GCGContext, GCGResult], Identifiable)
    __init__:        models, test_models, algorithm, strategy, output, hf_token
    execute_async:   goals, targets, test_goals, test_targets, memory_labels
    _setup_async:    spawn worker subprocesses
    _perform_async:  build attack, run optimization loop, parse log
    _teardown_async: stop workers (runs even on failure - fixes leak bug)

  GCGContext(PromptGeneratorStrategyContext): goals, targets, ...
  GCGResult(PromptGeneratorStrategyResult): final_suffix, final_loss,
                                            step_count, loss_history,
                                            control_history, log_path,
                                            memory_labels

  Sub-configs (init-time strategy configuration):
    GCGModelConfig:     HF identifier + device + kwargs per model
    GCGAlgorithmConfig: n_steps, batch_size, topk, weights, seed, ...
    GCGStrategyConfig:  transfer / progressive / anneal / stop_on_success
    GCGOutputConfig:    result_prefix, logfile, verbose

  GCGConfig: AML-transport-only bag (sub-configs + GCGDataConfig + hf_token)
             with to_json/from_json. Used by the notebook to ship a config
             into an AML job; library callers go straight to GCGGenerator.

  load_goals_and_targets(data, random_seed): CSV -> goal/target lists,
             decoupled from the runtime path.

Minimal call:

    generator = GCGGenerator(
        models=[GCGModelConfig(name="meta-llama/Llama-2-7b-chat-hf")],
    )
    result = await generator.execute_async(goals=[...], targets=[...])
    print(result.final_suffix)

CHANGES:

- Add pyrit/auxiliary_attacks/gcg/{__init__.py, config.py, data.py,
  generator.py} - the new typed public API with full lifecycle.
- _build_identifier exposes model names + key hyperparams as behavioral
  params for memory traceability.
- _teardown_async fixes the previously-known worker-leak-on-failure case
  (the test that characterized that bug now asserts the positive
  behaviour: workers ARE stopped on failure).
- Rewrite experiments/train.py: GreedyCoordinateGradientAdversarialSuffixGenerator
  becomes a deprecated thin shim that builds a GCGGenerator and calls
  execute_async via asyncio.run. DeprecationWarning emitted. Dead args
  (gbda_deterministic, model_name, num_train_models translation) accepted
  for backcompat.
- Rewrite experiments/run.py as a thin --config CLI wrapper: deserialize
  GCGConfig, build GCGGenerator, load goals/targets, execute_async.
  Accepts --output-dir for AML output mounting.
- Delete all 12 experiments/configs/*.yaml files (the only useful info in
  them was the HF model name; everything else has a sensible default in
  the dataclass or is an orthogonal concern).
- Drop _MODEL_NAMES / _ALL_MODELS friendly-name aliasing - users pass the
  HF path directly.
- Drop ml-collections dependency (and transitive absl-py) from the gcg
  extra and the dev group; the dataclasses replace its only use.
- Migrate doc/code/auxiliary_attacks/1_gcg_azure_ml.{py,ipynb} to build a
  GCGConfig and ship it to AML as a uri_file input. (Notebook .ipynb
  regenerated from .py but not re-executed against AML in this commit;
  re-execution should happen before merge.)
- Refresh experiments/README.md to document the new API.

TESTS:

- tests/unit/auxiliary_attacks/gcg/test_generator.py (new): GCGGenerator
  init/identifier/validate_context, execute_async lifecycle (with mocked
  workers), worker cleanup on success and on failure, real-attack-class
  wiring tests, _read_result.
- tests/unit/auxiliary_attacks/gcg/test_config.py (existing): config
  dataclass validation + JSON round-trip; unchanged.
- tests/unit/auxiliary_attacks/gcg/test_data_and_config.py: rewritten to
  cover the load_goals_and_targets helper, _resolve_output, and the new
  --config CLI wrapper.
- tests/unit/auxiliary_attacks/gcg/test_lifecycle.py: shrunk to 3 tests
  verifying the generate_suffix shim still translates kwargs and emits
  the deprecation warning. Worker-lifecycle behaviour is now covered in
  test_generator.py.
- tests/unit/auxiliary_attacks/gcg/test_attack_wiring.py: kept the two
  real-class wiring tests; the third (which tested through _create_attack
  on the legacy Generator) moved to test_generator.py.
- tests/unit/auxiliary_attacks/gcg/test_gcg_core.py: removed
  TestToLegacyParams/TestApplyTargetAugmentation/TestCreateAttack (now in
  test_generator.py).

109 gcg unit tests pass; pre-commit clean.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Introduce `pyrit.exceptions.ExperimentalWarning` (subclass of `FutureWarning`)
and emit it on import of `pyrit.auxiliary_attacks` so users are clearly
informed that the module's APIs may change in any release without a
deprecation cycle.

* Add `ExperimentalWarning` class in `pyrit/exceptions/exception_classes.py`
  and export from `pyrit.exceptions`.
* Emit the warning in `pyrit/auxiliary_attacks/__init__.py` (stacklevel=2 so
  it points at the user's import statement).
* Add unit tests covering: warning emission, FutureWarning subclassing, and
  silencing via warnings.filterwarnings.
* Update `0_auxiliary_attacks` and `1_gcg_azure_ml` notebook sources to
  include a callout explaining the experimental status and how to silence
  the warning.
* Re-execute `0_auxiliary_attacks.ipynb` to refresh outputs.

The `1_gcg_azure_ml.ipynb` outputs will be refreshed in a follow-up commit
once its AML execution completes (~30-45 minute paid GPU run).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Sync the notebook with the experimental-warning callout added to the .py
source. The notebook is committed without cell outputs (same as prior
HEAD state) — running it submits a paid AML GPU job and the existing
AML Dockerfile currently fails at import time due to a torch _dynamo
polyfill expecting sys.get_int_max_str_digits to be present.

This failure is environment-side (preexisting; tracked separately) and
unrelated to the ExperimentalWarning change: unit tests (190/190) and
local GCG integration tests (12/12 with RUN_ALL_TESTS=true) all pass.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Ubuntu 22.04 jammy ships python3.11 frozen at 3.11.0rc1 (pre-release),
which predates the addition of `sys.get_int_max_str_digits` in Python
3.11.0 final. Modern torch (>=2.7) references this attribute at import
time via `torch._dynamo.polyfills.sys.substitute_in_graph`, so any
`import torch` inside the apt-built container raises:

  AttributeError: module 'sys' has no attribute 'get_int_max_str_digits'

This cascaded to `import transformers` (which imports torch) and broke
the GCG AML notebook job during model load.

Switch to `uv python install 3.11` (python-build-standalone) and create
the venv with `--python-preference only-managed` so apt's broken rc1 is
never picked up.

Verified locally:
  * Python is 3.11.15 with `sys.get_int_max_str_digits() == 4300`
  * `import torch` (2.12.0) and `import transformers` (5.9.0) succeed
  * `from pyrit.auxiliary_attacks.gcg.data import load_goals_and_targets`
    succeeds and the ExperimentalWarning fires as expected

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
End-to-end run on the fixed Dockerfile (uv-managed Python 3.11) completed
successfully on AML A100. Captures live job-status progression and the
generated GCG suffix after 30 steps (loss 1.107).

AML job: strong_steelpan_2gyz6c2txv (Completed)

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Promote the public surface of `pyrit.auxiliary_attacks.gcg` so that callers
can import everything they need from the package root rather than reaching
into `.config`, `.generator`, etc.

* Add `GCG = GCGGenerator` as a short alias so user code reads:

      from pyrit.auxiliary_attacks.gcg import GCG, GCGConfig, GCGModelConfig

      generator = GCG(models=[GCGModelConfig(name="...")])

* Add `GCG` to `__all__` and update the package docstring example.
* Update the AML notebook (`1_gcg_azure_ml.py`/`.ipynb`) to import the
  configs from the package root instead of `.config`. Outputs are preserved
  from the previous successful AML run.
* Refresh the experiments README so the top example uses the modern API
  and the short import path; drop the deprecated long-named class.
* Add `tests/unit/auxiliary_attacks/gcg/test_public_api.py` covering:
  the alias identity, full `__all__` membership, and import smoke-tests.

All unit tests pass (116/116).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…Generator shim

`pyrit.auxiliary_attacks` is now flagged as experimental on import (see the
`ExperimentalWarning` introduced in the previous commits) and the new public
API is the GCG / GCGGenerator class. A deprecation cycle is unnecessary for
an experimental surface, so just drop the legacy shim.

* Delete `pyrit/auxiliary_attacks/gcg/experiments/train.py`
  (the `GreedyCoordinateGradientAdversarialSuffixGenerator` wrapper that
  emitted `DeprecationWarning` from `generate_suffix`).
* Delete `tests/unit/auxiliary_attacks/gcg/test_lifecycle.py` — its only
  coverage was the deprecated shim.
* Drop the now-dead `train_mod` import from
  `tests/unit/auxiliary_attacks/gcg/test_attack_wiring.py`.
* Refresh the `pyrit/auxiliary_attacks/gcg/config.py` module docstring so it
  no longer describes the removed entry point; the docstring now just
  documents the typed dataclasses.
* Refresh the `pyrit/auxiliary_attacks/gcg/generator.py` module docstring;
  drop the "Replaces the legacy ..." line.
* Rename internal `_to_legacy_params` -> `_to_attack_params` (the name
  "legacy" only referred to the now-deleted shim — the dotted-attribute
  namespace it builds is what the internal GCG attack helpers actually
  expect, so a neutral name reads cleaner).

113 unit tests + 12 integration tests pass (RUN_ALL_TESTS=true).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…api-resume

# Conflicts:
#	doc/code/auxiliary_attacks/0_auxiliary_attacks.ipynb
…wn9y2fs10

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…_method

Two bugs surfaced on PR microsoft#1792's CI matrix that the local single-extra run
did not catch:

1. `pyrit/auxiliary_attacks/gcg/__init__.py` eagerly imported
   `data.py` and `generator.py`, both of which transitively
   `import torch`. That made the whole package unimportable on the
   `dev` extra (no torch), so every `main-job` matrix entry failed at
   test collection time with `ModuleNotFoundError: No module named
   'torch'` for `test_data_and_config.py` and `test_public_api.py`.

   Fix: keep the short import path `from pyrit.auxiliary_attacks.gcg
   import GCG` but resolve the torch-dependent names lazily via PEP 562
   `__getattr__`. Config dataclasses stay eager (pure stdlib). Touching
   any of `GCG` / `GCGGenerator` / `GCGContext` / `GCGResult` /
   `load_goals_and_targets` triggers the underlying import on first
   access; if torch is missing the user gets the expected
   `ModuleNotFoundError` pointing at torch rather than a confusing
   import-time crash for users who only want config types.

2. `GCGGenerator.__init__` called
   `torch.multiprocessing.set_start_method('spawn')` whenever the
   current method was not already 'spawn'. In the `coverage` job the
   default Linux `fork` method is set (implicitly or by an earlier
   test) before any GCG test runs, and `set_start_method` without
   `force=True` raises `RuntimeError: context has already been set`.
   13 GCG tests failed for this reason. Constructing a generator should
   not touch global multiprocessing state anyway -- workers only get
   spawned later inside `_setup_async`.

   Fix: move the start-method guard out of `__init__` into a new
   `_ensure_spawn_start_method` helper invoked from `_setup_async`.
   Make it idempotent: set `spawn` only when nothing has been set yet;
   warn (do not crash) if some earlier code locked in a different
   context, since silently flipping the global out from under unrelated
   code would be worse than running with the existing setting.

Test adjustments:
- `test_data_and_config.py`: move `from ...gcg.data import
  load_goals_and_targets` below the existing `pytest.importorskip`
  for `attack_manager` and add a sibling `importorskip` for the
  data module so the file is properly skipped on torch-less installs.
- `test_public_api.py`: gate the whole file with
  `pytest.importorskip('torch')`; the public surface it exercises
  pulls in torch-dependent lazy attributes.
- `test_generator.py`: add regression tests covering both fixes -
  `__init__` no longer touches global mp state, and the new helper
  sets / no-ops / warns correctly across the three start-method states.

Local: 117/117 GCG unit tests pass (4 new), 8071/8071 in full unit
suite, pre-commit clean. Validated dev-only import path via subprocess
that pre-blocks torch on sys.meta_path: package imports successfully,
config-only access works, lazy attr raises ModuleNotFoundError pointing
at torch as expected.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant