Use this checklist to add a new scanner safely and consistently.
- Identify exact file formats/extensions.
- Document the concrete security risks you are detecting (CVE, known exploit class, or clearly documented abuse pattern).
- Decide whether dependencies are optional and how scanner behavior degrades when missing.
References: docs/agents/architecture.md and
docs/maintainers/scanner-cve-coverage.md
Create modelaudit/scanners/<format>_scanner.py with:
name,description, andsupported_extensionscan_handle()that is strict and deterministicscan()that usesresult.add_check(...)with clear severity and rationale- Path/size validation via base helpers before heavy parsing
For large scanners, move reusable parser/state helpers into
modelaudit/scanners/<format>_support/ and keep <format>_scanner.py as the
public class entrypoint plus orchestration layer.
Skeleton:
from __future__ import annotations
from typing import Any, ClassVar
from .base import BaseScanner, IssueSeverity, ScanResult
class ExampleScanner(BaseScanner):
name = "example"
description = "Scans Example model artifacts for security issues"
supported_extensions: ClassVar[list[str]] = [".example"]
@classmethod
def can_handle(cls, path: str) -> bool:
# extension + minimal signature checks
...
def scan(self, path: str) -> ScanResult:
result = self._create_scan_result_after_preflight(path)
if not result.success:
return result
# Add checks here
result.finish(success=not result.has_errors)
return resultUpdate modelaudit/scanner_registry_metadata.py:
- Add one scanner descriptor entry with module/class metadata
- Set priority, direct extensions, and any descriptor-owned
header_formats/content_routed_extensionscarefully - Declare dependency names for load-time diagnostics
- Document intentional descriptor/class extension differences with
scanner_only_extensionsinstead of leaving silent drift - If the extension is authoritative enough for extension-only format validation, add it to the descriptor-owned
EXTENSION_FORMAT_MAP; leave generic text/config extensions out - Do not add a second class map in
__getattr__; lazy exports are resolved from descriptor metadata
- Optional deps must fail gracefully with actionable messages.
- Never crash import-time for missing optional packages.
- If adding a new dependency, update
pyproject.tomlextras and justify it in PR notes.
Add focused tests under tests/:
- Safe/benign sample: scanner passes expected checks
- Malicious sample: scanner emits expected findings and severities
- Corrupt input: parser errors are handled cleanly
- Missing dependency path (if optional)
- Regression tests for edge cases and previously reported bypasses
- Registry/routing regressions: add
scan_file()andScannerRegistry.get_scanner_for_path()positives/negatives for extension collisions, header aliases, and content-routed archive names
uv run ruff format modelaudit/ packages/modelaudit-picklescan/src packages/modelaudit-picklescan/tests tests/
uv run ruff check --fix modelaudit/ packages/modelaudit-picklescan/src packages/modelaudit-picklescan/tests tests/
uv run mypy modelaudit/ packages/modelaudit-picklescan/src packages/modelaudit-picklescan/tests tests/
PROMPTFOO_DISABLE_TELEMETRY=1 uv run pytest -n auto -m "not slow and not integration" --maxfail=1If the change touches packages/modelaudit-picklescan, also run:
cargo fmt --manifest-path packages/modelaudit-picklescan/Cargo.toml -- --check
cargo check --manifest-path packages/modelaudit-picklescan/Cargo.toml
cargo clippy --manifest-path packages/modelaudit-picklescan/Cargo.toml --all-targets -- -D warnings
cargo test --manifest-path packages/modelaudit-picklescan/Cargo.toml
PROMPTFOO_DISABLE_TELEMETRY=1 uv run pytest packages/modelaudit-picklescan/tests -q- Scanner implementation + tests are included
- Registry wiring and dependency metadata are correct
- User-facing docs updated if behavior is visible to end users
- No security checks were downgraded without explicit rationale
Adding a CVE detection differs from adding a whole new scanner — you wire into existing scanners and shared detector infrastructure. Here is the typical multi-file workflow:
modelaudit/detectors/suspicious_symbols.py— Add regex pattern list, register it inCVE_COMBINED_PATTERNS, and updatevalidate_patterns().modelaudit/detectors/cve_patterns.py— Add_check_cve_XXXX_multiline()detection function +_create_cve_XXXX_attribution()helper, then wire intoanalyze_cve_patterns().modelaudit/scanners/<format>_scanner.py— Add version check method (if version-gated) and wire into the scanner's vulnerability checks.modelaudit/config/explanations.py— Add explanation function with type-specific messages.tests/detectors/test_cve_detection.py— Positive detection + false positive prevention + bypass prevention tests.tests/scanners/test_<format>_scanner.py— Version check tests (vulnerable + fixed).tests/conftest.py— Add test filenames toallowed_test_files.
- Use
_is_primarily_documentation()for doc guards, not substring checks like"#" in content. STACK_GLOBALopcodes havearg=None— reconstruct by walking backwards to precedingSHORT_BINUNICODE/BINUNICODEops.- Always assert specific check names/messages in tests, not just
result is not None. - Handle PEP 440 prerelease tags in version comparisons (
2.10.0a0is still vulnerable). - Use
except Exception(notexcept ImportError) for framework version-check imports.