fix: reduce false positives and add global rule configuration by satoridev01 · Pull Request #49 · ParzivalHack/PySpector

satoridev01 · 2026-05-09T14:58:48Z

Summary

Introduces two new mechanisms for controlling rule behavior globally, then applies them to fix a class of false positives that surface when scanning mature Python codebases.

Problem

Scanning a large Django-based project produced 864 findings with -s HIGH. Manual verification showed that ~720 were false positives:

453 from PY515/SHELL645/SHELL670 — flagging every re.compile() call as "code compilation"
101 from PY102 — the taint engine firing on generic callable invocations
78 from PY511/JSON612 — json.loads() flagged as dangerous deserialization
dozens more from AUTH711, CSRF747, SESS744, PATH813, etc. — firing on test fixtures and well-known safe patterns

Without filtering, scanning also revealed ~9,000 additional Low-severity findings from rules that flag every Python built-in (isinstance, super, len, f-strings, getattr, etc.).

Solution

1. `[defaults]` section in rules TOML

[defaults]
exclude_file_patterns = ["*tests*", "*fixtures*", "*testdata*", "*conftest*"]
disabled_rule_ids     = ["ISINSTANCE855", "LEN1101", "SUPER1128", ...]

exclude_file_patterns — glob list applied to all rules; eliminates the need to repeat exclude_file_pattern on every individual rule
disabled_rule_ids — completely disables noisy rules without deleting their definitions (easy to re-enable per project)

2. Per-rule `exclude_pattern`

Regex matched against the flagged line; suppresses the finding if it matches. Fixes rules that are correct in general but have well-known safe variants:

[[rule]]
id = "PY107"
ast_match = "Call(func.value.id=yaml, func.attr=load)"
# Safe when Loader=SafeLoader is explicitly passed
exclude_pattern = "Loader\\s*=\\s*(yaml\\.)?(Safe|Base)Loader"

Results on Django 6.1-alpha

	Before	After
Total findings (all severities)	864 (-s HIGH)	383
`re.compile()` false positives	453	0
`yaml.load(SafeLoader)` false positives	2	0
Test fixture credential false positives	9	0
Session fixation false positives	1	0
Python built-in noise rules	~9,000 (hidden by -s HIGH)	0
`pickle.loads()` true positives	57 → 8 (test files excluded)	✓ present
`exec()` true positives	3	✓ 3

Files changed

src/pyspector/_rust_core/src/rules.rs — Defaults struct, Rule::is_file_excluded(), exclude_pattern field
src/pyspector/_rust_core/src/analysis/mod.rs — apply disabled_rule_ids before scan
src/pyspector/_rust_core/src/analysis/ast_analysis.rs — use is_file_excluded() and exclude_pattern
src/pyspector/_rust_core/src/analysis/config_analysis.rs — same
src/pyspector/rules/built-in-rules.toml — [defaults] section + per-rule fixes
tests/unit/test_false_positive_reductions.py — 26 tests (each fix has a suppression test + true-positive retention test)

Test plan

python -m pytest tests/unit/test_false_positive_reductions.py — 26 tests pass
Scan a Django project: pyspector scan /path/to/django — confirm re.compile(), yaml.load(SafeLoader), and test-fixture findings are gone
Confirm pickle.loads() and exec() findings still present in production code

Addresses a class of false positives identified through real-world scanning of a large Python codebase (Django 6.1-alpha). Introduces two new mechanisms for controlling rule behavior without modifying individual rule definitions. ## Changes ### New: `[defaults]` section in rules TOML A single top-level section that applies to every rule: ```toml [defaults] exclude_file_patterns = ["*tests*", "*fixtures*", "*testdata*", "*conftest*"] disabled_rule_ids = ["ISINSTANCE855", "LEN1101", "SUPER1128", ...] ``` - `exclude_file_patterns` — glob patterns excluded from ALL rules (replaces repeating `exclude_file_pattern` on each individual rule) - `disabled_rule_ids` — completely disable rules that produce noise without taint context (Python built-in calls, etc.) ### New: per-rule `exclude_pattern` field Regex applied to the matched line; if it matches, the finding is suppressed. Useful for rules that are correct in general but have well-known safe variants: ```toml [[rule]] id = "PY107" ast_match = "Call(func.value.id=yaml, func.attr=load)" exclude_pattern = "Loader\\s*=\\s*(yaml\\.)?(Safe|Base)Loader" ``` ### Rust core changes - `rules.rs`: adds `Defaults` struct with `exclude_file_patterns`, `disabled_rule_ids`; adds `Rule::is_file_excluded()` helper; adds `exclude_pattern: Option<Regex>` to `Rule` - `analysis/mod.rs`: applies `disabled_rule_ids` before scanning - `analysis/ast_analysis.rs`: uses `Rule::is_file_excluded()` and `exclude_pattern` - `analysis/config_analysis.rs`: same ### Rule fixes applied | Rule | Fix | |---|---| | PY107 / PY302 | `exclude_pattern` suppresses `yaml.load(..., Loader=SafeLoader)` | | PY515 / SHELL645 / SHELL670 | `exclude_pattern` suppresses `re.compile()` and `.compile()` method calls | | AUTH711 / ADMIN795 | `exclude_file_pattern` suppresses test fixtures | | CSRF747 / IMPORT825 / PATH813 / PY001 | `exclude_file_pattern` suppresses test files | | SESS744 | Pattern narrowed to `session.session_key = request.*` (actual fixation); writing data to a session is not fixation | | PY511 / JSON612 | Severity downgraded to Low; `json.loads()` alone is not a code execution risk | | 95 Low/Medium rules | Added to `disabled_rule_ids`; rules flagging every Python built-in call (`isinstance`, `super`, `len`, f-strings, etc.) have no security value without taint analysis | ### Test coverage `tests/unit/test_false_positive_reductions.py` — 26 tests covering: - Each suppressed false positive with a concrete code fixture - True-positive counterpart for each rule (ensure real findings still fire) - Global `[defaults]` behavior (test-file exclusion via global config, not per-rule)

ParzivalHack added the enhancement New feature or request label May 9, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: reduce false positives and add global rule configuration#49

fix: reduce false positives and add global rule configuration#49
satoridev01 wants to merge 1 commit intoParzivalHack:mainfrom
satoridev01:fix/reduce-false-positives-django

satoridev01 commented May 9, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

satoridev01 commented May 9, 2026

Summary

Problem

Solution

1. [defaults] section in rules TOML

2. Per-rule exclude_pattern

Results on Django 6.1-alpha

Files changed

Test plan

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

1. `[defaults]` section in rules TOML

2. Per-rule `exclude_pattern`