Skip to content
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
34 changes: 34 additions & 0 deletions src/AI/AI-Assisted-Fuzzing-and-Vulnerability-Discovery.md
Original file line number Diff line number Diff line change
Expand Up @@ -133,6 +133,37 @@ Implement a queue where confirmed PoV-validated patches and *speculative* patche

---

## 6. Deterministic File-by-File AI Code Review

A frequent failure mode in AI-assisted review is asking one agent to inspect a whole repository and hoping it chooses the right files and grep terms. A more reliable pattern is to **force repository coverage**:

1. Enumerate source files.
2. Send **one file at a time** plus minimal context (entrypoint, imports, nearby routes/callers).
3. Require a **structured report** for each file: sources, sink, missing validation, exploit preconditions, and confidence.
4. Deduplicate reports by **source→sink pattern** and manually validate the high-risk ones.

This is token-heavy and noisy, but it is very effective at surfacing **simple high-impact source/sink bugs** that broad agentic reviews often miss.

### Triage patterns worth prioritising

- **Dynamic PHP include/dispatch**: attacker-controlled route/controller/module names flowing into `require_once()`, `include()`, or `include_once()` without strict allowlisting and path canonicalisation.
- **Shell execution in admin/account workflows**: usernames, domains, FTP accounts, or other identifiers reaching `exec()`, `system()`, `shell_exec()`, `passthru()`, `proc_open()`, or `popen()`.
- **Structured input fan-in**: the same parameter accepted from `$_GET`, `$_POST`, JSON, XML, or framework body parsers and later reused in filesystem or OS-command sinks.

### Two practical bug classes this method finds well

- **PHP controller-dispatch LFI/RCE**: if a request-controlled controller name is concatenated into `require_once()` with no allowlist/path normalisation, traversal sequences such as `../` can make PHP include an unintended local `.php` file. If the attacker can point the include at a planted or otherwise useful PHP file, the LFI becomes code execution. See [File Inclusion / Path Traversal](../pentesting-web/file-inclusion/README.md).
- **Authenticated command injection in hosting/admin panels**: if account-management fields such as usernames are embedded into shell commands, a low-privileged authenticated user may turn a normal create/delete action into RCE. The impact increases when the panel executes the command as a more privileged service account. See [Command Injection](../pentesting-web/command-injection.md).

### Practical review notes

- Use the LLM as a **source/sink reviewer first**, not as an exploit generator.
- Ask it to explicitly list the **attacker-controlled field**, the **normalization/validation gap**, and the **dangerous sink**.
- For large repos, deterministic per-file review is often more reliable than a single autonomous agent run, even when the latter uses a stronger model.
- Expect weaker results on **Broken Access Control** and other business-logic bugs where exploitability depends on cross-file assumptions, role semantics, or product-specific threat models.

---

## Putting It All Together
An end-to-end CRS (Cyber Reasoning System) may wire the components like this:

Expand All @@ -152,6 +183,9 @@ graph TD
---

## References
* [Project Black - Local AI for Cyber Security: Finding phpIPAM LFI and myVesta Authenticated RCE](https://projectblack.io/blog/local-ai-for-cyber-security)
* [Strix](https://github.com/usestrix/strix)
* [GitHub Copilot community security-review skill](https://github.com/github/awesome-copilot/blob/main/skills/security-review/SKILL.md)
* [Trail of Bits – AIxCC finals: Tale of the tape](https://blog.trailofbits.com/2025/08/07/aixcc-finals-tale-of-the-tape/)
* [CTF Radiooo AIxCC finalist interviews](https://www.youtube.com/@ctfradiooo)
{{#include ../banners/hacktricks-training.md}}