Skip to content

Commit 941eaf3

Browse files
author
wuerror
committed
fix(analysis): reduce taint false positives
Tighten receiver and setter-style taint propagation to cut noisy sink matches while preserving the JDBC dbtest flow. Reclassify DriverManager connection sinks as JDBC_Driver_RCE and align the architecture and roadmap docs with the current detection policy.
1 parent 1ca012a commit 941eaf3

10 files changed

Lines changed: 267 additions & 100 deletions

File tree

ARCHITECTURE.md

Lines changed: 24 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -119,14 +119,30 @@ The tool supports two modes to balance precision and performance:
119119
* *Reference*: *Lhoták, O., & Hendren, L. (2003). Scaling Java points-to analysis using Spark.*
120120

121121
#### C. Taint Analysis (Vulnerability Detection)
122-
The engine implements a **Forward Taint Propagation** algorithm.
123-
1. **Source Identification**: Based on `api.txt` and `rules.yaml`, mark return values of sources (e.g., `request.getParameter()`) as "Tainted".
124-
2. **Propagation**:
125-
* **Intra-procedural**: Use Soot's `SmartLocalDefs` or `FlowAnalysis` to track taint within a method body via assignments (`y = x` where x is tainted).
126-
* **Inter-procedural**: When a tainted variable is passed as an argument to a method call, query the Call Graph to find callee methods and map the argument to the callee's parameters, continuing the analysis recursively.
127-
3. **Sink Matching**: If a tainted variable reaches a sink method (e.g., `Runtime.exec(tainted)`), a vulnerability is flagged.
128-
129-
*Reference*: *Vallée-Rai, R., Co, P., Gagnon, E., Hendren, L., Lam, P., & Sundaresan, V. (1999). Soot - a Java optimization framework.*
122+
The engine implements a **Forward Taint Propagation** algorithm combining intra- and inter-procedural analysis.
123+
124+
1. **Source Identification**: Based on `api.txt` and `rules.yaml`, all parameters of API entry-point methods are marked as "Tainted" at method entry.
125+
2. **Intra-procedural Propagation** (`IntraTaintAnalysis``ForwardBranchedFlowAnalysis<FlowSet<Value>>`):
126+
* **Direct assignment**: `y = x``y` tainted if `x` tainted.
127+
* **Binary/cast**: `y = x + z`, `y = (T) x``y` tainted if operand tainted.
128+
* **Instance field read**: `y = obj.f``y` tainted if `obj` tainted.
129+
* **Static field read**: `y = Cls.f``y` tainted if `Cls.f` was previously written with tainted data (tracked in `taintedStaticFields`).
130+
* **Array read**: `y = arr[i]``y` tainted if `arr` tainted.
131+
* **Instance field write**: `obj.f = x``obj` tainted if `x` tainted.
132+
* **Static field write**: `Cls.f = x``Cls.f` added to `taintedStaticFields` if `x` tainted.
133+
* **Method return (instance)**: `y = obj.m(...)``y` tainted if `obj` tainted; `arg → return` is additionally applied only to setter-like instance methods to reduce taint explosion.
134+
* **Method return (static/any)**: `y = Cls.m(...)``y` tainted if any arg tainted.
135+
* **Setter/constructor receiver**: `obj.set(x)` or `new Obj(x)``obj` tainted if any arg tainted, but this receiver-tainting heuristic is restricted to setter-like methods and constructors. This enables the `setter → field → getter → sink` chain without broadly tainting service objects.
136+
* **Path sensitivity**: Null-check branches (`if x == null`) kill taint on the null path.
137+
3. **Inter-procedural Propagation** (`WorklistEngine`):
138+
* Tainted arguments are mapped to callee parameter locals before scheduling.
139+
* Tainted receiver (`obj` in `obj.m(...)`) is mapped to callee `this` local.
140+
* `AnalysisState` (method + tainted-param-bitset + `thisTainted`) is used for memoization to avoid redundant re-analysis.
141+
4. **Sink Matching**: A vulnerability is flagged when:
142+
* Any **argument** of a sink method call is tainted, OR
143+
* The **receiver** of an instance sink call is tainted for sink categories that enable receiver-based triggering. This receiver-based check is intentionally disabled for `sqli` to avoid false positives on tainted `Statement` / `Connection` objects.
144+
145+
*Reference*: *Vallée-Rai et al. (1999). Soot - a Java optimization framework.*
130146

131147
### 3.3 Internal Process Flows
132148

CHANGELOG.md

Lines changed: 46 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2,6 +2,52 @@
22

33
All notable changes to this project will be documented in this file.
44

5+
## [1.6.1] - 2026-03-09
6+
7+
### Changed
8+
- **JDBC Sink Taxonomy**:
9+
- Reclassified `java.sql.DriverManager.getConnection(...)` from generic `SSRF` to `JDBC_Driver_RCE` in `default_rules.yaml`.
10+
- Scoring now treats `jdbc_driver_rce` as a critical category; exploitability remains a human-triage decision based on actual driver and URL semantics.
11+
12+
### Fixed
13+
- **False Positive: Taint Explosion via Broad Setter Tainting**:
14+
- `IntraTaintAnalysis.flowThrough`: Receiver tainting for `InvokeStmt` is now restricted to **setter-like methods** (`set*`, `add*`, `put*`, `append*`, `insert*`, `with*`, `push*`, `enqueue*`, `load*`, `init*`, `configure*`, `update*`, `register*`) and constructors. Previously, ANY `InstanceInvokeExpr` with a tainted argument would taint the receiver, causing taint to explode through service-layer objects (repositories, HTTP clients, APM agents) and generate hundreds of false positives.
15+
- **False Positive: `URL.<init>` / `URI.<init>` as SSRF Sinks**:
16+
- Removed `<java.net.URL: void <init>(java.lang.String)>` and `<java.net.URI: void <init>(java.lang.String)>` from SSRF sinks in `default_rules.yaml`. Object construction alone does not make a network request; the real sinks (`URL.openStream()`, `URL.openConnection()`) are retained.
17+
- **False Positive: `ObjectMapper.readValue(String, Class)` as Deserialization Sink**:
18+
- Removed `<com.fasterxml.jackson.databind.ObjectMapper: java.lang.Object readValue(java.lang.String,java.lang.Class)>` from deserialization sinks. Jackson's `readValue` with an explicit target class is standard safe JSON parsing, not arbitrary deserialization. Dangerous Jackson deserialization requires `enableDefaultTyping()` + polymorphic type handling, which is not modeled by this sink.
19+
- **False Positive: Overly Broad Instance-Method Return Value Taint**:
20+
- `IntraTaintAnalysis.applyDefinition`: For `InstanceInvokeExpr` (instance method calls whose return value is assigned), `arg tainted → return tainted` is now restricted to **setter-like methods** (same `isSetterLike()` predicate as the receiver-tainting rule). General pass-through instance calls are handled correctly by the inter-procedural scheduler (callee is analyzed with tainted params). The `receiver tainted → return tainted` rule (for getters and chain calls) is unchanged.
21+
- Static invocations (`StaticInvokeExpr`) retain full `arg → return` propagation, which is correct for transformation functions (`String.format`, `Paths.get`, etc.).
22+
- **False Positive: Receiver-Tainted Sink Check for SQL Injection**:
23+
- `WorklistEngine.checkSink()` and `InterproceduralTaintAnalysis`: Receiver-based sink triggering (`taintedObj.sinkMethod()`) is now **disabled for the `sqli` category**. SQLi requires a tainted SQL string argument; triggering on a tainted `Statement` or `Connection` receiver generates false positives from taint reaching database objects via field propagation. SSRF, Path_Traversal, RCE, and other categories retain receiver-based detection.
24+
25+
## [1.6.0] - 2026-03-09
26+
27+
### Added
28+
- **Field Taint Propagation (Phase 8.5)**:
29+
- `IntraTaintAnalysis.flowThrough`: Any `InstanceInvokeExpr` (virtual, interface, or special/constructor) with a tainted argument now taints the receiver object. This covers the common setter pattern: `obj.setUrl(tainted)` propagates taint onto `obj` so that subsequent reads (`obj.getUrl()`, `obj.field`) stay tainted through the rest of the method.
30+
- `IntraTaintAnalysis`: Added `taintedStaticFields: Set<SootField>` to track writes of the form `SomeClass.field = tainted`. Subsequent reads `x = SomeClass.field` within the same method body are now correctly tainted. The set is monotone (only grows) consistent with MAY-analysis semantics.
31+
- `WorklistEngine.checkSink` + `InterproceduralTaintAnalysis`: Sink detection now also fires when the **receiver** of an instance-method sink is tainted (e.g., `taintedStmt.execute()`), in addition to the existing argument check. This closes a class of false negatives for builder/fluent API sink patterns.
32+
33+
### Fixed
34+
- **False Negatives: Interprocedural Receiver Taint (Phase 8.5 pre-work, commit `1ca012a`)**:
35+
- `AnalysisState`: Added `thisTainted` boolean to the memoization key (`equals`/`hashCode`). Previously, a method entered with a tainted receiver and without a tainted receiver mapped to the same state; the second visit was silently skipped, dropping real flows.
36+
- `WorklistEngine.scheduleCallee` + `InterproceduralTaintAnalysis`: When an `InstanceInvokeExpr` base object is tainted, the callee's `this` local is now seeded as tainted before scheduling. This enables `source → obj (tainted) → callee(this tainted) → sink` chains.
37+
- `IntraTaintAnalysis.applyDefinition`: Constructor calls (`SpecialInvokeExpr.isConstructor()`) with tainted arguments now taint the base object (previously missed, leaving `new Obj(tainted)` chains broken).
38+
- `IntraTaintAnalysis.applyDefinition`: Static method calls (`StaticInvokeExpr`) with tainted arguments now taint the return-value local, covering `x = Utils.process(tainted)` chains that were previously dropped.
39+
- **False Negatives: Missing JDBC URL / Connection Sinks (Phase 8.4, commit `1ca012a`)**:
40+
- Added `java.sql.DriverManager.getConnection(String)`, `getConnection(String, Properties)`, and `getConnection(String, String, String)` as SSRF/JDBC-URL-Injection sinks in `default_rules.yaml`.
41+
- **Config Merge (commit `1ca012a`)**:
42+
- `ConfigManager`: When a workspace `rules.yaml` exists, its source/sink rules are now **merged** with the bundled defaults instead of replacing them. Rules present in the workspace file take precedence; rules only in defaults are appended. This prevents sink/source coverage regressions when users customize their rule files.
43+
- **Graph Stability (commits `b9fc2ef`, `6e8f79b`, `ef8bc48`)**:
44+
- Auto-resolve dangling Soot classes during call-graph construction to reduce phantom-class noise.
45+
- Bulk resolution pass for recurring dangling packages.
46+
- Tightened retry budget for dangling resolution to avoid runaway retries.
47+
- **JAX-RS Route Extraction**:
48+
- `RouteExtractor`: POJO parameters annotated with JAX-RS path/query annotations are now correctly captured in `api.txt`.
49+
- Added support for `@Path`, `@GET/@POST/@PUT/@DELETE/@PATCH` on JAX-RS controllers.
50+
551
## [1.5.0] - 2026-02-10
652

753
### Added

README.md

Lines changed: 10 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -25,6 +25,7 @@
2525
* **Worklist Engine**: 迭代式污点分析,避免栈溢出。
2626
* **Leaf Optimization**: 智能摘要生成,大幅提升分析速度。
2727
* **Strict Isolation**: 严格隔离业务代码与第三方库,防止分析引擎崩溃。
28+
* **Field-Sensitive Propagation**: 支持字段污点传播,覆盖 `setter → field → getter → sink``new Obj(tainted)` 和静态字段读写链路;receiver 型 sink 检测按漏洞类别启用,默认不会仅因 tainted `Statement/Connection` 就报 SQLi。
2829

2930
---
3031

@@ -81,6 +82,8 @@ java -jar JByteScanner-1.0-SNAPSHOT.jar -m api --filter-annotation AnonymousVali
8182

8283
对于sink,直接修改生成的rules.yaml。第二次跑,或者再跑全量时会首先加载当前项目目录.jbytescanner下的rules.yaml。也可以通过`-c`选项指定
8384

85+
默认规则里,SSRF 更聚焦真正发起外连的通用 URL/HTTP API,例如 `openConnection``openStream`、HTTP client `execute(...)`;而 `DriverManager.getConnection(...)` 会单独归类为 `JDBC_Driver_RCE`,具体是否可利用交由人工判断。`new URL(...)` / `new URI(...)` 这类仅构造对象的调用默认不作为高置信 SSRF sink。
86+
8487
**全量扫描 (漏洞挖掘):**
8588

8689
-m scan或者什么都不带。如果.jbytescanner目录下已经有api.txt那么会跳过phase2
@@ -111,14 +114,19 @@ java -jar target/JByteScanner-1.0-SNAPSHOT-shaded.jar /path/to/app.jar --interac
111114
- [x] **Phase 1-5**: 基础架构、配置管理、资产发现、Soot 集成、SARIF 报告。
112115
- [x] **Phase 6**: 性能优化(结构化状态、反向剪枝、强依赖隔离)。
113116
- [x] **Phase 7**: 高级分析引擎(Worklist 迭代引擎、方法摘要、叶子节点优化)。
114-
- [x] **Phase 8: 战术情报 (Tactical Intelligence)**: Secret 扫描、漏洞评分、Smart PoC 生成。
117+
- [x] **Phase 8: 战术情报 (Tactical Intelligence)**:
118+
- [x] 8.1 Secret 扫描(配置文件、常量池、Base64 编码)。
119+
- [x] 8.2 漏洞评分(R-S-A-C 模型)与认证检测。
120+
- [x] 8.3 Smart PoC 生成(Burp Suite 可直接导入)。
121+
- [x] 8.4 Sink 覆盖扩展(JDBC URL / DriverManager.getConnection)。
122+
- [x] 8.5 字段污点传播(setter 模式、静态字段、sink receiver 检测)。
115123

116124
### 进行中 (Advanced Exploitation)
125+
- [ ] **Phase 8.6**: Summary 完善(`param→this``this→return` 摘要生成与消费)。
117126
- [ ] **Phase 9: 深度利用链**
118127
- [ ] **Auth Bypass**: 鉴权绕过检测(Config vs Code)。
119128
- [ ] **Gadget Miner**: 反序列化利用链挖掘。
120129

121130
- [ ] **Phase 10: 交互与 SCA**
122131
- [ ] **Offensive SCA**: 攻击型组件指纹识别。
123132
- [ ] **Interactive Shell**: 内存调用图查询 REPL。
124-

ROADMAP.md

Lines changed: 9 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -46,18 +46,21 @@ This document tracks the evolution of JByteScanner into a specialized Red Team t
4646

4747
### Current Known Gaps
4848
- [ ] **False Negative: JDBC URL / Connection Sinks**:
49-
- Case study: `GET /setup/dbtest` in the Qiyuesuo target is reachable from `com.qiyuesuo.setup.SetupController.dbtest(...)` to `java.sql.DriverManager.getConnection(...)`, but the engine reports 0 findings.
49+
- Case study: `GET /setup/dbtest` in a target application is reachable from `com.example.setup.SetupController.dbtest(...)` to `java.sql.DriverManager.getConnection(...)`, but the engine reports 0 findings.
5050
- Root cause 1: `default_rules.yaml` models SQL execution sinks such as `Statement.execute*` and `JdbcTemplate.execute/query`, but does not model JDBC connection-establishment sinks such as `DriverManager.getConnection(...)` or URL-setting APIs on common `DataSource` implementations.
5151
- Root cause 2: the current worklist engine only propagates taint through callee parameters and does not propagate taint into instance receivers (`this`), constructors, object fields, or return values, so flows like `param -> field -> this.method() -> sink` are dropped.
5252
- Root cause 3: method summaries define placeholders for `paramsToThis` and `thisToReturn`, but the summary generator and worklist engine do not yet produce and consume these facts.
5353

5454
### Planned Fix Plan
55-
- [ ] **8.4 Sink Coverage Expansion**:
55+
- [x] **8.4 Sink Coverage Expansion** [COMPLETED]:
5656
- Add JDBC URL / connection sinks to `default_rules.yaml`, starting with `java.sql.DriverManager.getConnection(...)` and common `DataSource` URL setters.
57-
- Revisit sink taxonomy so JDBC URL control can be scored as SSRF / JDBC URL injection / connection abuse instead of being limited to SQL execution only.
58-
- [ ] **8.5 Receiver/Object Taint Propagation**:
59-
- Extend `WorklistEngine.scheduleCallee()` to propagate tainted instance receivers into callee `this` for `InstanceInvokeExpr`.
60-
- Add constructor and field-backed object flow support for chains like `source -> new Obj(tainted) -> this.field -> sink`.
57+
- Revisit sink taxonomy so JDBC URL control is reported as `JDBC_Driver_RCE`, with exploitability left to analyst validation rather than downgraded into generic SSRF.
58+
- Keep SSRF focused on connection-establishing APIs such as `openConnection()`, HTTP client `execute(...)`, `openStream()`, and JDBC `DriverManager.getConnection(...)`; `URL(String)` / `URI(String)` constructors are intentionally not treated as high-confidence SSRF sinks because they create excessive noise.
59+
- [x] **8.5 Receiver/Object Taint Propagation** [COMPLETED]:
60+
- `IntraTaintAnalysis.flowThrough`: tainted arg to any `InstanceInvokeExpr` (setter/constructor) now taints the receiver (`obj.setUrl(t)``obj` tainted).
61+
- `IntraTaintAnalysis.applyDefinition`: added `StaticFieldRef` read/write tracking via `taintedStaticFields` (`StaticClass.f = t` → field remembered; `x = StaticClass.f``x` tainted).
62+
- `WorklistEngine.checkSink` + `InterproceduralTaintAnalysis`: sink check can fire when the receiver itself is tainted, but this receiver-based trigger is now disabled for `sqli` sinks to avoid false positives on tainted `Statement` / `Connection` objects.
63+
- Covers chains: `param → setter(param) → obj tainted → obj.get() → sink` and `param → static field → read → sink`.
6164
- [ ] **8.6 Summary Completion**:
6265
- Implement `param -> this`, `this -> return`, and return-value propagation in `SummaryGenerator` and `WorklistEngine`.
6366
- Upgrade memoization state to capture object/receiver taint facts in addition to tainted parameter indices.

0 commit comments

Comments
 (0)