feat: LLM review stage for enhanced reachability detection by joshbouncesecurity · Pull Request #50 · knostic/OpenAnt

joshbouncesecurity · 2026-05-04T18:59:58Z

Summary

Adds a new LLM review stage (off by default, enabled via --llm-reachability on openant scan) that uses a strong model (Opus by default) to surface additional reachability signals beyond what the structural analysis catches:

Likely entry points the structural pass missed (framework-specific handlers, plugin/CLI registrations, message handlers).
External content ingestion (HTTP request bodies, file/network reads, env/argv, stdin, untrusted IPC).
Cross-process or async data flows.

Signals are advisory and only promote a unit's reachability — they never demote one that the structural analysis already kept. This matches the issue's "complements, not replaces" intent.

Output:

llm_reachability.json in the scan dir with the full signal list.
llm_reachability_signals: [...] field on each unit in dataset.json.
High-confidence entry_point signals set is_entry_point: true on the target unit.

Cost & rate-limit safety: reuses the existing GlobalRateLimiter via AnthropicClient, opt-in only, and prompts are batched (default 25 units/call).

Addresses #17 (does not close — let the maintainer review the prompt + heuristics first).

Test plan

Unit tests for analyze_reachability with mocked LLM: fixed JSON, malformed JSON, exception in client, app_context threading, batch chunking.
Unit tests for apply_signals: promote-only semantics (high-confidence promotes, medium does not, never demotes), per-unit signal accumulation, unknown-id rejection.
CLI plumbing tests: --llm-reachability appears in openant scan --help; default does not pass the flag through to scan_repository; setting it threads llm_reachability=True.
Existing pytest suite passes: 112 passed, 23 skipped (env-dependent, e.g. Go binary).
Manual: enable on a small Express fixture; verify route handlers gain entry-point signals.

joshbouncesecurity · 2026-05-04T20:46:38Z

Manual verification

Off by default. Requires API key. Cost note: enabling adds approximately one Opus call per 25 units.

openant scan --help shows --llm-reachability with the cost note.
Default behavior: openant scan <repo> (no flag) — pipeline unchanged from current behavior; no LLM reachability cost incurred.
Enabled: openant scan <repo> --llm-reachability — emits llm_reachability.json in the scan dir; merged signals appear on dataset.json units as llm_reachability_signals.
Promote-only: high-confidence entry_point signals on a previously-unreachable unit promote it to is_entry_point: true. Existing entry points are NEVER demoted.
Pipeline ordering: app-context generation runs BEFORE the LLM reachability stage, so app context is threaded into the reachability prompt. Verify with --quiet step ordering.
Markdown-wrapped JSON: model occasionally returns the response inside a fenced json ... block — the parser strips the fences and recovers the signals.
Empty units: openant scan <empty-or-failed-parse> --llm-reachability: stage skips gracefully (no LLM call).

joshbouncesecurity · 2026-05-04T21:14:26Z

Local test results

Reinstalled openant-core from this branch and ran openant scan --llm-reachability end-to-end on the in-tree sample_python_repo (5 reachable units after the structural pass). Skipped enhance/verify/report/dynamic-test to keep cost minimal.

Commands run:

go build -o openant.exe ./
pip install -e libs/openant-core/

openant scan <sample_python_repo> --output <out> \
  --llm-reachability \
  --no-context --no-enhance --no-report --skip-dynamic-test \
  --limit 1 --model sonnet

Outcome (against the manual-verification checklist):

openant scan --help lists --llm-reachability with the cost note: "Off by default — enabling this may incur additional LLM cost (one Opus call per ~25 units)" ✅
Enabled run: llm_reachability.json written to the scan dir; the new pipeline step llm-reachability ran successfully and emitted llm-reachability.report.json with cost_usd: 0.01245, token_usage.total_tokens: 782, units_reviewed: 5 ✅
Promote-only semantics: Opus reviewed 5 units (the structural pass had already correctly tagged the two Flask endpoints as is_entry_point: True); model returned signals: []. signals_added: 0, entry_points_promoted: 0, units_touched: 0. Existing entry points were not demoted ✅
Pipeline ordered as documented: parse → llm-reachability → analyze. llm-reachability.report.json is timestamped before analyze.report.json ✅
Reachability stage runs Opus (per the report) regardless of --model sonnet (which only governs analyze) — matches cost note ✅
Did not exercise the markdown-fence recovery path or the empty-units skip path here — covered by the unit tests in the diff. Did not separately confirm app-context-threading because I passed --no-context.

Total cost: $0.024 (reachability $0.012 + analyze 1 unit $0.012). Well under the budget.

ar7casper · 2026-05-10T08:49:59Z

🔴 High — Architecture · core/scanner.py:228 + core/parser_adapter.py:351

Issue: The LLM stage runs after parse, but parse already applied the structural
reachability filter when processing_level != "all" (default is "reachable"). So by the
time the LLM sees the dataset, units that weren't reachable from the structural entry
points are already discarded. The headline value prop — "Likely entry points the structural
pass missed (framework-specific handlers, plugin/CLI registrations, message handlers)" —
cannot be delivered as written: if the structural pass missed an Express route handler, the
unit for that handler was filtered out at parse time. The LLM never sees it. Promoting
is_entry_point on a unit that's no longer in the dataset is impossible.

What the stage does still deliver in current form:

Attaches llm_reachability_signals to surviving units (metadata for reports)
Promotes is_entry_point on surviving units, which flows into
agentic_enhancer/prompts.py:106 (so the enhancer treats those units as entry points when
prompting)

What it cannot do:

Bring back units the structural pass dropped
"Surface missed entry points" — those units are gone

The stage was running after the structural filter had already discarded units that weren't reachable from heuristic entry points — meaning the LLM could never find the missed entry points that are the feature's main value prop. Fix: - When --llm-reachability is set, parse with processing_level="all" so every unit is visible to the LLM. - After apply_signals promotes LLM-identified entry points, re-run the structural reachability filter (apply_reachability_filter) with those unit IDs added as extra BFS seeds. - The final dataset is filtered to the user's requested processing_level but expanded by any entry points the LLM found. Also: - Expose apply_reachability_filter as a public function in parser_adapter.py with an extra_entry_points parameter; preserve any is_entry_point=True already set by the LLM stage when re-stamping. - Update help text in cli.py and scan.go to reflect that cost scales with total repo size, not the filtered unit count. - Update llm_reachability.py docstring to document the correct pipeline ordering. Addresses review comment on PR knostic#50. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

joshbouncesecurity · 2026-05-12T05:31:58Z

Good catch — fixed in faa8948.

The stage was running against the already-filtered dataset, so any unit the structural pass dropped was gone before the LLM could see it. The fix:

Parse with processing_level="all" when --llm-reachability is set, so the LLM reviews the full codebase.
After signals are applied, re-run apply_reachability_filter with the LLM-promoted entry point IDs added as extra BFS seeds, so the final dataset is still filtered to the user's requested processing_level — but now expanded by whatever the LLM found.
apply_reachability_filter is exposed as a public function in parser_adapter.py with an extra_entry_points parameter; it also preserves any is_entry_point=True already set by the LLM stage when re-stamping units.
Help text updated in both cli.py and scan.go — cost now scales with total repo size, not the filtered unit count, which is a meaningful difference on large repos.

All 29 existing tests still pass.

joshbouncesecurity · 2026-05-12T05:33:13Z

@ar7casper responded to the comment

ar7casper

Round-2 review after the faa8948 fix — the architectural rework for the Python path is correct. One High-severity finding I missed in the prior review, plus two Mediums and a few Lows.

🟠 High — Architecture · `core/scanner.py:283-294` + `core/parser_adapter.py:247-252`

The post-LLM re-filter call to apply_reachability_filter requires call_graph.json on disk. Only the Python parser (parse_repository.py:173) and the Zig parser (zig/test_pipeline.py:123) write this file. JS, Go, C, Ruby, and PHP parsers do their reachability filtering internally inside their per-language test_pipeline.py and never persist call_graph.json.

When a user runs openant scan <ts-repo> --llm-reachability (default --level reachable):

effective_parse_level = "all" → JS parser parses the full unfiltered codebase ✅
LLM reviews all units, promotes some via is_entry_point=True ✅
scanner.py:289 calls apply_reachability_filter(dataset, output_dir, "reachable", extra_entry_points=...) ❌
Function looks for output_dir/call_graph.json → not found
parser_adapter.py:248-252 logs [Warning] call_graph.json not found — skipping reachability filter to stderr and returns the dataset unchanged
Dataset remains the FULL unfiltered codebase
Downstream stages (enhance, analyze) process every function in the repo

Net effect: --llm-reachability silently disables --level reachable on every non-Python (and non-Zig) codebase. A typical Express scan that should have processed ~30 reachable units processes ~500 instead — 10–20× cost spike in the analyze stage. The user has no easy way to know --level was ignored.

This is the same architectural class as the round-1 issue (LLM stage interacting with the structural filter), but the round-1 fix only addresses the Python path.

Suggestion (any one of these):

Most thorough: make all parsers write call_graph.json to output_dir (uniform), so apply_reachability_filter works for every language.
Pragmatic: skip the post-LLM re-filter for non-Python languages and print a loud warning that --llm-reachability doesn't yet support per-language re-filtering, including the cost implication.
Minimum acceptable: in parser_adapter.py:247-252, raise a RuntimeError (or escalate the warning to a hard failure) when called with non-empty extra_entry_points and call_graph.json is missing — fail loudly so the bug is visible at scan time.

🟡 Medium — Quality · `core/llm_reachability.py:51-53`

MODEL_PRIMARY = "claude-opus-4-20250514" (Opus 4 from May 2025) is older than the codebase convention. analyzer.py, enhancer.py, reporter.py, finding_verifier.py all use the newer claude-opus-4-6. The source comment claims "matches the convention in core/analyzer.py / utilities/llm_client.py" but analyzer.py:316 is model_id = "claude-opus-4-6" if model == "opus" else "claude-sonnet-4-20250514" — the comment is misleading.

Suggestion: align MODEL_PRIMARY to claude-opus-4-6 to match the rest of the Opus-using stages, and fix the comment.

🟡 Medium — Coupling · `core/scanner.py:267`

analyze_reachability(..., max_units=limit) couples the LLM reachability stage's coverage to the analyze-stage --limit flag. A user running openant scan --limit 50 --llm-reachability to constrain analyze cost will silently limit LLM reachability to the first 50 units. If the entry point that would have promoted units 1–50 sits at unit 51, those units never enter the reachable set.

Suggestion: decouple — either a separate --llm-reachability-max-units flag, or don't apply limit to LLM reachability (let it always review the full dataset; cost is controlled by batch_size).

🟢 Lows (non-blocking, listed for completeness)

Unused import — core/llm_reachability.py:47: field imported but never used. PR 56's ruff rules don't catch this (F401 isn't enabled). Drop it.
Always-null prompt field — core/llm_reachability.py:174: unit.get("reachable_from_entry") always returns None because nothing in the project writes that field. The reachability filter writes reachable: bool. Either remove from the prompt projection or rename to reachable.
Aggressive code trimming — core/llm_reachability.py:61: MAX_CODE_BYTES = 1500 (~30-50 lines) often misses entry-point indicators in handlers with embedded business logic. Consider 4-8 KB or env-var configurable.
Style inconsistency — core/scanner.py:248,259,272,299: new with open(..., encoding="utf-8") as f: calls work but don't follow PR 56's new convention of read_json / write_json from utilities.file_io.
Observability gap — core/scanner.py:245: step_context inputs={"model": "opus"} is hardcoded string, not actual model ID. LLM-promoted units have empty entry_point_reason (the structural detector only sets it for its own detections) — hard to tell from a unit's metadata that the LLM promoted it without cross-referencing llm_reachability_signals.

What's good

Promote-only semantics with regression test against demotion is exactly right.
Failure isolation per batch in analyze_reachability (try/except per batch) is defensive in the right place.
5 commits with self-iteration including 3 fixes from prior review feedback. Engineering hygiene.
The architecture refactor commit (3d83787e) self-caught a real app_context_path = None ordering bug.
Test coverage is excellent — 29 tests across 6 classes including the never-demote semantic and the non-positive batch-size guard.
Help text honestly mentions cost scales with total repo size, not the filtered count.

Verdict: the High finding is the actionable ask — once that's resolved (or scoped to Python+Zig only with a loud warning), the rest are nits / follow-ups.

joshbouncesecurity · 2026-05-12T13:43:59Z

Addressed the call_graph.json gap and CI runtime coverage in two commits:

feat: write call_graph.json for all parser languages (417883b)

All five parsers that were missing call_graph.json now write it:

Parser	Approach
C	`builder.export()` result written directly (already correct snake_case format)
Ruby	Same — `builder.export()`
PHP	Same — `builder.export()`
JavaScript	Reads back `analyzer_output.json` after subprocess, normalises camelCase → snake_case (`callGraph` → `call_graph`, `reverseCallGraph` → `reverse_call_graph`)
Go	Reconstructs `call_graph`/`reverse_call_graph` from `direct_calls`/`direct_callers` in unit metadata (same fields `apply_reachability_filter` already reads when `call_graph.json` is absent)

Also added tests/test_call_graph_output.py with 17 tests:

5 always-run unit tests for apply_reachability_filter public API (synthetic fixture, no external deps)
2 always-run integration tests for Python parser
2 tests each for JS/Go/C/Ruby/PHP (skip-guarded by availability checks)

ci: ensure go_parser binary is built in python-tests job (c82498f)

The python-tests CI job previously had no Go toolchain, so Go parser tests always skipped (and by extension the call_graph.json test for Go). Fixed:

Added setup-go step in python-tests using parsers/go/go_parser/go.mod
Added per-platform go build steps (outputs go_parser on Linux/macOS, go_parser.exe on Windows)
Updated _go_parser_available() in the test file to try-execute the binary (catches WinError 193 when a Linux ELF is present on Windows) and check both go_parser and go_parser.exe candidates
Changed test run command to pytest tests/ -v (all tests) to match the go-tests job pattern

joshbouncesecurity · 2026-05-12T13:45:48Z

Addressed all findings from this review. Summary by severity:

🔴 High — call_graph.json missing for JS/Go/C/Ruby/PHP

Went with the most thorough fix: all five parsers now write call_graph.json to output_dir.

Parser	Approach
C / Ruby / PHP	`builder.export()` written directly — already correct snake_case format
JavaScript	Reads back `analyzer_output.json` after subprocess, normalises `callGraph` → `call_graph` / `reverseCallGraph` → `reverse_call_graph`
Go	Reconstructs from `direct_calls`/`direct_callers` in unit metadata (the same fields `apply_reachability_filter` already falls back to when the file is absent)

Added tests/test_call_graph_output.py with 17 tests covering all 6 languages plus 5 unit tests for the apply_reachability_filter public API. CI (python-tests job) now installs Go and builds go_parser per-platform so those tests don't silently skip. The _go_parser_available() guard now try-executes the binary to catch WinError 193 (Linux ELF on Windows).

The os.path.exists guard + loud WARNING in scanner.py is kept as a safety net for any future parser that doesn't yet write the file.

🟡 Medium — MODEL_PRIMARY string

Fixed: MODEL_PRIMARY = "claude-opus-4-6" to match analyzer.py / enhancer.py / etc. Updated step_context inputs to use the constant rather than a hardcoded string.

🟡 Medium — --limit coupling

Fixed: removed max_units=limit from the analyze_reachability call. LLM reachability now always reviews the full parsed dataset; --limit only constrains the downstream analyse stage as intended.

🟢 Lows

Unused field import — removed.
reachable_from_entry → reachable — fixed field name in the prompt projection.
is_entry_point_reason → entry_point_reason — fixed field name; set in apply_signals when LLM promotes a unit.
step_context inputs model string — now uses _LLM_REACH_MODEL constant, not hardcoded "opus".

(Low 3 — MAX_CODE_BYTES — left unchanged for now; that's a tuning call better made with empirical data on real repos rather than guesswork.)

Adds an opt-in LLM review stage (off by default, enabled via the new `--llm-reachability` flag on `openant scan`) that uses a strong model (Opus by default) to surface additional reachability signals beyond what the structural pass catches: - Likely entry points the structural analysis may miss (framework hooks, plugin/CLI registrations, message handlers). - External-input sites (HTTP request bodies, file/network reads, env/argv, stdin, untrusted IPC). - Cross-process / async data-flow indicators. Signals are advisory and *promote-only*: high-confidence entry-point signals can set `is_entry_point=True` on a unit, but no signal ever demotes a unit that the structural analysis already kept. This matches the "complements, does not replace" intent in issue #17. Output: - `llm_reachability.json` written to the scan dir with the full signal list. - Each unit gains an `llm_reachability_signals` array on the dataset. Cost & rate-limit safety: opt-in only, prompts are batched, and the client integration goes through the existing `AnthropicClient` (which respects `GlobalRateLimiter`). Refs #17.

The Python CLI defines --llm-reachability for the LLM reachability stage (issue #17), but the Go CLI proxy did not expose it. The test TestHelp::test_scan_help_advertises_llm_reachability inspects 'openant scan --help' (Go cobra output) and was failing on all 3 OS targets. Register --llm-reachability as a Bool flag on the Go scan command and pass it through to the Python invocation when set.

- scanner.py: forward-declare app_context_path before step 1.5 so the LLM reachability block doesn't hit a NameError when --llm-reachability is enabled (the block ran before the app-context step that defined it). - llm_reachability._chunk: non-positive batch_size used to reference an unbound loop variable; now collapses to a single batch covering all items. Adds a regression test. - Help text (Python CLI + Go CLI): note that --llm-reachability may incur additional LLM cost, per cost-safety review.

The LLM reachability stage threads app_context into its prompt to help the model reason about expected entry points (web_app vs cli_tool, etc). The previous ordering ran it before app-context generation, so the app_context_path was always None at the call site — the prompt threading silently no-op'd. Reordering the steps makes the threading actually work. This also retires the temporary forward-declaration introduced in the previous commit; app_context_path is now defined naturally by the preceding step before the LLM reachability block reads it.

The stage was running after the structural filter had already discarded units that weren't reachable from heuristic entry points — meaning the LLM could never find the missed entry points that are the feature's main value prop. Fix: - When --llm-reachability is set, parse with processing_level="all" so every unit is visible to the LLM. - After apply_signals promotes LLM-identified entry points, re-run the structural reachability filter (apply_reachability_filter) with those unit IDs added as extra BFS seeds. - The final dataset is filtered to the user's requested processing_level but expanded by any entry points the LLM found. Also: - Expose apply_reachability_filter as a public function in parser_adapter.py with an extra_entry_points parameter; preserve any is_entry_point=True already set by the LLM stage when re-stamping. - Update help text in cli.py and scan.go to reflect that cost scales with total repo size, not the filtered unit count. - Update llm_reachability.py docstring to document the correct pipeline ordering. Addresses review comment on PR knostic#50. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

High — JS/Go/C/Ruby/PHP don't persist call_graph.json so the post-LLM re-filter was silently falling back to returning the full unfiltered dataset. Guard the re-filter call with an os.path.exists check on call_graph.json; when absent, print a prominent WARNING with the unit count and cost implication so the user knows --level was not applied. Also add refilter_supported to the step report summary. Medium 1 — align MODEL_PRIMARY from "claude-opus-4-20250514" to "claude-opus-4-6" to match analyzer.py and the rest of the Opus-using stages. Fix the misleading comment. Medium 2 — remove max_units=limit from the analyze_reachability call. --limit governs the analyze stage; the LLM reachability pass must review the full codebase to find missed entry points. Coverage is controlled by batch_size. Low 1 — remove unused `field` import from llm_reachability.py. Low 2 — fix prompt projection: reachable_from_entry (never written by any path) → reachable (the actual field name written by the reachability filter). Low 5 — set entry_point_reason on LLM-promoted units in apply_signals so the promotion source is visible without cross-referencing llm_reachability_signals. Also surface the actual MODEL_PRIMARY constant in the step_context inputs instead of the hardcoded string "opus". Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Enables the post-LLM reachability re-filter to work on JS, Go, C, Ruby, and PHP repositories, not just Python and Zig. C / Ruby / PHP (trivial — 3 lines each): call_graph.json is written immediately after CallGraphBuilder.export() in test_pipeline.py. graph_result already contains functions, call_graph, and reverse_call_graph in the format apply_reachability_filter expects. JavaScript (easy): call_graph.json is written at the end of run_typescript_analyzer(), right after analyzer_output.json is available. Keys are normalised from camelCase (callGraph, reverseCallGraph) to snake_case so the Python-side filter can read them without any extra handling. Go (moderate): call_graph.json is written at the end of run_go_parser_all(), after both analyzer_output.json and dataset.json are available. Functions are normalised to the camelCase shape EntryPointDetector expects (same conversion already done in apply_reachability_filter). Call graph edges are reconstructed from unit metadata.direct_calls / direct_callers (the same source apply_reachability_filter used). Tests (test_call_graph_output.py): - TestApplyReachabilityFilterPublicAPI (5 tests, always run): verifies the public apply_reachability_filter API — basic filtering, extra entry points expand the reachable set, is_entry_point stamping, LLM-promoted flag preservation, missing-file graceful return. - TestPythonCallGraphOutput (2 tests, always run): parse_repository with processing_level=all and reachable both produce call_graph.json. - TestJavaScript/Go/C/Ruby/PHP CallGraphOutput (2 tests each, skip-guarded): same assertions, skip when the parser's runtime dependency is absent. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Add Go toolchain setup and per-platform go_parser build steps to the python-tests CI job so call_graph.json tests for Go and JS don't silently skip on all platforms. Also harden _go_parser_available() to try-execute the binary (catching WinError 193 for cross-platform Linux ELF) and check both go_parser and go_parser.exe. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

joshbouncesecurity · 2026-05-12T14:48:22Z

@ar7casper ready for any further review you want to do

ar7casper

Round-3 review — thanks for the substantive response to round-2.5. You nailed the High + both Mediums + 3 of 5 Lows.

Verified each fix:

Round-2.5	Status
🟠 High — multi-language correctness gap	✅ Fixed via the most thorough option. All 5 missing parsers (C/Ruby/PHP/JS/Go) now write `call_graph.json` (`2b0dddf`). PLUS defensive `os.path.exists` guard in scanner.py with a loud cost-impact WARNING if file is ever missing (`3df8f0b`). 422-line `test_call_graph_output.py` with 17 cross-parser tests locks in the contract. Belt-and-suspenders.
🟡 Medium — model version	✅ `MODEL_PRIMARY = "claude-opus-4-6"`, comment fixed
🟡 Medium — `--limit` coupling	✅ Removed `max_units=limit`. Comment explains why.
🟢 Low — unused `field` import	✅ Removed
🟢 Low — always-null `reachable_from_entry` field	✅ Prompt projection now uses `reachable` (actually written by filter)
🟢 Low — observability	✅ `step_context` uses `MODEL_PRIMARY` constant; LLM-promoted units get `entry_point_reason: "llm_reachability: ..."`
🟢 Low — `MAX_CODE_BYTES = 1500`	⏭️ Deferred (carry-over)
🟢 Low — bare `with open(...)` style	⏭️ Deferred (carry-over)

Plus the CI tweak (cf005c6) to ensure go_parser binary is built in the python-tests job — necessary for the new cross-parser tests to actually run. All 9 CI checks green.

The commit messages on 3df8f0b and 2b0dddf are exemplary — itemize every finding addressed and the per-parser approach. Easy to verify against round-2.5.

Three small Lows for awareness (none blocking):

🟢 Low — Import-before-sys-path-fix in C/Ruby/PHP `test_pipeline.py`

2b0dddf added from utilities.file_io import ... to parsers/{c,ruby,php}/test_pipeline.py to support the new call_graph.json writes. But the order is wrong — the import comes before the sys.path setup:

# parsers/php/test_pipeline.py (current)
from utilities.file_io import open_utf8, read_json, run_utf8, write_json   # line 45 — import FIRST
# ...
sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.dirname(os.path.abspath(__file__)))))   # line 48 — sys.path LATER

Compare to JS/Go (which got the correct order in #56):

# parsers/javascript/test_pipeline.py
_core_root = _parser_dir.parent.parent
if str(_core_root) not in sys.path:
    sys.path.insert(0, str(_core_root))   # sys.path FIRST
# ...
from utilities.file_io import open_utf8, read_json, run_utf8, write_json   # import AFTER

CI passes because the python-tests workflow does pip install -e . which makes utilities globally importable from the venv (subprocess inherits). Production also works for the same reason. But local dev without pip install -e . hits ModuleNotFoundError: No module named 'utilities.file_io' when the parser subprocess is spawned — I confirmed this locally; 6 cross-parser tests fail without PYTHONPATH=. set; pass with it.

Suggestion: swap the order in C/Ruby/PHP test_pipeline.py to match the JS/Go pattern. Trivial 3-line change per file. Low impact (works in CI / production already), but consistency with #56's JS/Go pattern would close the local-dev gap.

🟢 Low — Cross-parser test contract doesn't verify `unit_type` field shape

_assert_call_graph_valid in test_call_graph_output.py:88-100 only checks that call_graph.json exists with the 3 required top-level keys. It doesn't verify the functions value shape — JS and Go's unitType (camelCase) versus Python/C/Ruby/PHP's unit_type (snake_case) inconsistency persists and isn't caught.

This is the documented pre-existing parser-issues #23-26 / Room-for-Improvement #10 issue. PR 50 didn't introduce it, but the new cross-parser test contract was a natural place to lock down the schema. Doesn't block this PR — for the LLM reachability functionality, LLM-promoted IDs flow through correctly via extra_entry_points. The structural-detection gap means JS/Go scans get fewer Check-1 (unit_type) entry points than they should, same as before this PR.

Suggestion (post-merge follow-up): add a contract assertion that unit_type exists in each function entry, surfacing the camelCase/snake_case drift if it ever causes a test to fail. Or normalize at the test level.

🟢 Low — PR body stale

Body still describes the original scope. Doesn't mention:

The cross-parser call_graph.json expansion (5 more parsers now write it)
The defensive WARNING for unsupported languages
That parse runs with processing_level="all" when --llm-reachability is set

Suggestion: add a "Round-2.5 fix summary" section or rewrite the Output bullet list to include the new behavior.

Verdict: ✅ ready to merge. The 3 Lows are housekeeping. Approving via UI.

Covers the seven PRs in this release: - #35: parse --level default → reachable (CLI consistency with scan + Python CLI) - #36: auto-detect dep changes via ~/.openant/venv/.deps-hash - #37: lazy JS parser npm bootstrap on first use - #39: TypeScript/NestJS DI-aware call resolution (constructor + field + functional inject()) - #40: --language auto opt-in for openant init + non-git path support + shared config/languages.json - #49: Express anonymous route handler extraction (route_handler / route_middleware) - #50: --llm-reachability opt-in stage + cross-parser call_graph.json contract Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

joshbouncesecurity marked this pull request as ready for review May 12, 2026 05:32

joshbouncesecurity requested review from ar7casper, dgeyshis, shahar-davidson, sounil and yotamleo as code owners May 12, 2026 05:32

ar7casper changed the base branch from master to release/2026-05-12 May 12, 2026 12:47

ar7casper reviewed May 12, 2026

View reviewed changes

joshbouncesecurity marked this pull request as draft May 12, 2026 13:16

joshbouncesecurity and others added 8 commits May 12, 2026 17:32

joshbouncesecurity force-pushed the feat/issue17-llm-reachability branch from c515fa4 to cf005c6 Compare May 12, 2026 14:42

joshbouncesecurity marked this pull request as ready for review May 12, 2026 14:48

ar7casper reviewed May 13, 2026

View reviewed changes

ar7casper merged commit 8be2cd2 into knostic:release/2026-05-12 May 13, 2026
9 checks passed

ar7casper mentioned this pull request May 13, 2026

feat(Release): 2026-05-12 #58

Merged

3 tasks

ar7casper mentioned this pull request May 13, 2026

Normalize unit_type field shape across parsers so EntryPointDetector's unit-type check works for JS/Go #63

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: LLM review stage for enhanced reachability detection#50

feat: LLM review stage for enhanced reachability detection#50
ar7casper merged 8 commits into
knostic:release/2026-05-12from
joshbouncesecurity:feat/issue17-llm-reachability

joshbouncesecurity commented May 4, 2026

Uh oh!

joshbouncesecurity commented May 4, 2026 •

edited

Loading

Uh oh!

joshbouncesecurity commented May 4, 2026

Uh oh!

ar7casper commented May 10, 2026

Uh oh!

joshbouncesecurity commented May 12, 2026

Uh oh!

joshbouncesecurity commented May 12, 2026

Uh oh!

ar7casper left a comment

Uh oh!

joshbouncesecurity commented May 12, 2026

Uh oh!

joshbouncesecurity commented May 12, 2026

Uh oh!

joshbouncesecurity commented May 12, 2026

Uh oh!

ar7casper left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

joshbouncesecurity commented May 4, 2026

Summary

Test plan

Uh oh!

joshbouncesecurity commented May 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Manual verification

Uh oh!

joshbouncesecurity commented May 4, 2026

Local test results

Uh oh!

ar7casper commented May 10, 2026

Uh oh!

joshbouncesecurity commented May 12, 2026

Uh oh!

joshbouncesecurity commented May 12, 2026

Uh oh!

ar7casper left a comment

Choose a reason for hiding this comment

🟠 High — Architecture · core/scanner.py:283-294 + core/parser_adapter.py:247-252

🟡 Medium — Quality · core/llm_reachability.py:51-53

🟡 Medium — Coupling · core/scanner.py:267

🟢 Lows (non-blocking, listed for completeness)

What's good

Uh oh!

joshbouncesecurity commented May 12, 2026

Uh oh!

joshbouncesecurity commented May 12, 2026

Uh oh!

joshbouncesecurity commented May 12, 2026

Uh oh!

ar7casper left a comment

Choose a reason for hiding this comment

🟢 Low — Import-before-sys-path-fix in C/Ruby/PHP test_pipeline.py

🟢 Low — Cross-parser test contract doesn't verify unit_type field shape

🟢 Low — PR body stale

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

joshbouncesecurity commented May 4, 2026 •

edited

Loading

🟠 High — Architecture · `core/scanner.py:283-294` + `core/parser_adapter.py:247-252`

🟡 Medium — Quality · `core/llm_reachability.py:51-53`

🟡 Medium — Coupling · `core/scanner.py:267`

🟢 Low — Import-before-sys-path-fix in C/Ruby/PHP `test_pipeline.py`

🟢 Low — Cross-parser test contract doesn't verify `unit_type` field shape