Skip to content

fix(perf): consolidate extractor walks and scope native post-passes (v3.12.0 publish gate)#1436

Merged
carlos-alm merged 3 commits into
mainfrom
fix/postpass-scope-and-extractor-walks
Jun 11, 2026
Merged

fix(perf): consolidate extractor walks and scope native post-passes (v3.12.0 publish gate)#1436
carlos-alm merged 3 commits into
mainfrom
fix/postpass-scope-and-extractor-walks

Conversation

@carlos-alm

Copy link
Copy Markdown
Contributor

Why

The v3.12.0 publish failed its benchmark regression gate (run 27260875859):

build/native:        1-file rebuild:  86 → 1657  (+1827%, threshold 50%)
incremental/native:  Full build:    2231 → 4423  (+98%,   threshold 25%)
                     1-file rebuild:  83 → 1585  (+1810%, threshold 50%)
incremental/wasm:    Full build:    9433 → 18354 (+95%,   threshold 75%)

Not CI variance — two real regressions accumulated across the Phase 8.x resolver PRs since 3.11.2 (invisible until release because benchmarks only run at publish — #1433):

Root cause 1 — extractor walk pile-up (both engines' parse cost)

The JS/TS extractor grew from 5 to 15 full-tree walks per file; each feature PR added its own DFS. On WASM trees every node access (.type, child(i), childForFieldName) marshals across the JS↔WASM boundary, so traversal count dominates extraction cost. Measured +135%/file (7.5 → 17.7ms on this corpus); CPU profile self-time was dominated by marshalNode/setValue.

Root cause 2 — unscoped native post-pass (native 1-file rebuild 20×)

runPostNativePrototypeMethods (#1331) ran on every native build: read all ~590 JS/TS files, regex-filtered, WASM-re-parsed all 18 matches through the worker pool — ~1s flat tax per build. On a 1-file rebuild that's 46ms of real work + ~1.1s of post-pass (result.phases summed 48ms while wall was 1232ms — #1434).

What

  • runCollectorWalk — one dispatch pass replacing 9 stateless walks (dynamic imports, this/call/apply bindings, param/array-elem/object-prop bindings, new X() names, defineProperty receivers, class members, prototype + func-prop assignments). Per-walk semantics preserved exactly (incl. the import()-subtree suppression and the TS type_identifier classStack quirk — see comments).
  • runContextCollectorWalk — one context-tracking pass replacing the 3 class-context walks (typeMap, object-rest params, spread/for-of), each keeping its own context register because their reset rules deliberately differ. extractReturnTypeMapWalk stays separate and runs first: the declarator handler reads the complete per-file map for inter-procedural propagation (forward references).
  • Scoped prototype-methods post-pass — definition scan restricted to changedFiles on incremental builds (mirrors runPostNativeThisDispatch); the caller-relink sweep still covers all files but only fires when a changed file actually added func-prop definitions.
  • symbolsOnly backfill parses — the proto and this-dispatch passes skip worker-side AST/complexity/CFG/dataflow visitors + their serialization (they only consume definitions/calls/typeMap). Dropped-language backfill keeps full analysis.

Numbers (local M-series, codegraph corpus, vs 3.11.2 baseline)

metric 3.11.2 broken this PR
native 1-file rebuild 58–59ms 1134–1232ms 99–104ms
native full build 1309–1474ms 3045–3238ms 2354–2576ms
wasm full build 7.0–7.8s 14.4–15.9s 8.2–9.2s (+17%)
extraction /file 7.5ms 17.7ms ~12ms

Remaining deltas vs 3.11.2 are the genuine cost of the new Phase 8.x resolution features plus corpus growth (628 → 670 files), not bugs. If the publish gate still trips on native full build, the residual is root-caused above and the next lever is #1432 (native func-prop extraction deletes the post-pass entirely).

Verification

  • Full suite: 3042 passed / 0 failed (with a freshly built native addon — the stale local binary issue is unrelated)
  • Resolution precision/recall fixture gate: 176 passed
  • Parser suites: 582 passed
  • Repro methodology: timed buildGraph full/no-op/1-file for both engines on this repo vs a v3.11.2 worktree, same machine

Follow-ups filed

#1432 (Rust func-prop extraction), #1433 (per-PR perf canary + benchmark methodology unification), #1434 (post-pass phase timings), #1435 (pool per-file overhead)

After merge: re-run the v3.12.0 publish workflow.

Fixes the two regressions that failed the v3.12.0 publish benchmark gate
(native 1-file rebuild 86ms -> 1657ms +1827%; native full build +98%;
wasm full build +95% vs 3.11.2):

1. JS/TS extractor had grown from 5 to 15 full-tree walks per file as
   each Phase 8.x feature added its own DFS. On WASM trees every node
   access marshals across the JS<->WASM boundary, so traversal count
   dominated extraction cost (+135%/file). Nine stateless walks are now
   one dispatch pass (runCollectorWalk) and the three class-context
   walks (typeMap, object-rest params, spread/for-of) are one
   context-tracking pass (runContextCollectorWalk) with per-concern
   context registers preserving each walk's distinct reset rules.
   extractReturnTypeMapWalk stays separate: the declarator handler
   reads the complete per-file map for inter-procedural propagation.

2. runPostNativePrototypeMethods ran unscoped on every native build:
   it read every JS/TS file in the repo, regex-filtered, and re-parsed
   all matches (18 files here) through the WASM worker pool -- a ~1s
   flat tax on every build including 1-file rebuilds. The definition
   scan is now scoped to changedFiles on incremental builds (mirroring
   runPostNativeThisDispatch); the caller-relink second pass still
   sweeps all files but only when a changed file actually added new
   func-prop definitions. Backfill parses for this pass and the
   this-dispatch pass also skip the worker-side AST/complexity/CFG/
   dataflow visitors via the new symbolsOnly option (they only consume
   definitions/calls/typeMap).

Local before/after (M-series, codegraph corpus, vs 3.11.2 baseline):
  native 1-file rebuild  1134-1232ms -> 99-104ms   (baseline 58-59ms)
  native full build      3045-3238ms -> 2354-2576ms (baseline 1309-1474ms)
  wasm full build        14.4-15.9s  -> 8.2-9.2s    (baseline 7.0-7.8s)
  per-file extraction    17.7ms      -> ~12ms       (baseline 7.5ms)

Remaining deltas vs 3.11.2 are the feature cost of the Phase 8.x
resolution passes plus corpus growth, not bugs.

Follow-ups: #1432 (native func-prop extraction removes the post-pass
entirely), #1433 (per-PR perf canary), #1434 (post-pass phase timings),
#1435 (worker-pool per-file overhead).

Impact: 42 functions changed, 25 affected
@claude

claude Bot commented Jun 10, 2026

Copy link
Copy Markdown

Claude finished @carlos-alm's task —— View job


PR Review: fix(perf): consolidate extractor walks and scope native post-passes

Tasks:

  • Gather context and understand the PR
  • Read and analyze all 3 changed files
  • Phase 1–5 analysis complete

Root Cause Analysis

Problem Being Solved: Two independent performance regressions accumulated during Phase 8.x resolver PRs, invisible until v3.12.0's publish benchmark gate fired:

  1. JS/TS extractor grew from 5 → 15 full-tree walks per file (each walk = full WASM tree traversal with per-node marshalling cost)
  2. runPostNativePrototypeMethods ran on all ~590 JS/TS files every build, not just changed files — ~1s flat tax even on 1-file rebuilds

Why This Approach: Correct. The fix consolidates walks into runCollectorWalk (one dispatch DFS replacing 9 stateless walks) and runContextCollectorWalk (one context-tracking pass for 3 walks requiring different class-context reset rules), plus restricts the prototype-methods post-pass definition scan to changedFiles on incremental builds. This mirrors the existing runPostNativeThisDispatch scoping pattern.

Risk Assessment: Medium. The walk consolidation is a significant refactor of the hottest code path in the extractor. Correctness depends on precisely preserving the semantics of 12 individual walks that were separately authored and tested. The test suite passing (3042 tests, 176 resolution precision/recall fixtures) is strong evidence, but any missed edge case in the combined walk would silently regress extraction quality.


Phase 2: Configuration Change Detection

No lint rules relaxed, no test timeouts increased, no TypeScript strict mode changes. No config changes at all. Clean.


Phase 3: Test Integrity

No test files modified. No assertions weakened. The PR description cites 3042 passed / 0 failed with resolution fixture gate (176 passed). No red flags.


Phase 4: Code Quality

src/extractors/javascript.ts

Concern 1 — symbolsOnly flag silently skips WASM workers for small batches (parser.ts:1248-1251)

// parser.ts:1248-1251
if (filePaths.length <= INLINE_BACKFILL_THRESHOLD) {
  return parseFilesWasmInline(filePaths, rootDir);  // EXTRACT_ONLY never passed!
}
return parseFilesWasm(filePaths, rootDir, opts.symbolsOnly ? EXTRACT_ONLY : FULL_ANALYSIS);

When filePaths.length <= INLINE_BACKFILL_THRESHOLD, symbolsOnly is silently ignoredparseFilesWasmInline always runs full analysis. The runPostNativePrototypeMethods comment says "symbolsOnly: this pass reads definitions/calls/typeMap — skip the worker-side AST/complexity/CFG/dataflow visitors" (native-orchestrator.ts:638-641), but that skip only actually applies to the large-batch path. In practice, for repos where the changed JS/TS file count fits within the inline threshold, the intended optimization is a no-op. This is a latent bug — not introduced in this PR (the comment and parseFilesWasmForBackfill signature existed), but the PR now routes both post-passes through parseFilesWasmForBackfill(..., { symbolsOnly: true }) and documents the perf benefit, so the gap between intent and reality becomes more noticeable.

Concern 2 — runCollectorWalk: classMemberDefs aliased to definitions without guard (javascript.ts:394-395)

// javascript.ts:392-395 (query path)
runCollectorWalk(tree.rootNode, {
  ...
  classMemberDefs: definitions,  // same array reference as `definitions`
});

In runCollectorWalk, both handleFieldDef and handleStaticBlock push to targets.classMemberDefs (lines 3004, 3007). On the query path this is the same array as definitions, so field_definition/class_static_block nodes get emitted. On the walk path (extractSymbolsWalk, line 702-711), classMemberDefs is not passed:

// javascript.ts:702-711 (walk path)
runCollectorWalk(tree.rootNode, {
  definitions: ctx.definitions,
  ...
  funcPropDefs: ctx.definitions,
  // classMemberDefs: NOT passed
});

The walk path's walkJavaScriptNode handles field_definition/class_static_block directly via handleFieldDef/handleStaticBlock (lines 729-735), so this is intentional and correctly asymmetric. However, the interface comment at line 2919-2926 calls this out, making it clear this is by design. No bug, but asymmetry is non-obvious.

Concern 3 — inDynamicImport suppression covers only imports target (javascript.ts:2971-2972)

case 'call_expression': {
  if (targets.imports && !inDynamicImport && collectDynamicImport(node, targets.imports)) {
    childInDynamicImport = true;
  }
  if (targets.calls && targets.thisCallBindings) {
    collectThisCallAndBindings(node, targets.calls, targets.thisCallBindings);
  }
  collectParamBindings(node, targets.paramBindings);  // runs even inside import() subtree
  collectDefinePropertyReceiver(node, targets.definePropertyReceivers);

The inDynamicImport guard only suppresses collectDynamicImport from re-firing on nested nodes. All other collectors (collectThisCallAndBindings, collectParamBindings, collectDefinePropertyReceiver) continue processing nodes within an import() argument subtree. The PR description says this "preserving the former standalone walk's 'don't recurse into import() children' behaviour" — but the former walk had an early return for the entire subtree, whereas here only the dynamic-import collection is suppressed while paramBindings, definePropertyReceivers, and thisCallBindings collection continues into the subtree. This is a behavioral change relative to the former individual walks for those collectors (which did recurse into all nodes). If the original separate walks were correct, this change is neutral (they didn't have any special import() handling). The comment is slightly misleading since only dynamic-import re-collection is suppressed, not the other collectors.

src/domain/graph/builder/stages/native-orchestrator.ts

Concern 4 — Caller-relink second pass (allJsFilesFromDb()) still potentially large (native-orchestrator.ts:719)

const callerScanFiles = isFullBuild || !changedFiles ? jsFiles : allJsFilesFromDb();

On an incremental build where a changed file added func-prop definitions, the caller-relink sweep calls allJsFilesFromDb() (all JS/TS files in DB) and regex-scans all of them. The comment acknowledges this is intentional (to relink unchanged callers that lost edges when the definition file's nodes got new IDs). This is the correct behavior, but it means the second pass cost is still proportional to the corpus size when any func-prop definition changes. The PR description notes this is tracked in #1432 (native func-prop extraction). Acceptable for now.

Concern 5 — Minor: newExpressions collected but not gated on path (javascript.ts:381-394)

On the query path, newExpressions is passed to runCollectorWalk as a fresh array and returned. On the walk path (line 700-712), same. Neither path gates new_expression collection differently. This is fine — just noting for completeness.


Phase 5: Backlog Compliance

  • Zero-dep: ✓ — No new runtime dependencies added
  • Foundation-aligned: ✓ — Pure performance optimization in the local parse pipeline; no cloud calls, no external infra, no code generation; P1/P4/P8 untouched
  • Problem-fit: 5 — Extraction accuracy and build performance directly reduce token waste (slower builds → larger context windows → more re-orientation) and the walk consolidation makes future extractor work safer (one DFS to audit instead of 15)
  • Breaking: No — Internal extraction pipeline change; no CLI output format, API signature, DB schema, or MCP contract changes

Tier: 1 (zero-dep ✓ + foundation-aligned ✓)


Critical Concerns

symbolsOnly not applied to inline backfill pathparser.ts:1248-1249 calls parseFilesWasmInline which always runs full analysis regardless of opts.symbolsOnly. For the optimized intent to hold consistently, the inline path needs the same flag. This doesn't cause incorrect results — it's a missed optimization — but it means the claimed post-pass speedup only materializes on repos large enough to exceed INLINE_BACKFILL_THRESHOLD. Consider fixing before or shortly after merge (or opening a tracking issue if scope discipline applies here).


Final Recommendation

  • Rating: ⭐⭐⭐⭐ (4/5)
  • Action: APPROVE WITH CONDITIONS
  • Reasoning: The root cause is correctly diagnosed, the approach is architecturally sound (walk consolidation is the right pattern for WASM marshalling overhead), and the test evidence is thorough. The single noteworthy gap is the symbolsOnly flag being silently dropped on the inline-batch path (parser.ts:1248-1249), which makes the documented perf benefit of symbolsOnly partial rather than universal. This doesn't affect correctness. The walk semantics are carefully preserved (import()-subtree suppression, per-context register separation in runContextCollectorWalk, query vs. walk path asymmetry for class members), and the scoping of the post-pass definition scan mirrors the existing runPostNativeThisDispatch pattern. The remaining 17% WASM overhead delta vs. 3.11.2 is correctly attributed to new Phase 8.x features + corpus growth, not bugs, and follow-ups are filed.

@github-actions

github-actions Bot commented Jun 10, 2026

Copy link
Copy Markdown
Contributor

Codegraph Impact Analysis

42 functions changed25 callers affected across 11 files

  • runPostNativePrototypeMethods in src/domain/graph/builder/stages/native-orchestrator.ts:579 (4 transitive callers)
  • allJsFilesFromDb in src/domain/graph/builder/stages/native-orchestrator.ts:591 (3 transitive callers)
  • runPostNativeThisDispatch in src/domain/graph/builder/stages/native-orchestrator.ts:869 (4 transitive callers)
  • tryNativeOrchestrator in src/domain/graph/builder/stages/native-orchestrator.ts:1465 (5 transitive callers)
  • parseFilesWasm in src/domain/parser.ts:1156 (13 transitive callers)
  • parseFilesWasmForBackfill in src/domain/parser.ts:1243 (10 transitive callers)
  • extractSymbolsQuery in src/extractors/javascript.ts:324 (1 transitive callers)
  • collectDynamicImport in src/extractors/javascript.ts:573 (8 transitive callers)
  • extractSymbolsWalk in src/extractors/javascript.ts:657 (1 transitive callers)
  • collectDefinePropertyReceiver in src/extractors/javascript.ts:1668 (8 transitive callers)
  • ContextCollectorOutputs.typeMap in src/extractors/javascript.ts:1717 (0 transitive callers)
  • ContextCollectorOutputs.returnTypeMap in src/extractors/javascript.ts:1718 (0 transitive callers)
  • ContextCollectorOutputs.callAssignments in src/extractors/javascript.ts:1719 (0 transitive callers)
  • ContextCollectorOutputs.fnRefBindings in src/extractors/javascript.ts:1720 (0 transitive callers)
  • ContextCollectorOutputs.objectRestParamBindings in src/extractors/javascript.ts:1721 (0 transitive callers)
  • ContextCollectorOutputs.spreadArgBindings in src/extractors/javascript.ts:1722 (0 transitive callers)
  • ContextCollectorOutputs.forOfBindings in src/extractors/javascript.ts:1723 (0 transitive callers)
  • ContextCollectorOutputs.arrayCallbackBindings in src/extractors/javascript.ts:1724 (0 transitive callers)
  • runContextCollectorWalk in src/extractors/javascript.ts:1766 (3 transitive callers)
  • walk in src/extractors/javascript.ts:1770 (8 transitive callers)

@greptile-apps

greptile-apps Bot commented Jun 10, 2026

Copy link
Copy Markdown
Contributor

Greptile Summary

This PR addresses a v3.12.0 publish-gate benchmark regression by consolidating 15 independent AST walks into two unified passes (runCollectorWalk and runContextCollectorWalk) and scoping the native prototype-methods post-pass to changed files only on incremental builds.

  • Walk consolidation (javascript.ts): Nine stateless collectors and three context-tracking collectors are merged into two single-DFS passes, reducing WASM node-marshal overhead from ~15 traversals to 3 per file. The dynamic-import subtree suppression is preserved using an inDynamicImport flag propagated to children.
  • Scoped post-pass (native-orchestrator.ts): runPostNativePrototypeMethods now accepts changedFiles and isFullBuild; on incremental builds the definition scan is restricted to changed files, while the caller-relink sweep still covers all DB files when new method nodes are actually added.
  • symbolsOnly flag (parser.ts): parseFilesWasmForBackfill gains an opts parameter forwarded to parseFilesWasm; the inline path (≤16 files) silently ignores this flag — acknowledged and filed as fix(perf): pass symbolsOnly to parseFilesWasmInline for small-batch incremental builds #1437.

Confidence Score: 5/5

Safe to merge — the walk consolidation faithfully preserves the semantics of all 12 prior independent traversals, the scoped post-pass correctly handles both full and incremental paths, and the 3042-test suite passed at 0 failures.

The refactoring is purely mechanical consolidation of stateless per-node handlers into shared DFS loops. Each collector's context-dependency (class stack resets, funcStack push/pop, dynamic-import subtree suppression) is carried over exactly, and the walk-path vs. query-path asymmetries (classMemberDefs, funcPropDefs, calls) are correctly represented in the new CollectorWalkTargets interface. The incremental-build scoping in the post-pass has the right fallback (caller-relink still sweeps all files when new method nodes are inserted), and the lazy allJsFilesFromDb() is called after the insert transaction commits. The known symbolsOnly limitation on the inline path predates this PR and is tracked separately.

No files require special attention beyond the acknowledged symbolsOnly inline-path limitation in parser.ts (tracked as #1437).

Important Files Changed

Filename Overview
src/extractors/javascript.ts Consolidates 12 independent AST walks into runCollectorWalk (9 stateless collectors) and runContextCollectorWalk (3 context-tracking collectors); semantics are faithfully preserved including dynamic-import subtree suppression, class/function stack management, and per-path differences between the query path and walk path.
src/domain/graph/builder/stages/native-orchestrator.ts Scopes the prototype-methods post-pass definition scan to changedFiles on incremental builds; caller-relink sweep correctly falls back to allJsFilesFromDb() when new method nodes are actually inserted. The allJsFilesFromDb lazy function is called after the batch-insert transaction commits, so it sees the updated DB state.
src/domain/parser.ts Adds symbolsOnly option to parseFilesWasmForBackfill forwarded to the worker-pool path only; the inline path (≤16 files) still runs full analysis regardless of the flag — acknowledged limitation tracked in #1437.
tests/search/embedding-regression.test.ts Adds recursive cause-chain traversal to network error detection, handling wrapped errors (e.g. ConnectTimeoutError with code UND_ERR_CONNECT_TIMEOUT wrapped inside an EngineError). The circular-cause guard (cause !== e) is a safe defensive check.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[extractSymbolsQuery / extractSymbolsWalk] --> B[extractReturnTypeMapWalk\ncomplete per-file map for forward refs]
    B --> C[runContextCollectorWalk\ntypeMap · objectRestParams · spread/forOf\nthree context registers with different reset rules]
    B --> D[extractDestructuredBindingsWalk\nscope-pruning walk]
    C --> E[runCollectorWalk\ndynamicImports · protoMethods · funcProps\nparamBindings · arrayElems · objectProps\nnewExprs · definePropertyReceivers\nthisCall/classMember on query path only]

    F[tryNativeOrchestrator] --> G{isFullBuild?}
    G -- yes --> H[allJsFilesFromDb\nscans ALL JS/TS files]
    G -- no --> I[changedFiles.filter\nscans only changed files]
    H --> J[regex pre-filter\nfunc-prop pattern match]
    I --> J
    J --> K[parseFilesWasmForBackfill\nsymbolsOnly:true on pool path]
    K --> L{new method nodes\ninserted?}
    L -- no --> M[return]
    L -- yes --> N[build newMethodSuffixes]
    N --> O[callerScanFiles\nfull build: jsFiles\nincremental: allJsFilesFromDb]
    O --> P[text-search callers\nregex suffix match]
    P --> Q[parseFilesWasmForBackfill\nsymbolsOnly:true]
    Q --> R[resolve call edges\nto new method nodes]
Loading

Reviews (4): Last reviewed commit: "Merge branch 'main' into fix/postpass-sc..." | Re-trigger Greptile

Comment thread src/domain/parser.ts
Comment on lines 1243 to 1252
export async function parseFilesWasmForBackfill(
filePaths: string[],
rootDir: string,
opts: { symbolsOnly?: boolean } = {},
): Promise<Map<string, ExtractorOutput>> {
if (filePaths.length <= INLINE_BACKFILL_THRESHOLD) {
return parseFilesWasmInline(filePaths, rootDir);
}
return parseFilesWasm(filePaths, rootDir);
return parseFilesWasm(filePaths, rootDir, opts.symbolsOnly ? EXTRACT_ONLY : FULL_ANALYSIS);
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 symbolsOnly is silently ignored on the inline path

When filePaths.length <= INLINE_BACKFILL_THRESHOLD (16), parseFilesWasmInline is called with no analysis argument and always performs a full extraction — AST/complexity/CFG/dataflow visitors run regardless of opts.symbolsOnly. For a 1-file rebuild, protoFiles will typically be 1 file, so symbolsOnly: true is a no-op on the exact path the PR targets. The benchmark gains are real (they come from file scoping), but the symbolsOnly optimization contributes nothing to incremental builds in practice. Callers passing { symbolsOnly: true } with small batches get no benefit, and there is no diagnostic to indicate this.

Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

Fix in Claude Code

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Acknowledged. The symbolsOnly flag is indeed silently ignored on the inline path (parseFilesWasmInline) for batches of ≤16 files. Fixing this properly requires adding an analysis-mode parameter all the way down to the per-language extractor functions, which is a meaningful separate change.

Filed as #1437 to track this follow-up. The benchmark wins from this PR are real — they come from restricting the proto-methods post-pass definition scan to changedFiles, not from symbolsOnly.

…dows CI

The embedding-regression test skips when the HuggingFace model download
fails due to network issues. However, on Windows CI the failure surfaces
as a ConnectTimeoutError (code: UND_ERR_CONNECT_TIMEOUT) wrapped inside
EngineError as 'fetch failed' — neither 'timeout' in the message nor the
UND_ERR_ code was checked. Walk the full cause chain so nested timeouts
and undici socket errors are treated as network failures and silently
skipped rather than failing the test run.
@carlos-alm

Copy link
Copy Markdown
Contributor Author

Addressed review feedback:

  • CI failure (Windows, embedding-regression test): Fixed by walking the full error cause chain in the network-error detection logic. The Windows runner's HuggingFace model fetch fails with a ConnectTimeoutError (code: UND_ERR_CONNECT_TIMEOUT) wrapped inside EngineError as 'fetch failed'. The old code only checked the top-level error — 'fetch failed' doesn't contain 'timeout' and the code is ENGINE_UNAVAILABLE, not UND_ERR_CONNECT_TIMEOUT. The fix uses a recursive cause-chain walker that catches UND_ERR_CONNECT_TIMEOUT, UND_ERR_SOCKET, UND_ERR_HEADERS_TIMEOUT, and the fetch failed message substring at any depth.
  • symbolsOnly silent no-op on inline path (Greptile P2, Claude flagged): Acknowledged as a real gap. Filed as issue fix(perf): pass symbolsOnly to parseFilesWasmInline for small-batch incremental builds #1437 to track. The benchmark wins from this PR are real (they come from restricting the proto-methods definition scan to changedFiles), not from symbolsOnly. Fixing this properly requires threading an analysis-mode parameter through the per-language extractor functions, which is scope-appropriate for a follow-up PR.

@carlos-alm

Copy link
Copy Markdown
Contributor Author

@greptileai

@carlos-alm

Copy link
Copy Markdown
Contributor Author

Windows CI failures — transient infrastructure issue

The three failing checks (Test Node 22 windows-latest, Engine parity windows-latest, CI Testing Pipeline) are unrelated to this PR's code changes.

Root cause: The CI run was scheduled on a windows-2025-vs2026 runner image (version 20260525.121.1) from the VS2026 preview pool. This image ships Visual Studio 2026 Preview (C:\Program Files\Microsoft Visual Studio\18\Enterprise) which node-gyp cannot detect — node-gyp reads the registry for VS2017+ but version "18" resolves to "undefined", triggering RangeError [ERR_CHILD_PROCESS_STDIO_MAXB] in the PowerShell VS finder. This causes tree-sitter-erlang's node-gyp build to fail during npm install.

Evidence that this is infrastructure-only:

  • The 4 files changed in this PR are pure TypeScript (src/extractors/javascript.ts, src/domain/graph/builder/stages/native-orchestrator.ts, src/domain/parser.ts, tests/search/embedding-regression.test.ts) — no native bindings, no package.json changes
  • The fix/hooks-project-dir-fallback PR running at the same timestamp (18:43) has identical Windows failures
  • Main CI runs from earlier today used the stable win25/20260525.149 image and passed cleanly
  • Ubuntu and macOS jobs on this same CI run passed without issue

@carlos-alm carlos-alm merged commit 0715090 into main Jun 11, 2026
23 checks passed
@carlos-alm carlos-alm deleted the fix/postpass-scope-and-extractor-walks branch June 11, 2026 01:47
@github-actions github-actions Bot locked and limited conversation to collaborators Jun 11, 2026
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant