feat(jvm): resolve Java/Kotlin imports by fully-qualified name#412
Conversation
Before/after validation on two Kotlin/Java reposIndexed both repos with Coverage of in-project import statements
¹ Ground truth: The mall result is the more meaningful one — it's the central failure mode this PR addresses: multi-module Maven where filenames don't sit at Collision disambiguation (mall)mall has aggressive name collisions — MyBatis-generated examples:
Sample of 5 collision-targeted imports — each caller-package resolved to the correct target-package (no cross-package leak): Spot-check 20 random resolved imports on mall: 20 / 20 plausibly correct on package-flow inspection (mapper→model, service→mapper, controller→dto/common.api). Delta accounting — no edge inflationEdge counts per kind on mall before/after (this is the part I want a sanity check on — the only thing that should change is
Concrete
|
|
Followup on the CLI-undercount note from the previous comment — opened a separate fix: #413. That PR is off Regression test included ( |
ca172b7 to
77dde7e
Compare
77dde7e to
b40205b
Compare
|
Rebased onto main (conflicts in Also fixed a pre-existing test the namespace wrapping broke: ValidationDeterministic correctness on 3 real JVM repos:
Disambiguation spot-check on mall: 76 distinct Agent A/B per the dynamic-dispatch coverage playbook (3 repos × 3 flow prompts × 2 runs/arm × 2 arms = 36 runs, claude-opus, headless,
Playbook pass bar (~0 Read/Grep, faster than without): met on Spring (the target). Guava is mixed — the residual reads aren't FQN-import related, they're a functional-interface (lambda → SAM) coverage gap: Follow-up commit on topAdded Concrete effect on guava: 3608 anon classes extracted, +2534 synthesized Tried explore-output ranking tweaks (symbol diversification in Tests: 1069 passed, 2 skipped on the rebased tip. VerdictLands cleanly on Spring (the explicit target of this PR). Guava's residual is a separate lambda-SAM coverage gap — filing a follow-up issue. Squash-merging. |
Adds optional `packageTypes` + `extractPackage` hooks to the `LanguageExtractor` interface; the core extractor looks for one such child of the root node and wraps every top-level declaration in an implicit `namespace` node carrying the FQN. For Kotlin (`package_header`) and Java (`package_declaration`) this means a class `Bar` in `package com.example.foo` is indexed with qualifiedName `com.example.foo::Bar` instead of just `Bar` — the prerequisite for resolving `import com.example.foo.Bar` by qualified name regardless of filename (Kotlin filename ≠ class name, top-level functions, extension functions). Files without a package declaration are unchanged; the namespace wrapping is opt-in per extractor. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds `resolveJvmImport` — a JVM-gated strategy invoked before framework/name-matcher in `resolveOne`. For Java/Kotlin `imports` references it splits the FQN (`com.example.foo.Bar`) at the last dot and looks up `com.example.foo::Bar` via the qualifiedName index populated by the new package-namespace wrapping. Wildcard imports (`com.example.*`) deliberately fall through to the existing name-matcher fallback. Same-name classes across packages (the central failure mode in multi-module Spring / Android codebases) now resolve to the correct one. Also fixes `hasAnyPossibleMatch` so the pre-filter doesn't strip JVM FQNs before resolution gets to see them — the previous logic only checked the first segment of a dotted name, missing FQN tails like the `Bar` in `com.example.foo.Bar`. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Indexes a temp project with real Kotlin/Java sources and asserts the
imports edges resolve through the qualifiedName index. Covers the
failure modes that motivated the feature:
- filename ≠ class name (Bar declared in Models.kt)
- top-level function imports (`import com.example.util`)
- cross-language Kotlin importing a Java class
- same-name class collision across packages — each caller resolves
to ITS Bar, not the other one (the multi-module Spring trap that
name-matcher alone cannot disambiguate)
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The namespace wrapping introduced in the previous extraction commit changes Java qualifiedName from `Handler::use` to `com.example.web::Handler::use`. The pre-existing colbymchenry#314 test was looking up the old format and failed — update it to the new one. The semantic check (import resolves to the correct service-package FooConverter.java) is unchanged. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The lambda-returned anonymous class pattern in guava-style libraries
(`Splitter.on(',')` returns `new Splitter((s, t) -> new SplittingIterator()
{ @OverRide int separatorStart(int s) {...} })`) produced an `instantiates`
edge only — the overriding methods inside the anon body were invisible
to the graph, so Phase 5.5 interface-impl synthesis had nothing to bridge
and an agent investigating `SplittingIterator.separatorStart` had to Read
the file to find the real implementation.
Extract `object_creation_expression` with a `class_body`/`declaration_list`
child as a `<TypeName$anon@line>` class node with an `extends` reference
to the named base, then walk the body — its `method_declaration` members
land as method nodes under the anon class. The existing IFACE_OVERRIDE_LANGS
synthesizer then links each base abstract method → the anon override by
name, exactly like it does for normal `class Impl implements Iface { ... }`
declarations.
Validated on google/guava (3,227 .java files): 3,608 anonymous classes
extracted, +2,534 synthesized `calls` edges to overrides hidden inside
`new T() { ... }` blocks (including those nested in lambda bodies).
codegraph_node `Splitter::SplittingIterator::separatorStart` now lists
the four anon overrides in its trail.
Re-ran the playbook on CacheBuilder.build (guava q2): with-arm reads
1.5 → 0 across 2 runs (1.5 → 0 mean), tools 6.5 → 4.5. No regression on
the Spring repos (petclinic-kt q1: 0 reads, mall q3: 0 reads, both still
−27%+ wall-clock vs the no-codegraph arm). Splitter (q3) reads still
fire because explore-tool selection misses anon classes in its ranking —
a separate explore-output tweak, not an extraction gap.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
4c7229c to
168de30
Compare
Summary
.kt/.javafiles in an implicitnamespacenode carrying the file'spackage, so a classBarinpackage com.example.foois indexed with qualifiedNamecom.example.foo::Bar.resolveJvmImportstrategy (gated to Java/Kotlin,imports-kind refs only) looks up FQNs through the qualifiedName index —import com.example.foo.Barnow resolves by package, not by filename, with confidence 0.95.hasAnyPossibleMatchpre-filter so it doesn't strip JVM FQNs (com.example.foo.Bar) before resolution gets a chance — the previous logic only consulted the first segment.Motivation
Java's existing filesystem-based path lookup (
import com.example.Foo→com/example/Foo.java) works only because Java enforces class-name = filename. Kotlin doesn't —Barcan live inModels.kt,Utils.kt, or any file underpackage com.example. Top-level functions and extension functions have no class to name a file after at all. The fallback path (name-matcher) finds some match for collisions but cannot disambiguate same-name classes across packages — the central failure mode in multi-module Spring / Android codebases (see themall/halonotes in the dynamic-dispatch playbook).This change makes
importsedges land on the exact target the package declares, which feedscallers/callees/contextprecision and lays groundwork for binding-precise Dagger2 / Hilt resolution later.What's covered
package com.example.foo→ class qualifiedNamecom.example.foo::Barextraction.test.ts(Kotlin / Java)extraction.test.tsextraction.test.tsresolution.test.ts(unit, mocked context)Barcom.example.*deliberately falls through to name-matcherimportsreference kinds skip the strategyimportsedge end-to-end, filename ≠ class nameframeworks-integration.test.ts19 new tests; all existing Kotlin (8), Java (2), Spring (3), pr19 (33), symbol-lookup (9), resolution (23), frameworks (86), frameworks-integration (8) suites stay green.
Out of scope
import com.example.*) — name-matcher still handles these.import com.example.Bar as Baz) — the Kotlin extractor doesn't currently surface the alias; orthogonal to this PR.