Skip to content

Fix #335 fix #3: tighten canonical-repo match in queryTargetsDependingOnModules#354

Merged
tinder-maxwellelliott merged 2 commits into
masterfrom
claude/fix-issue-335-loose-substring-match
May 18, 2026
Merged

Fix #335 fix #3: tighten canonical-repo match in queryTargetsDependingOnModules#354
tinder-maxwellelliott merged 2 commits into
masterfrom
claude/fix-issue-335-loose-substring-match

Conversation

@tinder-maxwellelliott
Copy link
Copy Markdown
Collaborator

Summary

Implements the "loose substring match" tail of #335 (fix #3 in the issue body) and supersedes #346. This branch contains the reproducer commit from #346 with @Ignore dropped, plus the production fix that makes it pass.

The bug

CalculateImpactedTargetsInteractor.queryTargetsDependingOnModules resolved each changed module key to canonical bzlmod repos with:

val matched = allTargets.keys
    .filter { it.startsWith("@@") && it.contains(moduleName) }
    .map { it.substring(2).substringBefore("//") }

it.contains(moduleName) is far too loose. A single changed module like aspect_bazel_lib matched every canonical repo whose name began with that string — @@aspect_bazel_lib+, @@aspect_bazel_lib++toolchains+bats_toolchains, @@aspect_bazel_lib++toolchains+yq_toolchains, etc. On older revisions each match became its own serial bazel query rdeps(//..., @@<repo>//...) subprocess; on the workspace in #335 that produced ~5,000 serial subprocesses and took multiple hours. After the union-into-a-single-query change the rdeps subprocess count dropped, but the over-large rdeps(//..., @@a+//... + @@a++toolchains+x//... + @@a++toolchains+y//...) expression still over-fans-out inside Bazel and pollutes the impacted set.

The fix

Tighten the match to the module's base canonical repo only. The base repo has the shape <moduleName><sep><versionSegment> where <sep> is + or ~ and the remainder contains no further separators. Module-extension repos always layer additional +/~ segments on top and are rejected.

Forms recognised as base:

  • Bazel 7+ default: aspect_bazel_lib+
  • Bazel <7 / multi-version: aspect_bazel_lib~2.22.5
  • Bazel 7+ with version: aspect_bazel_lib+1.0.0

Forms rejected as extension repos:

  • aspect_bazel_lib++toolchains+bats_toolchains
  • rules_jvm_external~6.0~maven~maven

The narrowing is safe because extension-repo content changes already propagate to main-repo consumers through the per-target hash dep chain: computeSimpleImpactedTargets runs unconditionally for @@* labels and the build-graph hasher folds dep hashes into each rule's hash, so a consumer of a changed extension repo flips its own hash and is picked up without needing an rdeps query targeting the extension repo. The rdeps query exists to catch the harder case — MODULE.bazel-only bumps that don't flip any hash but do change which version of a base module repo is selected — and that's exactly what the base-repo match handles.

Implementation lives in a new private helper canonicalRepoIsBaseForModule(canonical, moduleName) with a doc comment that lays out the canonical-name forms and the safety argument.

Latent test bug fixed alongside

The reproducer's @Before had a Mockito misuse — the second whenever(queryService.query(any<String>())) stub omitted the default-argument matcher and was throwing InvalidUseOfMatchersException before any assertion ran. Because the test was @Ignored the bug was never observed. Dropped the redundant second stub; the first one (whenever(queryService.query(any<String>(), eq(false)))) covers the production caller, which always passes useCquery = false explicitly.

Verification

  • bazel test //cli:CalculateImpactedTargetsInteractorTestPASSED (21/21, including testUnionsRdepsAcrossChangedModules which exercises the ~ form through the same code path).
  • bazel test //cli:CalculateImpactedTargetsInteractorIssue335TestPASSED (the un-@Ignored reproducer now passes).
  • Reverted just the production change locally — the reproducer failed with expected to not contain:<"toolchains+bats_toolchains"> but was:<"rdeps(//..., @@aspect_bazel_lib+//... + @@aspect_bazel_lib++toolchains+bats_toolchains//... + @@aspect_bazel_lib++toolchains+yq_toolchains//...)">, confirming the regression test guards the fix.

Test plan

  • bazel test //cli:CalculateImpactedTargetsInteractorTest passes.
  • bazel test //cli:CalculateImpactedTargetsInteractorIssue335Test passes with the fix; fails as expected without it.
  • Spot-checked the helper against both + and ~ canonical forms (base and extension).
  • Close #346 in favour of this PR once merged.

🤖 Generated with Claude Code

tinder-maxwellelliott and others added 2 commits May 18, 2026 13:31
…endingOnModules

Issue #335 lists multiple potential fixes; my first PR (#340) covered
fix #1 (parser robustness). This PR adds a reproducer for fix #3: the
loose substring match in
CalculateImpactedTargetsInteractor.queryTargetsDependingOnModules.

The current code is:

  val moduleRepos = allTargets.keys
      .filter { it.startsWith("@@") && it.contains(moduleName) }
      .map { it.substring(2).substringBefore("//") }
      .toSet()

`it.contains(moduleName)` is a very loose match. A single changed
module like `aspect_bazel_lib` matches every canonical repo whose name
starts with that string -- e.g. `@@aspect_bazel_lib+`,
`@@aspect_bazel_lib++toolchains+bats_toolchains`, etc. Each match
becomes its own serial `bazel query rdeps(//..., @@<repo>//...)`
subprocess. On the workspace in #335 that fan-out produced ~5,000
subprocesses and took multiple hours.

New test class CalculateImpactedTargetsInteractorIssue335Test:
- Wires its own koin module with a mock BazelQueryService that captures
  every query expression.
- Simulates a workspace with one module (aspect_bazel_lib) plus two
  module-extension repos whose canonical names begin with the same
  apparent name.
- Asserts that resolving the changed module yields exactly ONE rdeps
  query, scoped to the actual changed repo.

@ignore'd until the match is tightened (match canonical repo key
exactly, or look up via `bazel mod dump_repo_mapping`). cli/BUILD wires
up the new kt_jvm_test target.

Refs #335

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…gOnModules

`queryTargetsDependingOnModules` resolved each changed module key to its
canonical bzlmod repos by scanning `allTargets.keys` with
`it.startsWith("@@") && it.contains(moduleName)`. The `contains` match
is far too loose: a single changed module like `aspect_bazel_lib` matched
every canonical repo whose name began with that string -- e.g.
`@@aspect_bazel_lib+`, `@@aspect_bazel_lib++toolchains+bats_toolchains`,
`@@aspect_bazel_lib++toolchains+yq_toolchains`, etc. Each match was then
unioned into the rdeps query expression (after the per-repo loop became
a single unioned query in an earlier change), but the same fan-out
behaviour on older revisions produced ~5,000 serial subprocesses on the
workspace in #335 and took multiple hours.

Tighten the match to the module's *base* canonical repo only:
`<moduleName><sep><versionSegment>` where `<sep>` is `+` or `~` and the
remainder contains no further separators. Module-extension repos always
layer additional `+`/`~` segments on top and are rejected here.

That narrowing is safe because extension-repo content changes already
propagate to main-repo consumers through the per-target hash dep chain:
`computeSimpleImpactedTargets` runs unconditionally for `@@*` labels and
the build-graph hasher folds dep hashes into each rule's hash, so a
consumer of a changed extension repo flips its own hash and is picked up
without needing an rdeps query targeting the extension repo. The rdeps
query exists to catch the harder case -- MODULE.bazel-only bumps that
don't flip any hash but do change which version of a base module repo is
selected -- and that is exactly the case the base-repo match handles.

The reproducer test class from PR #346 had a latent Mockito misuse in
its `@Before`: the second `whenever(queryService.query(any<String>()))`
stub omits the default-argument matcher and was throwing
`InvalidUseOfMatchersException` before any assertion ran. The test was
`@Ignore`d so the bug was never observed. Dropping the redundant stub
and dropping `@Ignore` -- the first stub already covers the production
caller, which always passes `useCquery = false` explicitly.

Verification:
- `bazel test //cli:CalculateImpactedTargetsInteractorTest` -- PASSED
  (21 pre-existing tests still pass, including
  `testUnionsRdepsAcrossChangedModules` which exercises the `~` form
  through the same code path).
- `bazel test //cli:CalculateImpactedTargetsInteractorIssue335Test` --
  PASSED (the un-`@Ignore`d reproducer now passes; reverting just the
  production change reproduces the bug with the loose match yielding
  `rdeps(//..., @@aspect_bazel_lib+//... + @@aspect_bazel_lib++toolchains+bats_toolchains//... + @@aspect_bazel_lib++toolchains+yq_toolchains//...)`).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@tinder-maxwellelliott tinder-maxwellelliott merged commit dfdc269 into master May 18, 2026
15 checks passed
@tinder-maxwellelliott tinder-maxwellelliott deleted the claude/fix-issue-335-loose-substring-match branch May 18, 2026 18:28
rdark added a commit to rdark/bazel-diff that referenced this pull request May 19, 2026
Layered on top of Tinder#335 fixes (Tinder#353 parse-asymmetry guard, Tinder#354
base-canonical-repo predicate), this commit hardens several edge cases
in the surrounding code.

* Restructure detectChangedModules to return Set<Module> instead of
  module-key strings, threading module.name straight to the
  canonical-match predicate.

* Fix `computeSimpleImpactedTargets(emptyMap(), allTargets)` returning
  allTargets.keys, silently unioning every target back into the rdeps
  result. Thread `from` so the union uses the real hash diff.

* Reject empty module.name in ModuleGraphParser.extractModules. The
  unnamed-root MODULE.bazel case (where `bazel mod graph --output=json`
  emits name="") would otherwise reach canonicalRepoIsBaseForModule
  and over-match every canonical.

* Downgrade per-module "no external repository found" from WARN to
  INFO and emit a single aggregate at the end.

* Make the rdeps-failure fallback shape-aware for bzlmod-only
  workspaces. The unconditional !startsWith("@@") filter left only
  //external:* bridges, which the downstream excludeExternalTargets
  default (Tinder#334) then strips to empty. Replace with a shape-aware
  fallback: emit the buildable subset (//... minus //external:) when
  non-empty, otherwise every hashed label. Extracted as
  isBuildableWorkspaceTarget.

* New tests in CalculateImpactedTargetsInteractorModuleQueryTest cover
  substring no-over-match, sentinel for the execute path, both branches
  of the shape-aware fallback, canonical dedup on add+remove version
  bumps, and the excludeExternalTargets interaction with the
  catch-fallback. New parser test covers empty-name rejection.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant