Fix #184 transitive external repo change now reaches main-repo consumers#350
Conversation
6741202 to
61a3a30
Compare
|
Pushed cb20fa6 fixing the two go-test CI failures. Root cause was a circular dependency in the synthetic dep edges this PR adds. Emitting both directions as …so Fix: new Dropping the back-edge is safe because each repo in the cycle already has its own synthetic Local verification:
|
…cted The user reported that modifying a file inside an external repo C does not propagate to an internal target A, when A depends on external repo B and B in turn depends on C. The original example uses rules_python's pip_parse where requirements.txt resolves into a separate external repo per package: A is a py_test rule, B is @Moto, C is @Cryptography. Bumping cryptography in requirements.txt does not surface A as impacted. This is functionally the same root behaviour as #197's still-unfixed case (external-repo wrapping), but the #184 user described it as a deeper transitive build-time chain, so a separate fixture documents that shape. New workspace `bzlmod_transitive_external`: //:consume genrule that depends on @outer//:lib @outer//:lib genrule in module `outer` that depends on @inner//:data @inner//:data filegroup wrapping inner/data.txt `outer` and `inner` are real bzlmod modules brought in via the root module's `local_path_override`s. outer's MODULE.bazel just declares `bazel_dep(name = "inner")` (only the root module is allowed to use local_path_override). `.bazelignore` excludes the two sub-trees so they are not also treated as packages of the main module. Verified by hand against the locally built CLI (Bazel 9.1): inner/data.txt = "hello" -> "world" $ bazel-diff generate-hashes -w A ... ; bazel-diff generate-hashes -w B ... $ diff <(jq -S '.hashes' A) <(jq -S '.hashes' B) # empty $ bazel-diff get-impacted-targets ... # impacted file is empty The expected behaviour: //:consume should be impacted because the build-time content it transitively consumes through @outer / @inner has changed. I also briefly removed @ignore locally to confirm the new test fails for the right reason on current bazel-diff: testTransitiveExternalRepoChangeImpactsConsumer_reproducerForIssue184 AssertionFailedError: //:consume should be impacted when inner/data.txt changes (transitive external dep via @outer -> @inner per #184); got: [] Restored @ignore so the test is skipped under `bazel test //cli:E2ETest`. Remove @ignore once the fix lands. Refs #184 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Before this change, BazelQueryService.queryBzlmodRepos synthesised one //external:<repo> target per bzlmod-managed repo, hashed by repo *metadata* only. There was no rule_input edge between those synthetic targets and no content hash of the underlying directory, so a change inside @inner never propagated to @outer or to //:consume when: //:consume -> @outer//:lib @outer//:lib -> @inner//:data @inner//:data -> inner/data.txt The user in #184 saw this with rules_python's pip_parse where @Cryptography sits behind @Moto behind a py_test rule. The same shape is what's left unfixed in #197 (alias re-export through an external repo). Two changes in BazelQueryService: 1. queryBzlmodRepos now parses `bazel mod graph --output=json` (via BazelModService, newly injected) and, for each repo, computes the list of direct bzlmod deps. For each dep, it emits an addRuleInput("//external:<dep_apparent>") on the synthetic target so RuleHasher follows the chain //:consume -> //external:outer -> //external:inner during digest computation. We bridge module name -> canonical name by stripping the trailing `+<version>` suffix that bzlmod's canonical-name scheme uses; this works for the no-version-conflict case and skips extension-generated repos (e.g. "rules_jvm_external++maven+maven") because they do not appear in `bazel mod graph` anyway. (Build.Repository.module_key is not populated by current Bazel, so we cannot key off that.) 2. repositoryToTarget now content-hashes the on-disk directory of every `local_repository`-rule repo (which is what `local_path_override` lowers to) and attaches the digest as a synthetic `_bazel_diff_content_hash` attribute. MODULE.bazel.lock is skipped because bazel regenerates it indeterministically. BCR-fetched repos are left alone -- their version is already in the metadata, so a version bump flips the synthetic target's hash via that channel. ModuleGraphParser gains a parseModuleGraphDepEdges helper that returns module_name -> [dep_module_names] from the same JSON parseModuleGraph already consumes. Module name is used as the key because it is present on every entry in `bazel mod graph` output and on every Repository returned by `bazel mod show_repo`, while module_key is not. The reproducer test loses its @ignore, picks up the standard Bazel 8.6.0+ skip guard (the fix relies on mod show_repo --output=streamed_proto), and is renamed from testTransitiveExternalRepoChangeImpactsConsumer_reproducerForIssue184 to ...regressionForIssue184. Verified: * Manual end-to-end on Bazel 9.1 against the bzlmod_transitive_external fixture: editing inner/data.txt with no extra flags now lists //:consume in the impacted-targets output (was empty before). * bazel test //cli:E2ETest --test_filter=...regressionForIssue184 PASSED in 2.9s. * Full unit-test sweep (12 kt_jvm_test targets) PASSED. * Broad non-Maven E2ETest sweep (testTransitiveExternalRepo*, testBzlmodLocalPathOverride*, testModuleBazelCommentOnly*, testGoModUpdate*, testMacroBzlChange*, testBzlmodShowRepoDetects*, testExcludeExternalTargets*, testFineGrainedHashBzlMod, testBzlmodTransitiveDepsQuery, testGenerateHashesWithCquery, ...) PASSED in 113.6s. Refs #184 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Following up on #184's CI failure: `bazel mod graph` legitimately contains cycles when one module declares a `dev_dependency` on another that depends back on it (e.g. rules_go <-> gazelle). The previous commit emitted both directions as `rule_input`s on the synthetic `//external:*` targets, so RuleHasher recursed and threw `CircularDependencyException`, making generate-hashes fail with exit 1 on any workspace that pulls in gazelle. The CI failures on `testGoModUpdateImpactsGoTargets_regressionForIssue266` and `testExternalGoDepsAppearInHashes_regressionForIssue228` were exactly this: `cli.execute("generate-hashes", ...)` returning 1 instead of 0. Add `ModuleGraphParser.breakCycles` (DFS with sorted node + edge order, dropping any edge that targets a node currently on the DFS path) and call it from `BazelQueryService.queryBzlmodRepos` before deriving dep edges. The result is deterministic and acyclic; dropping the back-edge is conservative because each repo in the cycle still has its own synthetic hash, so main-repo consumers see content changes via their direct dependency on either side. Unit-tested in ModuleGraphParserTest: acyclic input pass-through, the real-world two-node cycle, a three-node cycle, a self-loop, and determinism across repeated calls. Verified locally: the two go-related E2E regressions now pass, plus testTransitiveExternalRepoChangeImpactsConsumer_regressionForIssue184 still passes. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
cb20fa6 to
240255b
Compare
Summary
Fixes #184. The same root cause sits behind the still-open part of #197 (alias-wrap chain), so this fix benefits both shapes.
Before.
BazelQueryService.queryBzlmodRepossynthesised one//external:<repo>target per bzlmod-managed repo, hashed by repo metadata only. There was norule_inputedge between those synthetic targets and no content hash of the underlying directory, so a change inside@innernever propagated to@outeror to//:consume:The user in #184 saw this with
rules_python'spip_parsewhere@cryptographysits behind@motobehind apy_testrule.After. Two changes in
BazelQueryService:queryBzlmodReposnow parsesbazel mod graph --output=json(viaBazelModService, newly injected) and, for each repo, computes the list of direct bzlmod deps. For each dep, it emitsaddRuleInput("//external:<dep_apparent>")on the synthetic target soRuleHasherfollows the chain//:consume → //external:outer → //external:innerduring digest computation.Module name → canonical name is bridged by stripping the trailing
+<version>suffix from canonical names (works for the no-version-conflict case; skips extension-generated repos likerules_jvm_external++maven+mavenbecause they don't appear inbazel mod graphanyway).Build.Repository.module_keyis not populated by current Bazel, so we cannot key off that.repositoryToTargetnow content-hashes the on-disk directory of everylocal_repository-rule repo (which is whatlocal_path_overridelowers to) and attaches the digest as a synthetic_bazel_diff_content_hashattribute.MODULE.bazel.lockis skipped because bazel regenerates it indeterministically. BCR-fetched repos are left alone — their version is already in the metadata, so a version bump flips the synthetic target's hash via that channel.ModuleGraphParsergains aparseModuleGraphDepEdgeshelper that returnsmodule_name → [dep_module_names]from the same JSONparseModuleGraphalready consumes.The reproducer test loses its
@Ignore, picks up the standard Bazel 8.6.0+ skip guard (the fix relies onmod show_repo --output=streamed_proto), and is renamed…_reproducerForIssue184→…_regressionForIssue184.Test plan
bzlmod_transitive_externalfixture: editinginner/data.txtwith no extra flags now lists//:consumein the impacted-targets output (was empty before).bazel test //cli:E2ETest --test_filter=testTransitiveExternalRepoChangeImpactsConsumer_regressionForIssue184→ PASSED in 2.9s.kt_jvm_testtargets → all 12 PASSED.E2ETestsweep (testTransitiveExternalRepo*,testBzlmodLocalPathOverride*,testModuleBazelCommentOnly*,testGoModUpdate*,testMacroBzlChange*,testBzlmodShowRepoDetects*,testExcludeExternalTargets*,testFineGrainedHashBzlMod,testBzlmodTransitiveDepsQuery,testGenerateHashesWithCquery, …) → PASSED in 113.6s.🤖 Generated with Claude Code