feat(graph): add suggest-collection command for collection optimization#1030
feat(graph): add suggest-collection command for collection optimization#1030smoparth wants to merge 1 commit intopython-wheel-build:mainfrom
Conversation
|
Warning Rate limit exceeded
To keep reviews running without waiting, you can enable usage-based add-on for your organization. This allows additional reviews beyond the hourly cap. Account admins can enable it under billing. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (3)
📝 WalkthroughWalkthroughAdds a new CLI command Estimated code review effort🎯 5 (Critical) | ⏱️ ~120 minutes |
There was a problem hiding this comment.
Actionable comments posted: 3
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/fromager/commands/graph.py`:
- Around line 1031-1033: The current check in the onboarding output path always
prints a human message when toplevel_nodes is empty; update the block that
references toplevel_nodes so that when output_format == "json" it emits a valid
JSON empty list (e.g., using click.echo("[]")) and returns, otherwise keep the
human-friendly message "No top-level packages found in onboarding graph." This
change touches the conditional that checks toplevel_nodes in the function
handling the onboarding graph output and should reference the output_format
variable to decide which output to produce.
- Around line 916-918: The current sort used for selecting best_score
(scores.sort(...); best_score = scores[0]) only considers new_packages and
coverage_percentage, which leaves true ties to fall back to the original
collection_graphs order; make tie-breaking deterministic by adding a stable
filename-derived key as a tertiary sort field (e.g., change the sort key to
(score.new_packages, -score.coverage_percentage, score.filename) or, if Score
objects don’t carry filenames, derive the filename from the corresponding
collection_graphs entry when building scores) so best_score / best_fit is chosen
predictably by filename when the first two metrics tie.
In `@tests/test_suggest_collection.py`:
- Line 5: Replace the fragile regex-based JSON extraction in
tests/test_suggest_collection.py with direct JSON parsing of the entire CLI
output: remove the re import and any use of re.search that extracts a bracketed
block, and instead call json.loads(...) on the full output string produced by
the CLI (e.g., the variable holding stdout or `cli_output`/`result.stdout`) in
each affected test (the blocks around lines 24-29, 294-295, 327 etc.); ensure
the tests assert against the parsed dict/list returned by json.loads so the test
fails if extra bytes surround the JSON payload.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 13fe9c39-2bda-4331-802b-d350ab77b34e
📒 Files selected for processing (2)
src/fromager/commands/graph.pytests/test_suggest_collection.py
55faca2 to
1836b93
Compare
There was a problem hiding this comment.
🧹 Nitpick comments (1)
src/fromager/commands/graph.py (1)
1048-1067: Consider loading collection graphs in parallel for large inputs.For many collection graphs, sequential loading could be slow. This is fine for typical usage but worth noting if performance becomes a concern with many large graphs.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/fromager/commands/graph.py` around lines 1048 - 1067, Parallelize loading by extracting collection names first (call extract_collection_name for each graph_path, check for duplicates) and then use a concurrent.futures.ThreadPoolExecutor to call DependencyGraph.from_file(graph_path) for each unique path in parallel; for each future, on success call get_package_names(collection_graph) and populate collection_packages[collection_name], and on exception wrap and raise a click.ClickException that includes the graph_path and original error (preserve the existing logger.debug call per collection_name). Reference symbols: collection_graphs, extract_collection_name, DependencyGraph.from_file, get_package_names, collection_packages.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@src/fromager/commands/graph.py`:
- Around line 1048-1067: Parallelize loading by extracting collection names
first (call extract_collection_name for each graph_path, check for duplicates)
and then use a concurrent.futures.ThreadPoolExecutor to call
DependencyGraph.from_file(graph_path) for each unique path in parallel; for each
future, on success call get_package_names(collection_graph) and populate
collection_packages[collection_name], and on exception wrap and raise a
click.ClickException that includes the graph_path and original error (preserve
the existing logger.debug call per collection_name). Reference symbols:
collection_graphs, extract_collection_name, DependencyGraph.from_file,
get_package_names, collection_packages.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 6b23b6d9-1f42-421f-9b08-b2e1bdfe4769
📒 Files selected for processing (2)
src/fromager/commands/graph.pytests/test_suggest_collection.py
🚧 Files skipped from review as they are similar to previous changes (1)
- tests/test_suggest_collection.py
jlarkin09
left a comment
There was a problem hiding this comment.
LGTM other than the one suggestion by coderabbit for json.loads()
|
I'm not convinced that it makes sense to add this feature to Fromager upstream. Collections are a downstream concept, not a Fromager feature. |
@tiran, I had the same thought when I picked this. |
I want this feature here. Let's think of a better name for it. |
How about |
1836b93 to
2ec1705
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/fromager/commands/graph.py`:
- Around line 903-923: The JSON being emitted in the results dict in
src/fromager/commands/graph.py must follow the documented schema: replace the
top-level
best_fit/new_packages/existing_packages/coverage_percentage/all_collections
shape with package, version, total_dependencies, and an ordered collections list
named "collections" where each entry has keys name (use score.collection),
new_packages, existing_packages, total_dependencies (compute per-score total if
available), and coverage_pct (rounded to 1 decimal); remove the flattened
top-level best_* fields from this serialized dict and, if you still need those
flattened values for the human table, compute them in
_print_suggest_collection_table() by taking the first-ranked collection from the
collections list rather than emitting a second JSON shape here.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: b7011d4c-bcb5-4b92-8f48-db53729ab322
📒 Files selected for processing (3)
src/fromager/commands/graph.pysrc/fromager/dependency_graph.pytests/test_suggest_collection.py
✅ Files skipped from review due to trivial changes (1)
- tests/test_suggest_collection.py
I think that's moving us in the right direction. What do you think, @smoparth ? |
@dhellmann, Thanks for confirming. I'll update the command name and also correct cli argument types to clickext.ClickPath instead of str and update the PR. |
2ec1705 to
15236a5
Compare
There was a problem hiding this comment.
Actionable comments posted: 11
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
src/fromager/commands/lint_requirements.py (1)
82-97:⚠️ Potential issue | 🟠 Major | ⚡ Quick winGuard resolution after parse failures and empty resolution results.
Line 90 can reference
requirementbefore assignment when Line 64 raisesInvalidRequirement, which crashes linting under--resolve-requirements. Line 96 also assumesresultsis non-empty.Suggested fix
except InvalidRequirement as err: msg = f"{path}: {line}: {err}" logger.error(msg) failures.append(msg) + continue # Resolve the requirement to ensure it can be found # Skip resolution for constraints files as they should only specify versions if resolve_requirements and not is_constraints: token = requirement_ctxvar.set(requirement) try: results = bt.resolve_versions( req=requirement, req_type=RequirementType.TOP_LEVEL, ) + if not results: + raise RuntimeError("No versions returned by resolver") _, version = results[0] logger.info(f"{requirement} resolves to {version}")🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/fromager/commands/lint_requirements.py` around lines 82 - 97, The code attempts to resolve a requirement even when parsing failed or when resolve results are empty; update the flow so resolution only runs for successfully parsed requirements and guard against empty resolve results: inside the block handling parsing (where InvalidRequirement is caught) ensure you continue/skip resolution when an exception occurs (so the variable requirement isn't referenced after the except), and when calling bt.resolve_versions(req=requirement, ...) check that results is non-empty before doing results[0] (log an error and append to failures if empty). Refer to requirement, InvalidRequirement, resolve_requirements, is_constraints, requirement_ctxvar, bt.resolve_versions, and results when making the changes.src/fromager/bootstrapper.py (1)
1112-1137:⚠️ Potential issue | 🟠 Major | ⚡ Quick winScan all cache hits before rejecting on build tag.
find_all_matching_from_provider()can return multiple wheel URLs for the same pinned version. This code only inspectsresults[0], so a mismatched first wheel makes us miss a later candidate with the expected build tag and rebuild unnecessarily.Suggested fix
- results = resolver.find_all_matching_from_provider(provider, pinned_req) - wheel_url, _ = results[0] - wheelfile_name = pathlib.Path(urlparse(wheel_url).path) pbi = self.ctx.package_build_info(req) expected_build_tag = pbi.build_tag(resolved_version) - # Log the expected build tag for debugging logger.info(f"has expected build tag {expected_build_tag}") - # Get changelogs for debug info changelogs = pbi.get_changelog(resolved_version) logger.debug(f"has change logs {changelogs}") - - _, _, build_tag, _ = wheels.extract_info_from_wheel_file( - req, wheelfile_name - ) - if expected_build_tag and expected_build_tag != build_tag: - logger.info( - f"found wheel for {resolved_version} in cache but build tag does not match. Got {build_tag} but expected {expected_build_tag}" - ) - return None, None + results = resolver.find_all_matching_from_provider(provider, pinned_req) + + wheel_url = None + for candidate_url, _ in results: + candidate_name = pathlib.Path(urlparse(candidate_url).path) + _, _, build_tag, _ = wheels.extract_info_from_wheel_file( + req, candidate_name + ) + if not expected_build_tag or expected_build_tag == build_tag: + wheel_url = candidate_url + break + + if wheel_url is None: + logger.info( + "found wheel(s) for %s in cache but none matched build tag %s", + resolved_version, + expected_build_tag, + ) + return None, None🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/fromager/bootstrapper.py` around lines 1112 - 1137, The code inspects only results[0] from resolver.find_all_matching_from_provider(provider, pinned_req) and rejects on a build-tag mismatch too early; instead, iterate over all entries in results, for each wheel_url derive wheelfile_name, call wheels.extract_info_from_wheel_file(req, wheelfile_name) to get build_tag and compare against pbi.build_tag(resolved_version) (use the existing pbi and expected_build_tag), return the matching wheel_url and its metadata immediately when build_tag matches, and only return (None, None) after all candidates are checked and none match.
🧹 Nitpick comments (1)
src/fromager/context.py (1)
66-67: ⚡ Quick winKeep
max_release_agevalidation consistent between__init__and setter.
set_max_release_age()rejects non-positive values, but direct constructor input currently bypasses that rule.Suggested fix
- self.cooldown: Cooldown | None = cooldown - self._max_release_age: datetime.timedelta | None = max_release_age + self.cooldown: Cooldown | None = cooldown + if max_release_age is not None and max_release_age <= datetime.timedelta(0): + raise ValueError( + f"max_release_age must be positive, got {max_release_age}" + ) + self._max_release_age: datetime.timedelta | None = max_release_ageAlso applies to: 116-127
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/fromager/context.py` around lines 66 - 67, The constructor for the Context class currently accepts max_release_age without applying the same validation as set_max_release_age; update the __init__ to validate max_release_age the same way set_max_release_age does (reject None? or non-positive timedeltas as set_max_release_age enforces) and raise the same exception/type on invalid values; locate the __init__ parameter max_release_age and the set_max_release_age method and apply identical checks/path and error messages so direct construction and setter behavior are consistent (also mirror this fix for the related validation between lines 116-127 referenced in the review).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@docs/proposals/release-cooldown.md`:
- Line 236: The "Pre-built wheel exemption" line in the Phase 1 section
conflicts with the earlier "cooldown policy" that applies to pre-built wheels;
update the Phase 1 content so it aligns with the global cooldown rules—either
remove the "Pre-built wheel exemption" clause or clarify that pre-built wheels
remain subject to cooldown checks and describe any narrow, explicitly-scoped
exception; ensure you modify the Phase 1 heading/paragraph containing "Pre-built
wheel exemption" and mention the cooldown policy name or section so readers can
reconcile the two.
In `@e2e/test_bootstrap_cooldown_github.sh`:
- Around line 30-31: The install uses bare pip while the trap uninstall uses
python3 -m pip, which can target different interpreters; change the install
invocation so both use the same interpreter (mirror the uninstall) by running
the install via python3 -m pip with the same package path
("$SCRIPTDIR/github_override_example") to ensure teardown isolation and avoid
plugin leakage.
In `@e2e/test_bootstrap_cooldown_gitlab.sh`:
- Around line 34-35: The uninstall in the trap uses "python3 -m pip uninstall"
but the install uses a bare "pip install", which can target a different
interpreter; change the install invocation to use the same installer as the trap
(e.g., replace the pip install "$SCRIPTDIR/gitlab_override_example" call with
"python3 -m pip install \"$SCRIPTDIR/gitlab_override_example\"") so the install
and teardown use the identical Python/pip environment (refer to the existing
trap line and the current pip install invocation).
In `@e2e/test_mode_fallback.sh`:
- Around line 68-75: The script currently reads only the first match into
failures_file and ignores other test-mode-failures-*.json artifacts; update the
logic to iterate over all matches returned by find (e.g., using find ... -print0
with while read -r -d '' file or a for loop over results), compute the failures
for each file by running jq '.failures | length' on every matched file (or
accumulate a total), and if any file has failure_count > 0 print the error
including the file name and set pass=false (ensure you still send the message to
stderr with 1>&2); reference symbols: failures_file, failure_count, OUTDIR, and
pass.
In `@e2e/test_mode_resolution.sh`:
- Around line 39-40: The script currently reads the first failure entry
(`.failures[0]`) into failure_type which can return the wrong result when
multiple failures exist; update the jq expression that sets failure_type (the
assignment using failures_file and variable failure_type) to select the failure
object whose "dist" equals the shell variable $DIST and then extract its
"failure_type" (i.e., use a jq filter that finds .failures[] | select(.dist ==
env.DIST) | .failure_type and pass DIST into jq as an env/arg) so the test
checks the failure_type for the correct distribution rather than the first JSON
entry.
In `@pyproject.toml`:
- Around line 72-76: The coverage dependency constraint currently allows
versions down to 4.0 ("coverage[toml]!=4.4,>=4.0") but the test command uses
`coverage report --format=markdown` which requires coverage >=7.0; update the
dependency string to require at least 7.0 (for example replace the existing
"coverage[toml]!=4.4,>=4.0" entry with "coverage[toml]>=7.0" or
"coverage[toml]>=7.0,<8.0") so the resolver cannot pick an incompatible coverage
version.
In `@src/fromager/bootstrapper.py`:
- Around line 379-391: When a version bootstrap fails in the multiple_versions
branch, the requirement remains marked in _seen_requirements so later visits are
skipped; modify the failure handling in the block that appends to
_failed_versions (the branch guarded by self.multiple_versions) to also clear
the seen-state for that specific requirement/version (so _has_been_seen() will
not prevent a retry and _add_to_graph() won't reintroduce a failed node).
Specifically, after calling self.ctx.dependency_graph.remove_dependency(...) and
before returning, remove the corresponding entry from self._seen_requirements
(the same key/tuple used by _has_been_seen() and _bootstrap_impl()) and then
call self.ctx.write_to_graph_to_file().
In `@src/fromager/commands/graph.py`:
- Around line 804-810: The code incorrectly adds node.canonicalized_name into
dependency_names before iterating dependencies, causing packages with no deps to
report a dependency on themselves; remove the conditional that adds
node.canonicalized_name and rely solely on node.iter_all_dependencies()
(filtering out ROOT as already done) to build dependency_names so the closure
excludes the package itself. Ensure the code still initializes dependency_names:
set[NormalizedName] and only adds dependency.canonicalized_name for dependencies
where dependency.canonicalized_name != ROOT.
In `@src/fromager/dependency_graph.py`:
- Around line 387-405: After deleting the node and cleaning direct edges,
unreachable descendant nodes remain in self.nodes; update the removal logic in
dependency_graph.py (around deleted_node / self.nodes handling) to prune any
nodes no longer reachable from ROOT: compute the set of reachable nodes (e.g.,
BFS/DFS starting at ROOT) and then remove from self.nodes any key not in that
reachable set (or iteratively delete nodes with zero parents except ROOT), so
get_all_nodes() / get_build_topology() only see reachable packages.
In `@src/fromager/wheels.py`:
- Around line 552-560: The call to resolver.find_all_matching_from_provider is
missing the max_age_cutoff so prebuilt wheels can bypass ctx.max_release_age;
update the call to pass the max release age (e.g.
max_age_cutoff=ctx.max_release_age) so provider.cooldown (set via
resolver.resolve_package_cooldown) and age filtering are applied
consistently—ensure the same cutoff logic used by resolve_source is used here
and that the existing wheel_server_url/supports_upload_time handling remains
unchanged.
In `@tests/test_bootstrap.py`:
- Around line 546-639: The tests currently assert log text instead of the real
side effect; change them to verify constraints generation behavior by either (a)
asserting the Bootstrapper mock's write_constraints_file method was or was not
called (reference mock_bootstrapper.return_value.write_constraints_file) or (b)
checking the presence/absence of the generated constraints file
(constraints.txt) in the isolated filesystem after invoking bootstrap.bootstrap;
update test_multiple_versions_auto_disables_constraints to assert no
write_constraints_file call / no constraints.txt and update
test_multiple_versions_with_skip_constraints_no_duplicate_log similarly to
assert no write_constraints_file call / no constraints.txt so the tests validate
functionality rather than brittle log messages.
---
Outside diff comments:
In `@src/fromager/bootstrapper.py`:
- Around line 1112-1137: The code inspects only results[0] from
resolver.find_all_matching_from_provider(provider, pinned_req) and rejects on a
build-tag mismatch too early; instead, iterate over all entries in results, for
each wheel_url derive wheelfile_name, call
wheels.extract_info_from_wheel_file(req, wheelfile_name) to get build_tag and
compare against pbi.build_tag(resolved_version) (use the existing pbi and
expected_build_tag), return the matching wheel_url and its metadata immediately
when build_tag matches, and only return (None, None) after all candidates are
checked and none match.
In `@src/fromager/commands/lint_requirements.py`:
- Around line 82-97: The code attempts to resolve a requirement even when
parsing failed or when resolve results are empty; update the flow so resolution
only runs for successfully parsed requirements and guard against empty resolve
results: inside the block handling parsing (where InvalidRequirement is caught)
ensure you continue/skip resolution when an exception occurs (so the variable
requirement isn't referenced after the except), and when calling
bt.resolve_versions(req=requirement, ...) check that results is non-empty before
doing results[0] (log an error and append to failures if empty). Refer to
requirement, InvalidRequirement, resolve_requirements, is_constraints,
requirement_ctxvar, bt.resolve_versions, and results when making the changes.
---
Nitpick comments:
In `@src/fromager/context.py`:
- Around line 66-67: The constructor for the Context class currently accepts
max_release_age without applying the same validation as set_max_release_age;
update the __init__ to validate max_release_age the same way set_max_release_age
does (reject None? or non-positive timedeltas as set_max_release_age enforces)
and raise the same exception/type on invalid values; locate the __init__
parameter max_release_age and the set_max_release_age method and apply identical
checks/path and error messages so direct construction and setter behavior are
consistent (also mirror this fix for the related validation between lines
116-127 referenced in the review).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: ab06fc5b-b61f-48d7-990c-8d6b0ade0e7d
📒 Files selected for processing (86)
.github/workflows/codeql.yaml.github/workflows/python-publish.yaml.github/workflows/scorecard.yaml.github/workflows/test.yaml.gitignore.mergify.ymlAGENTS.mddocs/develop.mddocs/how-tos/index.rstdocs/how-tos/release-age-cooldown.rstdocs/index.rstdocs/proposals/index.rstdocs/proposals/iterative-bootstrap.rstdocs/proposals/new-patcher-config.mddocs/proposals/new-resolver-config.mddocs/proposals/release-cooldown.mddocs/reference/config-reference.rstdocs/reference/hooks.rstdocs/spelling_wordlist.txte2e/ci_bootstrap_suite.she2e/cooldown_override_settings/stevedore.yamle2e/cooldown_prebuilt_settings/stevedore.yamle2e/github_override_example/pyproject.tomle2e/github_override_example/src/package_plugins/__init__.pye2e/github_override_example/src/package_plugins/stevedore_github.pye2e/gitlab_override_example/pyproject.tomle2e/gitlab_override_example/src/package_plugins/__init__.pye2e/gitlab_override_example/src/package_plugins/python_gitlab.pye2e/test_bootstrap_conflicting_requirements.she2e/test_bootstrap_constraints.she2e/test_bootstrap_cooldown.she2e/test_bootstrap_cooldown_github.she2e/test_bootstrap_cooldown_gitlab.she2e/test_bootstrap_cooldown_override.she2e/test_bootstrap_cooldown_prebuilt.she2e/test_bootstrap_cooldown_transitive.she2e/test_bootstrap_max_release_age.she2e/test_bootstrap_multiple_versions.she2e/test_build_failure/pyproject.tomle2e/test_build_failure/setup.pye2e/test_build_failure/test_build_failure/__init__.pye2e/test_mode_build.she2e/test_mode_deps.she2e/test_mode_fallback.she2e/test_mode_resolution.shpyproject.tomlsrc/fromager/__main__.pysrc/fromager/bootstrap_requirement_resolver.pysrc/fromager/bootstrapper.pysrc/fromager/build_environment.pysrc/fromager/commands/bootstrap.pysrc/fromager/commands/graph.pysrc/fromager/commands/lint_requirements.pysrc/fromager/context.pysrc/fromager/dependencies.pysrc/fromager/dependency_graph.pysrc/fromager/external_commands.pysrc/fromager/finders.pysrc/fromager/http_retry.pysrc/fromager/packagesettings/__init__.pysrc/fromager/packagesettings/_models.pysrc/fromager/packagesettings/_pbi.pysrc/fromager/pyproject.pysrc/fromager/requirements_file.pysrc/fromager/resolver.pysrc/fromager/sbom.pysrc/fromager/sources.pysrc/fromager/vendor_rust.pysrc/fromager/wheels.pytests/conftest.pytests/test_bootstrap.pytests/test_bootstrap_requirement_resolver.pytests/test_bootstrap_requirement_resolver_multiple.pytests/test_bootstrap_test_mode.pytests/test_bootstrapper.pytests/test_build_environment.pytests/test_cooldown.pytests/test_dependencies.pytests/test_finders.pytests/test_hooks.pytests/test_packagesettings.pytests/test_sbom.pytests/test_sources.pytests/test_suggest_collection.pytests/testdata/sbom-context/overrides/settings.yamltests/testdata/sbom-context/overrides/settings/test_purl_fields.yaml
✅ Files skipped from review due to trivial changes (26)
- e2e/test_build_failure/test_build_failure/init.py
- tests/testdata/sbom-context/overrides/settings.yaml
- docs/reference/config-reference.rst
- .github/workflows/codeql.yaml
- e2e/cooldown_prebuilt_settings/stevedore.yaml
- AGENTS.md
- docs/proposals/index.rst
- e2e/github_override_example/pyproject.toml
- tests/testdata/sbom-context/overrides/settings/test_purl_fields.yaml
- docs/how-tos/release-age-cooldown.rst
- docs/how-tos/index.rst
- .gitignore
- docs/index.rst
- docs/reference/hooks.rst
- docs/proposals/iterative-bootstrap.rst
- e2e/test_bootstrap_constraints.sh
- src/fromager/requirements_file.py
- docs/develop.md
- src/fromager/vendor_rust.py
- src/fromager/external_commands.py
- tests/test_packagesettings.py
- src/fromager/pyproject.py
- e2e/test_build_failure/setup.py
- docs/spelling_wordlist.txt
- src/fromager/dependencies.py
- tests/test_build_environment.py
15236a5 to
8f18545
Compare
Add fromager graph suggest-collection that analyzes dependency overlap between onboarding packages and existing collections to recommend the best-fit collection for each package. Helps pipeline maintainers make data-driven decisions when assigning new packages to permanent collections. - Iterative DFS traversal over all edge types for full dependency closure - Ranks collections by fewest new packages, then highest coverage - Rich table (default) and JSON output formats - Error handling for bad graph files and duplicate collection names - Tests covering helpers, CLI, and analysis logic Co-Authored-By: Claude-4.6-opus-high <claude@anthropic.com> Closes: python-wheel-build#971 Signed-off-by: Shanmukh Pawan <smoparth@redhat.com>
8f18545 to
6e989dc
Compare
We may want to update the name in #1103 as well. Let me know what you think? |
|
@coderabbitai resolve |
✅ Actions performedComments resolved. Approval is disabled; enable |
IMHO, I preferred the original names (suggest-collection and suggest-base). They were more intuitive, even though they used downstream terminology. find-best-fit does describe the action, but it lacks a clear noun and doesn’t feel immediately intuitive. |
The point about not having a noun is interesting, but we don't have nouns in all of the other graph-related subcommands so I'm not sure it's a requirement. I do agree that we might be able to come up with something more precise than "best fit". I'll leave the naming up to you two. You've seen what happens when I'm allowed to name things. :-) |
In that case, we could go with find-best-fit as-is for #1030 and name suggest-base something like find-shared-packages or find-common-packages. I've also addressed all the previous review feedback -- I think it's ready to merge. PTAL when you get a chance. cc: @dhellmann, @rd4398 |
Pull Request Description
What
Add fromager graph suggest-collection that analyzes dependency overlap between onboarding packages and existing collections to recommend the best-fit collection for each package. Helps pipeline maintainers make data-driven decisions when assigning new packages to permanent collections.
Why
Closes: #971