feat(bootstrap): parallelize RESOLVE and PREPARE_SOURCE I/O with background threads#1199
feat(bootstrap): parallelize RESOLVE and PREPARE_SOURCE I/O with background threads#1199dhellmann wants to merge 3 commits into
Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughThis PR introduces a background-threaded I/O prefetch pipeline to the bootstrapper. Estimated code review effort🎯 4 (Complex) | ⏱️ ~80 minutes 🚥 Pre-merge checks | ✅ 4✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
src/fromager/commands/bootstrap.py (1)
121-149:⚠️ Potential issue | 🟡 Minor | ⚡ Quick winAdd a Sphinx version directive for the new CLI option.
Line 121 introduces a user-facing CLI flag, but the command docstring doesn’t record this with a Sphinx version marker.
Proposed change
def bootstrap( @@ ) -> None: """Compute and build the dependencies of a set of requirements recursively TOPLEVEL is a requirements specification, including a package name and optional version constraints. + .. versionchanged:: NEXT + Added ``--bg-threads`` to control background I/O prefetch worker count. + """As per coding guidelines,
**/*.{rst,py}requires Sphinxversionadded/versionchangeddirectives for user-facing changes.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/fromager/commands/bootstrap.py` around lines 121 - 149, The new CLI option `--bg-threads` introduced in the bootstrap function's click decorators is missing a Sphinx version directive in the function docstring. Add a `:versionadded::` directive to the docstring of the bootstrap function to document when this new user-facing CLI flag was introduced, following the coding guidelines requirement for marking changes in py and rst files.Source: Coding guidelines
🧹 Nitpick comments (2)
src/fromager/commands/bootstrap.py (1)
531-544: ⚡ Quick winExpose and forward
num_bg_threadsinbootstrap_parallelfor tunable performance.
bootstrap_parallelcurrently forces the default background-thread count when it invokesbootstrap, so callers of the parallel command cannot tune this knob.Proposed change
`@click.option`( + "--bg-threads", + "num_bg_threads", + type=click.IntRange(min=1), + default=max(1, (os.cpu_count() or 2) // 2), + show_default=True, + help="Number of background threads for parallel I/O pre-fetching (min 1).", +) +@click.option( "--max-release-age", @@ def bootstrap_parallel( @@ multiple_versions: bool, max_release_age: int | None, + num_bg_threads: int, toplevel: list[str], ) -> None: @@ ctx.invoke( bootstrap, @@ multiple_versions=multiple_versions, max_release_age=max_release_age, + num_bg_threads=num_bg_threads, toplevel=toplevel, )Also applies to: 562-572
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/fromager/commands/bootstrap.py` around lines 531 - 544, The `bootstrap_parallel` function does not expose a `num_bg_threads` parameter in its function signature, preventing callers from tuning the background thread count when it internally invokes `bootstrap`. Add `num_bg_threads` as a parameter to the `bootstrap_parallel` function signature, and then forward this parameter when calling `bootstrap` within the function body to allow callers to control this performance knob.tests/test_bootstrapper_iterative.py (1)
260-386: ⚡ Quick winAdd one
bg_futureexception-path test to cover threaded failure propagation.The updated suite validates resolved/empty futures, but not the case where
item.bg_future.result()re-raises an exception from the background resolver.Proposed change
def _make_resolved_future( result: typing.Any, ) -> concurrent.futures.Future[typing.Any]: @@ future.set_result(result) return future +def _make_failed_future( + err: Exception, +) -> concurrent.futures.Future[typing.Any]: + """Return an already-failed Future carrying *err*.""" + future: concurrent.futures.Future[typing.Any] = concurrent.futures.Future() + future.set_exception(err) + return future + @@ class TestPhaseResolve: + def test_future_exception_propagates(self, tmp_context: WorkContext) -> None: + bt = bootstrapper.Bootstrapper(tmp_context) + item = _make_resolve_item() + item.bg_future = _make_failed_future(RuntimeError("resolver failed")) + + with pytest.raises(RuntimeError, match="resolver failed"): + bt._phase_resolve(item) + def test_single_version(self, tmp_context: WorkContext) -> None:As per coding guidelines,
tests/**asks us to verify intended behavior and flag missing edge cases.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tests/test_bootstrapper_iterative.py` around lines 260 - 386, Add a new test method to the TestPhaseResolve class that covers the exception path when item.bg_future.result() raises an exception from the background resolver. The test should create a resolve item with a bg_future that raises an exception when result() is called, invoke _phase_resolve on the bootstrapper, and verify that the exception is properly propagated. This test should validate that exceptions from the background resolver thread are correctly surfaced, covering the missing edge case in exception handling.Source: Coding guidelines
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@src/fromager/bootstrapper.py`:
- Around line 350-405: The background task functions _bg_resolve,
_bg_prepare_source, and _bg_prepare_prebuilt execute outside the main-thread
context and therefore lose per-requirement log attribution. Wrap each of these
functions with req_ctxvar_context() using the appropriate requirement parameter
to preserve requirement-scoped logging context. For _bg_resolve wrap with
req_ctxvar_context(req), for _bg_prepare_source wrap with
req_ctxvar_context(req), and for _bg_prepare_prebuilt wrap with
req_ctxvar_context(req).
- Around line 200-217: The ZipFile object created when opening wheel_filename is
not wrapped in a context manager, which can lead to file descriptor leaks
especially under parallel load. Wrap the zipfile.ZipFile call with a `with`
statement to ensure the file is properly closed and the file descriptor is
released immediately after use, moving the entire extraction logic inside the
context manager block.
---
Outside diff comments:
In `@src/fromager/commands/bootstrap.py`:
- Around line 121-149: The new CLI option `--bg-threads` introduced in the
bootstrap function's click decorators is missing a Sphinx version directive in
the function docstring. Add a `:versionadded::` directive to the docstring of
the bootstrap function to document when this new user-facing CLI flag was
introduced, following the coding guidelines requirement for marking changes in
py and rst files.
---
Nitpick comments:
In `@src/fromager/commands/bootstrap.py`:
- Around line 531-544: The `bootstrap_parallel` function does not expose a
`num_bg_threads` parameter in its function signature, preventing callers from
tuning the background thread count when it internally invokes `bootstrap`. Add
`num_bg_threads` as a parameter to the `bootstrap_parallel` function signature,
and then forward this parameter when calling `bootstrap` within the function
body to allow callers to control this performance knob.
In `@tests/test_bootstrapper_iterative.py`:
- Around line 260-386: Add a new test method to the TestPhaseResolve class that
covers the exception path when item.bg_future.result() raises an exception from
the background resolver. The test should create a resolve item with a bg_future
that raises an exception when result() is called, invoke _phase_resolve on the
bootstrapper, and verify that the exception is properly propagated. This test
should validate that exceptions from the background resolver thread are
correctly surfaced, covering the missing edge case in exception handling.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: e01796a7-cf1c-434b-9f19-3b1fa7dfc08a
📒 Files selected for processing (5)
docs/proposals/background-tasks.mdsrc/fromager/bootstrap_requirement_resolver.pysrc/fromager/bootstrapper.pysrc/fromager/commands/bootstrap.pytests/test_bootstrapper_iterative.py
|
This pull request has merge conflicts that must be resolved before it can be merged. |
1df982f to
188b4a9
Compare
Claude Code Review:
|
Revised Claude Code Review:
|
Fix Plan: background-tasks branch review issuesContextThe Issue #2 — Pool resource leak (Important)Problem:
Fix: Make Changes
def __enter__(self) -> "Bootstrapper":
return self
def __exit__(self, exc_type: type | None, exc_val: BaseException | None, exc_tb: object) -> None:
if self._bg_pool is not None:
self._bg_pool.shutdown(wait=False, cancel_futures=True)
self._bg_pool = NoneNote:
# Before:
bt = bootstrapper.Bootstrapper(...)
...
bt.finalize()
# After:
with bootstrapper.Bootstrapper(...) as bt:
...
bt.finalize()
# Before:
bt = bootstrapper.Bootstrapper(...)
# ... (no finalize call)
# After:
with bootstrapper.Bootstrapper(...) as bt:
# ... (no finalize call needed — __exit__ handles cleanup)Issue #3 — Missing assertion in
|
| File | Purpose |
|---|---|
src/fromager/bootstrapper.py |
Add __enter__/__exit__; fix assertion (#3), docstring (#5), dead-code (#6) |
src/fromager/commands/bootstrap.py |
Wrap Bootstrapper with with statement |
src/fromager/commands/lint_requirements.py |
Wrap Bootstrapper with with statement |
tests/test_bootstrapper_iterative.py |
Add real-thread-pool test |
- Make `Bootstrapper` a context manager (`__enter__`/`__exit__`) so the thread pool is always cleaned up, even when an exception is raised before `finalize()` is called. `lint_requirements.py` never called `finalize()` at all, leaking the pool on every invocation. - Wrap `Bootstrapper` with `with` in `commands/bootstrap.py` and `commands/lint_requirements.py`. - Add `assert not req.url` guard in `_bg_resolve` to document-by-contract that git URL requirements must be pre-resolved before background dispatch. - Fix `PreparedSourceData` docstring — "exactly one field set" was wrong; replace with the three actual field-combination cases. - Replace the dead `if item.bg_future is not None:` fallback in `_phase_prepare_source` with `assert item.bg_future is not None`, consistent with `_phase_resolve`. Remove the unreachable inline-I/O fallback branches and update tests to supply pre-completed futures. - Add `TestThreadPoolSubmission` test that uses a real `ThreadPoolExecutor` to exercise the `_push_items` → `submit` → `future.result()` path. Closes: python-wheel-build#1199 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Signed-off-by: Doug Hellmann <dhellmann@redhat.com>
fec76f9 to
6d21e35
Compare
There was a problem hiding this comment.
Actionable comments posted: 5
🧹 Nitpick comments (1)
tests/test_bootstrapper.py (1)
781-784: ⚡ Quick winNarrow the log-prefix assertion scope to target records only.
These loops assert the prefix on every captured message, so unrelated future INFO logs on
fromager.bootstrappercan fail the test despite correct behavior. Filter to the helper-specific messages (or assert at least one expected prefixed message) before checking prefixes.Suggested change
- for msg in messages: - assert msg.startswith("mypkg-1.2.3: "), ( - f"Expected 'mypkg-1.2.3: ' prefix, got: {msg!r}" - ) + target_messages = [m for m in messages if "preparing source" in m] + assert target_messages + for msg in target_messages: + assert msg.startswith("mypkg-1.2.3: "), ( + f"Expected 'mypkg-1.2.3: ' prefix, got: {msg!r}" + )- for msg in messages: - assert msg.startswith("mypkg-1.2.3: "), ( - f"Expected 'mypkg-1.2.3: ' prefix, got: {msg!r}" - ) + target_messages = [m for m in messages if "using pre-built wheel" in m] + assert target_messages + for msg in target_messages: + assert msg.startswith("mypkg-1.2.3: "), ( + f"Expected 'mypkg-1.2.3: ' prefix, got: {msg!r}" + )Based on learnings, this test suite should avoid brittle log-string assertions and prefer behavior-focused checks.
Also applies to: 822-824
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tests/test_bootstrapper.py` around lines 781 - 784, The assertion loops in the test are checking the log-prefix requirement on all captured messages, making the test brittle to unrelated future INFO logs on fromager.bootstrapper that would fail the test despite correct behavior. Filter the messages list to include only the helper-specific messages that are expected to have the mypkg-1.2.3 prefix before performing the assertion, rather than asserting on every message in the loop. Apply this filtering fix to both assertion blocks mentioned in the comment.Source: Learnings
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@e2e/test_bootstrap_cache.sh`:
- Around line 45-46: The grep assertions in the bootstrap cache test script use
regex mode which can match unintended strings when patterns contain dotted
versions like "78.1.0" where dots are interpreted as wildcards. Switch all grep
commands checking log assertions (at lines 45-46, 57, and 108-110) from standard
grep mode to fixed-string mode by adding the -F flag to each grep -q invocation,
so that version strings with dots are matched literally rather than as regex
patterns.
In `@e2e/test_bootstrap_prerelease.sh`:
- Line 23: The grep command uses regex pattern matching which treats dots as
metacharacters that match any character, causing unintended matches in version
strings. Replace `grep -q` with `grep -Fq` on both occurrences (the line
checking for "flit_core-2.0: new toplevel dependency flit_core<2.0.1 resolves to
2.0" and the other similar check at line 44) to use fixed-string literal
matching instead, ensuring the grep assertions match only the exact expected
version strings.
In `@src/fromager/bootstrap_requirement_resolver.py`:
- Around line 121-129: The cached resolution path does not handle the case where
get_cached_resolution() returns an empty list, causing an IndexError when trying
to access cached_result[0]. Add a check after retrieving the cached_result to
verify it is not empty before attempting to access its first element. If the
cached_result is empty, return an empty list to match the behavior of the
non-cached code path that handles empty matching results.
In `@src/fromager/commands/bootstrap.py`:
- Around line 223-229: Replace the manual context variable management using
requirement_ctxvar.set() and requirement_ctxvar.reset() with the
req_ctxvar_context() context manager to ensure the context is always reset even
if bt.resolve_and_add_top_level() raises an exception. Wrap the call to
bt.resolve_and_add_top_level() within the context manager so the per-requirement
logging context does not leak when an exception occurs. Apply the same fix to
the similar code block at lines 233-236 that uses the same pattern.
In `@src/fromager/commands/lint_requirements.py`:
- Around line 62-64: The try-except block around Requirement(line) does not
properly guard against execution continuing to later code blocks that reference
the requirement variable when InvalidRequirement is raised. When
Requirement(line) fails and raises an exception, the requirement variable
remains unassigned, but the resolve-requirements logic (around lines 81-103)
attempts to use it, causing UnboundLocalError. Fix this by ensuring that after
catching the exception for Requirement(line), the code either continues to the
next iteration of the loop or uses req_ctxvar_context() for proper error logging
and skips the resolve path entirely for that line so that undefined requirement
references are never reached.
---
Nitpick comments:
In `@tests/test_bootstrapper.py`:
- Around line 781-784: The assertion loops in the test are checking the
log-prefix requirement on all captured messages, making the test brittle to
unrelated future INFO logs on fromager.bootstrapper that would fail the test
despite correct behavior. Filter the messages list to include only the
helper-specific messages that are expected to have the mypkg-1.2.3 prefix before
performing the assertion, rather than asserting on every message in the loop.
Apply this filtering fix to both assertion blocks mentioned in the comment.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 7f916997-faa5-48df-bf6c-6e8e1c5b3399
📒 Files selected for processing (13)
e2e/test_bootstrap_build_tags.she2e/test_bootstrap_cache.she2e/test_bootstrap_cooldown_prebuilt.she2e/test_bootstrap_prerelease.shpyproject.tomlsrc/fromager/bootstrap_requirement_resolver.pysrc/fromager/bootstrapper.pysrc/fromager/commands/bootstrap.pysrc/fromager/commands/lint_requirements.pytests/test_bootstrap.pytests/test_bootstrap_test_mode.pytests/test_bootstrapper.pytests/test_bootstrapper_iterative.py
✅ Files skipped from review due to trivial changes (2)
- e2e/test_bootstrap_build_tags.sh
- e2e/test_bootstrap_cooldown_prebuilt.sh
🚧 Files skipped from review as they are similar to previous changes (1)
- src/fromager/bootstrapper.py
|
Local testing of this with a representatively large set of input packages shows performance improvements:
All tests used bootstrap-parallel. The single-version mode test ran through to the end, including building all packages after the initial bootstrap phase. The multi-version mode stopped after bootstrapping and did not complete all of the builds, but the PR changes have no effect on that stage of the work. |
|
This pull request has merge conflicts that must be resolved before it can be merged. |
b5d104d to
b828b29
Compare
There was a problem hiding this comment.
🧹 Nitpick comments (1)
tests/test_bootstrapper.py (1)
775-777: 📐 Maintainability & Code Quality | 🔵 Trivial | ⚡ Quick winReduce brittleness in log-prefix assertions by targeting relevant records only.
On Line 781 and Line 823, asserting every captured message starts with an exact prefix can fail due to unrelated log lines. Filter to records emitted by the function-under-test path and assert prefix presence on those records (or assert at least one matching prefixed record), rather than requiring all captured messages to match.
Suggested test hardening
- for msg in messages: - assert msg.startswith("mypkg-1.2.3: "), ( - f"Expected 'mypkg-1.2.3: ' prefix, got: {msg!r}" - ) + relevant = [m for m in messages if "preparing source" in m or "wheel" in m] + assert relevant, "Expected at least one relevant bootstrapper log message" + assert all(m.split(": ", 1)[0] == "mypkg-1.2.3" for m in relevant)- assert len(messages) >= 1 - for msg in messages: - assert msg.startswith("mypkg-1.2.3: "), ( - f"Expected 'mypkg-1.2.3: ' prefix, got: {msg!r}" - ) + relevant = [m for m in messages if "pre-built wheel" in m or "wheel" in m] + assert relevant, "Expected at least one relevant bootstrapper log message" + assert all(m.split(": ", 1)[0] == "mypkg-1.2.3" for m in relevant)Based on learnings, in this repo’s tests avoid asserting exact log output strings because they are brittle; prefer behavior-oriented verification. As per path instructions for
tests/**, verify intended behavior and flag brittle tests.Also applies to: 781-784, 815-817, 821-824
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tests/test_bootstrapper.py` around lines 775 - 777, The test assertions at lines 781 and 823 are brittle because they require all captured messages to start with the exact prefix, which fails when unrelated log lines are emitted. Instead of collecting all messages from caplog.records without filtering, filter the caplog.records to only include records emitted by the specific function-under-test or logger being tested (using attributes like logger name or module), then assert that the prefix is present on those filtered records. This way the test verifies the intended behavior without breaking on unrelated log output. Apply the same filtering pattern to all similar assertions mentioned in the comment where exact log output strings are being verified.Sources: Path instructions, Learnings
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Nitpick comments:
In `@tests/test_bootstrapper.py`:
- Around line 775-777: The test assertions at lines 781 and 823 are brittle
because they require all captured messages to start with the exact prefix, which
fails when unrelated log lines are emitted. Instead of collecting all messages
from caplog.records without filtering, filter the caplog.records to only include
records emitted by the specific function-under-test or logger being tested
(using attributes like logger name or module), then assert that the prefix is
present on those filtered records. This way the test verifies the intended
behavior without breaking on unrelated log output. Apply the same filtering
pattern to all similar assertions mentioned in the comment where exact log
output strings are being verified.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 74ae9544-0642-4bd4-8970-35a53d528952
📒 Files selected for processing (14)
e2e/test_bootstrap_build_tags.she2e/test_bootstrap_cache.she2e/test_bootstrap_cooldown_prebuilt.she2e/test_bootstrap_prerelease.shpyproject.tomlsrc/fromager/bootstrap_requirement_resolver.pysrc/fromager/bootstrapper.pysrc/fromager/commands/bootstrap.pysrc/fromager/commands/lint_requirements.pysrc/fromager/resolver.pytests/test_bootstrap.pytests/test_bootstrap_test_mode.pytests/test_bootstrapper.pytests/test_bootstrapper_iterative.py
✅ Files skipped from review due to trivial changes (2)
- e2e/test_bootstrap_build_tags.sh
- e2e/test_bootstrap_cache.sh
🚧 Files skipped from review as they are similar to previous changes (11)
- e2e/test_bootstrap_prerelease.sh
- pyproject.toml
- tests/test_bootstrap.py
- src/fromager/resolver.py
- e2e/test_bootstrap_cooldown_prebuilt.sh
- src/fromager/commands/lint_requirements.py
- src/fromager/commands/bootstrap.py
- tests/test_bootstrap_test_mode.py
- src/fromager/bootstrap_requirement_resolver.py
- src/fromager/bootstrapper.py
- tests/test_bootstrapper_iterative.py
b828b29 to
b196f88
Compare
There was a problem hiding this comment.
🧹 Nitpick comments (1)
tests/test_bootstrapper.py (1)
781-784: 📐 Maintainability & Code Quality | 🔵 Trivial | ⚡ Quick winAdd assertion that at least one message was logged.
For consistency with
test_bg_prepare_prebuilt_log_prefix_includes_version(line 821), add an assertion before the loop to ensure the test fails explicitly if no messages are logged.🔍 Proposed fix
finally: logging.setLogRecordFactory(old_factory) + assert len(messages) >= 1 for msg in messages: assert msg.startswith("mypkg-1.2.3: "), ( f"Expected 'mypkg-1.2.3: ' prefix, got: {msg!r}"🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tests/test_bootstrapper.py` around lines 781 - 784, The test loop that checks message prefixes does not explicitly verify that at least one message was logged, which means the test could pass even if the messages list is empty. Add an assertion before the for loop iterating over messages to verify that at least one message exists in the messages collection, ensuring the test fails explicitly if no messages are logged, consistent with the pattern used in test_bg_prepare_prebuilt_log_prefix_includes_version.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Nitpick comments:
In `@tests/test_bootstrapper.py`:
- Around line 781-784: The test loop that checks message prefixes does not
explicitly verify that at least one message was logged, which means the test
could pass even if the messages list is empty. Add an assertion before the for
loop iterating over messages to verify that at least one message exists in the
messages collection, ensuring the test fails explicitly if no messages are
logged, consistent with the pattern used in
test_bg_prepare_prebuilt_log_prefix_includes_version.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: ead43119-875f-472f-aefa-f86c6926c4b0
📒 Files selected for processing (14)
e2e/test_bootstrap_build_tags.she2e/test_bootstrap_cache.she2e/test_bootstrap_cooldown_prebuilt.she2e/test_bootstrap_prerelease.shpyproject.tomlsrc/fromager/bootstrap_requirement_resolver.pysrc/fromager/bootstrapper.pysrc/fromager/commands/bootstrap.pysrc/fromager/commands/lint_requirements.pysrc/fromager/resolver.pytests/test_bootstrap.pytests/test_bootstrap_test_mode.pytests/test_bootstrapper.pytests/test_bootstrapper_iterative.py
✅ Files skipped from review due to trivial changes (5)
- e2e/test_bootstrap_cooldown_prebuilt.sh
- e2e/test_bootstrap_prerelease.sh
- e2e/test_bootstrap_build_tags.sh
- e2e/test_bootstrap_cache.sh
- pyproject.toml
🚧 Files skipped from review as they are similar to previous changes (7)
- src/fromager/resolver.py
- tests/test_bootstrap_test_mode.py
- src/fromager/commands/bootstrap.py
- tests/test_bootstrap.py
- src/fromager/commands/lint_requirements.py
- tests/test_bootstrapper_iterative.py
- src/fromager/bootstrapper.py
…ground threads Submit version resolution and source download/unpack to a thread pool as items are pushed onto the bootstrap stack, overlapping I/O with the main thread's serial processing. Key design decisions: - Background callables are module-level functions (no self capture) so background threads cannot accidentally access mutable Bootstrapper state (self.why, self._seen_requirements, etc.) - _push_items() helper guarantees every RESOLVE and PREPARE_SOURCE item gets a bg_future, including the initial item in bootstrap() - _drain_background_pool() provides an exclusive-build barrier - BootstrapRequirementResolver cache protected by threading.Lock - --bg-threads CLI option (min 1, default: cpu_count // 2) - finalize() uses cancel_futures=True to avoid blocking on abort paths Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Signed-off-by: Doug Hellmann <dhellmann@redhat.com>
When bootstrapping finishes, overwrite the stack file with an empty list. Signed-off-by: Doug Hellmann <dhellmann@redhat.com>
Convert pypi_simple.errors.NoSuchProjectError to resolvelib.resolvers.ResolverException so callers only have to track one type of exception. Signed-off-by: Doug Hellmann <dhellmann@redhat.com>
b196f88 to
dd16257
Compare
|
This pull request has merge conflicts that must be resolved before it can be merged. |
Summary
Add a thread pool for processing some tasks from the bootstrap work item stack before they are popped. Keep the stack processing as it is, allowing the backgrounded phases to get the results from the future object created by the thread pool. Apply this approach to the phases for resolving requirements and preparing source.
Ensure that background callables are module-level functions (no
selfcapture) so threads cannot accidentally access mutableBootstrapperstate (self.why,self._seen_requirements, etc.)Add a
_push_items()helper to guarantee everyRESOLVEandPREPARE_SOURCEitem has abg_futureset, including the initial item inbootstrap()Set the minimum size of the thread pool to 1 to avoid special cases.
Test plan
hatch run test:test tests/test_bootstrapper.py tests/test_bootstrapper_iterative.py)TestPhaseResolvetests updated to use pre-completedFutureobjects instead of patchingresolve_versions_resolver.resolvedirectly (the path background threads use)./e2e/test_bootstrap_parallel_git_url.shpasses end-to-end🤖 Generated with Claude Code