perf: parallel pre-read of import regions with --setup#14032
Open
Kha wants to merge 2 commits into
Open
Conversation
Member
Author
|
!bench |
|
Benchmark results for bfe1987 against 60967a7 are in. There are significant results. @Kha
New metrics (6✅, 5🟥)
Medium changes (2🟥)
Small changes (1✅, 49🟥) Too many entries to display here. View the full report on radar instead. |
This PR reads each module's main `.olean` in parallel when `--setup` provides pre-resolved import artifacts, so the import traversal hits warm regions instead of faulting them in one module at a time. Before `importModulesCore` traverses the import graph, `preReadArtifacts` reads every module's main `.olean` across a small number of striped worker tasks. Only the main `.olean` is pre-read: it is loaded for every module in the transitive closure (the traversal needs each header to recurse), so there is no over-read, while the pruned `.olean.server`/`.olean.private`/`.ir` parts stay on the sequential path. The traversal then consults the pre-read cache. The worker count is capped low to bound `mmap` address-space lock contention. Co-Authored-By: Claude <noreply@anthropic.com>
Member
Author
|
!bench |
|
Benchmark results for 105935e against 60967a7 are in. There are significant results. @Kha
New metrics (5✅, 6🟥)
Large changes (2🟥)
Medium changes (1🟥)
Small changes (2✅, 38🟥) Too many entries to display here. View the full report on radar instead. |
|
Mathlib CI status (docs):
|
Collaborator
|
Reference manual CI status:
|
Member
Author
|
!bench mathlib |
|
Benchmark results for leanprover-community/mathlib4-nightly-testing@0c6e99b against leanprover-community/mathlib4-nightly-testing@d09d04c are in. No significant results found. @Kha
Small changes (1✅, 1🟥)
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This PR parallelizes parts of the
.oleanloading process under e.g.lake build, accelerating importing whenever a significant amount of time is spent in the OS kernel and can be parallelized there as well.Before
importModulesCoretraverses the import graph,preReadArtifactsreads every module's main.oleanacross a small number of striped worker tasks. Only the main.oleanis pre-read: it is loaded for every module in the transitive closure (the traversal needs each header to recurse), so there is no over-read, while the pruned.olean.server/.olean.private/.irparts stay on the sequential path. The traversal then consults the pre-read cache. The worker count is capped low to boundmmapaddress-space lock contention.