-
Notifications
You must be signed in to change notification settings - Fork 145
Add a ResourceBudget mechanism which keeps disk usage in check during syncs #7649
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1 @@ | ||
| Add a configurable ResourceBudget for preventing over-subscription of the disk "properly". Adds a backpressure mechanism + flushing mechanism in order to ensure that batches get fully processed even if minsize hasn't yet been reached. Allows previous performance-reducing mitigations to be removed. |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -474,10 +474,38 @@ Defaults to `/var/lib/pulp/tmp/`. | |
|
|
||
| ### MAX\_CONCURRENT\_CONTENT | ||
|
|
||
| The size of the batch of content processed in one go when syncing content from | ||
| a remote. | ||
| The maximum number of concurrent downloads during sync. Controls how many HTTP | ||
| download tasks can run in parallel within the `ArtifactDownloader` pipeline stage. | ||
|
|
||
| Defaults to 25. | ||
| Defaults to 200. | ||
|
|
||
| !!! warning "Deprecated" | ||
| This setting is deprecated and may be removed in a future release. | ||
| Use `SYNC_MAX_IN_FLIGHT_ITEMS` instead, which provides similar | ||
| functionality. If `MAX_CONCURRENT_CONTENT` is set by the user and | ||
| `SYNC_MAX_IN_FLIGHT_ITEMS` is not, its value will be used as | ||
| `SYNC_MAX_IN_FLIGHT_ITEMS` automatically. | ||
|
|
||
| ### SYNC\_MAX\_IN\_FLIGHT\_MB | ||
|
|
||
| The maximum total size (in megabytes) of downloaded artifacts that are waiting to be | ||
| saved. This limits the temporary disk space consumed by artifacts that have been | ||
| downloaded by `ArtifactDownloader` but not yet persisted by `ArtifactSaver`. | ||
|
|
||
| When set, small artifacts will download with high concurrency while large artifacts | ||
| will automatically throttle to avoid exhausting disk space. | ||
|
|
||
| Defaults to 5120 (5gb) | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Is this a hard limit? Will this option prevent syncs to succeed with one 6gb file?
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It should always let one file through at a time no matter how large. In theory
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Nice, then we don't need to mention it.
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. IMO it's worth adding a note that this does not prevent bigger to be synced. |
||
|
|
||
| ### SYNC\_MAX\_IN\_FLIGHT\_ITEMS | ||
|
|
||
| The maximum number of downloaded artifacts that are waiting to be saved. Like | ||
| `SYNC_MAX_IN_FLIGHT_MB`, this limits the buffer between `ArtifactDownloader` and | ||
| `ArtifactSaver`, but counts items rather than bytes. | ||
|
|
||
| This is useful as a fallback when artifact sizes are not known ahead of time. | ||
|
|
||
| Defaults to `None` (no limit). | ||
|
|
||
| ## Redis Settings | ||
|
|
||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As this is user facing documentation I would prefer dropping specific references to
ArtifactDownloaderandArtifactSaver.