From #8288,
I do have one piece of feedback for the team responsible for the crt client. One of its requirements is to read/write a file in a filesystem, rather than a pipe. This requirement makes sense, since presumably the client is downloading chunks out of order and using something like mmap to write them directly to their target ranges when they arrive.
Aside from technical hurdles in implementation, I can imagine it's not clear to them that the user may want to keep all those chunks in memory (one risks blowing out ram.) Since the sizes involved in my use case permit it, and disk write throughput is precious, I do definitely want to write and hold all those chunks to ram.
That's easy for me to do; I can use a tmpfs (perhaps with a -o size=limit option) to hold downloaded files, and hand them off to e.g. zstd | tar once they finish. The problem is I then need to wait for the downloads to finish before I can get at my data. My download+unpack process, which is now down to just under a minute (thank you!) could probably drop another 30% if I didn't have to wait for that first download to completely finish before starting the decompress. I can and will twiddle the count and sizes of my downloaded files to help pipeline that, but that is getting rather fiddly.
The ask for the crt team is to instead offer an option to hold downloaded chunks in memory (perhaps up to some configured limit) so they can once again stream chunks out to a pipe (in order, once they're available.)
The AWS CRT already supports this feature via the on_body config option of make_request, but the AWS CLI does not support it.
From #8288,
The AWS CRT already supports this feature via the
on_bodyconfig option of make_request, but the AWS CLI does not support it.