Skip to content

Conversation

@dwo
Copy link

@dwo dwo commented Feb 9, 2026

Closes #841

Adds parallel byte-range GET support to _get_file and _cat_file, mirroring the existing _upload_file_part_concurrent pattern used by put_file.

When a file is larger than chunksize and max_concurrency > 1, downloads are split into byte-range requests executed in batches via asyncio.gather. Smaller files and explicit range requests (start/end on cat_file) fall through to the existing sequential path unchanged.

As for my motivation: boto3's download_file was a lot faster than s3fs get_file for a 20GB file in Cloudflare R2 from our cloud instances.

dwo added 2 commits February 9, 2026 12:31
Use parallel byte-range GET requests for large file downloads,
mirroring the existing _upload_file_part_concurrent pattern. Files
larger than chunksize with concurrency > 1 are downloaded in batches
of concurrent range requests; smaller files fall through to the
existing sequential path unchanged.
Use parallel byte-range GET requests for cat_file, mirroring the
approach in _get_file. When no start/end range is specified and
concurrency > 1, large files are read using batched concurrent
range requests and assembled in memory.

Also correct max_concurrency docstring to reflect that pipe()
already supports concurrency.
@martindurant
Copy link
Member

As for my motivation: boto3's download_file was a lot faster than s3fs get_file for a 20GB file in Cloudflare R2 from our cloud instances.

Out of interest, do you have any measurements of what a difference this change makes for your specific usecase?

Copy link
Member

@martindurant martindurant left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am generally positive about the changes here.

The problem with concurrency on slow networks, is that connections will time out if they are launches at the same time, but then one swamps the bandwidth. Therefor, it seems best to default to concurrency 1 unless we can reliably determine the bandwidth to the data or whether the call location is coresident.

s3.pipe(test_bucket_name + "/parallel_test", data)

test_file = str(tmpdir.join("parallel_test"))
cb = Callback()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overriding Callback's relative_update with give a way to measure how many chunks are actually read, to make sure everything is correct.

@pytest.mark.parametrize("factor", [1, 5, 6])
def test_get_file_parallel_integrity(s3, tmpdir, factor):
chunksize = 5 * 2**20
data = os.urandom(chunksize * factor)
Copy link
Member

@martindurant martindurant Feb 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could also include non-integer factors, e.g., a parameter to add zero or 1 to this data size.

]
)
for offset, data in results:
f0.seek(offset)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe this should work on Windows, but I seem to recall only posix has true sparse file support - do we need to expand the file to full size and open in update mode?

Should we ensure that the block-size is a factor-of-2 to align with disk block sizes? The default, at least, is.

@dwo
Copy link
Author

dwo commented Feb 9, 2026

As for my motivation: boto3's download_file was a lot faster than s3fs get_file for a 20GB file in Cloudflare R2 from our cloud instances.

Out of interest, do you have any measurements of what a difference this change makes for your specific usecase?

At least for a large file (20GB) in a cloud instance with a lot of bandwidth this reduced a download from ~5 to ~1.5 minutes.

I tested this locally on my home internet, and it made a difference, but not nearly as dramatic so I think defaulting to 1 is probably fair.

@dwo
Copy link
Author

dwo commented Feb 9, 2026

The problem with concurrency on slow networks, is that connections will time out if they are launches at the same time, but then one swamps the bandwidth. Therefor, it seems best to default to concurrency 1 unless we can reliably determine the bandwidth to the data or whether the call location is coresident.

Do you want to split the max_concurrency option somehow? I think boto3 also defaults to 10 fwiw https://boto3.amazonaws.com/v1/documentation/api/latest/guide/s3.html#concurrent-transfer-operations

@martindurant
Copy link
Member

s3fs is single-threaded, though. If parallelising on threads, each connection gets a turn (although the NIC can still get swamped by one of them).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add transfer configuration to support concurrent downloading

2 participants