feat: `get_ranges` 2x performance degradation by comphead · Pull Request #7380 · apache/opendal

comphead · 2026-04-12T03:55:08Z

Which issue does this PR close?

In Apache DataFusion Comet we figured out that HDFS access slow down dramatically, leading HDFS tasks performance to degrade 2x likely after #7192

Original issue apache/datafusion-comet#3926

Rationale for this change

Summary

Concurrent range reads in get_ranges: Replace the sequential per-range read loop with concurrent reads using futures::stream::buffered(8), improving wall-clock latency for multi-range reads (e.g., Parquet rowgroups from HDFS)
No stat() call: The existing get_ranges already avoids stat(); this preserves that by using a single reader created once per call
Exact-sized fetches: Each range fetches only the requested bytes — no coalescing/over-fetching that would retain large backing buffers in memory

comphead · 2026-04-13T00:41:26Z

Putting this to draft, as adding more parallelism increases significantly the memory usage

comphead · 2026-04-14T03:07:21Z

The last revision seems returned the performance to what it was before, perhaps slightly faster because of not calling stats() and memory usage is moderate

@Xuanwo @kszucs PTAL

comphead · 2026-04-14T03:08:42Z

integrations/object_store/src/store.rs

+                        .map_err(|err| format_object_store_error(err, &location_ref))
+                }
+            })
+            .buffered(10)


10 paralellism also used in coalesce_ranges which have been called in original object store get_ranges

morristai · 2026-04-14T04:23:16Z

I think the current change improves get_ranges latency by issuing per-range reads concurrently, but it still performs one backend read per requested range.

reader.read(range) still goes through OpenDAL's normal read path, so for HDFS this still means a fresh file open / read pipeline per range. In other words, this changes N sequential reads into up to N concurrent reads, but it doesn't coalesce nearby ranges or reduce the total number of HDFS opens / NameNode RPCs.

That seems different from the PR description, which suggests this should turn something like 50 range reads into ~5 merged reads. The earlier regression looks more like we lost object_store's default get_ranges coalescing when #7192 added the custom get_range / get_ranges path to avoid stat().

So this patch may still help wall-clock time, but I don't think it fixes the root cause if the goal is to actually reduce HDFS opens. That likely needs a merged-range path again, e.g. something based on OpenDAL's range-fetch/merge logic instead of buffering individual reader.read(range) calls.

Xuanwo · 2026-04-14T10:23:29Z

integrations/object_store/src/store.rs

-            results.push(data);
-        }
-        Ok(results)
+        let location_ref: Arc<str> = Arc::from(location.as_ref());


Maybe we just call fetch + concurrent here?

comphead · 2026-04-14T17:50:08Z

Thanks @morristai and @Xuanwo let me try to get back the coalesce_ranges as it was in original object_store implementation

comphead · 2026-04-14T20:19:59Z

Thanks @morristai and @Xuanwo let me try to get back the coalesce_ranges as it was in original object_store implementation

coalesce_ranges explodes memory usage at least twice

comphead · 2026-04-14T21:18:56Z

Rolled back to without using coalesce_ranges

object_store::coalesce_ranges merges nearby ranges (gap < 1MB) into larger fetches. For HDFS files with multiple rowgroups, this can merge separate rowgroups into a single ~full-file read. The returned Bytes::slice() views retain the full coalesced backing buffer via reference counting, causing ~2x
memory usage.

On our cluster with 1200 files × 140MB across 30 machines, this increased total memory from ~0.5T to ~1T. The simple per-range concurrent approach fetches exactly what's needed with no over-allocation.

This PR can act like a patch to fix the performance having relatively same memory utilization, and happy to participate in longer term solution in follow up PR if needed

feat: read HDFS ranges concurrently

fa69b02

comphead requested review from Xuanwo and tisonkun as code owners April 12, 2026 03:55

comphead force-pushed the main branch from 5795d38 to fa69b02 Compare April 12, 2026 18:28

dosubot bot added size:M This PR changes 30-99 lines, ignoring generated files. and removed size:L This PR changes 100-499 lines, ignoring generated files. labels Apr 12, 2026

feat: read HDFS ranges concurrently

5cc9b08

comphead marked this pull request as draft April 13, 2026 00:40

comphead changed the title ~~feat: read HDFS ranges concurrently~~ feat: get_ranges 2x performance degradation Apr 13, 2026

ovoievodin added 6 commits April 12, 2026 20:24

feat: read HDFS ranges concurrently

4977210

feat: read HDFS ranges concurrently

01f091d

feat: read HDFS ranges concurrently

80e4f14

feat: read HDFS ranges concurrently

3c4a831

feat: read HDFS ranges concurrently

defce08

feat: read HDFS ranges concurrently

79ef945

comphead mentioned this pull request Apr 13, 2026

bug: HDFS get_ranges performance 2x degradation #7383

Open

1 task

ovoievodin added 2 commits April 13, 2026 13:01

feat: read HDFS ranges concurrently

fd69b77

feat: read HDFS ranges concurrently

50f83ba

comphead marked this pull request as ready for review April 14, 2026 03:07

comphead commented Apr 14, 2026

View reviewed changes

feat: read HDFS ranges concurrently

10b7975

morristai self-requested a review April 14, 2026 04:24

Xuanwo reviewed Apr 14, 2026

View reviewed changes

feat: read HDFS ranges concurrently

770b356

feat: read HDFS ranges concurrently

c340fa2

comphead requested a review from Xuanwo April 14, 2026 21:19

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: `get_ranges` 2x performance degradation #7380

feat: `get_ranges` 2x performance degradation #7380
comphead wants to merge 13 commits intoapache:mainfrom
comphead:main

comphead commented Apr 12, 2026 •

edited

Loading

Uh oh!

comphead commented Apr 13, 2026

Uh oh!

comphead commented Apr 14, 2026

Uh oh!

comphead Apr 14, 2026

Uh oh!

morristai commented Apr 14, 2026

Uh oh!

Xuanwo Apr 14, 2026

Uh oh!

comphead commented Apr 14, 2026

Uh oh!

comphead commented Apr 14, 2026

Uh oh!

comphead commented Apr 14, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

comphead commented Apr 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Which issue does this PR close?

Rationale for this change

Summary

Uh oh!

comphead commented Apr 13, 2026

Uh oh!

comphead commented Apr 14, 2026

Uh oh!

comphead Apr 14, 2026

Choose a reason for hiding this comment

Uh oh!

morristai commented Apr 14, 2026

Uh oh!

Xuanwo Apr 14, 2026

Choose a reason for hiding this comment

Uh oh!

comphead commented Apr 14, 2026

Uh oh!

comphead commented Apr 14, 2026

Uh oh!

comphead commented Apr 14, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

comphead commented Apr 12, 2026 •

edited

Loading