Conversation
Signed-off-by: Adam Gutglick <adam@spiraldb.com>
Signed-off-by: Adam Gutglick <adam@spiraldb.com>
Signed-off-by: Adam Gutglick <adam@spiraldb.com>
Signed-off-by: Adam Gutglick <adam@spiraldb.com>
Merging this PR will degrade performance by 10.43%
Performance Changes
Comparing Footnotes
|
Polar Signals Profiling ResultsLatest Run
Previous Runs (4)
Powered by Polar Signals Cloud |
Benchmarks: PolarSignals ProfilingSummary
Detailed Results Table
|
Benchmarks: FineWeb NVMeSummary
Detailed Results Table
|
Benchmarks: TPC-H SF=1 on NVMESummary
Detailed Results Table
|
Signed-off-by: Adam Gutglick <adam@spiraldb.com>
Benchmarks: TPC-DS SF=1 on NVMESummary
Detailed Results Table
|
Benchmarks: TPC-H SF=10 on NVMESummary
Detailed Results Table
|
|
FWIW the morsels in apache/datafusion#20481 are very much an IO scheduler for Parquet scans... |
Benchmarks: TPC-H SF=1 on S3Summary
Detailed Results Table
|
Benchmarks: FineWeb S3Summary
Detailed Results Table
|
Benchmarks: Statistical and Population GeneticsSummary
Detailed Results Table
|
Benchmarks: Clickbench on NVMESummary
Detailed Results Table
|
Benchmarks: TPC-H SF=10 on S3Summary
Detailed Results Table
|
|
@alamb I haven't read through the PR in a few days and it went through a lot of change since, this is mostly claude-driven because I was curious if there's any "free" perf here, I now see that the work stealing queue is enabled by |
|
I guess if we can figure out some morsel size, there's also benefit in trying approach right? |
Absolutely -- I am also trying to sketch out a slightly different API (see details here apache/datafusion#20529 (comment)) for morsels. The idea of using this in Vortex is a great usecase -- I'll ping you on the PR for your feedback |
Based on some iterative benchmarking on Parquet, a "projected compressed size" seems a relatively good proxy to avoid splitting really narrow / small (e.g. all primitive) row groups of a number of MB (e.g. ~2-16MB). 100K rows (which is commonly used) seems to create lot's of small scans / too much overhead of e.g. one-column scans on primitive fields |
Signed-off-by: Adam Gutglick <adam@spiraldb.com>
Signed-off-by: Adam Gutglick <adam@spiraldb.com>
|
Something that works quite good for Parquet on a couple of benchmarks it seems (on local file system):
This is with splitting both IO and decode a row group level (i.e. it does a bunch of IO a number of times) but seems not to hurt those benchmarks really. |
Basically two changes smashed together: