Skip to content

SedonaDB doesn't use the column statistics of bbox column? #617

@yutannihilation

Description

@yutannihilation

When I was trying to do the same thing as the DuckDB SQL in Overture Maps' docs, I found SedonaDB is strangely slow compared to DuckDB.

Benchmark results (seconds):

Engine Median Mean Min Max Runs row_count max_confidence
DuckDB 0.588484 2.243619 0.585020 5.557354 3 7471 0.999544084072
SedonaDB 54.524910 55.320437 51.254931 60.181470 3 7471 0.999544084072

This is the script for benchmarking: yutannihilation@a2c5a8f

I suspect it's because SedonaDB doesn't utilize the statistics of the bbox column for pushdown, because it's faster (but still 10x slower than DuckDB) if I replace these conditions on bbox with ST_Intersects(geometry, ...). I know this DuckDB SQL is a temporary one because bbox column will disappear, but I'm wondering if this is an expected result.

  WHERE
    categories.primary = 'pizza_restaurant'
    AND bbox.xmin BETWEEN -75 AND -73
    AND bbox.ymin BETWEEN 40 AND 41

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions