Skip to content

Snowflake CSV - use parallel scan settings for uncompressed CSV #408

@sfc-gh-xhuang

Description

@sfc-gh-xhuang

I noticed that for duckdb the CSV file is first uncompressed before loading. https://github.com/ClickHouse/ClickBench/blob/main/duckdb/benchmark.sh#L18C1-L18C5

For Snowflake, we also support faster parallel scanning of uncompressed CSVs.

Can we modify the Snowflake data loading test such that it loads a uncompressed CSV with MULTI_LINE=FALSE and COMPRESSION=NONE?

https://medium.com/snowflake/recap-of-snowflake-ingestion-cost-and-performance-improvements-large-csv-demo-911e6588d626?source=friends_link&sk=38a754b71aa06f51f269c6974b24abd8

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions