Handle integer-backed Arrow decimals via logical type metadata#653
Conversation
|
Snowflake returns decimals and integers as When Snowflake transmits metadata to Arrow-based tools or drivers (like ADBC, JDBC, or Power BI), it identifies all fixed-point data as a generic "fixed" type with a precision of up to 38. Because native Apache Arrow strictly distinguishes between fixed-width integers (like Set the client option @rahul-iyer - could you add the first two paragraphs to the summary as the motivation and the third paragraph as a potential workaround for older releases without this fix? |
4c5a710 to
9c7a47f
Compare
Summary
ADBC exposes Arrow schemas and arrays, but some Snowflake types are not fully recoverable from the Arrow
physical type alone. In particular, Snowflake NUMBER/DECIMAL columns may arrive with their semantics
encoded in field metadata rather than standard Arrow decimal format.
The Snowflake ADBC driver documents two relevant metadata channels:
Without interpreting those annotations, Ladybug can bind Snowflake decimals as plain numeric storage
types.
What changed
Added Snowflake-specific decoding for decimal types from:
Added decimal scanning support for Snowflake FIXED values when the Arrow batch is:
Refactored Arrow metadata handling into separate decoder components:
Design
The refactor separates three concerns:
The rest of the Arrow pipeline still consumes normalized logical type information only. This keeps
Snowflake-specific behavior out of the core binding and scan code paths except where the recovered
logical type is applied.
This structure is intended to make future support for other sources, such as Databricks/Spark-specific
metadata, straightforward to add as separate decoders rather than as scattered conditionals.
https://arrow.apache.org/adbc/current/driver/snowflake.html
DATA_TYPEDATA_TYPE=NUMBER(12,4)DECIMAL(12,4)DATA_TYPEwith implicit scaleDATA_TYPE=NUMBER(18)DECIMAL(18,0)0.DATA_TYPEDATA_TYPE=NUMERIC(10,3)orDATA_TYPE=DECIMAL(10,3)DECIMAL(10,3)logicalType=FIXED,precision=7,scale=2INT8/16/32/64,UINT8/16/32/64)DECIMAL(7,2)logicalType=FIXED,precision=9,scale=2FLOAT,DOUBLE)DECIMAL(9,2)DATA_TYPEplus validlogicalType=FIXEDmetadataDECIMAL(p,s)fromlogicalTypemetadataDATA_TYPEparsing fails, SnowflakelogicalTypeparsing is tried next.DATA_TYPE=NUMBER(12,4)plus genericlogicalType=DECIMAL,precision=9,scale=3DECIMAL(12,4)