Allow UTC & +00:00 to be written in parquet writer#9994
Conversation
| /// native array | ||
| fn types_compatible(a: &DataType, b: &DataType) -> bool { | ||
| // if the Arrow data types are equal, the types are deemed compatible | ||
| if a.equals_datatype(b) { |
There was a problem hiding this comment.
Open question if this new logic belongs in equals_datatype
There was a problem hiding this comment.
Moving to equals_datatype gets us the recursive checks for free so I think this might make more sense?
alamb
left a comment
There was a problem hiding this comment.
Thanks for this PR @xanderbailey
| }) | ||
| }) | ||
| } | ||
| // Timestamps with matching unit whose timezones are both recognized UTC |
There was a problem hiding this comment.
It seems to me that this cast / comparsion should be done at the application layer (aka the thing calling the parquet writer). My rationale is that there are many other places in the codbase that compare arrays based on exact equality of DataType, and don't do this timezone normalization.
Perhaps in the iceberg code you could add a call to cast for all columns before you pass it to the parquet writer . The cast is a noop for arrays of the already correct type
There was a problem hiding this comment.
Yeah I started drafting that but when I realised DataFusion has to do something similar I thought it might be worth pushing it lower in the stack.
Is there a case where we wouldn't want UTC to equal +00:00?
There was a problem hiding this comment.
Gave it a go in iceberg-rust here apache/iceberg-rust#2477 to see what it would look like
Which issue does this PR close?
Working towards: #3199
Rationale for this change
types_compatiblewill error today like so:When datafusion creates a UTC timestamp but iceberg-rust is expecting to see a "+00:00". These two are semantically the same so we should allow the parquet write.
What changes are included in this PR?
Slacken the
types_compatibleto allow UTC equivalent timestamps when writing parquet.Are these changes tested?
Are there any user-facing changes?