fix: add iceberg_type column for SqlCatalog#3263
fix: add iceberg_type column for SqlCatalog#3263rchowell wants to merge 2 commits intoapache:mainfrom
Conversation
geruh
left a comment
There was a problem hiding this comment.
Thanks for raising this @rchowell! This is a matter of completing the implementation of our SqlCatalog. Here you are updating the table to have the column and handle the migration logic.
We will definitely need a follow up here to filter against this column for table operations. Looks like the java v1 implementation runs something like WHERE (iceberg_type = 'TABLE' OR iceberg_type IS NULL) Otherwise, a view can bleed into table operations.
|
@geruh nice catch, added with a little unit test. Thanks. |
geruh
left a comment
There was a problem hiding this comment.
LGTM! I'll create the fast follow issue.
|
Thanks @geruh - one thing I would like to flag is that we shouldn't auto-migrate as pointed out in this PR. apache/iceberg-rust#2380 -- I am ok with not-merging this as-is and waiting for me to update the non-auto migrate. I can address #3337 easily as well, just like the iceberg-rust PR. |
Rationale for this change
Pyiceberg currently does not include the
iceberg_typecolumn in theiceberg_tablestable for SQL-based catalogs. This caused an error when reading a SQL-based Catalog from iceberg-rust, which expected this column. I will also update iceberg-rust to be more defensive like this PR.The fix here is done exactly how iceberg-java handles this. We do an idempotent
ALTER TABLE iceberg_tables ADD COLUMN iceberg_typeafter ensuringiceberg_tablesexists. We explicitly use the try/catch to make it idempotent because older sqlite versions do not supportIF NOT EXISTSfor column creation. For newly created tables, the addition to the sqlalchemy table will create the column. For existing tables, this backwards-compatible schema update will hit.Are these changes tested?
Unit Testing
End-to-end Testing
Are there any user-facing changes?
No