-
Notifications
You must be signed in to change notification settings - Fork 450
Open
Description
Apache Iceberg version
0.10.0
Please describe the bug 🐞
When performing an upsert operation after adding a new column via update_schema().union_by_name() , the operation fails with a ValueError indicating that the schema field names don't match.
To reproduce:
from pyiceberg.catalog import load_catalog
import polars as pl
catalog = load_catalog("default", **{"type": "in-memory"})
df = pl.DataFrame(
[
{"id": 1, "name": "Alice", "age": 30, "city": "São Paulo"},
{"id": 2, "name": "Bob", "age": 25, "city": "Rio de Janeiro"},
{"id": 3, "name": "Carol", "age": 35, "city": "Belo Horizonte"},
{"id": 4, "name": "David", "age": 28, "city": "Curitiba"},
]
)
arrow = df.to_arrow()
catalog.create_namespace_if_not_exists("default")
catalog.create_table_if_not_exists("default.my_table", arrow.schema)
table = catalog.load_table("default.my_table")
try:
table.append(arrow)
# Add a new column
arrow = df.with_columns(ping=pl.lit("pong")).to_arrow()
# Update schema to include the new column
with table.update_schema() as update_schema:
update_schema.union_by_name(arrow.schema)
table = table.refresh()
# This fails with ValueError
table.upsert(arrow, ["id"])
finally:
catalog.drop_table("default.my_table")Error:
ValueError: Target schema's field names are not matching the table's field names: ['id', 'name', 'age', 'city', 'ping'], ['id', 'name', 'age', 'city']
Stack trace:
File "pyiceberg/table/__init__.py", line 1343, in upsert
return tx.upsert(
File "pyiceberg/table/__init__.py", line 825, in upsert
rows_to_update = upsert_util.get_rows_to_update(df, rows, join_cols)
File "pyiceberg/table/upsert_util.py", line 92, in get_rows_to_update
source_table.cast(target_table.schema)
File "pyarrow/table.pxi", line 4721, in pyarrow.lib.Table.cast
Expected:
The upsert operation should succeed after the schema has been updated to include the new column.
Willingness to contribute
- I can contribute a fix for this bug independently
- I would be willing to contribute a fix for this bug with guidance from the Iceberg community
- I cannot contribute a fix for this bug at this time
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels