Backgroud
During the compaction process, the iceberg table may have added a new snapshot, and the submission will cause a conflict.
-
Other snapshots (non-replace) have been written during the compaction period, so the compaction result has no actual conflict with the latest snapshot and can be reapplied
-
Other replace operations have been written during the compaction period, and the compaction input no longer exists. At this time, it is necessary to detect the conflict and abandon the compaction result to avoid data errors.
Solution
-
For situation 1, add a retry operation to commit. When the transaction fails to commit, re-obtain the latest table snapshot and re-execute the action apply.
-
For situation 2
Backgroud
During the compaction process, the iceberg table may have added a new snapshot, and the submission will cause a conflict.
Other snapshots (non-replace) have been written during the compaction period, so the compaction result has no actual conflict with the latest snapshot and can be reapplied
Other replace operations have been written during the compaction period, and the compaction input no longer exists. At this time, it is necessary to detect the conflict and abandon the compaction result to avoid data errors.
Solution
For situation 1, add a retry operation to commit. When the transaction fails to commit, re-obtain the latest table snapshot and re-execute the action apply.
For situation 2
SDK prohibits the removal of non-existent files.
Add conflict detection before committing to reduce commit counts.