Skip to content

[AMORO-4163][ams] Fix CommitFailedException when loading legacy mixed-iceberg tables on Iceberg 1.7.2#4182

Open
lintingbin wants to merge 1 commit intoapache:masterfrom
lintingbin:fix/legacy-mixed-iceberg-commit-on-load
Open

[AMORO-4163][ams] Fix CommitFailedException when loading legacy mixed-iceberg tables on Iceberg 1.7.2#4182
lintingbin wants to merge 1 commit intoapache:masterfrom
lintingbin:fix/legacy-mixed-iceberg-commit-on-load

Conversation

@lintingbin
Copy link
Copy Markdown
Contributor

@lintingbin lintingbin commented Apr 15, 2026

Search before asking

  • I have searched in the issues and found no similar issues.

What type of PR is this?

  • Improvement
  • Bug Fix
  • Feature
  • Refactoring

What does this PR do?

Found while investigating the CI failure in #4179.

Root Cause

Iceberg 1.7.x introduced a breaking change in HadoopTableOperations.commit(): it now uses reference equality (==) to compare the base argument against the internally cached currentMetadata. If they differ, it throws CommitFailedException("Cannot commit changes based on stale table metadata").

In InternalMixedIcebergHandler.newTableOperations(), when loading a legacy mixed-iceberg table (created before v0.7.0), the code called ops.current() to obtain the current metadata, then passed it as base to ops.commit(base, legacyCurrent). However, commit() internally calls versionAndMetadata() which may refresh the internal state and return a different object instance than what current() returned — even when the on-disk metadata has not changed. This causes the reference-equality check to fail, throwing CommitFailedException with no cause.

The exception propagates through MixedHadoopTableOperations (which only wraps CommitFailedException with a cause) and the Thrift layer, causing BasicMixedCatalog.createTableMeta() to throw IllegalStateException("update table meta failed"), which makes table creation fail.

This is the root cause of the flaky TestInternalMixedCatalogService.CompatibilityCatalogTests#testNewCatalogLoadHistorical regression observed after the Iceberg 1.7.2 upgrade (#4163).

Fix

Replace ops.current() with ops.refresh(). refresh() reads from disk and stores the result as the internal cached reference, returning that same object. When commit() then calls versionAndMetadata(), it finds the version unchanged on disk and returns the same cached reference — satisfying the reference-equality check without any additional disk I/O beyond the single refresh() call.

Checklist

  • I have added corresponding tests for my changes.
  • This PR only changes one thing, and it is clear from the title.
  • I have documented my changes in the relevant documentation.

@github-actions github-actions bot added the module:ams-server Ams server module label Apr 15, 2026
…-iceberg tables on Iceberg 1.7.x

Found while investigating the CI failure in apache#4179.

Iceberg 1.7.x introduced a breaking change in HadoopTableOperations.commit(): it
now uses reference equality (==) to compare the `base` argument against the cached
currentMetadata. Previously, newTableOperations() called ops.current() to obtain the
current metadata, but versionAndMetadata() inside commit() may refresh the internal
state and return a different object instance. When the two references differ, commit()
throws CommitFailedException("Cannot commit changes based on stale table metadata")
even though the metadata content is identical, causing table loading to fail.

Fix: replace ops.current() with ops.refresh() so that the returned TableMetadata
reference is the same object stored in ops' internal cache. When commit() then calls
versionAndMetadata(), it finds the version unchanged on disk and returns the same
cached reference, satisfying the reference-equality check.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@lintingbin lintingbin force-pushed the fix/legacy-mixed-iceberg-commit-on-load branch from e8176fa to 603e139 Compare April 15, 2026 11:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

module:ams-server Ams server module

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant