Skip to content

HIVE-29587: Cleanup acid table dir after async drop#6459

Open
VenuReddy2103 wants to merge 1 commit intoapache:masterfrom
VenuReddy2103:HIVE-29587
Open

HIVE-29587: Cleanup acid table dir after async drop#6459
VenuReddy2103 wants to merge 1 commit intoapache:masterfrom
VenuReddy2103:HIVE-29587

Conversation

@VenuReddy2103
Copy link
Copy Markdown
Contributor

@VenuReddy2103 VenuReddy2103 commented Apr 30, 2026

What changes were proposed in this pull request?

Added a new compaction type(DEFERRED_CLEANUP). It is used while adding a row to compaction queue table when table is soft deleted.

Why are the changes needed?

When drop acid table is followed by drop database cascade operation, the acid table directory itself is never deleted. Hence managed database directory is also never deleted. Orphaned table and database directories remaining indefinitely in the filesystem, leading to potential storage bloat and inconsistencies between HMS metadata and filesystem state

Does this PR introduce any user-facing change?

No

How was this patch tested?

Tested manually

@VenuReddy2103 VenuReddy2103 changed the title [WIP]HIVE-29587: Cleanup acid table dir after async drop HIVE-29587: Cleanup acid table dir after async drop May 1, 2026
@deniskuzZ
Copy link
Copy Markdown
Member

thanks for the refactor @VenuReddy2103 , that looks much cleaner

isReplicated = isDbReplicationTarget(db);

EnvironmentContext context = request.getEnvContext();
if (!request.isDeleteData() && context != null && ReplChangeManager.isSourceOfReplication(db)) {
Copy link
Copy Markdown
Member

@deniskuzZ deniskuzZ May 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what does request.isDeleteData() do? do we need it

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since it is enough to set only when request.isDeleteData() is false(i.e., soft delete), have added this check. But there is no harm setting it in context irrespective of it since we do not use it. Do you suggest to remove it?

if (isSoftDelete) {
context = new EnvironmentContext();
context.putToProperties(hive_metastoreConstants.TXN_ID, String.valueOf(request.getTxnId()));
if (ReplChangeManager.isSourceOfReplication(db)) {
Copy link
Copy Markdown
Member

@deniskuzZ deniskuzZ May 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

was it redundant?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes it is redundant to set here since we set in DropTableHandler. And it need to present in DropTableHandler. When drop soft table is followed by drop database cascade, we won't preseve this info in CQ_TBLPROPERTIES if we do not set in DropTableHandler.

@deniskuzZ
Copy link
Copy Markdown
Member

deniskuzZ commented May 4, 2026

with DEFFERED_CLEANUP type introduction we could probably simplify CompactionCleaner checks like if (isNull(location)) { and replace with type check

not in scope for this PR

@sonarqubecloud
Copy link
Copy Markdown

sonarqubecloud Bot commented May 6, 2026

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants