Disable delete optimization and exit ref loop faster#6219
Disable delete optimization and exit ref loop faster#6219ddanielr wants to merge 4 commits intoapache:2.1from
Conversation
When delete table is called, the delete marker code checks to see if any file references exist in other tables. However, only a single reference has to exist for delete markers to be created. Added break out of for loop once a single entry was found. Fixed log lines and removed a `getLogger` call. Removed a nested try block in favor of a single try-with-resources
Adds a property to allow the scan of the metadata table to be skipped for table deletes. This forces delete markers to always be created when deleting tables instead of the manager deleting the volumes immediately.
|
@keith-turner & @ctubbsii I'm also leaning towards just removing this delete marker optimization entirely. I wanted to be a bit more careful about it in 2.1 but if you don't see a reason to keep this code around then I'll push a change to just remove the scan and volume deletion shortcut entirely. |
Improve property description Co-authored-by: Dave Marion <dlmarion@apache.org>
I support removing this optimization in general. Do share the concerns about introducing bugs in 2.1. Looked at what test we have for delete table to see if any verified the files actually eventually go away. I could not find any test that verifies this. I looked at test method names that called TableOperations.delete and for test I was not sure about based on the name I looked at the test. Maybe part of removing this in 2.1 is adding some more test. Want to ensure the files exists before delete and that after delete the tables file and dir eventually go away. That would cover the risk that delete table leaves files behind w/ this change. If delete does more than its supposed to (like messes w/ another table) I suspect existing test would catch that as we create/delete a lot in the test. |
This PR adds a property to skip scanning the metadata table for table deletes.
DeleteMarker Creation Optimization
When the manager deletes a table it performs an optimization step by creating a batch scanner with 8 threads (not configurable) to scan all the other table file references on the metadata table and ensure that no file references are found for the given table volume.
The manager then directly deletes the volumes as opposed to writing delete markers and allowing the GC to handle the tablet file deletions.
This is a nice optimization to have when dealing with a small static set of tables. However, when table creation is dynamic these scans can cause unnecessary delays and/or hanging fate processes as all metadata tablets must be scanned in order to process a single table delete.
Unnecessary File Ref Counting
The batch scanner only needs to produce a single shared file ref result (
refCount) in order to trigger delete markers to be created as the code only checks ifrefCountis equal to zero.However, the existing code needlessly counts all of the found refs first.
This is unnecessary and a fast break was added to the iterator loop for the batch scanner.