Target --full-rerun to TaxonoPy-specific output files only via manifest#31
Closed
Target --full-rerun to TaxonoPy-specific output files only via manifest#31
--full-rerun to TaxonoPy-specific output files only via manifest#31Conversation
Previously --full-rerun called shutil.rmtree(output_dir), deleting the entire output directory including any non-TaxonoPy files. Now each run writes a taxonopy_manifest.json listing only the files it produced. --full-rerun reads that manifest and removes only the listed files (plus the manifest itself), leaving other content untouched. The existing_output detection also checks for the manifest file so that a previous run is reliably detected even if output filenames change. [AI-assisted session] Co-authored-by: thompsonmj <31709066+thompsonmj@users.noreply.github.com>
Copilot
AI
changed the title
[WIP] Update target for full rerun to TaxonoPy-specific files
Target Feb 20, 2026
--full-rerun to TaxonoPy-specific output files only via manifest
Contributor
|
Closing in favor of #32 |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
--full-reruncalledshutil.rmtree(output_dir), nuking the entire output directory and any non-TaxonoPy files that happened to live there.Approach
Rather than pattern-matching filenames (fragile), each run now writes a
taxonopy_manifest.jsonto the output directory recording exactly which files it produced.--full-rerunreads that manifest and removes only those files — nothing else.Changes
output_manager.py— two new public functions:write_output_manifest(output_dir, files)— writestaxonopy_manifest.jsonwith paths relative tooutput_dirread_output_manifest(output_dir)— returns absolute paths from the manifest, or[]if absent/unparseablecli.py--full-rerunreplacesshutil.rmtreewith per-fileunlinkof manifest entries--force-inputpath callwrite_output_manifestafter output is generatedexisting_outputdetection extended to check for manifest presence — more reliable than glob-matching*.resolved.*tests/test_output_manager.py(new) — covers manifest write/read, missing/corrupt manifest, and the key invariant that non-TaxonoPy files survive a simulated--full-rerunExample manifest
{ "files": [ "data.resolved.parquet", "data.unsolved.parquet", "resolution_stats.json" ] }Original prompt
--full-rerunto TaxonoPy-specific output files only #28🔒 GitHub Advanced Security automatically protects Copilot coding agent pull requests. You can protect all pull requests by enabling Advanced Security for your repositories. Learn more about Advanced Security.