Skip to content

Target --full-rerun to TaxonoPy-specific output files only via manifest#31

Closed
Copilot wants to merge 2 commits intomainfrom
copilot/target-taxonopy-output-files
Closed

Target --full-rerun to TaxonoPy-specific output files only via manifest#31
Copilot wants to merge 2 commits intomainfrom
copilot/target-taxonopy-output-files

Conversation

Copy link

Copilot AI commented Feb 20, 2026

--full-rerun called shutil.rmtree(output_dir), nuking the entire output directory and any non-TaxonoPy files that happened to live there.

Approach

Rather than pattern-matching filenames (fragile), each run now writes a taxonopy_manifest.json to the output directory recording exactly which files it produced. --full-rerun reads that manifest and removes only those files — nothing else.

Changes

  • output_manager.py — two new public functions:
    • write_output_manifest(output_dir, files) — writes taxonopy_manifest.json with paths relative to output_dir
    • read_output_manifest(output_dir) — returns absolute paths from the manifest, or [] if absent/unparseable
  • cli.py
    • --full-rerun replaces shutil.rmtree with per-file unlink of manifest entries
    • Both the normal resolve path and the --force-input path call write_output_manifest after output is generated
    • existing_output detection extended to check for manifest presence — more reliable than glob-matching *.resolved.*
  • tests/test_output_manager.py (new) — covers manifest write/read, missing/corrupt manifest, and the key invariant that non-TaxonoPy files survive a simulated --full-rerun

Example manifest

{
  "files": [
    "data.resolved.parquet",
    "data.unsolved.parquet",
    "resolution_stats.json"
  ]
}
Original prompt

This section details on the original issue you should resolve

<issue_title>Target --full-rerun to TaxonoPy-specific output files only</issue_title>
<issue_description>It is currently too aggressive and may delete unintended data. </issue_description>

Comments on the Issue (you are @copilot in this section)

@hlapp A rogue agent that got implanted into the code? 😉 @thompsonmj Haha, no, just some explosives that could benefit from friendly-fire avoidance ... @thompsonmj To deal with this, options considered include: - Pattern matching to known TaxonoPy output filenames (e.g. `**/*.resolved.*`), but this risks tech debt if naming conventions change and possible collateral damage. Simpler now, but could be improved. - Maintain a record of the exact relevant TaxonoPy output filenames under a directory resulting from a TaxonoPy command. The `--full-rerun` would only act on contents listed in that file for perfect targeting. Having such a manifest would also be a valuable piece of metadata. Better I think.

🔒 GitHub Advanced Security automatically protects Copilot coding agent pull requests. You can protect all pull requests by enabling Advanced Security for your repositories. Learn more about Advanced Security.

Previously --full-rerun called shutil.rmtree(output_dir), deleting the
entire output directory including any non-TaxonoPy files.

Now each run writes a taxonopy_manifest.json listing only the files it
produced. --full-rerun reads that manifest and removes only the listed
files (plus the manifest itself), leaving other content untouched.

The existing_output detection also checks for the manifest file so that
a previous run is reliably detected even if output filenames change.

[AI-assisted session]

Co-authored-by: thompsonmj <31709066+thompsonmj@users.noreply.github.com>
Copilot AI changed the title [WIP] Update target for full rerun to TaxonoPy-specific files Target --full-rerun to TaxonoPy-specific output files only via manifest Feb 20, 2026
Copilot AI requested a review from thompsonmj February 20, 2026 21:05
@thompsonmj
Copy link
Contributor

Closing in favor of #32

@thompsonmj thompsonmj closed this Feb 24, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Target --full-rerun to TaxonoPy-specific output files only

2 participants