Skip to content

Add wp media find-orphans subcommand#252

Open
agenceKanvas wants to merge 3 commits into
wp-cli:mainfrom
nouveauxterritoires:feature/find-orphans
Open

Add wp media find-orphans subcommand#252
agenceKanvas wants to merge 3 commits into
wp-cli:mainfrom
nouveauxterritoires:feature/find-orphans

Conversation

@agenceKanvas
Copy link
Copy Markdown

What it does

wp media find-orphans scans a WordPress site for "orphan" media — files and attachments that are out of sync between the filesystem and the media library — and reports them. It is read-only by design: it never deletes, moves, or modifies anything.

It runs four independent detectors, selectable with --type:

  • filesystem — files present in wp-content/uploads that are not in the media library (File on disk not in media library)
  • database — attachments whose underlying file is missing from disk (Attachment file missing from disk)
  • thumbnails — generated thumbnail files (-WxH) whose parent attachment no longer exists (Thumbnail parent attachment missing)
  • usage — attachments that don't appear to be referenced in any post content (Attachment appears unused in content); featured images are not flagged

Without --type, all detectors run. Output is a table by default and supports --format=table|json|csv|yaml|ids|count, plus --fields, --limit, and --include-thumbnails.

The command always exits 0. Pass --error-on-orphans to exit 1 when any orphan is found — convenient for CI/cron checks.

Extensibility

Two WordPress filters let themes/plugins teach the detectors about their own storage and references:

  • wp_cli_media_find_orphans_ignore_paths — upload subpaths to skip (defaults cover common generated dirs such as caches and form uploads), so plugin-generated assets aren't reported as filesystem orphans.
  • wp_cli_media_find_orphans_used_ids — additional attachment IDs to treat as "used" (receives the scanned post IDs and known attachment IDs as context), so custom references (e.g. postmeta, custom fields) don't yield false positives in the usage detector.

Examples

# Full audit
wp media find-orphans

# Only files on disk missing from the library, as IDs
wp media find-orphans --type=filesystem --format=ids

# Fail a CI job if anything is orphaned
wp media find-orphans --error-on-orphans

Tests

Acceptance coverage lives in features/media-find-orphans.feature — one scenario per capability, each following Arrange → Act → Assert on a fresh WP install (with uploads_use_yearmonth_folders disabled for deterministic paths):

  1. Consistent library → reports No orphan media found and exits 0.
  2. filesystem → a stray file dropped into uploads/ is detected.
  3. database (@require-wp-5.3) → an imported attachment whose file is then rm'd is detected.
  4. thumbnails → a -150x150 file with no parent attachment is detected.
  5. usage → an unused attachment is flagged, while an image used only as a post's featured image is not.
  6. --format=json → output is valid JSON containing the expected fields (type, attachment_id, file, issue, path).
  7. --error-on-orphans → returns a non-zero exit code when orphans exist.

All 7 scenarios pass (72 steps) against the standard wp-cli/wp-cli-tests Behat harness.

Notes

@agenceKanvas agenceKanvas requested a review from a team as a code owner June 4, 2026 14:49
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Jun 4, 2026

Hello! 👋

Thanks for opening this pull request! Please check out our contributing guidelines. We appreciate you taking the initiative to contribute to this project.

Contributing isn't limited to just code. We encourage you to contribute in the way that best fits your abilities, by writing tutorials, giving a demo at your local meetup, helping other users with their support questions, or revising our documentation.

Here are some useful Composer commands to get you started:

  • composer install: Install dependencies.
  • composer test: Run the full test suite.
  • composer phpcs: Check for code style violations.
  • composer phpcbf: Automatically fix code style violations.
  • composer phpunit: Run unit tests.
  • composer behat: Run behavior-driven tests.

To run a single Behat test, you can use the following command:

# Run all tests in a single file
composer behat features/some-feature.feature

# Run only a specific scenario (where 123 is the line number of the "Scenario:" title)
composer behat features/some-feature.feature:123

You can find a list of all available Behat steps in our handbook.

@github-actions github-actions Bot added command:media Related to 'media' command scope:testing Related to testing labels Jun 4, 2026
@codecov
Copy link
Copy Markdown

codecov Bot commented Jun 4, 2026

Codecov Report

❌ Patch coverage is 87.03704% with 42 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
src/Media_Command.php 87.03% 42 Missing ⚠️

📢 Thoughts on this report? Let us know!

Add a non-destructive subcommand that finds orphaned media candidates by
comparing the media library, the uploads directory, and content usage.

Detectors (via `--type`, all run by default):
- filesystem: files on disk not in the media library. Restricted to known
              media extensions (get_allowed_mime_types) and skips generated
              subdirectories (elementor, gravity_forms, cache, wpcf7_uploads)
              via the `wp_cli_media_find_orphans_ignore_paths` filter, so
              page-builder noise is excluded.
- database:   attachments whose file is missing from disk (both the
              get_attached_file() and raw _wp_attached_file paths checked
              to avoid `-scaled` false positives).
- thumbnails: generated thumbnails whose parent attachment is gone.
- usage:      attachments unreferenced in content (conservative; scans all
              registered post types; O(M+N) precomputed path->id lookup).

Options: --type, --format (table/json/csv/yaml/ids/count), --fields,
--include-thumbnails, --limit, --error-on-orphans (exit 1 when found).

Usage detection is extensible via the `wp_cli_media_find_orphans_used_ids`
filter so plugins can declare postmeta/ACF/page-builder references.

Adds a Behat feature covering all four types, JSON output, and the
error-on-orphans exit code. Registers the command in composer.json.

Implements wp-cli/ideas#216.
@agenceKanvas agenceKanvas force-pushed the feature/find-orphans branch from c0a9910 to 10a55e4 Compare June 4, 2026 16:04
@agenceKanvas
Copy link
Copy Markdown
Author

I've tried my best to get a 100% coverage on Codecov, but the missing lines would imply a lot of fixtures and heavy coding for a small result. Is it mandatory ?

@swissspidy
Copy link
Copy Markdown
Member

I've tried my best to get a 100% coverage on Codecov, but the missing lines would imply a lot of fixtures and heavy coding for a small result. Is it mandatory ?

No, not mandatory at all. Thanks for keeping an eye on it though!

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a new read-only WP-CLI subcommand, wp media find-orphans, to audit a WordPress site for media “orphan” candidates across filesystem, database, thumbnails, and content-usage signals, with configurable output formats and CI-friendly exit behavior.

Changes:

  • Implements wp media find-orphans in Media_Command with four detectors and two extensibility filters.
  • Adds Behat acceptance coverage for core scenarios, output formats, and flag validation.
  • Registers the new subcommand in composer.json so it’s discoverable/bundled with the command set.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.

File Description
src/Media_Command.php Adds the find-orphans subcommand and the detector implementations (filesystem/database/thumbnails/usage) plus helper utilities.
features/media-find-orphans.feature Adds end-to-end Behat scenarios covering expected detection behavior, output, and exit codes.
composer.json Registers media find-orphans in the commands list.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/Media_Command.php Outdated
Comment thread src/Media_Command.php
agenceKanvas and others added 2 commits June 5, 2026 11:54
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
The post_parent scan collected parent post IDs into $used_ids, which is
a set of attachment IDs. This could misclassify attachments uploaded to
a post (but not referenced in content) as unused, and could shield an
unrelated attachment whose ID collided with a parent post ID. Select the
attachment IDs themselves instead, and cover the post_parent-only case
in the usage scenario.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

command:media Related to 'media' command scope:testing Related to testing

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants