Skip to content

Use two-phase rename to safely handle collisions and cycles #35

@dolph

Description

@dolph

Problem

Naive serial renames lose data whenever the rename set has collisions or cycles between source and destination names.

Chain — rename a → b then b → c:

mv a b   # original b is clobbered by a
mv b c   # what was a is now at c; original b is gone

Swap — rename a → b and b → a:

mv a b   # original b destroyed
mv b a   # one file is gone forever

These cases come up in any "shift everything" rename and in regex-driven renames where source and target sets overlap.

Proposed fix

Apply renames in two phases that decouple source and target namespaces:

  1. Phase 1 (quarantine): rename every source to a unique temp name in the same directory (e.g. .find-replace.tmp.{nonce}.{seq}).
  2. Phase 2 (install): rename every temp to its final destination.

Because the temp namespace is disjoint from both the source set and the target set, ordering within each phase doesn't matter — chains, swaps, and arbitrary permutations all work.

Safety details

  • Use renameat2(RENAME_NOREPLACE) on Linux so phase 2 atomically fails if a destination unexpectedly exists (closes the TOCTOU window). Fall back to a pre-check + rename(2) on platforms without it.
  • fsync the parent directory after each phase so the operation is durable across power loss.
  • Pre-flight: verify no two sources collide, no two destinations collide, and any destination not in the source set doesn't already exist. Abort before phase 1 on conflict.
  • On phase 1 partial failure, roll back the temps already created (sources are still vacant, so this is always possible).
  • On phase 2 partial failure, leave the remaining files in temp names and surface the error clearly; a --resume path can finish them.
  • Optional but recommended: write a journal file (.find-replace.journal) listing (original, temp, final) triples before phase 1 so a crashed run can be resumed or rolled back.

Why this lands first

Pure safety improvement — no flag, no behavior change visible to correct uses, no API surface. Establishes the rename infrastructure that regex mode will depend on.

Acceptance

  • Chain and swap rename sets complete without data loss
  • Pre-flight rejects conflicting target sets with a clear message
  • renameat2(RENAME_NOREPLACE) used on Linux; fallback documented
  • Parent directories fsync'd after each phase
  • Tests cover: chain, swap, 3-cycle, partial phase-1 failure rollback, destination-already-exists pre-flight

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions