Skip to content

fileio: coalesce --sparse writes instead of 1-KiB dribbles (fixes #773)#1019

Open
dr-who wants to merge 1 commit into
RsyncProject:masterfrom
dr-who:fix-773-sparse-write-coalesce
Open

fileio: coalesce --sparse writes instead of 1-KiB dribbles (fixes #773)#1019
dr-who wants to merge 1 commit into
RsyncProject:masterfrom
dr-who:fix-773-sparse-write-coalesce

Conversation

@dr-who

@dr-who dr-who commented Jun 30, 2026

Copy link
Copy Markdown

Problem (#773)

write_file()'s --sparse path slices each span into SPARSE_WRITE_SIZE (1024-byte) pieces, and write_sparse() issues one write() syscall per slice, bypassing the 256 KB buffering the normal path uses. Copying a large non-sparse file with --sparse therefore costs roughly one write() per kilobyte — about a million write() calls for a 1 GiB file.

The reporter measured a 1 GiB random file:

  • rsync --sparse1.36 MB/s (would take ~12 min)
  • rsync (no --sparse) → 391 MB/s (~2 s)

i.e. ~280× slower purely from syscall overhead. The 1024-byte chunk is also smaller than a filesystem block, so it can't even create finer holes than a plain copy.

Reproduced here on a 200 MB file: 201,438 write() syscalls (≈ size/1024).

Fix

Rewrite write_sparse() to scan the whole span itself: it finds interior runs of zeros at least SPARSE_WRITE_SIZE long — the same hole granularity rsync has always used — and emits each intervening non-zero region (which may include shorter zero runs not worth a hole) with a single write(). A small flush_sparse_hole() helper defers/flushes holes between segments; do_punch_hole() advances the file offset just like the lseek() path, so the position stays correct.

Hole granularity is unchanged, so sparseness is identical — only the syscall pattern changes.

Improvement

Measured on a 100 MiB random (hole-free) file:

write() syscalls
before 100,730
after 6,125 (~16×)

The count now tracks the data's natural chunking (≤ CHUNK_SIZE tokens) instead of its size in kilobytes. On slow/high-latency storage (the USB/RAID in the report) this is the difference between the 1.36 MB/s and ~full-speed cases.

Verified byte-identical and equally-sparse output across: hole-free, large-hole, small (8 KB) interior-hole (stays 516 K allocated — a naive constant bump would balloon it to 772 K), all-zeros, --inplace (the use_seek path), and --preallocate (the punch_hole path). Full testsuite passes under valgrind with no errors.

Test

testsuite/sparse-write-count_test.py copies a 16 MiB hole-free file under strace and asserts the write() count stays far below the old size/1024 behaviour. It fails on stock master (16,919 writes) and passes with this change (a few hundred); it skips cleanly where strace is unavailable (non-Linux / no ptrace).

…ncProject#773)

write_file()'s sparse path sliced each span into SPARSE_WRITE_SIZE (1024-byte)
pieces and write_sparse() issued one write() syscall per slice.  Copying a
large *non-sparse* file with --sparse therefore cost roughly one write() per
kilobyte -- about a million write() calls for a 1 GiB file -- which on real
storage ran far slower than the same copy without --sparse (the bug report
measured 1.36 MB/s vs 391 MB/s, ~280x).  The 1024-byte chunk is also smaller
than a filesystem block, so it cannot even create finer holes than a plain
copy could.

Rewrite write_sparse() to scan the whole span itself: it looks for interior
runs of zeros that are at least SPARSE_WRITE_SIZE long -- the same hole
granularity rsync has always used -- and emits each intervening non-zero
region (which may include shorter zero runs not worth a hole) with a single
write().  do_punch_hole() advances the file offset just like the lseek() path,
so flushing a deferred hole between segments keeps the position correct.

The hole granularity is unchanged, so sparseness is identical; only the
syscall pattern changes.  Measured on a 100 MiB random (hole-free) file:
write() syscalls drop from 100,730 to 6,125 (~16x), now tracking the data's
natural chunking rather than its size in kilobytes.  Verified byte-identical
and equally sparse output for hole-free, large-hole, small-interior-hole,
all-zero, --inplace, and --preallocate cases.

testsuite/sparse-write-count_test.py copies a 16 MiB hole-free file under
strace and asserts the write() count stays far below the old size/1024
behaviour (it skips where strace is unavailable).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant