Skip to content

compact: free space per pack with --threshold#9801

Merged
ThomasWaldmann merged 4 commits into
borgbackup:masterfrom
mr-raj12:pack-files-compact
Jun 24, 2026
Merged

compact: free space per pack with --threshold#9801
ThomasWaldmann merged 4 commits into
borgbackup:masterfrom
mr-raj12:pack-files-compact

Conversation

@mr-raj12

@mr-raj12 mr-raj12 commented Jun 23, 2026

Copy link
Copy Markdown
Contributor

Description

Moves borg compact from deleting single objects to compacting whole packs, so it keeps working once a pack holds more than one object (N>1).

For each pack:

  • all objects unused: drop the pack file.
  • all objects used: leave it.
  • mixed: rewrite only if the unused bytes reach --threshold percent (default 10), copying the survivors into a new pack via Repository.compact_pack and dropping the old one. Below the threshold the pack is left alone, so a large pack is not rewritten just to reclaim a few bytes.

--dry-run reports the space compact would free without changing the repository. --stats reports the sum of stored object sizes before and after; the sizes come from the cached chunk index, so no extra full scan over the repository objects is needed.

The chunk index is scanned twice to keep memory bounded: first only per-pack byte counts to decide each pack's fate, then the object ids of just the packs that change. The #9748 crash-safety order is preserved: cached chunk indexes are invalidated before the first store change.

At N=1 every pack holds one object, so mixed packs never occur and the behavior matches before. The rewrite path is covered by a test that forces max_count > 1.

This recycles the approach from #9777, which can be closed.

refs #8572 #8514

Checklist

  • PR is against master
  • New code has tests
  • Tests pass
  • Commit messages are clean and reference related issues

@codecov

codecov Bot commented Jun 23, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 97.22222% with 2 lines in your changes missing coverage. Please review.
✅ Project coverage is 84.77%. Comparing base (5237f9a) to head (4a28995).
⚠️ Report is 8 commits behind head on master.
✅ All tests successful. No failed tests found.

Files with missing lines Patch % Lines
src/borg/archiver/compact_cmd.py 97.22% 1 Missing and 1 partial ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #9801      +/-   ##
==========================================
- Coverage   84.78%   84.77%   -0.01%     
==========================================
  Files          92       92              
  Lines       15251    15289      +38     
  Branches     2286     2297      +11     
==========================================
+ Hits        12930    12961      +31     
- Misses       1621     1630       +9     
+ Partials      700      698       -2     

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

@ThomasWaldmann ThomasWaldmann left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

some stuff i found...

Comment thread src/borg/archiver/compact_cmd.py Outdated
Comment thread src/borg/archiver/compact_cmd.py Outdated
Comment thread src/borg/archiver/compact_cmd.py
Comment thread src/borg/testsuite/archiver/compact_cmd_test.py Outdated
@ThomasWaldmann

ThomasWaldmann commented Jun 23, 2026

Copy link
Copy Markdown
Member

Also please rebase on current master (I have merged #9800 now).

@ThomasWaldmann

Copy link
Copy Markdown
Member

Have a look at that, maybe you can implement --dry-run rather easily?

#9379

@ThomasWaldmann

ThomasWaldmann commented Jun 23, 2026

Copy link
Copy Markdown
Member

I did a test run with slightly modified code (N=2) and got this. So guess the 100% display issue is not fixed yet?

borg compact --progress
Starting compaction / garbage collection...
Getting object IDs present in the repository...
Computing object IDs used by archives...

Cleaning archives directory from soft-deleted archives...
Deleting 4992 unused objects...
Compacting packs 0.0%
Compacting packs 0.1%
Compacting packs 0.2%
...
Compacting packs 99.9%

Overall statistics, considering all 0 archives in this repository:
Source data size was 0 B in 0 files.
Repository size is 0 B in 0 objects.
Compaction saved 39 MB.
Cleaning up files cache...
Removed 1 unused files cache files.
Finished compaction / garbage collection...

Also: even without -v, it is very verbose:

% borg compact                                                                    
Starting compaction / garbage collection...
Getting object IDs present in the repository...
Computing object IDs used by archives...
Analyzing archive arch2 2026-06-23 17:30:11.952749+02:00 ad56058509ce0dffa8e080f52c35a55008dd1279a38c7a6bf9f5037f279811f6 (1/2)
Analyzing archive arch2 2026-06-23 17:30:18.510445+02:00 b9ac90e796a20b53ed7759b3df0bec14489ca4cac53ca9d6eae175c154152a22 (2/2)
Cleaning archives directory from soft-deleted archives...
Deleting 1 unused objects...
Overall statistics, considering all 2 archives in this repository:
Source data size was 305 MB in 10552 files.
Repository size is 39 MB in 4992 objects.
Compaction saved 393 B.
Cleaning up files cache...
Removed 0 unused files cache files.
Finished compaction / garbage collection...

I suggest you try this manually to get a feel of it and find such issues yourself.

@mr-raj12 mr-raj12 marked this pull request as ready for review June 24, 2026 05:40
mr-raj12 added 2 commits June 24, 2026 11:10
Group objects by their pack and act per pack: drop fully-unused packs, and
rewrite mixed packs whose unused bytes reach --threshold (default 40%) by
copying the survivors forward with compact_pack. Two index scans keep the
memory use bounded. refs borgbackup#8572 borgbackup#8514
Rewrite a pack only when its unused bytes reach --threshold percent; --dry-run reports the space compact would free without changing the repository (borgbackup#9379).
@mr-raj12 mr-raj12 force-pushed the pack-files-compact branch from 464c91a to b1862f3 Compare June 24, 2026 05:44

@ThomasWaldmann ThomasWaldmann left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no manifest talk please, we do not have a manifest (in the original meaning) any more.

Comment thread src/borg/archiver/compact_cmd.py Outdated
@ThomasWaldmann

Copy link
Copy Markdown
Member

About my feedback that it is verbose even without -v: my fault, guess I forgot that I had tweaked borg's default via a settings file of jsongargparse. So, ignore that.

@ThomasWaldmann

Copy link
Copy Markdown
Member

I have seen that borg compact also has a --stats option.

Is there a difference now with and without it?

IIRC, I introduced that back then because it was much slower with stats than without.

@mr-raj12 mr-raj12 force-pushed the pack-files-compact branch from b1862f3 to 0586cc7 Compare June 24, 2026 12:44
@mr-raj12

Copy link
Copy Markdown
Contributor Author

I have seen that borg compact also has a --stats option.

Is there a difference now with and without it?

IIRC, I introduced that back then because it was much slower with stats than without.

yes, --stats does a full repo scan so it is slower but gives size numbers, without it borg uses the cached index so it is faster but shows no sizes

@mr-raj12 mr-raj12 requested a review from ThomasWaldmann June 24, 2026 14:49

@ThomasWaldmann ThomasWaldmann left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@ThomasWaldmann ThomasWaldmann merged commit e57e22d into borgbackup:master Jun 24, 2026
19 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants