Fix %aunique{} not disambiguating sanitized album name collisions#6582
Fix %aunique{} not disambiguating sanitized album name collisions#6582marubio04 wants to merge 2 commits intobeetbox:masterfrom
Conversation
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## master #6582 +/- ##
==========================================
+ Coverage 71.78% 71.79% +0.01%
==========================================
Files 159 159
Lines 20526 20535 +9
Branches 3264 3266 +2
==========================================
+ Hits 14734 14743 +9
Misses 5104 5104
Partials 688 688
🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Pull request overview
PR fix %aunique{} so it disambiguate albums that different in raw metadata but become same after path sanitization (ex: 1/1 vs 1?1). This fit beets path formatting + import/move flow where album dirs must not collide.
Changes:
- Update
%aunique{}collision set to consider path-sanitized values, and sanitize disambiguator values when picking a field. - Add regression test for sanitized album-name collision.
- Add changelog entry for bug 6462.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 5 comments.
| File | Description |
|---|---|
beets/library/models.py |
Teach unique template logic to consider sanitized collisions and sanitized disambiguator values. |
test/test_library.py |
Add test covering sanitized album name collision case. |
docs/changelog.rst |
Document bugfix (and adds an extra 6450 bullet too). |
| disam_values = { | ||
| util.sanitize_path( | ||
| s.formatted(for_path=True).get(disambiguator, "") | ||
| ) | ||
| for s in ambigous_items |
There was a problem hiding this comment.
grug worry sanitize_path here also ignore library custom replacements. if user set replace rules, disambiguator uniqueness check might pick field that look unique pre-replace but collide on disk.
grug suggest pass replacements=self.lib.replacements into sanitize_path.
| if isinstance(db_item, Album): | ||
| all_items = self.lib.albums() | ||
|
|
There was a problem hiding this comment.
grug see perf danger. this code do self.lib.albums() full scan inside %aunique{} compute. big library => many work, maybe O(N^2) over many albums.
| all_items = self.lib.albums() | ||
|
|
||
| def path_key(item): | ||
| formatted = item.formatted(for_path=True) |
There was a problem hiding this comment.
grug also note formatted(for_path=True) default include_keys='*' trigger extra db query for keys (see FormattedMapping). inside full scan, that hurt a lot. grug suggest use item.formatted(included_keys=keys, for_path=True) (and maybe skip scan when sanitize no change).
| formatted = item.formatted(for_path=True) | |
| formatted = item.formatted( | |
| included_keys=keys, | |
| for_path=True, | |
| ) |
| - Correctly handle semicolon-delimited genre values from externally-tagged | ||
| files. :bug:`6450` |
There was a problem hiding this comment.
grug see changelog add 🐛6450 entry here, but this PR diff show no code for that bug. also same 6450 line already exist later in file, so this look like duplicate / wrong backport.
grug suggest drop this 6450 bullet (or move it only if PR actually include that fix).
| def path_key(item): | ||
| formatted = item.formatted(for_path=True) | ||
| return tuple( | ||
| util.sanitize_path(str(formatted.get(key, ""))) |
There was a problem hiding this comment.
grug worry sanitize_path here ignore library custom replacements. real path build use replacements=self._db.replacements, so collision detect may disagree with actual path when user set replace rules.
grug suggest pass replacements=self.lib.replacements (or self._db.replacements) into sanitize_path call.
| util.sanitize_path(str(formatted.get(key, ""))) | |
| util.sanitize_path( | |
| str(formatted.get(key, "")), | |
| replacements=self.lib.replacements, | |
| ) |
|
Hey @marubio04! Could you please give details on the performance impact of this change? For mid to large libraries listing and computing paths for all albums can take seconds, or tens of seconds. If I'm not mistaken with this change this will happen each time we want the path of an item (excluding cases when conflicts are dectected with the query, which should be very rare). This could dramatically slow beets down for day to day operations. |
Fixes #6462
Summary
This fixes
%aunique{}disambiguation for albums whose raw names differ but collide after path sanitization, such as1/1and1?1.Changes
Testing