Skip to content

LB-1852: Add file importer for Audioscrobbler spec#3452

Merged
MonkeyDo merged 28 commits intometabrainz:masterfrom
shirsakm:scrobbler-log-importer
Jan 30, 2026
Merged

LB-1852: Add file importer for Audioscrobbler spec#3452
MonkeyDo merged 28 commits intometabrainz:masterfrom
shirsakm:scrobbler-log-importer

Conversation

@shirsakm
Copy link
Copy Markdown
Contributor

@shirsakm shirsakm commented Dec 16, 2025

This PR introduces an SQL schema update, don't forget to update the database and the background tasks container

Problem

As reported in LB-1852, by user "UltimateRiff", Rockbox and other hardware media players provide a .scrobbler.log file in the Audioscrobbler spec. This contains enough data to import played tracks as listens. The spec is a TSV file, with the following expected columns,

 - artist name
 - album name (optional)
 - track name
 - track position on album (optional)
 - song duration in seconds
 - rating (L if listened at least 50% or S if skipped)
 - unix timestamp when song started playing
 - MusicBrainz Track ID (optional)

Solution

I have extended the BaseListensImporter class in audioscrobbler.py, similar to #3399 and #3400.

The implementation is kind of similar to librefm.py, in the sense that it uses the csv module, and that's about it. I also used this script by Lucifer as a reference, that's cool.

As for the relevant differences and notes,

  • Added a _parse_header(...) function, to parse the header metadata. It is kind of unnecessary as the only information we extract is the original_submission_client field, if present, but it's nice.
  • Added a _filtered_rows(...) function, which is pretty important, and I will list a few things that I implemented specifically,
    • Ignoring entries with the S rating, which I believe is consistent with how LB already checks if a listen should be stored, i.e. if 50% of the song is played, though I couldn't confirm this.
    • Ignoring entries where the Artist field is "<Untagged>" because those tend to be local files like .mp4 or .mp3, though I don't have any relevant source that that is indeed the only case.
    • More importantly, this kind of messes up the import count, because these rows are discarded before the validation and parsing function is even called. More later on how I think this could be remedied, if needed.
  • Lastly, header aliases are ignored. This is mainly because the official spec doesn't mention any names for the headers. The relevant Rockbox page, (other than having frequent bouts of giving 403) does specify names for these aliases, but I believe this isn't necessary. Mainly because even Rockbox doesn't seem to use these aliases all the time, e.g. this one Reddit post from a Rockbox which contains no aliases, though the way it's been posted might as well be garbage for any other inference.
  • I have added a test in test_listen_importer.py to test this functionality, as well. I don't have any notes about it.

P.S.: There is only one change to the frontend, which aims to fix this visual discrepancy, that probably only I and no one else noticed, in it's long life of 3 weeks.

How it looked with a trailing comma and the space included in the code block formatting:
image

How it looks now that it's fixed:
image

Action

Here I mainly want to discuss how the messed up success count can be fixed. Firstly, before starting, the same issue is there for the Panoscrobbler and Maloja Importer, as well. Now this isn't to take away from the fact that it is a bug, just that it's a flaw that's not really limited to this importer, and the base class should probably be updated for this.

Secondly, on to ways I think this can be fixed, one way could be to update the base class to also yield the number of counts, and handling it, yada yada. This is probably fine, because we only have like 4 files that implement this base class, of which only 3 need this, and it can be updated pretty easily, but maybe it should be different PR?

I think it could just be easier to just add a set "skipped" value to the row of data instead, and yield it as well, instead of just discarding it. This can be implemented on a case-by-case basis, like skipping S rated songs for example which the user probably should know, or "<Untagged>" songs. On the other hand, for just date filtering (like in Maloja and PanoScrobbler), it probably does make sense that those were skipped and not attempted on, and shouldn't be counted or reported.

In conclusion, I am leaning towards the latter, but I am not sure which would be better, or if it even is a bug for that matter. So please let me know your thoughts on what I should do.

@shirsakm shirsakm marked this pull request as ready for review December 16, 2025 20:17
Comment thread listenbrainz/background/listens_importer/audioscrobbler.py Outdated
@shirsakm
Copy link
Copy Markdown
Contributor Author

The frontend tests are failing because the frontend for the "Import Listens" page is being updated, which I will implement once Monkey gives the go for it. I will then update the frontend tests based on the end layout.

I also seem to have added a couple of unrelated formatting changes because of my Prettier plugin formatting files on save. Let me know if I should revert those.

@shirsakm
Copy link
Copy Markdown
Contributor Author

Unless I am missing something, I believe everything is done, and the PR is ready to be reviewed. Please let me know if more frontend tests / testcases need to be added.

I have updated the frontend tests to check for the accordion, and all tests seem to be passing.

I have also disabled the "Timezone" entry for all services that are not the Audioscrobbler spec.
https://github.com/shirsakm/listenbrainz-server/blob/1fa356d31ed600bd13915268287fca4667f7f65e/frontend/js/src/settings/import/ImportListens.tsx#L589
I don't know how scalable this would be if new importers are added that needs to use the "Timezone" field as well, but it works for now.

I also added to check this behavior of it being disabled, purely because I wanted to use my new found ability to write frontend tests for something. More tests can't hurt, I reckon.

Copy link
Copy Markdown
Member

@MonkeyDo MonkeyDo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good! Some final details, as far as i can tell not much else and the feature will be ready.

Comment thread listenbrainz/background/listens_importer/audioscrobbler.py Outdated
Comment thread listenbrainz/background/listens_importer/audioscrobbler.py
Comment thread listenbrainz/tests/integration/test_listens_importer.py Outdated
Comment thread listenbrainz/webserver/views/settings.py Outdated
Comment thread listenbrainz/testdata/.scrobbler.log
Comment thread frontend/js/src/settings/import/ImportListens.tsx Outdated
shirsakm and others added 4 commits January 20, 2026 23:12
Added a comment to explain the `L` and `S` values of the rating field.

Co-authored-by: Monkey Do <MonkeyDo@users.noreply.github.com>
Co-authored-by: Monkey Do <MonkeyDo@users.noreply.github.com>
@shirsakm shirsakm requested a review from MonkeyDo January 22, 2026 21:29
@shirsakm
Copy link
Copy Markdown
Contributor Author

@MonkeyDo I have made the required changes. I will look into adding a frontend test to verify the timezone setting, if I can.

Copy link
Copy Markdown
Member

@MonkeyDo MonkeyDo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking great, working great!
Thanks for another good addition to the importer toolbox :)

Thinking about the filtering vs. parsing process and how that affects the attempt count has been quite useful to clarify the separate steps, and will inform future importers.
It's one thing is to filter by date or (in this case) skipped listens, and another to intent and fail to parse listens that passed the filtering step.

On the topic of failed listen parsing, it might be slightly frustrating to users to be told only "Some listens were rejected.", but it it 100% preferable to silently dropping items and pretending we got it all perfect.

Great job!

@MonkeyDo
Copy link
Copy Markdown
Member

I'm going to hold off on merging this as we need to run SQL schema updates and coordinate to deploy other containers.
To be deployed at the same time as #3498

@MonkeyDo MonkeyDo merged commit 3e934e2 into metabrainz:master Jan 30, 2026
3 checks passed
@shirsakm shirsakm deleted the scrobbler-log-importer branch March 6, 2026 11:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants