Bugzilla – Bug 15776
7.4.x scanner is confused by multiple versions of same album
Last modified: 2010-11-11 00:44:30 UTC
The scanner in the 7.4 series is confused by multiple versions of the same album. I had this problem with 7.4.0 when it was released and reverted to 7.3.2, as it's a show-stopper for me. I tried today's 7.4.2 release and I'm still encountering the same issue. This problem does not exist in 7.3.2. Here is my test case: create a top-level directory with two subdirectories. One subdirectory contains 13 FLAC-encoded tracks from the stereo version of The Beatles 2009 re-release of Sgt. Pepper's, and the other contains 13 FLAC-encoded tracks from the mono version of The Beatles 2009 re-release of the same album. There are no other files or subdirectories of any kind under the top-level directory (no CUE files, no M3U files, etc.). The directory names are identical except for the word "stereo" or "mono": The Beatles - Sgt. Pepper's Lonely Hearts Club Band (2009 mono remaster)/ The Beatles - Sgt. Pepper's Lonely Hearts Club Band (2009 stereo remaster)/ All of the FLAC files have been tagged by MusicBrainz Picard version 0.11. Obviously, both of these albums have the exact same track titles, track order, artist, and album name, since they're the same album with the only difference being how they were mastered. Here's an example of the tags: the first section shows the tags on the stereo version of track 1, and the second section shows the tags on the mono version of track 1: -- The Beatles - Sgt. Pepper's Lonely Hearts Club Band (2009 stereo remaster)/01 - Sgt. Pepper's Lonely Hearts Club Band.flac - FLAC, 122.89 seconds, 44100 Hz (audio/x-flac) producer=George Martin format=CD releasecountry=XE label=Apple Records totaltracks=13 musicbrainz_albumartistid=b10bbbfc-cf9e-42e0-be17-e2c3e1d2600d composer=Paul McCartney composer=John Lennon date=2009-09-09 engineer=Geoff Emerick comment=2009 EMI/Apple stereo remaster asin=B000002UAU albumartistsort=Beatles, The language=eng script=Latn title=Sgt. Pepper's Lonely Hearts Club Band musicbrainz_albumid=44b7cab1-0ce1-404e-9089-b458eb3fa530 releasestatus=official albumartist=The Beatles catalognumber=5099969945922 album=Sgt. Pepper's Lonely Hearts Club Band musicbrainz_artistid=b10bbbfc-cf9e-42e0-be17-e2c3e1d2600d releasetype=album performer=Paul McCartney (lead vocal) artist=The Beatles musicbrainz_trackid=237acd31-db9b-40e0-9263-9659a10ec98f artistsort=Beatles, The genre=Classic Rock tracknumber=1 -- The Beatles - Sgt. Pepper's Lonely Hearts Club Band (2009 mono remaster)/01 - Sgt. Pepper's Lonely Hearts Club Band.flac - FLAC, 122.51 seconds, 44100 Hz (audio/x-flac) producer=George Martin format=CD releasecountry=XW label=EMI Records totaltracks=13 musicbrainz_albumartistid=b10bbbfc-cf9e-42e0-be17-e2c3e1d2600d composer=Paul McCartney composer=John Lennon date=2009-09-09 engineer=Geoff Emerick comment=2009 EMI/Apple mono remaster asin=B000002UAU albumartistsort=Beatles, The language=eng script=Latn title=Sgt. Pepper's Lonely Hearts Club Band musicbrainz_albumid=44b7cab1-0ce1-404e-9089-b458eb3fa530 releasestatus=official albumartist=The Beatles catalognumber=PMC 7027 album=Sgt. Pepper's Lonely Hearts Club Band musicbrainz_artistid=b10bbbfc-cf9e-42e0-be17-e2c3e1d2600d releasetype=album performer=Paul McCartney (lead vocal) artist=The Beatles musicbrainz_trackid=237acd31-db9b-40e0-9263-9659a10ec98f artistsort=Beatles, The genre=Classic Rock tracknumber=1 As you can see, the tags are identical except for the label, release country, catalog number, and a comment that I added manually in Picard so that I could tell the tracks apart in the SqueezeCenter/SqueezeBox Server interface. Again, this is as it should be: both of these albums have the same name, tracks, artist, etc., and the fact that they're mastered differently is of no consequence. Now, if you set up SqueezeBox Server 7.4.2 on a Debian system using the Debian package from the slimdevices.com repo, and set the music folder to this top-level directory, after the initial scan is finished, you will have 14 albums in your music library: the complete mono version of the album Sgt. Pepper's Lonely Hearts Club Band with 13 tracks; and 13 additional albums titled Sgt. Pepper's Lonely Hearts Club Band, each with exactly 1 unique track from the stereo version of the album. Subsequent rescans (either "look for new" or "start over") make no difference in the outcome. If you do the same with Squeezecenter 7.3.2, you will get 2 albums, each titled Sgt. Pepper's Lonely Hearts Club Band, each with the proper tracks. You can't tell the difference between the two in the Album browsing interface, but if you drill down and click on any of the tracks, the COMMENT tag will tell you which one it is. When I run a scan on my full library with 7.4.2, any album for which I have multiple versions -- exact same album name, number of tracks, artist, etc., and the only difference being how the album was encoded, digitized or mastered -- will exhibit this bug. But with 7.3.2 and the same full library, each version shows up as a separate, complete album in the library. (I haven't tried later version of 7.3.x because of other bugs: 7.3.2 is the last version of Squeeze* that has worked perfectly for me.) Note that my 7.4.2 install is brand new: I started from scratch and did not attempt to upgrade my existing 7.3.x install. The only non-Slimdevices plugin I've configured on my 7.4.2 server is the BBCiPlayer plugin: there are no third-party multi-library or custom scanning plugins whatsoever. I don't think it's a particularly unusual or bizarre situation to have multiple versions of the same album: chances are any Beatles fan will have several copies of at least some of their albums, for example. People with high-res vinyl rips of certain albums are another example. I could work around this bug by adding the COMMENT field to the album name, but that's a hack: the albums should have an ALBUM tag that matches what you would find in the databases of MusicBrainz, LastFM, Pandora, MusicIP, etc. I've attached the following files to assist with the debug: my server.prefs, my plugin/state.prefs, my server.log and my scanner.log. The system is an amd64 Debian sid machine that's up-to-date as of 2010-02-24.
Created attachment 6560 [details] server config/logs
I have several repeated albums, but do not see this issue. I found my "same album name by same artist" cases using the following query: select contributor, titlesort, count(*) from albums group by contributor, titlesort having count(*) > 1 I do not see an album per track. I don't think I was seeing this when I had 7.4.2 installed. I am running on Windows. Have you tried this in 7.5?
(In reply to comment #2) > I have several repeated albums, but do not see this issue. I found my "same > album name by same artist" cases using the following query: > > select contributor, titlesort, count(*) > from albums > group by contributor, titlesort > having count(*) > 1 > > I do not see an album per track. I don't think I was seeing this when I had > 7.4.2 installed. I am running on Windows. > > Have you tried this in 7.5? No, I haven't tried it in 7.5. Has scanning or the schema changed significantly in 7.5? In any case, I'm trying to stick with release versions, especially since 7.3.2 "just works."
I'm no expert in this field, but I think the musicbrainz_albumid (or any musicbrainz_*id) is taking a very strong role: if this is identical, then it's the same album. Remove it in one of the copies, and you might get the two albums you want. Not that it would help navigating by album title though...
(In reply to comment #4) > I'm no expert in this field, but I think the musicbrainz_albumid (or any > musicbrainz_*id) is taking a very strong role: if this is identical, then it's > the same album. Remove it in one of the copies, and you might get the two > albums you want. Not that it would help navigating by album title though... I appreciate the reply, but I don't want to do that. I use the musicbrainz_albumid's to refresh the tags from time to time (the MB database is constantly changing and adding new metadata). I'm also a stickler for proper tags, which is why I insist on not adding version information to the album title in the first place, so removing them would be counter-productive. And if, as you suggest, it won't help browsing by album title, it's not going to be a very useful hack, anyway. If the scanner is matching tracks to albums using the mb albumid, wouldn't I get one album with multiple copies of each track? That's not what's happening. The scanner log shows messages like this for each duplicate track: [10-02-24 18:17:25.1444] Carp::Clan::__ANON__ (216) Warning: DBIx::Class::ResultSet::single(): Query returned more than one row. SQL that returns multiple rows is DEPRECATED for ->find and ->single at /usr/share/perl5/Slim/Schema.pm line 2538 Line 2538 in Schema.pm is tagged with the ominous comment: # XXX: can return multiple objects I'm tempted to hack around in there to see if I can fix it, but I'm hoping one of you devs can do that better than I :) Could it be that this has something to do with a particular version of a Perl module in Debian sid? I was under the impression that new-ish versions of Squeeze* came with all of the packages they need in order to avoid versioning problems, but I don't know for certain. Is that the case?
> I appreciate the reply, but I don't want to do that. Then you're probably out of luck. AFAIK it has a key role, and support for it was added to allow having an album spread across multiple folders. But as these indeed are distinct and different albums they should have different IDs anyway. Maybe MB needs an update on these rather rare recordings?
(In reply to comment #6) > > I appreciate the reply, but I don't want to do that. > > Then you're probably out of luck. AFAIK it has a key role, and support for > it was added to allow having an album spread across multiple folders. > > But as these indeed are distinct and different albums they should have > different IDs anyway. Maybe MB needs an update on these rather rare recordings? Rare? Bands and publishers have been releasing remastered versions of albums for years. I don't mean "special editions" and the like, just remasters. The example I used in my bug report was from last year's Beatles mono and stereo remasters, which sold about 2.25 million copies in their first four days of release. It's not a stretch to imagine that many people have at least a couple of albums that are identical other than their release dates, and anyone who's tagging those tracks with MusicBrainz is going to run into this problem. MusicBrainz isn't going to change it's entire schema just to accomodate SqueezeBox Server, in any case. I'm not convinced that the musicbrainz_albumid tag is the problem, anyway. If it were, why wouldn't all the tracks from each version appear in a single album in the library, instead of what's actually happening: one complete album followed by individual albums each containing exactly one track from the other versions? Furthermore, Philip in comment #2 says that he's seeing the desired behavior in Windows.
(In reply to comment #7) > MusicBrainz isn't going to change it's entire schema just to accomodate > SqueezeBox Server, in any case. s/it's/its/
If you tag files with the same MB Album ID, they are the same album. This is not a bug.
OK actually I will leave this open as an enhancement. The 'right' solution is to check multiple various tags when determining if an album is the same. Not likely to happen unless someone provides a good patch. I suggest you talk to Moonbase about this one.
i Andy, (In reply to comment #9) > If you tag files with the same MB Album ID, they are the same album. The MusicBrainz album ID can't be the whole story. The server must be keying on the album name tag as well, or else people who are adding "(extended version)" or "(2009 remaster)" to their titles would be complaining, too, right? If I'm correct, then the code is already using multiple criteria to decide album identity. It would be helpful to users like me if we had the option of adding a track's pathname to the mix so that it could be used to distinguish multiple versions. I'd happily trade off support for albums spread across multiple folders, if that's what caused this change. > This is not a bug. It is if you relied on the way this scenario behaved in versions from 6.x to 7.3.2, at least. And in any case, your statement that "same MB album ID = same album" is *not* the behavior I'm seeing. I'm seeing N+1 albums, where N is the number of tracks on the album. So that's a bug, whether you disagree that behavior change is a bug or not. How about an option to enable the old behavior for users who use MusicBrainz and have multiple versions of the same album(s)? Otherwise, you're putting those users in a Catch 22 situation: the server uses MusicBrainz album IDs to group albums, but MusicBrainz gives multiple releases of the same album the same album ID. I can't satisfy both constraints with the 7.4.x behavior.
(In reply to comment #10) > OK actually I will leave this open as an enhancement. The 'right' solution is > to check multiple various tags when determining if an album is the same. Not > likely to happen unless someone provides a good patch. I suggest you talk to > Moonbase about this one. I'd be happy to. Who's Moonbase?
(In reply to comment #12) > (In reply to comment #10) > > OK actually I will leave this open as an enhancement. The 'right' solution is > > to check multiple various tags when determining if an album is the same. Not > > likely to happen unless someone provides a good patch. I suggest you talk to > > Moonbase about this one. > > I'd be happy to. Who's Moonbase? Never mind, I see he's been cc:'ed on this Moonbase, feel free to get in touch via email. If you think this is in fact the expected behavior and not a bug (c.f. Phil's message in comment #2, he says he's seeing the behavior I want), then if you could point me to the places in the code where I should start looking to make a change, I'm happy to do that. Using the MusicBrainz releasecountry, catalognumber and/or date tags are obvious choices to discriminate between versions, but it's also fairly common for people to make high-res vinyl rips in addition to their CD rips or digital versions of the exact same album, so I think the pathname discriminator is the most straightforward one, though that might break this folder-spanning feature that Michael mentioned.
(In reply to comment #10) > The 'right' solution is to check multiple various tags when determining if an > album is the same. It's not going to do any good checking tags in this case, as they will all be identical. You'd have to use the directory path. I'm not sure where SbS currently stands with permitting tracks from an album to reside in different directories, but it would be impossible with a change like this.
I meant tags such as release date, release country, catalog number, etc, that MusicBrainz provides.
While diffing Schema.pm in 7.3.2 vs. 7.4.2, I ran across this addition in 7.4.2: # Bug 10583 - Also check for MusicBrainz Album Id my $albumid = $attributes->{'MUSICBRAINZ_ALBUM_ID'}; That looked suspicious. Bug 10583, filed by moonbase, looks like it was the genesis for supporting tracks from the same album in multiple folders. It also says this: > I guess SC wants to compare Album Artist, Album Name, and the complete release > date, constructed from both TYER+TDAT. If these dont’ agree or TDAT is > empty/non-existent, the match will fail and as many »albums« created as there > are tracks on the (real) album! Getting as many albums as there are tracks is *exactly* the behavior I'm seeing now post-7.3.2. Also, the fix for #10583 was merged for the 7.3.3 release. As I noted in my original report here, I've been sticking with 7.3.2 because of issues I've had with subsequent versions. My memory isn't clear anymore on exactly what broke for me in 7.3.3 -- I only know that I've also tried 7.4.0 and 7.4.2 and experienced the bug I've filed here -- but it makes sense that I would have skipped 7.3.3 after encountering this issue as well. Obviously, from my perspective, the changeset for #10583 fixed one bug and introduced another. I'll keep digging and see if I can come up with a compromise fix.
i'm just curious, i don't use musicbrainz, but why does it say that the stereo beatles and the mono beatles are the same? to me, thats ludicrous. what about the 1987 versions, does it think they are the same too? i tend to agree with the earlier comment that said MB's DB is the problem.