Bugzilla – Bug 5339
Music containing special characters does not appear.
Last modified: 2009-09-08 09:12:55 UTC
In my music collection, many artists and song names have characters with accents, tildes, etc. Such as Antonín Dvořák or Béla Bartók. If I scan my music collection without specifying the iTunes XML file, but rather just the direct path to my music files, all of my music shows up. However, if I turn on iTunes support and specify the iTunes Library XML file, all of my music shows up EXCEPT for those containing these special characters (for example, I see everything by Barber but nothing by Bartók). I would like to be able to use the iTunes XML file as the source for my library in order to access iTunes playlists and volume normalization, but because of this problem I cannot. I am using the latest version of iTunes (7.3 I believe) and SlimServer 6.5.4. The problem is not new to these versions, however, as I've had the issue for a long time. I run SlimServer on SuSE Linux 10.0 and map to the iTunes Library XML file through a Samba share.
Please do not preset targets for new bug reports. These are used by qa after review (not to mention that it's too late for a target of 6.5.4) More than likely, this should be marked as a dupe of bug 5332 (which, I'm fairly cure is also a dupe of an earlier report)
*** Bug 5343 has been marked as a duplicate of this bug. ***
5332 looks similar but may not be quite the same. This case is very specific to scans based on the iTunes Library XML file. When doing a direct scan of the file folder, all music is recognized and all special characters appear correctly; however, when reading from the iTunes Library XML file, any songs that containing names, artists, etc. which contain special characters are not recognized at all.
I searched for other similar issues. There are quite a few past bugs reporting something similar. 3073, 5205, and 1248 are examples. 1248 looks particularly similar and even offers a solution that involves editing iTunes.pm. However, there is no iTunes.pm file that I can find under /slimserver.
I have the same problem. When I scan the files directly all files turn up, when I use the .xml file I am missing all files with special characters. I use iTunes 7.3 on my PC (Windows Vista) (although 7.1 and 7.2 had the same problem), and store my files on a OpenSuse 10.1 Samba file store. I run slimserver (Linux vesion) on the same Linux box as the file store is! The reason why I like to use the .xml itunes file scan instead of the direct file scan, is because I loose my 'Album Artist'-tag when I directly scan the .m4a AAC files. With the use of the .xml file all Album-Artist tags show info shows up! Example of a problem file; <key>7347</key> <dict> <key>Track ID</key><integer>7347</integer> <key>Name</key><string>Un Bel Dì Vedremo (Madama Butterfly)</string> <key>Artist</key><string>Angela Gheorghiu</string> <key>Composer</key><string>Puccini, Giacomo (1858-1924)</string> <key>Album</key><string>Puccini</string> <key>Genre</key><string>Classical</string> <key>Kind</key><string>AAC audio file</string> <key>Size</key><integer>6260771</integer> <key>Total Time</key><integer>309786</integer> <key>Disc Number</key><integer>1</integer> <key>Disc Count</key><integer>1</integer> <key>Track Number</key><integer>1</integer> <key>Track Count</key><integer>18</integer> <key>Year</key><integer>2004</integer> <key>Date Modified</key><date>2007-03-16T18:16:22Z</date> <key>Date Added</key><date>2007-03-16T17:06:49Z</date> <key>Bit Rate</key><integer>160</integer> <key>Sample Rate</key><integer>44100</integer> <key>Persistent ID</key><string>942A1B5B974C50B5</string> <key>Track Type</key><string>File</string> <key>Location</key><string>file://localhost/C:/Users/Edwin/Music/iTunes/iTunes%20Music/Angela%20Gheorghiu/Puccini/01%20Un%20Bel%20D%C3%AC%20Vedremo%20(Madama%20Butterf.m4a</string> <key>File Folder Count</key><integer>4</integer> <key>Library Folder Count</key><integer>1</integer> </dict>
Some example files would be welcome. Steven, could you work with Wallace to either generate or use any offered files with his test automation?
*** Bug 5332 has been marked as a duplicate of this bug. ***
What do you need? A zip file containing my complete iTunes xml file and some m4a files which do not show up?
I too could zip up an m4a file and the xml if that would help, but given that your music database won't otherwise match and the differences in pathing, it might be difficult to duplicate this way. All you should really need to do to duplicate this is to give any song in an iTunes db an Artist name having an accented character. You should then see that this song shows up when scanning directly, but not when specifying the library xml file.
There are not many users affected by this problem. It seems like there are a set of problems users can run into if they are running iTunes on one system but storing their files on another that combine with the fact that different OSes also support non-ASCII characters in different ways to cause a number of issues. It's not clear to me what the "right" solution to these bugs is at this time, but we'll keep this open so we can refer to it.
Lots of people using a kind of Linux based NAS box will have troubles with this bug. So it is my personal believe more people will be hit, but it is hard to notice the problem. You are just missing some 'random' tracks here and there in some 'random' album. I still hope you can solve this bug.
Just a guess, but I would think this issue would affect *everyone* who runs Slimserver on a Linux box and uses iTunes to manage their music library. If this adds up to "not many users" then maybe you have a point, but I would think that this might be a fairly popular configuration; especially, as Edwin points out, when you take NAS boxes into consideration. Right now, not being able to use my iTunes playlists on my SB remains a disappointment.
Ok, at last I have solved this issue. It turns out that the solution is to ensure that the character code mapping between Windows and Linux are compatible. When I typed "locale" at a Linux command prompt, it told me that my RC-LANG setting was UTF-8. This and Samba Server need to be set to ISO-8859-1 to play nicely with iTunes running in Windows. This solution was not easy to come by, however, and it seems like there should be some way to address this issue more easily on an application level. But then again, maybe this is *the* solution. In any case, here is what I did. Remember, I am using SuSE 10.0. So this may differ for other distributions. 1. Change RC-LANG in /etc/sysconfig/language file from “UTF-8” to “ISO-8859-1” 2. In Yast, open Samba Server, go to the Identity tab, then Advanced Settings, then Expert Global Settings. Create the following settings: display charset = LOCALE dos charset = CP850 unix charset = iso8859-1 After completing these settings, I rebooted the computer, then changed my server settings in SS to reference the iTunes xml library file. Upon completing this, if all you do is just go ahead and do a clear/rescan, existing songs/artists having special characters will *still not* be imported--just as before; however, any *new* music imported into iTunes at this point *will work*. To fix older music, one should choose "Add Folder to Library..." from the iTunes File menu, and choose the iTunes\iTunes Music folder on their (Samba) shared drive. This will cause iTunes to reimport all music and rebuild the folders (note that I am assuming that you let iTunes organize your music folder, if not, your procedure may differ). After doing this you will find that all music that previously had special characters will now have a duplicate in the iTunes library. One of these will have an exclamation point next to it in the music list in iTunes. Delete all of these as they are basically orphaned references. You should then exit iTunes and go into the "\iTunes Music" folder and examine each folder and subfolder. If any of these folders has garbage characters you should examine it to see that its contents are empty, then delete it. After all this, you should be good to go. iTunes playlists now work, as do other playlists that reference songs/artists with higher characters.
Assigning to unassigned for future review.
I've tracked down the error in two disctinct places, but have not sufficient Perl knowledge to do much about it. The first is in Plugins/iTunes/Importer.pm: here the itunes URL is correctly transformed to the UTF-8 pathname, but the -e $file fails on the perl side, causing the tune to be ignored. This may be a perl bug? 2008-01-10 23:56:57.4206 iTunes: ORIGINAL URL file:///opt/McShare/iTunes/iTunes%20Music/Bjo%CC%88rk/Live%20at%20Shepherds%20Bush%20Empire/01%20Anchor%20Song.mp3 2008-01-10 23:56:57.4211 iTunes: ORIGINAL FILE /opt/McShare/iTunes/iTunes Music/Bjo?\x88rk/Live at Shepherds Bush Empire/01 Anchor Song.mp3 When I comment/remove the file existence checking routines in Importer.pm, the tunes are correctly entered in the database (I'm using MySQL). The second error occurs when the tune is played: the iTunes URL is now in the database, but it is incorrectly transformed into a filename: 2008-01-11 08:47:08.5986 Got /opt/McShare/iTunes/iTunes%20Music/Bjo%CC%88rk/Live%20at%20Shepherds%20Bush%20Empire/Crying.mp3 from file url file:///opt/McShare/iTunes/iTunes%20Music/Bjo%CC%88rk/Live%20at%20Shepherds%20Bush%20Empire/Crying.mp3 2008-01-11 08:47:08.6008 extracted: /opt/McShare/iTunes/iTunes Music/Björk/Live at Shepherds Bush Empire/Crying.mp3 from file:///opt/McShare/iTunes/iTunes%20Music/Bjo%CC%88rk/Live%20at%20Shepherds%20Bush%20Empire/Crying.mp3 This happens in Slim/Utils/Misc.pm
I've got this one solved (at least on my system ;=)), in the squeezecenter 7b01 sources (backporting to 6.5 should be easy). Problem description: My itunes library is kept on a MAC, hence all paths in the XML file follow the UTF-8 decomposed form convention that is used on a MAC. The problem arises when the library and music files are moved to an NFS or SMB volume (e.g. ReadyNAS): the XML file won't change, but the actual filename format changes from decomposed UTF-8 to composed form. This is a problem for the iTunes Importer.pm. If someone helps me to put it in the beta stream, it can be tested in a wider variety of systems. The second problem arises when the escaped filepath is extracted from the database: it is then not converted correctly to UTF-8 (or the form that was used to put it in...). These changes are in Misc.pm and are applied to pathFromFileURL. Quite a central piece, so again my changes should be tested thoroughly. Could someone help me placing the two changed files in the dev stream? Thanks, Paul
Thanks, Paul. Can you attach your patch to the bug?
Created attachment 2901 [details] Modified Util/Misc.pm
Created attachment 2902 [details] Modified Plugins/iTunes/Importer.pm
Andy, can you review this patch?
Thanks, can you post a diff -u against trunk? svn diff would be the easiest way. Your files are a bit outdated.
David: can you post the patch as Andy described? That makes it easier to review and apply so we can close this bug. Thx!
Perhaps superfluous, but I have attached the diff -u 's against the current version of SC7 17786 I run it on a linux box and works well with UTF-8 encoding: SqueezeCenter Version: 7.0 - 17786 - SUSE - EN - utf8
Created attachment 3026 [details] diff -u against iTunes/Importer.pm
Created attachment 3027 [details] diff -u against Util/Misc.pm
I've recently checked in a bunch of changes which care about non-latin character stuff. Could anybody confirm they fixed/did not fix this issue? Please use latest trunk builds to reproduce. Thanks!
Paul/David - is this still broken in the latest 7.0.1 nightly builds?
Yes, it is still broken. Not surprising, since nothing material has changed in the nightly of Plugin/iTunes/Importer.pl, so each and every composed UTF-8 file is not found against the decomposed form in the iTunes Library.xml. Net effect is that the iTunes importer will skip any file that has accented characters. I'll apply my Importer changes to the 7.01 source and see if that solves the issue....
With the changes to Importer.pm the scanner will find the accented songs again, but squeezecenter will still fail to play them. If I apply the changes from my Misc.pm to Utils/Misc.pm this is resolved too. So there is nothing in the current 7.01 nightly that resolves this bug.
Just to be sure I understand this right: - you're running iTunes on your mac - on the mac you've mounted a share from Linux server using Samba - your music is stored on that share - your iTunes XML file is stored on that share Is this about right?
..and SC is running on the Linux box?
Created attachment 3193 [details] recompose file names if needed Could you please give this patch a try? I've successfully tested it on a ReadyNAS.
(In reply to comment #31) > ..and SC is running on the Linux box? > Yes, thats my config, with a very small detail: the iTunes lib resides on a MAC (HFS+) and gets synchronised over (in its entirety) to a linux box, where it is indexed by Slimserver running on that linux box. It is the same setup if you would have iTunes on a Mac and copy the iTunes library over to e.g. a readyNAS Hope this helps!
I've tried your patch and can confirm it works on my config too (notice that the linux box must use a UTF-8 locale). Nice efficient patch. Thanks!! (In reply to comment #32) > Created an attachment (id=3193) [edit] > recompose file names if needed > > Could you please give this patch a try? I've successfully tested it on a > ReadyNAS. >
change 18350 - thanks for the testing!
Verified Fixed Version: 7.0.1 - 19422 If anyone see this issue appear again, please reopen this bug with added details
This bug has recently been fixed in the latest release of SqueezeCenter 7.0.1 Please try that version, if you still see the error, then reopen this bug. To download this version, please navigate to: http://www.slimdevices.com/su_downloads.html
This bug reappeared again in 7.1.0 (stable) and 7.2 nightly's (squeezecenter-7.2-0.1.22491.noarch.rpm). Could someone reopen the bug and fix it?
Sorry, I jumped to conclusions too early. The problem seems to be in the fact that setting --charset=utf8 has different behaviour from setting LC_CTYPE=en_GB.UTF-8 Adding --charset=utf8 to the commandline via squeezeserver.conf breaks the UTF-8 interpretation and causes this bug to reappear (so it is not in the 7.1/7.2 branches). Shall I open another bug for the charset failure so this one can be closed? Cheers, Paul
Does it work for you _without_ that new parameter?
(In reply to comment #40) > Does it work for you _without_ that new parameter? > Yes, but only if LC_CTYPE=en_GB.UTF-8. Since the account that runs squeezecenter has its locale set to POSIX (haven't found a way around that really), I cheat and edit Unicode.pm. Here is the diff: 119c119 < --- > 122,123c122 < < $lc_ctype = "en_GB.UTF-8"; --- > Hope this helps
In this case the "real" solution would be to use "--charset=en_GB.UTF-8". This parameter can be used to force any character set, not only utf8. Please open a new bug if this doesn't work. Thanks!
This is tracked further in Bug# 9126
This issue appears to have been partially addressed in 7.2.1, please reopen if you disagree.
*** Bug 5117 has been marked as a duplicate of this bug. ***
Reduce number of active targets for SC