Bugzilla – Bug 7547
SqueezeCenter storing incorrect path url values to track.url field on UTF8 encoded filesystems
Last modified: 2009-07-31 10:18:15 UTC
With flac files containing embedded cuesheets and with filenames including diacritic characters, as of SC7 trunk svn 1795, SqueezeCenter seems to "double encode" the value that gets stored in tracks.url. This results in subsequent failures to extract artwork and play audio from the incorrectly encoded url values. Example: OS: Fedora 8 locale: LANG=en_US.UTF-8 LC_CTYPE="en_US.UTF-8" LC_NUMERIC="en_US.UTF-8" LC_TIME="en_US.UTF-8" LC_COLLATE="en_US.UTF-8" LC_MONETARY="en_US.UTF-8" LC_MESSAGES="en_US.UTF-8" LC_PAPER="en_US.UTF-8" LC_NAME="en_US.UTF-8" LC_ADDRESS="en_US.UTF-8" LC_TELEPHONE="en_US.UTF-8" LC_MEASUREMENT="en_US.UTF-8" LC_IDENTIFICATION="en_US.UTF-8" LC_ALL= SqueezeCenter Version: 7.0.1 - TRUNK - Red Hat - EN - utf8 Server IP address: 192.168.0.199 Perl Version: 5.8.8 i386-linux-thread-multi MySQL Version: 5.0.45-log Platform Architecture: i686-linux Hostname: slim-minuet.da Server Port Number: 9000 Total Players Recognized: 1 Cache Folder: /var/lib/squeezecenter_trunk/cache Preferences Folder: /var/lib/squeezecenter_trunk/prefs Plugin Folders: /usr/share/squeezecenter_trunk/server/Slim/Plugin, /usr/share/squeezecenter_trunk/server/Plugins Actual path of audio file: /mnt/slimtest/music/Flac_problem_artwork/c_Early_Baroque/Guédron, P/Le Consert des Consorts - Le Poème Harmonique.flac What SC7 records to the track.url field in the db: file:///mnt/slimtest/music/Flac_problem_artwork/c_Early_Baroque/Gu%C3%A9dron,%20P/Le%20Consert%20des%20Consorts%20-%20Le%20Po%E8me%20Harmonique.flac#0-1 What OUGHT TO (I think) get recorded to the track.url field: file:///mnt/slimtest/music/Flac_problem_artwork/c_Early_Baroque/Gu%E9dron%2C%20P/Le%20Consert%20des%20Consorts%20-%20Le%20Po%E8me%20Harmonique.flac Taking just a portion of that example: "Guédron" gets transformed into "Gu%C3%A9dron" rather than the correct "Gu%E9dron". This looks like a "double encoding" artifact to me, i.e. UTF8 data getting encoded to UTF8 a 2nd time. So, SC7 can see the original file, open it and parse the embedded cuesheet, create track records for all the tracks, but incorrectly stores the path in track.url. Again, the practical result of this are tracks that can be browsed to in the web interface, but lack artwork and, most importantly, cannot be played.
Created attachment 3115 [details] Test files showing the problem. (zipped, utf8 encoded) These files were zipped on a UTF8 encoded ext3 file system. Suitable for unzipping in the same environment.
Created attachment 3116 [details] Test files showing the problem (zipped ansi) This zip file includes the same test files, but with ansi encoded filenames. Suitable for unzipping on a windows machine and then copying to a samba share on a UTF8 encoded filesystem.
Comment on attachment 3115 [details] Test files showing the problem. (zipped, utf8 encoded) Test files zipped on a utf8 encoded, ext8 file system. Suitable for unzipping in the same environment.
OK, my thinking is evolving here. The value in tracks.url: file:///mnt/slimtest/music/Flac_problem_artwork/c_Early_Baroque/Gu%C3%A9dron,%20P/Le%20Consert%20des%20Consorts%20-%20Le%20Po%E8me%20Harmonique.flac#0-1 utf8 decodes to: file:///mnt/slimtest/music/Flac_problem_artwork/c_Early_Baroque/Guédron, P/Le Consert des Consorts - Le Poème Harmonique.flac#0-1 In the above instance, the filename portion of the path, "Le%20Consert%20des%20Consorts%20-%20Le%20Po%E8me%20Harmonique.flac" decodes correctly. But the directory name portion with the diacritic, "/Gu%C3%A9dron,%20P/" decodes to "Guédron" which is actually the UTF8 encoding of "Guédron". So: in the scanning process, it's looking like filenames are getting url-ized correctly, but directory names are not. Or perhaps directory portions of paths need to be utf8 decoded after being un-url-ized while filenames do not. In any event, checking my scanner log, I see for this file: [08-03-18 15:24:03.7618] Slim::Schema::Track::coverArt (299) Error: Exception when trying to call readCoverArt() for [file:///mnt/slimtest/music/Flac_problem_artwork/c_Early_Baroque/Gu%C3%A9dron,%20P/Le%20Consert%20des%20Consorts%20-%20Le%20Po%E8me%20Harmonique.flac] : [[/mnt/slimtest/music/Flac_problem_artwork/c_Early_Baroque/Guédron, P/Le Consert des Consorts - Le Poème Harmonique.flac] does not exist or cannot be read: No such file or directory at /usr/share/squeezecenter_trunk/server/lib/Audio/FLAC/Header.pm line 67. file:///mnt/slimtest/music/Flac_problem_artwork/c_Early_Baroque/Gu%C3%A9dron,%20P/Le%20Consert%20des%20Consorts%20-%20Le%20Po%E8me%20Harmonique.flac will un-url-ize to: file:///mnt/slimtest/music/Flac_problem_artwork/c_Early_Baroque/Guédron, P/Le Consert des Consorts - Le Poème Harmonique.flac So...we have a portion of the url un-url-izing correctly (the filename that includes "Poème") and a portion that gets botched (the directory name that includes "Guédron" that gets mangled to "Guédron"). The portion of the log that says: [/mnt/slimtest/music/Flac_problem_artwork/c_Early_Baroque/Guédron, P/Le Consert des Consorts - Le Poème Harmonique.flac] does not exist ..seems to be VERY confused. "Poème" utf8 decodes to "Poème" (correct) but "Guédron" utf8 decodes to "Guédron"
Running tests with and withou my recent changes, I noticed that without them, half the files would not be found. It's the same files that with the changes wouldn't display artwork. Can you confirm this?
change 17926 - please give it a try...
Michael: I think you got it: OK, Updated to revision 17934. Dropped all tables from mysql squeezecentertest schema. Restarted SqueezeCenter, let it perform an automatic first time scan. At scan conclusion, scanner.log completely empty...no warnings! Checking all test via the web interface: all files now have artwork and all are playable. Copied new flac file to: /mnt/slimtest/music/Flac_problem_artwork/zzz_Thís_Fòldér_Häß_ein_Sčrèwÿ_Ñâmę/Åŗŧëşť, Fôrêìğñ/Pièces de Violes - Markku Loulajan-Mikkola, Mikko Perkola, Aapo Häkkienen.flac Performed a "scan for new music" Still no warnings in scanner.log. New, über-wacky named flac file found by the scan, artwork visible, file playable. Your one-line fix seems to have completely resolved this bug, which has been bugging me for the past 3 years. Thanks!
I also just tested this on a windows box just to make sure that this fix doesn't break anything there. Everything looks OK on windows too. My final test will be to restore all the diacritic characters in the paths of my real library...something that I won't be able to do until a month from now...and test against that. That will be something of a special case test: the library weighs in at 750gb and resides on a ntfs formatted 1T disk mounted via ntfs-3g.
Verified fixed in 7.0.1 - 19597
This bug has recently been fixed in the latest release of SqueezeCenter 7.0.1 Please try that version, if you still see the error, then reopen this bug. To download this version, please navigate to: http://www.slimdevices.com/su_downloads.html
Reduce number of active targets for SC