Bug 4578 - Accented characters in playlist entries don't rescan correctly
: Accented characters in playlist entries don't rescan correctly
Status: CLOSED FIXED
Product: Logitech Media Server
Classification: Unclassified
Component: Playlists
: 7.3.0
: PC Linux (other)
: P2 normal with 8 votes (vote)
: 7.5.0
Assigned To: Andy Grundman
: charset_issues, perl5.10
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2006-12-09 23:11 UTC by Bob Maple
Modified: 2010-12-30 12:28 UTC (History)
9 users (show)

See Also:
Category: ---


Attachments
Playlist with Unicode in filename and contents (1.55 KB, text/plain)
2008-08-11 15:45 UTC, Keith Briscoe
Details
tgz containing scanner.log (scan.* + format.playlists=debug) + playlist file (2.22 KB, application/octet-stream)
2008-09-11 15:30 UTC, Markus Schiegl
Details
test m3u playlist, created by SC (359 bytes, text/plain)
2008-11-16 10:13 UTC, Markus Schiegl
Details
scanner.log, without patch - note the different encodings of ä (7.64 KB, text/plain)
2008-11-16 10:14 UTC, Markus Schiegl
Details
scanner.log, with patch from #34 - note the identical encodings of ä (7.67 KB, text/plain)
2008-11-16 10:15 UTC, Markus Schiegl
Details
A Playlist i feed my SC with (includes non-ASCII) (58.24 KB, audio/mpegurl)
2008-11-17 06:02 UTC, Dominique Cote
Details
this is SC's version of the same playlist (70.59 KB, audio/mpegurl)
2008-11-17 06:04 UTC, Dominique Cote
Details
this is one of the "offending" tracks (3.45 MB, audio/x-ms-wma)
2008-11-17 06:08 UTC, Dominique Cote
Details
bigger change... (984 bytes, patch)
2008-12-12 07:44 UTC, Michael Herger
Details | Diff
playlist log, umlauts (4.96 KB, text/plain)
2009-01-22 16:31 UTC, Ross Levine
Details
Working M3U8 playlist for the test case (UTF-8+BOM) (1.86 KB, text/plain)
2009-01-31 09:35 UTC, Moonbase
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Bob Maple 2006-12-09 23:11:21 UTC
Songs saved in playlists with pathnames containing accented characters don't rescan correctly (possibly the same as/related to bug #4276)

For instance, if you have a file in your library with accented characters somewhere in the pathname, you can add it to a new playlist, save it, and all is fine - you can browse the playlist, load it into your player, play it, etc.  However if you then simply "touch" the playlist and tell SlimServer to rescan your library (or just playlists), you will get errors:

WARNING:
	file:///shaketunes/V%E4rttin%E4/Miero/12-Eerama.ogg found in playlist:
	file:///home/slimserver/playlists/XMas%20Mix.pls doesn't exist on disk - skipping!

In my case the pathname to the song in question is:

/shaketunes/V�rttin�/Miero/12-Eerama.ogg

%E4 should be the correct character code for the � - however, upon turning on d_info, I see:

2006-12-09 23:49:46.2835 isFile(/shaketunes/Vrttin /Miero/12-Eerama.ogg) == 0

You can see that the �'s seem to have been dropped completely (sort of, not sure what the space in place of the second � is all about) - I guess pathFromFileURL() is not decoding the string properly?
Comment 1 Chris Owens 2007-02-12 14:34:50 UTC
That character isn't stored by bugzilla either, but looking on teh Intarweb it appears to be an 'a' with an umlaut (two dots) over it.  

Ross, there are various playlist improvements in 7.0, so if you could try to repro this in 7.0 that would great.
Comment 2 Ross Levine 2007-03-28 17:31:34 UTC
Sorry for the delay here. I'm using SuSE 10.2 with SlimServer 7.0, and there is a clear issue present. SuSE uses UTF-8, and SlimServer is using iso-9959-1. SlimServer isn't able to properly index any accented characters. I used the umlaut as well as the tilde over the n (ä & ñ, alt 0228 and alt 0241 in windows) and both of these are returned with bizarre characters. 

Should I try anything else? 
Comment 3 KDF 2007-07-02 10:45:39 UTC
how are you able to determine that SuSe is utf-8 while slimserver is iso?  If your own user is utf-8, it may be that the slimserver user is not set with the right local.  On my setup, a default user is iso and I specifically have to add a line to the init to set the locale before starting slimserver.
Comment 4 Chris Owens 2007-10-22 10:09:24 UTC
This is very difficult to reproduce.

In some Linux installs we have noted that Slimserver seems to think the OS is running a different character set than it is.

So, I believe that it is a problem because I have talked to enough people that it's happening to, but if anyone could give detailed repro instructions, or a theory about how this might be happening, I'd love to hear it.

Andy, do you have any theories or guidance?
Comment 5 Chris Owens 2007-11-20 10:42:06 UTC
*** Bug 6153 has been marked as a duplicate of this bug. ***
Comment 6 Chris Owens 2007-11-20 10:45:12 UTC
Andy I'm not sure this should be assigned to you, feel free to pass it along or back to unassigned.
Comment 7 Chris Owens 2008-06-04 10:29:05 UTC
This stuff has been totally rewritten in recent builds.  Someone please let me know if you are still seeing this.
Comment 8 Chris Owens 2008-06-23 10:12:45 UTC
Michael to see if his command line option helps this behavior or not.
Comment 9 Michael Herger 2008-06-24 00:48:02 UTC
What's the locale on your Linux box? (Settings/Status)

If it's not utf8 though your box is configured to use utf8, then there's a new command line parameter in 7.1 which allows forcing a charset. Please add

--charset utf8

to your startup script and try again.
Comment 10 Michael Herger 2008-06-27 02:18:16 UTC
Anybody still seeing this issue?
Comment 11 Michael Herger 2008-06-30 10:28:58 UTC
Looking good to me. Feel free to re-open if needed.
Comment 12 Chris Owens 2008-07-30 15:29:56 UTC
This bug has now been fixed in the 7.1 release version of SqueezeCenter!  Please download the new version from http://www.slimdevices.com if you haven't already.  

If you are still experiencing this problem, feel free to reopen the bug with your new comments and we'll have another look.
Comment 13 Keith Briscoe 2008-08-08 23:23:14 UTC
I'm still seeing this on SC7.1/SuSE RPMs, with the --charset utf8 parameter.  Non-latin1 characters in the filename of the playlist itself are displayed garbled, and non-latin1 paths within the playlist are skipped.
Comment 14 Keith Briscoe 2008-08-10 21:23:59 UTC
Oh, the playlist in question is a pre-existing M3U playlist, not one created by SC.  That may be relevant.
Comment 15 Michael Herger 2008-08-10 22:37:57 UTC
How did you create/store this playlist? Are you running Windows, accessing the files over Samba? How does the file look name look like from a shell?
Comment 16 Keith Briscoe 2008-08-11 08:47:20 UTC
More details: The OS is openSUSE 11.0, and I've verified in the web interface that SC is definitely using UTF8.  The file is stored on the local disk.  Within the OS, through the shell and the GUI, the filename of the playlist appears correctly, and the contents of the file do as well.  Also, when using Music Folder browsing within SC, the filenames of all of the music files (also stored locally)containing the same character display correctly.  The character in question is 0x2019 (apostrophe).  This problem seems specific to playlists.

The file was originally created using Amarok, and modified using Kate.
Comment 17 Chris Owens 2008-08-11 14:18:11 UTC
Could we get you to attach one of the affected playlist files so we could have a look?  Thanks!
Comment 18 Keith Briscoe 2008-08-11 15:45:04 UTC
Created attachment 3779 [details]
Playlist with Unicode in filename and contents

Here's the playlist.  The last track is the one that gets skipped (because of the apostrophe), and the playlist name gets mangled for the same reason.
Comment 19 Michael Herger 2008-08-14 04:24:10 UTC
Keith - do you know what locale you're using on that system? Would the scan work if you removed the --charset parameter from the startup script? This issue might be related to bug 9126.
Comment 20 Keith Briscoe 2008-08-14 21:28:14 UTC
I'm thinking it is related.  Here's the output of locale:

LANG=en_US.UTF-8
LC_CTYPE="en_US.UTF-8"
LC_NUMERIC="en_US.UTF-8"
LC_TIME="en_US.UTF-8"
LC_COLLATE="en_US.UTF-8"
LC_MONETARY="en_US.UTF-8"
LC_MESSAGES="en_US.UTF-8"
LC_PAPER="en_US.UTF-8"
LC_NAME="en_US.UTF-8"
LC_ADDRESS="en_US.UTF-8"
LC_TELEPHONE="en_US.UTF-8"
LC_MEASUREMENT="en_US.UTF-8"
LC_IDENTIFICATION="en_US.UTF-8"
LC_ALL=

If I remove the charset parameter, SqueezeCenter goes back to ISO-8859-1, mangles all of the filenames when browsing the Music Folder, and the playlist is still as mangled as always, skipping the same songs as before.  So I'm guessing if the charset parameter were working properly, it SHOULD fix both music and playlist filenames--but right now it's only fixing one.
Comment 21 Michael Herger 2008-08-15 03:01:52 UTC
Keith/QA - could you please try to reproduce this issue and get some scanner.log output with the scan.* plus format.playlists debugging enabled? I can't reproduce this. Thanks!
Comment 22 Markus Schiegl 2008-09-11 15:30:56 UTC
Created attachment 3973 [details]
tgz containing scanner.log (scan.* + format.playlists=debug) + playlist file

I'm seeing the same problem on 7.3r23146. System is linux, utf-8 and generally works without problems.

what i've done:
- removed all playlists
- wipe & scan
- created a new Playlist call "Test" which includes 2 Tracks (attached)
- wipe & scan
- duplicate entries have been created for those two tracks (different encoding used in the database for "url")

Is this information helpful?
Comment 23 ubermes 2008-09-15 14:07:24 UTC
I have the same problem. I'm swedish and all songs in a playlist that contains the swedish characters åäö is skipped when re-scanning.

I'm running:
SqueezeCenter-version: 7.2 - 22900 @ Tue Aug 26 11:00:28 PDT 2008 - Red Hat - SV - utf8

I have noticed that the list first is scanned correctly and all song are included, but it is when the "Databasrensning nr 1" (database clean-up in english) is run that the songs with swedish characters is removed.
Comment 24 Dominique Cote 2008-10-17 00:01:04 UTC
me too, guys. i posted this thread with the description: http://forums.slimdevices.com/showthread.php?t=53828

PLEASE FIX THIS! sorry for the yeeling, but to me this is a *Critical* bug. my GF and me almost only use playlists to access our collection and this bug causes playback to stop and/or playlists to not scan correctly. this is spoiling our enjoyment of our squeezeboxes *big time*.

(sorry i cant contribute more, i am a linux novice.)
Comment 25 Michael Herger 2008-10-17 00:20:55 UTC
> (sorry i cant contribute more, i am a linux novice.)

Giving some information about your system doesn't require a whole lot of linux know how... at least the information on Settings/Status would be helpful, plus what kind of device you're using.
Comment 26 Dominique Cote 2008-10-17 06:02:05 UTC
hi michael, i hope the information you are looking for is in the thread i mentioned: http://forums.slimdevices.com/showthread.php?t=53828.

if there is any more information you guys need, please let me know exactly what it is and how to get it - i will be more than happy to provide anything i can!
Comment 27 Michael Herger 2008-10-17 06:08:20 UTC
Please add that information to this bug report. Threads sometimes can get pretty long and hard to find the information. Thanks.
Comment 28 Dominique Cote 2008-10-17 06:12:19 UTC
oh - sorry about that! i was actually trying to keep _this_ thread short by just linking to the thread... ;-)

here is my system:
SqueezeCenter Version: 7.2 - 22900 @ Tue Aug 26 10:59:02 PDT 2008 - Linux - EN - utf8
Server IP address: 192.168.0.102
Perl Version: 5.8.8 armv5tejl-linux-thread-multi
MySQL Version: 5.0.27 
Platform Architecture: armv5tejl-linux
Hostname: Turbonas
Server Port Number: 9001
Total Players Recognized: 3

on a qnap TS409 pro with FW 2.1.0 Build 0904T and SSOTS 3.15

if you need anything else, spam me! this bug is important to me. :-)
Comment 29 James Richardson 2008-10-17 10:30:53 UTC
Ross:  Can you have a look at this with an eye to 7.3.
Comment 30 Markus Schiegl 2008-10-26 16:14:00 UTC
this bug is annoying. i've already reduced my usage of playlists to a minimum but at least the zap playlist remains. I'd like to know if this issue is obsoleted by a possible upcoming "favorites-podcast-my-radio-playlist-all-inclusive" feature. Otherwise i'd take a look at this bug (no promise for a fix though). thanks!
Comment 31 Ross Levine 2008-11-11 12:58:01 UTC
Possibly has something to do with bug 9659. Easily reproduced with my same test bed as for that bug. /home/user/Music/Ä/artists/tracks.mp3 create a playlist (with SqueezeCenter web interface) ensure it shows up in browse - playlists, then clear and rescan and it is gone. 7.3 - 23875 Perl 5.10.0 Ubuntu 8.10. Michael please let me know if there is anything else I can do that would be helpful. 
Comment 32 Dominique Cote 2008-11-12 01:24:07 UTC
surely downgrading this bug's severity to "minor" will not impact the enthusiasm with which this bug will be attacked? ;-)
Comment 33 ubermes 2008-11-15 15:09:04 UTC
Why has this bug been reduced to minor? This bug makes the playlist functionality completely useless, therefore this is a major bug. I would also like to increase the priority of the bug. Can you please prioritize this bug!
Comment 34 Markus Schiegl 2008-11-16 10:12:19 UTC
From my point of view this is no minor bug, because it's not just an incomplete/buggy function (not all playlist entries are scanned correctly) but it even corrupts the database with (partially) wrong path and file names. Any function (plugin, selection, etc.) which relies on the db is at risk of fetching an invalid track.

As promised i took at look and this is my analysis + patch:

The file scanner (Scanner.pm) stores tracks with non-ascii characters (e.g. ä) as utf8/url encoded strings in the DB, eg: Ger%C3%A4usch, note the "%C3%A4", a typical utf8 sequence. The playlist scanner (M3U.pm) only url encodes this example string to "Ger%E4usch". Because of %C3%A4 != %E4 this creates two different track entries (but with the same meta-data) in the database.

I have replaced the conditional S::U::U::utf8encode_locale (which obviously doesn't trigger the encoding function) with an unconditional S::U::U::encode to force the correct encoding of this string:

Index: Slim/Formats/Playlists/M3U.pm
===================================================================
--- Slim/Formats/Playlists/M3U.pm	(revision 23939)
+++ Slim/Formats/Playlists/M3U.pm	(working copy)
@@ -89,7 +89,7 @@
 		next if $entry eq "";
 
 		$entry =~ s|$LF||g;
-		$entry = Slim::Utils::Unicode::utf8encode_locale($entry);	
+		$entry = Slim::Utils::Unicode::encode("utf8", $entry);
 		$entry = Slim::Utils::Misc::fixPath($entry, $baseDir);
 
 		if ($class->playlistEntryIsValid($entry, $url)) {

With this patch the playlist scanner now properly encodes ä to %C3%A4.
While this works perfectly fine on my system (linux gentoo, utf8) i won't claim that this fixes the problem (and works) on all systems, so try it out yourself.
I'll attach sample log files & a m3u playlist (created by SC).

It would be really appreciated if you (slimdevices :-) could take a look at this bug + patch and consider it (or something else which works) for 7.3.
I suspect other playlists (e.g. PLS.pm, line 66) may have the same issue...
thanks!
Comment 35 Markus Schiegl 2008-11-16 10:13:57 UTC
Created attachment 4257 [details]
test m3u playlist, created by SC
Comment 36 Markus Schiegl 2008-11-16 10:14:37 UTC
Created attachment 4258 [details]
scanner.log, without patch - note the different encodings of ä
Comment 37 Markus Schiegl 2008-11-16 10:15:43 UTC
Created attachment 4259 [details]
scanner.log, with patch from #34 - note the identical encodings of ä
Comment 38 Dominique Cote 2008-11-16 12:46:12 UTC
servus markus! (your name sounds bavarian? ;-)

i really appreciate your work - great stuff! i would love to try it, but unfortunately, i am not much of a perl expert... would you mind giving a brief explanation how to insert your patch in to M3U.pm? i dont want to break anything.
Comment 39 Michael Herger 2008-11-17 03:17:28 UTC
Are all of you running some Linux flavour? Could you please all poste the system information as found in Settings/Information?
Comment 40 Michael Herger 2008-11-17 05:05:01 UTC
Markus - I'm still trying to reproduce this issue. Haven't been able so far. I'm now installing Ubuntu 8.10 to be sure.

One question: has this playlist you've uploaded been created using SC or some other application?
Comment 41 Michael Herger 2008-11-17 05:29:25 UTC
Ross - I've now installed Ubuntu with Perl 5.10 and still can't reproduce the issue :-(.

I've been using with tons of files I've collected with umlauts, accents etc. in folder and file names, like eg.

~/Music/Guédron, P/Le Consert des Consorts - Le Poème Harmonique.flac

They all show up fine before and after scans. Ubuntu 8.04, 7.04, 8.10, OSX 10.5

Comment 42 Dominique Cote 2008-11-17 06:02:56 UTC
Created attachment 4272 [details]
A Playlist i feed my SC with (includes non-ASCII)

please note track 5 in line 5: ...café del mar... this is the kind of entry which is later missing from the SC-imported playlist.
Comment 43 Dominique Cote 2008-11-17 06:04:51 UTC
Created attachment 4273 [details]
this is SC's version of the same playlist

this is the output after i load Style Vocal.m3u in to SC, and then save it again. note the missing ...café del mar... (among many others).
Comment 44 Dominique Cote 2008-11-17 06:08:37 UTC
Created attachment 4274 [details]
this is one of the "offending" tracks

note that this track contains non-ascii also in its tags.
Comment 45 Markus Schiegl 2008-11-17 08:37:04 UTC
(In reply to comment #39)
> Are all of you running some Linux flavour? Could you please all poste the
> system information as found in Settings/Information?

Gentoo 64bit, latest but stable

Version: 7.3 - TRUNK @ UNKNOWN
Hostname: <name>
Server IP Address: <ip>
Server HTTP Port Number: 9000
Operating system: Linux - EN - utf8
Platform Architecture: x86_64-linux
Perl Version: 5.8.8 - x86_64-linux-thread-multi
MySQL Version: 5.0.60-log
Total Players Recognized: 4

(In reply to comment #49)
> Markus - I'm still trying to reproduce this issue. Haven't been able so far.
> I'm now installing Ubuntu 8.10 to be sure.

Michael i appreciate your work! You have my sympathy when tracking bugs which are not reproduceable - i often have the same problem on my daily business. I'd offer you remote access to my box if you're interested and you think this helps - pm me, i've some spare time this evening (MET)

> One question: has this playlist you've uploaded been created using SC or some
> other application?

it was created by SC. I've setup a Music Libary with only 3 tracks (see log), put this album to the now playing list and saved this as "test.m3u". Did another rescan and now every track is doubled. This gave me 6 tracks (of which 3 are invalid)...

@Dominique: Did you receive my mail about how to test the patch? Did it work?

thanks,
Markus
Comment 46 Michael Herger 2008-11-17 09:07:30 UTC
Ross - could you give me access to your Ubuntu 8.10 VM?
Comment 47 ubermes 2008-11-19 11:33:50 UTC
Thank you Markus Schiegl!
I have tried your patch on my machine and it works fine.
I'm running ClarkConnect Community Edition 4.3.

Server-Information:
SqueezeCenter-version: 7.2.1 - 23630 @ Mon Oct 20 19:52:55 PDT 2008 - Red Hat - SV - utf8
Serverns IP-adress: 192.168.0.2
Perl-version: 5.8.5 i386-linux-thread-multi
MySQL-version: 4.1.20
Plattformsarkitektur: i686-linux
Värdnamn: gateway.clarkconnect.lan
Serverportnummer: 9000
Totalt antal anslutna spelare: 1
Comment 48 Michael Herger 2008-11-20 09:05:59 UTC
Markus - we can't hardcode that encoding, as it will break some other platform for sure...

Could you please give the following patch a try? I think we missed to fix the logic in that condition when the function was extended beyond the simple x->utf8 encoding:

Index: /Users/mh/Documents/workspace/7.3/server/Slim/Utils/Unicode.pm
===================================================================
--- /Users/mh/Documents/workspace/7.3/server/Slim/Utils/Unicode.pm	(revision 23981)
+++ /Users/mh/Documents/workspace/7.3/server/Slim/Utils/Unicode.pm	(working copy)
@@ -450,7 +450,7 @@
 
 	# Check for doubly encoded strings - and revert back to our original
 	# string if that's the case.
-	if ($string && $] > 5.007 && encodingFromString($string) eq 'utf8') {
+	if ($string && $] > 5.007 && encodingFromString($string) ne $encoding) {
 
 		$string = $orig;
 	}
Comment 49 Markus Schiegl 2008-11-20 09:40:08 UTC
(In reply to comment #48)
> Markus - we can't hardcode that encoding, as it will break some other platform
> for sure...
i'd bet on that, too...

> 
> Could you please give the following patch a try? I think we missed to fix the
> logic in that condition when the function was extended beyond the simple
> x->utf8 encoding:

Yes this fixes the problem and looks much more thought through than a hard coded encoding switch (which is indeed no solution).

Do you think we need some more tests with non-utf8 systems?

Thanks for your great commitment while tracking this bug!
Markus
Comment 50 Michael Herger 2008-11-21 01:20:34 UTC
change 23993 - please test the next nightly build _thoroughly_. Thanks!

Tested on Gentoo, OSX 10.5 and Windows XP. If this change is to cause side-effects, I'll probably pull it out for 7.3 again. Thanks for your understanding.
Comment 51 Ross Levine 2008-11-21 13:00:54 UTC
Michael I'm still able to reproduce the issue I see in comment #31 with 7.3 23994, I really need to get you access to this VM I'll speak to Mr. Wise today. 

Note that when I create a playlist, a playlist directory is of course specified in SC settings, but that directory remains empty even after I check and make sure I can play the playlist. 
Comment 52 Markus Schiegl 2008-11-21 14:29:50 UTC
(In reply to comment #50)
> change 23993 - please test the next nightly build _thoroughly_. Thanks!
> 
> Tested on Gentoo, OSX 10.5 and Windows XP. If this change is to cause
> side-effects, I'll probably pull it out for 7.3 again. Thanks for your
> understanding.
> 

I'm sorry (for you and me) that change 23993 has problems with the MusicIP-Import using "Phil's method".

In MusicMagic::Import.pm, line 339 utf8encode_locale is called with a not properly(?) encoded UTF8 track name (König instead of König).
Tracing in utf8encode_locale shows that the if-clause at line 431 evaluates to true:

431         if ($string && ($encoding ne 'utf8' || !Encode::is_utf8($string))) {
432 
433                 $string = Encode::encode($encoding, $string, $FB_QUIET);

With a non empty $string and $encoding='utf8' the last condition must be responsible why line 433 is executed. This results in a double-encoded $string (König).

Ommitting the call to utf8encode_locale in Importer.pm...

Index: Importer.pm
===================================================================
--- Importer.pm	(revision 24011)
+++ Importer.pm	(working copy)
@@ -336,7 +336,7 @@
 
 			if ($1 eq 'file') {
 				# need conversion to the current charset.
-				$file = Slim::Utils::Unicode::utf8encode_locale($2);
+				$file = $2;
 			}
 			elsif ($1 eq 'active') {
 				$active = $2

solves the problem (all tracks are mixable now)

I don't know what's cause and effect in this scenario. Was the call in Importer.pm a fix because of a previously broken utf8encode_locale or did we break it now? Or it's a third option: There's still an issue in Unicode.pm (Does Encode::is_utf8($string) work as designed?)

If you need "anything" from me i'll help, OTOH i'd understand it, if this one gets a bit hot for 7.3...
Comment 53 Michael Herger 2008-11-21 23:25:26 UTC
change 24015 - partially reverting previous change

I'm sorry, don't want to take the risk of an unknown number of side-effects at this point. One already is known now :-(. I'll re-target this issue for 7.3.1 with increased priority.

Markus - could you please test again? SC should now show the same behaviour as before: broken playlists, working MusicIP. Please tell me if this isn't the case. Thanks for your understanding.
Comment 54 Markus Schiegl 2008-11-22 00:55:05 UTC
three times yes...
- MusicIP Import works again without patching
- Playlist Import has the known issue
- i do understand that this is a bit like a minefield and we should be better safe than sorry...
Comment 55 Dominique Cote 2008-11-23 23:19:22 UTC
i also tested markus' patch in comment #34 and can confirm it works for me. i don't use music IP and can't see any other irregular behavior, so i am going to leave the "hack" in my system for now.
thanks a *lot* markus! :-)
Comment 56 ubermes 2008-12-01 12:33:50 UTC
I have noticed that it is not only the playlist functionality that have problems with accented character. The same problem also accounts for the search functionality. When searching for something that contains accented characters, the result is empty (search doesn't find anything), even though the song (or artist) exists in the database.

PS. The search test has been done from SqueezeCenter interface.
Comment 57 Markus Schiegl 2008-12-01 12:39:16 UTC
there was a bug (which i experienced myself) in <7.3 which has recently been fixed: bug 8637. Could you verify you still have this problem in a new 7.3 nightly? (or wait until 7.3 is released)
Comment 58 ubermes 2008-12-03 09:07:12 UTC
Now I have tried 7.3 build 24158, and yes the problem with search and accented characters is fixed. Good.
Comment 59 Michael Herger 2008-12-12 06:32:03 UTC
Markus - I'm still pretty sure my patch was ok. Going back in the history of that function I think we didn't keep up with all the changes. In the beginning it was hardcoded to encode anything to utf8. It then was weakened to encode utf8 to anything too. But some of the checks are still very strong, eg. is_utf8() only checking for the utf8 flag, but not doing the guesswork we otherwise do. Therefore I'd suggest this patch:


Index: /Users/mh/Documents/workspace/7.3/server/Slim/Utils/Unicode.pm
===================================================================
--- /Users/mh/Documents/workspace/7.3/server/Slim/Utils/Unicode.pm	(revision 24287)
+++ /Users/mh/Documents/workspace/7.3/server/Slim/Utils/Unicode.pm	(working copy)
@@ -428,7 +428,7 @@
 	# Don't try to encode a string which isn't utf8
 	# 
 	# If the incoming string already is utf8, turn off the utf8 flag.
-	if ($string && ($encoding ne 'utf8' || !Encode::is_utf8($string))) {
+	if ($string && ($encoding ne 'utf8' || encodingFromString($string) ne $encoding) {
 
 		$string = Encode::encode($encoding, $string, $FB_QUIET);
 
@@ -439,7 +439,7 @@
 
 	# Check for doubly encoded strings - and revert back to our original
 	# string if that's the case.
-	if ($string && $] > 5.007 && encodingFromString($string) eq 'utf8') {
+	if ($string && encodingFromString($string) ne $encoding) {
 
 		$string = $orig;
 	}

 	
Does it change anything for good?
Comment 60 Michael Herger 2008-12-12 07:43:21 UTC
Take this one...

Index: /Users/mh/Documents/workspace/7.3/server/Slim/Utils/Unicode.pm
===================================================================
--- /Users/mh/Documents/workspace/7.3/server/Slim/Utils/Unicode.pm	(revision 24287)
+++ /Users/mh/Documents/workspace/7.3/server/Slim/Utils/Unicode.pm	(working copy)
@@ -428,7 +428,7 @@
 	# Don't try to encode a string which isn't utf8
 	# 
 	# If the incoming string already is utf8, turn off the utf8 flag.
-	if ($string && ($encoding ne 'utf8' || !Encode::is_utf8($string))) {
+	if ($string && ($encoding ne 'utf8' || encodingFromString($string) ne $encoding)) {
 
 		$string = Encode::encode($encoding, $string, $FB_QUIET);
 
@@ -439,7 +439,7 @@
 
 	# Check for doubly encoded strings - and revert back to our original
 	# string if that's the case.
-	if ($string && $] > 5.007 && encodingFromString($string) eq 'utf8') {
+	if ($string && encodingFromString($string) ne $encoding) {
 
 		$string = $orig;
 	}
Comment 61 Michael Herger 2008-12-12 07:44:30 UTC
Created attachment 4417 [details]
bigger change...

if you're really adventurous, please give this patch a try and test with all you can: tracks from files, playlists, MusicIP etc. Thanks!
Comment 62 Markus Schiegl 2008-12-12 16:20:23 UTC
Michael, i think you're on the most promising track to date. This time we've got two working patches: from comment #60 & comment #61 !

Up to now both are working without any difference and any problems.

They successfully passed the following test areas with "special" files:
- multiple wipe&scans finds all files
- playlists (save, playback, rescan playlist only)
- mip (mip scan, create mip mixes, mip mix contains all types of files)
- used the webfrontend, player's ui and controller

I tried quite hard (and long) but failed to run into an encoding related errors :-)

@ubermes,@Dominique: It'd be good if you can test one (or both) patches on your system, too (and don't forget to remove my hackish patch before :-). thanks!

Good work!
Comment 63 Michael Herger 2008-12-15 04:33:40 UTC
Thanks a lot for your extensive testing!

change 24300 - let's hope this fixes this issue (and potentially others) for good
Comment 64 Keith Briscoe 2008-12-16 18:51:29 UTC
I'm not sure what this means, but I'm using a nightly build (Version: 7.3.1 - 24324 @ Tue Dec 16 03:00:06 PST 2008) and there is no visible change in playlist behavior after a full rescan of my library.

I'm using the playlist example in attachment#3779 [details] and the name of the playlist is mangled, and the tracks with extended characters are dropped from the playlist, as before.

Same system specs as before (SuSE 11.0):
Operating system: SuSE - EN - utf8
Platform Architecture: x86_64-linux
Perl Version: 5.10.0 - x86_64-linux-thread-multi
MySQL Version: 5.0.51a

Let me know if any more information would be helpful.
Comment 65 Michael Herger 2008-12-16 23:55:43 UTC
This sucks. I wonder whether perl 5.10 is acting differently than 5.8. Will have to install a 5.10 based system.
Comment 66 Markus Schiegl 2008-12-17 00:05:38 UTC
(In reply to comment #65)
> This sucks. I wonder whether perl 5.10 is acting differently than 5.8.

this was my concern, too - and after reading about some utf8-changes in the upcoming perl 5.8.9 (http://www.heise.de/newsticker/Letztes-grosses-Release-der-Perl-5-8-x-Serie-veroeffentlicht--/meldung/120477 - sorry to all non-german readers) i'm curious what will happen then.
Comment 67 Ross Levine 2008-12-18 14:33:48 UTC
I'm still able to reproduce the playlist issue from comment #31 with 7.3 - 24367.
Comment 68 Michael Herger 2008-12-18 14:41:34 UTC
Ross - this is also on a perl 5.10 based system?
Comment 69 Ross Levine 2008-12-18 14:51:43 UTC
(In reply to comment #68)
> Ross - this is also on a perl 5.10 based system?

Yes. Ubuntu 8.10.
Comment 70 ubermes 2008-12-22 04:03:33 UTC
I can confirm that change 24300 (comment#63) fixes this issue for me. I'm running perl 5.8.5, see comment#47.
Comment 71 Chris Owens 2008-12-22 09:31:08 UTC
So there's still a problem for users using perl 5.10
Comment 72 Oliver Dolny 2009-01-02 14:16:41 UTC
I am glad I found this thread.

My system:
Version: 7.3.2 - 24460 @ Fri Jan 2 03:12:08 PST 2009
Betriebssystem: Debian - DE - utf8
Plattformarchitektur: i686-linux
Perl-Version: 5.8.8 - i486-linux-gnu-thread-multi
MySQL-Version: 5.0.32-Debian_7etch8
Anzahl erkannte Player: 1

This is German, but I hope it's easy to understand.

I can confirm this bug still exists with the system above. In my view it got worse. For some days now (just aptituded my system und SC-version 7.3.2 among others) I do not only have problems with accented characters in playlists (like I had before) but I am also loosing 'regular' tracks now without any special characters. I can access all files from the music library just not in playlists.

I am still in the process of getting used to Linux. So please excuse, if I did not provide enough information. Just tell me, what else you would like to know.




Comment 73 Michael Herger 2009-01-06 02:43:54 UTC
Oliver - if you're missing regular tracks, then this is a different issue. Please open a new bug or better ask in the forums for some help first.
Comment 74 Michael Herger 2009-01-06 03:43:24 UTC
Ross - either I don't understand the remaining open issue, or it's working as expected for me:

- installed a Ubuntu 8.10 here, running perl 5.10, latest SC 7.3. 
- created a playlist with "Björk - Medùlla" and "für usszeschnigge" in the path of two tracks
- saved playlist
- wipe/rescan
-> playlist is scanned fine, all tracks (special characters or not) are found.

Could you please describe what exactly doesn't work for you. Please describe it step by step as if this was a new bug. The going back and forth between comments is making this a bit hard to follow :-)

This is with 7.3/trunk rev. 24523
Betriebssystem: Debian - DE - utf8 
Perl-Version: 5.10.0 - i486-linux-gnu-thread-multi
Comment 75 Ross Levine 2009-01-06 15:51:44 UTC
I feel terrible, my test was flawed. I just figured out the problem and it was a permissions issue. 

I no longer see disappearing playlists now that SC can write to my playlist directory. So sorry Michael!!
Comment 76 Michael Herger 2009-01-06 23:30:52 UTC
Thanks Ross. This let's us with Keith's comment #64

> I'm using the playlist example in attachment#3779 [details] and the name of the playlist
> is mangled, 

I've opened a new bug 10361 about the name mangling (before I realised this might be related to this issue here). What you're seeing is the file's 8.3 backwards compatibility name (good old DOS...) Windows carries around internally.

> and the tracks with extended characters are dropped from the
> playlist, as before.

Please let me know if this is still true with the latest nightly builds.

Ross - do you have a SuSE Linux you could test with?
Comment 77 Michael Herger 2009-01-21 05:33:05 UTC
Ross? I'm still trying to get some SuSE running. Failed with 11.0, trying 11.1 now. Do you have a working SuSE installation on which you can reproduce?
Comment 78 Ross Levine 2009-01-21 16:30:30 UTC
(In reply to comment #77)
> Ross? I'm still trying to get some SuSE running. Failed with 11.0, trying 11.1
> now. Do you have a working SuSE installation on which you can reproduce?

Very sorry Michael but I haven't got SqueezeCenter on SuSE running. I've tried but I can't get it to install in OpenSuSE 11, I'm stuck at:

error: Failed dependencies:
    /usr/bin/mysqld_safe is needed by squeezecenter...
Comment 79 Keith Briscoe 2009-01-21 20:31:16 UTC
You'll need to install mysql--it's not included in the default setups.  FWIW openSUSE 11.0 and 11.1 show the same behavior so you could use either for testing.
Comment 80 Ross Levine 2009-01-22 16:31:37 UTC
Created attachment 4690 [details]
playlist log, umlauts

Thank you Keith. I was trying to dig up mysqld_safe, you saved me a lot of trouble! 

Michael, this is reproducible with OpenSuSE 11, SC 7.3.3 - 24731. I also couldn't get SuSE working, running into a network adapter issue with VMware.
Comment 81 Moonbase 2009-01-31 08:36:44 UTC
Unfortunately, the ".m3u" playlist format was an "ad-hoc" creation by Nullsoft (Winamp) a long time ago, and there are no official specs. Nevertheless, the general consensus nowadays seems to be that M3U only supports the Latin-1 (ISO-8859-1) character set while the (quite new) ".m3u8" format supports UTF-8 encoding.

Many programs like Mp3tag will even silently try to convert UTF-encoded file paths to ISO-8859-1 (and fail), resulting in entries like:

#EXTINF:30,Alfred E. Moonbase - Test #08: HebRus / ????? / ???????
D:\Temp\Testdaten\MusicIP\Moonbase, Alfred E. - Test #08_ HebRus _ ????? _ ???????.mp3

from my test case, which SHOULD have been:

#EXTINF:30,Alfred E. Moonbase - Test #08: HebRus / עברית / Русский.mp3
D:\Temp\Testdaten\MusicIP\Moonbase, Alfred E. - Test #08_ HebRus _ עברית _ Русский.mp3

Programs like iTunes, Winamp, foobar2000 and the like also tend to differentiate between ".m3u" and ".m3u8" playlists and simply skip any non-Latin-1 entries in ".m3u" files.

So the question arises if SC should support ".m3u8" (and maybe XSPF, also in wide use and supporting UTF-8) in addition to ".m3u" and use a more "compliant" playlist file type for scanning/storing.
Comment 82 Moonbase 2009-01-31 09:35:53 UTC
Created attachment 4734 [details]
Working M3U8 playlist for the test case (UTF-8+BOM)

Here is a working ".m3u8" playlist that has relative file paths and is UTF-8 encoded (plus BOM). It works under Windows with my "odd filenames" test case:
http://www.kaufen-ist-toll.de/download/radio/Moonbase%20-%20Windows%20Unicode%20Filenames%20Test%20Suite.rar
http://www.kaufen-ist-toll.de/download/radio/Moonbase%20-%20Windows%20Unicode%20Filenames%20Test%20Suite%20FLAC.rar
Comment 83 Moonbase 2009-01-31 10:14:10 UTC
Here is an example of how to MAKE ".m3u8" playlists with Mp3tag (it doesn’t yet do this natively):
http://www.anytag.de/forums/index.php?showtopic=8925&hl=
Comment 84 Michael Herger 2009-02-17 00:22:02 UTC
Gee... I'm lost in the dozens of comments in this bug report. Please let's stick with the original issue and open new bugs for new/different issues. Thanks.

What's the remaining problem? We still see this misbehaviour on some Linux distributions, but not all of us are able to reproduce it?

Those who can: how did you create your files? If you've been using a Windows machine, how did you transfer the files to the Linux boxes?
Comment 85 Keith Briscoe 2009-02-17 10:46:33 UTC
As far as I can tell, the M3U in attachment#3779 [details] shows both problems:

1) The scanner mangles the name of the playlist (due to extended characters in the filename)
2) The scanner cannot find some tracks on the playlist (those with extended characters in their filename)

This is repeatable with openSUSE 11.0 and 11.1, using local files and a UTF8 locale.  The M3U in the attachment was hand-created to trigger the bug, but I've seen it with programmatically-created ones.

The fact that the playlist name is getting mangled means that the M3U parsing code may not even be involved in the bug--it's a filename/locale issue, at least in part.
Comment 86 Chris Owens 2009-03-16 09:34:03 UTC
We are now planning to make a 7.3.3 release.  Please review your bugs (all marked open against 7.3.3) to see if they can be fixed in the next few weeks, or if they should be retargeted for 7.4 or future.

Thanks!
Comment 87 Chris Owens 2009-03-30 17:29:29 UTC
Since there's now a planned 7.3.3 release, bugs which won't make the cut-off are being moved to the next target out.  If you feel that this bug needs to be addressed more (or less) urgently than the 7.4 release, please cc chris@slimdevices.com and leave a comment in the bug to that effect so we can review it.

Thanks.
Comment 88 Chris Owens 2009-03-31 08:51:18 UTC
For some reason Bugzilla did not change the target when I did this yesterday.  Or maybe it was me.  In either case, I'm trying it again.
Comment 89 Dominique Cote 2009-06-08 03:56:19 UTC
not sure if i am helping and/or contributing anything useful, but:

recently upgraded to SC 7.3.2. --> problem returned.
reapplied markus' patch to m3u.pm --> problem went away. happy again.
Comment 90 Michael Herger 2009-06-08 10:59:53 UTC
Andy - is this an issue which should be covered by your recent changes too?

Others - can you reproduce/confirm your issues with the latest 7.4 nightlies (_not_ 7.4/trunk!)?
Comment 91 Keith Briscoe 2009-06-08 20:35:47 UTC
Tried a 7.4 nightly (squeezecenter-7.4-0.1.26940.noarch.rpm)

After a full rescan, the playlist in attachment#3779 [details] now shows the following behavior:
- Playlist name is still mangled, as before
- Playlist appears empty in SqueezeCenter.  I verified that the contents had not changed.  I assume it can't read the file contents due to the name mangling.
Comment 92 Michael Herger 2009-07-27 08:16:14 UTC
assigning scanner related bugs to Andy
Comment 93 Andy Grundman 2009-07-29 14:58:29 UTC
Moving 7.4 bugs to 8.0.
Comment 94 Michael Herger 2009-11-30 10:10:49 UTC
Almost 100 comments - I've lost track, I'm sorry.

Anybody on this list still has issues the with latest 7.5 and playlists?
Comment 95 Keith Briscoe 2009-11-30 14:46:31 UTC
The problem certainly exists as described in comment#91 in the 7.4 release.  Has anything changed on the 7.5 branch that might change this?  If so, I can test that file again.
Comment 96 Keith Briscoe 2009-11-30 19:46:50 UTC
Verified behavior is the same as described in comment#91 using nightly 7.5 build.

System: openSUSE 11.1 x86-64, using squeezeboxserver-7.5.0-0.1.29503.noarch.rpm
Comment 97 Dominique Cote 2009-12-07 03:41:42 UTC
according to bugzilla, P2/"normal" indicates:

"regular issue, some loss of functionality under specific circumstances"

however, this is probably only the case in ENGLISH speaking countries with little or no non-ascii character use.

for me/us (my romanian/hungarian GF) this bug is more accurately categorized as "major loss of function", since fully 2/3 of her romanian/hungarian tracks have all kinds of accents etc in them. i personally have a large number of german tracks, which frequently use "umlauts". äöü and ß.

the only way i can currenlty use our (three) squeezeboxes is by implementing markus' hack everytime i upgrade SC.

thank goodness it doesnt seem to break anything else...

considering that logitech is a truly global company and needs to provide products that work all over the world (including my locale), i request to upgrade this bug's priority back to P1.

oh, and perhaps this bug's three year birthday (!) would be a good reason to finally fix it? ;-)
Comment 98 Michael Herger 2010-02-10 06:38:09 UTC
*** Bug 15463 has been marked as a duplicate of this bug. ***
Comment 99 SVN Bot 2010-02-12 11:11:08 UTC
 == Auto-comment from SVN commit #30147 to the  repo by agrundman ==
 == https://svn.slimdevices.com/?view=revision&revision=30147 ==

Bug 4578, support m3u8 playlist extension
Comment 100 Andy Grundman 2010-02-12 11:18:35 UTC
I think we need to mark this one as fixed.  If you want to use playlists with non-Latin characters they need to be encoded in UTF-8.  I've added support for the m3u8 extension, although there is no actual difference in the way the server treats these files.  You can use m3u files as long as they are UTF-8 encoded.  Do not use Latin1 or any other encoding for playlists.  The server can't be expected to correctly guess what encoding a given playlist is using.
Comment 101 Keith Briscoe 2010-02-14 18:27:22 UTC
I just tested the attachment from comment#91 again and it still shows the same behavior.  This attachment is UTF-8 encoded (filename and file contents) and I'm testing it on a system with a UTF-8 locale (all files are local files).

As far as I can tell from what Andy wrote, this should be working now, or there's still a bug, right?

[Tested with squeezeboxserver-7.5.0-0.1.30158.noarch.rpm]
Comment 102 Markus Schiegl 2010-02-14 22:10:03 UTC
I second this. My problem has never been when mixing different charsets but in 100%-utf8 land, i.e. everything was utf8, the filenames, the filesystem, the charset the process was running and created the very playlist which caused the problem later...so it was more of a encoding/decoding issue (of the identical charset)
Comment 103 Keith Briscoe 2010-02-19 11:49:37 UTC
I don't know the protocol here, but I suppose that there is a possibility that some aspect of this bug was in fact fixed by the recent changes, and that what I'm seeing is a different problem entirely, deserving of a new bug.

So I've cloned this bug as bug#15739.  If nothing else, the new bug is less vague and has fewer comments...for now ;)
Comment 104 Chris Owens 2010-04-06 09:13:15 UTC
So, to summarize this bug, SbS tries to guess the encoding of the text in the playlist file.  We have made so many changes in this area changing things back and forth that it's a drain on our resources.  It is a very difficult challenge to guess the encoding of a playlist.

The team's consensus today on how to approach this for now is to make sure that UTF-8 always works.  I'll update this bug with some list of playlist tools that can be used to make playlists with UTF-8 encoding.

There's already a bug to fix UTF support: bug 15739
Comment 105 Chris Owens 2010-04-08 17:25:42 UTC
This bug has been marked fixed in a released version of Squeezebox Server or the accompanying firmware or mysqueezebox.com release.

If you are still seeing this issue, please let us know!
Comment 106 Simon Finch 2010-05-13 16:29:53 UTC
(In reply to comment #105)

> If you are still seeing this issue, please let us know!

I'm still seeing this bug -- all 100% UTF-8, SBS 7.5.0 (and tried 7.5.1) on Debian Squeeze, locale = UTF-8, m3u playlists created via web interface, tried m3u8 playlists via Mp3Tag on Windows.

I recently added a lot of Françoise Hardy albums -- the cedilla chokes playlist scanning (so I cut them) -- but even some acute accents -- file path:

/Volumes/SpMusic/SpSqueeze/Rock Pop/Music/Francoise Hardy/Blues 1962-1993/06 L'amitié.mp3

in an m3u8 playlist produces:

Warning: file:///Volumes/SpMusic/SpSqueeze/Rock%20Pop/Music/Francoise%20Hardy/Blues%201962-1993/06%20L%27amiti found in playlist:
file:///Volumes/SpMusic/Config/Playlists/Francoise%20Blues.m3u8 doesn't exist on disk - skipping!

(cross-posted in bug#15739)
Comment 107 Leif Johansson 2010-12-30 12:28:11 UTC
I'm seeing this issue with 7.6.0 - r30575.