Bugzilla – Bug 10199
Most transcoding doesn't work with non-ascii-characters in filename
Last modified: 2011-05-09 17:35:39 UTC
this used to work, but broke in change 23758 (because of bug 8828) when flc mp3 * * # IFB... was replaced by flc mp3 * * # FB... I'll attach two logfiles so you can compare... system: 7.3trunk linux, utf8
Created attachment 4370 [details] player.source=info, current but _not_ working convert.conf
Created attachment 4371 [details] player.source=info, old but working convert.conf
Steven: your thoughts on this?
Alan, would this be something you would look at?
As written in comment #1 i've modified my convert.conf to enable playback of those files/tracks (if transcoding is enabled) Today i ran into a problem which has probably the same root cause: I tried to seek to another position using the Webinterface. This works as long as the filename doesn't contain any non-ascii characters. If it does the following output is printed to stdout (example): ----- Error writing mp3 output 03 - Mägde und Knechte.flac: ERROR initializing decoder init status = FLAC__STREAM_DECODER_INIT_STATUS_ERROR_OPENING_FILE An error occurred opening the input file; it is likely that it does not exist or is not readable. ----- I suppose the part which is responsible for reading the file has problems when determining the correctly encoded filename...
Created attachment 4508 [details] Proposed fix
The problem occurs in TranscodingHelper::tokenizeConvertCommand2() in the fragment: foreach (keys %subs) { $command =~ s/\$$_\$/$subs{$_}/; } The perl string being substituted contains non-ASCII characters encoded as UTF8 but it not have the UTF8 flag set. The substitute command notices the non-ASCII bytes (the result of the UFT8 encoding) and encodes them (again) as UFT8, resulting in a double encoding. The string in question was originally created by the URI module, which failed to cause the UTF8 flag to be set when it undid the RFC3986 percent encoding.
Markus - you aren't using perl 5.10, are you?
We may want to backport this to 7.3.3 is QA have sufficient testing resources.
Change 24594 Assigning to QA for testing (marked fixed in Changelog). This would need careful testing with transcoding of files whose names include non-ascii characters. The testing should cover a variety of different non-ASCII character sets and across platforms set up for a similar variety of non-English locales. Michael may be able to offer some advice.
I prepared a test suite of files with Windows Unicode filenames for bug #9772, you could use that as a "uniform test file set" for a start.
QA to try to repro this on Vista, Mac, and linux to judge the urgency. Retarget as appropriate.
(In reply to comment #12) > QA to try to repro this on Vista, Mac, and linux to judge the urgency. > Retarget as appropriate. James and myself have not been able to reproduce the reported issue with SC 7.3.1 on Windows Vista, Mac OS X and Ubuntu. I would like to be able to reproduce before moving on to test the proposed fix. Clear reproduction instructions are appreciated. Changing targeting to 7.3.3 for now.
Created attachment 4652 [details] player.source=info, working... (In reply to comment #8) > Markus - you aren't using perl 5.10, are you? > no. it's 5.8.8 I just returned from an extensive voyage (without internet :-) and am still trying to catch up on my various online projects. Reading Steven's response i tried to reproduce the problem after having updated SC to 7.3r24668 (+wipe&scan) without success, or the other way around: transconding (mp3->flac) works now with all files. (see updated attached log file, line "Slim::Player::Song::open differs", though i tried to retain all other parameters, i.e. perl version, etc.) A short fly-past over hundreds of svn-commit messages didn't show anything closely related to me. Any ideas what has changed? (single "charset" change i found was change 24497). btw. seeking works again, too. kind regards, Markus
hmm, after my retest ("it works") last weekend with my transporter and bitrate limiting enabled the problem is now back with the original player (SB3). 1.) Another test with the transporter confirms the problem now - unfortunately i've no clue why it worked a few days ago, maybe my test was flawed... :-( 2.) The patch from comment #6 works! Regarding "Clear reproduction instructions are appreciated" i do acknowledge this - although i have some problems myself with this at the moment as you see ;-) my configuration: Version: 7.3.3 - TRUNK @ UNKNOWN Hostname: vserver1 Server IP Address: <IP> Server HTTP Port Number: 9000 Operating system: Linux - EN - utf8 Platform Architecture: x86_64-linux Perl Version: 5.8.8 - x86_64-linux-thread-multi MySQL Version: 5.0.70-log Total Players Recognized: 5 - no modifications to convert.conf, no custom-convert.conf - bitrate limiting set to 128kbit/s, quality level 3 - crossfade enabled, smart mode, 10 seconds Loglines: [09-01-19 20:40:32.5364] Slim::Player::StreamingController::_setStreamingState (1829) new streaming state IDLE [09-01-19 20:40:32.5369] Slim::Player::StreamingController::nextsong (761) The next song is number 1, was 0 [09-01-19 20:40:32.5420] Slim::Player::Song::new (64) index 1 -> file:///data/common/All%20Music/Roger%20Waters/%C3%87a%20Ira%20(1%20of%202)/02%20-%20Overture.flac [09-01-19 20:40:32.5426] Slim::Player::StreamingController::_setStreamingState (1829) new streaming state TRACKWAIT [09-01-19 20:40:32.5430] Slim::Player::Song::getNextSong (178) file:///data/common/All%20Music/Roger%20Waters/%C3%87a%20Ira%20(1%20of%202)/02%20-%20Overture.flac [09-01-19 20:40:32.5434] Slim::Player::StreamingController::_nextTrackReady (653) 00:04:20:06:73:61: nextTrack will be index 1 [09-01-19 20:40:32.5437] Slim::Player::StreamingController::_Stream (927) 00:04:20:06:73:61: preparing to stream song index 1 [09-01-19 20:40:32.5440] Slim::Player::StreamingController::_Stream (942) Song queue is now 1 [09-01-19 20:40:32.5444] Slim::Player::Song::open (302) file:///data/common/All%20Music/Roger%20Waters/%C3%87a%20Ira%20(1%20of%202)/02%20-%20Overture.flac [09-01-19 20:40:32.5464] Slim::Player::TranscodingHelper::getConvertCommand2 (454) Matched: flc->mp3 via: [flac] -dcs $START$ $END$ -- $FILE$ | [lame] --silent -q $QUALITY$ $RESAMPLE$ -v $BITRATE$ - - [09-01-19 20:40:32.5475] Slim::Player::TranscodingHelper::getConvertCommand2 (454) Matched: flc->mp3 via: [flac] -dcs $START$ $END$ -- $FILE$ | [lame] --silent -q $QUALITY$ $RESAMPLE$ -v $BITRATE$ - - [09-01-19 20:40:32.5478] Slim::Player::Song::open (323) seek=false time=0 canSeek=2 [09-01-19 20:40:32.5487] Slim::Player::TranscodingHelper::getConvertCommand2 (454) Matched: flc->mp3 via: [flac] -dcs $START$ $END$ -- $FILE$ | [lame] --silent -q $QUALITY$ $RESAMPLE$ -v $BITRATE$ - - [09-01-19 20:40:32.5493] Slim::Player::Song::open (340) Transcoder: streamMode=F, streamformat=mp3 [09-01-19 20:40:32.5505] Slim::Player::Song::open (457) Tokenized command "/opt/SqueezeCenter_svn/7.3/trunk/server/Bin/i386-linux/flac" -dcs -- "/data/common/All Music/Roger Waters/ÃÂa Ira (1 of 2)/02 - Overture.flac" | "/usr/bin/lame" --silent -q 3 -v -B 128 - - & | [09-01-19 20:40:32.5884] Slim::Player::StreamingController::_Stream (983) 00:04:20:06:73:61: stream [09-01-19 20:40:32.5956] Slim::Player::StreamingController::_Stream (1012) Song queue is now 1 [09-01-19 20:40:32.5962] Slim::Player::StreamingController::_setPlayingState (1816) new playing state BUFFERING [09-01-19 20:40:32.5965] Slim::Player::StreamingController::_setStreamingState (1829) new streaming state STREAMING [09-01-19 20:40:33.9075] Slim::Player::Source::_readNextChunk (499) end of file or error on socket, song pos: 2218561 [09-01-19 20:40:33.9079] Slim::Player::Source::_readNextChunk (508) Didn't stream any bytes for this song, so just mark it as played problem is the Directory called "Ça Ira"... Is anybody else able to reproduce this problem? Thanks!
Markus - with that patch applied, are you still able to BMF to an umlauted path? There's a forum thread where people complain about the change breaking BMF. http://forums.slimdevices.com/showthread.php?t=59183 QA to investigate.
Unfortunately this is true. On my side BMF (on SC) works only as long as there are no special characters in the path name. I can select "Die Ärzte" but not further selection is possible (e.g. "Jazz ist anders"). It's either empty or displays wrongly encoded directory names, e.g. 'Location: /data/common/All Music/Die �rzte/Geräusch (1 of 2)' :(
We are now planning to make a 7.3.3 release. Please review your bugs (all marked open against 7.3.3) to see if they can be fixed in the next few weeks, or if they should be retargeted for 7.4 or future. Thanks!
Since there's now a planned 7.3.3 release, bugs which won't make the cut-off are being moved to the next target out. If you feel that this bug needs to be addressed more (or less) urgently than the 7.4 release, please cc chris@slimdevices.com and leave a comment in the bug to that effect so we can review it. Thanks.
For some reason Bugzilla did not change the target when I did this yesterday. Or maybe it was me. In either case, I'm trying it again.
Created attachment 5087 [details] only utf8 encode if needed Markus - could you give this patch a try? Please test transcoding AND bmf. This approach stinks, as it requires one more disk-access. But I really don't know how else we could fix this without breaking BMF (see bug 11459)
Created attachment 5088 [details] player.source=info, w/ and w/o patch Hi, i applied your patch to 7.4r25893. While BMF works again it seems this breaks transcoding again (see the attached log file, it contains two distinct sections) - it won't start playing. If you need more logs or another test just ask. thanks, Markus
Michael, do you have time to continue working on this one?
*** Bug 12065 has been marked as a duplicate of this bug. ***
Unfortunately Alan's fix causes some other issues, notably the roundtrip is broken for fileURLFromPath <-> pathFromFileURL. This causes problems during rescan, because we see a completely different filename from what is in the database. Example (the album title here is Life²): my $url = 'file:///Music/Organized/Asura/Life%C2%B2/10%20-%20La%20Chanson%20De%20Carla.mp3' my $file = Slim::Utils::Unicode::utf8on( URI->new($url)->file ); The call to utf8on() causes $file to become: "/Music/Organized/Asura/Life\xB2/10 - La Chanson De Carla.mp3"
Actually, nevermind, I realized the real problem was that the scanner code does not call Encode::decode('utf8', $file) as it should. It wasn't following the basic rules of: decode everything coming in, encode everything going out.
Is this issue still real or did Andy's recent changes fix it?
(In reply to comment #27) > Is this issue still real or did Andy's recent changes fix it? I can not test this at the moment, as i'm still running 7.4trunk and not the sqlite-branch. Maybe someone else can...
> I can not test this at the moment, as i'm still running 7.4trunk and not the > sqlite-branch. Maybe someone else can... Could you still give it a try? Even on trunk?
(In reply to comment #29) > > I can not test this at the moment, as i'm still running 7.4trunk and not the > > sqlite-branch. Maybe someone else can... > > Could you still give it a try? Even on trunk? sure. i tested 7.4trunk, r27942 as from today. status: - transcoding works just fine (it has since the day Alan committed the change from comment #6) - OTOH bmf is still broken (with special characters as described in my comments) applying your patch from comment #21 reverses the situation (again). If you want me to test anything else please ask. thanks!
*** Bug 11015 has been marked as a duplicate of this bug. ***
Chris to sort through this bug and determine the proper state and resolution
wait, what is BMF?
(In reply to comment #33) > wait, what is BMF? Browse Music Folder
this is an administrative shuffle on priority fields to help make better judgment on the top end of the priority list. P4->P5, P3->P4, and P2->P3.
*** Bug 14610 has been marked as a duplicate of this bug. ***
QA - could you please re-test this issue? With all the recent path handling stuff this might be influenced as well...
Related to Bug 14610. It's not fixed in Version: 7.4.1 - r28825 @ Tue Oct 13 04:04:48 PDT 2009
Hi! flac files with åäö in path or filename don´t play if Bitrate Limiting is on. Same file play fine with no Bitrate Limiting. Had no problem with this on Ubuntu 9.04 and SC 7.3.4. Version: 7.4.1 - r28947 @ Tue Oct 20 07:59:38 PDT 2009 Värdnamn: vortexbox.localdomain Serverns IP-adress: 192.168.1.10 HTTP-serverns portnummer: 9000 Operativsystem: Red Hat - SV - utf8 Plattformsarkitektur: i686-linux Perl-version: 5.10.0 - i386-linux-thread-multi MySQL-version: 5.1.40 Totalt antal anslutna spelare: 3 Kind Regards /Bernt
Bernt, could I get you to file a separate bug for that issue? It seems only slightly related to this bug (to me :) Thanks!
(In reply to comment #40) > Bernt, could I get you to file a separate bug for that issue? It seems only > slightly related to this bug (to me :) > > Thanks! I'm sorry but also my problems are also related to the Bitrate Limiting and in Post 15 Markus said he use also Bitrate Limiting. Why is it only slightly related? Could you try to explain where else transcoding is needed.
*** Bug 15555 has been marked as a duplicate of this bug. ***
Transcoding of files with non-ASCII characters in the filename is still an issue for me using version 7.4.1 of Slimserver. Adding the patch "Proposed fix" fixes this issue for me (interestingly, the file Misc.pm has the comments from the patch added, but not the actual change in the code).
(In reply to comment #43) > Transcoding of files with non-ASCII characters in the filename is still an > issue for me using version 7.4.1 of Slimserver. Adding the patch "Proposed > fix" fixes this issue for me (interestingly, the file Misc.pm has the comments > from the patch added, but not the actual change in the code). +1
-1 this patch is not working (also for a wrong version) Please test every situation. For example Browse my files is not working correct. also jump from one file with special characters to an other will not work. I think there is also a problem with files with special characters in folders with special characters.
(In reply to comment #45) > -1 this patch is not working (also for a wrong version) > > Please test every situation. For example Browse my files is not working > correct. > also jump from one file with special characters to an other will not work. > I think there is also a problem with files with special characters in folders > with special characters. You are right but I can live with that for now. At least I can play everything in my library with this patch.
(In reply to comment #45) > -1 this patch is not working (also for a wrong version) > > Please test every situation. For example Browse my files is not working > correct. > also jump from one file with special characters to an other will not work. > I think there is also a problem with files with special characters in folders > with special characters. I'm with Bernt on this one. This way I can play everything (SB Radio, SB Classic 3, Mplayer). Viewing of files from the Browse By Artist, Year, etc works, it's just browse by file folder that fails.
This is still a issue in Version: 7.4.2 - r30227. Kind Regards /Bernt
This is still a issue in Version: 7.5.0 - r30476 Kind Regards /Bernt
== Auto-comment from SVN commit #30673 to the slim repo by ayoung == == http://svn.slimdevices.com/slim?view=revision&revision=30673 == bug 10199: Most transcoding doesn't work with non-ascii-characters in filename Decode $filepath byte-string to character-string before substituting into transcoder command string. Apply same decode() fix to routine used for BMF as that already in place for main scan.
The patch above should be considered experimental. It would be helpful if someone could test it out under Windows.
(In reply to comment #51) > The patch above should be considered experimental. It would be helpful if > someone could test it out under Windows. I've backported this change to 7.5 r30611. Transcoding works like a harm. but BMF not. the problem is when you have such folder/file-Structure: "/music/weihnachten/Rolf Zuckowski/Stille Nächte - Helles Licht/06 - Eine Chance Für Das Weihnachtsfest.flac" You see an umlaut in Folder and in file so this file is hidden in BMF. If no folder have an umlaut everything is perfect.
Bernt and Icebird2000: can you confirm the character-set of the locale you are using (the locale that SbS runs under) and that of the underlying filesystem. I am sure that I used to be able to reproduce this but I no longer can.
Hello Alan, thanks for your hard work on this problem. Now I try to explain my setup and hope this helps you. ------------------------------------------------------------------------------- SbS is installed on an gentoo linux. The Music Folder is mounted as an cifs share from an Windows 2003 x64 Server. And here are some system infos: ------------------------------------------------------------------------------- Squeezebox Server Status: Version: 7.5.1 - r30611 @ Thu Apr 15 02:08:13 MDT 2010 Hostname: intra Server IP Address: 192.168.222.3 Server HTTP Port Number: 9000 Operating system: Linux - EN - utf8 Platform Architecture: i686-linux Perl Version: 5.8.8 - i686-linux-thread-multi MySQL Version: 5.0.21-standard Total Players Recognized: 2 ------------------------------------------------------------------------------- Init-script: pidfile=/var/run/squeezecenter/squeezecenter.pid logdir=/var/log/squeezecenter cachedir=/var/cache/squeezecenter prefsdir=${cachedir}/prefs prefsfile=/etc/squeezecenter.prefs scdir=/usr/local/squeezeboxserver scuser=squeezecenter #depend() { # need net mysql #} start() { ebegin "Starting SqueezeCenter" cd / /usr/bin/nice --adjustment=${SC_NICENESS:-0} sudo -u ${scuser} \ start-stop-daemon \ --start --quiet \ --name slimserver.pl \ --exec ${scdir}/slimserver.pl -- \ --quiet --daemon \ --pidfile=${pidfile} \ --cachedir=${cachedir} \ --prefsfile=${prefsfile} \ --prefsdir=${prefsdir} \ --logdir=${logdir} \ --audiodir=${SC_MUSIC_DIR} \ --playlistdir=${SC_PLAYLISTS_DIR} \ --charset utf8 \ ${SC_OPTS} eend $? "Failed to start SqueezeCenter" } ------------------------------------------------------------------------------- cifs mount: mount -t cifs //mainserver/music$ /music -o username=xx,password=xx,iocharset=utf8 ------------------------------------------------------------------------------- locale: LANG=en_US.UTF-8 LC_CTYPE="en_US.UTF-8" LC_NUMERIC="en_US.UTF-8" LC_TIME="en_US.UTF-8" LC_COLLATE="en_US.UTF-8" LC_MONETARY="en_US.UTF-8" LC_MESSAGES="en_US.UTF-8" LC_PAPER="en_US.UTF-8" LC_NAME="en_US.UTF-8" LC_ADDRESS="en_US.UTF-8" LC_TELEPHONE="en_US.UTF-8" LC_MEASUREMENT="en_US.UTF-8" LC_IDENTIFICATION="en_US.UTF-8" LC_ALL= ------------------------------------------------------------------------------- syslog.log from start: May 12 04:19:57 intra sudo: root : TTY=console ; PWD=/ ; USER=squeezecenter ; COMMAND=/sbin/start-stop-daemon --start --quiet --name slimserver.pl --exec /usr/local/squeezeboxserver/slimserver.pl -- --quiet --daemon --pidfile=/var/run/squeezecenter/squeezecenter.pid --cachedir=/var/cache/squeezecenter --prefs file=/etc/squeezecenter.prefs --prefsdir=/var/cache/squeezecenter/prefs --logdir=/var/log/squeezecenter --audiodir= --playlistdir= --charset utf8 May 12 04:20:07 intra [ 84.245310] slimserver.pl used greatest stack depth: 5380 bytes left ------------------------------------------------------------------------------- I'm adding an logfile where you can see the file-structure of the music folder: This is ls -la from /music/Hörbücher/Edgar Allan Poe/: drwxr-xr-x 1 root root 0 2008-08-05 14:21 01 - Die Grube Und Das Pendel drwxr-xr-x 1 root root 0 2008-08-05 14:21 02 - Die Schwarze Katze drwxr-xr-x 1 root root 0 2008-08-05 14:21 03 - Der Untergang Des Hauses Usher drwxr-xr-x 1 root root 0 2008-08-05 14:21 04 - Die Maske Des Roten Todes drwxr-xr-x 1 root root 0 2008-08-05 14:21 05 - Der Sturz In Den Malstrom drwxr-xr-x 1 root root 0 2008-08-05 14:21 06 - Der Goldkäfer drwxr-xr-x 1 root root 0 2008-08-05 14:21 07 - Die Morde In Der Rue Morgue drwxr-xr-x 1 root root 0 2008-08-05 14:21 08 - Lebendig Begraben drwxr-xr-x 1 root root 0 2008-08-05 14:21 09 - Hopp - Frosch drwxr-xr-x 1 root root 0 2008-08-05 14:21 10 - Das Ovale Portrait drwxr-xr-x 1 root root 0 2008-08-05 14:21 11 - Der Entwendete Brief drwxr-xr-x 1 root root 0 2008-08-05 14:21 12 - Eleonora -------------------------------------------------------------------------------
Created attachment 6837 [details] Serverlog from BMF
Thanks for the info. Note, at the moment I am only trying to reproduce the problem with transcoding and I'm not looking at BMF.
As I wrote in comment 52 the transcoding works!! but BMF not.
(In reply to comment #57) > As I wrote in comment 52 the transcoding works!! but BMF not. But is that not only with a backport of the 7.6 change? I am trying to reproduce the problem without that change.
(In reply to comment #58) > (In reply to comment #57) > > As I wrote in comment 52 the transcoding works!! but BMF not. > But is that not only with a backport of the 7.6 change? I am trying to > reproduce the problem without that change. right. this works only with the patch.
(In reply to comment #53) > Bernt and Icebird2000: can you confirm the character-set of the locale you are > using (the locale that SbS runs under) and that of the underlying filesystem. I > am sure that I used to be able to reproduce this but I no longer can. Version: 7.5.0 - r30476 @ Fri Apr 9 17:20:05 EDT 2010 Värdnamn: vortexbox.localdomain Serverns IP-adress: 192.168.1.10 HTTP-serverns portnummer: 9000 Operativsystem: Red Hat - SV - utf8 Plattformsarkitektur: i686-linux Perl-version: 5.10.0 - i386-linux-thread-multi MySQL-version: 5.1.45 Totalt antal anslutna spelare: 4 LANG=en_US.UTF-8 LC_CTYPE="en_US.UTF-8" LC_NUMERIC="en_US.UTF-8" LC_TIME="en_US.UTF-8" LC_COLLATE="en_US.UTF-8" LC_MONETARY="en_US.UTF-8" LC_MESSAGES="en_US.UTF-8" LC_PAPER="en_US.UTF-8" LC_NAME="en_US.UTF-8" LC_ADDRESS="en_US.UTF-8" LC_TELEPHONE="en_US.UTF-8" LC_MEASUREMENT="en_US.UTF-8" LC_IDENTIFICATION="en_US.UTF-8" LC_ALL=
Created attachment 6858 [details] Log from Bernt with enhanced logging The double-encoding shown in the message at Slim::Player::Song::open (552) may not be significant because Song.pm uses 'use bytes' and the log message was constructed using string interpolation so the UTF-8 flag was probably lost there. It is interesting that the first setting of the utf-8 flag visible at tokenizeConvertCommand2 (528) has not resulted in C3A5 being interpreted as ISO8859-1 but rather as UTF-8.
Created attachment 6859 [details] Log from Alan with enhanced logging Version: 7.5.1 - rTRUNK @ UNKNOWN Hostname: oz Server IP Address: 192.168.1.11 Server HTTP Port Number: 9000 Operating system: Red Hat - EN - utf8 Platform Architecture: x86_64-linux Perl Version: 5.10.0 - x86_64-linux-thread-multi MySQL Version: 5.0.21-standard Total Players Recognized: 3 In this case we do not see the UTF-8 flag ever being set. I do not know what is the difference between this environment and Berndt's but, from what he posted earlier in comment #39, it should be pretty-much the same.
== Auto-comment from SVN commit #31416 to the slim repo by ayoung == == http://svn.slimdevices.com/slim?view=revision&revision=31416 == Fixed bug 10199: Most transcoding doesn't work with non-ascii-characters in filename Ensure that the building of the transcoding command is done in 'use bytes' mode so that native byte-strings are used throughout and one does not get accidental promotion to a decoded UFT-8 string (see http://perldoc.perl.org/perlunicode.html). This would only happen if one of the inputs had the Perl utf8 flag set and, for some strange reason, the $quality (from the lameQuality preference) variable does have this flag (even though the value is "9"). But there could be other cases with custom convert files or unusual system configurations where some of the paths might include non-ASCII characters.
Ok, SBS 7.6.0 - r31450 transcoding Works how it should. But you kill the Browse My Files Function if there is an UTF8 encoded folder.
(In reply to comment #64) > Ok, SBS 7.6.0 - r31450 transcoding Works how it should. But you kill the Browse > My Files Function if there is an UTF8 encoded folder. Ok, this goes now to Bug 16180.
Able to see the contents of folders which user non-ascii characters. Transcoding of files is working as expected