Bugzilla – Bug 13153
Both web interface and squeezebox "hang" when one browses folders containing non-ascii chars
Last modified: 2011-05-06 14:14:26 UTC
This is happening for at least 3 months (may be 6). "Hang" quoted because one can resurrect interface by clicking to "home" (for example) in case of a web interface, or "now playing" on a remote in case of squeezebox. Quite annoying bug.... I'm using Debian/Sid + squeezesever/7.4, updated on a daily basis. Any extra info available by request, and I'm more than willing to test any patches.
P.S. Probably this is important xxx@yyyy:~$ locale LANG=ru_RU.KOI8-R LC_CTYPE="ru_RU.KOI8-R" LC_NUMERIC="ru_RU.KOI8-R" LC_TIME="ru_RU.KOI8-R" LC_COLLATE="ru_RU.KOI8-R" LC_MONETARY="ru_RU.KOI8-R" LC_MESSAGES=C LC_PAPER="ru_RU.KOI8-R" LC_NAME="ru_RU.KOI8-R" LC_ADDRESS="ru_RU.KOI8-R" LC_TELEPHONE="ru_RU.KOI8-R" LC_MEASUREMENT="ru_RU.KOI8-R" LC_IDENTIFICATION="ru_RU.KOI8-R" LC_ALL= ===== Version: 7.4 - r27933 @ Thu Jul 30 04:00:06 PDT 2009 Hostname: yyyy Server IP Address: 1.2.3.4 Server HTTP Port Number: 9000 Operating system: Debian - EN - koi8-r Platform Architecture: i686-linux Perl Version: 5.10.0 - i486-linux-gnu-thread-multi
Created attachment 5544 [details] screenshot - see how location is being displayed
Created attachment 5545 [details] corresponding frame source (gzipped)
Also, displayed location is totally screwed (probably related), see attachments
Michael: any idea what's happening here?
QA - can you reproduce? What about server.log (no frame debug needed)?
Created attachment 5554 [details] error message in server.log This is the only message that appears in a server.log when one tries to access such folder from a browser. Please let me know if you need traces with log level increased.
I can't reproduce it because the scanner crashes. And I can't zip up the example file I made because winzip crashes! I'll look and see if there's already a bug for the scanner crash.
Hmm it now seems to be working okay after some messing about. I'll upload a sample file for reference.
Created attachment 5573 [details] mp3 file with russian filename To make a directory name I just copied and pasted the cyrillic character string from the filename. I don't speak Russian so I hope this random Russian string I found on the web isn't anything offensive!
Chris, it says "letter to the mother", I do not think you should worry about anything :) If all you need is a dir/file samples I've got a tons of them....
Nick, since I can't reproduce the bug, could you verify that the sample I created also demonstrates the bug for you? If it does, then I need to figure out what's different between your system and mine. If it doesn't, then we need to look at what's different between your files and mine!
Chris, Yes, I can reproduce it with your sample with 100% hit rate. Just in case I'm using: Version: 7.4 - r27977 @ Sat Aug 1 04:00:31 PDT 2009 Hostname: xxxx Server IP Address: yyy.yyy.yyy.yyy Server HTTP Port Number: 9000 Operating system: Debian - EN - koi8-r Platform Architecture: i686-linux Perl Version: 5.10.0 - i486-linux-gnu-thread-multi MySQL Version: 5.0.84-1 Total Players Recognized: 2 What could be different - are you using UTF or KOI8 system-wide locale? Also, I have the following in /etc/default/squeezecenter: # locale settings if [ -r /etc/default/locale ]; then . /etc/default/locale export LANG fi More: I can reproduce this even by creating empty directory with non-ascii chars in a name. After that content of a 'parent' directory cannot be displayed. I'll upload tar file with such dir. P.S. I'm not sure if it is clear from the short bug description, but it is a "Home > Music Folder" view that is affected.
Created attachment 5588 [details] empty directory with koi8 chars created on my box
I guess this is totally irrelevant, but I'm not using any "fancy" mount options for my data pertition: /dev/sda2 on /data type reiserfs (ro,nosuid,nodev,noatime)
Hm I should try KOI8
Ross I had been following up on this but I've gotten busy again. You have a good VMware setup with some linux images. Do you have a russian language image running this KOI8 encoding? Could you set one up? Thanks!
So far unable to reproduce when switching FC10 to Russian, will try Ubuntu next week.
Ross have you found anything out about this KOI8 encoding? Is it something that needs to be configured at install time?
> Ross have you found anything out about this KOI8 encoding? Is it something that needs to be configured at install time? Not really. Can be enabled any time. Check 'dpkg-reconfigure locales' P.S. Since linux fs treat filenames as a raw byte sequences (as opposed to Mac for example where all filenames are being stored in unicode) you would want to make sure locale is set before any 'localized' files are created though. P.P.S. KOI8-R historically was a 'standard' Russian encoding used on Linux, I think nowadays unicode is a default out of the box (in fresh installs), that's probably why you never run into this. P.P.P.S. Would it help if I will provide 'guest' access to my box?
Lowering severity due to reduced likelihood of KOI8 systems as explained in comment #20. I'm not able to reproduce this with Ubuntu and Fedora switching to Russian.
Ok, I've lost the hope to have it fixed in a reasonable time and decided to dive into code by myself. So here is what's happening: Slim::Music::Info::fileName() at the end does return Slim::Utils::Unicode::utf8decode_locale($j); which calls Encode::decode() and converts fileName from koi8 to internal perl string form at the same time Slim::Music::Info::sortFilename() does Slim::Utils::Unicode::utf8encode_locale( fileName($_) ) which seems to be reasonable, but utf8encode_locale ends up calling utf8on() which tries to convert from UTF-8 to internal perl representation _again_, and that's exactly where it fails. For now I've just commented out utf8on call in utf8encode_locale, which fixed the issue for me.
Apparently the fix is incomplete: when scrolling through directories containing non-ascii chars using IR remote on SB3 nothing is being displayed (SB3 display is completely black)... But at least it does not hang...
Created attachment 5883 [details] better patch for a hang After playing with it for a while I came up with the following patch that for one thing solves the 'hang' problem for me completely, and for another looks like a safe hack (hack because one should not try to encode into UTF-8 twice). With this patch in place I have no problem browsing folders neither using WEB UI nor SB3 itself. Please note that I've changed UTF-8 -> utf8, which seems to be reasonable (utf8 is less restrictive - why to fail for no good reason?)
Created attachment 5884 [details] this chunks fixes garbage in 'location' field for me I'm not so sure about this one, it definitely works for me, but probably there was a reason to use utf8decode_guess at the first place. Probably one should look at autodetection of KOI8-R encoding instead ....
I've just posted 2 patches that certainly changes things for the better for me. Can someone knowledgeable review them? Would be nice if they would be committed so I do not have to apply them every time I'm upgrading squeezeboxserver ....
P.S. I know why one could have troubles reproducing original problem: /etc/init.d/squeezeboxserver forces utf8 charset, which does not work for me for obvious reasons and commented out.....
Hi Nick, We're currently working to get 7.4 ready for release in a very short time. We're not ignoring you! :) I'll get someone to review your patch probably next week.
Created attachment 5894 [details] Oops, original patch had been inverted Chris, ok :) I hope you are men of your word :) I've just noticed that the first patch I've posted is inverted. Here is the right one (just in case)
Nick - would you mind giving the latest 7.4.1 build a try? I've identified and fixed quite a few issues with browsing cyrillic and other non-western files/folders.
As of "Version: 7.5.0 - r28873 @ Fri Oct 16 02:00:29 PDT 2009" it still does not work for me (both my patches are still required). Did you try to reproduce it using non-utf locale?
oops.. didn't read this bug carefully enough. My fixes are mostly Windows only. Did you ever try adding --charset=utf8 to the slimserver.pl startup parameters?
Forcing utf8 charset with 8 bit system locale makes it much worse, see bug #9236
Moving these bugs to P4 to make room for moving P1.5 bugs to P2
Any chance of reviewing patches I've sent and possibly closing the bug instead of lowering priority?
I looked at your patch. Our Unicode handling is a mess and it is hard to say what the impact of a change such as yours would have on other aspects of the system. :( I think we need to revisit how we handle all Unicode input and output and use the simplest methods possible (such as utf8::decode and utf8::encode) and get rid of much of the Unicode module. Of course, much of the mess is a result of trying to be compatible with absolutely everything out there, including horrible Windows encodings, broken encodings, etc. So being more strict and clean is probably not possible in the real world.
Agreed on most parts, but I do believe that the first chunk (aka "better patch for a hang") should be safe and fixes most severe aspects of the problem. P.S. Over time I've learned to code "just to make things work" as opposed to "for the sake of art of it". I think proposed solution is "good enough" compromise to deal with an issue in hands.
Administrative move of 7.5 bugs. All P2, P3, P4 being downgraded one level. Will then split P1s.
This bug needs to be fixed at some point, and hopefully we can use some of your patch, Nick. I have made sure our test suite has a test for the symptom you originally reported, as well.
Both patches I posted are still good: I'm applying them every time I'm upgrading to a next nightly. Hope you eventually will accept them ....
Nick, are you still using a non-UTF-8 locale/encoding for your filesystem? Is there some particular reason for using KOI8 instead of UFT-8?
(In reply to comment #41) > Nick, are you still using a non-UTF-8 locale/encoding for your filesystem? Yes, I do. > Is there some particular reason for using KOI8 instead of UFT-8? Besides historical reasons (KOI8 had been de-facto standard Russian Linux locale long before UTF8 got any traction) it's faster and more stable (less of an issue nowadays since more and more software supports unicode, but still).
Ok, I think that non-UTF-8 has to be becoming pretty rare with Linux but I guess it should still work. I doubt that it will become part of our official support matrix. Are you interested in trying 7.6 with the recent changes? I'd be interested in your feedback if you are.
(In reply to comment #43) > Are you interested in trying 7.6 with the recent changes? I'd be interested in > your feedback if you are. Well, there are good news and bad news. Good news are - both issues I complained about are solved in 7.6. Bad news - as of now (r31605) 7.6 has too many regressions (for example Artist sorting is totally screwed for names containing non-ascii chars - let me know if you need screenshot or something). So I'm back to 7.5.2 + my patches...
Nick, thanks for this. I'm not exactly sure but I suspect that the sorting issue may be related to your use of a non-UTF-8 locale. Sorting is using the following method: $COLLATION{perllocale} = sub { use locale; $_[0] cmp $_[1] }; where the strings from the DB that are being collated here will be UTF-8 encoded. This collation is dependent upon LC_COLLATE (env var) being set appropriately. Taking those two points together, maybe try setting LC_COLLATE="ru_RU" instead of LC_COLLATE="ru_RU.KOI8-R". I would be very interested to hear your results.
After some more experiments, I think what you might need is LC_COLLATE="ru_RU.utf8" [awy@oz ~]$ LC_COLLATE="ru_RU.KOI8-R" perl t.pl t abcd efgh äbcd åbcd æbcd øbcd [awy@oz ~]$ LC_COLLATE="ru_RU.utf8" perl t.pl t abcd åbcd äbcd æbcd efgh øbcd [awy@oz ~]$ LC_COLLATE=no_NO.utf8 perl t.pl t abcd efgh æbcd äbcd øbcd åbcd
== Auto-comment from SVN commit #31609 to the slim repo by ayoung == == http://svn.slimdevices.com/slim?view=revision&revision=31609 == bug 16683: Non-ASCII characters in file and directory names Fixed Bug 15805 - Artists sorting incorrectly on 7.5-embedded SQLite Fixed Bug 14800 - Sorting not done correctly with Swedish characters Bug 13153 - Both web interface and squeezebox "hang" when one browses folders containing non-ascii chars Set LC_COLLATE so that SQLite database sorting using perlcollate works for different languages.
7.6.0 32390 -USed the Cyrillic mp3 and folder ---was able to browse and play. set language to Russion (on TOuch) still works.