Bug 13600 - Non-western characters incorrectly sorted
: Non-western characters incorrectly sorted
Status: CLOSED FIXED
Product: Logitech Media Server
Classification: Unclassified
Component: Database
: 7.4.0
: PC Windows Home Server
: P2 major with 2 votes (vote)
: 7.3.3
Assigned To: Alan Young
:
Depends on:
Blocks: 13811
  Show dependency treegraph
 
Reported: 2009-08-22 09:40 UTC by Themis
Modified: 2011-05-08 00:42 UTC (History)
7 users (show)

See Also:
Category: ---


Attachments
Greek characters (17.57 MB, application/octet-stream)
2009-08-24 05:02 UTC, Themis
Details
View wrong sort order in Squeezecenter (10.42 KB, image/pjpeg)
2009-09-23 01:16 UTC, Dennis Mutsaers
Details
Change character in alphabet (1.08 KB, image/pjpeg)
2009-09-23 01:17 UTC, Dennis Mutsaers
Details
scanner.log (592.44 KB, application/octet-stream)
2009-09-23 22:17 UTC, Dennis Mutsaers
Details
part 1 of 2 part scanner.log (1.00 MB, application/octet-stream)
2009-09-24 09:11 UTC, Dennis Mutsaers
Details
part 2 of 2 part scanner.log (868.50 KB, application/octet-stream)
2009-09-24 09:11 UTC, Dennis Mutsaers
Details
wrong sort order (9.90 KB, image/pjpeg)
2009-09-27 22:32 UTC, Dennis Mutsaers
Details
É after Z (12.12 KB, image/pjpeg)
2010-02-05 23:49 UTC, Dennis Mutsaers
Details
á after Z (11.24 KB, image/jpeg)
2011-05-05 22:41 UTC, Dennis Mutsaers
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Themis 2009-08-22 09:40:49 UTC
Since a couple of versions (I'm with a 7.4.28243 build right now, but that used to work last month), Greek artists are grouped between western letters I and J. They used to be after letter Z a few weeks ago, which is more logical.

There have been no changes neither in artists' list nor in my configuration.
Comment 1 James Richardson 2009-08-22 09:48:45 UTC
Michael: does this sound like a dupe (or related to) of another bug? or new behavior.
Comment 2 Michael Herger 2009-08-24 00:25:46 UTC
Themis - what build have you been using before you saw this happen? Would you mind upload one of thos files so we can test?

James - surely related to _some_ other bug report, but I doubt the one you linked. Character set handling is one big issue...
Comment 3 Themis 2009-08-24 04:33:02 UTC
Oh well... hard to say which build it was.

WHS logs show that last builds downloaded were 28228, 28235, 28243 and the actual 28244.
If you could provide me with a link to the 28228 build (of 7.4) for WHS I could test to see whether the problem existed.

What is for sure, is that I didn't have the problem last month (july). I applied the last build on aug 15th (28243 at that time I think), then this dysfunction arose. At first I thought non-western characters were not scanned at all (It had been true somewhere back on 6.something), but then I discovered that artists were simply between letters I and J...
Comment 4 Themis 2009-08-24 05:02:08 UTC
Created attachment 5670 [details]
Greek characters
Comment 5 James Richardson 2009-09-08 15:30:22 UTC
Themis: Please try with this version, let me know if you still see the issue

http://downloads.slimdevices.com/nightly/?ver=7.4

Tested with Version: 7.4 - r28469 but I was unable to replicate the error as stated in this bug.
Comment 6 Themis 2009-09-11 13:56:45 UTC
I retried, still the same problem. Tried to instal the .exe innstead of the .msi on my WHS: still the same.

I even installed 7.4 r28494 on my laptop (which never had SC installed), copied 3 flacs in a directory,

Alice in Chains - 1 song
The file that I sent you - 1 song
Ravel - Dafnis & Chloe - 1 song

, the greek artist is between the Alice in Chains and Ravel, although it should be (and it used to be) at the end of the Latin-character list (after letter Z), that is: after Ravel.
Comment 7 James Richardson 2009-09-11 14:27:23 UTC
Thanks for all the extra testing, assigning back to Michael for comment
Comment 8 Michael Herger 2009-09-14 05:51:13 UTC
Assigning to Andy...
Comment 9 Michael Herger 2009-09-14 06:59:02 UTC
*** Bug 13810 has been marked as a duplicate of this bug. ***
Comment 10 Michael Herger 2009-09-14 07:02:17 UTC
*** Bug 13296 has been marked as a duplicate of this bug. ***
Comment 11 SVN Bot 2009-09-14 07:51:17 UTC
 == Auto-comment from SVN commit #28507 to the slim repo by andy ==
 == https://svn.slimdevices.com/slim?view=revision&revision=28507 ==

Fixed bug 13600, change sort and search columns to be BLOB instead of TEXT so UTF-8 data can be stored correctly in them
Comment 12 Dennis Mutsaers 2009-09-15 09:31:02 UTC
It sorts different, but correct? É(dith Piaf)and È(tienne de Crécy) are sorted after Z. I would expect them to be sorted under E..
Comment 13 Andy Grundman 2009-09-15 09:48:24 UTC
What language do you have selected?
Comment 14 Dennis Mutsaers 2009-09-15 09:50:21 UTC
Dutch
Comment 15 Andy Grundman 2009-09-15 09:55:01 UTC
OK, let me look at this some more.  Collation used for Dutch is utf8_general_ci which from what I can tell should sort É next to E:

http://www.collation-charts.org/mysql60/mysql604.utf8_general_ci.european.html
Comment 16 Themis 2009-09-18 01:12:00 UTC
Ok, works fine now for me since 28547 (there were various other problems in versions 28507 till 28538).

Thank you for having fixed this.

Regards,
Themis
Comment 17 Michael Herger 2009-09-18 01:22:26 UTC
Great news! Thanks for the feedback.
Comment 18 Andy Grundman 2009-09-18 05:37:50 UTC
Wait, is this really fixed?  According to Dennis it may not be.
Comment 19 Themis 2009-09-18 06:04:31 UTC
No, I think it's still not fixed for accentuated characters.
Comment 20 Andy Grundman 2009-09-18 06:12:13 UTC
Reopening...
Comment 21 Jim McAtee 2009-09-18 20:00:36 UTC
Previously, weren't characters with accents 'normalized' to some extent into their plain character equivalents when stored in the sort and search columns?  I don't see that now in the database.  That would explain why é and e are no longer sorted as the same.

See bug 13811.  This was also the case for the search columns, who's behavior has changed since 7.3.  For instance, searching for 'torme' would find 'Mel Tormé', but now it does not.  This was brought up discussed recently in the beta forum before the change to BLOB columns, so I'm guessing it might be a change made to the SQLite branch that should not have been merged into the trunk.

http://forums.slimdevices.com/showthread.php?t=67321
Comment 22 Andy Grundman 2009-09-18 20:11:46 UTC
Hmm, as far as I could tell nothing was ever normalized (I think you really mean transliterated) it was just stored in a TEXT column which completely garbled the data.  I don't think searching that way ever worked, at least for me.  I recall trying to search for "Budi" (looking for Büdi) and it didn't work.  I need to take a closer look at this.
Comment 23 Jim McAtee 2009-09-18 20:40:47 UTC
I just fired up the 7.3.4 server and you're right about the storage.  The characters were not transliterated.  One difference that I do see, though, is that in the 7.3 database those accented characters aren't capitalized in the sort column, while they are in the 7.4 database - TORMé MEL vs. TORMÉ MEL.

If storing the sort text as a BLOB, I'm not sure how you could get É and E to sort the same or even adjacent to one another.

From http://dev.mysql.com/doc/refman/5.0/en/blob.html :

"BLOB columns are treated as binary strings (byte strings). TEXT columns are treated as nonbinary strings (character strings). BLOB columns have no character set, and sorting and comparison are based on the numeric values of the bytes in column values. TEXT columns have a character set, and values are sorted and compared based on the collation of the character set."
Comment 24 Andy Grundman 2009-09-18 21:05:33 UTC
Yeah you're right, I hope I didn't screw this up even more by changing the column types.  I did a bit of research the other day before I changed this about why the characters looked wrong in the database, and there may be a bug in the version of the server we are using that's causing it. :(
Comment 25 SVN Bot 2009-09-21 07:42:04 UTC
 == Auto-comment from SVN commit #28582 to the slim repo by andy ==
 == https://svn.slimdevices.com/slim?view=revision&revision=28582 ==

Fixed bug 13600, revert previous fix, the real problem was SET NAMES UTF8 had been accidentally removed from the on-connect SQL statements.  I tested this with ?\195?\136tienne and was able to search for it using Etienne because MySQL performs the removal of diacritics automatically.  It also appears to sort correctly now.  You will need to do a full wipe and rescan because I have removed the schema_11 file.
Comment 26 Dennis Mutsaers 2009-09-23 01:12:59 UTC
Well, Édith Piaf and Étienne de Crécy are still sorted after Z...
7.4.0 - r28603 @ Tue Sep 22
Comment 27 Dennis Mutsaers 2009-09-23 01:16:51 UTC
Created attachment 5896 [details]
View wrong sort order in Squeezecenter
Comment 28 Dennis Mutsaers 2009-09-23 01:17:56 UTC
Created attachment 5897 [details]
Change character in alphabet
Comment 29 Andy Grundman 2009-09-23 04:52:11 UTC
You did a full wipe and rescan?
Comment 30 Dennis Mutsaers 2009-09-23 08:48:22 UTC
Yes, I did a full wipe & rescan. This is scheduled daily and I triggered it manually to be sure.
Comment 31 James Richardson 2009-09-23 14:14:45 UTC
Please attach your scanner log to the bug
Comment 32 Dennis Mutsaers 2009-09-23 22:17:19 UTC
Created attachment 5912 [details]
scanner.log

As requested: scanner.log
Comment 33 Dennis Mutsaers 2009-09-24 09:11:10 UTC
Created attachment 5917 [details]
part 1 of 2 part scanner.log

With debugging enabled.
Comment 34 Dennis Mutsaers 2009-09-24 09:11:50 UTC
Created attachment 5918 [details]
part 2 of 2 part scanner.log

With debugging enabled.
Comment 35 James Richardson 2009-09-24 11:55:50 UTC
Andy: does the attached logs help?
Comment 36 Andy Grundman 2009-09-24 12:17:41 UTC
I don't think a scanner log will help with this one.  I thought I saw correct sort when I was testing.  QA can you reproduce?
Comment 37 Dennis Mutsaers 2009-09-25 09:24:56 UTC
@Andy
Can another log help?
Comment 38 Andy Grundman 2009-09-25 10:38:55 UTC
What we need for this is a set of 2 test files that should sort one way but sort the opposite way.  Can someone provide that?
Comment 39 Dennis Mutsaers 2009-09-25 10:44:39 UTC
ftp://squeezeboxserver.kicks-ass.net/%C9dith%20Piaf/ 

Dont use IE or disable FTP passive mode in IE.
Comment 40 Andy Grundman 2009-09-25 10:51:42 UTC
OK, but those are all from 1 artist right?  So not the best set for testing sorting I think?
Comment 41 Dennis Mutsaers 2009-09-25 11:00:52 UTC
And another one...

ftp://squeezeboxserver.kicks-ass.net/Etienne%20De%20Crecy/
Comment 42 Dennis Mutsaers 2009-09-25 11:03:57 UTC
ftp://squeezeboxserver.kicks-ass.net/Music
Comment 43 Andy Grundman 2009-09-25 11:10:30 UTC
OK I really don't think you should be posting that here. :)  Can someone please just attach 2 test files to this bug that don't sort properly?  They can be very short, all we need are the tags.
Comment 44 Dennis Mutsaers 2009-09-25 11:21:10 UTC
Not a very wise thing to do, indeed.
Can you tell me a simple way to 'shorten' some files..
Comment 45 Jim McAtee 2009-09-25 12:35:04 UTC
I downloaded a couple of the Édith Piaf tracks, scanned my test library with the latest SbS 7.4 and on my system the name sorts correctly among the E's.

Edgar Meyer
Édith Piaf
Eric Clapton
Comment 46 Andy Grundman 2009-09-25 12:44:52 UTC
Yeah I also saw the correct sort when I tested...
Comment 47 Dennis Mutsaers 2009-09-25 13:58:43 UTC
So what's happening on my system?
Comment 48 Dennis Mutsaers 2009-09-25 15:45:21 UTC
As I said before, I've scheduled a daily wipe & rescan. I also triggered several manual wipe & rescan actions. It didn't solve my sort problem. Now I've completely uninstalled Squeezebox Server, deleted "Program Files" & "ProgramData" files. It looks like sorting does work as expected now.
Comment 49 Dennis Mutsaers 2009-09-27 22:32:34 UTC
Created attachment 5941 [details]
wrong sort order

It gets weirder and weirder. "E" disappeared and "É" has been added after "Z" (7.4.0 - r28660). The language is set to "dutch". I believe it sorted correct wth the language set to "english". I will try that today.
Comment 50 Dennis Mutsaers 2009-09-28 00:37:42 UTC
If I set the interface language to 'english' it sorts as expected. If the interface language has been set to "dutch" the sort order is wrong.
Comment 51 vagskal 2009-10-02 00:16:38 UTC
I upgraded from 7.3.4 to 7.4.1 - r28693 yesterday and in the artist list ÅÄÖåäö are not sorted correctly (last), but with their "clean" equivalents (AOao).

Also the sorting of those characters in browse music folder is broken. ÅÄÖ are here sorted first for some reason.

Sorting was correct with 7.3.4.

I am on Swedish Win XP with SC (or is it SS now?) and the SBC set to Swedish.
Comment 52 Dennis Mutsaers 2009-10-02 01:27:06 UTC
(In reply to comment #51)
> I upgraded from 7.3.4 to 7.4.1 - r28693 yesterday and in the artist list
> ÅÄÖåäö are not sorted correctly (last), but with their "clean"
> equivalents (AOao).
> Also the sorting of those characters in browse music folder is broken. ÅÄÖ
> are here sorted first for some reason.
> Sorting was correct with 7.3.4.
> I am on Swedish Win XP with SC (or is it SS now?) and the SBC set to Swedish.

You're experiencing the behaviour I want, I'm experiencing the behavior you want...
Comment 53 Dennis Mutsaers 2009-10-03 02:42:24 UTC
Please remove this from the 'fixed' list in the 7.4.0 release notes. It isn't.
Comment 54 Michael Herger 2009-10-05 23:50:57 UTC
*** Bug 14555 has been marked as a duplicate of this bug. ***
Comment 55 Dennis Mutsaers 2009-11-08 07:52:22 UTC
I could not reproduce this in 7.5.0 - r29192
Comment 56 vagskal 2009-11-11 23:13:18 UTC
(In reply to comment #51)
> I upgraded from 7.3.4 to 7.4.1 - r28693 yesterday and in the artist list
> ÅÄÖåäö are not sorted correctly (last), but with their "clean"
> equivalents (AOao).
> 
> Also the sorting of those characters in browse music folder is broken. ÅÄÖ
> are here sorted first for some reason.
> 
> Sorting was correct with 7.3.4.
> 
> I am on Swedish Win XP with SC (or is it SS now?) and the SBC set to Swedish.

With 7.4.2 r29220 a change of language in SBS between Swedish and English and back seems to have finally cured this issue, compare https://bugs-archive.lyrion.org/show_bug.cgi?id=10114#c8.
Comment 57 vagskal 2009-11-12 09:02:51 UTC
(In reply to comment #56)
> (In reply to comment #51)
> > I upgraded from 7.3.4 to 7.4.1 - r28693 yesterday and in the artist list
> > ÅÄÖåäö are not sorted correctly (last), but with their "clean"
> > equivalents (AOao).
> > 
> > Also the sorting of those characters in browse music folder is broken. ÅÄÖ
> > are here sorted first for some reason.
> > 
> > Sorting was correct with 7.3.4.
> > 
> > I am on Swedish Win XP with SC (or is it SS now?) and the SBC set to Swedish.
> 
> With 7.4.2 r29220 a change of language in SBS between Swedish and English and
> back seems to have finally cured this issue, compare
> https://bugs-archive.lyrion.org/show_bug.cgi?id=10114#c8.

Correction: The reported issue when browsing the music folder still persists.
Comment 58 Dennis Mutsaers 2010-02-05 23:48:36 UTC
The sorting order is wrong again. 7.5.0 - r30028
Comment 59 Dennis Mutsaers 2010-02-05 23:49:39 UTC
Created attachment 6493 [details]
É after Z
Comment 60 Andy Grundman 2010-02-06 04:40:34 UTC
If you're running embedded with SQLite, then yes it's a known issue that I'm working on.
Comment 61 Dennis Mutsaers 2010-02-06 05:18:56 UTC
No, I'm not running the SQLite version, I'm running the MySQL version. The problem was fixed, but it's back now.
Comment 62 Dennis Mutsaers 2010-03-07 00:08:26 UTC
Still happening on 7.5.0 - r30326 (É after Z)
Comment 63 Chris Owens 2010-03-15 18:06:50 UTC
7.4.x milestone is in the past
Comment 64 Dennis Mutsaers 2010-03-15 23:27:31 UTC
I don't see the É after Z problem in 7.5.0 - r30373, AFTER completely uninstalling/installing Squeezebox Server.
Comment 65 Dennis Mutsaers 2010-04-20 00:44:40 UTC
It's back again, but with different behavior on the controller (7.6.0 r8716) & webinterface (7.6.0 - r30667). 

On the controller artists starting with É have been sorted after A, but before B.

On the webinterface artists starting with É (and also artists starting with E) have been sorted after Z.
Comment 66 Dennis Mutsaers 2010-06-05 23:58:11 UTC
I've removed all 'international' characters from the FILEname, not the tag. Sorting is now as expected. (But SBS should handle international characters correctly)
Comment 67 Dennis Mutsaers 2010-06-08 13:15:14 UTC
Please disregard my last comment. The problem still exists, even when all filenames don't have extended/special/international characters.
Comment 68 Dennis Mutsaers 2010-08-17 11:45:33 UTC
Also present in 7.6.0 - r31215
Comment 69 Alan Young 2010-12-10 02:41:52 UTC
Filename sorting is different to artist, album or title sorting. For filename sorting please see bug 14906. The original version of this bug was fixed in 7.3.3. The new behaviour is covered by bug 14800.
Comment 70 Alan Young 2010-12-10 02:43:51 UTC
Hmm, I may have got the bit about this one ever having been fixed but I would still like to use bug 14906 and bug 14800 to track these issues further.
Comment 71 Dennis Mutsaers 2011-05-05 22:39:43 UTC
It's back again in the 7.6 trunk...
Comment 72 Dennis Mutsaers 2011-05-05 22:41:10 UTC
Created attachment 7261 [details]
á after Z
Comment 73 Michael Herger 2011-05-06 00:26:04 UTC
Dennis - please open a new bug which clearly states that this is about the page bar in the web UI only. This bug started as something different. Thanks!
Comment 74 Dennis Mutsaers 2011-05-08 00:42:38 UTC
Created new bug report:
https://bugs-archive.lyrion.org/show_bug.cgi?id=17205