Bugzilla – Bug 13811
Search doesn't ignore umlauts anymore
Last modified: 2009-10-05 14:34:54 UTC
In the previous version of server when performing a search with umlauts, the umlauts were dropped. Now when I search for "rückertlieder" I have to enter ü, previously the search worked with "ruckertlieder". This is quite annoying with names with e, è and é, if you don't remember the correct hat... Kari Version: 7.4 - r28396 @ Tue Sep 1 04:02:17 PDT 2009 Hostname: trane Server IP Address: 192.168.1.3 Server HTTP Port Number: 9000 Operating system: SuSE - EN - utf8 Platform Architecture: x86_64-linux Perl Version: 5.8.8 - x86_64-linux-thread-multi MySQL Version: 5.0.26
I agree stripping umlauts (and accented characters) can come in handy when searching. For some European countries, at least. Alas, imagine stripping to some "minor character" in, say, Mandarin, Hangeul, Russian... As you can see, "reducing" characters in a world with so many languages is not easy. An additional thing to consider is that people living in country A might have set their interface language to country A's language but still have songs in country B,C and D's language. (And want to search for them.) Things like the controller would probably only "wheel" through the characters of the selected language (mine wheels only through the English characters, albeit I live in Germany and thus "äöüÄÖÜß" are missing). I propose using a more general "wildcard" character instead, like "?" (for one unknown character) and "*" (for multiple unknown characters) — this is what we use with filesystems ever since. So, in your case, one could look for Mahler's "R?ckert-Lieder" instead. Or maybe even "r*ert*lieder" if one didn't know if it was written with "ck" or only "k" and one didn't know if it was written with or without a hyphen ("*" standing for zero to n unknown characters in this case). This type of searching should of course be the same in all Web UI, soft- and hardware player interfaces. And it would be easy to implement, too, since Perl knows about RegExp's and both MySQL and SQLite understand wildcard searches. For ease of use, I would also propose that devices that use a scrollable ("wheelable") character set use the uppercase set of characters of the selected interface language, plus numbers and punctuation symbols, followed by the lowercase set of the selected interface language, probably followed by some "agreed-upon" base character set (i.e., USASCII). Thus, Russian, for example, would present Russian characters and numbers in the Controller's interface, followed by the latin A-Z. This would make searching in the user's language easier (assumed he has most of his titles in the local language), plus allow searching for "foreign" ASCII-labelled titles. Using wildcards like "?" and "*" in ANY language set would make finding things with foreign characters much easier (like, say, French accents).
Thinking about it, I'd probably scrap the lowercase characters altogether for selection-based devices (wheel interface, touch screen, virtual keyboard). The search itself will usually be performed case-insensitive anyway, so why clutter the interface with lots of characters that aren't actually needed?
Good suggestions. As I said the basic accent dropping used to work. The last build I definitely it working from www-ui was 26229.
*** Bug 13809 has been marked as a duplicate of this bug. ***
Ignoring umlauts will be just fine for me, but i can still use them rigth ? accent's is very important to ignore, unless you want to spend half hour googling for the correct spelling, We that don't speak any Latin languages have no clue.. But still we have latin music on our drives. We can not expect non computer geeks to understand wildcard search especially the "?" variant. But i think this should be included as well, for those who can use it. That would make everybody happy.
I did some digging this evening and found the following: - This works in 7.3.4 - It worked in 7.4 Trunk up until the merge (<= r27972) - It's broken after the merge, with search columns of datatype TEXT - This does not work with search columns of datatype BLOB Assuming TEXT columns: Something bad is happening to the accented characters when they're being stored in the database. When I view the namesort field for contributor Mel Tormé in my database client (SQLyog), prior to the merge I see TORMé MEL After the merge I see TORMÉ MEL If I change the columns to BLOB, these characters look correct, but 'E' does not match 'é' in a binary column.
The change #28582 on bug 13600 seems to have fixed the searching...
Yes this is fixed now.
This bug has been marked as fixed in the 7.4.0 release version of SqueezeBox Server! * SqueezeCenter: 28672 * Squeezebox 2 and 3: 130 * Transporter: 80 * Receiver: 65 * Boom: 50 * Controller: 7790 * Radio: 7790 Please see the Release Notes for all the details: http://wiki.slimdevices.com/index.php/Release_Notes If you haven't already, please download and install the new version from http://www.logitechsqueezebox.com/support/download-squeezebox-server.html If you are still experiencing this problem, feel free to reopen the bug with your new comments and we'll have another look.