Bug 4477 - RandomPlay uses very slow queries causing music to stutter
: RandomPlay uses very slow queries causing music to stutter
Status: RESOLVED FIXED
Product: Logitech Media Server
Classification: Unclassified
Component: Plugin
: 6.5.0
: All Linux (other)
: P2 normal with 1 vote (vote)
: ---
Assigned To: Dan Sully
http://forums.slimdevices.com/showthr...
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2006-11-06 10:42 UTC by Nicola Fankhauser
Modified: 2008-12-18 11:11 UTC (History)
2 users (show)

See Also:
Category: ---


Attachments
RandomPlay speedup (1.26 KB, patch)
2007-01-04 22:18 UTC, Dan Sully
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Nicola Fankhauser 2006-11-06 10:42:41 UTC
Summary:
========

1. too many columns selected for no reason
2. unneeded JOIN on table 'genres' for the default (all genres)
3. non-performing way to randomize results

the problem is that when the SQL is started, the buffer of the squeezebox has just filled a bit (shows 0%), started to play for one second or so, but receives no further data because the SQL query is taking so long. very audibly stuttering music (two gaps of ~1s) is the consequence.

applying points 1-3 together, performance can be improved by 99% (1.44s vs. 11s). no more stuttering music!

see also thread http://forums.slimdevices.com/showthread.php?t=20494

Detailed explanation:
=====================

The Plugin RandomPlay produces SQL queries like this:

SELECT me.id, me.url, me.content_type, me.title, me.titlesort, me.titlesearch, me.album, me.tracknum, me.timestamp, me.filesize, me.disc, me.thumb, me.remote, me.audio, me.audio_size, me.audio_offset, me.year, me.secs, me.cover, me.vbr_scale, me.bitrate, me.samplerate, me.samplesize, me.channels, me.block_alignment, me.endian, me.bpm, me.tagversion, me.drm, me.moodlogic_id, me.moodlogic_mixable, me.musicmagic_mixable, me.musicbrainz_id, me.playcount, me.lastplayed, me.lossless, me.lyrics, me.rating, me.replay_gain, me.replay_peak FROM tracks me LEFT JOIN genre_track genreTracks ON ( genreTracks.track = me.id ) WHERE ( audio = '1' AND genreTracks.genre IN ( '33', '32', '21', '7', '26', '17', '2', '1', '18', '30', '16', '44', '55', '27', '25', '28', '40', '20', '14', '49', '24', '10', '31', '35', '11', '53', '48', '42', '22', '46', '13', '23', '29', '6', '50','39', '36', '3', '51', '9', '41', '12', '47', '15', '52', '38', '8', '4', '34', '45', '37', '43', '19', '54', '5' ) AND ( ( lastPlayed IS NULL ) OR ( lastPlayed < '1162836263' ) ) ) ORDER BY RAND() LIMIT 1;

this takes 11s on a NSLU2. 

with the exception of me.id, the selected fields are not needed at all but seem to cause huge overhead:

SELECT me.id FROM tracks me LEFT JOIN genre_track genreTracks ON ( genreTracks.track = me.id ) WHERE ( audio = '1' AND genreTracks.genre IN ( '33', '32', '21', '7', '26', '17', '2', '1', '18', '30', '16', '44', '55', '27', '25', '28', '40', '20', '14', '49', '24', '10', '31', '35', '11', '53', '48', '42', '22', '46', '13', '23', '29', '6', '50','39', '36', '3', '51', '9', '41', '12', '47', '15', '52', '38', '8', '4', '34', '45', '37', '43', '19', '54', '5' ) AND ( ( lastPlayed IS NULL ) OR ( lastPlayed < '1162836263' ) ) ) ORDER BY RAND() LIMIT 1;

this small improvement results in a query that takes only 3s!

furthermore the RandomPlay plugin does not take into account when having selected all genres (default) no JOIN is necessary, optimising this leads to:

SELECT id FROM tracks me WHERE ( audio = '1' AND ( ( lastPlayed IS NULL ) OR ( lastPlayed < '1162836263' ) ) ) ORDER BY RAND() LIMIT 1;

this takes 1.4s on a NSLU2.

using RAND() in the order clause is not very fast, the following query performs better:

SELECT RAND() as rnd_id, id FROM tracks me WHERE ( audio = '1' AND ( ( lastPlayed IS NULL ) OR ( lastPlayed < '1162836263' ) ) ) ORDER BY rnd_id LIMIT 1;


Diff for improving the generated SQL query:
===========================================

this diff removes the unneeded columns and uses RAND() in a better fashion, however does not take into account the default case with all genres (no JOIN needed in that case).

spatz:/opt# diff -uN SlimServer_v6.5.0/Plugins/RandomPlay/Plugin.pm slimserver/Plugins/RandomPlay/Plugin.pm
--- SlimServer_v6.5.0/Plugins/RandomPlay/Plugin.pm      2006-09-16 10:09:08.000000000 +0200
+++ slimserver/Plugins/RandomPlay/Plugin.pm     2006-11-06 19:38:33.000000000 +0100
@@ -64,7 +64,8 @@
        # RAND() function to get back a random list. Restrict by the genre's we've selected.
        my @results = Slim::Schema->rs($type)->search($find, {

-               'order_by' => \'RAND()',
+               'order_by' => \'rnd_id',
+               'select' => [\'RAND() as rnd_id', 'id'],
                'join'     => \@joins,

        })->slice(0, ($limit-1));
Comment 1 Mick 2006-11-06 13:59:32 UTC
When you return the track ID from your fitst query, you then probably need to go back and get the rest of the data from the original query using the previously determined track ID.

SELECT me.id FROM tracks me LEFT JOIN genre_track genreTracks ON (
genreTracks.track = me.id ) WHERE ( audio = '1' AND genreTracks.genre IN (
'33', '32', '21', '7', '26', '17', '2', '1', '18', '30', '16', '44', '55',
'27', '25', '28', '40', '20', '14', '49', '24', '10', '31', '35', '11', '53',
'48', '42', '22', '46', '13', '23', '29', '6', '50','39', '36', '3', '51', '9',
'41', '12', '47', '15', '52', '38', '8', '4', '34', '45', '37', '43', '19',
'54', '5' ) AND ( ( lastPlayed IS NULL ) OR ( lastPlayed < '1162836263' ) ) )
ORDER BY RAND() LIMIT 1;

store me.id in a var then ...
<psuedo code> randtrackid = me.id </psuedocode>

SELECT me.id, me.url, me.content_type, me.title, me.titlesort, me.titlesearch,
me.album, me.tracknum, me.timestamp, me.filesize, me.disc, me.thumb, me.remote,
me.audio, me.audio_size, me.audio_offset, me.year, me.secs, me.cover,
me.vbr_scale, me.bitrate, me.samplerate, me.samplesize, me.channels,
me.block_alignment, me.endian, me.bpm, me.tagversion, me.drm, me.moodlogic_id,
me.moodlogic_mixable, me.musicmagic_mixable, me.musicbrainz_id, me.playcount,
me.lastplayed, me.lossless, me.lyrics, me.rating, me.replay_gain,
me.replay_peak FROM tracks me LEFT JOIN genre_track genreTracks ON (
genreTracks.track = me.id ) WHERE (me.id= randtrackid);

Should still be pretty quick.


Also Slim don't seem to use Stored procedures within MySQL. I'm not sure how efficient MySQL's query plan cache is, but I'd imagine that a stored procedure would make a difference.

Time to do some reading...

http://dev.mysql.com/doc/refman/5.1/en/query-cache.html
Comment 2 Nicola Fankhauser 2006-11-07 01:59:25 UTC
(In reply to comment #1)
> When you return the track ID from your fitst query, you then probably need to
> go back and get the rest of the data from the original query using the
> previously determined track ID.

my proposed patch does indeed not work very well. though as you proposed not all data seems to be necessary, the 'url' column and 'id' are sufficient (see new patch below). my SS is now running with the new patch and without the genre-JOINs, however I still experience stuttering music. I think that the squeezebox should not start to play the next track before this certain SQL-query has finished; this needs some more analysis.

spatz:/opt# diff -uN SlimServer_v6.5.0/Plugins/RandomPlay/Plugin.pm
slimserver/Plugins/RandomPlay/Plugin.pm
--- SlimServer_v6.5.0/Plugins/RandomPlay/Plugin.pm      2006-09-16
10:09:08.000000000 +0200
+++ slimserver/Plugins/RandomPlay/Plugin.pm     2006-11-07 10:56:14.000000000
+0100
@@ -64,7 +64,8 @@
        # RAND() function to get back a random list. Restrict by the genre's
we've selected.
        my @results = Slim::Schema->rs($type)->search($find, {

-               'order_by' => \'RAND()',
+               'order_by' => \'rnd_id',
+               'select' => [\'RAND() as rnd_id', 'id', 'url'],
                'join'     => \@joins,

        })->slice(0, ($limit-1));
Comment 3 Nicola Fankhauser 2006-11-07 02:13:32 UTC
a further performance improvement can be made when adding a combined SQL index to the table 'tracks' for the columns 'audio' and 'lastPlayed', this reduces scanned rows by 50% in my case:

ALTER TABLE `tracks` ADD INDEX `trackAudioLastPlayedIndex` (`audio`,`lastPlayed`);
Comment 4 Chris Owens 2006-11-13 12:50:03 UTC
cc'ing Dan
Comment 5 Dan Sully 2007-01-02 16:32:26 UTC
The problem comes from two parts:

1) Using RAND() for order by is slow. Very slow. It needs to do a full scan, and use an external temporary file.

2) I don't see another way to do it *AND* apply all the join conditions that need to happen in order to return an accurate query.
Comment 6 Nicola Fankhauser 2007-01-02 23:58:12 UTC
(In reply to comment #5)
> 1) Using RAND() for order by is slow. Very slow. It needs to do a full scan,
> and use an external temporary file.

foremost, the time is not spent finding out the correct track, but to return all the demanded columns. I observed a great performance boost when only returning the track id. since the block of new random tracks to add to the playlist, subsequent queries to the database delivering the missing information could be a solution.

furthermore, I'd propose to make some performance comparisons between RAND() as virtual column (with subsequent ordering in the "order by" clause) and RAND() directly in the "order by" clause, to see if they differ.

> 2) I don't see another way to do it *AND* apply all the join conditions that
> need to happen in order to return an accurate query.

I think that most people (as I do) in 95% of cases leave all genres ticked and thus the joins are superfluous - which the DBMS however does not seem to notice, and is slow. the 5% that need the JOIN can be implemented just like the existing implementation.

even if I repeat myself (see first post), this boils down to these three points:

1. too many columns selected for no reason
2. unneeded JOIN on table 'genres' for the default (all genres)
3. non-performing way to randomize results
Comment 7 Dan Sully 2007-01-04 22:18:38 UTC
Created attachment 1757 [details]
RandomPlay speedup

Please try this patch to speed things up.
Comment 8 Nicola Fankhauser 2007-01-05 01:40:55 UTC
hi dan

(In reply to comment #7)
> Please try this patch to speed things up.
 
thanks, I tried it out. on the upside, compared to my previous amateurish hack it now displays track info correctly in the playlist. :)

seriously, this fixes issue 1 (too many columns selected for no reason) and reduces the query time on my system (500mhz p2) from 3-4s to <=2s.

however, issue 2 (unneeded JOIN on table 'genres' for the default ("all genres")) could be handled separately. when starting the two different queries manually (one with the JOIN and all genres, the other without the JOIN), this results in the following:

1. with JOIN: 14,056 total result rows, Query took 0.9826 sec
2. without JOIN: 14,126 total result rows, Query took 0.0005 sec

query 1: SELECT me.id FROM tracks me LEFT JOIN genre_track genreTracks ON ( genreTracks.track = me.id ) WHERE ( audio = '1' AND genreTracks.genre IN ( '127', '90', '118', '102', '119', '
99', '162', '125', '84', '161', '95', '108', '115', '163', '109', '92', '103', '151', '89', '148', '113', '152', '142', '91', '167', '107', '87', '93', '106', '157', '133', '149
', '123', '138', '97', '114', '153', '137', '81', '101', '86', '165', '139', '129', '166', '88', '116', '136', '141', '144', '100', '82', '110', '147', '128', '120', '156', '83'
, '134', '168', '135', '112', '145', '140', '150', '104', '124', '155', '130', '131', '122', '121', '143', '158', '154', '105', '96', '126', '159', '85', '94', '160', '146', '11
1', '98', '169', '164', '132', '117' ) AND ( ( lastPlayed IS NULL ) OR ( lastPlayed < '1167988705' ) ) ) GROUP BY me.id;
query 2: SELECT me.id FROM tracks me WHERE audio = '1' AND (lastPlayed IS NULL OR lastPlayed < '1167988705');

I think this could be worthwile to implement, since differences are enormous and in >95% of cases people leave all genres checked.

regards
nicola
Comment 9 Dan Sully 2007-01-05 09:54:33 UTC
I'm not seeing the issue with genres.. if all or none are selected, no join is done.
Comment 10 Dan Sully 2007-01-08 14:38:58 UTC
Ok - I've patched this in change 11162, which should be much faster.

Please open another bug for the genres issue, as I'm not able to reproduce it.

Thanks
Comment 11 Nicola Fankhauser 2007-01-08 23:50:58 UTC
(In reply to comment #10)
> Ok - I've patched this in change 11162, which should be much faster.

nice!

> Please open another bug for the genres issue, as I'm not able to reproduce it.

I will as soon I have tried out the trunk - I'm still running 6.5.0, which might be the problem.

thanks & regars
nicola
Comment 12 Mick 2007-01-09 12:20:23 UTC
The change to 6.5.1 has 1 small bug resulting in the plugin not initialising with the following error.

2007-01-09 20:10:06.4063 Requiring Plugins::RandomPlay::Plugin plugin.
2007-01-09 20:10:06.4238 Can't load Plugins::RandomPlay::Plugin for Plugins menu: Undefined subroutine &Plugins::RandomPlay::Plugin::getDisplayName called at /PerlApp/Slim/Utils/PluginManager.pm line 257.




... the problem is on line 425 where ); was changed to ));