Bugzilla – Bug 2336
XPL enabled causes "Insane number of timers" crash
Last modified: 2008-09-15 14:37:04 UTC
When XPL support is enabled, I've noticed that I eventually get an "Insane number of timers: 501" error which causes slimserver to die. I added the following debug code to the end of sub setTimer: my $bryanb_timediff = $when - time; msg ("T: [$client] [$subptr] (@args) $bryanb_timediff $numtimers\n") unless ($bryanb_lasttimer == $numtimers); $bryanb_lasttimer = $numtimers; This produced the following output: 2005-10-22 07:23:41.3115 T: [] [CODE(0x8d4c0dc)] () 300.310846090317 141 2005-10-22 07:28:41.3161 T: [] [CODE(0x8d4c0dc)] () 300.315644025803 142 2005-10-22 07:33:41.3211 T: [] [CODE(0x8d4c0dc)] () 300.320498943329 143 2005-10-22 07:38:41.3266 T: [] [CODE(0x8d4c0dc)] () 300.326057910919 144 2005-10-22 07:43:41.3327 T: [] [CODE(0x8d4c0dc)] () 300.332314968109 145 2005-10-22 07:48:41.3380 T: [] [CODE(0x8d4c0dc)] () 300.337065935135 146 Notice that there is no client id (the first set of brackets is empty). And notice the last number is $numtimers. $numtimers normally decrements after the timer expires for all timers except the one shown above. This causes a non-monotonically increasing $numtimers which eventually invokes the "Insane number of timers" error. I searched the source for "300" and this eventually led nowhere. :-) I then decided to search for " 5" (5 minutes, 300 secods) and found this sub: # Sends an xPL heartbeat from all clients that are currently connected sub sendxplhbeat { my @clients = Slim::Player::Client::clients(); foreach my $client (@clients) { sendXplHBeatMsg($client); } Slim::Utils::Timers::setTimer("", Time::HiRes::time() + ($xpl_interval*60), \&sendxplhbeat); } $xpl_interval was previously set to 5. This is the source of the 300 second timer (verified by disabling XPL support). From what I understand looking at the other timer subs, calling setTimer without a client id ("" as above) will guarantee that the timer is never removed from the timer stack (or whatever it is called) and this will eventually trigger the error. I recall reading somewhere that XPL support is experimental. I am not actually using it (I turned it on once to monitor the packets it sent), but thought someone might like to know about this. My wife is relieved that I figured out why "the gadget" stopped crashing and that's really all I need to accomplish. :-)
Bryan, I don't think a timer with no client should cause your problem. It should be removed from the timer list when it fires (see Slim::Utils::Timers::checkTimers). Hence there should only be one instance of this timer on the list at once assuming xPL::init is only called once. There are several other timers in the code with no client set (most set this to 0 rather than "" though) It does sound like you have a timer leak, could it be with another timer? What happens when you turn on --d_time? Does this suggest the same code is continually being added to the timer stack? Does it say timers are always blocked?
Ah. You are correct that the timer I thought was causing the problem is itself working correctly. My debugging approach was flawed. I replaced my original debugging code with this to dump the timer array each time a timer is added: my $bryanb_timediff = $when - time; msg ("T: New timer: [$client] [$subptr] (@args) $bryanb_timediff $numtimers\n"); for (my $i = 0; $i <= $#normalTimers; $i++) { $bryanb_timediff = $normalTimers[$i]->{when} - time; msg (" T: [$i] ($normalTimers[$i]->{client}) ($normalTimers[$i]->{subptr}) $bryanb_timediff (@{$normalTimers[$i]->{args}})\n"); } That code results in this output after about 30 minutes of runtime (with XPL enabled and a playlist being played): 2005-10-23 07:00:07.6870 T: New timer: [Slim::Player::SLIMP3=ARRAY(0x9f9852c)] [CODE(0x9fbfaac)] () 0.736871004104614 19 2005-10-23 07:00:07.6873 T: [0] (Slim::Player::SLIMP3=ARRAY(0x9f9852c)) (CODE(0x9193308)) 0.71684193611145 () 2005-10-23 07:00:07.6875 T: [1] (Slim::Player::SLIMP3=ARRAY(0x9f9852c)) (CODE(0x9fbfaac)) 0.736871004104614 () 2005-10-23 07:00:07.6878 T: [2] (Slim::Player::SLIMP3=ARRAY(0x9f9852c)) (CODE(0x8ef0934)) 1.28964781761169 (Slim::Player::SLIMP3=ARRAY(0x9f9852c)) 2005-10-23 07:00:07.6882 T: [3] (Slim::DataStores::DBI::DBIStore=HASH(0xa079b10)) (CODE(0x89e4f80)) 1.30803203582764 () 2005-10-23 07:00:07.6887 T: [5] (Slim::DataStores::DBI::DBIStore=HASH(0xa076990)) (CODE(0x89e4f80)) 8.90686202049255 () 2005-10-23 07:00:07.6889 T: [6] (Slim::DataStores::DBI::DBIStore=HASH(0xa075bec)) (CODE(0x89e4f80)) 16.6173961162567 () 2005-10-23 07:00:07.6892 T: [7] (Slim::DataStores::DBI::DBIStore=HASH(0xa035d78)) (CODE(0x89e4f80)) 20.7026860713959 () 2005-10-23 07:00:07.6893 T: [8] (Slim::DataStores::DBI::DBIStore=HASH(0xa075bf8)) (CODE(0x89e4f80)) 20.7172288894653 () 2005-10-23 07:00:07.6895 T: [9] (Slim::DataStores::DBI::DBIStore=HASH(0xa03366c)) (CODE(0x89e4f80)) 20.7911450862885 () 2005-10-23 07:00:07.6899 T: [10] (Slim::DataStores::DBI::DBIStore=HASH(0xa073444)) (CODE(0x89e4f80)) 20.8605389595032 () 2005-10-23 07:00:07.6903 T: [11] (Slim::DataStores::DBI::DBIStore=HASH(0x9bdfcc0)) (CODE(0x89e4f80)) 21.2709050178528 () 2005-10-23 07:00:07.6905 T: [12] (Slim::DataStores::DBI::DBIStore=HASH(0x976be78)) (CODE(0x89e4f80)) 21.7198550701141 () 2005-10-23 07:00:07.6906 T: [13] (Slim::DataStores::DBI::DBIStore=HASH(0xa079b94)) (CODE(0x89e4f80)) 22.7797019481659 () 2005-10-23 07:00:07.6908 T: [14] (Slim::DataStores::DBI::DBIStore=HASH(0xa0759b8)) (CODE(0x89e4f80)) 24.7895228862762 () 2005-10-23 07:00:07.6911 T: [15] (0) (CODE(0x9bcc320)) 53.2884349822998 () 2005-10-23 07:00:07.6913 T: [16] (0) (CODE(0x9055064)) 53.5564730167389 () 2005-10-23 07:00:07.6914 T: [17] () (CODE(0x8d4dbc8)) 200.705127000809 () 2005-10-23 07:00:07.6916 T: [18] (0) (CODE(0x991b29c)) 29706.0501899719 () Timer 17 is the one with client id of "" which proves I was wrong... I thought those were somehow being accumulated. Without uploading the full log of the debug output suffice it to say that the timers that are accumlating are the ones that look like this: 2005-10-23 07:00:07.6884 T: [4] (Slim::Player::SLIMP3=ARRAY(0x9f9852c)) (CODE(0x8de8620)) 1.58605909347534 () And by "accumulating" I mean it looks like the total number of active timers like the one shown above increases by 1 each time the 300 second timer quoted in my original post fires. So, I still feel pretty strongly that the 300 second timer in xPL.pm is causing something bad to happen even if indirectly. Also, I did enable the --d_time option as suggested and this never resulted in a "blocked timer" message. I didn't mention this before, but to sensitize the problem a playlist must be playing. (I am not saying this is the only way to trigger the problem, though, but SlimServer doesn't exhibit symptoms that I've noticed while in the "off" state.) If you can suggest the next debugging step, I'll be glad to try it.
Bryan, There's a sub in Slim/Utils/Misc called deparseCodeRef. I would suggest using a cut down version of this (which does not return the body of the coderef) to turn the coderef into a readable name... Alternatively if you add bt(); into setTimer it will tell you who is calling setTimer for the problem case.
I added bt to my debug code as suggested. I found this interesting sequence occurring right before the addition of the 300 second timer: frame 0: Slim::Utils::Timers::setTimer (/usr/local/slimserver/Slim/DataStores/DBI/DBIStore.pm line 1033) frame 1: Slim::DataStores::DBI::DBIStore::_commitDBTimer (/usr/local/slimserver/Slim/DataStores/DBI/DBIStore.pm line 92) frame 2: Slim::DataStores::DBI::DBIStore::new (/usr/local/slimserver/Slim/Control/xPL.pm line 265) frame 3: Slim::Control::xPL::sendXplHBeatMsg (/usr/local/slimserver/Slim/Control/xPL.pm line 310) frame 4: Slim::Control::xPL::sendxplhbeat (/usr/local/slimserver/Slim/Utils/Timers.pm line 100) frame 5: Slim::Utils::Timers::checkTimers (/usr/local/slimserver/slimserver.pl line 641) frame 6: main::idle (/usr/local/slimserver/slimserver.pl line 579) frame 7: main::main (/usr/local/slimserver/slimserver.pl line 1182) sendXplHBeatMsg has this section of code: if ($playmode eq 'play') { $playmode = "playing"; my $currentDB = Slim::DataStores::DBI::DBIStore->new(); my $url = Slim::Player::Playlist::song($client); my $track = ref $url ? $url : $currentDB->objectForUrl($url, 1, 1); if (defined($track->album())) { if (defined($track->album()->title())) { $album = $track->album()->title(); } } Notice that this calls the new method of Slim::DataStores::DBI::DBIStore. Slim::DataStores::DBI::DBIStore->new() has this code: $self->_commitDBTimer(); and _commitDBTimer() has this code: Slim::Utils::Timers::setTimer($self, Time::HiRes::time() + DB_SAVE_INTERVAL, \&_commitDBTimer); Note the reference that this timer has back to itself making this a self-renewing timer. So here's what I believe is happening: Every 300 seconds the XPL heartbeat code calls sendXplHBeatMsg. This calls Slim::DataStores::DBI::DBIStore->new() which sets up a recurring timer via the call to $self->_commitDBTimer(); This explains my observation that every 300 seconds that the net number of timers grows by one when xPL is enabled and the player is playing--it's the accumulation of calls to Slim::DataStores::DBI::DBIStore->new() every 300 seconds that's causing the problem. Agree?
Good catch - definately looks the problem with the code you are running. This was resolved by a change to xPL.pm at version 2343: http://svn.slimdevices.com/trunk/server/Slim/Control/xPL.pm? rev=4243&r1=3315&r2=4243 (and explains why I didn't see it as I tend to have quite a recent version running) Could you try a 6.2 nightly to check this is fixed?
Bryan, I'm going to mark this as Fixed. Please reopen if it's still an issue. Thanks.
There are 536 bugs in the database with targets of '---' that were fixed prior to new year 2006. I am setting them to targets of 6.2.1 to keep them from showing up in my queries.