Bug 1446 – MusicMagic plugin eats memory and cpu

Bug 1446 - MusicMagic plugin eats memory and cpu

Summary:

MusicMagic plugin eats memory and cpu

Status:	CLOSED FIXED

Product:	Logitech Media Server
Classification:	Unclassified
Component:	MusicIP
Version:	6.1.0
Platform:	PC All

Importance:	P2 normal (vote)
Target Milestone:	---
Assigned To:	KDF

URL:
Keywords:

Depends on:
Blocks:
	Show dependency tree / graph

Reported:	2005-04-26 15:00 UTC by michael
Modified:	2008-08-18 10:54 UTC (History)
CC List:	2 users (show)

See Also:
Category:	---

Attachments
strace of slim using mmm and eating memory/cpu (2.25 MB, text/plain) 2005-04-27 13:16 UTC, michael	Details
View All Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description michael 2005-04-26 15:00:52 UTC

using a recent trunk checkout, if I select "use musicmagic" in the server
settings, within a day the server will start exibiting drop outs and very slow
brower response.  Looking at the server I see that slimserver.pl is using
something like 98% of the cpu and 75% or more of the memory (and it's not doing
a rescan or anything). These numbers don't drop if I leave the server alone for
awhile, in fact the memory usage will keep going up.  Restarting the server
brings everything back to normal levels, but the numbers will start clibming
again.  If I switch to "don't use musicmagic", then slimserver hovers below 7%
cpu while playing (and not rescanning) and memory stays below 5%.

I tried this on another machine, and though the numbers differed slightly with
the machine specs, the general effect was the same.
It also doesn't seem to matter if MMM itself is on the same box or a remote one.

I can attach the output from a --d_musicmagic, but that seems to be more about
the data being echanged between the two than anything obviously relevent here.

I'll try to add more useful info when I've had a chance to poke at it a bit more.

Comment 1 Dan Sully 2005-04-26 15:06:42 UTC

Michael - can you get an strace of the process?

Comment 2 michael 2005-04-27 13:00:33 UTC

some additional observations and ponderings...

On further observation, I have seen the cpu usage (of slimserver.pl) drop to
normal levels for a short while and then pop back up pegging it's cpu.  

This didn't seem to be a problem when I first started playing with the plugin,
but the dropouts are what made be notice it, and the SB2 seems immune to the
dropouts (yay for buffer), so I can't really say whether it's been happening all
along or not. 
However... when I first launched the mmm it quickly went through and identified
everything it could (and marked them green) and I could start using those songs
in slim mixes right away.  and then it goes back to verify it's assessment of
those songs, which is a long process (it's been running for a couple weeks now).
 There seems to be two parts to this process, fingerprinting and analysis, and
the analysis part takes much longer of the two.  for about a week, the
fingerprint part seemed essentially paused while analysis would grind away at
it's cue.  a few days ago analysis caught up and fingerprinting started grinding
through it's queue again. (I only half grok what mmm is doing under the covers,
so forgive me if I'm butchering this discription.  This is all based on my
understanding of the x/y/z analysis line at the bottom of their window.) 
Now, I'm not totally sure about how slimserver gets updates from mmm, but if mmm
decided that each time it finished a verification it constituted an update
(perhaps regardless of whether it's opinion of the track changed) and if
slimservers processing that update was more cpu intensive than one would hope
for...  then perhaps the mmm switching over to fingerprinting mode and churning
through tracks faster is why the slim is chewing up cpu more noticibly than it
was before?  And when it occationally goes back to analysis for things it's just
fingerprinted might explain why slim's cpu usage occationally drops to "normal".

Just a guess, and it doesn't address the memory leak, but I figured it was worth
a mention.

anyway...
on to the strace...

Comment 3 michael 2005-04-27 13:16:58 UTC

Created attachment 465 [details]
strace of slim using mmm and eating memory/cpu

anyway.... 
I attached strace to the running slim process (let me know if you'd prefer a
start to stop version for some reason). The trace was growing fairly quickly,
so I didn't let it run too long.  Let me know if you need a bigger chunk. This
was on a secondary server that only talks to software players, but it's the
server that mmm is actually running on, so I can say it's running unmodified
code (had to tweak the code on the "real" server to talk to a remote mmm, but
the cpu/mem behaviour is similar to what's seen here, even though the cpu is
beefier).  This is running with revision 3062.	No players are actually playing
at the time this was taken.  There are multiple cpus on this box, so mmm's
analysis eating up it's own cpu isn't interfering with silm and what it's doing
with it's cpu.	This process has been running for a couple days I think,
there's enough memory on this box that It takes longer to eat through it, but
the effect is still pretty obvious.

top - 12:15:55 up 52 days, 19:57,  8 users,  load average: 2.20, 2.30, 3.54
Tasks: 101 total,   3 running,	98 sleeping,   0 stopped,   0 zombie
Cpu(s): 49.3% us,  0.9% sy,  0.0% ni, 49.0% id,  0.6% wa,  0.0% hi,  0.2% si
Mem:   4057572k total,	4039080k used,	  18492k free,	 103920k buffers
Swap:  2097136k total,	 117492k used,	1979644k free,	1416724k cached

  PID USER	PR  NI	VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND	    
28074 michael	25   0 61732  50m 4524 R 99.9  1.3   0:32.22 MusicMagicServi
 5844 root	17   0 1926m 1.9g 5948 R 99.0 48.5   1862:33 slimserver.pl  
  359 root	 5 -10	   0	0    0 S  0.3  0.0   3:26.41 reiserfs/0     
28172 root	15   0	   0	0    0 S  0.3  0.0   1:14.03 pdflush	    
21392 root	16   0	2064 1084 1848 R  0.3  0.0   0:05.58 top	    
    1 root	16   0	1500  480 1344 S  0.0  0.0   0:05.53 init

Comment 4 michael 2005-04-27 15:54:33 UTC

A thread on the mailing list seems to confirm my suspicion above about what's
going on with the cpu.  Just to verify, I set the musicmagic reload interval to
3600. Sure enough when the cpu did settle down, it stayed normal for about an
hour. Then it shot back up for 20 mins, and then back down again.
That doesn't solve the problem, but at least we know what's going on there.  
Let me know if you want to split this out from the memory leak as sepperate issues.

Comment 5 KDF 2005-04-27 18:26:46 UTC

clearly something is convincing the server that the musicmagic db is changing. 
howver, you can now set the musicmagic reload interval to 0 to disable the
automatic reload completely if you like.

Comment 6 KDF 2005-04-28 23:42:21 UTC

do we have any d_musicmagic output available?  It might be good to know where
its looping or just to be certain that this is just the result of the rescannign
process.  I haven't seen this myself, running with 6.1, mmm 1.1.5 on win2k while
at work.

Comment 7 KDF 2005-04-29 18:41:22 UTC

some new debug options checked into trunk at change 3107
i'd be interested in seeing d_musicmagic logs from this from startup of both MMM
and slimserver.  plus, a second log after shutting both down and starting again.

Comment 8 michael 2005-05-04 12:08:58 UTC

sorry, I've been busy lately and haven't had a chance to poke at this.
I do have an output capture of --d_musicmagic that was taken (coincidently)
using revision 3108, but I let it run a bit too long, so I'll need to trim it
down in size before attaching it.

in the meantime, mmm has finally finished analysing my files, which saves me
from the cpu problem (at least until I add more music).  I'll have to watch it
for awhile to see what happens to the memory usage in this state.

Comment 9 KDF 2005-05-04 12:21:24 UTC

np, every bit helps.  I'm definitely seeing the rescan on start, and know the
cause (not sure how to fix without more info from MusicMagic).  What I'm not
seeing is the eating of cpu.  Both on windows (activeperl) and linux with 1.4G
(ish) pc's the scan seems to run without especially overloading the server.

Comment 10 KDF 2005-05-19 13:10:52 UTC

wendell and I have tracked down potential bottlenecks and some changes are
included in the new musicmagic 1.1.5.1 beta.  Slimserver can check on the MTime
for the contents only so that things like ratings and playback count will no
longer trigger a rescan.  There is also a new getStatus API command that
slimserver can now use to only trigger a rescan when musicmagic is idle,
avoiding a usage during analysis.

hoping to get a patch in for this and bug 1180 tonight for 6.1 users to test
(with latest beta of musicmagic)

Comment 11 KDF 2005-05-19 19:32:14 UTC

committed a possible fix to 6.1 at change 3272.

we now only scan if musicmagic returns idle from getStatus
use cacheid?contents for fileMTime to avoid a rescan from simple internal status
changes.

please test and confirm if this helps relieve cpu issues, as I was never able to
reproduce that problem.  You will need the very latest MusicMagic 1.1.5.1 from
here: http://music.predixis.com/beta.html

will close if there is confirmation

Comment 12 Blackketter Dean 2005-06-07 16:07:10 UTC

Michael: Is this fixed for you?

Comment 13 Blackketter Dean 2005-06-13 15:51:41 UTC

Michael?

Comment 14 michael 2005-06-14 03:13:58 UTC

sorry, I'm not near the server running mmm at the moment.  The server in
question has been generating some strange messages from slimserver lately, but I
haven't had the time or net connection to verify if it's related to mmm or not.
I'll be near that machine again after the 17th, and I'll be able to add more
useful comments then.  sorry for the delay.

Comment 15 Vidur Apparao 2005-06-30 15:00:36 UTC

Pinging Michael again.

Comment 16 michael 2005-06-30 18:48:57 UTC

testing it now

Comment 17 michael 2005-06-30 21:43:54 UTC

This appears to be fixed now, Thanks!  
Sorry for taking so long to verify.

Comment 18 Chris Owens 2008-03-11 11:28:11 UTC

This bug was marked resolved in Slimserver 6.1, which is several versions ago.  If you're still seeing this bug, please re-open it.  Thanks!