Bug 7343 - 'jive' process ran out of memory, puts Jive into unusable state
: 'jive' process ran out of memory, puts Jive into unusable state
Status: CLOSED FIXED
Product: SB Controller
Classification: Unclassified
Component: UI
: unspecified
: All Other
: P1 critical (vote)
: 7.0.1
Assigned To: Matt Wise
:
Depends on: 11192
Blocks:
  Show dependency treegraph
 
Reported: 2008-02-27 10:30 UTC by Matt Wise
Modified: 2009-09-08 09:19 UTC (History)
5 users (show)

See Also:
Category: ---


Attachments
look for "dstein" notes (66.85 KB, text/plain)
2008-02-28 13:19 UTC, Stein David
Details
SD Log from Jive (312.98 KB, application/octet-stream)
2008-02-29 08:30 UTC, James Richardson
Details
James' Jive Memory error (51.06 KB, text/plain)
2008-02-29 08:30 UTC, Stein David
Details
Patch for heap debug (7.74 KB, patch)
2008-03-02 19:54 UTC, Richard Titmuss
Details | Diff
Fix artwork memory leak with now playing screen (2.09 KB, patch)
2008-03-02 19:56 UTC, Richard Titmuss
Details | Diff
Richard's patch + a critical line uncommented (2.09 KB, patch)
2008-03-03 13:51 UTC, Ben Klaas
Details | Diff
Another memory leak fix (1.12 KB, patch)
2008-03-07 09:46 UTC, Richard Titmuss
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Matt Wise 2008-02-27 10:30:35 UTC
Yesterday morning (@10am?) I updated my Jive to the 2013 build. I left it in its cradle all day and night acting as a 'screen' for what my SB3 was playing. It is connected to a Ray via Bridged mode, and controlling an ethernet-connected SB3 running the latest SC7 build (updated nightly). This morning it was still displaying the current information and I could control the music (forward, back, vol up, vol down, etc), but the 'Wifi' signal indicator was Red and most of the menu options under 'Advanced' failed. I could not read the About screen or Enable SSH.

Further investigation showed that the Jive process was using all of the available memory on the unit and once the process was restarted, it began behaving again.
Comment 1 Blackketter Dean 2008-02-27 10:33:54 UTC
The unfortunate thing here was that there were errors in the log, but jive was just behaving badly.  

I propose that in the case that we run out of memory badly, we should just reboot.
Comment 2 Matt Wise 2008-02-27 10:37:42 UTC
Kind of like an iPhone when an application crashes? Just make it look like it MEANT to do that.. :) 


Comment 3 Matt Wise 2008-02-28 05:21:18 UTC
I've reproduced this bug exactly this morning. Exact same symptoms, red wifi icon, unresponsive menu items, but still able to control SC 7. Looks like this is an issue! 
Comment 4 Richard Titmuss 2008-02-28 05:40:51 UTC
Matt, what were you playing local music or internet radio? Could you try editing /usr/share/jive/jive/slim/ArtworkCache.lua and changing the ARTWORK_LIMIT to 5 * 1024 * 1024 (for a 5 Mbyte cache). Can you still recreate this then?

I think the two possibilities are a memory leak in the lua tables, or the artwork cache is too big. I think the first unlikely as I've had a jive on the now playing screen saver running ok for a few days, however it is only a small playlist.
Comment 5 Matt Wise 2008-02-28 06:56:51 UTC
Richard,
  We've grabbed the logs off of my Jive and I've made your change... unfortunately, every time I connect to my SB3 now it asks if I want to do the Software Update. By holding the Left arrow, it disconnects completely from the SB3 and leaves me hanging. If I choose my SB3 again, it forces me to try to start the update.

  For testing purposes, can I disable the auto-update feature? 
Comment 6 Stein David 2008-02-28 07:03:08 UTC
Jive "top" showed Jive running 113% memory
It first showed errors when running About.
It crashed with running the keypad test.

Here is part of a crash...

[<c0072d50>] (do_page_cache_readahead+0x0/0x64) from [<c006e648>] (filemap_nopage+0x150/0x314)
 r7:c395fbe0 r6:c395fc24 r5:00000061 r4:00000000
[<c006e4f8>] (filemap_nopage+0x0/0x314) from [<c0078530>] (__handle_mm_fault+0x140/0x750)
[<c00783f0>] (__handle_mm_fault+0x0/0x750) from [<c0036b98>] (do_page_fault+0xe8/0x224)
[<c0036ab0>] (do_page_fault+0x0/0x224) from [<c0036d74>] (do_translation_fault+0x1c/0x84)
[<c0036d58>] (do_translation_fault+0x0/0x84) from [<c002f1d0>] (do_PrefetchAbort+0x18/0x1c)
 r4:ffffffff
[<c002f1b8>] (do_PrefetchAbort+0x0/0x1c) from [<c002fe28>] (ret_from_exception+0x0/0x10)
Exception stack(0xc3bd7fb0 to 0xc3bd7ff8)
7fa0:                                     00000a20 00096918 00001fff 00000001
7fc0: 00000015 00096918 00000015 4001da50 00000006 0000000a 00000000 00000000
7fe0: 4001da50 beaf5d20 000695ec 000695ec 00000010 ffffffff
Mem-info:
DMA per-cpu:
CPU    0: Hot: hi:   18, btch:   3 usd:   7   Cold: hi:    6, btch:   1 usd:   0
Active:13611 inactive:107 dirty:0 writeback:0 unstable:0
 free:248 slab:1002 mapped:2 pagetables:65 bounce:0
DMA free:992kB min:1016kB low:1268kB high:1524kB active:54444kB inactive:428kB present:65024kB pages_scanned:82180 all_unreclaimable? yes
lowmem_reserve[]: 0 0
DMA: 0*4kB 0*8kB 0*16kB 1*32kB 1*64kB 1*128kB 1*256kB 1*512kB 0*1024kB 0*2048kB 0*4096kB = 992kB
Free swap:            0kB
16384 pages of RAM
317 free pages
954 reserved pages
1002 slab pages
51 pages shared
0 pages swap cached
Out of memory: kill process 30039 (jive) score 650 or a child
Killed process 30039 (jive)
s3c2410-wdt: Unexpected close, not stopping watchdog!


U-Boot 1.0.0 (Sep 24 2007 - 10:32:53)

U-Boot code: 33E00000 -> 33E1D388  BSS: -> 33E20FC4
IRQ Stack: 33e41fc0
FIQ Stack: 33e42fc0
DRAM Configuration:
Bank #0: 30000000 64 MB
NAND:Nand Flash Probe...
Nand Flash Configuration:
Page Size  :512 Bytes
Oob Size   :16 Bytes
Block Size :16 KB
In:    serial
Out:   serial
Comment 7 Stein David 2008-02-28 13:19:44 UTC
Created attachment 3018 [details]
look for "dstein" notes
Comment 8 Felix Mueller 2008-02-28 13:34:07 UTC
Not sure if its related, but I just saw this on the SBC running MusicMix:

# free
              total         used         free       shared      buffers
  Mem:        62280        48360        13920            0         6336
 Swap:            0            0            0
Total:        62280        48360        13920
# dma2: IRQ with no loaded buffer?
# free
              total         used         free       shared      buffers
  Mem:        62280        50664        11616            0         6336
 Swap:            0            0            0
Total:        62280        50664        11616
Comment 9 Felix Mueller 2008-02-28 14:52:17 UTC
My SBC used up about 4 MB within 4 hours while on Now Playing screen running Random Mix.

From this at about 7:30 pm:

# free
              total         used         free       shared      buffers
  Mem:        62280        48372        13908            0         6336
 Swap:            0            0            0
Total:        62280        48372        13908

To this at about 11:30 pm:

# free
              total         used         free       shared      buffers
  Mem:        62280        52572         9708            0         6336
 Swap:            0            0            0
Total:        62280        52572         9708
Comment 10 Felix Mueller 2008-02-28 16:30:37 UTC
Ok, I did sped up it a bit by pressing next a few times. Free memory got down to about 1.4 MB, then went up again to about 1.6 MB, but at the same time buffers dropped a bit. This pattern then repeated over and over again until buffers was down to 0 and free memory down to 1.2MB:

# free
              total         used         free       shared      buffers
  Mem:        62280        61016         1264            0            0
 Swap:            0            0            0
Total:        62280        61016         1264

From then on the free memory and buffers were jumping around and SBC would be unresponsive at times:

# free
              total         used         free       shared      buffers
  Mem:        62280        60784         1496            0         2700
 Swap:            0            0            0
Total:        62280        60784         1496
# free
              total         used         free       shared      buffers
  Mem:        62280        61220         1060            0          456
 Swap:            0            0            0
Total:        62280        61220         1060

Also, the response to a 'free' command now takes up to 20 seconds to show up on the console.

I'll let it run now and report back in the morning.
Comment 11 Felix Mueller 2008-02-29 00:02:32 UTC
SBR eventually died this night and rebooted.
Comment 12 Matt Wise 2008-02-29 07:34:13 UTC
My SBR at home died overnight.... it didn't reboot, but it ran into the same symptoms (red wifi, unusable menu, etc). 
Comment 13 James Richardson 2008-02-29 08:30:41 UTC
Created attachment 3021 [details]
SD Log from Jive

I left 2 Jive units running over night, 1 on the charger, 1 off the charger.  The unit on the Charger got to 99% memory (red wifi icon) where the one OFF the charger did not.

Attached are my logs as recorded on the SD card.  Dave also did some logging on the unit, and was able to crash the Jive when it was in this low memory state.  He will also be attaching his crash log.
Comment 14 Stein David 2008-02-29 08:30:44 UTC
Created attachment 3022 [details]
James' Jive Memory error
Comment 15 James Richardson 2008-02-29 08:39:34 UTC
(In reply to comment #14)
> Created an attachment (id=3022) [edit]
> James' Jive Memory error
> 

When we got the Jive to crash, we were doing the Factory Tests over and over.  It was running over 120%, we believe that the Interrupt Handler pushed the unit over 130% and then locked up the jive.
Comment 16 Richard Titmuss 2008-03-02 19:54:23 UTC
Created attachment 3028 [details]
Patch for heap debug

This patch adds a logging option that scans the heap for total/new/freed objects every 60 seconds and outputs debug to the log.
Comment 17 Richard Titmuss 2008-03-02 19:56:45 UTC
Created attachment 3029 [details]
Fix artwork memory leak with now playing screen

This patch fixes Window:replace() to send the correct events. This fixes a memory leak seen with the now playing screensaver.
Comment 18 Ben Klaas 2008-03-03 13:51:56 UTC
Created attachment 3033 [details]
Richard's patch + a critical line uncommented

This patch is the same as Richard's prior attachment, but uncomments a line that actually makes the fix work :)

I have verified that the fix works as advertised.
Comment 19 Ben Klaas 2008-03-03 14:10:33 UTC
FYI-- new 7.0 build kicked off 2:10pm PST
Comment 20 Richard Titmuss 2008-03-07 09:46:58 UTC
Created attachment 3042 [details]
Another memory leak fix
Comment 21 Richard Titmuss 2008-03-07 14:34:05 UTC
*** Bug 6293 has been marked as a duplicate of this bug. ***
Comment 22 Richard Titmuss 2008-03-11 09:22:29 UTC
Fixed in r2097.
Comment 23 Matt Wise 2008-03-22 13:20:53 UTC
In r2114 I can reproduce this bug in just 2-3 hours of use... it's happened three nights in a row now, and its the exact same symptoms as previously reported. Took about 2 hours last night of scrolling around and creating playlists. 
Comment 24 Ben Klaas 2008-03-27 07:35:59 UTC
along with #7376 I think this is the most important fix needed for 7.0.1. Ramping severity accordingly.

Matt, were you able to reproduce this when screensaver was set to clock?

Richard, let me know if there's anything I can do to help get debug for this. The heap XML parser I wrote could probably be improved in one way or another to better identify what's going on.

Comment 25 Matt Wise 2008-03-27 08:52:10 UTC
Last night my Jive was stuck in an I-want-to-upgrade loop so I didn't get a chance to try it. I need a way to block software updates -- is there a simple way to do that? 
Comment 26 James Richardson 2008-04-02 10:02:32 UTC
*** Bug 7522 has been marked as a duplicate of this bug. ***
Comment 27 James Richardson 2008-04-09 09:54:33 UTC
We would very much like details on how to reproduce this. what action's cause the jive to get into this state?
Comment 28 Matt Wise 2008-04-09 17:42:43 UTC
Right now my jive is running r2097 and its very noticibly going up in memory usage as it cycles through the now playing screen. If I remember correctly, newer builds don't do this as badly, so I need to update it to a newer build by hand I think. 
Comment 29 Richard Titmuss 2008-04-10 00:46:26 UTC
It will noticeably go up for a while, until the artwork cache is filled. Then it should plateau.
Comment 30 Chris Owens 2008-04-14 09:16:57 UTC
So what are you actually doing in your typical Jive use, Matt?
Comment 31 Spies Steven 2008-04-15 08:58:27 UTC
I left my jive and ray playing a random mix overnight without issues so far.  Top reports jive memory usage of 78%.  This is with SC 7.0.1 18702 and jive 7.0.1 r2223.  I will leave it playing and see if the memory usage goes up.

Matt reports it may take days of playing nonstop for the issue to manifest itself.

Is that correct Matt?   
Comment 32 Spies Steven 2008-04-16 08:46:08 UTC
My jive has now been running for about 2 days now and it is still successfully playing random mix.  According to top the memory use has increased a little to 85% from 78% yesterday and it seems to be holding steady.

Matt, can you still reproduce this?
Comment 33 Chris Owens 2008-04-16 09:46:10 UTC
Were you just playing one album off your hard drive?  Or Rhapsody?  Or a multi-album playlist?  Or what?  Thanks.
Comment 34 Matt Wise 2008-04-17 11:08:44 UTC
After 3 days of playing a random mix of 3000+ songs, my r2224 Jive build is using 95% of the available memory. 

  809     1 root     S    59096  95%  17% /usr/bin/jive 

Comment 35 Ben Klaas 2008-04-18 09:38:04 UTC
I think we need to know memory use over time. 95% memory usage could be stable or unstable, impossible to tell. Matt, is this still running? If so, can you post another data point?
Comment 36 Matt Wise 2008-04-18 09:51:47 UTC
As of this morning, its still holding steady @ 95%.... running just fine. 
Comment 37 Richard Titmuss 2008-04-18 12:37:27 UTC
So Matt are you happy if this bug is closed for now?
Comment 38 Matt Wise 2008-04-19 10:52:00 UTC
Yeah, I think we're in pretty good shape now. I can't seem to replicate this on the latest builds. Monday will be the real test though because my Jive @ work will have been running for about 7 days by then. 

This is about 5 days in:

Mem: 56140K used, 6168K free, 0K shrd, 6528K buff, 12892K cached
CPU:  18% usr   1% sys   0% nice  79% idle   0% io   0% irq   0% softirq
Load average: 0.30 0.24 0.25
  PID  PPID USER     STAT   VSZ %MEM %CPU COMMAND
  809     1 root     R    59096  95%  17% /usr/bin/jive 
15573 15569 root     R     2980   5%   3% top 
  787     1 root     S     3052   5%   0% /usr/sbin/inetd 

It's finally holding steady @ 95%... 
Comment 39 James Richardson 2008-05-08 09:13:24 UTC
(In reply to comment #38)
> Yeah, I think we're in pretty good shape now. I can't seem to replicate this on
> the latest builds. Monday will be the real test though because my Jive @ work
> will have been running for about 7 days by then. 
> 
> This is about 5 days in:
> 
> Mem: 56140K used, 6168K free, 0K shrd, 6528K buff, 12892K cached
> CPU:  18% usr   1% sys   0% nice  79% idle   0% io   0% irq   0% softirq
> Load average: 0.30 0.24 0.25
>   PID  PPID USER     STAT   VSZ %MEM %CPU COMMAND
>   809     1 root     R    59096  95%  17% /usr/bin/jive 
> 15573 15569 root     R     2980   5%   3% top 
>   787     1 root     S     3052   5%   0% /usr/sbin/inetd 
> 
> It's finally holding steady @ 95%... 
> 

I also have not been able to reproduce this error with the latest build of r2409

Marking as Verified until someone reopens the bug
Comment 40 James Richardson 2008-05-15 12:28:32 UTC
This bug has recently been fixed in the latest release of SqueezeCenter 7.0.1

Please try that version, if you still see the error, then reopen this bug.

To download this version, please navigate to: http://www.slimdevices.com/su_downloads.html