Bug 15915 - Out of memory error and crash after running ~15 hours with 4+ players
: Out of memory error and crash after running ~15 hours with 4+ players
Status: RESOLVED WORKSFORME
Product: SB Touch
Classification: Unclassified
Component: TinySC
: 7.5.0
: PC Windows XP
: P1 normal (vote)
: 7.5.0
Assigned To: Unassigned bug - please assign me!
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2010-03-18 12:29 UTC by Mickey Gee
Modified: 2010-03-30 14:12 UTC (History)
5 users (show)

See Also:
Category: ---


Attachments
Crash log (121.28 KB, text/plain)
2010-03-18 12:31 UTC, Mickey Gee
Details
Messages log file -- doesn't appear to have info, but just in case .... (200.08 KB, text/plain)
2010-03-18 12:33 UTC, Mickey Gee
Details
crashlog - after about 10h idling with the ImageViewer (184.05 KB, text/plain)
2010-03-22 11:25 UTC, Michael Herger
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Mickey Gee 2010-03-18 12:29:28 UTC
Have Fab4/TinySC running with 4 players streaming MP3 files from attached WD 1.5TB USB self-powered drive. (2 Boom, 2 SB3, Fab4 is PB3)

Started all 4 players streaming MP3 (different Harry Potter Audio CDs), with Fab4 itself continuously playing another Harry Potter MP3 story. Started 6:30 pm, and appeared to crash with "System Error" screen at around 11:00 pm.

Have crashlog file, which says it's out of memory. File is attached.

This is second time in 2 days this has happened. For some reason, I ignored the first failure. Guess I shouldn't have!

Was able to play this configuration continuously last weekend, but not this week. Test was stopped when building experienced power outage last Monday March 15 and I then updated firmware. Now on r8656. Not sure what firmware version was running last weekend. Maybe 863x? Now I can't get it to stay up for 24 hours.

Please review crashlog and suggest additional debug switches to set.
Comment 1 Mickey Gee 2010-03-18 12:31:13 UTC
Created attachment 6665 [details]
Crash log
Comment 2 Mickey Gee 2010-03-18 12:33:54 UTC
Created attachment 6666 [details]
Messages log file -- doesn't appear to have info, but just in case ....
Comment 3 Alan Young 2010-03-19 02:01:32 UTC
Two observations: not sure if either is significant.
1. ntfs-g3 was being used.
2. Both TinySC (slimpserver.pl) and SP (jive) were selected to be killed.
Comment 4 Mickey Gee 2010-03-19 10:18:01 UTC
Good point. One thing different in this configuration versus last weekend is the use of the 1.5TB drive in NTFS format. Last weekend it was using a different drive (IDE 120GB 4200 RPM with PATA-USB converter) but it was also an NTFS drive.

Not sure why a different drive would make a difference, but I'll try it and see.
Comment 5 Mickey Gee 2010-03-20 09:06:03 UTC
Swapped out with different drive -- Hitachi 250GB USB Simple Drive Mini. Still running same config after 16.5 hours. No obvious memory issues from free or cat /proc/meminfo commands.
Comment 6 Mickey Gee 2010-03-21 16:15:39 UTC
Still running after 72 hours with Hitachi 250GB drive. Running firmware r8660.
Comment 7 Mickey Gee 2010-03-21 16:31:24 UTC
This Hitachi drive is also preformatted in FAT32, not NTFS. Ran the test for days on the PATA drive in NTFS v3.0 format (formatted using Windows 2000), so possibly an issue with NTFS on 1.5TB WD drive? Maybe WD drive in NTFS v3.1 format?
Comment 8 Michael Herger 2010-03-21 23:04:08 UTC
Mickey - have you been playing music on the fab4 itself too? If not: what screensaver are you using?

I noticed SBS on my fab4 quitting too. But only on of my two devices. Both are running SBS accessing a 2.5" disk with a few thousand tracks. The biggest difference between the two is one uses the ImageViewer applet to show local image files as a screensaver. Checking memory this morning, jive on that device was using 45-50MB, while the other one (running the clock saver) was at around 30MB. I wonder whether ImageViewer is leaking memory. You aren't using it as your screensaver by chance?
Comment 9 Alan Young 2010-03-22 00:45:23 UTC
Mickey, what means are you using to build the playlists and how big are they?

I ran a 48hr test with 4 players: fab4 + 3 x baby. Each had a single-album MP3 playlist. TinySC was using an SD-card for the library. No leak.  Will now run test with ip3k-display players attached.
Comment 10 Chris Owens 2010-03-22 09:37:31 UTC
In the bug meeting today, speculation is that it's some combination of NTFS and a large drive.  

If we push this to 7.5.x, it should be P1 there.
Comment 11 Vahid Fereydouny 2010-03-22 10:33:10 UTC
The following link describes how the kernel decides which application to kill:
http://linux-mm.org/OOM_Killer
Comment 12 Michael Herger 2010-03-22 11:25:00 UTC
Created attachment 6681 [details]
crashlog - after about 10h idling with the ImageViewer

I didn't see a steady growth of memory usage, but rather an up and down (with larger or smaller images). For about 7-8h I thought it was quite stable. But when SBS got killed, jive had grown to about 50MB. There obviously is something wrong with it.

It eventually died when it should have resized a 6MP image. Famous last words (more to be found in the attached log file):

Mar 22 19:11:35 squeezeplay: DEBUG  applet.ImageViewer - ImageViewerApplet.lua:550 image rendering
Mar 22 19:11:35 squeezeplay: INFO   applet.ImageViewer - ImageSourceLocalStorage.lua:141 Next image in queue: /media/sda1/images/gegenlicht.jpg
Mar 22 19:11:39 kernel: jive invoked oom-killer: gfp_mask=0x1201d2, order=0, oomkilladj=0
Mar 22 19:11:39 kernel: [<c02f6bfc>]
Comment 13 Vahid Fereydouny 2010-03-22 14:57:40 UTC
Here is the path to the watchdog source code:
7.5/trunk/squeezeos/poky/meta-squeezeos/packages/watchdog
which gets installed to:
7.5/trunk/squeezeos/poky/build/tmp-fab4/work/armv6-none-linux-gnueabi/watchdog-5.6-r5/watchdog-5.6
There is a watchdog configuration file under /etc that indicates the behavior of the watchdog at run-time.
Here is the path to the watchdog config file in the source tree.
7.5/trunk/squeezeos/poky/meta-squeezeos/packages/base-files/files/watchdog.conf
Comment 14 Vahid Fereydouny 2010-03-22 15:24:12 UTC
What are these errors in the logs:

Mar 18 10:11:54 squeezeplay: audio_thread_execute:908 xrun (snd_pcm_mmap_commit) err=-32
Mar 18 10:11:54 squeezeplay: audio_thread_execute:798 xrun (snd_pcm_wait)
Mar 18 10:11:54 squeezeplay: audio_thread_execute:800 PCM wait failed: Text file busy
Mar 18 10:11:54 squeezeplay: audio_thread_execute:752 underrun!!! (at least 10.478 ms long)
Mar 18 10:11:54 squeezeplay: audio_thread_execute:798 xrun (snd_pcm_wait)
Mar 18 10:11:54 squeezeplay: audio_thread_execute:752 underrun!!! (at least 6812718.075 ms long)
Comment 15 Alan Young 2010-03-22 23:54:38 UTC
They are to do with use of ALSA. I do not think that they are significant when they occur at the start of playback after a period of not playing.
Comment 16 Mickey Gee 2010-03-24 10:21:13 UTC
Have 3 Fab4/TinySC combinations running with r8661. Each has 4 IP3K players.

- Hitachi Simpledrive Mini 250GB drive FAT32
- Seagate Expansion 2TB NTFS
- Hitachi XL10000 1TB NTFS

All are running fine after 43 hours. Original bug happened with WD Elements 1.5 TB drive. 

Will stop test on 250GB and attach WD Elements drive for retesting.
Comment 17 Chris Owens 2010-03-25 09:27:30 UTC
Mickey can no longer repro.  Please reopen if anyone still is seeing an issue!
Comment 18 Vahid Fereydouny 2010-03-30 14:12:49 UTC
We should look into our policy of dealing with out of memory situations. I am not sure killing an application by kernel is the best approach.