Bugzilla – Bug 18036
Cometd client manager fails to purge dead clients (e.g. iPad/Android apps) causing memory leakage
Last modified: 2013-05-02 13:50:52 UTC
Created attachment 7698 [details] Proposed patch More discussion on forum: http://forums.slimdevices.com/showthread.php?94559-LMS-7-7-1-leaky/page6 Using the iPad or Android apps causes LMS to leak. Basically because these clients cannot be guaranteed to disconnect from the server in a graceful way. The server continues to accumulate events relating to these inactive, or simply dead, clients, resulting in ever increasing memory usage that can only be reclaimed by restarting the server. The apps in question use 'long-polling'. The attached patch seeks to resolve this problem by implementing an 'autokill' timer that forcibly disconnects clients that have not polled the server within a 'reasonable' period. The autokill period has been set to 3 minutes. This is intended to be long enough not to trigger on short network outages/disruption, but not so long that an unnecessarily large queue of messages is built up. These will likely be mostly historical and of little interest. This parameter may require tuning to reflect the real needs of the iPad and Android apps.
Thanks a lot! This issue has been bugging me for years... Patch applied to 7.7+
(In reply to comment #1) Glad I could help. I'm keeping my fingers crossed that this is, indeed, 'it'. I've noticed a 'leakage' problem for a couple of years or so, but I can't remember when I first started to use the Android app...
It's probably not _the_ solution to all memory issues, but it definitely is a big improvement (if I can say so after 3-4 days of running that code). How did you figure out the details of the problem, btw?
Serendipity played its part. I made a number of dissect and isolate attempts, following some of the leads in the thread, e.g. start up gdresized demon to isolate image resizing, cut off artwork requests with a 404 return, disabled gzip, etc., etc., leaving a number of hours between each one. Got nowhere, decided to give up. Then noticed 2MB of events flying over to the iPad when looking at network traffic in connection with another project. So decided to follow it up, despite having officially given up. The debug output for Cometd is too voluminous to be able to see what needed to be seen, but a few trace lines eventually showed me that every iPad socket tear down was 'ignored'. Which lead to the answer. Big improvement ? Oh yes ! I'm fully expecting the system to remain up for some weeks before requiring attention, if not months. The more I think about it, the more convinced I become that the leakage started when I started to use the Android app occasionally. I have no swap file set up on my Sheevaplug, so I know very quickly when memory has filled up. Actually I reboot the server from a cron job at 4 in the morning if memory usage is excessive. I was going to make a simple plugin to do it, but perhaps it won't be needed any more. (Something like ps -o vsz $$ from perl would do.) Touch wood !