Bug 15522 - Firmware upgrade process sometimes hangs at 99% installation
: Firmware upgrade process sometimes hangs at 99% installation
Status: CLOSED FIXED
Product: SqueezePlay
Classification: Unclassified
Component: --
: 7.5.x
: PC Other
: P1 normal with 1 vote (vote)
: 7.5.0
Assigned To: Michael Herger
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2010-01-23 14:52 UTC by Sébastien Phélep
Modified: 2010-04-08 17:26 UTC (History)
3 users (show)

See Also:
Category: ---


Attachments
/var/log/messages (159.11 KB, text/plain)
2010-01-25 11:24 UTC, Erland Isaksson
Details
use kill() instead of shelling out to run /etc/init.d/squeezecenter stop (3.08 KB, patch)
2010-01-27 01:41 UTC, Michael Herger
Details | Diff
new log after recent changes (50.27 KB, text/plain)
2010-01-31 23:38 UTC, Michael Herger
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Sébastien Phélep 2010-01-23 14:52:22 UTC
Sometimes, when upgrading firmware on the Touch or the Radio, the installation process hangs at 99%, and a manual power cycle is needed.

It looks like many beta testers had the problem with today's firmware release (7.5.0 r8394), although some had the problem with both their Touch and Radio, whereas others (like me) only encountered the issue with the Touch, and not with the Radio.
Comment 1 Jim McAtee 2010-01-23 18:44:41 UTC
I've only seen this on the most recent two Touch fw updates to r8391 and r8395, but never prior to that.  I have a USB drive attached and TinySC is running.  The update then works after rebooting the Touch via the reset button.  Other beta testers have been reporting the same problem for many months.
Comment 2 Sébastien Phélep 2010-01-24 00:48:17 UTC
This morning update is stuck at 99% DOWNLOADING.
I also seen that previously, and I realize that's what Ian was really complaining about.

So, there looks to be something wrong in both the download and installation "progress loops".

The Touch is still accessible with SSH, but I don't know where/what to look for...
Comment 3 Michael Herger 2010-01-24 22:45:00 UTC
Yep, I've seen the stuck at 99% downloading too. This might be related to bug 15245.

Next time this happens please log in using SSH (if possible) and

- check whether TinySC is still running (slimserver.pl)
- check whether Samba is still running (if you started it: smbd, nmbd)
- grab a copy of the /var/log/messages file and attach it to this bug

Thanks!
Comment 4 Erland Isaksson 2010-01-25 11:24:16 UTC
Created attachment 6461 [details]
/var/log/messages

Added /var/log/messages file when stuck at 99%

Looked at processed when it had started and:
- No slimserver.pl process running
- Two /usr/sbin/smbd processes
- One /usr/sbin/nmbd process

Also looked at memory usage with "free" command:
- Around 10% progress: 30MB free memory
- Around 90% progress: 27MB free memory
- Around 99% progress: 30MB free memory

Also looked at CPU usage with "top" and the jive process used between 70-80% CPU all the time. Is this really normal when nothing is playing ?

The memory usage with "top", the VSZ column showed 77928 for the jive process, which also is the process that takes up most memory. I'm using my Album Flow screen saver when playing music, so it probably fills the in memory artwork cache to its maximum size, this is probably one explanation to the memory usage.

It's easy to see when it's about to get stuck because the firmware upgrade runs a lot slower than when it succeeds.
Comment 5 Michael Herger 2010-01-26 11:35:53 UTC
Thanks erland - seeing exactly the same error filling the log.
Comment 6 Michael Herger 2010-01-27 00:07:37 UTC
Looking at the logs and the code it seems that the firmware downloader is trying to write to some uninitialized file handle. We still have to figure out why it's not open. I've added the following lines to my fab4. Let's see tonight whether this will give me some hint:

Index: src/squeezeplay_squeezeos/share/applets/SetupFirmwareUpgrade/UpgradeUBI.lua
===================================================================
--- src/squeezeplay_squeezeos/share/applets/SetupFirmwareUpgrade/UpgradeUBI.lua	(revision 8408)
+++ src/squeezeplay_squeezeos/share/applets/SetupFirmwareUpgrade/UpgradeUBI.lua	(working copy)
@@ -303,6 +303,11 @@
 			self.downloadBytes = self.downloadBytes + #chunk
 
 			if _action == "store" then
+				if not _fhsink then
+					log:error("_fhsink not defined")
+					return nil, "Something went wrong... _fhsink not defined"
+				end
+			
 				-- write content to fhsink
 				local t, err = _fhsink(chunk)
 				if not t then


Other than that I wonder why the services aren't closed. Could it be fab4 is running out of memory when trying to shell out to stop the services?
Comment 7 Michael Herger 2010-01-27 01:41:16 UTC
Created attachment 6470 [details]
use kill() instead of shelling out to run /etc/init.d/squeezecenter stop

Alan was kind enough to add the kill() function to squeezeos. If memory was an issue, not allowing to shell out using os.execute() we can try harder to kill the SBS server to free the memory
Comment 8 Michael Herger 2010-01-27 01:42:05 UTC
Please note that this patch is untested, as kill() isn't in the latest firmware yet
Comment 9 Michael Herger 2010-01-27 02:32:01 UTC
Comment on attachment 6470 [details]
use kill() instead of shelling out to run /etc/init.d/squeezecenter stop

tested and checked in as change 8410 to fix bug 15245
Comment 10 SVN Bot 2010-01-28 09:42:16 UTC
 == Auto-comment from SVN commit #8413 to the jive repo by michael ==
 == https://svn.slimdevices.com/jive?view=revision&revision=8413 ==

Bug: 15522
Description: don't pretend to be downloading when download has failed miserably. Return with an error message.
Comment 11 Michael Herger 2010-01-31 23:38:51 UTC
Created attachment 6476 [details]
new log after recent changes

New behaviour now: the unsuccessful download now progresses in normal speed, but still fails. There's evidence in the log that something fails lack of memory. The change I applied makes this return quicker than before, but still doesn't make it fail immediately and return. Room for improvement there.

I'll try to add some timeout to wait for SBS to be stopped before the update begins. It looks as if it was told to stop, but still running, only closing during the fake download process.
Comment 12 Michael Herger 2010-01-31 23:39:58 UTC
Ben - please let me know if you have an idea on how to back out immediately in case the "_fshandle not open" error occurs.
Comment 13 SVN Bot 2010-02-01 01:44:53 UTC
 == Auto-comment from SVN commit #8433 to the jive repo by michael ==
 == https://svn.slimdevices.com/jive?view=revision&revision=8433 ==

Bug: 15522
Description: wait up to 10 seconds for SBS to be stopped before continuing the firmware download. Always use kill() instead of os.execute() to stop SBS.

We still have to better handle out of memory issues in other cases (as seen on Radio too)
Comment 14 SVN Bot 2010-02-01 03:30:47 UTC
 == Auto-comment from SVN commit #8434 to the jive repo by michael ==
 == https://svn.slimdevices.com/jive?view=revision&revision=8434 ==

Bug: 15522
Description: use squeezeos.kill() instead of os.execute() to stop Samba.
Comment 15 SVN Bot 2010-02-01 06:19:06 UTC
 == Auto-comment from SVN commit #8438 to the jive repo by michael ==
 == https://svn.slimdevices.com/jive?view=revision&revision=8438 ==

Bug: 15522
Description: return immediately to "Try again" page if error occurs.

Ben - should we have a "restart SB and try again" option in this case? Repeated trial won't help if we run out of memory.
Comment 16 Michael Herger 2010-02-04 04:53:08 UTC
Ok, second successful firmware update in a row. Please re-open if you still see issues updating _from_ firmware 8438 or later. Thanks!
Comment 17 Chris Owens 2010-04-08 17:26:38 UTC
This bug has been marked fixed in a released version of Squeezebox Server or the accompanying firmware or mysqueezebox.com release.

If you are still seeing this issue, please let us know!