Bug 11555 - Inconsistent error messages when network fails during SN setup
: Inconsistent error messages when network fails during SN setup
Status: CLOSED FIXED
Product: SB Touch
Classification: Unclassified
Component: Setup
: unspecified
: PC Other
: -- normal (vote)
: CAT
Assigned To: Weldon Matt
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2009-03-30 12:34 UTC by Dan Evans
Modified: 2009-10-06 09:22 UTC (History)
5 users (show)

See Also:
Category: ---


Attachments
Log file of several failed attempts to recover from lost connection (861.79 KB, text/plain)
2009-04-06 11:23 UTC, Dan Evans
Details
Another log, this one of "System Error" message (938.17 KB, text/plain)
2009-04-07 11:11 UTC, Dan Evans
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Dan Evans 2009-03-30 12:34:43 UTC
I was testing weak WiFi signal cases.  I took a Fab4 to the very edge of a wireless network so it was successfully connecting only 50% of the time.

Then during the SN account setup process, I kept getting error messages about the connection, but... these errors were not consistent.  More importantly, there was one error that is not in Matt Weldon's Setup Map.

The errors I got were:

 1. "Can't Connect -- It appears your internet connection is not working. Please check your network to make sure it is connected and running properly. Try connecting again > " 

 2. "Can't Connect -- A firewall or other mechanism seems to be blocking your connection. Please contact your router manufacturer or network admin for further assistance, or visit www.mysqueezebox.com/support.  Try connecting again > "

These correspond to errors 10K and 10L in the setup map.  But I also got:

 3. "Can't Connect -- An system error has happened. Please try again later.  Try connecting again >" 

This last one not only isn't in our setup map, but it's very, very vague.  What does it mean?  And/or what failure catch is triggering it?

FYI, I counted the frequency of the above errors.  Out of 20 tries the errors appeared with this spread:

Error 1: 7
Error 2: 2
Error 3: 11

So, whatever that error is it's coming up the majority of the time.

We need this error code examined and corrected, and/or we need that third error rewritten to describe the failure?
Comment 1 Richard Titmuss 2009-03-30 15:10:45 UTC
That extra error is displayed when SN sends unexpected data, your should never see this. Logs would be useful to understand what failed.

This error should be added to the Setup Map. There should be no other errors in that are not on the map, that's the only extra one I had to add.
Comment 2 Blackketter Dean 2009-03-30 17:01:03 UTC
Andy: Know what this is?

Or is this for you, Richard?
Comment 3 Dan Evans 2009-03-30 17:05:42 UTC
How do we capture logs on Fab4?  Last time I asked I heard it wasn't quite possible yet.  Any update?
Comment 4 Andy Grundman 2009-03-30 17:18:15 UTC
Don't think this is mine.
Comment 5 Richard Titmuss 2009-03-31 03:46:52 UTC
You can capture logs the same way you do on jive (it can sometimes cause fab4 to reboot while logging, this is a know bug).

The third error happens when we don't have a firmware update url or registration status from SN. I don't think the error should be rewritten (the user can take no corrective action, other than trying again), but I need to understand why this error is happening in the condition you describe and see if the code can be modify to return one of the first two errors.

This is a difficult situation to accurately report, as the method of failure will keep changing if the connectivity is poor. Do you have examples of what other products do in this situation?
Comment 6 Blackketter Dean 2009-03-31 17:57:22 UTC
Dan: Can you capture some logs and add them to the bug?
Comment 7 Dan Evans 2009-04-06 11:23:14 UTC
Created attachment 5055 [details]
Log file of several failed attempts to recover from lost connection

What brought me to this log was slightly different that before... 

I went through a fresh install with ethernet connected, but then pulled the ethernet cable when Fab4 restarted after its fw update.  After restarting, it correctly detected that it'd lost connectivity and I got the error, "Your internet connection appears to be not working".  I reconnected the ethernet cable and pressed "Try Again".

At this point, I got the repeated errors of, "A Firewall appears to blocking the connection" which is wrong.  I tried 3 times to "try again" but got the firewall error every time.

I power cycled Fab4 and it connected immediately.
Comment 8 Dan Evans 2009-04-07 11:11:22 UTC
Created attachment 5066 [details]
Another log, this one of "System Error" message

This log was gathered while trying to recreate the conditions of my original post above.  I had trouble recreating those conditions, but I think I got close.

I connected to a distant WiFi-- the connection was weak: 6-8 SNR.  I went through setup and when it first tried to connect to mySqueezebox.com I'd get 1 of 3 error situations...

Out of 10 tries:
 1. 7 times ... "Can't Connect -- It appears your internet connection is not working.  Please..."  In these cases when I checked Diagnostics the WiFi was listed as "not connected".

 2. 2 times ... "Can't Connect -- It appears your internet connection is not working.  Please..."  In these cases when I checked Diagnostics the WiFi was listed as "Connected" but the SqueezeNetwork field said, "DNS Failed".

 3. 1 time  ... "Can't Connect -- An system error has happened. Please..."  In this case when I checked Diagnostics the WiFi was listed as "Connected" but the SqueezeNetwork field was blank.

Log attached.
Comment 9 Richard Titmuss 2009-04-08 05:56:12 UTC
(In reply to comment #7)
> I went through a fresh install with ethernet connected, but then pulled the
> ethernet cable when Fab4 restarted after its fw update.  After restarting, it
> correctly detected that it'd lost connectivity and I got the error, "Your
> internet connection appears to be not working".  I reconnected the ethernet
> cable and pressed "Try Again".

I can't recreate this, but the relevant log entry is:

Jan  5 02:05:43 SqueezeboxController user.info jive: (Comet.lua:622) - Comet {mysqueezebox.com}: _handshake error: fab4.squeezenetwork.com Try again

This is an indication that the DNS lookup failed. If you can recreate this it would be useful to know what information is display on the Diagnostics screen. This maybe related to bug 11455, where the DHCP may take a long time to recover after the ethernet cable is inserted.
Comment 10 Richard Titmuss 2009-04-08 08:28:57 UTC
(In reply to comment #8)
> I connected to a distant WiFi-- the connection was weak: 6-8 SNR.  I went
> through setup and when it first tried to connect to mySqueezebox.com I'd get 1
> of 3 error situations...

The problem here is at the edge of the wireless network different connection problems trigger the different error screens (dns failure, connection failure, failure to communicate after connection).

My best suggestion would be to add a new error screen, that is shown if the SNR is low.

Dean, Dan, Matt, comments?
Comment 11 Blackketter Dean 2009-04-08 08:32:59 UTC
Are we _sure_ we're putting up the right message at all times? i.e. If the WLAN is disconnected or has been disconnected, we should always put up the right error message.

I'm nervous about adding a new screen for a new class of failure modes based on signal strength without a significant amount of testing...
Comment 12 Richard Titmuss 2009-04-08 08:37:12 UTC
Yes, I am sure. At this stage in setup specific error screens are used:

Can't resolve fab4.squeezenetwork.com: "Can't Connect -- It appears your internet connection is not working.  Please..."

Can't connect to fab4.squeezenetwork.com: "Can't Connect -- It appears your internet connection is not working.  Please..."

Connected by comet exchange failed: "Can't Connect -- An system error has happened. Please..."

The problem is these tests don't include link level test, and if your at the edge of the wireless range some things will work sometimes. So other than using the SNR to verify the wireless, we really won't be able to distinguish this error case.
Comment 13 Anoop Mehta 2009-04-09 13:45:25 UTC
So right now with firmware r5265 I entered my wireless encryption, the FAB4 went to the Connecting to mysqueezebox.com screen, then FAB4 went to the Can't Connect - An System error has happened. Please try again later". 


1. Shouldn't this read "A system error has happened instead of AN system error has happened?

2. I agree that this error screen is way too bland..if a customer was to run into this error..they would be completely lost. IMO there must be more detail to this error message.
Comment 14 Blackketter Dean 2009-04-09 13:49:41 UTC
At a minimum, the text should be improved.  But should the comet error happen this frequently?
Comment 15 Ben Klaas 2009-04-13 11:18:48 UTC
r5292 fixes the typo.

What else am I supposed to be doing with this bug? Richard's last comment states fairly unequivocally that what we are delivering is the best resolution we can provide for the error.

If we can't split the error into different ones, does this bug become Weldon's (followed by SLT) for better text there?
Comment 16 Ben Klaas 2009-04-13 11:20:09 UTC
and on reflection, this bug is about problems during SN account setup. Is this only about that? If so, then we can probably push this off the MP bug list.
Comment 17 Ben Klaas 2009-04-13 12:43:39 UTC
assigning to Dean for re-assign and possible re-target
Comment 18 Blackketter Dean 2009-04-14 09:00:20 UTC
If this bug happens after the update then it's a post-MP bug.
Comment 19 Weldon Matt 2009-04-14 12:42:25 UTC
Unless I'm missing something, the typo has been fixed and we're showing the best/most appropriate copy possible.
Comment 20 James Richardson 2009-10-06 09:22:29 UTC
This bug has been fixed in the latest release of MySqueezebox.com (formally
known as SqueezeNetwork)!

If you are still experiencing this problem, feel free to reopen the bug with
your new comments and we'll have another look.