Bug 277 - Resume from pause broken
: Resume from pause broken
Status: CLOSED FIXED
Product: SB 1
Classification: Unclassified
Component: Firmware
: 21
: PC Windows XP
: P2 major (vote)
: Future
Assigned To: Sean Adams
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2004-04-27 02:10 UTC by Adrian Hall
Modified: 2009-09-08 09:24 UTC (History)
3 users (show)

See Also:
Category: ---


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Adrian Hall 2004-04-27 02:10:40 UTC
For me resuming from a pause of over about 25 seconds works for a further 4-5 
seconds then stutters, jumps back to a couple of seconds before pause started 
then stops (audio & counter).  Track can be restarted from the beginning by 
pressing play or rewind.

Wires-only DB, firmware 21, server 5.1.4 with >192kbit stereo mp3, not 
tried any other file format (no interest).

Server 2GHz Celery, 1GB RAM, XP Pro SP1.  Server and SB cabled to the same 
router (Buffalo G54 Wireless).
Comment 1 Adrian Hall 2004-04-27 03:42:48 UTC
*** Bug 278 has been marked as a duplicate of this bug. ***
Comment 2 wd 2004-04-30 15:19:23 UTC
Occured under following combinations (others not tested): Version
5.1.1 with firmware 8, Version 5.1.2 (nightly builds 26.3., 2.4.) with firmware 
10 and 20, 5.1.3 with firmware 21, 5.1.5 with firmware 21

WinXP + SP1, Athlon XP 2600+ 512MB; Epia 6000 + 512MB

mp3 / 56kbit Stereo / Sampling @32kHz
Pause for 30 secs is ok, but pausing for 45 secs not

mp3 / 32kbit Stereo / 32kHz: 55 secs is ok, 60 secs not, buffer empty 
within 58 secs.

see comment for bug #32
Comment 3 Blackketter Dean 2004-05-06 15:12:16 UTC
Confirmed by Rich Siegel, this is still happening in firmware 23.
Comment 4 Blackketter Dean 2004-06-07 12:26:17 UTC
*** Bug 335 has been marked as a duplicate of this bug. ***
Comment 5 Bill Fenner 2004-06-15 22:20:23 UTC
I did some debugging on this, since I was able to replicate it occasionally.  One potential problem is 
that the slim TCP stack is not replying to window probes with ACKs, causing the server TCP stack to 
think that the slim device is dead:

Last packets flowing:
22:09:17.755113 squeezebox.34470 > slimserver.9000: . ack 998741 win 1416
22:09:22.753548 slimserver.9000 > squeezebox.34470: . 998741:1000157(1416) ack 301 win 58400 
(DF)
22:09:22.859484 squeezebox.34470 > slimserver.9000: . ack 1000157 win 0

Window probes start:
22:09:27.856150 slimserver.9000 > squeezebox.34470: . 1000157:1000158(1) ack 301 win 58400 
(DF)
22:09:32.858744 slimserver.9000 > squeezebox.34470: . 1000157:1000158(1) ack 301 win 58400 
(DF)
22:09:37.861319 slimserver.9000 > squeezebox.34470: . 1000157:1000158(1) ack 301 win 58400 
(DF)
22:09:42.863885 slimserver.9000 > squeezebox.34470: . 1000157:1000158(1) ack 301 win 58400 
(DF)

note no ACKs from the squeezebox, so the dead timers on the slimserver's tcp stack are running.  
Finally, the server gives up - first packet is a window probe, second is a RST:

22:17:16.894214 slimserver.9000 > squeezebox.34470: . 1000157:1000158(1) ack 301 win 58400 
(DF)
22:17:55.513794 slimserver.9000 > squeezebox.34470: R 1000158:1000158(0) ack 301 win 58400 
(DF)
22:17:55.617780 squeezebox.34470 > slimserver.9000: . ack 1000157 win 0
22:17:55.617831 slimserver.9000 > squeezebox.34470: R 98469381:98469381(0) win 0

When I unpause, the squeezebox now re-establishes the connection; however, when I was able to 
replicate this problem the squeezebox did not re-establish the connection.

I wasn't able to determine why the squeezebox did not re-establish the connection when the resume 
from pause failed; however, if the squeezebox responded to window probes with the proper ACK, the 
connection wouldn't need to die in the first place, possibly avoiding this problem altogether.
Comment 6 Bill Fenner 2004-06-18 14:37:04 UTC
BTW, I see that I forgot to mention the environment under which I was replicating the pause problem.  
FreeBSD with slimserver from CVS, and I have tcpdump's of it happening with firmware 23; with 
firmware 24 I see the reset occur but then the slim successfully re-establishes the connection.  I can't 
be sure that there was any true difference in the behavior because of the firmware version, since I am 
not able to reliably replicate this, just occasionally.  Neither 23 nor 24 acks window probes, though, so 
they are definitely misbehaving.
Comment 7 Bill Fenner 2004-06-22 07:38:43 UTC
I still see this with firmware 26. The squeezebox window goes to zero, the slimserver starts sending 
window probes and getting no ACKs so resets the connection, and resuming from pause fails after the 
connection is reset.

Player Firmware Version: 26

The IP address for this player is: 135.197.17.55:27173

I hit pause around here, so the squeeze window goes to zero:

07:32:50.453345 slimserver.9000 > squeezebox.27397: . 1194927:1196387(1460) ack 301 win 
58400 (DF)
07:32:50.475230 squeezebox.27397 > slimserver.9000: . ack 1196387 win 566
07:32:50.495370 squeezebox.27397 > slimserver.9000: . ack 1196387 win 932
07:32:55.474482 slimserver.9000 > squeezebox.27397: . 1196387:1197319(932) ack 301 win 58400 
(DF)
07:32:55.498611 squeezebox.27397 > slimserver.9000: . ack 1197319 win 0

server starts sending probes and getting no responses:

07:33:00.494364 slimserver.9000 > squeezebox.27397: . 1197319:1197320(1) ack 301 win 58400 
(DF)
07:33:05.494267 slimserver.9000 > squeezebox.27397: . 1197319:1197320(1) ack 301 win 58400 
(DF)
...
07:41:12.644951 slimserver.9000 > squeezebox.27397: . 1197319:1197320(1) ack 301 win 58400 
(DF)
07:41:21.804769 slimserver.9000 > squeezebox.27397: . 1197319:1197320(1) ack 301 win 58400 
(DF)
07:41:30.964609 slimserver.9000 > squeezebox.27397: R 1197320:1197320(0) ack 301 win 58400 
(DF)
07:41:30.986705 squeezebox.27397 > slimserver.9000: . ack 1197319 win 0
07:41:30.986750 slimserver.9000 > squeezebox.27397: R 2464605313:2464605313(0) win 0

server gives up; unpause results in buffer emptying and no more data flowing.

I'd suggest looking at the other side of this bug too - namely, why doesn't the squeeze reconnect to the 
stream URL after the connection is reset?  If you fix either half, then you should be able to ship without 
the pause bug.  (Of course, it'd be ideal to fix both bugs, in case there are other reasons that the 
connection gets closed unexpectedly!)
Comment 8 Sean Adams 2004-06-25 14:36:14 UTC
Bill, thanks very much for the packet traces. Very helpful.

I confirmed the behavior on Windows XP, and found a bug in our TCP stack in how 
it handles out-of-window data (in this case one byte of data past the right 
edge). Fixed it to do the correct calculations, and now it reaches the proper 
(existing) case which handles it as a window probe (sends ack immediately). 

Unpause is working fine on XP for me now.
Comment 9 Vidur Apparao 2004-06-25 15:44:15 UTC
Sean, was this fixed in firmware 27?
Comment 10 Robert Wallace 2004-06-25 20:51:53 UTC
bug still present for me on WinXP in 5.2 with firmware 27.
Comment 11 Blackketter Dean 2004-06-25 21:26:10 UTC
Sorry, the fix went into version 28, which will be available in tonight's nightly, it didn't make it into 5.2.  
Comment 12 Bill Fenner 2004-06-26 11:14:04 UTC
Well done - the ACKs come now, and resuming from a long pause works peachy.

10:42:18.830547 10.42.77.100.9000 > 10.42.77.80.37635: . 1002271:1002272(1) ack 301 win 58400 
(DF)
10:42:18.856549 10.42.77.80.37635 > 10.42.77.100.9000: . ack 1002271 win 0
10:42:23.830462 10.42.77.100.9000 > 10.42.77.80.37635: . 1002271:1002272(1) ack 301 win 58400 
(DF)
10:42:23.856299 10.42.77.80.37635 > 10.42.77.100.9000: . ack 1002271 win 0
10:42:28.830378 10.42.77.100.9000 > 10.42.77.80.37635: . 1002271:1002272(1) ack 301 win 58400 
(DF)
10:42:28.856041 10.42.77.80.37635 > 10.42.77.100.9000: . ack 1002271 win 0
10:42:33.830284 10.42.77.100.9000 > 10.42.77.80.37635: . 1002271:1002272(1) ack 301 win 58400 
(DF)
10:42:33.861561 10.42.77.80.37635 > 10.42.77.100.9000: . ack 1002271 win 0

I do think that there's still a second bug here (that the squeezebox doesn't establish a new connection 
to the server when the old one went away), and actually this fix makes the other one harder to invoke 
but certain network problems (e.g., having the wireless network go away for just the right amount of 
time) would invoke it.  However, it's a low enough probability that it's worth just keeping in mind when 
diagnosing network problems but may not be worth trying to track down without a specific trigger 
mechanism.
Comment 13 Adrian Hall 2004-06-27 00:50:28 UTC
Indications are on XP with 5.2 and firmware 28 the pause is fixed.  Volume 
up/down seems a little more erratic but otherwise ok.
Comment 14 Blackketter Dean 2004-06-27 09:30:31 UTC
resume from paused fixed.  
Bill:  please file another bug for the reconnect issue you described.
Comment 15 James Richardson 2008-12-15 13:05:09 UTC
This bug appears to have been fixed in the latest release!

If you are still experiencing this problem, feel free to reopen the bug with your new comments and we'll have another look.

Make sure to include the version number of the software you are seeing the error with.