Bug 8783 - SB3 appears to open new connection to SC every time button is pressed on IR remote
: SB3 appears to open new connection to SC every time button is pressed on IR r...
Status: RESOLVED INVALID
Product: SB 2/3
Classification: Unclassified
Component: Slimproto
: unspecified
: PC Other
: -- minor (vote)
: ---
Assigned To: Squeezebox QA Team email alias
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2008-07-17 18:46 UTC by Nick Orlov
Modified: 2009-09-08 09:16 UTC (History)
3 users (show)

See Also:
Category: ---


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Nick Orlov 2008-07-17 18:46:15 UTC
Which does cause the problem when firewall is set in a way so it limits max number of new connections per min for a given src-ip/dst-ip/dst-port triplet.

Once I cranked this limit up to 120/min problem went away. I think back then SB3 had firmware ~88. Can retest with latest firmware if you would want me too.

I'd say SB3 should keep connection to SC alive for 20 sec of so (at least).

Decided to log it in BZ since other people can be affected by this.
Comment 1 Nick Orlov 2008-07-17 19:15:41 UTC
Let me clarify - this is especially noticeable when one uses FFWD/REW buttons to position precisely withing a long track or jump to prev/next track. So might be related to SB3 reading next chunk of mp3/flac data...
Comment 2 Chris Owens 2008-07-21 09:50:39 UTC
We're interested in what kind of firewall this is.  I'll target this for future and see if others complain about it.  We haven't seen this.

Andy notes he thinks the re-opening of the connection during skipping is normal.  Is it really a new connection for every button?
Comment 3 Nick Orlov 2008-07-21 17:13:56 UTC
> We're interested in what kind of firewall this is. 

standard Linux functionality. Here is corresponding iptables entry:

-A external_tcp_known -m hashlimit --hashlimit 5/min --hashlimit-mode srcip,dsti
p,dstport --hashlimit-name incoming_tcp -j ACCEPT
-A external_tcp_known -j external_tcp_flood

-A external_tcp_flood -j ULOG --ulog-prefix "Dropping TCP (limit):"
-A external_tcp_flood -j DROP

> Is it really a new connection for every button?

At least for FFWD/REW. That's how I noticed it.
If you are interested I can sniff the traffic ...

Here is the relevant example from firewall log:


............
Jul  9 21:38:53 nickolas Dropping TCP (limit): IN=ath0 OUT= MAC= SRC=192.168.2.4 DST=192.168.2.2 LEN=44 TOS=00 PREC=0x00 TTL=64 ID=31744 PROTO=TCP SPT=44318 DPT=3483 SEQ=2639310290 ACK=0 WINDOW=3000 SYN URGP=0 
Jul  9 21:38:54 nickolas Dropping TCP (limit): IN=ath0 OUT= MAC= SRC=192.168.2.4 DST=192.168.2.2 LEN=40 TOS=00 PREC=0x00 TTL=64 ID=31746 PROTO=TCP SPT=44318 DPT=3483 SEQ=2639310291 ACK=0 WINDOW=3000 RST URGP=0 
Jul  9 21:38:54 nickolas Dropping TCP (limit): IN=ath0 OUT= MAC= SRC=192.168.2.4 DST=192.168.2.2 LEN=40 TOS=00 PREC=0x00 TTL=64 ID=31747 PROTO=TCP SPT=44318 DPT=3483 SEQ=2639310291 ACK=0 WINDOW=3000 RST URGP=0 
Jul  9 21:38:54 nickolas Dropping TCP (limit): IN=ath0 OUT= MAC= SRC=192.168.2.4 DST=192.168.2.2 LEN=40 TOS=00 PREC=0x00 TTL=64 ID=31748 PROTO=TCP SPT=44318 DPT=3483 SEQ=2639310291 ACK=0 WINDOW=3000 RST URGP=0 
Jul  9 21:38:54 nickolas Dropping TCP (limit): IN=ath0 OUT= MAC= SRC=192.168.2.4 DST=192.168.2.2 LEN=44 TOS=00 PREC=0x00 TTL=64 ID=31749 PROTO=TCP SPT=44319 DPT=3483 SEQ=2639632491 ACK=0 WINDOW=4096 SYN URGP=0 
Jul  9 21:38:57 nickolas Dropping TCP (limit): IN=ath0 OUT= MAC= SRC=192.168.2.4 DST=192.168.2.2 LEN=44 TOS=00 PREC=0x00 TTL=64 ID=31755 PROTO=TCP SPT=44319 DPT=3483 SEQ=2639632491 ACK=0 WINDOW=3000 SYN URGP=0
............

The reason this rule is in place - this PC used to be exposed to internet directly (w/o using hardware firewall/router) and people manage to DOS it couple of times.... So now I'm little bit paranoid about these things.
Comment 4 Nick Orlov 2008-07-21 17:25:08 UTC
P.S. Funny thing (unrelated), once I put this rule in place and before I moved ssh to non-standard port my Belkin router had been defeated (hanged and required hard reset) couple of times by bots opening enormous amount of connections during brute force "pick the password" attacks, but PC newer suffered.
Comment 5 Andy Grundman 2008-07-21 17:28:34 UTC
The player does not open up many new connections.  It does open up a new streaming connection on port 9000 for each new song, and maintains a single slimproto connection on port 3483.

In your log it looks like it's blocking 4 packets from source port 44318, which would probably cause the player to think its connection got closed and so it tries to open a new connection from the next source port (44319).

Marking invalid.  You don't need to firewall your internal network.
Comment 6 Nick Orlov 2008-07-21 18:12:17 UTC
> The player does not open up many new connections.

In fact it does: otherwise my firewall would accept these packets right away:

-A external -m state --state RELATED,ESTABLISHED -j ACCEPT
-A external -p tcp -j external_tcp

Do you really want me to sniff the traffic?

Note: I never said in opens many connections _simultaneously_. But I do believe opening and closing TCP connections right away _does_ include unnecessary overhead which otherwise can be eliminated.

> Marking invalid.

Feel free to do so - my job was to notify about potential issue. It's up to you how to use this info.

> You don't need to firewall your internal network.

As long as wireless is involved - I would do _everything_ to protect myself.

P.S.

XXX@nickolas:~$ fgrep SRC=192.168.2.4  /var/log/ulog/netfilter.log.2 | wc -l  
61

It's not 4 packets. It's much more. Plus logger have it's own threshold...
And it _does_ look bad from consumer perspective - SB3 simply hangs for a couple of seconds. But whatever - I do have a workaround ....
Comment 7 Andy Grundman 2008-07-21 19:26:37 UTC
A packet capture that shows the problem, with your firewall disabled, might be useful.
Comment 8 Nick Orlov 2008-07-21 20:59:41 UTC
Ok, fair enough. I'll post tcpdump logs tomorrow (too tired and sleepy a the moment).
Comment 9 Nick Orlov 2008-07-22 19:44:27 UTC
Ok, as promised I

a) made sure problem is still there by reverting to more conservative firewall settings

b) captured bunch of TCP sessions to demonstrate the issue.

Below is the example with firewall _completely_ disabled.

Note 1: this time it's connection to port 9000 that is problematic
Note 2: SC IP .2.2, SB3 IP .2.4

===========
1216778730.696580 IP 192.168.2.4.28254 > 192.168.2.2.9000: . ack 734381 win 1111
1216778730.786521 IP 192.168.2.4.28254 > 192.168.2.2.9000: . ack 734381 win 3618
1216778730.786618 IP 192.168.2.2.9000 > 192.168.2.4.28254: . 734381:735841(1460) ack 0 win 5840
1216778730.786677 IP 192.168.2.2.9000 > 192.168.2.4.28254: . 735841:737301(1460) ack 0 win 5840
1216778730.789687 IP 192.168.2.4.28254 > 192.168.2.2.9000: . ack 737301 win 698
1216778730.899620 IP 192.168.2.4.28254 > 192.168.2.2.9000: . ack 737301 win 3206
1216778730.899713 IP 192.168.2.2.9000 > 192.168.2.4.28254: . 737301:738761(1460) ack 0 win 5840
1216778730.899773 IP 192.168.2.2.9000 > 192.168.2.4.28254: P 738761:740221(1460) ack 0 win 5840
1216778730.901326 IP 192.168.2.4.28254 > 192.168.2.2.9000: . ack 740221 win 286
1216778731.000340 IP 192.168.2.4.28254 > 192.168.2.2.9000: . ack 740221 win 2794
1216778731.000411 IP 192.168.2.2.9000 > 192.168.2.4.28254: . 740221:741681(1460) ack 0 win 5840
1216778731.011491 IP 192.168.2.4.28254 > 192.168.2.2.9000: . ack 741681 win 1334
1216778731.069746 IP 192.168.2.4.28254 > 192.168.2.2.9000: R 2167940888:2167940888(0) win 1334
1216778731.073690 IP 192.168.2.4.28255 > 192.168.2.2.9000: S 2168080944:2168080944(0) win 4096 <mss 1460>
1216778731.073824 IP 192.168.2.2.9000 > 192.168.2.4.28255: S 1727290414:1727290414(0) ack 2168080945 win 5840 <mss 1460>
===========

Here is what I think is wrong with the example above

a) why in the world SB3 would change win size back and forward in a less than 100ms ?

b) I cannot find rational explanation for SB3 behavior starting 1216778731.011491

c) total number of RST packets during in a less than 8 seconds period (not shown here): 44. All originated on SB3 side.

Reopening bug since it's clear that SB3 closes connection to reopen it in a less than 10(!!!) ms.

Full log for this session is available per request.
It's ~27 Mb long, so I'd rather upload it somewhere.




Comment 10 Andy Grundman 2008-07-22 19:50:13 UTC
Sean, Richard, can you please comment?

The window size is changed often to help optimize the player's buffer.  The window size is set to the amount of data the player can accept at any given time.
Comment 11 Nick Orlov 2008-07-23 06:04:24 UTC
Andy,

I see now what's happening. I do believe that the way you do traffic modulation is suboptimal. Look at the numbers:

1216778730.899620 IP 192.168.2.4.28254 > 192.168.2.2.9000: . ack 737301 win
3206 <== I'm ready to accept 3206 bytes
1216778730.899713 IP 192.168.2.2.9000 > 192.168.2.4.28254: .
737301:738761(1460) ack 0 win 5840 <== 1460 sent
1216778730.899773 IP 192.168.2.2.9000 > 192.168.2.4.28254: P
738761:740221(1460) ack 0 win 5840 <== 1460 sent
1216778730.901326 IP 192.168.2.4.28254 > 192.168.2.2.9000: . ack 740221 win 286 <== ACK with 2 ms delay for 2 * 1460 bytes (286 == 3206 - 2 * 1460)
1216778731.000340 IP 192.168.2.4.28254 > 192.168.2.2.9000: . ack 740221 win
2794 <== some data is taken out from a recv buffer, ready to accept 2794 bytes

ACK that is sent at 1216778730.901326 is almost useless. It does not contain any new info for a sender besides "I've got these packets, don't retransmit". But sender wouldn't retransmit anyway since less than 2ms passed since the previous ACK. Right thing to do would be to set a timer and ACK with a delay of 100ms or so (to prevent retransmits) or after you have processed some data from a recv buffer. _Then_ it would carry useful data indeed. Plus don't forget that sender sees these ACK's with some delay (true - negligible in the most of the cases) but by saying "I have very few bytes left in a recv buffer" you are shooting yourself in the foot (by the time sender sees this you might have some space freed)....


As far as original issue (new connections) is concerned - I think what you are doing is the easiest, most portable and reliable way to say "I do not need audio data that you might already have queued for sending, please start sending from a new point". So now I see the very good reason for new connections to exist and think there is nothing to fix....

Comment 12 Blackketter Dean 2008-07-23 07:49:51 UTC
Closing this as the original report is now understood.

Nick: If you want to open a new bug for optimizing the TCP acks, please do.  Not sure how it would benefit folks in general, but interested to hear your thoughts.
Comment 13 Nick Orlov 2008-07-23 20:22:38 UTC
Dean,

In my opinion - there are areas where improvements are much more welcomed.
For example I'd rather see bug #8782 fixed.
 
Suboptimal ACKs is just something to keep in mind. In can _theoretically_ affect people with crappy networking hardware, high-latency low-throughput links, less sophisticated TCP/IP stack implementation on a server (SC) side. I have none of that, so I'm fine :)

Out of curiosity you might want to check if, for example, it causes SC running under Win to send flood of packets with a small payload (less then mss). That would move the issue from "theoretical" to "real" category. But before that, or even better - before anyone else complains - not worth the efforts.