Bugzilla – Bug 462
problems with DHCP that do not meet specs from RFC 2131
Last modified: 2008-10-03 02:49:26 UTC
Reported by Harald Alvestrand, including a packet trace. --- It appears that for some reason the DHCP server (which I think is running on a Cisco 677i-DIR DSL modem) is not answering the renewal requests. You can see renewal requests from a PC (which got answered) in the trace too. First renewal request at 554.188 in the trace. And at the end of the trace, the client goes into "lost contact" mode - when what it's actually lost is its own IP address; misleading error message... I captured this with the filter "ether host <squeezebox> and not tcp port 9000 and not tcp port 3483 or port bootpc or port bootps" - I did not want to look at the music packets! Note: I *suspect* that the DHCP server is expecting the renewal request to come in unicast; RFC 2131 section 4.4.5 says: At time T1 the client moves to RENEWING state and sends (via unicast) a DHCPREQUEST message to the server to extend its lease. And it's got another oddity: The client MUST NOT include a 'server identifier' in the DHCPREQUEST message. The squeezebox' DHCP message does seem to include a server identifier. The broadcast message that the squeezebox does send seems in conflict with the same point in this text from RFC 2131 section 4.4.5: If no DHCPACK arrives before time T2, the client moves to REBINDING state and sends (via broadcast) a DHCPREQUEST message to extend its lease. The client sets the 'ciaddr' field in the DHCPREQUEST to its current network address. The client MUST NOT include a 'server identifier' in the DHCPREQUEST message. It includes a "server identifier". You never know what the server implementation is going to test for..... Good luck in figuring this one out .... anything I can do to help... Harald
Created attachment 79 [details] packet trace showing the behavior
I can confirm this behaviour with the cisco 678 DSL model acting as a DHCP server. I was unable to capture a trace, but found that setting the SB to a static IP resolved the connectivity problems.
Heard from another gentleman today who is experiencing this problem with his Linksys router's DHCP server.
This problem also observed in my home network with my Solaris 10 DHCP server. The Squeezebox is clearly violating the specification in its attempts to renew. Can work around with a permanent address assignment, but this really shouldn't be necessary.
I observed a similar problem with DHCP yesterday. Setup is 2 x SB2 on Wireless LAN using D-Link DI-524 router (WPA-PSK mode). This is connected to wired LAN with Linux box (FC4) running a DHCP server & Slimserver. I had the DI-524 CTS/RTS threshold set very low (256 bytes) to try and minimise wireless collisions and while this works when they are already running, it seems to cause the SB2 devices to have a problem on reboot when they try and get DHCP addresses. In this configuration the SB devices are able to connect to the WLAN but unable to retrieve IP addresses. This is despite the fact I see repeated DHCPOFFER messages in the Linux log (note the 2 MAC addresses of the SB devices): Jul 11 20:20:45 lled dhcpd: DHCPDISCOVER from 00:04:20:05:cb:5e via eth1 Jul 11 20:20:46 lled dhcpd: DHCPOFFER on 10.1.0.200 to 00:04:20:05:cb:5e via eth1 Jul 11 20:20:50 lled dhcpd: DHCPDISCOVER from 00:04:20:05:cb:5a via eth1 Jul 11 20:20:51 lled dhcpd: DHCPOFFER on 10.1.0.199 to 00:04:20:05:cb:5a via eth1 Jul 11 20:21:49 lled dhcpd: DHCPDISCOVER from 00:04:20:05:cb:5e via eth1 Jul 11 20:21:50 lled dhcpd: DHCPOFFER on 10.1.0.200 to 00:04:20:05:cb:5e via eth1 Jul 11 20:21:54 lled dhcpd: DHCPDISCOVER from 00:04:20:05:cb:5a via eth1 Jul 11 20:21:55 lled dhcpd: DHCPOFFER on 10.1.0.199 to 00:04:20:05:cb:5a via eth1 Jul 11 20:22:58 lled dhcpd: DHCPDISCOVER from 00:04:20:05:cb:5a via eth1 Jul 11 20:22:59 lled dhcpd: DHCPOFFER on 10.1.0.199 to 00:04:20:05:cb:5a via eth1 Jul 11 20:24:02 lled dhcpd: DHCPDISCOVER from 00:04:20:05:cb:5a via eth1 Jul 11 20:24:03 lled dhcpd: DHCPOFFER on 10.1.0.199 to 00:04:20:05:cb:5a via eth1 At this time, a Dell laptop I have was able to connect and get an IP over DHCP so don't think we can blame the router or Linux box. When I changed the DI-524 RTS/CTS setting back to the default (2436 or something - effectively 'off') the SB2s were able to get leases, here's one doing so: Jul 11 21:00:22 lled dhcpd: DHCPDISCOVER from 00:04:20:05:cb:5e via eth1 Jul 11 21:00:23 lled dhcpd: DHCPOFFER on 10.1.0.199 to 00:04:20:05:cb:5e via eth1 Jul 11 21:00:24 lled dhcpd: DHCPREQUEST for 10.1.0.199 (10.1.0.1) from 00:04:20:05:cb:5e via eth1 Jul 11 21:00:24 lled dhcpd: DHCPACK on 10.1.0.199 to 00:04:20:05:cb:5e via eth1 Jul 11 21:01:38 lled dhcpd: DHCPDISCOVER from 00:04:20:05:cb:5e via eth1 Jul 11 21:01:39 lled dhcpd: DHCPOFFER on 10.1.0.199 to 00:04:20:05:cb:5e via eth1 Jul 11 21:01:40 lled dhcpd: DHCPREQUEST for 10.1.0.199 (10.1.0.1) from 00:04:20:05:cb:5e via eth1 Jul 11 21:01:40 lled dhcpd: DHCPACK on 10.1.0.199 to 00:04:20:05:cb:5e via eth1 Sadly the Dell laptop does not play nice with a high RTS/CTS so it's catch 22... I will experiment more later.
Robin, what model of Squeezebox and firmware are you currently using?
Both players are SB2. They are on firmware 62. Server is: SlimServer Version: 6.5b1 - 9519 - Linux - EN - utf8 Perl Version: 5.8.6 i386-linux-thread-multi MySQL Version: 4.1.16 I haven't seen this problem for a while as have turned the AP back to default settings. Basically, I've given up trying to get the SBs to sync (see bug 259) so the network traffic was a lot lower.
cc'ng Richard. This isn't related to bug 3851 is it, Richard?
OK I am slighly confused. Robin's bug looks different to the others, was this bug re-opened? Chris could to try setting any access point to have a short RTS/CTS threshold, and see if DHCP works with a linux dhcp server. We need to isolate if it is a Squeezebox2/3 or router issue.
what DHCP server software are you using, Robin? Is there a regular dhcpd that FC4 uses? Or something else? Thanks for any info. I spent some time trying to reproduce this on my more-familiar debian system with no luck. I'll try it on FC4.
Reassigning Squeezebox firmware bugs to Felix.
If anyone is still seeing this, please re-open.
Is this bug changed to RESOLVED because no more reports came in? I deduct that question from the resolution listed ("WONTFIX") and the last comment... Non compliance with RFC2131 means that the product will always have DHCP issues with (some) DHCP-servers that -do- comply and because the RFC is authoritive on these issues, it means the SB implementation is broken, not the other way around, i.e. by not complying you do not "make" or "define" the standard, you just don't comply. I'm just a user of your product but before retiring I was also involved with defining the standards as written in RFC's. Also, I would like to see your products become rock-stable on the networking-part and closing bugs without resolving won't lead to that target... Nick.
Created attachment 4099 [details] Current wireshark capture Just for completeness I've attached a current wireshark capture containing the initial DHCP Discover, Offer, Request, Ack plus two DHCP Requests and Acks. Client was a SBB fw 33 and DHCP server was DNSMasq v2.41 with a lease time of 2 minutes. The renewal DHCP Request is sent unicast and does _not_ contain a 'server identifier' as it should be.