Bug 4067 - HTTP server: Address already in use
: HTTP server: Address already in use
Status: RESOLVED WORKSFORME
Product: Logitech Media Server
Classification: Unclassified
Component: Misc
: 6.5b1
: Sun Other
: P2 normal (vote)
: ---
Assigned To: Chris Owens
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2006-09-08 08:50 UTC by Dan Newman
Modified: 2011-03-16 04:34 UTC (History)
1 user (show)

See Also:
Category: ---


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Dan Newman 2006-09-08 08:50:48 UTC
Solaris SPARC 9 U5
SlimServer 6.5 2006-09-08
Perl v5.8.7

With the latest nightly build, the slimserver startup fails
with the message

can't setup the listening port 6998 for the HTTP server: Address already in use
at /usr/local/src/SlimServer_6.5_v2006-09-08/server/Slim/Web/HTTP.pm line 167.

Now, the interesting thing is that regardless of what TCP port I tell it to
use, that error is generated.  And even if it is running as user root.
Additionally, "netstat -a | grep 6998" reveals no one bound to that port, listening
or otherwise.  I've tried this with about eight different ports all above
4096 and no luck.  Switched back to running the 2006-09-03 build and all is
well (i.e., it can bind to all of those ports just fine, and repeatedly too).
Comment 1 Dan Newman 2006-09-08 10:14:17 UTC
BTW, looking at Slim/Web/HTTP.pm, I don't see any changes which would seem
relevant.  That would then point a finger at my Perl's HTTP::Daemon.  However,
that has not been changed by me in over a month.   But, I notice that the
directory structure for the SlimServer install has changed a bit.  When I unpackaged
the nightly, I had

    SlimServer_6.5_v2006-09-08/server/CPAN

and after I ran

   SlimServer_6.5_v2006-09-08/server/Bin/build-perl-modules.pl

I then had the new directory

   SlimServer_6.5_v2006-09-08/CPAN/

So, I have two CPAN directories now.  Moreover, when I tried to start slimserver.pl,
it couldn't find YAML's dump.al anywhere.  I dealt with that.  However, that all now
leads me to wonder if something is cockeyed with how SlimServer is building and/or
looking for its CPAN dependencies.  And, if so, it could mean that SlimServer is now
seeing the wrong version of HTTP::Daemon.

Thoughts?
Comment 2 Dan Newman 2006-09-08 10:40:03 UTC
P.S. When I rean build-perl-modules, I gave it the path

   SlimServer_6.5_v2006-09-08

rather than

   SlimServer_6.5_v2006-09-08/server

as the path to my "SlimServer directory".  I've since rerun it giving it

   SlimServer_6.5_v2006-09-08/server

and now I don't get the extra CPAN directory.  However, still the same
underlying issue of being told that it cannot bind to a port that is most
definitely not in use according to netstat.
Comment 3 Chris Owens 2006-09-08 11:58:12 UTC
Our local solaris gurus suggest netstat is lying, and to run 'lsof -i: 6998'
Comment 4 Dan Newman 2006-09-08 13:25:21 UTC
I'm inclined to agree.   The more I've looked at things this morning, the
more I'm convinced that the problem isn't with SlimServer or Perl but rather
with the TCP/IP stack.  While I have written C code to bind to the port, ran it
and it succeeded, I simply cannot see why the Perl code is failing.  Net,
net, I suspect the TCP/IP stack, especially since I'm running it with multipathing
and virtual interfaces -- two more layers of potential trouble.

I suppose I could persue the bug databases (I work for Sun as a software engineer),
however, I'll take the more expedient path of

 1. Pulling a copy of lsof (not shipped with Solaris) and see if it sheds any light, and
 2. If I don't get satisfaction from 1, then scheduling a reboot for tonight....

I suppose I could also update my patches for the system too...
Comment 5 Dan Newman 2006-09-08 13:35:36 UTC
FWIW, lsof turns up nothing on those ports....  For example,

# lsof -i :6998
#

But lsof is working as shown by running it when SlimServer 6.5 2006-09-03 is running
and bound to port 8998

# lsof -i :8998
COMMAND   PID     USER   FD   TYPE        DEVICE SIZE/OFF NODE NAME
perl    23149 slimbeta   10u  IPv4 0x300044e1b70      0t0  TCP *:8998 (LISTEN)
perl    23149 slimbeta   15u  IPv4 0x30609e8e1f0 0t618836  TCP mtbaldy.us:8998->dhcp-10.30.0.4.mtbaldy.us:49569 (ESTABLISHED)
#

And, yes, I tried running SlimServer 6.5 2006-09-08 on 8998 also: as with the other ports I tried,
it couldn't bind.  (No other SlimServer was running at that time.)  Beats me why this 2006-09-08
build cannot bind to any ports and the 2006-09-03 can.  At this point I'll probably try a reboot
this evening.
Comment 6 Dan Newman 2006-09-08 14:46:16 UTC
In light of

http://sunsolve.sun.com/search/document.do?assetkey=1-26-101834-1&searchclause=101834

I'll be installing Solaris patch 118305-05 tonight.  Since that patch needs to be followed by a reboot,
it won't be obvious what the "real" fix was for this issue with SlimServer as I will have changed two
variables at once: installed a patch AND rebooted the system.  I'll update this bug after the reboot.
Comment 7 Dan Newman 2006-09-08 15:39:00 UTC
Closing as INVALID.  This is indeed a (known) Solaris 9 (and 10) bug.  Spoke with
a/the responsible engineer at Sun, stopped and restarted some interfaces, and
all works now (but will again get into a bad state until such time that I apply patches
112233-12 and 118305-08).  My situation was apparently exacerbated by use of virtual
interfaces and IP multipathing so hopefully it's less likely that others may see this.
Comment 8 Dan Newman 2006-09-12 09:52:03 UTC
Wanted to add one additional bit of debugging info.  If a machine is running the
network backup system Legato Networker / EMC Networker / Sun EBS then the
backup system may be usurping TCP ports 7937-9936 such that processes may
not be able to bind to them even if they don't show up in use with lsof or netstat.
After I installed the Sun patches mentioned previously in this report, I still had
difficulty with some ports in that range.  Use nsrports -S to resolve any
issues with that port range.
Comment 9 Chris Owens 2006-09-12 10:57:07 UTC
Thanks for the additional info, Dan!