We have a Windows 2003 server running Exchange 2003 that is being monitored by NAGIOS. After applying some Exchange updates, we get the “connection refused” message in NAGIOS. The following is found in the event log:
Source: NSClient
Event ID: 2
Windows socket error: only one usage of each socket address (protocol/network address/port) is normally permitted (10048), on API ‘bind’
I’ve already tried reinstalling the client and rebooting the server to no success. Anyone have a similar problem?
No idea…sorry… it appears that several programs try to use the same socket or that a program try to use at the same time the same socket for several point. That’s it ?
Try to restart your server.
Then try to connect with a telnet:
first local : telnet 127.0.0.1 num_port_nrpe (5666 default)
We are having the same problem on a Win2k/Exch2k3 box. Sometimes the NAGIOS agent starts fine, sometimes not. If we try to stop/restart the NAGIOS service, the service fails to stop.
nsclient runs fine on about 50 servers, with the socket error on 2. One is a Domain Controller and the other is a Citrix server. We have many DC’s and Citirx boxes that work fine with same setup. Edited Wed Nov 02 2005, 05:30PM ]
I’ve hear you say that a few times in various threads now Jakkedup. What have you against NRPE? I’m really curious.
Of course simply replacing the used software could fix the problem the TS has, but it’s not -the- solution is it? When your car won’t start you don’t simply change the whole engine either, do you?
If my car has square tires on it, I could either shave the tires to make them round, or I could fix the whole problem by replacing them with the correct/best tires I can find.
The reason NRPE and other active check solutions are not good, is simply due to them being ACTIVE checks, plus the security factor. With NRPE types, the daemon is running on the remote system. Nagios has to login with the password, then it has to tell it what check to make. It then waits for the check results, then it has to process those results.
With a passive solution like nsca, everything is done for nagios without it’s intervention. Periodically nagios reads the external command file, when it has the time to do so, and like magic, there is already the complete output of a service check. All nagios has to do, is to process that output so you can see it.
In my case the problem is that mad.exe(and exchange process) has taken the port that nagios uses and is in a close_wait status. To fix this simply restart the Microsoft system attendant service and restart the server. I have this problem every once in a while and doing this has fixed it every time.