NSClient - Connection refused?

I get the error stating that the connection has been refused in my service checks. The NSClient service starts normally and no errors are reported in the Event Viewer - the only events that show up are from the startup of NSClient. When I telnet to the server and port a la “telnet 192.168.1.101 1248”, it connects almost immediately, both from the server locally and from a remote box.

The server is a Windows 2000 SQL Server and unfortunately is unable to be restarted at this point in time (it’s been up for about 330 days).

This is just a little confusing, and I wanted to know if anyone else had run into this. Thanks!

From a remote box or from the nagios box? A remote box connects ok, but how about the nagios box doing a telnet. If the nagios box telnet’s ok, then perhaps it’s your settings in the .cfg for the nsclient connection. Check for typos in your configs.

I can telnet from the Nagios box, a different Windows server, and the local Windows box, all make a connection. When I do my check-host-alive, it works just fine, but that’s a function of the ping command. I don’t think that it’s in my .cfg’s as I have 29 other hosts and 157 other services that check just fine. Or is there a way to make adjustments in the NSClient?

Also, I checked my .cfg’s (because you never know), they all look fine, they were pretty much a copy and paste job from other working hosts and services and I just changed the host_name field.

possibly a wrong “windows service” name?
check taht you don’t have a localized version of windows instead of an english one… that gave me some headache when istalling NSClient

Luca

When I get a connection refused error its normally caused by another program pinching port 1248 that nsclient uses, exchange does this alot, sql might also be similar.

If the config worked then all of sudden you got connection refused this is normally the cause.

Possible solutions:

  • reboot server (not always an option)
  • netstat -a (or whatever allows you to see the PID) match the PID listening on port tcp 1248 with the PID of nsclient if they match it could be the nsclient has crashed, if they dont it means another program has pinched the port.
  • stop the nslcinet in services.msc it will probably fail. Kill the psnclient.exe process wait about 5 minutes and start it again, this should work. If not stop the other service/program using the 1248 port and start nslclient and then start the other service/program.
  • You could also try uninstalling and re-installing the nsclient to. (stop it first though!)

I’m monitoring about 85 hosts on one of our nagios box and this happens to a host at least once a week.

hope thats of some help

Thanks for the help here. When I do a netstat -a, I see the server listening on port 1248, but I do not see any program associated with it later in the list. I stopped the service, uninstalled, and then re-installed it, but the same results. I’m probably going to try a different service (NRPE, perhaps?) and see if I can get that to work.

As far as the localized version goes vs. the English one, this is the same version of 2000 that my other 2000 servers are running. Really at this point it’s probably anyone’s guess.

Hmm, if that doesn’t look like it is working then unfortunatly the only way is to reboot the machine. I know its not much help. But when I get this problem and all else fails its the only way i’ve been able to make it work im affraid.

I assume there is no firewall running on that server? Like a software one, such as ISA or the windows firewall. Is the port open.

Also do you run RAS on it? this also uses the same range of ports that the client uses.

No software firewall. Don’t use RAS on it. I’m going to schedule downtime for it soon. I’ll let you know if that works/doesn’t work.