Connection refused by host

I am experiencing a problem with Nagios running some checks on our Citrix servers.

For whatever reason, all of the checks will come up with the response: “Connection refused by host”.

We have checked the server at the time of occurence and all seems fine. No memory issues, no CPU issues etc.

Then within 5-10 minutes the problem goes away.

Any suggestions/ideas would be appreciated.

I am running Nagios 1.2.
NRPE ver. 0.7f/2.0
NSclient ver. 2.01

What is the service check ou are doing. Paste it here. If the problem goes away, then there should be some entry in the log on the Citrix server “connection refused, too many connections” or something along those lines.
For example, a ftp check, might be refused sue to the ftp server having a 20 client connection limit.

I am doing checks for disk space, cpu usage, service checks, additions to the local admin groups, etc.

All of the checks come back with the “connection refused by host” and then they are fine 5 minutes later.

Does nrpe keep a log of connections made? if so, then perhaps the answer is there. i use nsca, so sorry.

From what I can tell from the “manual” for the nrpe client, it doesn’t do any logging of connections.

netstat -pta
does that show it’s running?
If it works one minute and not the next, I can only think of CPU overload. dunno, maybe turn off everything else, all the other checks,just this one check only, and see if it helps.

it may just be that you are swapping memory, and we all know just how slow that can become, and the connection attempt is timing out.

Personally, I dont like the idea of the nagios machine, initiating a connection to another box, asking it to perform a check, and then have to retrieve and display the results. That is far to much work, when you have over 1000 service checks running every 5 minutes.
Using nsca, you have the remote host, making it’s own checks, making a connection to the Nagios machine, passing the results to nagios, and nagios simply displays those results. It is far easier for the nagios machine to handle passive checks(results from nsca) than to perform active checks(nrpe initiated).

I’d suggest to ditch nrpe and go with nsca.
Edited Sun Apr 17 2005, 05:45PM ]