I have installed and configured nrpe on both the local host and the remote server. I am getting a return error (socket time out after 10 seconds) from the “check_nrpe” service check in nagios. I have configured the firewalls on both ends opening up port 5666 as well as forwarding any packets for nrpe to the local nagios server. I am still geting a time out error on both the local and remote machines using "/usr/local/nagios/libexec/check_nrpe -H ". here is my config on the remote linux server:
[code]# default: on
description: NRPE (Nagios Remote Plugin Executor)
service nrpe
{
flags = REUSE
socket_type = stream
port = 5666
wait = no
user = nagios
group = nagios
server = /usr/local/nagios/bin/nrpe
server_args = -c /usr/local/nagios/etc/nrpe.cfg --inetd
log_on_failure += USERID
disable = no
only_from = 127.0.0.1
}[/code]
the server running nagios does not need nrpe set up on it, it just runs check_nrpe. check_nrpe initiates a connection, so no firewall settings should be needed either.
on the remote server:
-try temporarily >removing< the only_from line to make sure you’ve got that correct. restart xinetd.
-do a netstat -pantu |grep 5666 and make sure inetd is actually listening on that port. If it’s not, make sure the path to nrpe is correct, the path to nrpe.cfg is correct, and the nagios user exists.
tail /var/log/messages after restarting the inet daemon to make sure inetd isn’t erroring on something.
-“tail -f /var/log/secure” on your remote server while doing a query or similar file if you’re not on redhat. See if it’s even seeing the connection attempt, and if it’s blocking it for some reason.
-turn off iptables or the firewall temporarily if you’re running it to see if that’s the problem.
-Is there an upstream firewall blocking stuff?
I’d be doing a
check_nrpe -H -c check_whatever
instead. replace check_whatever with something you have defined in nrpe.cfg on the remote box.
If i were you i’d switch to the nagios user on the remote box (su - nagios) and run one of the plugin commands (something like /usr/local/nagios/libexec/check_load -w 8,3,1 -c 10,5,2) and make sure the nagios user can actually run it. This point is irrelevant to your current problem as it won’t time out if you have permissions problems, but just in case you run into problems later.