NRPE is driving me MAD


#1

This is driving me crazy

I have Nagios installed (in USA) and monitoring more than 40 machines and all is working well except one machine (which is in Spain)

I do have so another machine in Spain that’s working perfectly

The error i’m getting is famous of course

[blockquote][root@ag001-es ~]# /usr/lib/nagios/plugins/check_nrpe -H 127.0.0.1
CHECK_NRPE: Error - Could not complete SSL handshake.[/blockquote]

I’m getting this error whether i run it locally or form the Nagios machine I’ve read almost every thread available and couldn’t fix the problem. The machine is identical to another machine that’s working fine and this is what’s driving me crazy.

When i disable SSL i got the following
[blockquote]
CHECK_NRPE: Received 0 bytes from daemon. Check the remote server logs for error messages
[/blockquote]
There is nothing valid though in /var/log/messages

I would appreciate any help please tell me what could be wrong here


#2

You will always get could not complete ssl handshake when trying to nrpe to localhost. That isn’t a valid test.

Use a command that you have defined in nrpe.cfg on your remote server (spain) that you know works when run locally, and run nrpe from your nagios server to do your testing.
eg on spain in /etc/nrpe.cfg lets say you have
command[check_root]=/usr/local/nagios/libexec/check_disk -w 10% -c 5% -p /

and when you run the bolded part on spain, it works fine. Then on nagios (USA)
$ /usr/lib/nagios/plugins/check_nrpe -H spain -c check_root

Use that method to do any testing. still getting SSL handshake error? tail -f /var/log/secure on spain. make sure openSSL is the same version on spain as spain2 (which is working). Try again without ssl.

how are you running the nrpe daemon on the computer in spain? via xinetd or running it manually as a daemon? are you running nrpe as a different user (eg: nagios user) ? if so, can you su - nagios user on spain and run the check_disk command?


#3

Thanks A LOT for your help

I ran this command on the nagios server

/usr/lib/nagios/plugins/check_http -I spain -p 8008 -t 2400 -u google.com/
HTTP OK - HTTP/1.0 302 Found - 1.120 second response time |time=1.119548s;;;0.000000 size=579B;;;0

So it worked fine and i tried others as well like check_load and check_users and they work fine on Spain and on USA when i run this though i got this error

[blockquote] /usr/lib/nagios/plugins/check_nrpe -H spain -t 360
CHECK_NRPE: Error - Could not complete SSL handshake.[/blockquote]

and with no SSL i got this

[blockquote]CHECK_NRPE: Received 0 bytes from daemon. Check the remote server logs for error messages[/blockquote]

I found this error in log

[blockquote]Cannot remove pidfile ‘/home/tarek/nagios/nrpe.pid’ - check your privileges.[/blockquote]

did give this file nagios/nagios permission but when i restarted NRPE it is back root/root


#4

I’m running nrpe as daemon and the users is nagios ,

Regarding SSL version it’s

OpenSSL 0.9.8b 04 May 2006

On Nagios machine and Spain machine however it’s

OpenSSL 0.9.8g 04 May 2006 on Spain2 machine

I appreciate your help a lot


#5

Hi!

Since your openssl versions are a bit different, you should re-compile nrpe for your server…

OR, running BOTH sides of the nrpe checks without ssl.
if your nrpe is not too old, launch the client (in Spain) with the option “-n”;
then make the call with “-n” too:
/usr/lib/nagios/plugins/check_nrpe -H spain -t 360 -n
(meaning that you won’t have SSL on this test … so I hope the channel is secure (VPN) :))

hope this helps


#6

Hi,

I did try to run both sides with no SSL but i got the following error in Spain

[root@ag001-es ~]# /usr/lib/nagios/plugins/check_nrpe -H ag001-es -t 360 -n
Connection refused by host


#7

Well, if nothing works and you’re really stuck you might try to re-compile both nrpe (check_nrpe and nrpe) with the option “–disable-ssl”

(that’s what I did after a day of trying to make them work on new intel+solaris servers :))


#8

Thanks for the help , it seems this is the only choice i got. thanks again :slight_smile:


#9

This is finally worked after installing Nagios 3.0.3 and the new Pluging/NRPE and it’s working like a charm