New Twist on an Old Problem

I know I am not the only one with this problem. I am getting CHECK_NRPE: Error - Could not complete SSL handshake. I am confused as to why this is and I will explain why.

I have a Linux machine running as a Nagios Server. It is checking a few other Windows and Linux servers checking for different services and statistics. I use NRPE to communicate the results between the client and server. My current problem is stemming from the 2nd Linux server I am trying to configure. The 1st server is set up using NRPE and checking the same services as the 2nd server. I ran the same installation and setup steps for both servers but the 1st box works, and the 2nd box comes back with the SSL handshake error.

I have ruled out that it was the firewall and NRPE versions. Is there anyone that has seen one setup work while another not? I could post my configs if requested, but they seem to be the same between both clients.

Is the new client running nrpe through xinetd or as a daemon in a tcp wrapper? What happens when you telnet to the new client from the server on 5666 - does it connect and then close it immediately?

Both clients are running NRPE through XINETD. When I do a telnet 5666 it opens a connection and closes immediately.

Interesting, I did the same telnet test to the working client and it did not close immediately.

hmmm - could be a bunch of stuff…
have you checked file permissions on /etc/xinetd.d/nrpe, nrpe.cfg, /dev/random, and /dev/urandom ? All should be readable by your nagios user…
Any typo in the “only_from =www.xxx.yyy.zzz” line in /etc/xinetd.d/nrpe ?
Are there any configuration anomolies or differences between the 2 clients with respect to the contents of their /etc/hosts.allow and /etc/hosts.deny? (if indeed these files are used on your systems…)

I have checked everything you mentioned. My Nagios server has two NICs, and in my working config I placed both IP addresses while the non-working had only one. I made the adjustment and restarted the xinetd service but it doesn’t seem to make a difference.

Both machines are running a firewall but I have inserted the proper statements into iptables to allow connections. Both iptables statements are identical.

It is just frustrating that I use the same config and it works on one server and doesn’t on another. The only difference to my knowledge of the two servers is the working is running RHEL 5 and the non-working is RHEL 4.

Well, the only other thing that you should need is the /etc/services entry for nrpe…
Can you run an lsof|grep nrpe to confirm that nrpe is listening and under the control of xinetd?
Also, what version of nrpe are you running on your clients, and are there any differences in the version of openssl?

Rechecked the /etc/services entries on the working and broken host - both are identical
nrpe 5666/tcp #NRPE

Lsof | grep nrpe on both hosts returns similiar results
xinetd random# root 5u IPv4 random# TCP *:nrpe (LISTEN)

Both NRPE clients are version 2.12.

OpenSSL is different. On my server and working host I am running OpenSSL v0.9.8b. On my non-working client I have 0.9.7a. In the NRPE version info it does state that it needs OpenSSL < 0.9.6 which it does meet. Do you think upgrading to v0.9.8b would be a good next step?

Yeah, I don’t think that would neccessarily be a bad idea, although this is supposedly tested as working with 0.9.7a according to the information in README.SSL. Probably then worth re-installing NRPE and keeping an eye out during ./configure to make sure it finds your SSL install alright

FIXED! I turned on debugging in the NRPE config file. I checked /var/log/messages to see that my NRPE.cfg was throwing an error at a particular line. I had been copying configs from one machine to the other because I was using the same setup. Somewhere in the copy paste process a carriage return was entered in 1 line but it looked just like it was word wrapping. I fixed the line and it allowed nrpe.cfg to load correctly fixing all my problems.

Thanks Strides for taking time out to just talk through different ways of debugging. Of course it had to be a simple problem making things so complex.

heh, glad it’s sorted, was running out of ideas !!!