NRPE: can't get check_ping to work through the web UI



I’m using Nagios 3.2.1 with NRPE 2.12.

I’m trying to monitor some services on a remote host, and I got 3 of the default ones working (Current users, root partition, current processes), but PING still isn’t working. I’m pretty sure it’s just a bad argument somewhere, but I can’t seem to find it. Note that the other commands ARE working, so it’s not a general command line argument thing, it’s just specific to PING.

First off, when I run ping from the server via the command line, it works:

/usr/local/nagios/libexec/check_nrpe -H X.X.X.X -c check_ping -a X.X.X.X 3000.0,80% 5000.0,100% 5
PING OK - Packet loss = 0%, RTA = 0.01 ms|rta=0.014000ms;3000.000000;5000.000000;0.000000 pl=0%;80;100;0

On my server, the PING service definition looks like this:

define service{
        use                             generic-service         ; Name of service template to use
        host_name                       XXXXX
        service_description             PING
        is_volatile                     0
        check_period                    24x7
        max_check_attempts              4
        normal_check_interval           5
        retry_check_interval            1
        contact_groups                  admins
        notification_options            w,u,c,r
        notification_interval           960
        notification_period             24x7
        check_command                   check_nrpe!check_ping!100.0,20%!500.0,60%

My check_npre command looks like this:

# 'check_nrpe' command definition
define command{
        command_name    check_nrpe
command_line    /usr/local/nagios/libexec/check_nrpe -H $HOSTADDRESS$ -c $ARG1$ -a $ARG2$ $ARG3$ $ARG4$

Finally, on the client, the check_ping command looks like this:

command[check_ping]=/usr/local/nagios/libexec/check_ping -H $ARG1$ -w $ARG2$ -c $ARG3$ -p 5

So, as you can see, the service should be passing check_ping, 100.0,20% and 500.0,60% to the command, the command should be passing all those, along with the host address to the client, and the client should be handling all those correctly.

Can anyone tell me if I’m doing something wrong here, or if there’s something I’m not understanding?

"check_local_disk" output doesn't seem complete

not used NRPE in a long time so it’s just a try…
isn’t there in the executing server’s log which command is executed effectively?


There is, but it doesn’t give me any more information than the web UI does.

Here’s the relevant part of /var/log/messages on the server:

Mar 24 00:00:00 nagios: CURRENT SERVICE STATE: ;Current Users;OK;HARD;1;USERS OK - 1 users currently logged in
Mar 24 00:00:00 nagios: CURRENT SERVICE STATE: ;PING;UNKNOWN;HARD;4;check_ping: %s: Warning threshold must be integer or percentage!
Mar 24 00:00:00 nagios: CURRENT SERVICE STATE: ;Root Partition;OK;HARD;1;DISK OK - free space: / 497548 MB (95% inode=99%):
Mar 24 00:00:00 nagios: CURRENT SERVICE STATE: ;Total Processes;OK;HARD;1;PROCS OK: 166 processes


looks like the args are messed up somehow.

i get THAT error when i pass a string instead of an integer to check_ping in the W or C field…

jake:~# /usr/local/nagios/libexec/check_ping -H -w d
check_ping: %s: Warning threshold must be integer or percentage!


But I’m not understanding how that can be the case, when I’m looking at my check_ping service definition, and the command is:

check_command check_nrpe!check_ping!100.0,20%!500.0,60%

The values are right there.

Is there any way I can debug this further? The logs don’t seem to be helping, and it works from the command line…