Greetings,
I’ve tried searching for a solution to my problem with no success. Unfortunately, the search terms are relatively common, even when used together and so I’m having a hard time getting answers. Here is my problem:
I have a Nagios Core 3.3.1 install on a RHEL 6.2 server that can do standard check_host_alive tests all day long. In fact, I’ve got it monitoring 137 hosts that way so far. But I know that the real power of Nagios lies within NRPE. Herein lies the problem. I’ve only been experimenting with one remote host so far- another RHEL 6.2 system. I used a few online tutorials to install and configure NRPE on the remote host with xinetd. I created several custom commands in the nrpe.cfg flle as follows:
command[check_root]=/usr/local/nagios/libexec/check_disk -w 10% -c 5% -p /dev/mapper/system-root
command[check_internal1]=/usr/local/nagios/libexec/check_disk -w 5% -c 10% -p /dev/mapper/system-internal1
command[check_tmp]=/usr/local/nagios/libexec/check_disk -w 10% -c 5% -p /dev/mapper/system-tmp
command[check_usr]=/usr/local/nagios/libexec/check_disk -w 10% -c 5% -p /dev/mapper/system-usr
command[check_system-local]=/usr/local/nagios/libexec/check_disk -w 10% -c 5% -p /dev/mapper/system-local
command[check_var]=/usr/local/nagios/libexec/check_disk -w 10% -c 5% -p /dev/mapper/system-var
I created the check_nrpe command on the Nagios server with the following line:
define command {
command_name check_nrpe
command_line $USER1$/check_nrpe -H $HOSTADDRESS$ -c $ARG1$
}
I can run the following from the command line on the nagios server and get the appropriate output:
bash-4.1$ /usr/lib64/nagios/plugins/check_nrpe -H plprdapp -c check_var
DISK OK - free space: /var 7226 MB (94% inode=99%);| /var=427MB;7256;7659;0;8063
And I can run the following (as specified in the nrpe.cfg file from above) from the command line on the remote host and also get the appropriate output:
[root@plprdapp etc]# /usr/local/nagios/libexec/check_disk -w 10% -c 5% -p /dev/mapper/system-var
DISK OK - free space: /var 7226 MB (94% inode=99%);| /var=427MB;7256;7659;0;8063
The problem is, the host is reported as being down in the nagios web interface despite all 7 services being OK:
The error reported on the host state information screen reads, “NRPE: Command ‘check_var,’ not defined”
How can this be? check_var works locally and remotely. Can someone point me in the right direction?