Hi all,
I am having an absolute nightmare attempting to monitor filesystems on a production server in AIX running the NRPE daemon.
Here is the problem:
I have configured my NRPE.cfg like so:
command[check_aix_disks]=/usr/local/nagios/libexec/check_disk -e -w 10% -c 5% -x /proc -x /tsmdatapool -x /tsmlog -x /tsmdb -x /tsmarchivepool -x /nim/software
I run the below check_disk command from the server to be monitored:
check_disk -e -w 10% -c 5% -x /proc -x /tsmdatapool -x /tsmlog -x /tsmdb -x /tsmarchivepool -x /nim/software
and the output is:
DISK CRITICAL - free space: /oradata 2 MB (0% inode=50%);| /=52MB;115;121;0;128 /usr=1590MB;115;121;0;2304 /var=218MB;115;121;0;512 /tmp=405MB;115;121;0;2048 /home=210MB;115;121;0;256 /opt=80MB;115;121;0;128 /export/installios=673MB;115;121;0;1024 /nim/aix/53/CDs=6288MB;115;121;0;7168 /nim/aix/53/lppsource=1587MB;115;121;0;2080 /nim/aix/53/spot=409MB;115;121;0;2048 /nim/aix/53/mksysbs=6220MB;115;121;0;7168 /sysadmin=460MB;115;121;0;640 /app/oracle=12415MB;115;121;0;15744 /app/oracle/10g=3848MB;115;121;0;5984 /app/oracle/OEM=14331MB;115;121;0;17152 /oraarch/OEM=14263MB;115;121;0;24896 /oraarch/RMAN=3603MB;115;121;0;5440 /oradata=8925MB;115;121;0;8928 /oradata/OEM=7998MB;115;121;0;10592 /oradata/RMAN=4155MB;115;121;0;7392
the problem I have is that /oradata/RMAN and /oradata/OEM cannot be monitored from the nagios server:
[root@prodnag01 libexec]# ./check_nrpe -n -H salmo -c check_aix_disks
DISK CRITICAL - free space: /oradata 2 MB (0% inode=50%); /oradata/OEM 2 MB (0% inode=50%); /oradata/RMAN 2 MB (0% node=50%);| /=52MB;115;121;0;128 /usr=1590MB;115;121;0;2304 /var=218MB;115;121;0;512 /tmp=405MB;115;121;0;2048 /home=210MB;115;121;0;256 /opt=80MB;115;121;0;128 /export/installios=673MB;115;121;0;1024 /nim/aix/53/CDs=6288MB;115;121;0;7168 /nim/aix/53/lppsource=1587MB;115;121;0;2080 /nim/aix/53/spot=409MB;115;121;0;2048 /nim/aix/53/mksysbs=6220MB;115;121;0;7168 /sysadmin=460MB;115;121;0;640 /app/oracle=12415MB;115;121;0;15744 /app/oracle/10g=3848MB;115;121;0;5984 /app/oracle/OEM=14331MB;115;121;0;17152 /oraarch/OEM=14263MB;115;121;0;24896 /oraarch/RMAN=3603MB;115;121;0;5440** /oradata=8925MB;115;121;0;8928 /oradata/OEM=8925MB;115;121;0;8928 /oradata/RMAN=8925MB;115;121;0;8928**
you will notice that the output for the highlighted data above is displaying the same information which is incorrect.
I run a few tests and configure the NRPE daemon to run the following check:
command[check_aix_disks]=/usr/local/nagios/libexec/check_disk -w 10% -c 5% -p /oradata/OEM -p /oradata/RMAN -p /oradata
when I run the command form the server to be monitored:
–> ./check_disk -w 10% -c 5% -p /oradata/OEM -p /oradata/RMAN -p /oradata
DISK OK - free space: /oradata/OEM 2593 MB (24% inode=99%); /oradata/RMAN 3236 MB (43% inode=99%); /oradata 1814 MB (20% inode=99%);| /oradata/OEM=7998MB;9532;10062;0;10592 /oradata/RMAN=4155MB;6652;7022;0;7392 /oradata=7113MB;6652;7022;0;8928
but when I run the NRPE check from the nagios server I get:
[root@prodnag01 libexec]# ./check_nrpe -n -H salmo -c check_aix_disks
DISK CRITICAL - /oradata/OEM does not exist
I get the same result when I just run -p /oradata/RMAN
but the check is successful when I run -p /oradata
Please can someone advise as this is driving me mad, this only happens on 2 servers that are identically configured out of 55 monitored servers.
any help will be much appreciated and stop me crying on my keyboard
Thanks
Chris