check_disk plugin not reporting on mounted filesystem


#1

Hi all,

I am having an absolute nightmare attempting to monitor filesystems on a production server in AIX running the NRPE daemon.

Here is the problem:

I have configured my NRPE.cfg like so:

command[check_aix_disks]=/usr/local/nagios/libexec/check_disk -e -w 10% -c 5% -x /proc -x /tsmdatapool -x /tsmlog -x /tsmdb -x /tsmarchivepool -x /nim/software

I run the below check_disk command from the server to be monitored:

check_disk -e -w 10% -c 5% -x /proc -x /tsmdatapool -x /tsmlog -x /tsmdb -x /tsmarchivepool -x /nim/software

and the output is:

DISK CRITICAL - free space: /oradata 2 MB (0% inode=50%);| /=52MB;115;121;0;128 /usr=1590MB;115;121;0;2304 /var=218MB;115;121;0;512 /tmp=405MB;115;121;0;2048 /home=210MB;115;121;0;256 /opt=80MB;115;121;0;128 /export/installios=673MB;115;121;0;1024 /nim/aix/53/CDs=6288MB;115;121;0;7168 /nim/aix/53/lppsource=1587MB;115;121;0;2080 /nim/aix/53/spot=409MB;115;121;0;2048 /nim/aix/53/mksysbs=6220MB;115;121;0;7168 /sysadmin=460MB;115;121;0;640 /app/oracle=12415MB;115;121;0;15744 /app/oracle/10g=3848MB;115;121;0;5984 /app/oracle/OEM=14331MB;115;121;0;17152 /oraarch/OEM=14263MB;115;121;0;24896 /oraarch/RMAN=3603MB;115;121;0;5440 /oradata=8925MB;115;121;0;8928 /oradata/OEM=7998MB;115;121;0;10592 /oradata/RMAN=4155MB;115;121;0;7392

the problem I have is that /oradata/RMAN and /oradata/OEM cannot be monitored from the nagios server:

[root@prodnag01 libexec]# ./check_nrpe -n -H salmo -c check_aix_disks
DISK CRITICAL - free space: /oradata 2 MB (0% inode=50%); /oradata/OEM 2 MB (0% inode=50%); /oradata/RMAN 2 MB (0% node=50%);| /=52MB;115;121;0;128 /usr=1590MB;115;121;0;2304 /var=218MB;115;121;0;512 /tmp=405MB;115;121;0;2048 /home=210MB;115;121;0;256 /opt=80MB;115;121;0;128 /export/installios=673MB;115;121;0;1024 /nim/aix/53/CDs=6288MB;115;121;0;7168 /nim/aix/53/lppsource=1587MB;115;121;0;2080 /nim/aix/53/spot=409MB;115;121;0;2048 /nim/aix/53/mksysbs=6220MB;115;121;0;7168 /sysadmin=460MB;115;121;0;640 /app/oracle=12415MB;115;121;0;15744 /app/oracle/10g=3848MB;115;121;0;5984 /app/oracle/OEM=14331MB;115;121;0;17152 /oraarch/OEM=14263MB;115;121;0;24896 /oraarch/RMAN=3603MB;115;121;0;5440** /oradata=8925MB;115;121;0;8928 /oradata/OEM=8925MB;115;121;0;8928 /oradata/RMAN=8925MB;115;121;0;8928**

you will notice that the output for the highlighted data above is displaying the same information which is incorrect.

I run a few tests and configure the NRPE daemon to run the following check:

command[check_aix_disks]=/usr/local/nagios/libexec/check_disk -w 10% -c 5% -p /oradata/OEM -p /oradata/RMAN -p /oradata

when I run the command form the server to be monitored:

–> ./check_disk -w 10% -c 5% -p /oradata/OEM -p /oradata/RMAN -p /oradata
DISK OK - free space: /oradata/OEM 2593 MB (24% inode=99%); /oradata/RMAN 3236 MB (43% inode=99%); /oradata 1814 MB (20% inode=99%);| /oradata/OEM=7998MB;9532;10062;0;10592 /oradata/RMAN=4155MB;6652;7022;0;7392 /oradata=7113MB;6652;7022;0;8928

but when I run the NRPE check from the nagios server I get:

[root@prodnag01 libexec]# ./check_nrpe -n -H salmo -c check_aix_disks
DISK CRITICAL - /oradata/OEM does not exist

I get the same result when I just run -p /oradata/RMAN

but the check is successful when I run -p /oradata

Please can someone advise as this is driving me mad, this only happens on 2 servers that are identically configured out of 55 monitored servers.

any help will be much appreciated and stop me crying on my keyboard :slight_smile:

Thanks

Chris


[solved] nagios - web view doesn't agree with CLI checks
#2

Not really familiar with AIX but I’ll take a stab at it. To trouble shot I would think you could run df as the user that nrpe is running as and see what you get. Maybe there are special permissions to see certain partitions/mounts. How it helps I don’t have a AIX machine to test on.


#3

genius!

that is the one - thanks a million my friend, i investigated this for about 3hrs and now I have little hair :slight_smile: