Nagios gets it wrong... or more likely me!

Hi all,

I’ve been beavering away with my nagios install, and it’s all going pretty well. So I thought I’d move on to writing my own (very simple!) custom plugin.

I wrote a simple script to check whether a tape drive door is open or not. The script works fine when I run it from the command-line.

However, when I call it via check_nrpe, it always reports that a tape is inserted - even if it’s not!

So I figure I must have something wrong with how I am calling the script. Hours of web searching later I’m none the wiser, so I thought I’d ask and hope some kind soul takes pity on me! :slight_smile:

Here is the nuts and bolts of my script -

if “/bin/mt -tf /dev/st0 status | sed -n 6p” = " DR_OPEN IM_REP_EN" ] ; then
echo "WARNING - no tape is inserted"
exit 1
else
echo "OK - tape is inserted"
exit 0
fi

and here is the command I’ve added to nrpe.cfg -

command[check_tape]=/usr/local/gml/bin/nagios_check_tape.sh

If I run the script from the command-line on a server with no tape in the drive, I get this -

./nagios_check_tape.sh

WARNING - no tape is inserted

But when I call it via check_nrpe (from the local server), I get this instead -

/usr/local/nagios/libexec/check_nrpe -H 127.0.0.1 -c check_tape

OK - tape is inserted

What am I missing?!

Thanks for any help.

put in asimple script :

/bin/mt -tf /dev/st0 status | sed -n 6p

see the output running it with crontab. no user logged in… see the output.

probably some user initialization which isn’t done when running as a daemon or whatever wiothout the user being logged in :slight_smile:

Thanks Luca.

The script in crontab worked when running as root, but produced no output at all when running as nagios… so I logged in as nagios, ran the script locally, and found I had no permissions on /dev/st0.

So I added the nagios user to the disk group (which owns the /dev/st0 device), and following that, running the script as the nagios user gave me the correct result.

However… check_nrpe still told me the tape was in the drive!

Eventually I got round this by changing /etc/xinetd.d/nrpe so the nrpe agent runs under the disk group, rather than the nagios group.

So it works, but I’m not convinced this is the best way to run it!

Any thoughts on how I could improve on this?

Thanks again!

some permission issue for sure… WHAT permission issue remains open for debate :slight_smile:
You should consider changing your script to something which evaluates an error condition… haveing the if then else only has the OK condition as fallback which isn’t the best possible choice.

I had to set the permissions of /dev/st0 to 664, and then it worked fine running under xinetd user/group nagios.

For completeness, here’s the final script -

#!/bin/bash

init vars

DoorOpen=" DR_OPEN IM_REP_EN"
DoorClosed=" BOT ONLINE IM_REP_EN"

perform check for $DoorOpen and return appropriate exit value

if “/bin/mt -tf /dev/st0 status | /bin/sed -n 6p” = “$DoorOpen” ] ; then
echo "WARNING - no tape is inserted"
exit 1
fi

perform check for $DoorClosed and return appropriate exit value

if “/bin/mt -tf /dev/st0 status | /bin/sed -n 6p” = “$DoorClosed” ] ; then
echo "OK - tape is inserted"
exit 0
fi

If we have not exited the script by now then report an unknown status

echo "Unknown status"
exit 3