I posted about this to the developer of the plugin, but it seems to be more of a Nagios issue. The plugin reports OK, both on the command line and in Nagios, but it appears in yellow as a Warning on the services page.
Commandline:
./check_oracle_health -v --connect=oracle_user/oracle_passwd@mySID --mode=connection-time --warning=3 --critical=8
OK - 0.14 seconds to connect as oracle_user | connection_time=0.1437;3;8
Nagios log:
SERVICE ALERT: oracle_host;Oracle mySID Connect;WARNING;HARD;3;OK - 0.14 seconds to connect as oracle_user
Nagios PerfData:
Current Status: WARNING (for 0d 1h 15m 37s)
Status Information: OK - 0.16 seconds to connect as oracle_user
Performance Data: connection_time=0.1550;3;8
Current Attempt: 3/3 (HARD state)
Last Check Time: 06-22-2010 15:31:44
Check Type: ACTIVE
Yeah, I thought about that too. I hacked the plugin to add the exit code to the output, now it looks like:
./check_oracle_health -v --connect=oracle_user/oracle_passwd@mySID --mode=connection-time --warning=3 --critical=8
OK(0) - 0.14 seconds to connect as oracle_user | connection_time=0.1437;3;8
And from the nagios.log:
CURRENT SERVICE STATE: oracle_host;Oracle Connect;WARNING;HARD;3;OK(0) - 0.18 seconds to connect as oracle_user
The one thing I noticed is that the performance data in Nagios always shows:
Current Attempt: 3/3 (HARD state)
Looks almost like it’s getting rechecked twice. Interestingly, if I change the config to do a max of 4 checks, the perf data shows:
Current Attempt: 4/4 (HARD state)
So, it seems Nagios is using this plugin differently than other plugins. I’ve fired off the script from the command line a bunch of times and never got a failure, so I don’t know what could be causing it to get rechecked the max number of times.