check_oracle_health shows warning when OK

tsm · June 22, 2010, 7:37pm

I posted about this to the developer of the plugin, but it seems to be more of a Nagios issue. The plugin reports OK, both on the command line and in Nagios, but it appears in yellow as a Warning on the services page.

Commandline:

./check_oracle_health -v --connect=oracle_user/oracle_passwd@mySID --mode=connection-time --warning=3 --critical=8
OK - 0.14 seconds to connect as oracle_user | connection_time=0.1437;3;8

Nagios log:

SERVICE ALERT: oracle_host;Oracle mySID Connect;WARNING;HARD;3;OK - 0.14 seconds to connect as oracle_user

Nagios PerfData:
Current Status: WARNING (for 0d 1h 15m 37s)
Status Information: OK - 0.16 seconds to connect as oracle_user
Performance Data: connection_time=0.1550;3;8
Current Attempt: 3/3 (HARD state)
Last Check Time: 06-22-2010 15:31:44
Check Type: ACTIVE

Any idea how I get Nagios to report it correctly?

luca · June 23, 2010, 11:10am

looks like the plugin is returnign the wrong exit code…

after running the plugin from command line execute:

echo $?

it should return 0 (OK state - 1 is warning - 2 is critical)

tsm · June 23, 2010, 1:28pm

Yeah, I thought about that too. I hacked the plugin to add the exit code to the output, now it looks like:

./check_oracle_health -v --connect=oracle_user/oracle_passwd@mySID --mode=connection-time --warning=3 --critical=8
OK(0) - 0.14 seconds to connect as oracle_user | connection_time=0.1437;3;8

And from the nagios.log:

CURRENT SERVICE STATE: oracle_host;Oracle Connect;WARNING;HARD;3;OK(0) - 0.18 seconds to connect as oracle_user

The one thing I noticed is that the performance data in Nagios always shows:

Current Attempt: 3/3 (HARD state)

Looks almost like it’s getting rechecked twice. Interestingly, if I change the config to do a max of 4 checks, the perf data shows:

Current Attempt: 4/4 (HARD state)

So, it seems Nagios is using this plugin differently than other plugins. I’ve fired off the script from the command line a bunch of times and never got a failure, so I don’t know what could be causing it to get rechecked the max number of times.