Nagios: Service Detail => Status



I have created a script to check some disk statuses on some remote hosts. The status is returned OK back to Nagios and Nagios picks that return code up for all except one the hosts. That means that the green status remains even if the script returns a nonezero number for the mentioned host. If I run the command from the monitoring/nagios server to the remote host, the status is returned as expected.
I have forced the script to return a return code=2 (critical) just to debug and the output and return code are returned back to Nagios for all the remote hosts like this:

mon@svg022 > ssh svg010 /net/svg022/export/monitor/nagios/libexec/raidctlchk
Raid Status: OK, Disk Status: OK, Mirror Status: Error
mon@svg022 > echo $?

All the hosts turns up red after a while except for the one!?

Any Ideas ?



Are you saying that on the problem host, Nagios is displaying the status text that was returned, but not processing the exit code? Have you tried wrapping the ssh host command into parens (ssh host command) ? Are you sure that the status message returned is from the current check, that is the ‘Last Check’ column shows a fresh date-time for that particular service?




We can see that the status text and the timestamp is updated, but the exit code are processed wrong on the “problem” host. That means that the Status are OK (Green) and will not switch to Critical (Red).
We have tried putting both the ssh + command into brackets, but it did not help.
We can also see that there are some other plugins that are working like the check of disk utilization.
There are also another difference and that is that the “problem host” are Solaris9, while the other are Solaris 8.