Host going to hard state before max attempts

derekbrewer · October 11, 2010, 8:49pm

I am using Nagios 3.2.1 with RHEL5.2 and noticed today that when a host check goes to a HARD CRITICAL state, the web interface and even the status.dat file show the current attempts as 1 even though in my case the max attempts is set to 10. Looking through the log, I do see the 10 SOFT CRITICAL entries, but once it’s through that cycle, it seems the attempts get reset to 1. Am I going crazy here or is this truely a problem? I don’t think it really affects all that much other than just confusing me (and not being able to parse out if a host is in a hard state from the status.dat file).

luca · October 12, 2010, 11:02am

possibly you have more than one nagios instance running.
try stopping nagios, cehck for surviving threads (ps -ef | grep nagios), kill as required, start nagios and try again.

derekbrewer · October 12, 2010, 2:45pm

Thank You for your tip. I am running multiple instances of nagios on the system (on purpose) and when I shut down the 2nd instance, it did report 10 of 10 attempts which is correct. Any idea why or how my 2nd instance is conflicting with this one? As far as I can tell everything else is completely separate. Again, I’m not really going to lose sleep over this since it doesn’t really affect much.

luca · October 12, 2010, 7:01pm

looks like the two instances aren’t really separated…

If you want to purposely run two instances you need to check ALL paths defined in ‘./configure --help’ to be different between the two instances, it look like some info is still shared in your setup, don’t ask me what exactly