I’m curious if anybody else has this problem…
The nagios install is running quite fine together with nagiostat. The problem is sometims it looks like Nagios simply stops executing checks. In fact nagios IS running (and possibly even performring the checks) last check times where all two hours back… (and the nagiostats graphs stopped at the same time.
root@submariner(sun58):/# ps -ef | grep nagios
nagios 23770 23769 0 0:00
nagios 5389 1 0 02:01:02 ? 8:52 /export/nagios2/bin/nagios -d /export/nagios2/etc/nagios.cfg
nagios 23769 5389 0 07:27:10 ? 0:00 /export/nagios2/bin/nagios -d /export/nagios2/etc/nagios.cfg
Killing the 23769 process has brought Nagios back to life… and all last check times went somewhrere near the actual time… going to check some http logs if nagios checked the sites in the meanwhile or what else…
At the moment to limit this i have a “nagiosreload” script running each night. so at most it looses a day (but thats BAD anyway)
Any ideas from those running 2.03b? BTW I have had this with all nagios2 releases.
Thank you in advance, Luca
PS: Running on solaris 8 together with an MRTG instance.
EDIT: An update on this, in fact killing the above process awoke nagios but only for a real short time, looks like it did a complete check of all hosts/services and hanged again…