Nagios 2.9 I currently have 39 hosts and 67 services monitored. When I go to the Host Detail page, the “Last Check” column is from several days to several months old for most hosts. Two show N/A and Durations of 567d 2h 9m 16s For every one of those hosts, I can look at View Status Detail For This Host and see that checks have been run in the past few minutes.
Where does Nagios get this info from, and how can I reset it so it’s accurate?
Smells to me like a case of 2 nagios processes… you might like to stop nagios, and then do a killall -9 nagios to make sure everything is stopped, before restarting.
OK, I moved retention.dat, started nagios, everything went to Pending and then OK… and already, the Last Check times are stringing out throughout the day. Each individual service shows a time that looks reasonable, but on the Host Detail page, a few show the last few minutes, and the others are various times going back to this AM.
You are not alone… I didn’t try moving retention.dat and status.dat–but i have the exact same problem with nagios and the reporting of checks. But my symptoms go deeper. I also run PNP and I lose my performance data; forced active checks do not run when scheduled.
I have cron jobs that restart nagios, nsca (it dies under xinetd and daemon for me), and now I think I need to add ndo2db to the periodic restart. I’m at wit’s end.
The Last Check on Host Details page is a last chek-host-alive check for that host. It doesn’t have any connection with regular service checks. Host check (check-host-alive) is executed only when all services fail (read, have Critical states) and when there is no Last Check information (like in the case you’ve moved retention.dat file and every host was in Pending state).
So, don’t get confused with the Last Check info on host’s, those are executed only in special conditions, and not on a regular basis. Services are checked on a regular basis.
This is cut/paste from official 2.0 version documentation, under Host definition nagios.sourceforge.net/docs/2_0/ … .html#host check_interval: NOTE: Do NOT enable regularly scheduled checks of a host unless you absolutely need to! Host checks are already performed on-demand when necessary, so there are few times when regularly scheduled checks would be needed. Regularly scheduled host checks can negatively impact performance - see the performance tuning tips for more information. This directive is used to define the number of “time units” between regularly scheduled checks of the host. Unless you’ve changed the interval_length directive from the default value of 60, this number will mean minutes. More information on this value can be found in the check scheduling documentation.