Host Check retry interval 70s in Nagios 3.0.6, compared to 3s in Nagios 2.12.
Any ideas as to what is causing this behavior?
Is there a parameter we can set that will affect the rate of Host Check retries?
Thanks in advance for any help you can provide.
Details:
When a Host Check goes critical, our installation performs subsequent Host Checks once every 70 seconds:
Host Down[07-28-2009 14:49:39] HOST ALERT: broker105;DOWN;HARD;10;CRITICAL - Host Unreachable (10.22.4.105)
Host Down[07-28-2009 14:48:29] HOST ALERT: broker105;DOWN;SOFT;9;CRITICAL - Host Unreachable (10.22.4.105)
Host Down[07-28-2009 14:47:19] HOST ALERT: broker105;DOWN;SOFT;8;CRITICAL - Host Unreachable (10.22.4.105)
Host Down[07-28-2009 14:46:09] HOST ALERT: broker105;DOWN;SOFT;7;CRITICAL - Host Unreachable (10.22.4.105)
We have about 2500 sevice checks across 500 hosts.
max_concurrent_checks = 0
service _reaper_frequency = 10
normal_check_interval and retry_check_interval are not specified except for Service Checks, and should not affect Host Check retries (but are in units of 60s).
The Nagios 2 documentation states that, “If the first host check returns a non-OK state, Nagios will keep pounding out checks …” The Nagios 3 documentation does not have the Service Check Scheduling page completed, but one retry every 70 seconds is definitely not “pounding out”.