I would like to ensure that my nagios installation checks all of my services as quickly as possible to ensure an accurate state of our network and servers within nagios. And to ensure accurate reporting.
I have shortened the max_service_check_spread value and turned off passive check values as we dont use these. Is there any other way to get the quickest check times of all services and to ensure quick recovery detection?
I asked for averages and from the above I can’t tell.
But anyway, no matter if the above are the max or averages, either one is too big.
Fix your problem of how long your checks take first thing. Most likely, you have some poor settings for timeouts or something. Or perhaps your check_ping plugin is using something like -p 10 or maybe 5, when it should be -p 1.
See the nagios docs on how to trim things up. nagios.sourceforge.net/docs/2_0/tuning.html
This is my problem, I have gone through all of the tuning documentation but it still seems too slow to complete all checks. My check_ping is “-p 1” as it should be.
No only servers we just have a busy network with lots of activity.
with regards to tuning nagios I have set the value
max_concurrent_checks=0
under the impression this will allow nagios to perform the checks as fast as the server allows, is this correct or should I manually calculate this value?
Active Service Latency: 0.000 / 24.049 / 0.574
That tells me that nagios is not slow. If the average latency of a check is .574 seconds, then that means that all checks are being completed within .574 seconds of the time that it was scheduled to be ran. So why do you say nagios isn’t completing all the checks when it looks like it is from the above info?