I am running Nagios 2.10 which monitors approximately 3000 hosts (1 service per host).
I get a huge check latency and have tried a bunch of things which apparently dont work
Here is the output for the preflight nagios check
Nagios 2.10
Copyright © 1999-2007 Ethan Galstad (nagios.org)
Last Modified: 10-21-2007
License: GPL
Projected scheduling information for host and service
checks is listed below. This information assumes that
you are going to start running Nagios with your current
config files.
Total hosts: 3345
Total scheduled hosts: 0
Host inter-check delay method: SMART
Average host check interval: 0.00 sec
Host inter-check delay: 0.00 sec
Max host check spread: 30 min
First scheduled check: N/A
Last scheduled check: N/A
Total services: 3345
Total scheduled services: 3345
Service inter-check delay method: SMART
Average service check interval: 300.00 sec
Inter-check delay: 0.09 sec
Interleave factor method: SMART
Average services per host: 1.00
Service interleave factor: 1
Max service check spread: 30 min
First scheduled check: Mon Aug 18 10:38:13 2008
Last scheduled check: Mon Aug 18 10:43:12 2008
Service check reaper interval: 10 sec
Max concurrent service checks: 400
I have no suggestions - things look okay.
Here is the tactical overview :
Service Check Execution Time: 0.24 / 15.46 / 2.463 sec
Service Check Latency: 634.00 / 683.67 / 658.990 sec
Host Check Execution Time: 0.03 / 7.22 / 0.676 sec
Host Check Latency: 0.00 / 0.00 / 0.000 sec
Active Host / Service Checks: 3345 / 3345
Passive Host / Service Checks: 0 / 0
As you can see above, i have forced max concurrent service checks to 400 instead of the default “0” (unlimited) . The service check latency is in hundreds (and at times reaches in thousands)
I have kept the ping check as lean as possible.
$USER1$/check_ping -H $HOSTADDRESS$ -w 3000.0,80% -c 5000.0,100% -p 1 -t 5
also, i use “pexec” (an alternative to rsh) to check hosts
Kindly let me know if you have any suggestions
thank you!