Nagios scalability


#1

Hey guys, does anyone have experience using Nagios to monitor a large number of nodes?

I want to use Nagios to monitor link packet loss and I have about 2,500 links that need to be monitored. Do you think Nagios can cope with that? Using the check_icmp plugin .


#2

you won’t have a problem. once you hit 5000 or so hosts you might have to multithread or follow some other solution.
a number of large organizations use nagios. it’s hard to get scalability numbers because the number of hosts vs the number of service checks you have can have varied performance, and how often your service checks are running, and whether host checking is turned on, etc etc .

to give you an idea, we’ve got ~200 hosts and ~4500 passive service checks running at 1 min and 5 min intervals being received from various distributed nagios sites to a dualcore p4 3ghz, and its load averages ~0.25. This is nagios 2

I’ve heard of comparable hardware with a few more cores doing ~4000 hosts though.

check out
nagios.sourceforge.net/docs/3_0/tuning.html

if you have the option, set up your checks as services rather than defining 2500 hosts, and nagios will have absolutely no problem. You’ll be fine anyways though.


#3

Cool . Thanks for that. I understand many people use Nagios however i thought that around ~2-300 hosts would be the upper limit for most normal installations.

Here we have it monitoring 50 servers which results in about 500 service checks . But i am trying to make a case for it to monitor our WAN as well, which consists of approximately 2,500 routers.

I didn’t think many companies would have or want to monitor 2,500 hosts (which would result in about 10,000 service checks). Thats why i thought i’d ask :slight_smile: