Big network needed to be monitored

  • I’ll just tell you what I want. I just need a hint on how to do it.
  • I have a big network, mainly windows stations. there are 4 classes of IP’s and I need to monitor … the presence of let’s say at least one IP per each class to know that one segment of the network is OK. Is it possible to use the ping, fping or arping plugins to scan all 254 IPs on a network segment and report it’s findings to nagios? …
  • I also need to know if IP definitions of groups can be give in the format … a group of IP-s to be monitored as a hostgroup.

waiting for an answer. Thanks in advance.

Yes to most of your questions, and no to the format.
You would have to include every IP addy in the hosts.cfg file and the name you wish to call it… You can then define a hostgroup named 10subnet in the hostgroups.cfg file and put each hostname in that group. Now, instead of defining a ping for each and every host, you define a ping for the hostgroup_name instead. Nagios will now ping every host in that hostgroupname.

If you want Nagios to discover every ip on the 10.xxxx network it’s not going to happen. Get a commercial app for around $10,000 and I’m sure it would auto discover.

I’m playing around with tkined which does discover for you.

Thu Feb 09 04:30:55 PM EST 2006
Discover xx.xx.xx.xx from localhost []
107 nodes found on network xx.xx.xx.xx in 48 seconds.
107 routes traced in 4 seconds.
109 netmasks queried in 25 seconds.
38 snmp agents queried in 25 seconds.
109 ip addresses queried in 4 seconds.
3 networks discovered in 0 seconds.
5 gateways discovered in 0 seconds.
214 links discovered in 0 seconds.
2 nodes merged in 0 seconds.
219 tkined objects created in 0 seconds.
Discover finished in 106 seconds

And then it draws out one big mother of a map that kinda looks like spaghetti, but much more structured.

Well, bottom line is, if you have a big network, and you have no monitor on it now, then somebody better get to work. Unless you don’t care how long it takes to find a fix a problem with your network. Now let’s see, how does this switch get to the router? Which port? Ummm, ahh heck, just setup nagios and get it over with. It checks our network, every switch, every router, every cable/port used to connect one to the other, every eth0 on a host and how it connects to a switch, the cpu loads, temp, fan RPM’s power supplies, and on, and on, and on,…

Of course, it didn’t do this by itself. I’ve put plenty of work into it. And I can guarentee that I didn’t just tell it “look at the” net and see what’s working. No, I had to find out for myself, how my nagios pc connects to a switch, how that switch connects to another, how that switch connects to a router. I then looked at every switch in the company and figured out what port was used to connect from A to B, to router, etc. It’s not easy, but is is worth it? You tell me. It takes me 5 minutes to discover exactly which cable has been cut/unplugged/etc and it would take you how long?
If you could not ping to your router, is it your eth0 cable/card, the switch you are plugged into, perhaps the other switch that daisy chains to the router you can’t ping? You don’t know, unless you actually ping the switches first. Oh, but wait, you don’t remember the ips of the switches or even which switches are in between you and the router. Ahh, heck, call someone who knows how the network is cabled up and let them fix it. NOT!!! This is NOT how you want to fix your network. Nagios will not only tell you what is broken, but you will learn your entire network layout due to the need for you to configure nagios correctly.

Get my point?