Question about distributed monitoring

subsine · January 31, 2008, 10:33am

Hi there,

I’m considering implementing a distributed nagios monitoring system for our ~300 host DCN network. It is likely to grow considerably over the next few years so a scalable solution is required.
I was wondering if a slave host will ‘save’ its monitoring data if its connectivity to the master host is temporarily lost, and then update the master once connectivity has resumed.
Perhaps my question is best illustrated with an example:
Suppose we have several network sites which are all self contained (have their own firewall). If we have a nagios slave in each of these sites monitoring the local hosts, but then WAN connectivity fails, and the slaves cannot communicate with the master (but can all still monitor their own LANs), will all the monitoring data collected during the WAN dropout be ‘saved’ and uploaded to the master once connectivity is resumed, or will all that data be ‘lost’?
I hope that’s clear.
Of course I could set this up and test it, but asking might save me some considerable time.

Thanks in advance,

Max.

sharamun · February 1, 2008, 10:01pm

subsine,

since nagios servers retain state history for the most part, theoretically you could configure a nagios server to run in each LAN and while the WAN is up, have them forward all alerts to a central nagios server

the problem with doing it this way is maintenance…maintaining a dozen nagios servers could be managed but I doubt significantly more than that would be practical