I’ve been floundering trying to find the best way to deal with this problem. Basically we have about 700 hosts/services. Most of those sites go through one central point, and occasionally the link to that central point goes all wonky. So pings rise up and all of a sudden we’re getting 400 some warning/critical messages. Is there any easy way to have warning/critical be supressed? Does the parent setting do this? I haven’t noticed that it does.
My idea to solve this was to stop getting notifications for all those hosts and just monitor the number of DOWN/WARNING/CRITICAL states instead. So that if CRIT’s rose by a certain number then we would all get pages to that effect. I downloaded and tried the check_remote_nagios_status.pl plugin but was unable to get it to work with the nagios3 status log. I don’t think the plugin has been updated since 2003. Does anyone know of any other programs/plugins/methods that might accomplish this?
Really I’m quite lost at the moment on the best way to accomplish this. Thank you for your help.