Something like LOTS OF STUFF JUST WENT DOWN page


#1

We just set up Nagios a few weeks ago. Today the connection between Nagios and the internet was broken for about 7 minutes. After that, Nagios sent out 118 Emails and SMS Pages to tell us everything was down, then another 118 Emails and SMS Pages to tell us everything(118 servers) was up.

Is there a way to get Nagios to send out one mail when multiple hosts and services go down at once like… “Hey LOTS of stuff just went down” instead of: “Server went down” 118 times and “Server came up” 118 times? I searched for this and could find nothing. I found lots of people misunderstanding the question and suggesting “Flap Detection” but nothing actually related to “Everything went down at once.”

Any suggestions, work-arounds, scripts, or ideas about this?

Thank you


#2

maybe you can accomplish what you want by setting up dependecy between services. Check the doc about that.


#3

first start by setting up parent depdnecnies between hosts and disable notificaitons for the Unknown status.


#4

Actually, there are 2 ways.

If you use the “parent” definition in the host definition to define which other hosts the defintion is dependent on, Nagios will check those hosts first. This requires you to change in host definition, but has the advantage that the status mapp will reflect the topoloogy.

Another way to do this is to create a file that contains nothing but host_dependency definitions. So you could list a router and then all hosts directly connected to it. If “inherits_parent” is “1”, Nagios will check the entire path.

For example, image you have router1->router2->VirtualHost->(guest1|guest2|guest3)

define hostdependency 
{
    dependent_host_name router2
    host_name router1
    inherits_parent 1
    execution_failure_criteria	    u,d,p
    notification_failure_criteria	u,d,p
}

define hostdependency 
{
    dependent_host_name VirtualHost
    host_name router2
    inherits_parent 1
    execution_failure_criteria	    u,d,p
    notification_failure_criteria	u,d,p
}

define hostdependency 
{
    dependent_host_name guest1,guest2,guest3
    host_name VirtualHost
    inherits_parent 1
    execution_failure_criteria	    u,d,p
    notification_failure_criteria	u,d,p
}

If router1 is unreachable, you won’t receive notifications for any of the other systems.

SilverSpore
silverspore.com