Notification grouping/batching / notification digests


#1

Is there a way to batch up notifications and send out a single email every 5 minutes instead of sending a email out immediately after a state change? i.e. something like:

14:30 (3) Crit
Host foohost1.nyc Ping CRIT
Host foohost2.nyc Ping CRIT
Host foohostn.nyc Ping CRIT

I realize there’s a way to structure things so that if X goes down, then assume everything else is unreachable, but this isn’t what I want to do. When something catastrophic happens to a site I’d rather get one email instead of 1500. The homegrown monitoring solution at my old job did this and it worked quite well.


#2

You can set the number of retries before sending notification for host/service state changes (max_check_attempts option). There is also an option which allows you to send notification evry X minutes (notification_interval). Those can be specified in every host/service definition.

If you wish to be informed onla about the highest host/service in hierarchy which got down/critical state then you should do it with parent/child relationships in host definitions, and with host/service dependencies:
nagios.sourceforge.net/docs/2_0/ … dependency