I have a simple setup running Nagios 2.3.1 (actually NagiosVMA, configured using Groundwork Monarch). I have a host with many services on it (both remote checks performed using check_by_ssh and checks for various public services), and I get a barrage of notifications whenever something goes wrong.

So, I defined dependencies. The services in a host are dependent on the host (and/or on the PING service, and some other general services). But, there are some timing issues that mean I still get hammered with a lot of alerts. What happens during outages it this:

  1. The host itself goes down, but not before a few services manage to go down individually (so I get a few service down notifications)
  2. More services that go down when the host is down do not generate any notifications - good.
  3. When the outage is over, Nagios detects the host is up.
  4. Now the services are detected as going up, and since the host is up there is no dependency to filter the notifications, so I get a whole barrage of “service OK” notifications.

These are killing my cellphone :(. Can anyone direct me to the best way to better configure these things?