Nagios escalating prematurely


#1

Hi All … we have nagios 2.9 running with 3 escalations level …

  1. second line support
  2. third line support
  3. all engineers.

What seems to happen is that if a service/host is in a warning state for a while, and then goes to critical, it directly escalates to the step ‘3’ - all engineers.
Which seems to annoy people who wake up with a sms :slight_smile:

So is this a bug or a feature :slight_smile: Or did I configure something wrong … Escalation definitions are blow … I’d
love to know how to ignore the ‘warnings’ in the notification counters.

[code]define hostescalation {
host_name .*
contact_groups level_1
first_notification 3
last_notification 0
notification_interval 10
escalation_options d,u,r
}

define hostescalation {
host_name .*
contact_groups level_1,level_2
first_notification 6
last_notification 0
notification_interval 10
escalation_options d,u,r
}

define hostescalation {
host_name .*
contact_groups level_1,level_2,level_3
first_notification 9
last_notification 0
notification_interval 10
escalation_options d,u,r
}

define serviceescalation {
host_name .*
service_description .*
contact_groups level_1
first_notification 3
last_notification 0
notification_interval 10
escalation_options c,u,r
}

define serviceescalation {
host_name .*
service_description .*
contact_groups level_1,level_2
first_notification 6
last_notification 0
notification_interval 10
escalation_options c,u,r
}

define serviceescalation {
host_name .*
service_description .*
contact_groups level_1,level_2,level_3
first_notification 9
last_notification 0
notification_interval 10
escalation_options c,u,r
}[/code]


#2

Well it really shouldn’t send the fist critical notification to the level 3 contacts. Weird, configuration seems ok, although you could use ranger 3rd to 5th notification for level1 6th-8th for level2 and 9th-0 for level3. That shouldn’t have any impact on notification policy from the one you’ve specified. That one is wierd. I would try that on some other server just to be sure it isn’t server’s or nagios installation fault…

How to ignore Warnings in notification counter? Depends on the plugin you’re using to make the checks. If it is a bulit-in plugin from nagios-plugins, I assume it has warning and critical thresholds. Setting them at equal value gives you the ability to overcome warnings because it automaticaly gives a critical status if threshold is reached.
If it is a custom plugin of your own, then make all exit codes to be critical.


#3

We are currently using Nagios 1.3 .The issue we facing is the same , when a alert is in Warning state and then from Warning it moves to Critical state ,the alert is escalated directly to L2,L3 L4 escalations,here nagios assumes that the time period ,the alert was in warning state as unacknowledged time (even when it is acknowledged ),and it follows the L2 ,L3 escalation path depending on the time we have defined for the esclations.

Is this a bug or a feature of nagios and is there a way to fix this problem as escalation to TOP management is major thing because it is direct impacting to our SLAs.


#4

I just ran into this exact same problem. The escalation logic I believe only checks that the current notification level matches escalation_options. So if I set the escalation on second notification, and get one warning then one critical, it matches the escalation and pages everyone. I want it to only escalate if the first_notification count only matches alerts of type escalation_options. Is this possible, or just a “feature” of nagios?