Service Escalations won't work


#1

I started out with a much more complex config, and have reduced down to something very minimal to trobuleshoot this. I cannot get service escalations to fire, no mater what I do. My host escalations work fine.

It strikes me as weird that it says “N/A” for last notification… I have it enabled everywhere, as far as I can tell.

I set up “testing” as a service that will always fail.

Any help apprecaited.

Escalation Template

define serviceescalation{
name noc-esc
first_notification 1
last_notification 0
contact_groups NOC
notification_interval 10
escalation_options w,u,c,r
register 0
}

Real escalation

define serviceescalation{
use noc-esc
host_name sv-ld0
service_description testing
}

Service

define service{
use generic-service
host_name sv-ld0
servicegroups notifications
service_description testing
is_volatile 0
check_period 24x7
max_check_attempts 4
normal_check_interval 1
retry_check_interval 1
contact_groups notifications
notification_interval 300
notification_period 24x7
check_command check_http
}

Host

define host {
use lg-generic-host
host_name sv-ld0
address xx.xx.xx.xx
}

Output in the web interface

Current Status:
CRITICAL
Status Information: CRITICAL - Socket timeout after 10 seconds
Performance Data:
Current Attempt: 4/4
State Type: HARD
Last Check Type: ACTIVE
Last Check Time: 01-19-2006 13:39:07
Status Data Age: 0d 0h 0m 56s
Next Scheduled Active Check: 01-19-2006 13:40:07
Latency: 0.314 seconds
Check Duration: 10.011 seconds
Last State Change: 01-19-2006 13:13:17
Current State Duration: 0d 0h 26m 46s
Last Service Notification: N/A
Current Notification Number: 0
Is This Service Flapping? N/A
Percent State Change: N/A
In Scheduled Downtime?
NO


#2

doesn’t make much sense to have an escalation defined on the first notification… anyway… you have a notification interval on the service of 300 minutes. that’s when you will get them in the escalations…
try putting the service notification interval to 10 put the first notification to 2 and last notification to 3 and see what happens… you should get a notification on failure to group notifications (4 minutes later in fact) and 2 subsequent ntoifications to NOC

escalations don’t generate notifications. they use the notifications generated by the service they work on and relay them if some rules match.

Luca


#3

Thinking I had a misunderstanding about how escalations work, I took your advice. Here is what my config looked like:

define service{
use generic-service
host_name sv-ld1
servicegroups notifications
service_description testing
is_volatile 0
check_period 24x7
max_check_attempts 4
normal_check_interval 1
retry_check_interval 1
contact_groups notifications
notification_interval 10
notification_period 24x7
check_command check_http
}

define serviceescalation{
name noc-esc
first_notification 2
last_notification 3
contact_groups NOC
notification_interval 10
escalation_options w,u,c,r
register 0
}

test

define serviceescalation{
use noc-esc
host_name sv-ld1
service_description testing
}

After 15 mins, no notifications at all. I let it sit for another 15, just in case. It still says “N/A”. I should have certainly seen something by this time.

Current Status:
CRITICAL
Status Information: CRITICAL - Socket timeout after 10 seconds
Performance Data:
Current Attempt: 4/4
State Type: HARD
Last Check Type: ACTIVE
Last Check Time: 01-20-2006 10:35:53
Status Data Age: 0d 0h 0m 42s
Next Scheduled Active Check: 01-20-2006 10:36:53
Latency: 0.215 seconds
Check Duration: 10.012 seconds
Last State Change: 01-20-2006 10:21:53
Current State Duration: 0d 0h 14m 42s
Last Service Notification: N/A
Current Notification Number: 0
Is This Service Flapping? N/A
Percent State Change: N/A
In Scheduled Downtime?
NO
Last Update: 01-20-2006 10:36:25

Any ideas? I’m especially confused as a similar host escalation works just fine, it’s only service escalations which won’t work.


#4

please disable notifications on the host via the web interface, wait a couple of minutes until cth eoptions changes to enable notifications on this host/service an enable it. At this point you are sure notifications for the host/service are enabled, and try again.

Luca


#5

Luca’s suggestion is due to this:
nagios.sourceforge.net/docs/1_0/ … tion_notes

It’s good reading and vital if you want to understand why you need the web interface at times.
Edited Sun Jan 22 2006, 07:20PM ]


#6

I finally tracked this down, and will explain so others can benefit from my folly:

I realized that I couldn’t recall if I’d proven that I could receive service notifications, regardless of escalation. So I removed all escalations, and tried again.

Failure.

Turns out, I’d mis-defined my generic service without a directive for service notification options. So naturally, none went out.

I’ve remedied that, and things are working now. Thanks for those who tried to help, much appreciated!