I have a script that emails our Remedy ticketing server and creates tickets for specific service alerts. I only want Nagios to run that “notification” script once, when the service first goes to Warning or Critical (depending on the check). So I set up a “servicescalation” definition with different notification options from the notification options in the service defintion:
define service{
use generic-template
host_name hp1,hp2,hp4,hp6
service_description 4-cpu HP-UX load average
contact_groups uab,dba
check_command check_by_ssh!"/usr/local/nagios/libexec/check_load -w 100,100,3 -c 100,100,4"
notification_options c
}
define serviceescalation{
host_name hp1,hp2,hp4,hp6
service_description 4-cpu HP-UX load average
contact_groups dba-ticket,dba
first_notification 1
last_notification 1
notification_interval 0
escalation_options c
}
According to the documentation for Nagios (we’re on 2.9), the above service configuration (combined with our overall settings for reminder notifications every 2 hours) should email uab and dba with reminder notifications while the service is critical. The serviceescalation configuration should email dba-ticket and dba only for the first Critical notification and never do reminders.
Here ends the theoretical.
What happened this weekend was that hp2 went critical on 5/24 at 8:40 and hp1 went critical on 5/25 at 14:48. They both stayed critical until today (why people didn’t follow up sooner is another topic entirely…).
For hp1 Nagios sent out reminders faithfully every two hours to dba and uab but never sent to dba-ticket.
For hp2 Nagios sent to dba-ticket and dba but never sent to uab or sent reminders.
So it looks like when the serviceescalation notifications are in conflict with the service notifications configuration it’s a toss-up which one you get?? Has anyone else ever seen anything like this? Or can anyone suggest a way to get dependable service escalations that only notify for the first time something goes Critical?
Many thanks!