Delays between notifications


#1

Hello,

I’ve to setted up nagios to send notification emails when a problem occurs and an escalade to call on two phones with the help of asterisk and a notification script (sip calls are free with my isp, sms aren’t)

My problem is that if the three emails are sent approximatively at the same time that’s not the case for the phone calls and i dont know why. I’ve the first call on the first phone and several minutes later the second call on the second phone.

Here are my log file, configuration files, contact, host and commands :

LOG:

[1227175589] HOST NOTIFICATION: fde;FAILURE_TEST4;DOWN;notify-host-by-email;PING CRITICAL - Paquets perdus = 100%
[1227175589] HOST NOTIFICATION: dja;FAILURE_TEST4;DOWN;notify-host-by-email;PING CRITICAL - Paquets perdus = 100%
[1227175589] HOST NOTIFICATION: cki;FAILURE_TEST4;DOWN;notify-host-by-email;PING CRITICAL - Paquets perdus = 100%
[122717[b]6296] HOST NOTIFICATION: cki-phone;FAILURE_TEST4;DOWN;notify-host-by-phone;PING CRITICAL - Paquets perdus = 100%
[122717[b]6976] HOST NOTIFICATION: dja-phone;FAILURE_TEST4;DOWN;notify-host-by-phone;PING CRITICAL - Paquets perdus = 100%

#hosts

define host{
name generic-host ; The name of this host template
notifications_enabled 1 ; Host notifications are enabled
event_handler_enabled 1 ; Host event handler is enabled
flap_detection_enabled 1 ; Flap detection is enabled
failure_prediction_enabled 1 ; Failure prediction is enabled
process_perf_data 1 ; Process performance data
retain_status_information 1 ; Retain status information across program restarts
retain_nonstatus_information 1 ; Retain non-status information across program restarts
check_command check-host-alive
max_check_attempts 10
notification_interval 0
notification_period 24x7
notification_options d,u,r
contact_groups admins
normal_check_interval 5
retry_check_interval 1
register 0 ; DONT REGISTER THIS DEFINITION - ITS NOT A REAL HOST, JUST A TEMPLATE!
}

'first interval of 10mn between mail and phone escalation

define hostescalation{
hostgroup_name all
first_notification 1
last_notification 1
notification_interval 10
contact_groups admins
}

phone escalation

define hostescalation{
hostgroup_name all
first_notification 2
last_notification 2
notification_interval 60
contact_groups admins-phones
}

mail every hours

define hostescalation{
hostgroup_name all
first_notification 3
last_notification 0
notification_interval 60
contact_groups admins
}

‘FAILURE’ host definition

define host{
use generic-host ; Name of host template to use
host_name FAILURE_TEST4
alias FAILURE_TEST4
address 192.168.182.4
parents switch_serveur_netgear_7212
check_command check-host-alive
max_check_attempts 2
notification_interval 60
check_period 24x7
notification_period 24x7
notification_options d,u,r
contact_groups admins
}

#contact

‘Christophe’ contact definition

define contact{
contact_name cki
alias Christophe
service_notification_period 24x7
host_notification_period 24x7
service_notification_options w,c,r
host_notification_options d,r
service_notification_commands notify-service-by-email
host_notification_commands notify-host-by-email
email christophe@xxx.xx
}

‘Christophe Contact Par Telephone’ contact definition

define contact{
contact_name cki-phone
alias Christophe
service_notification_period 24x7
host_notification_period 24x7
service_notification_options w,c,r
host_notification_options d,r
service_notification_commands notify-service-by-phone
host_notification_commands notify-host-by-phone
email christophe@xxx.xx
address2 SIP/cki
address3 SIP/cki-sec
}

‘David’ contact definition

define contact{
contact_name dja
alias David
service_notification_period 24x7
host_notification_period 24x7
service_notification_options w,c,r
host_notification_options d,r
service_notification_commands notify-service-by-email
host_notification_commands notify-host-by-email
email dja@xxx.xx
}

‘David Contact par Telephone’ contact definition

define contact{
contact_name dja-phone
alias David
service_notification_period 24x7
host_notification_period 24x7
service_notification_options w,c,r
host_notification_options d,r
service_notification_commands notify-service-by-phone
host_notification_commands notify-host-by-phone
email dja@xxx.xx
address2 SIP/dja
address3 SIP/dja-sec
}

‘Francois’ contact definition

define contact{
contact_name fde
alias Francois
service_notification_period workhours
host_notification_period workhours
service_notification_options c,r
host_notification_options d,r
service_notification_commands notify-service-by-email
host_notification_commands notify-host-by-email
email fde@xxx.xx
}

define contactgroup{
contactgroup_name admins
alias Nagios Administrators
members cki,dja,fde
}

define contactgroup{
contactgroup_name admins-phones
alias Nagios Administrators
members dja-phone,cki-phone
}

#comands

‘notify-host-by-email’ command definition

define command{
command_name notify-host-by-email
command_line /usr/bin/printf “%b” "***** Nagios *****\n\nNotification Type: $NOTIFICATIONTYPE$\nHost: $HOSTNAME$\nState: $HOSTSTATE$\nAddress: $HO$
}

‘notify-service-by-email’ command definition

define command{
command_name notify-service-by-email
command_line /usr/bin/printf “%b” "***** Nagios *****\n\nNotification Type: $NOTIFICATIONTYPE$\n\nService: $SERVICEDESC$\nHost: $HOSTALIAS$\nAddre$
}

‘notify-service-by-phone’ command definition

define command{
command_name notify-service-by-phone
command_line /scripts/notify-by-phone-launcher.sh $CONTACTADDRESS2$ $CONTACTADDRESS3 "Alarme Nagios ! Type de notification: $NOTIFICATIONTYPE$. Servi$
}

‘notify-host-by-phone’ command definition

define command{
command_name notify-host-by-phone
command_line /scripts/notify-by-phone-launcher.sh $CONTACTADDRESS2$ $CONTACTADDRESS3$ "Alarme Nagios ! Type de notification : $NOTIFICATIONTYPE$. Ser$
}

I just want to specify that notify-by-phone-launcher.sh doesn’t cause nagios to wait. The process that the script launch is detached.

If some one has any idea.

Thanks a lot.
Christophe.


#2

Does this happen always with the same delay between notifications?

What happens when you manually start the script? How long does it need to be executed?


#3

Thank you for your reply.

It’s approximatively always the same delay yes.

If I run the script from the command line it doesn’t not take more then 1 sec because /scripts/notify-by-phone-launcher.sh just launch the real script /scripts/notify-by-phone.sh as a detached process.


#4

Check in your nagios.cfg file and serch for the setting named:
command_check_interval=-1
It should be set to -1 to check the command file as often as possible. Maybe it is set to some other value in your config and that could do a delay in processing command file. Although this is the setting for the external command and I’m not sure if it records notification commands to that file also.


#5

I’ve checked the nagios.cfg and it’s ok : command_check_interval is set to -1…

The more I search, the more confused I get…


#6

This is a misetry :frowning:
I haven’t got a clue about this. The only thing that left for now is to try to reinstall nagios. Maybe on another server?


#7

Maybe yes. I’ll try another version if I can take some time in the next weeks.

Thank you for the help.