Nagios sends notification much too fast!


#1

server05 HTTP OK 04-24-2009 21:18:12 mobiel_ahmed notify-by-sms HTTP OK HTTP/1.1 200 OK - 3779 bytes in 0.030 seconds server05 HTTP OK 04-24-2009 21:18:12 mobiel_john notify-by-sms HTTP OK HTTP/1.1 200 OK - 3779 bytes in 0.030 seconds server05 HTTP OK 04-24-2009 21:18:12 nagios-admin notify-by-email HTTP OK HTTP/1.1 200 OK - 3779 bytes in 0.030 seconds server05 HTTP CRITICAL 04-24-2009 21:18:02 mobiel_ahmed notify-by-sms CRITICAL - Socket timeout after 10 seconds server05 HTTP CRITICAL 04-24-2009 21:18:02 mobiel_john notify-by-sms CRITICAL - Socket timeout after 10 seconds server05 HTTP CRITICAL 04-24-2009 21:18:02 nagios-admin notify-by-email CRITICAL - Socket timeout after 10 seconds server06 HTTP OK 04-24-2009 20:46:12 mobiel_ahmed notify-by-sms HTTP OK HTTP/1.1 200 OK - 3779 bytes in 0.020 seconds server06 HTTP OK 04-24-2009 20:46:12 mobiel_john notify-by-sms HTTP OK HTTP/1.1 200 OK - 3779 bytes in 0.020 seconds server06 HTTP OK 04-24-2009 20:46:12 nagios-admin notify-by-email HTTP OK HTTP/1.1 200 OK - 3779 bytes in 0.020 seconds server06 HTTP CRITICAL 04-24-2009 20:46:02 mobiel_ahmed notify-by-sms CRITICAL - Socket timeout after 10 seconds server06 HTTP CRITICAL 04-24-2009 20:46:02 mobiel_john notify-by-sms CRITICAL - Socket timeout after 10 seconds server06 HTTP CRITICAL 04-24-2009 20:46:02 nagios-admin notify-by-email CRITICAL - Socket timeout after 10 seconds localhost HTTP OK 04-24-2009 16:01:24 company-pager notify-by-epager HTTP OK HTTP/1.1 200 OK - 4384 bytes in 0.022 seconds localhost HTTP OK 04-24-2009 16:01:24 mobiel_ahmed notify-by-sms HTTP OK HTTP/1.1 200 OK - 4384 bytes in 0.022 seconds localhost HTTP OK 04-24-2009 16:01:24 mobiel_john notify-by-sms HTTP OK HTTP/1.1 200 OK - 4384 bytes in 0.022 seconds localhost HTTP OK 04-24-2009 16:01:24 nagios-admin notify-by-email HTTP OK HTTP/1.1 200 OK - 4384 bytes in 0.022 seconds localhost HTTP CRITICAL 04-24-2009 16:00:44 company-pager notify-by-epager Connection refused localhost HTTP CRITICAL 04-24-2009 16:00:44 mobiel_ahmed notify-by-sms Connection refused localhost HTTP CRITICAL 04-24-2009 16:00:44 mobiel_john notify-by-sms Connection refused localhost HTTP CRITICAL 04-24-2009 16:00:44 nagios-admin notify-by-email Connection refused localhost HTTP OK 04-24-2009 15:51:31 company-pager notify-by-epager HTTP OK HTTP/1.1 200 OK - 4384 bytes in 0.073 seconds localhost HTTP OK 04-24-2009 15:51:31 nagios-admin notify-by-email HTTP OK HTTP/1.1 200 OK - 4384 bytes in 0.073 seconds localhost HTTP OK 04-24-2009 15:51:31 sms notify-by-sms HTTP OK HTTP/1.1 200 OK - 4384 bytes in 0.073 seconds localhost HTTP CRITICAL 04-24-2009 15:49:31 company-pager notify-by-epager Connection refused localhost HTTP CRITICAL 04-24-2009 15:49:31 nagios-admin notify-by-email Connection refused localhost HTTP CRITICAL 04-24-2009 15:49:31 sms notify-by-sms Connection refused

[code][04-24-2009 21:20:42] SERVICE ALERT: server05;HTTP;OK;SOFT;3;HTTP OK HTTP/1.1 200 OK - 3779 bytes in 0.060 seconds
Service Ok[04-24-2009 21:20:32] SERVICE ALERT: server05;MySQL;OK;SOFT;3;Uptime: 5 Threads: 2 Questions: 30 Slow queries: 0 Opens: 14 Flush tables: 1 Open tables: 8 Queries per second avg: 6.000
Service Critical[04-24-2009 21:20:32] SERVICE ALERT: server05;HTTP;CRITICAL;SOFT;2;Connection refused
Service Critical[04-24-2009 21:20:22] SERVICE ALERT: server05;MySQL;CRITICAL;SOFT;2;Can’t connect to MySQL server on ‘82.87.212.12’ (111)
Service Critical[04-24-2009 21:20:22] SERVICE ALERT: server05;HTTP;CRITICAL;SOFT;1;Connection refused
Service Critical[04-24-2009 21:20:12] SERVICE ALERT: server05;MySQL;CRITICAL;SOFT;1;Can’t connect to MySQL server on ‘82.87.212.12’ (111)
Service Notification[04-24-2009 21:18:12] SERVICE NOTIFICATION: mobiel_ahmed;server05;HTTP;OK;notify-by-sms;HTTP OK HTTP/1.1 200 OK - 3779 bytes in 0.030 seconds
Service Notification[04-24-2009 21:18:12] SERVICE NOTIFICATION: mobiel_john;server05;HTTP;OK;notify-by-sms;HTTP OK HTTP/1.1 200 OK - 3779 bytes in 0.030 seconds
Service Notification[04-24-2009 21:18:12] SERVICE NOTIFICATION: nagios-admin;server05;HTTP;OK;notify-by-email;HTTP OK HTTP/1.1 200 OK - 3779 bytes in 0.030 seconds
Service Ok[04-24-2009 21:18:12] SERVICE ALERT: server05;HTTP;OK;HARD;3;HTTP OK HTTP/1.1 200 OK - 3779 bytes in 0.030 seconds
Service Notification[04-24-2009 21:18:02] SERVICE NOTIFICATION: mobiel_ahmed;server05;HTTP;CRITICAL;notify-by-sms;CRITICAL - Socket timeout after 10 seconds
Service Notification[04-24-2009 21:18:02] SERVICE NOTIFICATION: mobiel_john;server05;HTTP;CRITICAL;notify-by-sms;CRITICAL - Socket timeout after 10 seconds
Service Notification[04-24-2009 21:18:02] SERVICE NOTIFICATION: nagios-admin;server05;HTTP;CRITICAL;notify-by-email;CRITICAL - Socket timeout after 10 seconds
Service Critical[04-24-2009 21:18:02] SERVICE ALERT: server05;HTTP;CRITICAL;HARD;3;CRITICAL - Socket timeout after 10 seconds
Service Critical[04-24-2009 21:17:42] SERVICE ALERT: server05;HTTP;CRITICAL;SOFT;2;Connection refused
Service Critical[04-24-2009 21:17:32] SERVICE ALERT: server05;HTTP;CRITICAL;SOFT;1;HTTP CRITICAL - No data received from host
Informational Message[04-24-2009 21:04:12] Auto-save of retention data completed successfully.

April 24, 2009 20:00 	

Service Notification[04-24-2009 20:46:12] SERVICE NOTIFICATION: mobiel_ahmed;server06;HTTP;OK;notify-by-sms;HTTP OK HTTP/1.1 200 OK - 3779 bytes in 0.020 seconds
Service Notification[04-24-2009 20:46:12] SERVICE NOTIFICATION: mobiel_john;server06;HTTP;OK;notify-by-sms;HTTP OK HTTP/1.1 200 OK - 3779 bytes in 0.020 seconds
Service Notification[04-24-2009 20:46:12] SERVICE NOTIFICATION: nagios-admin;server06;HTTP;OK;notify-by-email;HTTP OK HTTP/1.1 200 OK - 3779 bytes in 0.020 seconds
Service Ok[04-24-2009 20:46:12] SERVICE ALERT: server06;HTTP;OK;HARD;3;HTTP OK HTTP/1.1 200 OK - 3779 bytes in 0.020 seconds
Service Notification[04-24-2009 20:46:02] SERVICE NOTIFICATION: mobiel_ahmed;server06;HTTP;CRITICAL;notify-by-sms;CRITICAL - Socket timeout after 10 seconds
Service Notification[04-24-2009 20:46:02] SERVICE NOTIFICATION: mobiel_john;server06;HTTP;CRITICAL;notify-by-sms;CRITICAL - Socket timeout after 10 seconds
Service Notification[04-24-2009 20:46:02] SERVICE NOTIFICATION: nagios-admin;server06;HTTP;CRITICAL;notify-by-email;CRITICAL - Socket timeout after 10 seconds
Service Critical[04-24-2009 20:46:02] SERVICE ALERT: server06;HTTP;CRITICAL;HARD;3;CRITICAL - Socket timeout after 10 seconds
Service Critical[04-24-2009 20:45:42] SERVICE ALERT: server06;HTTP;CRITICAL;SOFT;2;Connection refused
Service Critical[04-24-2009 20:45:32] SERVICE ALERT: server06;HTTP;CRITICAL;SOFT;1;HTTP CRITICAL - No data received from host
Informational Message[04-24-2009 20:04:12] Auto-save of retention data completed successfully.[/code]

This is my configuration:define service{ name local-service ; The name of this service template use generic-service ; Inherit default values from the generic-service definition check_period 24x7 ; The service can be checked at any time of the day max_check_attempts 3 ; Re-check the service up to 4 times in order to determine its final (hard) state normal_check_interval 1 ; Check the service every 5 minutes under normal conditions retry_check_interval 1 ; Re-check the service every minute until a hard state can be determined contact_groups admins ; Notifications get sent out to everyone in the 'admins' group notification_options w,c,r ; Send notifications about warning, unknown, critical, and recovery events notification_interval 60 ; Re-notify about service problems every hour notification_period 24x7 ; Notifications can be sent out at any time register 0 ; DONT REGISTER THIS DEFINITION - ITS NOT A REAL SERVICE, JUST A TEMPLATE! } [root@dns2 etc]# grep _check_ nagios.cfg #command_check_interval=15s command_check_interval=40 service_inter_check_delay_method=s max_service_check_spread=10 host_inter_check_delay_method=s max_host_check_spread=10 service_check_timeout=15 host_check_timeout=10 service_freshness_check_interval=10 host_freshness_check_interval=60

I thought I had set everything correctly… but I guess not. I would like to receive downtime e-mails and text messages after 5 minutes downtime, not less!


#2

did you modify interval_length in nagios.cfg? (should be =60)

Luca


#3

[quote=“luca”]did you modify interval_length in nagios.cfg? (should be =60)

Luca[/quote]

Luca, I have indeed changed interval_length to 25. I changed it to 15 before, because Nagios was configured to send a notification after 15 to 20 minutes, which is much too slow for our production servers. I will restore the interval_length to 60 again, so I can calculate in seconds and minutes again. It gets very confusing to calculate time with an interval_length of 25.

I hope Nagios will send notifications and especially resends notifications at decent times after this change.

Thanks for your help.