Ping Frequency Change


#1

I don’t know how this can be done, so any help would be appreciated.

I have setup Nagios 3.0.6 on Ubuntu Server 8.10

I’ve added a list of hosts to monitor, with the command “check-host-alive” This is the only command being used, for any hosts.

I’ve downloaded and installed postfix, and enabled mail notifications as well. The e-mails work, and the hosts ping.

What I would like to do next is lower the time the pings occur.

For example, when testing, I unplugged the ethernet cable from one of the monitored hosts. After 5 minutes nagios re-tried the ping, and it timed out. (see logs below.) After Nagios realized the host was down, it then pinged the host every 90 seconds, for 10 consecutive times. I know if I lower the max_check_attempts that it will ping less times before e-mailing, but I want nagios to ping before the 90 seconds. I want it to ping 10 seconds after it realizes the host is down, for a total max_check_attempt of 5 times.

I know in the below code i will need to change max_check_attempts to 5, and that I will also need to lower notification_interval below 15 to get the e-mail out faster. From the time the host went down, to the time I received the e-mail was 20 minutes…I need it a whole lot sooner, when we go live with this box.

How do I change the ping from 90 seconds though?

Thanks for the help!

Part of current config file:

define host {
use generic-host ; Name of host template to use
host_name memrsql ; DNS hostname
alias Memphis RSQL ; Alias name for host
address 10.12.1.126 ; IP Address
check_command check-host-alive ; Cmd used to check host (default)
max_check_attempts 10 ; Number of check attempts
contact_groups admins ; cit-operations
notification_interval 15 ; Interval between notifications
notification_period 24x7 ; Notification period
notification_options d,r ; Options, D = down, R = Recover
}

Log file from nagios.

[02-27-2009 11:19:34] HOST ALERT: hp-it;DOWN;SOFT;1;(Host Check Timed Out)
[02-27-2009 11:21:04] HOST ALERT: hp-it;DOWN;SOFT;2;(Host Check Timed Out)
[02-27-2009 11:22:34] HOST ALERT: hp-it;DOWN;SOFT;3;(Host Check Timed Out)
[02-27-2009 11:24:04] HOST ALERT: hp-it;DOWN;SOFT;4;(Host Check Timed Out)
[02-27-2009 11:25:34] HOST ALERT: hp-it;DOWN;SOFT;5;(Host Check Timed Out)
[02-27-2009 11:27:04] HOST ALERT: hp-it;DOWN;SOFT;6;(Host Check Timed Out)
[02-27-2009 11:28:34] HOST ALERT: hp-it;DOWN;SOFT;7;(Host Check Timed Out)
[02-27-2009 11:30:04] HOST ALERT: hp-it;DOWN;SOFT;8;(Host Check Timed Out)
[02-27-2009 11:31:44] HOST ALERT: hp-it;DOWN;SOFT;9;(Host Check Timed Out)
[02-27-2009 11:33:14] HOST ALERT: hp-it;DOWN;HARD;10;(Host Check Timed Out)
[02-27-2009 11:33:14] HOST NOTIFICATION: nagiosadmin;hp-it;DOWN;notify-host-by-email;(Host Check Timed Out)
[02-27-2009 11:38:24] HOST ALERT: hp-it;UP;HARD;1;PING OK - Packet loss = 0%, RTA = 0.81 ms
[02-27-2009 11:38:24] HOST NOTIFICATION: nagiosadmin;hp-it;UP;notify-host-by-email;PING OK - Packet loss = 0%, RTA = 0.81 ms


#2

Howdy

You need to look at altering the following directives in the host object:

[quote]check_interval: This directive is used to define the number of “time units” between regularly scheduled checks of the host. Unless you’ve changed the interval_length directive from the default value of 60, this number will mean minutes. More information on this value can be found in the check scheduling documentation.
retry_interval: This directive is used to define the number of “time units” to wait before scheduling a re-check of the hosts. Hosts are rescheduled at the retry interval when they have changed to a non-UP state. Once the host has been retried max_check_attempts times without a change in its status, it will revert to being scheduled at its “normal” rate as defined by the check_interval value. Unless you’ve changed the interval_length directive from the default value of 60, this number will mean minutes. More information on this value can be found in the check scheduling documentation. [/quote]

nagios.sourceforge.net/docs/3_0/ … .html#host

Like, if you want a check_interval or retry_interval of 15 seconds (assuming your interval_length) is default, you set them to 0.25

HTH

/S