Hello,
I’ve setup nagios ditributed monitoring using the nsca plugin.
Everything is working ok, but there is one thing I can not figure out.
Normally when a service goes down for the first time (soft 1) the next check it will use the value of the retry check interval.
So if you setup a service with normal check interval 5 minutes en retry check interval 1 minute, normally it should take 7 minutes before an alert is send.
But in my situation it takes 15 minutes before an alert is send.
Anyone ideas ?
This is the log of the central nagios:
[07-24-2008 07:17:57] SERVICE NOTIFICATION: NB_mail;test;PING;CRITICAL;notify-by-email;CRITICAL - Host Unreachable (192.168.17.204)
[07-24-2008 07:17:57] SERVICE ALERT: test;PING;CRITICAL;HARD;3;CRITICAL - Host Unreachable (192.168.17.204)
[07-24-2008 07:12:57] SERVICE ALERT: test;PING;CRITICAL;SOFT;2;CRITICAL - Host Unreachable (192.168.17.204)
[07-24-2008 07:07:57] SERVICE ALERT: test;PING;CRITICAL;SOFT;1;CRITICAL - Host Unreachable (192.168.17.204)
This is the log of the nagios server who’s sending the check results to the central nagios server:
[24-07-2008 07:14:59] HOST ALERT: test;DOWN;SOFT;9;CRITICAL - Host Unreachable (192.168.17.204)
[24-07-2008 07:13:49] HOST ALERT: test;DOWN;SOFT;8;CRITICAL - Host Unreachable (192.168.17.204)
[24-07-2008 07:12:39] HOST ALERT: test;DOWN;SOFT;7;CRITICAL - Host Unreachable (192.168.17.204)
[24-07-2008 07:11:29] HOST ALERT: test;DOWN;SOFT;6;CRITICAL - Host Unreachable (192.168.17.204)
[24-07-2008 07:10:21] HOST ALERT: test;DOWN;SOFT;5;CRITICAL - Host Unreachable (192.168.17.204)
[24-07-2008 07:09:09] HOST ALERT: test;DOWN;SOFT;4;CRITICAL - Host Unreachable (192.168.17.204)
[24-07-2008 07:07:59] HOST ALERT: test;DOWN;SOFT;3;CRITICAL - Host Unreachable (192.168.17.204)
[24-07-2008 07:07:49] SERVICE ALERT: test;PING;CRITICAL;HARD;1;CRITICAL - Host Unreachable (192.168.17.204)
[24-07-2008 07:06:49] HOST ALERT: test;DOWN;SOFT;2;CRITICAL - Host Unreachable (192.168.17.204)
[24-07-2008 07:05:39] HOST ALERT: test;DOWN;SOFT;1;PING CRITICAL - Packet loss = 100%