Frequent "Host Check Timed Out" in slow servers

ItalianPenguin · February 9, 2010, 8:10am

Hi to all !
We have some servers that are slow (maybe network problems) and sometimes i get
a “host check timed out”.
Using nagstamon, it’s rather disturbing because it give a false error (in fact after a minute
nagios recheck the host, that is up.
Is there any parameter for increase the time before the host is considered “Timed Out” ??
Regars to all
Italian Penguin

luca · February 9, 2010, 11:54am

Standard check-host-alive uses check_ping
./check_ping --help

ItalianPenguin · February 9, 2010, 1:06pm

Yes, i saw in commands.cfg the definition of check-host-alive
and it uses check_ping with parameter -p 5
Should i increase the parameter ? or just omit it ?
Thanks one more time
Ciao
Italian Penguin

luca · February 9, 2010, 1:12pm

i don’t know the default of timeout… but that is what you were looking for

ItalianPenguin · February 9, 2010, 2:18pm

No luck … i tried setting the parameter -p 15 or higher:
command_line $USER1$/check_ping -H $HOSTADDRESS$ -w 3000.0,80% -c 5000.0,100% -p 15

, and even suppressed that parameter:
command_line $USER1$/check_ping -H $HOSTADDRESS$ -w 3000.0,80% -c 5000.0,100%

It does not work : i keep continuing in getting “host check timed out” …
Any other suggestion … ??
Regards

luca · February 9, 2010, 2:33pm

do pings work to those hosts when nagios fails?

ItalianPenguin · February 9, 2010, 2:37pm

This hosts are ESX hosts … so of course they are always up and running, but for some reason
the check_ping gives a “host check timed out”. It’s rather boring because if we plan to send sms for
the event “esx host down”, it’s not trusteable.
Thanks for every suggestion
Regards
Italian Penguin

luca · February 9, 2010, 3:02pm

notifications shouldn’t start on the first check failure