I seem to have a little problem with nagios when one of the hosts i monitor goes down.
1 Nagios server
8 Proxies being monitored, in total 79 servicechecks distributed over the proxies.
All are openbsd 3.8 systems.
The scripts are all customized (written myself). I use ssh with pki authentication to tranfser the scripts from the nagiosserver to the nagiosclients, execute them and get the results back (and remove the scripts from the clients).
If a service goes down nagios reports that very quickly (for testing purposes i have set all normal_check_interval to 1 minute). But a problem occurs when a host goes down.
It seems to hold up all other servicechecks. On the forum i found this thread:
meulie.net/portal_plugins/fo … c.php?6965
I read it an currently i am using the check_icmp for check-host and check-ping purposes. It’s much faster then the regular check_ping binary.
But nagios still seems to be agitated when hosts go down. Does anyone know how nagios deals with down hosts now? Are all servicechecks on the hosts which seems to be down put to a hold? That sounds good, but i would want that the service checks on ‘healthy’ hosts would continue… Is there a way to realize this?
Are there more users experiencing these kind of problems? And if so, what are ur solutions?
Thanks in advance!