I’m in the process of setting up a new Nagios server, and so far, I have things working, but not the way I want.
I created (edited) the generic host template, with a check_command of check-host=alive. I then created hosts using that template. When I apply the configuration, nagios issues warnings that no services are associated with the host, however, every host easily changes from PENDING to UP.
Now, back to the services thing… I create a generic service template, again using the check-host-alive. I then create a service, and set host=* to apply it to all hosts. I don’t care to have duplicate checks, so I remove the check_command from the host template. However, when I check the status, every service is up, but the hosts never change from PENDING.
So, back to the host template, I put the check_command back in, restart nagios, and every host moves from PENDING to UP. Every service is UP. Beautiful!
The next step, was to simulate an outage… the results are undesireable, nagios sent a notification for both the host and the service. I really don’t need both.
And now for the questions:
If I have both a host check and a service check defined to ping the host, is nagios actually doing the check twice?
If a host is down, and a notification is sent… why would nagios also send notifications for the services on that host? Does nagios not understand that if a host is dead, it’s services are also dead?
Why does Nagios issue warnings if a host doesn’t have a service? At this point, I only care if the host is alive or not. If I omit services, will I run into any problems with reliability or accuracy?
Thanks!