We have been using Nagios for sometime now and just recently the other admins and I have noticed that we are no longer receiving notifications when a host or service is up. I have version 2.1.0 installed and when I open the configuration page for my contacts via the web page, under Service Notification Options and Host Notification Options, I have selected “Recovery”.
I have checked in hosts and contacts. I have the defaults set for Down, Unreachable and Recover … with that said, I have selected “Default” as my option. So I when I view what has been selected using Nagmin, it shows “*Down, Unreachable, Recovery”. Can I safely assume that those 3 options are used?
Also, I have rechecked my event logs and when a device enters the Down state (using ping checks), the event log shows the device goes down, then all of the paging goes out … then when the device comes up again, Nagios shows the up state for the device and that is it … no paging is sent.
Lucas and everyone else … I REALLY appreciate the help that you all give!
I don’t use Nagmin so i don’t know what could be going wrong. Try checking the config files… if you always have the r in notification_options.
Let’s wait for somebody else using Nagmin
I have checked my host.cfg, contact.cfg, and service.cfg and all have the r option (along with w, u, c with the service_notification option and d and u with hosts_notification)
I have checked the logs and Nagios will send out a notification when the device or service is down, but will not send one out when the device or service came back up.
Thanks for the help Lucas … I will get this thing to work …
Where you notified of a service recovering? If so, then it’s assumed, that if the service recovered, that the host did also.
Example: I powered off my ftp server. I am notified of the HOST being down, because first the service check failed, then nagios looked to see if the host is ok, and it wasn’t. So nagios knows that it’s not a service problem, but a host problem and I’m notified of that fact.
Now, I power on the ftp server. Nagios is still looking in it’s schedule of checks to make, and all those checks are for SERVICES ONLY. This time, the ftp server service check passed, and since that is a state change, I am notified of this state change and it says my ftp service is ok on host “whatever”. So of course, I have to assume that the host has recovered.
Now, as another example to find out if what you say is true, then try this:
Example I do the same as above, except that when I power back on, the ftp service is still not working. Nagios does it’s check’s again, and see’s that the service still fails. It then checks to see if host is still bad, and it is found to be OK. So again, this is a state change and you are notified that the HOST has recovered. You are also notified that the service ftpd is not running.