When I shutdown a server I am monitoring (ping) the service detail window shows “critical” .
But at the same time when I look at the host detail window I see the server is in state "UP"
Has Anyone got an idea what causes this?
It remains all the time or does it get correct data after a while?
Luca
“At the same time”…
Mayhap that’s your problem? Are you sure you’re allowing Nagios enough time to check that the host is down once it finds that ping is failing? Wait a while and see if Nagios is still showing the host to be up. It might just be that Nagios is executing the failed PING check and has yet to execute the host check.
It remains all the time in status “UP”.
The ping service I monitor syas "critical"
The server is down now for three days and it still isn’t updated, so I think he has enough time to update.
Any ideas??
10x Geronnimo00111
I think you have more than one nagios running.
Do this:
ps -ef|grep nagios
/etc/rc.d/init.d/nagios stop
ps -ef|grep nagios
Are there any returned from ps that are nagios running.
If so, kill them. They will look like /usr/local/nagios2/bin/nagios -d /usr/local/nagios2/etc/nagios.cfg
Now start nagios again and your problem should be gone.
10x but that ain’t the prob, just one nagios ps running.
So, you didn’t shut it down as I suggested? OK, well, can’t help ya since that should clear up the problem. If they are all killed.
jakkedup,
I did shut down as you suggested.
At that time there was no nagios ps running.
I started the process again, but that didn’t solve the problem.
Found the problem.
There was a problem in my template in the host.cfg
10x a lot guys
Would you mind sharing your problem/fix? It might help someone who comes across this problem later on.
First config of my host.cfg was the default you get with nagios sample
define host{
name generic-host ; The name of this host template
notifications_enabled 1 ; Host notifications are enabled
event_handler_enabled 0 ; Host event handler is enabled
flap_detection_enabled 1 ; Flap detection is enabled
failure_prediction_enabled 1 ; Failure prediction is enabled
process_perf_data 1 ; Process performance data
retain_status_information 1 ; Retain status information across program restarts
retain_nonstatus_information 1 ; Retain non-status information across program restarts
register 0 ; DONT REGISTER THIS DEFINITION - ITS NOT A REAL HOST, JUST A TEMPLATE!
}
To solve the problem I created a new template with a check command.
‘HTemplate_LAN’ host definition
define host{
name generic-host
check_command check-host-alive
max_check_attempts 3
checks_enabled 1
event_handler_enabled 0
flap_detection_enabled 1
process_perf_data 1
retain_status_information 1
retain_nonstatus_information 1
notification_interval 0
notification_period 24x7
notification_options d,u,r
notifications_enabled 1
register 0
}
This solved the problem.