Hi
I am new to Nagios,
I have a server that hosts Nagios. I would like to use Nagios to monitor the status of a radius service on a different server and restart on HARD failure of service
#Restart the radiusd service in the event of termination
define service{
use local-service
host_name radius2
service_description radiusd
check_command check_procs_radius
max_check_attempts 2
event_handler restart-radiusd
}
Command object
#Check Radiusd
define command{
command_name restart-radiusd
command_line $USER2$/restart-radiusd -H $HOSTADDRESS$ $SERVICESTATE$ $STATETYPE$ $SERVICEATTEMPT$
Script:
#!/bin/sh
Event handler script for restarting the RADIUS server on the local machine
Note: This script will only restart the web server if the service is
retried 3 times (in a “soft” state) or if the web service somehow
manages to fall into a “hard” error state.
What state is the RADIUS service in?
case “$1” in
OK)
# The service just came back up, so don’t do anything…
;;
WARNING)
# We don’t really care about warning states, since the service is probably still running…
;;
UNKNOWN)
# We don’t know what might be causing an unknown error, so don’t do anything…
;;
CRITICAL)
# The RADIUS service appears to have a problem - perhaps we should restart the server…
# Is this a "soft" or a "hard" state?
case "$2" in
# We're in a "soft" state, meaning that Nagios is in the middle of retrying the
# check before it turns into a "hard" state and contacts get notified...
SOFT)
What check attempt are we on? We don't want to restart the RADIUS server on the first
check, because it may just be a fluke!
case "$3" in
Wait until the check has been tried 3 times before restarting the web server.
If the check fails on the 4th time (after we restart the RADIUS server), the state
type will turn to "hard" and contacts will be notified of the problem.
Hopefully this will restart the RADIUS server successfully, so the 4th check will
result in a "soft" recovery. If that happens no one gets notified because we
fixed the problem!
3)
#echo -n "Restarting Radius service (3rd soft critical state)..."
# Call the init script to restart the RADIUSD server
/etc/init.d/radiusd start
;;
esac
;;
# The RADIUS service somehow managed to turn into a hard error without getting fixed.
# It should have been restarted by the code above, but for some reason it didn't.
# Let's give it one last try, shall we?
# Note: Contacts have already been notified of a problem with the service at this
# point (unless you disabled notifications for this service)
HARD)
#echo -n "Restarting Radius service..."
# Call the init script to restart the HTTPD server
/etc/init.d/radiusd start
;;
The problem is that I think the service is not being restarted on the radius machine. I think this is being dealt with as a local operation. Can anyone help me to get Nagios to restart the radius service?
Regards