Nagios problem?


#1

My name is Harry. I just try to use the Nagios’s check_processes script (see bottom of this message for check_processes script) to check my environment. The check_processes script works perfect from command line.
E.g: nagios@sftest#/usr/local/nagios/libexec/check_processes
Not OK - 2 processes NOT running: cvs mysql

My machine doesn’t have cvs and mysql running.

However, when i apply this script on Nagios configuration, i don’t see “Not OK - 2 processes NOT running: cvs mysql” message showing Nagiso web interface. Here what i do:

  1. Copy check_processes file to the path /usr/local/nagios/libexec/ (this is all-plugins located)
  2. Add lines in /usr/local/nagios/etc/checkcommands.cfg
    define command {
    command_name check_processes
    command_line /usr/local/nagios/libexec/check_processes
    }
  3. Add lines in /usr/local/nagios/etc/services.cfg
    define service {
    use generic-service
    host_name dbinhouse
    service_description Check processes
    is_volatile 0
    check_period 24x7
    max_check_attempts 3
    normal_check_interval 5
    retry_check_interval 1
    contact_groups sveng-admin
    notification_interval 60
    notification_period 24x7
    notification_options c,r
    check_command check_processes
    }

Here is what i see from Nagios Web interface:

  1. Check processes CRITICAL 03-06-2008 14:35:53 3d 2h 48m 37s 3/3 -------------------------------------------
    All dot-dot shows at “status information” column.
  2. There is nothing listed like when do it from command line (e.g: Not OK - 2 processes NOT running: cvs mysql)

Any help is very appreciate it. Anything i miss?

Thanks,
Harry-

check_processes script starts here:

You may have to change this, depending on where you installed your

Nagios plugins

PROGNAME=“check_processes"
PATH=”/usr/bin:/usr/sbin:/bin:/sbin"
LIBEXEC="/usr/local/nagios/libexec"
. $LIBEXEC/utils.sh

DEFINING THE PROCESS LIST

#LIST="init"
LIST=“nagios cvs sendmail httpd mysql”

REQUISITE NAGIOS COMMAND LINE STUFF

print_usage() {
echo "Usage: $PROGNAME"
echo “Usage: $PROGNAME --help???”
}

print_help() {
echo ""
print_usage
echo ""
echo "Basic processes list monitor plugin for Nagios"
echo ""
echo "This plugin not developped by the Nagios Plugin group."
echo "Please do not e-mail them for support on this plugin, since"
echo "they won’t know what you’re talking about …"
echo ""
echo “For contact info, read the plugin itself…”
}

while test -n "$1"
do
case “$1” in
–help) print_help; exit $STATE_OK;;
-h) print_help; exit $STATE_OK;;
*) print_usage; exit $STATE_UNKNOWN;;
esac
done

FINALLY THE MAIN ROUTINE

COUNT=“0"
DOWN=”"

for PROCESS in echo $LIST
do
if ps -ef | grep -i $PROCESS | grep -v grep | wc -l -lt 1 ]
then
let COUNT=$COUNT+1
DOWN="$DOWN $PROCESS"

    fi

echo "-------------------------------------------"
echo "Processes checked: $PROCESS "

done
echo "-------------------------------------------"
if $COUNT -gt 0 ]
then
echo "Not OK - $COUNT processes NOT running: $DOWN"
exit $STATE_CRITICAL
fi

Nothing caused us to exit early, so we’re okay.

echo "OK - All requisite processes running."
exit $STATE_OK

End here ---------


#2

i’d be tempted to get rid of:
echo "-------------------------------------------"
echo "Processes checked: $PROCESS "

done
echo "-------------------------------------------"
HTH

/S


#3

Thank for your quick reponse.
Per your suggestion, i commentd them out, and i got (No output!) message from Web interface. I don’t see dot-dot anymore. When i run ./check_processes from command line, i have the error “./check_processes: line 90: syntax error: unexpected end of file”. This line 90 is for "echo "Processes checked: $PROCESS ".

Any help??


#4

oops, my bad… try:
get rid:

echo "-------------------------------------------" echo "Processes checked: $PROCESS "
leave in :roll: :

get rid:

Sorry bout that, must’ve been half asleep.

HTH
/S


#5

Finally, it works…It can display all non-running services message to Web interface.
However, the check_processes script only checks all services on Nagios server NOT from the client (e.g in my case: dbinhouse). I want it check services on my client and report back to the Web interface.

How can i make it work? or edit the script?
What should i do on the client side?

Thanks,

Harry-


#6

Hi

This script will only check the processes on the box on which it is running, therefore you need to replicate it on your target client and execute it from there somehow. Options for doing that would include installation of NRPE on both the server and client, or possibly using the slightly more cpu intensive check_by_ssh check. Take a look at the NRPE documentation on the nagios website.

HTH

/S