Check Apache and MySQL -Restart them if down


#1

Is Nagios overkill for this? Any other options?

I have it setup to monitor Apache and MySQL. If Apache is down, I can’t see Nagios, but Nagios should still be running … correct?

I’ve used an example restart-httpd script (and modified it for MySQL as well) but my event_handler is not triggered after these services are stopped (for testing). How should I debug?

Thanks,
sg


#2

No, nagios is not overkill for this usage.
The reason you can’t get the event to trigger, may be due to retention. Use the cgi pages to enable events handler for the service and/or hosts. The link is named “Enable event handler for this service”


#3

Thanks for the info. I have enabled events in nagios.cfg, and it shows up in the cgis as enabled (and is green).

I tried su - nagios
and to run the same command in the script - and nagios doesn’t have permissions:

Starting httpd: (13)Permission denied: make_sock: could not bind to address ::]:80

Starting httpd: (13)Permission denied: make_sock: could not bind to address ::]:443

Starting httpd: Syntax error on line 117 of /etc/httpd/conf.d/ssl.conf:
SSLCertificateFile: file ‘/etc/httpd/conf/ssl.crt/server.crt’ does not exist or is empty

What is the best way to give nagios permissions to start apache? The doc mentions figuring this out, or thinking about sudo? After allowing nagios to sudo, then what, or is there an easier way?

Thanks again,
sg


#4

Another question, My e-mail alerts come back as having happened on ‘localhost’ even though I’ve specified a name for my host in the host definition, and for the service - is it because of the IP I use (127.0.0.1) in the host definition - does it interpret and then refer to it as localhost? I would rather see the hostname. Any way to change the from address of nagios @ localhost.localdomain?

nagios @ localhost.localdomain
***** Nagios *****
Notification Type: PROBLEM
Service: HTTP
Host: localhost
Address: 127.0.0.1
State: CRITICAL
Date/Time: Sat Aug 12 10:54:05 EDT 2006
Additional Info:
Connection refused

Also: regarding events when MySQL is stopped (the nagios user cannot restart it either) I get this in the log:

[1155394174] SERVICE EVENT HANDLER: krd2;MYSQL;CRITICAL;HARD;3;restart-mysqld
[1155394205] Warning: Service event handler command ‘/usr/local/nagios/libexec/eventhandlers/restart-mysqld CRITICAL HARD 3’ timed out after 30 seconds

But when apache is stopped I only get this:

[1155394445] SERVICE EVENT HANDLER: krd2;HTTP;CRITICAL;HARD;3;restart-httpd

Should I get more debug for apache as in mysql -or is mysql providing more info to nagios, or is it a difference between the plugins (check_mysql vs check_http’s) verbosity?

Thanks,
sg


#5

Try sudo
sudo /usr/local/nagios/libexec/eventhandlers/restart-httpd CRITICAL SOFT 3
If it doesn’t work, then fix your sudoers file like this:
visudo
add this to the file,
%nagios ALL = NOPASSWD: /usr/local/nagios/libexec/eventhandlers/restart-httpd CRITICAL SOFT 3
%nagios ALL = NOPASSWD: /usr/local/nagios/libexec/eventhandlers/restart-httpd CRITICAL HARD 4


#6

Thanks - That worked, although I had to change the commands to include the sudo commands SOFT 1, SOFT 2, HARD 3, and HARD 4. And prepend ‘sudo’ to the command defnitiions.

Thanks again.
sg

p.s. now my notify-by-epager (to cingular) seems to result in:

(error from maillog) i was confused about the (501/501) but i think this is nagios’s uid not a mail error, or is it? i’m still a little confused about this (as it appears they are accepted for delivery - but i never receive the pages, and some are 421 errors)- do i need to do anything to sendmail.mc?
i think i need to have a reverse dns name setup for this machine - it may solve the 421s …
e.g. mail-archive.com/nagios-user … 05887.html

(k7D3IWkS032403 Message accepted for delivery)
Aug 12 23:18:32 hostname sendmail[32405]: k7D3IWkS032403: to=[email protected], delay=00:00:00, xdelay=00:00:00, mailer=esmtp, pri=120497, relay=atlsmtp.cingularme.net. [66.102.165.114], dsn=4.0.0, stat=Deferred: 421 Service not available
Aug 12 23:18:32 hostname sendmail[32397]: k7D3IWsr032395: to=[email protected], delay=00:00:00, xdelay=00:00:00, mailer=esmtp, pri=120564, relay=gmail-smtp-in.l.google.com. [66.249.83.114], dsn=2.0.0, stat=Sent (OK 1155439122 i13si5706559wxd)
Aug 12 23:20:52 hostname sendmail[32525]: k7D3Kqh2032525: from=nagios, size=289, class=0, nrcpts=1, msgid=[email protected], [email protected]
Aug 12 23:20:52 hostname sendmail[32526]: k7D3KqEN032526: from=[email protected], size=557, class=0, nrcpts=1, msgid=[email protected], proto=ESMTP, daemon=MTA, relay=mydomain.com [127.0.0.1]
Aug 12 23:20:52 hostname sendmail[32525]: k7D3Kqh2032525: [email protected], ctladdr=nagios (501/501), delay=00:00:00, xdelay=00:00:00, mailer=relay, pri=30289, relay=[127.0.0.1] [127.0.0.1], dsn=2.0.0, stat=Sent (k7D3KqEN032526 Message accepted for delivery)
Aug 12 23:20:52 hostname sendmail[32533]: k7D3KqoN032533: from=nagios, size=214, class=0, nrcpts=1, msgid=[email protected], [email protected]
Aug 12 23:20:52 hostname sendmail[32534]: k7D3KqLX032534: from=[email protected], size=489, class=0, nrcpts=1, msgid=[email protected], proto=ESMTP, daemon=MTA, relay=mydomain.com [127.0.0.1]
Aug 12 23:20:52 hostname sendmail[32533]: k7D3KqoN032533: [email protected], ctladdr=nagios (501/501), delay=00:00:00, xdelay=00:00:00, mailer=relay, pri=30214, relay=[127.0.0.1] [127.0.0.1], dsn=2.0.0, stat=Sent (k7D3KqLX032534 Message accepted for delivery)
Aug 12 23:20:52 hostname sendmail[32536]: k7D3KqLX032534: to=[email protected], delay=00:00:00, xdelay=00:00:00, mailer=esmtp, pri=120489, relay=atlsmtp.cingularme.net. [66.102.165.114], dsn=4.0.0, stat=Deferred: 421 Service not available
Aug 12 23:20:52 hostname sendmail[32528]: k7D3KqEN032526: to=[email protected], delay=00:00:00, xdelay=00:00:00, mailer=esmtp, pri=120557, relay=gmail-smtp-in.l.google.com. [66.249.83.114], dsn=2.0.0, stat=Sent (OK 1155439262 h8si6030418wxd)
Aug 12 23:21:32 hostname sendmail[32550]: k7D3LWCU032550: from=nagios, size=370, class=0, nrcpts=1, msgid=[email protected], [email protected]
Aug 12 23:21:32 hostname sendmail[32551]: k7D3LWVx032551: from=[email protected], size=638, class=0, nrcpts=1, msgid=[email protected], proto=ESMTP, daemon=MTA, relay=mydomain.com [127.0.0.1]
Aug 12 23:21:32 hostname sendmail[32550]: k7D3LWCU032550: [email protected], ctladdr=nagios (501/501), delay=00:00:00, xdelay=00:00:00, mailer=relay, pri=30370, relay=[127.0.0.1] [127.0.0.1], dsn=2.0.0, stat=Sent (k7D3LWVx032551 Message accepted for delivery)
Aug 12 23:21:32 hostname sendmail[32558]: k7D3LWqh032558: from=nagios, size=295, class=0, nrcpts=1, msgid=[email protected], [email protected]
Aug 12 23:21:32 hostname sendmail[32559]: k7D3LWs9032559: from=[email protected], size=570, class=0, nrcpts=1, msgid=[email protected], proto=ESMTP, daemon=MTA, relay=mydomain.com [127.0.0.1]
Aug 12 23:21:32 hostname sendmail[32558]: k7D3LWqh032558: [email protected], ctladdr=nagios (501/501), delay=00:00:00, xdelay=00:00:00, mailer=relay, pri=30295, relay=[127.0.0.1] [127.0.0.1], dsn=2.0.0, stat=Sent (k7D3LWs9032559 Message accepted for delivery)
Aug 12 23:21:32 hostname sendmail[32561]: k7D3LWs9032559: to=[email protected], delay=00:00:00, xdelay=00:00:00, mailer=esmtp, pri=120570, relay=atlsmtp.cingularme.net. [66.102.165.114], dsn=4.0.0, stat=Deferred: 421 Service not available
Aug 12 23:21:34 hostname sendmail[32553]: k7D3LWVx032551: to=[email protected], delay=00:00:02, xdelay=00:00:02, mailer=esmtp, pri=120638, relay=gmail-smtp-in.l.google.com. [66.249.83.114], dsn=2.0.0, stat=Sent (OK 1155439304 i13si5709231wxd)