Weird Notification Problem


#1

Hi everyone,
i’m configuring Nagios 3.0.6 to send notifications via email. I’ve configured the commands.cfg, contacts.cfg, nagios.cfg, etc.
This is what the nagios.log file saves:

[1230743045] SERVICE NOTIFICATION: ops;AG010-US;Proxy 8008;OK;notify-service-by-email;HTTP OK HTTP/1.1 200 OK - 320 bytes in 0.045 seconds [1230743076] Warning: Contact 'ops' service notification command '/usr/bin/printf "%b" "***** Nagios *****\n\nNotification Type: RECOVERY\n\nService: Proxy 8008\nHost: AG010-US\nAddress: 72.11.128.231\nState: OK\n\nDate/Time: Wed Dec 31 09:04:05 PST 2008\n\nAdditional Info:\n\nHTTP OK HTTP/1.1 200 OK - 320 bytes in 0.045 seconds" | /bin/mail -s "** RECOVERY Service Alert: AG010-US/Proxy 8008 is OK **" seo-ops@inquirogroup.org' timed out after 30 seconds

The /var/log/maillog says that the petition never arrives to sendmail, because i used grep over the maillog file to search for the email address ( seo-ops@inquirogroup.org ) and nothing was found.
I have a nagiosadmin contact which mail is nagios@localhost. This mail is the only one that gets delivered.

If i ran the command myself, as user nagios or any other, it works fine and the mail arrives. The command i ran was:
/usr/bin/printf “%b” “***** Nagios ***\n\nNotification Type: RECOVERY\n\nService: Proxy 8008\nHost: AG010-US\nAddress: 72.11.128.231\nState: OK\n\nDate/Time: Wed Dec 31 09:04:05 PST 2008\n\nAdditional Info:\n\nHTTP OK HTTP/1.1 200 OK - 320 bytes in 0.045 seconds" | /bin/mail -s " RECOVERY Service Alert: AG010-US/Proxy 8008 is OK **” seo-ops@inquirogroup.org

Please help, i can’t even tell if this a Nagios problem or a sendmail one…
Thanks


Possible alerting bug
#2

This is purely sendmail problem, Nagios is able to send email to localhost but not to the address which is external. So please check the sendmail configuration and make necessary changes to relay the domains and also check sendmail.mc and make changes accordingly.


#3

From aperez199:

          "If i ran the command myself, as user nagios or any other, it works fine and the mail arrives."

This would certainly appear to rule out a sendmail configuration problem, unless I’m missing something? I suddenly find myself experiencing a similar problem - the nagios alerts are not being delivered, but I can cut and paste the command from the “host delivery notification timed out” error in the nagios.log, and it will deliver the notice with no problems.

When the notice was queued by nagios,my maillog indicates a message connection was initiated but there was no message body or recipients:

         "from=nagios, size=0, class=0, nrcpts=0, relay=nagios@localhost"

When I paste the failed command from the nagios.log at the command line, the maillog looks good:

        "from=nagios, size=436, class=0, nrcpts=1, msgid=<"SENDMAIL ID HERE">, relay=nagios@localhost"

And a few seconds later, the message is in the appropriate inbox. I get the exact same result regardless of the destination address (local system, corporate notes server, gmail address) or mailer (/bin/mail, /bin/mailx) - it always results in a timeout in the nagios.log, and it always works from the commandline. This host also delivers localmail from other applications, relays external notices for other domains, and delivers external notices from various cron jobs.

Anyone have any thoughts that don’t involve dismissing this offhand as a “sendmail problem”?


#4

Hello, I have the same problem. I am using sendEmail http://caspian.dotconf.net/menu/Software/SendEmail/
I am able to send email as nagios user, but when I use it in misccommands.cfg. It doesn’t work. In my case, the log file also doesn’t contain any errors.

Is there a way of interpreting how nagios executes commands? Or is there a way of seeing what the actual command will look like (after nagios replaces macros) that nagios executes?


#5

Hi again,

I’ve managed to troubleshoot this a little more. I’ve determined that when nagios runs performs the command, a connection is made to the MTA, but no recipients are passed, and no message body is passed. Why this would be the case from a simple ‘printf | mail’ shell statement isn’t clear at all.

I ended up working around the problem (in the ugliest way possible IMNSHO) by running a cronjob that scanned the /usr/local/nagios/var/nagios.log file for “NOTIFICATION” entries, and then mailed them. So basically, Nagios is trying (and still failing to properly execute) the ‘/bin/mailx’ with a recipient and message body, but the script I wrote is parsing the log and sending off the message(s) that have accumulated since the last pass (using ‘printf | mail’) as the nagios user, so the customer is perfectly happy … and I’ve moved on to my next deployment.

The really weird part is, when I first setup nagios, notifications (actual, and test) were working. The notifications spontaneously stopped in week 3 of production operation … and I kludged up the script that greps the notices out of the log, cron’d it as nagios, and away we went.

HTH/HAND;


#6

Is there a good tool for tracing google linkbacks for SEO competitive intelliigence purposes? thx? I understand that it should be possible to analyze competitors’ websites to understand how they have built a high ranking position in Google. I found an older tool from 2006, but given that they haven’t updated their website since, I am sceptical about the quality. Thank you!


#7

go into the nagios.cfg and modify the notifications time-out from the default of 30 seconds to something more reasonable. i reset mine to 120 seconds and suddenly my alerts are working again. I guess sendmail is just a little too busy to respond to the intial handshake in a non-interactive shell within the default amount of time so nagios figures its not there and never sends the rest of the message. on my machine 120 seconds is enough time for sendmail to get around to responding.

i guess sendmail is getting old and it needs more time to do some tasks… we’ll all get that way eventually :wink:


#8

I was looking for a fix for that exact same issue and upping the timeout to 120 like you did fixed my issues… :slight_smile:

Thanks for the advice.