Nagios does not send notifications per email after a restart


#1

Hello all,

I have been keeping an eye on my Nagios system after making some major changes to the main host configuration file. I have noticed over the past couple of days that Nagios is not sending notifications like it is suppose too. The thing is, I did not stop the Nagios system when I reloaded the config file into the system, instead I just reloaded the config without stopping the system. Could that have cuased some problems with the system? If yes, what can I do to fix this?

Thanks ahead of time for your help.

[EDIT: I looked at the nagios log files and I found the following:

[1214391211] SERVICE NOTIFICATION: webmaster;***host***;Total_Processes_rp50_-nrpe-;CRITICAL;service-notify-by-email;CHECK_NRPE: Socket timeout after 10 seconds. [1214391211] SERVICE NOTIFICATION: phe;***host***;Total_Processes_rp50_-nrpe-;CRITICAL;service-notify-by-email;CHECK_NRPE: Socket timeout after 10 seconds.

So Nagios is trying to send notificatations by email, but I am not receiving any… why?]


#2

Hi Geo

Yeah, I reckon reload could do that… I gave up using it as it caused me all sorts of random and bizarre issues such as seemingly 2 nagios processes running (like you’d ack a service problem, then the ack would disappear, or sometimes notifications would come out twice or not at all, etc. etc…) Now I’ll only use restart, but it’s not ideal as I have many service checks running at 15s intervals and find that this way often some service checks will get their ‘next check’ time pushed out to something ridiculous like 10 minutes away (not good as I’m graphing the perfmon data, so I have to go into each service group and ensure that they are all firing as they should, rescheduling those that aren’t)

I would suggest maybe that you do a stop first, then a killall -9 nagios a coupla times to make sure your nagios processes are gone, before finally issuing the start

HTH

/S


#3

hmmmm, I did that several times and I also reset a couple of commands to see if nagios would send out a notification, but I am not receiving any notifications from Nagios. Here is my commands for the email notications in my config, let me know if there is anything wrong with them:

[code]
#‘notify-host-by-email’ command definition
define command{
command_name host-notify-by-email
command_line /usr/bin/printf “%b” “***** Nagios ***\n\nNotification Type: $NOTIFICATIONTYPE$\nHost: $HOSTNAME$\nState: $HOSTSTATE$\nAddress: $HOSTADDRESS$\nInfo: $HOSTOUTPUT$\n\nDate/Time: $LONGDATETIME$\n" | /usr/bin/mail -s " $NOTIFICATIONTYPE$ Host Alert: $HOSTNAME$ is $HOSTSTATE$ **” $CONTACTEMAIL$
}

#‘notify-service-by-email’ command definition
define command{
command_name service-notify-by-email
command_line /usr/bin/printf “%b” “***** Nagios ***\n\nNotification Type: $NOTIFICATIONTYPE$\n\nService: $SERVICEDESC$\nHost: $HOSTALIAS$\nAddress: $HOSTADDRESS$\nState: $SERVICESTATE$\n\nDate/Time: $LONGDATETIME$\n\nAdditional Info:\n\n$SERVICEOUTPUT$" | /usr/bin/mail -s " $NOTIFICATIONTYPE$ Service Alert: $HOSTALIAS$/$SERVICEDESC$ is $SERVICESTATE$ **” $CONTACTEMAIL$
}[/code]

If worse comes to worse, I will just recompile nagios after backing it all up and see if that works. Is there anything else that you can think of?


#4

Hmm…seems alright to me. Perhaps some other things you could (in no particular order) might be:

[list]send a custom notification from a host to see if that works[/list:u]
[list]check /var/log/maillog (or the appropriate log for your mail system) and /var/log/messages for indication of any errors[/list:u]
[list]check /var/spool/(mqueue or clientmqueue - can’t remember… maybe even somewhere else depending on your mail implementation) to see if mail is stuck spooled[/list:u]
[list]send yourself a test mail from the commandline[/list:u]
[list]check and ensure you can talk to your smtp server by, e.g. telnet to it on port 25[/list:u]

May not be the case that doing any of that helps any and your only recourse will be to ‘undo changes’, although I have to say that it does indeed look ike nagios is trying to email, so I’d be inclined to lay the blame elsewhere…
I guess ultimately it depends on whether you can resist taking an axe to it if, after you put it all the nagios changes you made back to square one, it still doesn’t work.

I’ll cross my fingers for you

/S


#5

OK, i think that I have found the problem… checking the mail logs worked. :frowning:

P.S. What the heck is goin on with the crappy design of this Forum. I would be inclined to ask the admin to redesign it so it is easier to use…


#6

That makes sense. Good news, glad I could assist. :frowning:
As for the forum, think it’s the using of the “code” tags that stops it wordwrapping the content within them. Bugs the hell outa me too.


#7

Hi,

I have a similar problem. here the result of the test

please advise
Thanks


#8

Hi

It would help if you posted your notify command objects (indicate which work and which don’t) and also your contact object definition.

/S


#9

heres my contacts.cfg

heres the commands.cfg

the notify-service-by-email and notify-host-by-email does not work