Nagios doesn't send notification emails


#1

I am using sSMTP with Nagios 2.9 on FreeBsd. If I do this:

Mail -s “Test” root

It works fine.

I actually have a host down right now and it shows that it;s down in the Nagios web interface but I didn’t recieve any emails. I have a minimal.cfg configured for everything so I’m assuming there is an error somewhere in the file even though nagios loads and works. Here is my minimal.cfg, if someone can see something out of the ordinary. I have checked various logs and I don’t se any erros unless I’m missing something.

###############################################################################

MINIMAL.CFG

MINIMALISTIC OBJECT CONFIG FILE (Template-Based Object File Format)

Last Modified: 03-23-2005

NOTE: This config file is intended to be used to test a Nagios installation

that has been compiled with support for the template-based object

configuration files.

This config file is intended to servce as an extremely simple

example of how you can create your object configuration file(s).

If you’re interested in more complex object configuration files for

Nagios, look in the sample-config/template-object/ subdirectory of

the distribution.

###############################################################################

###############################################################################
###############################################################################
#TIME PERIODS

###############################################################################
###############################################################################

This defines a timeperiod where all times are valid for checks,

notifications, etc. The classic “24x7” support nightmare. :frowning:

define timeperiod{
timeperiod_name 24x7
alias 24 Hours A Day, 7 Days A Week
sunday 00:00-24:00
monday 00:00-24:00
tuesday 00:00-24:00
wednesday 00:00-24:00
thursday 00:00-24:00
friday 00:00-24:00
saturday 00:00-24:00
}

###############################################################################
###############################################################################
#COMMANDS

###############################################################################
###############################################################################

This is a sample service notification command that can be used to send email

notifications (about service alerts) to contacts.

define command{
command_name notify-by-email
command_line /usr/bin/printf “%b” "***** Nagios @VERSION@ *****\n\nNotification Type: $NOTIFICATIONTYPE$
}

This is a sample host notification command that can be used to send email

notifications (about host alerts) to contacts.

define command{
command_name host-notify-by-email
command_line /usr/bin/printf “%b” "***** Nagios @VERSION@ *****\n\nNotification Type: $NOTIFICATIONTYPE$
}

Command to check to see if a host is “alive” (up) by pinging it

define command{
command_name check-host-alive
command_line $USER1$/check_ping -H $HOSTADDRESS$ -w 99,99% -c 100,100% -p 1
}

Generic command to check a device by pinging it

define command{
command_name check_ping
command_line $USER1$/check_ping -H $HOSTADDRESS$ -w $ARG1$ -c $ARG2$ -p 5
}

Command used to check disk space usage on local partitions

define command{
command_name check_local_disk
command_line $USER1$/check_disk -w $ARG1$ -c $ARG2$ -p $ARG3$
}

Command used to check the number of currently logged in users on the

local machine

define command{
command_name check_local_users
command_line $USER1$/check_users -w $ARG1$ -c $ARG2$
}

Command to check the number of running processing on the local machine

define command{
command_name check_local_procs
command_line $USER1$/check_procs -w $ARG1$ -c $ARG2$
}

Command to check the load on the local machine

define command{
command_name check_local_load
command_line $USER1$/check_load -w $ARG1$ -c $ARG2$
}

###############################################################################
###############################################################################
#CONTACTS

###############################################################################
###############################################################################

In this simple config file, a single contact will receive all alerts.

This assumes that you have an account (or email alias) called

@nagios_user@-admin” on the local host.

define contact{
contact_name alert
alias Alert
service_notification_period 24x7
host_notification_period 24x7
service_notification_options w,u,c,r
host_notification_options d,r
service_notification_commands notify-by-email
host_notification_commands host-notify-by-email
email alert@mydomain.com
}

###############################################################################
###############################################################################
#CONTACT GROUPS

###############################################################################
###############################################################################

We only have one contact in this simple configuration file, so there is

no need to create more than one contact group.

define contactgroup{
contactgroup_name admins
alias Nagios Administrators
members alert
}

###############################################################################
###############################################################################
#HOSTS

###############################################################################
###############################################################################

Generic host definition template - This is NOT a real host, just a template!

define host{
name generic-host ; The name of this host template
notifications_enabled 1 ; Host notifications are enabled
event_handler_enabled 1 ; Host event handler is enabled
flap_detection_enabled 1 ; Flap detection is enabled
failure_prediction_enabled 1 ; Failure prediction is enabled
process_perf_data 1 ; Process performance data
retain_status_information 1 ; Retain status information across program restarts
retain_nonstatus_information 1 ; Retain non-status information across program restarts
register 0 ; DONT REGISTER THIS DEFINITION - ITS NOT A REAL HOST, JUST
}

Since this is a simple configuration file, we only monitor one host - the

define host{
use generic-host ; Name of host template to use
host_name localhost
alias localhost
address 127.0.0.1
check_command check-host-alive
max_check_attempts 10
notification_interval 120
notification_period 24x7
notification_options d,r
contact_groups admins
}

define host{
use generic-host ; Name of host template to use
host_name server1
alias server1
address 10.1.2.100
check_command check-host-alive
max_check_attempts 10
notification_interval 120
notification_period 24x7
notification_options d,r
contact_groups admins
}

define host{
use generic-host ; Name of host template to use
host_name server2
alias server2
address 10.1.2.200
check_command check-host-alive
max_check_attempts 10
notification_interval 120
notification_period 24x7
notification_options d,r
contact_groups admins
}

###############################################################################
###############################################################################
##HOST GROUPS

###############################################################################
###############################################################################

We only have one host in our simple config file, so there is no need to

create more than one hostgroup.

define hostgroup{
hostgroup_name hausmann
alias Hausmann Servers
members localhost,server1,server2
}
###############################################################################
###############################################################################
#SERVICES

###############################################################################
###############################################################################

Generic service definition template - This is NOT a real service, just a template!

define service{
name generic-service ; The ‘name’ of this service template
active_checks_enabled 1 ; Active service checks are enabled
passive_checks_enabled 1 ; Passive service checks are enabled/accepted
parallelize_check 1 ; Active service checks should be parallelized (disabling
obsess_over_service 1 ; We should obsess over this service (if necessary)
check_freshness 0 ; Default is to NOT check service 'freshness’
notifications_enabled 1 ; Service notifications are enabled
event_handler_enabled 1 ; Service event handler is enabled
flap_detection_enabled 1 ; Flap detection is enabled
failure_prediction_enabled 1 ; Failure prediction is enabled
process_perf_data 1 ; Process performance data
retain_status_information 1 ; Retain status information across program restarts
retain_nonstatus_information 1 ; Retain non-status information across program restarts
register 0 ; DONT REGISTER THIS DEFINITION - ITS NOT A REAL SERVICE,
}

Define a service to “ping” the local machine

define service{
use generic-service ; Name of service template to use
host_name localhost,server1,server2
service_description PING
is_volatile 0
check_period 24x7
max_check_attempts 4
normal_check_interval 5
retry_check_interval 1
contact_groups admins
notification_interval 960
notification_period 24x7
check_command check_ping!100.0,20%!500.0,60%
}

Define a service to check the disk space of the root partition

on the local machine. Warning if < 20% free, critical if

< 10% free space on partition.

define service{
use generic-service ; Name of service template to use
host_name localhost,server1,server2
service_description Root Partition
is_volatile 0
check_period 24x7
max_check_attempts 4
normal_check_interval 5
retry_check_interval 1
contact_groups admins
notification_interval 960
notification_period 24x7
check_command check_local_disk!20%!10%!/
}

Define a service to check the number of currently logged in

users on the local machine. Warning if > 20 users, critical

if > 50 users.

define service{
use generic-service ; Name of service template to use
host_name localhost,server1,server2
service_description Current Users
is_volatile 0
check_period 24x7
max_check_attempts 4
normal_check_interval 5
retry_check_interval 1
contact_groups admins
notification_interval 960
notification_period 24x7
check_command check_local_users!20!50
}

Define a service to check the number of currently running procs

on the local machine. Warning if > 250 processes, critical if

> 400 users.

define service{
use generic-service ; Name of service template to use
host_name localhost,server1,server2
service_description Total Processes
is_volatile 0
check_period 24x7
max_check_attempts 4
normal_check_interval 5
retry_check_interval 1
contact_groups admins
notification_interval 960
notification_period 24x7
check_command check_local_procs!250!400
}

Define a service to check the load on the local machine.

define service{
use generic-service ; Name of service template to use
host_name localhost,server1,server2
service_description Current Load
is_volatile 0
check_period 24x7
max_check_attempts 4
normal_check_interval 5
retry_check_interval 1
contact_groups admins
notification_interval 960
notification_period 24x7
check_command check_local_load!5.0,4.0,3.0!10.0,6.0,4.0
}


#2

Hi,

Yep, there’s something wrong in your definitions:

This is a **sample **service notification command that can be used to send email

notifications (about service alerts) to contacts.

define command{
command_name notify-by-email
command_line /usr/bin/printf “%b” "***** Nagios @VERSION@ *****\n\nNotification Type: $NOTIFICATIONTYPE$
}

This is a **sample **host notification command that can be used to send email

notifications (about host alerts) to contacts.

define command{
command_name host-notify-by-email
command_line /usr/bin/printf “%b” "***** Nagios @VERSION@ *****\n\nNotification Type: $NOTIFICATIONTYPE$
}

As you can see, these lines are “samples” that “can be used” to send e-mails… they are not complete and do nothing as they are (see “man printf” to verify that printf doesn’t send e-mails by itself :)).

To help you, here is my definition:
define command{
command_name notify-by-email
command_line /usr/bin/printf “%b” “Groupe\t: $HOSTALIAS$\nAdresse\t: $HOSTADDRESS$\nState\t\t: $SERVICESTATE$\nDate/Hour\t: $DATETIME$\n\nInfo\t\t: $SERVICEOUTPUT$” | /usr/bin/mailx -s “$NOTIFICATIONTYPE$ / $HOSTNAME$ / $SERVICEDESC$ is $SERVICESTATE$” $CONTACTEMAIL$
}

‘host-notify-by-email’ command definition

define command{
command_name host-notify-by-email
command_line /usr/bin/printf “%b” “***** Nagios *****\n\nNotification Type: $NOTIFICATIONTYPE$\nHost: $HOSTNAME$\nState: $HOSTSTATE$\nAddress: $HOSTADDRESS$\nInfo: $SERVICEOUTPUT$\n\nDate/Time: $DATETIME$\n” | /usr/bin/mail -s “Host $HOSTSTATE$ alert for $HOSTNAME$!” $CONTACTEMAIL$
}

BTW, I didn’t check all your definitions: I just stopped at the beginning where the command def are written, so there may be other errors, or not :slight_smile:


#3

That fixed it. Thanks!