Current event and notifications


#1

Hi.

I needed to monitor a temperature monitor on my network. I wrote my own shell script and put it in the check commands folder as normal.

#!/bin/sh

origtemp=$(/sbin/gettemp)
temp=$(/sbin/gettemp | sed 's/\..*//')

if  $temp -gt "100" ]; then

echo "Critcal Temperature: $origtemp"
exit 2

else

echo "Temperature OK: $origtemp"
exit 0

fi

However while monitoring the script in nagios I never get notifications. However I have it setup just like every other single service that we monitor. From what I know about writing scripts it should get all its info from exit codes. Which is what I used in my script.

I also noticed that current attempt never changes from 1/2 and is always in HARD state.

Current Attempt: 1/2 (HARD state)

Here is my config:

define service{
use service
hostgroup_name server_room_monitor
service_description ENVIROMUX-TEMP
check_command check_enviromux_temp
notifications_enabled 1
max_check_attempts 2
is_volatile 1
notification_interval 5
}

define serviceescalation{
host_name server-room-monitor
hostgroup_name server_room_monitor
service_description ENVIROMUX-TEMP
contacts nagiosadmin

contact_groups admin

    first_notification      2
    last_notification       0
    notification_interval   5
    escalation_period       24x7
    escalation_options      w,u,c,r
    }

What is wrong why will it not give me any emails when in critical state?

Please help.


#2

Here is what is in the rest of the services area:

Generic service template

define service{
name generic-service
active_checks_enabled 1
passive_checks_enabled 1
parallelize_check 1 ; Disabling this can lead to performance problems
obsess_over_service 1
check_freshness 0
notifications_enabled 1 ; Notify by default
event_handler_enabled 1
flap_detection_enabled 1
failure_prediction_enabled 1
process_perf_data 1
retain_status_information 1
retain_nonstatus_information 1
is_volatile 0 ; service is not volatile
check_period 24x7
max_check_attempts 3
check_interval 5
normal_check_interval 5 ; check every 10 minutes
retry_check_interval 2
contact_groups admins
notification_options w,u,c,r ; warning, unknown, critical,recovery
notification_interval 60 ; re-notify every hour
notification_period 24x7
register 0 ; Is a template
}

Service template

define service{
name service
use generic-service
max_check_attempts 4
normal_check_interval 5
retry_check_interval 1
register 0
first_notification_delay 0
}


#3

Anyone know?


#4

Have you seen it go from OK to Critical? Notifications will only be triggered initially by a state change.