Receiving service outages instead of host outages


#1

Howdy –

I am using an existing installation of nagios, and I am for the most part, a nagios newbie, but very familiar with SNMP monitoring practices in general.

One problem I’m trying to tackle is that I’m getting my service outages when a host goes down instead of a host outage. When I look at the web GUI, I see the following for all of my host configs:

I’ve reformatted and taken a few things out to make it more readable. I’ve also highlighted the things that look suspect to me.
Host Name: atest01
Max. Check Attempts: 3
**Check Interval: 0h 4m 0s **
Host Check Command: check-host-alive
Obsess Over: Yes
**Enable Active Checks: No
Enable Passive Checks: Yes
Check Freshness: No **
Freshness Threshold: Auto-determined value
Default Contact Groups: corpemail
Notification Interval: 48h 0m 0s
Notification Options: Down, Unreachable, Recovery
Notification Period: 24x7

Per the Docs, it seems that they do not recommend setting a check interval for hosts, because “among other reasons”, its bad for performance. Could this be “another reason”?

“Enable active checks” Being disabled also seems like a bad thing to me, since the docs describe “on demand” checks as falling under the active checks category.

I’m referencing the host options section of this page: nagios.sourceforge.net/docs/2_0/xodtemplate.html

Here is my generic host config:

define host{
     name                           generic-host     ; The name of this host template
     notifications_enabled          1     ; Host notifications are enabled
     event_handler_enabled          1     ; Host event handler is enabled
     flap_detection_enabled         1     ; Flap detection is enabled
     process_perf_data              1     ; Process performance data
     retain_status_information      1     ; Retain status information across program restarts
     retain_nonstatus_information   1     ; Retain non-status information across program restarts
     check_command                  check-host-alive
     contact_groups                 corpemail
     check_interval                 4
     max_check_attempts             3
     notification_interval          2880     ; 0=noRepeats, If set to Zero host escalation will not work
     notification_period            24x7
     notification_options           d,u,r

     register               0     ; DONT REGISTER THIS DEFINITION - ITS NOT A REAL HOST, JUST A TEMPLATE!
     }

Here is my Generic Service config

define service{
     name    generic-service       ; The 'name' of this service template, referenced in other service definitions
     active_checks_enabled         1 ; Active service checks are enabled
     passive_checks_enabled        1 ; Passive service checks are enabled/accepted
     parallelize_check             1 ; Active service checks should be parallelized (disabling this can lead to major performance problems)
     obsess_over_service           1 ; We should obsess over this service (if necessary)
     check_freshness               0 ; Default is to NOT check service 'freshness'
     notifications_enabled         1 ; Service notifications are enable
     event_handler_enabled         1 ; Service event handler is enabled
     flap_detection_enabled        1 ; Flap detection is enabled
     process_perf_data             1 ; Process performance data
     retain_status_information     1 ; Retain status information across program restarts
     retain_nonstatus_information  1 ; Retain non-status information across program restarts

     is_volatile                     0
     check_period                    24x7
     max_check_attempts              3
     normal_check_interval           4
     retry_check_interval            2
     contact_groups                  rt-warn-q
     notification_interval           2880 ; 0=norepeats, no-repeats disables escalations
     notification_period             24x7
     notification_options            w,u,c,r
     register                      0 ; DONT REGISTER THIS DEFINITION - ITS NOT A REAL SERVICE, JUST A TEMPLATE!
     }

If anyone has any insight I’d really appreciate it. Unfortunately I don’t really have a good “test” box to test on, so I’d like some confirmation (or not) as to whether my suspicions are truly suspicious :slight_smile: