Check interval seems like it's not doing what I ask

avonnieda · January 25, 2006, 3:08pm

Hi folks,

I’m sure I have a misconfiguration, as I don’t understand why Nagios is not scheduling a check when I would expect. Here’s the situation. I had a service go down at app. 4:00am this morning, and a notification followed soon after (within several minutes). Now, the service is back up, but the next sceduled check is showing as about 45 minutes from now. Since the normal_check_interval is set to 5, I would expect to get a notification that the service is back up within 5 minutes of when it came back up. Any advice would be appreciated, here’s some info. I’m running version 1.2

Thanks,

    -Adam vonNieda

Service state information

Last Check Time: 01-25-2006 07:56:54
Status Data Age: 0d 0h 55m 4s
Next Scheduled Active Check: 01-25-2006 09:29:26 <-- ??
Last State Change: 01-25-2006 04:02:01
Current State Duration: 0d 4h 50m 27s
Last Service Notification: 01-25-2006 04:02:01
Last Update: 01-25-2006 08:52:16

Configuration for the service

Service definition

define service{
use production-service ; Name of service template to use

    host_name                       vader
    service_description             Oracle Availability
    contact_groups                  sysadmins,dbas
    normal_check_interval           5
    check_command                   check_oracle_instance!username!password!vader_oracle
    }

define service{
name production-service ; The ‘name’ of this service template, referenced in other service definitions
active_checks_enabled 1 ; Active service checks are enabled
passive_checks_enabled 1 ; Passive service checks are enabled/accepted
parallelize_check 1 ; Active service checks should be parallelized (disabling this can lead to major performance problems)
obsess_over_service 1 ; We should obsess over this service (if necessary)
check_freshness 0 ; Default is to NOT check service 'freshness’
notifications_enabled 1 ; Service notifications are enabled
event_handler_enabled 1 ; Service event handler is enabled
flap_detection_enabled 1 ; Flap detection is enabled
process_perf_data 1 ; Process performance data
retain_status_information 1 ; Retain status information across program restarts
retain_nonstatus_information 1 ; Retain non-status information across program restarts

    register                        0       ; DONT REGISTER THIS DEFINITION - ITS NOT A REAL SERVICE, JUST A TEMPLATE!

    is_volatile                     0
    check_period                    24x7
    max_check_attempts              3
    normal_check_interval           5
    retry_check_interval            1
    notification_interval           480     ; 8 hours
    notification_period             24x7
    notification_options            w,u,c,r

    }

luca · January 25, 2006, 3:43pm

could somebody have rescheduled the next active check at that time from the web interface? it’s the only idea i get…

Luca

avonnieda · January 25, 2006, 4:32pm

Nope, I’m the only one who uses it.

Thanks for the reply though. Anyone else have any ideas?

Thanks,

  -Adam vonNieda

system · January 25, 2006, 6:25pm

Do you have any escalations defined? I’m not sure whether or not you can define different check intervals in escalation configs, but it might be worth looking into.
Edited Wed Jan 25 2006, 01:26AM ]

avonnieda · January 25, 2006, 9:49pm

No, I have no escalations defined either.

Thanks for the reply,

   -Adam vonNieda

jakkedup · January 26, 2006, 6:49pm

obsess_over_service 1 Why do you have that and so what is your ocsp_command=???

What is inter_check_delay_method=???

avonnieda · January 30, 2006, 3:41pm

To be honest, I don’t know why that option is set. Maybe I didn’t understand what it was for when I first set this stuff up. I’m setting it to 0 now.

inter_check_delay_method=s

I’ve got 65 services over 23 hosts configured. I’ll look over the above parameter and see if that answers my question.

-Adam