Check interval seems like it's not doing what I ask


#1

Hi folks,

I’m sure I have a misconfiguration, as I don’t understand why Nagios is not scheduling a check when I would expect. Here’s the situation. I had a service go down at app. 4:00am this morning, and a notification followed soon after (within several minutes). Now, the service is back up, but the next sceduled check is showing as about 45 minutes from now. Since the normal_check_interval is set to 5, I would expect to get a notification that the service is back up within 5 minutes of when it came back up. Any advice would be appreciated, here’s some info. I’m running version 1.2

Thanks,

    -Adam vonNieda

Service state information

Last Check Time: 01-25-2006 07:56:54
Status Data Age: 0d 0h 55m 4s
Next Scheduled Active Check: 01-25-2006 09:29:26 <-- ??
Last State Change: 01-25-2006 04:02:01
Current State Duration: 0d 4h 50m 27s
Last Service Notification: 01-25-2006 04:02:01
Last Update: 01-25-2006 08:52:16

Configuration for the service

Service definition

define service{
use production-service ; Name of service template to use

    host_name                       vader
    service_description             Oracle Availability
    contact_groups                  sysadmins,dbas
    normal_check_interval           5
    check_command                   check_oracle_instance!username!password!vader_oracle
    }

define service{
name production-service ; The ‘name’ of this service template, referenced in other service definitions
active_checks_enabled 1 ; Active service checks are enabled
passive_checks_enabled 1 ; Passive service checks are enabled/accepted
parallelize_check 1 ; Active service checks should be parallelized (disabling this can lead to major performance problems)
obsess_over_service 1 ; We should obsess over this service (if necessary)
check_freshness 0 ; Default is to NOT check service 'freshness’
notifications_enabled 1 ; Service notifications are enabled
event_handler_enabled 1 ; Service event handler is enabled
flap_detection_enabled 1 ; Flap detection is enabled
process_perf_data 1 ; Process performance data
retain_status_information 1 ; Retain status information across program restarts
retain_nonstatus_information 1 ; Retain non-status information across program restarts

    register                        0       ; DONT REGISTER THIS DEFINITION - ITS NOT A REAL SERVICE, JUST A TEMPLATE!

    is_volatile                     0
    check_period                    24x7
    max_check_attempts              3
    normal_check_interval           5
    retry_check_interval            1
    notification_interval           480     ; 8 hours
    notification_period             24x7
    notification_options            w,u,c,r

    }

#2

could somebody have rescheduled the next active check at that time from the web interface? it’s the only idea i get…

Luca


#3

Nope, I’m the only one who uses it.

Thanks for the reply though. Anyone else have any ideas?

Thanks,

  -Adam vonNieda

#4

Do you have any escalations defined? I’m not sure whether or not you can define different check intervals in escalation configs, but it might be worth looking into.
Edited Wed Jan 25 2006, 01:26AM ]


#5

No, I have no escalations defined either.

Thanks for the reply,

   -Adam vonNieda

#6

obsess_over_service 1 Why do you have that and so what is your ocsp_command=???

What is inter_check_delay_method=???


#7

To be honest, I don’t know why that option is set. Maybe I didn’t understand what it was for when I first set this stuff up. I’m setting it to 0 now.

inter_check_delay_method=s

I’ve got 65 services over 23 hosts configured. I’ll look over the above parameter and see if that answers my question.

-Adam