Hi!
I recently got nagios set up and installed on a Server Running AIX 5.3. The install went (more or less) smoothly, and I got things configured and running. Things look good at first when I start / restart nagios.
I’m doing very simple checks, using check_ping as the “service” for each host, and check-host-alive to do the host checks.
The problem I’m having is that once I have a host go down, the scheduling gets “off” and several hosts don’t get checked in time, and drop out of the queue. They never get rescheduled… an obvious problem.
I’ve tried changing the scheduling parameters around, and going through the docs & following what it says (including using the “smart” option for the inter-check delay), but I can’t find a way around this.
Here’s the output of the -s switch:
Nagios 2.0b3
Copyright © 1999-2005 Ethan Galstad (www.nagios.org)
Last Modified: 04-03-2005
License: GPL
Projected scheduling information for host and service
checks is listed below. This information assumes that
you are going to start running Nagios with your current
config files.
HOST SCHEDULING INFORMATION
Total hosts: 246
Total scheduled hosts: 0
Host inter-check delay method: SMART
Average host check interval: 0.00 sec
Host inter-check delay: 0.00 sec
Max host check spread: 30 min
First scheduled check: N/A
Last scheduled check: N/A
SERVICE SCHEDULING INFORMATION
Total services: 246
Total scheduled services: 246
Service inter-check delay method: USER-SUPPLIED VALUE
Inter-check delay: 0.30 sec
Interleave factor method: SMART
Average services per host: 1.00
Service interleave factor: 1
Max service check spread: 30 min
First scheduled check: Mon May 23 16:42:41 2005
Last scheduled check: Mon May 23 16:43:54 2005
CHECK PROCESSING INFORMATION
Service check reaper interval: 10 sec
Max concurrent service checks: 100
PERFORMANCE SUGGESTIONS
I have no suggestions - things look okay.
Here’s my services.cfg:
define service{
host_name *
service_description PING
contact_groups admins,on_call_egh
notification_options c,w,r
check_command check_ping
active_checks_enabled 1
passive_checks_enabled 1
parallelize_check 1
obsess_over_service 0
check_freshness 0
notifications_enabled 1
event_handler_enabled 1
flap_detection_enabled 0
process_perf_data 1
retain_status_information 1
retain_nonstatus_information 1
check_period 24x7
max_check_attempts 3
normal_check_interval 3
retry_check_interval 1
notification_interval 0
notification_period 24x7
}
Here’s my “standard” host config (used as an include):
define host{
name windows_default
max_check_attempts 2
notification_interval 0
notification_period 24x7
notification_options d,r
check_command check-host-alive
register 0
}
Thanks for any help!
- Tony