Ping Warnings

I am running nagios v2 on slackware 10.2 and I am monitoring around 60 hosts. Right now I am only doing ping checks. The ping checks work but i get random ping warnings all the time. Its never the same host and never at the same time.

Here is the check command I am using
check_command check_ping!1000.0,50%!2000.0,80%

Here is the output when ran from the command line
PING OK - Packet loss = 0%, RTA = 0.30 ms

I set it to report out to a log file and here is one of the warning pings
PING WARNING - Packet loss = 0%, RTA = 0.68 ms

I have read through every thread on here dealing with ping warnings and tweaking things. I have my intervals set to the following

   max_check_attempts              3
    normal_check_interval           5
    retry_check_interval            1

I am stumped at this point. I have everything working the way I want it except for these darn warnings.

PING WARNING - Packet loss = 0%, RTA = 0.68 ms

duh… that really shouldn’t happen…

are you sure you defined icheck ping only once? possibly somewhere else for some hosts with strange values?

Luca

After checking everything again I did notice that I had it defined twice to ping. I had check-host-alive in the host definition and then i also had a service that ran the check_ping on every host. I removed the check-host-alive from the host definitions of each device and left my ping service in tact. I was hoping this would solve the problem however I am still getting the same warning messages.

I did remember to restart nagios after I made the changes.

Defining a Ping check and having a ping check as your check-host-alive command shouldn’t cause any problems. I have that same setup on many of my machines (Nagios 2.0b4).

Your issue appears to be elsewhere. How is your check_ping command defined? Perhaps you’ve got warning and critical levels set incorrectly?

Here is the check ping service that runs for all hosts

define service{
use generic-service ; Name of service template to use
host_name *
service_description PING
is_volatile 0
check_period 24x7
max_check_attempts 3
normal_check_interval 5
retry_check_interval 1
contact_groups admins
notification_options w,u,c,r
notification_interval 960
notification_period 24x7
check_command check_ping!1000.0,50%!2000.0,80%
}

Also as soon as the next interval runs the warnings clear themselves and I get a recovery email notification.

What does the template look like for your generic-service definition?

define service{
name generic-service ; The ‘name’ of this service template
active_checks_enabled 1 ; Active service checks are enabled
passive_checks_enabled 1 ; Passive service checks are enabled/accepted
parallelize_check 1 ; Active service checks should be parallelized (disabling thi$
obsess_over_service 1 ; We should obsess over this service (if necessary)
check_freshness 0 ; Default is to NOT check service 'freshness’
notifications_enabled 1 ; Service notifications are enabled
event_handler_enabled 1 ; Service event handler is enabled
flap_detection_enabled 1 ; Flap detection is enabled
failure_prediction_enabled 1 ; Failure prediction is enabled
process_perf_data 1 ; Process performance data
retain_status_information 1 ; Retain status information across program restarts
retain_nonstatus_information 1 ; Retain non-status information across program restarts
register 0 ; DONT REGISTER THIS DEFINITION - ITS NOT A REAL SERVICE, JUS$
}

Replace your ping plugin with check_fping. It’s probably a problem with the plugin itself, that could be corrected if you get the latest from the CVS.
cvs.sourceforge.net/viewcvs.py/n … g/plugins/

can you have nagios ping test both the internal (which is setup by default) and the external IP of a server (windows) without making a whole new define host. As my define host points to the internal IP.