Can I stagger checks?


#1

Hi,

  I have four checks, each of which runs every 180 seconds (all against the same host), is there any way to force them to be 45 seconds apart?

  I have tried auto_reschedule_checks without any luck.

Thanks,
Jeff


#2

Yes, you can.
I guess your host/service definition for check you’ve mentioned have

That will mean 180 seconds. If you want to lower that to a minute just change those intervals to 1.

But if you still want to change to exactly 45 seconds you can do lower example.

In default Nagios configuration (check the nagios.cfg file) there is a variable that says:

which means that every interval efined in host/service definition will last 60 seconds. That means if you have defined, for example:

that notification will repeat after 1 interval = 60 seconds = 1 minute.

Be aware that changing interval_length variable in nagios.cfg will be spread accross all configurations in Nagios, you cannot specify interval_length in service or host definitions, just on a program-wide basis in nagios.cfg.


#3

Thanks for replying!!

My problem is that I have four different service checks (in this case four web servers that are all running with the same back end database). I want to check each server once every three minutes, but would like them spaced evenly so that the back end database is “verified” once every 45 seconds. No matter what I do, Nagios seems to “resync” my four web server checks so they all occur at about the same time.


#4

I don’t know whether this kind of balancing of service checks is even achievable through ‘normal’ nagios scheduling, so it might be worth investigating external commands - nagios.org/developerinfo/ext … and_id=129

My idea would be to turn off active checks of those services and use cron to run a script every 3 minutes that submits a SCHEDULE_FORCED_SVC_CHECK for like servicecheck#1 now, servicecheck#2 at now+45s, servicecheck#3 at now+90s and servicecheck#4 at now+135s

That might work for you

HTH

/S


#5

You can write a shell (or any other) script that would do all the checks, send the response back to Nagios and sleep beetween, in similar order like this:

  1. check the response of the first webserver
  2. depending on the response you then execute submit_check_result to Nagios for first webserver
  3. sleep 45
  4. check the response of the second webserver
  5. depending on the response you then execute submit_check_result to Nagios for second webserver
  6. sleep 45
  7. check the response of the third webserver
  8. depending on the response you then execute submit_check_result to Nagios for third webserver

You can create one virtual service on the Nagios server that will use this script as a check_command (executing every three minutes in your example) and those three services for webservers could be passive services that would only accept submit_check_result from the script.

That is the simplest solution I can think of right now.