Need help with dynamic checks

Ken · September 17, 2008, 3:01am

Hi Folks,

I hope I’ve posted this to the correct forum as I do not see a plug-in developer’s forum.

Here’s the situation, I have several different boxes running dedicated applications that I want to monitor various aspects of.

The good news is that most of these applications publish many of the various parameters I wish to monitor as XML that I can easily get by directing the right request to a specific port on the box, so very little in the way of scripts actually need be deployed on the box itself. The bad news is that the port, request format and information returned all vary by which version of the application is running on the box.

I’ve already developed a perl plug-in that will query what type of application, and what version, is on any particular box, but takes about two seconds to run. I figure I only need to check about once a week to keep pace with what is on any particular box in the network, so the run time at that frequency is negligible. How can I use that information returned to determine the proper checks I need to run in order to retrieve the remainder of the information? I would like to be able to do this as an active check and avoid putting a phalanx of scripts on the boxes themselves. Any way this could be done on the Nagios monitor side that doesn’t involve passive checks or constant updating of either the host.cfg and/or service.cfg? I’m also limited to perl or batch scripting in the solution. Links to examples welcome.

Thanks in advance.

Loose · September 17, 2008, 8:11am

Hi!

I’m not sure this solution will satisfy you as you said “no constant updataing of the cfg files”, but it’s still a good idea:
-firstly, put all your checks for these box in one other file (like “check_boxes.cfg” or whatever :))
-do a script that will run once a week; this script will firstly do what your current perl script does (get the appplication, version… of the boxes), and then, depending on the results, generate a new “check_boxes.cfg” file
-then, the script will replace the old cfg file, and reload/restart nagios

you just add that in crontab, and here you go: you don’t have to do anything, the monitoring will adapt itself to the boxes;

it’s fairly easy to do that in perl/sh, so you shouldn’t hae any problems :))

Hope this helps;

if you have any question or if you need help to implement that, don’t hesitate to ask

MP1 · September 17, 2008, 10:41pm

What i would recommend is similar to Loose’s suggestion but doesn’t require a nagios reload.

You could make your perl script generate a file with hostname and application version numbers. You would then modify your service definitions to also pass the host as an argument (check_command check_application!$HOST$) to whatever checkcommand script you are running. You would then make your checkcommand plugin smart enough to parse that host argument passed to it, reference the file with all the hosts and version numbers, and then run the appropriate script for that version of application.

You could either make your perl script that outputs hosts and version numbers a nagios check that runs every day or something, or throw it in a daily cron job if you have a list of hosts external to nagios configs.

Good luck!

Ken · September 18, 2008, 12:38am

Hi Folks,

My present leaning is to do something akin to what MP is suggesting by implementing an abstraction layer between the service and the call that would substitute the appropriate call based upon looking up the last known application version returned from the box. Unless anyone has simpler or better idea.