I manage a network with about 200 hosts. Some of the hosts are only up some of the time and that’s fine with me. If the host is down, I don’t want to see it on my service problems display; but, if the host is up, I want to make sure that I the display will show any services that are not OK. Is there a way to do this?
Well, that depends on what you’re trying to achieve. If you know at which times the host is supposed to be monitored then you could use timeperiods for checking that host and for its notifications. Another way, if you want to manualy turn off host checks, you could do it on your web GUI, by selecting the wanted host and click on Disable active host checks and Disable checks of all services on this host. When you need them again, you can enable them again.
Thanks. Unfortunately I don’t know when hosts will be up or down. These are a variety of workstations (Unix/Linux/Irix/DGUX/Tru64/HPUX,AIX,etc) and if a system is turned on, I want to make sure that it is configured and running properly (NFS, load, disk space, clock, permissions, etc), but I am trying to avoid screens of “CRITICAL” hosts which are “just turned off”. I’m leaning towards a cron job that modifies my .cfg files, but was hoping for something cleaner.
Thanks - I’ve tested it and this will work. The biggest problem I have now is knowing which service checks to disable for each host. Is there an easy way to get that list without parsing the CFG file(s)?
Parsing the config files would be yuck… Luckily, you can, given a hostname, use a wget+grep+sed one-liner to pull the info straight out of the service-detail webpage for that host by grepping the lines that name the services out of the HTML and removing all the tags with sed (which helpfully just leaves the service name)
[root@localhost ~]# wget --quiet --output-document=- --http-user=foo --http-password=bar "http://127.0.0.1/nagios/cgi-bin/status.cgi?host=localhost" | grep "HREF='extinfo.cgi?type=2&host=localhost&service="| sed -e :a -e 's/<^>]*>//g;/</N;//ba'
Current Load
Current Users
HTTP
PING
Root Partition
SMS3 Processes - smsd
SSH
Swap Percent Usage
Swap Usage
Total Processes
Var Partition
[root@localhost ~]#
as all else (username, password, server ip) remains static, your script just needs to push the hostname in question into the one-liner at the appropriate places.