I am trying to determine if this is possible.
I am monitoring a linux cluster consisting of 658 nodes.
each node has a job queue and a queue status that I wish to report the current status of. The kicker is that the master node is queried to provide the information, if I run the check for each node I effectively run 658 queries on a single node which is not very effective.
what I wish to do is run the query only on the master node but report the status and queue length for each system.
the output of the command is this:
HOST_NAME STATUS JL/U MAX NJOBS
host1 ok 2 2 0
host2 closed_Adm - - 0
host658 ok 2 2 2
where the Status and Njobs are the items of interest.
Any input is greately appreciated.