In the Nagios.log file I am getting this message.
“SERVICE ALERT: Thorium;Avg CPU Load 15min;UNKNOWN;SOFT;2;Critical Process Count must be an integer!”
Does this make any sense?
Also on the same server I am struggling to get the email working and to monitor the partitions. Could I get some pointers.
In the Nagios.log file I am getting this message.
“SERVICE ALERT: Thorium;Avg CPU Load 15min;UNKNOWN;SOFT;2;Critical Process Count must be an integer!”
So why did you paste the check disk command? Your problem was with the check_procs plugin wasn’t it?
Your check disk looks fine to me, what’s the problem? You paste the command definition for check_procs and the error, but not the command. You pasted the command for the check disk, but not the definition and the error. What gives? Are we working on this in some sort of new method?
" service_description Avg CPU Load 15min "
I don’t know why you are calling it cpu load avg, and then use the check_procs plugin. That plugin does not reflect 15 min avg load.
I’m assuming here that you are just trying to monitor the local CPU load. check_proc’s main function is to check processes not processors. The CPU metric is used to see whether one of the processes you are monitoring is using more than a specified amount of the CPU. From check_procs --help
check_procs -w 10 -c 20 --metric=CPU
Alert if cpu of any processes over 10% or 20%
To get the local CPU usage, you could try check_load. A good place to look for an overview of the commands is here
(basically it’s a list big list of check_command --help)
I’ll post this just in case anyone is searching the archives for multi CPU stuff. If it was a remote multi CPU NT/2000 box that you were wanting to check checking then you can use check_nt as below
As the normal CPU check using check_nt/nsclient averages out the load across the cpus (i.e. on a dual cpu machince one cpu could be 100% and the other 0% and you’d get a 50% reading and no warning) Edited Mon Aug 01 2005, 12:33AM ]