I’m configuring Nagios to monitor my HP ProCurve switches. I found excellent command and service definitions at www.nagiosexchange.org and all is working wonderfully except for the service that monitors free memory.
The definitions that I’m using are copied and pasted directly from the above-mentions site. They are:
The switches list about 150MB of total memory, with about 109MB free when I view status from the switch console itself. Nagios is correctly reporting the free 109MB, but is showing the state as critical.
I’ve done a good bit of googling to try to understand how the “2000:30000000” and “1000:30000000” sections work. I realize that those are ARG2 and ARG3, and that ARG2 is the warning level and ARG3 is the critical level. What I don’t understand is how to adjust those numbers to get the levels that I want to give warning and critical status on my particular switches. I’ve found info that states that two numbers followed by a colon are a range, and other info that says they are a less-than:higher-than definition for when to return the state defined by the command.
What I’d like is to have the following:
-Up to 60MB of free memory = OK
-Between 60MB and 40MB of free memory = Warning
-Less than 40MB of free memory = Critical
I will likely adjust those values once I get a better idea of memory usage under different loads.
I’d like to understand how to adjust the numbers in the service definition so that my service monitors will work as listed above. Can someone explain this, or point me to a resource that helps explain what the colon separated numbers mean on this particular command? I haven’t had any luck in my searching, but I’m continuing to try to find as much information as I can to understand this.
from the command “./check_snmp --help”:
-w, --warning=INTEGER_RANGE(s)
Range(s) which will not result in a WARNING status
-c, --critical=INTEGER_RANGE(s)
Range(s) which will not result in a CRITICAL status
Well; here you go
Yes, that’s a very very weird way to do thresholds, and it’s really confusing!
My advice: you have to think the “normal” way and input the opposite.
ie: you want a warning between 60 and 40Mb ? => that means you want a warning as soon as the value is below 60
just input “-w 60:” (or 60000:, or with more “0”, I don’t know :))
you want a critical between 0 and 40 => “-c 40:”
even now, I’m not sure of the values I said above … I had to correct them twice … that’s really a weird way to do that.
Anyway, try to understand how it works and you’ll do fine
(also, you can get rid of this plugin and script a new one :))
recoimpile the plugins… when you run the configure command it tells you something like WARNING: net-snmp missing…
you’ll need to install the snmp package (make snmpd too at least you can test snmp on your localhost).
How to install these depends on the ditribution you are running…
when you run this step you have to check the output. it is telling you what plugins will NOT be compild thorugh warnings.
until you get the snmp warning fixed it will not compile.
you’ll probably need to install the snmp snmpd snmplib packages. in debian you use aptitude. not sure what ubuntu uses to install RPMs.
can’t help you there.