Hey all -
I work for a small cable provider and am seeking direction for something I’d like to be able to do with Nagios but don’t know which way to go with it. Here is what has been going on. As of late, we have been having some intermittent problems with our DHCP server communicating with our CMTS. When this happens, several hundred modems will drop off the uBR. Basically, the CMTS shows these cable modems as being unregistered. This ofcourse takes the customer offline as their DHCP request isn’t completed and they don’t get an IP. Anyway, from the CMTS, I run the following command:
CMTS]# sh cable modem sum total
I get an output like what you see below:
Total Registered Unregistered Offline
Total: 10000 9000 500 500
What I would like to do with nagios is to mointor this, perhaps by logging into the CMTS, running that command and then set a threshold to alert me when the Unregistered or Offline modems reach 300 or something like that. Is this something that can be done with Nagios? If so, where would I start?
I’m guessing that I would have to first write a script that logs me into the CMTS and runs that command. Then, given the output, search for the unregistered number and report if it’s past a certain threshold. Then just add that script to be checked every 5 or 10 minutes in Nagios?
Anyway, is that how you guys would do it or is there an easier way somehow?
I wouldn’t do it that way.
Instead of having nagios forced to run the script(which will login, make the check, and grab results) I would have the remote system(cmts perform the check itself. The results of this check would then be passed to the nagios server via the send_nsca script. nagios.sourceforge.net/docs/2_0/distributed.html
This would mean that I would have load the nsca client on the Cisco CMTS ? First, would you know if this application can be loaded on a Cisco 10k series uBR?
If so, I will still have to go through my admin before I could load anything on our CMTS.
Like I said, I have no idea what these things are. If it’s a switch or something, then obviously, you can’t load anything on it. So perform the active check:
check_snmp on any of these devices that can’t have something loaded on it.
If you can load something on it, make a decision, active or passive nagios checks. Daemon running on remote system(active checks), or daemon running on nagios system(passive checks).
jakkedup,
Yeah I don’t know what I was thinking. Obviously since this is a router, nothing can be loaded on it and therefore this would have to be an active check.
I know there is an OID that will pull the information that I’m trying to get from the router. I just have to find what that OID is and set it up in Nagios.
To assist anyone in finding out what oid # to use in the check_snmp check, I would suggest you find yourself a mib browser.
For Winblows, getif
For 'nix type get mbrowse
Google for it, mbrowse, set it up(was kinda hard) but well worth it.
Now download mib text files from your vendors for all of your equipment, and then browse away, and pick and choose what you want to do.
I was amazed on how much info, alot of our routers/switches/etc have.
Redundant and primary power supply status’, fans rpm, temp of chassis, temp of fan exaust, voltages, and of course, interface status’ and on and on…just amazing, and I added as much of it that made sense(the backup power supply check, actually has proven VERY helpful, and prevented system downtime).
Yeah I am using an MIB broswer on my windows client. The browser is MIB Browser, software by iReasoning. Anyway, I issue an snmpwalk against the router and get a lot of information back. However, I can’t seem to find the OID that I’m looking for.
I will need the OID that gathers the total number of cable modems that are marked ‘unregistered’.
I will keep plugin away at that and shoot you another post if I run into a question that I have relating to that.