Can Nagios monitor a switch port for errors?


#1

I was wondering of there was a way for Nagios to monitor switch/router ports for errors and kick of an alert if it sees so many errors in a set time?

Thanks.

Jason


Check_SNMP Thresholds
#2

[quote=“jwilliams31”]I was wondering of there was a way for Nagios to monitor switch/router ports for errors and kick of an alert if it sees so many errors in a set time?

Thanks.

Jason[/quote]

Whilst I haven’t done this myself, my first thought would be SNMP.

Ie. If you know the SNMP property you want to query, Nagios can monitor it.


#3

I guess the problem is that I don’t know how to monitor it. Is there some kind of macro like there is for the port status?

Thanks.


#4

Short answer, no. You will have to read up on SNMP and learn how to query it. Then right a nagios command that runs that query. Warning that there really isn’t anything simple about Simple Network Management Protocol unless you compare it to the alternatives. So if you have never used it you are in for quite a learning curve. I’d love to help more but I have to learn more about it myself.


#5

I’d suggest using snmpwalk and friends with a network MIB loaded to look at all the possible items the switch is capable of reporting on, then choosing one and setting the relevant test on it.

Ie. snmpwalk -v 1 -c community 192.168.1.2

This would query a device with an IP of 192.168.1.2 with a “community” (think password) of “community” with SNMP version 1 returning data like:

SNMPv2-MIB::sysDescr.0 = STRING: 3Com Baseline Switch 2916-SFP Plus SNMPv2-MIB::sysObjectID.0 = OID: SNMPv2-SMI::enterprises.43.1.8.60 DISMAN-EVENT-MIB::sysUpTimeInstance = Timeticks: (15898660) 1 day, 20:09:46.60 SNMPv2-MIB::sysContact.0 = STRING: SNMPv2-MIB::sysName.0 = STRING: SNMPv2-MIB::sysLocation.0 = STRING: SNMPv2-MIB::sysServices.0 = INTEGER: 2 SNMPv2-MIB::sysORLastChange.0 = Timeticks: (0) 0:00:00.00 SNMPv2-MIB::sysORID.1 = OID: SNMPv2-SMI::enterprises.89.73 SNMPv2-MIB::sysORDescr.1 = STRING: RS capabilities SNMPv2-MIB::sysORUpTime.1 = Timeticks: (0) 0:00:00.00 .... IF-MIB::ifOperStatus.1 = INTEGER: up(1) IF-MIB::ifOperStatus.2 = INTEGER: up(1) IF-MIB::ifOperStatus.3 = INTEGER: up(1) IF-MIB::ifOperStatus.4 = INTEGER: down(2) IF-MIB::ifOperStatus.5 = INTEGER: down(2) IF-MIB::ifOperStatus.6 = INTEGER: up(1) IF-MIB::ifOperStatus.7 = INTEGER: up(1) IF-MIB::ifOperStatus.8 = INTEGER: up(1) IF-MIB::ifOperStatus.9 = INTEGER: up(1) IF-MIB::ifOperStatus.10 = INTEGER: up(1) IF-MIB::ifOperStatus.11 = INTEGER: up(1) IF-MIB::ifOperStatus.12 = INTEGER: up(1) IF-MIB::ifOperStatus.13 = INTEGER: up(1) IF-MIB::ifOperStatus.14 = INTEGER: up(1) IF-MIB::ifOperStatus.15 = INTEGER: down(2) IF-MIB::ifOperStatus.16 = INTEGER: up(1) IF-MIB::ifOperStatus.17 = INTEGER: notPresent(6) IF-MIB::ifOperStatus.18 = INTEGER: notPresent(6) IF-MIB::ifOperStatus.19 = INTEGER: notPresent(6) IF-MIB::ifOperStatus.20 = INTEGER: notPresent(6) IF-MIB::ifOperStatus.21 = INTEGER: notPresent(6) IF-MIB::ifOperStatus.22 = INTEGER: notPresent(6) IF-MIB::ifOperStatus.23 = INTEGER: notPresent(6) IF-MIB::ifOperStatus.24 = INTEGER: notPresent(6) ... IF-MIB::ifSpeed.1 = Gauge32: 100000000 IF-MIB::ifSpeed.2 = Gauge32: 100000000 IF-MIB::ifSpeed.3 = Gauge32: 100000000 IF-MIB::ifSpeed.4 = Gauge32: 1000000000 IF-MIB::ifSpeed.5 = Gauge32: 1000000000 IF-MIB::ifSpeed.6 = Gauge32: 100000000 IF-MIB::ifSpeed.7 = Gauge32: 100000000 IF-MIB::ifSpeed.8 = Gauge32: 1000000000 IF-MIB::ifSpeed.9 = Gauge32: 100000000 IF-MIB::ifSpeed.10 = Gauge32: 100000000 IF-MIB::ifSpeed.11 = Gauge32: 100000000 IF-MIB::ifSpeed.12 = Gauge32: 100000000 IF-MIB::ifSpeed.13 = Gauge32: 1000000000 IF-MIB::ifSpeed.14 = Gauge32: 1000000000 IF-MIB::ifSpeed.15 = Gauge32: 1000000000 IF-MIB::ifSpeed.16 = Gauge32: 1000000000 IF-MIB::ifSpeed.17 = Gauge32: 0 IF-MIB::ifSpeed.18 = Gauge32: 0 IF-MIB::ifSpeed.19 = Gauge32: 0 IF-MIB::ifSpeed.20 = Gauge32: 0 IF-MIB::ifSpeed.21 = Gauge32: 0 IF-MIB::ifSpeed.22 = Gauge32: 0 IF-MIB::ifSpeed.23 = Gauge32: 0 IF-MIB::ifSpeed.24 = Gauge32: 0 ... and so on and so forth. Above example shows port link status, port link speed, etc… What is shown will vary dependent on the switch manufacturer… Each of these values you can test with check_snmp[code]$ /usr/local/nagios/libexec/check_snmp -H 192.168.1.2 -C community -o IF-MIB::ifOperStatus.1 -v
/usr/bin/snmpget -t 1 -r 5 -m ALL -v 1 -c community 192.168.1.2:161 IF-MIB::ifOperStatus.1
IF-MIB::ifOperStatus.1 = INTEGER: up(1)

SNMP OK - up(1) |
$ /usr/local/nagios/libexec/check_snmp -H 192.168.1.2 -C community -o IF-MIB::ifOperStatus.4 -v
/usr/bin/snmpget -t 1 -r 5 -m ALL -v 1 -c community 192.168.1.2:161 IF-MIB::ifOperStatus.4
IF-MIB::ifOperStatus.4 = INTEGER: down(2)

SNMP OK - down(2) | [/code]… here I’m looking at the link state of the port… But as I haven’t set any conditions, it’s just saying ok…[code]$ /usr/local/nagios/libexec/check_snmp -H 192.168.1.2 -C community -o IF-MIB::ifOperStatus.4 -c 1 -v
/usr/bin/snmpget -t 1 -r 5 -m ALL -v 1 -c community 192.168.1.2:161 IF-MIB::ifOperStatus.4
IF-MIB::ifOperStatus.4 = INTEGER: down(2)

SNMP CRITICAL - 2 | IF-MIB::ifOperStatus.4=2 [/code]Above example shows that things are NOT critical if returned value is 1 (-c 1) but as the value is 2, it’s a critical state… For more info, consult check_snmp’s --help switch.

In short, review your switch’s snmpwalk output, determine which variables and values are relevant for your test requirements, then set the relevant warning/critical etc. ranges to match the states appropriately…


#6

You need to read and learn from SNMP queries it. Then the command just Nagios, which carries out this survey. Beware that in reality is nothing simple on the Simple Network Management Protocol options comparison.