I’m currently in the process of setting up Nagios to monitor our internal network. As part of this process, I want to monitor switch ports and probably make them the parent of the device connected to them.
What is the best way of going about monitoring switch ports? In particular, if the individual ports have to be set-up as a host, the check_host_alive script (used for normal hosts) surely is not relevant? I really want to check specific ports are enabled and traffic is flowing to/from them. Does this mean that I only monitor the ports I have purposely enabled and don’t bother monitoring the ones not enabled?
There’s a few ways you could do it, either approach would work.
Keep with the ‘standard’ way of determining if the host is up or down via ICMP (pinging it). In this approach you come up with your own plugin that you pass parameters to that’s a service on the host – say it’s a service named Link Status or something. When this check fails and returns critical, Nagios tries pinging the server, that fails, and it knows the host is down
Make your own script for checking if the host is up/down that takes some arguments and tests the status of the link. You just need to write the script such that it’ll accept arguments for what switch port to try, and in your host definition, where the check_command goes, you pass in those parameters via $ARG1$, $ARG2$, etc. etc.
Both approaches work though and accomplish the same thing. The only downside with approach #2 is that you might not get notified properly if a host is down – there are plenty of cases where the OS crashes and all the network links are still active. In that scenario, you’d get alerted about all of the services not responding, but not that the host was down.