Personally, I think it’s a mistake to take the hostgroup or comma delimited shortcut route. For each host/router you should have a parent host defined for it in hosts.cfg file. This will give you a good idea of what router has failed, otherwise, unreachable routers may be reported as “Down”. When in fact they are simply “unreachable” due to another router blocking your path.
The other thing is, for each and every router, you should be graphing the output of “ping” with nagiostat. This may not be important right now, but it may in the future help you find trouble on the network. In my case, I’ve noticed that as of Sunday at 6PM, there has been an unexplainable increase in rta’s for most of our network. WIthout the graphs, nobody would even know that our rta’s have increased from 50ms MAX to 200ms MAX.
Also, with each and every router, there can be many other checks that are very helpful. For example, Power supply 1 and 2 up, fan #1,2,3 RPM’s, and others.
If you take the hostgroup route, let us know how it works out. I’m we are diing to hear if this actually works, since it’s not documented and others have said YES, it does work.