The first thing you should think about is parent/child relationships in your hosts.cfg file if you want to reduce notifications.
If I have 100 hosts(which are nothing but switches that all connect to each other in a daisy chain) and the first switch in the daisy chain dies, then all other switches are going to be reported as DOWN also. But if you said that switch2’s parent is Switch1, and Switch3’s parent is 2, and so on, then if switch1 dies, you get only ONE device reporting DOWN. The rest are unreachable. You then set your contact to not receive Unreachable notifications.
Host/service dependancies can be used where it’s a real dependancy. Like you said, if the box won’t ping, then of course, the ftp server isn’t going to be able queried.
But the DNS down? Sounds kina weak. The problem with having over 1000 service checks, and building a service dependancy tree, is time. When nagios finds a service that does not respond, it walks the dependancy tree, untill it finds the device that is actually DOWN, and not just unreachable.
Depending on how many nodes you have on your network, this could take alot of time. But if you don’t have many, then it may work just fine.
But they should be legitimate dependancies. Just because the DNS server is not running DNS, doesn’t mean that the FTP server might not be running.
You want nagios to tell you what is broken, not lie to you.