My problem is this. I have two WIndows AD servers. One of the services I checked on the AD servers was DNS. On Monday night, I re-ip’d both servers to another subnet (the same subnet for each server). Now when I check the DNS on one, it claims that the address I am searching for does not exist, while it’s partner checks as working.
When I use the “broken” DNS server directly (ie a dig@) I get the correct information back. Similarly, clients using the DNS server routinely get valid information. So the indication is that the check_dns is not working for some reason, even though the DNS serve actually is.
Any ideas what might be wrong, or how I might go about fixing it?
I think it does. The only thing that’s different is the IP address of the DNS server, and the IP address of the corresponding entry in nagios. The rest of the world thinks this DNS server is working, why does nagios think it isn’t? Why does nagios think that it’s twin, which underwent a virtually identical move, is still working?
Can you run the check from the command line? If you use the -v option from the command line you should see a more verbose output and may be able to see what is going on with the nslookup query.
The problem is that the IP address of the server I was checking had no reverse name associated with it. So when I ran the query in verbose mode, I’d get:
/usr/sbin/nslookup google.ca dns-ph-10
*** Can’t find server name for address 10.0.0.10: Non-existent host/domain
Domain google.ca was not found by the server
Fixed the reverse address pointer and it starts working again.