Performance very poor, not preserving host states accross re



Our nagios server had a hard drive failure yesterday and we had to migrate to a new machine. I saved the config directories, but not the html/cgi directories. This morning when I went to reinstall everythingon the new mahcine I niotced that nagios 2.4 is no longer available and grabbed 2.5 instead. Unfortunately I'm having serious problems with 2.5 and the http plugin.

When I start the daemon it takes over five minutes to check 19 hosts with about 101 services between them. This is far higher than it was with 2.4. Further, it will detect several http and https services as down even though manual tests of those services by using the check_http command form the CLI show that they are fine. In the log the failures are due to the plugin timing out, but when I test the plugin manually I get a sub-second response time. Check_http seems to be the only plugin affected. I've run a tcpdump between the nagios server and the host being monitored; nagios never actually sends any http/https traffic to the host that the services are being marked down on.  I ran a bunch of 'ps -ax | grep nagios' at the same time as saw that nagios is in fact calling the plugin, there's just no traffic. I even copy/pasted the exact command nagios is calling from the ps -ax output to my command line as a test. Not only did the plugin work perfectly, but I saw the traffic in my tcpdump. 

Any help that could be provided would be greatly appreciated. I have two hosts that nagios is marking down even though they are operating perfectly.



I should have known better than to question the most excellent nagios code. My resolv.conf was not configured properly, hence an obvious source of plugin weirdness as it’s only intermittently able to resolve the hostnames.

Not sure what nagios toold I could have used to troubleshoot this; no errors were being printed by the plugin that might have indicated the problem.



yea, that is strange. But live and learn.