HTTP check failed while Apache log reports success


#1

Nagios report problem with webserver while Apache log recorded a successful get from Nagios. Has anyone else encountered the same behavior and found a solution?

I don’t know how Nagios really check HTTP service but I would assume it issues a HTTP get request from server and wait for response. I am running Nagios 1.2 on Linux 2.4. In every instance that Nagios report failure the weblog report a successful get (200). Below is Nagios and Apache logs.

Nagios

Service Ok[10-16-2008 07:51:41] SERVICE ALERT: web001;HTTP;OK;SOFT;3;HTTP OK HTTP/1.1 200 OK - 406 bytes in 9.051 seconds
Service Critical[10-16-2008 07:49:41] SERVICE ALERT: web001;HTTP;CRITICAL;SOFT;2;CRITICAL - Socket timeout after 10 seconds
Service Critical[10-16-2008 07:47:31] SERVICE ALERT: web001;HTTP;CRITICAL;SOFT;1;CRITICAL - Socket timeout after 10 seconds
Service Ok[10-16-2008 07:31:31] SERVICE ALERT: web001;HTTP;OK;HARD;3;HTTP OK HTTP/1.1 200 OK - 406 bytes in 2.851 seconds
Service Critical[10-16-2008 07:15:31] SERVICE ALERT: web001;HTTP;CRITICAL;HARD;3;CRITICAL - Socket timeout after 10 seconds
Service Critical[10-16-2008 07:13:31] SERVICE ALERT: web001;HTTP;CRITICAL;SOFT;2;CRITICAL - Socket timeout after 10 seconds
Service Critical[10-16-2008 07:11:41] SERVICE ALERT: web001;HTTP;CRITICAL;SOFT;1;CRITICAL - Socket timeout after 10 seconds

Apache log

192.168.1.45 - - [16/Oct/2008:06:55:21 -0700] “GET / HTTP/1.0” 200 163
192.168.1.45 - - [16/Oct/2008:07:03:21 -0700] “GET / HTTP/1.0” 200 163
192.168.1.45 - - [16/Oct/2008:07:11:23 -0700] “GET / HTTP/1.0” 200 163
192.168.1.45 - - [16/Oct/2008:07:13:21 -0700] “GET / HTTP/1.0” 200 163
192.168.1.45 - - [16/Oct/2008:07:15:21 -0700] “GET / HTTP/1.0” 200 163
192.168.1.45 - - [16/Oct/2008:07:23:21 -0700] “GET / HTTP/1.0” 200 163
192.168.1.45 - - [16/Oct/2008:07:31:22 -0700] “GET / HTTP/1.0” 200 163
192.168.1.45 - - [16/Oct/2008:07:39:23 -0700] “GET / HTTP/1.0” 200 163
192.168.1.45 - - [16/Oct/2008:07:47:23 -0700] “GET / HTTP/1.0” 200 163
192.168.1.45 - - [16/Oct/2008:07:49:22 -0700] “GET / HTTP/1.0” 200 163
192.168.1.45 - - [16/Oct/2008:07:51:23 -0700] “GET / HTTP/1.0” 200 163
192.168.1.45 - - [16/Oct/2008:07:59:21 -0700] “GET / HTTP/1.0” 200 163
192.168.1.45 - - [16/Oct/2008:08:07:24 -0700] “GET / HTTP/1.0” 200 163


#2

Do you have some delay in the network? As I can see, the latest OK in Nagios log is finished in 9 seconds. Maybe the timeout value should be set higher then 10 secinds if that suits your needs. Or check why you have such a big delay in your http availability…


#3

Log into nagios server and rung the check_htttp script… /usr/local/nagios/libexec/check_http -H

If you get any error check the nagios logs on the server, also check the connectivity from nagios server to the monitoring server.