I have recently setup Nagios 3 on Centos 5.5. It appears to be running corectly and is reporting as I’d exepect.
The problem that I have relates to the monitoring of some servers (and services) on a remote site via a VPN.
The servers are generally reporting with no problems, but there appear to be random times when check_nt returns errors and creates an alert.
The most common error is:
(Return code of 139 is out of bounds)
then 2 minutes later when the check is made again, the script receives a valid response and the alert is closed.
Another common output is:
No data was received from host!
Again 2 minutes later the same command works with no problem
I have enabled debugging but it doesnt give me any further information.
Running check_nt from the command always returns a valid response.
I dont think it is a firewall issue as I would expect it to never work if the firewall was blocking it.
If the VPN was going down I would expect ALL hosts to have problems (along with the users connecting across it)
I dont think it is a timeout issue as from what I’ve read that would return a timeout error
Is there anything else that I should be checking or looking for?
Let me know if you need more information