Weird Nagios Error


#1

Hi,

We have an odd issue with our Nagios monitoring system.
Our internal network is pretty classic; switches, mail&file server, DMZ…

All hosts are configured in nagios with NO dependencies (current nagios setup is still pretty basic).
Everything works great, services are being monitored and alerts are being send when something goes wrong.

The strange thing is, if I reboot for example my mail server, then all my network switches are down for nagios (which is obviously not the case).
The error I get on my switches is:

“popen timeout received, but no child process”

I don’t find alot of info on this error msg on the net. Maybe you guys can help me?

Specs:
Nagios 3.0rc3
FreeBSD 6.0

Cheers,

A


#2

What do you see in Nagios log? What is the output for Nagios checks of downed switches?

Did you configure them right? Post mail server host definition and one of the (down) switches host definition, so we could see if you’ve defined anything in unappropriate way.


#3

Mailserver config:

define host{
use windows-server ; Inherit default values from a template
host_name [SERVER]_xxxxx ; The name we’re giving to this host
alias xxxxx ; A longer name associated with the host
address xxxxxxxxxx ; IP address of the host
hostgroups windows-servers ; Host groups this host is associated with
contact_groups admins
notification_interval 120
notification_period 24x7
notification_options d,u,r
}

Switch config:

define host{
use generic-switch ; Inherit default values from a template
host_name [SWITCH]_Cisco_I ; The name we’re giving to this switch
alias CISCO_1 ; A longer name associated with the switch
address xxxxxxx ; IP address of the switch
hostgroups switches ; Host groups this switch is associated with
contact_groups admins
notification_interval 120
notification_period 24x7
notification_options d,u,r
}


#4

Hm, nothing weird here. Everything seems ok. You could have better debuging info if you could do a traceroute from Nagios machine to one of your switches when the mail server goes down.

Is your mail server also some sort of gateway for you Nagios box? That could lead to such a problem.