I have a distributed Nagios installation with two pollers that report back to the central Nagios through encrypted nsca.
The nsca runs from xinetd and the problem is that it spawns a new nsca process for each new connection from the pollers without ever killing it, so pretty soon I end up with thousands of process, memory goes down and oom killer eventually starts killing random processes and the server hangs.
This is my xinetd.d/nsca config from the central Nagios:
[root@nagios~]# cat /etc/xinetd.d/nsca
# default: on
# description: NSCA
flags = REUSE
socket_type = stream
wait = no
user = nagios
per_source = UNLIMITED
instances = UNLIMITED
group = nagcmd
server = /usr/local/nagios/sbin/nsca
server_args = -c /usr/local/nagios/etc/nsca.cfg --inetd
log_on_failure += USERID
disable = no
only_from = X.X.X.X Y.Y.Y.Y
I tried to limit the instances to 100, but after that new connections get dropped.
I'm running the last version on nsca (2.7.2) and Nagios 3.2.0
NSCA - Nagios Service Check Acceptor
Copyright (c) 2000-2007 Ethan Galstad (nagios.org)
Last Modified: 07-03-2007
License: GPL v2
Encryption Routines: AVAILABLE
TCP Wrappers Available
I found this old post (mail-archive.com/nagios-user ... 18055.html) that mentions the same kind of issue, but I was hoping that it was fixed by now, so I would like to know if other users with distributed monitoring implementations have the same problem and what solution worked for them.
Thank you in advance.