Nagios Forking Error

Barley · June 24, 2008, 2:42pm

I am getting the following error every other week or so on my CentOs 5 Server running Nagios Version 3.0a4:

nagios: Warning: The check of host could not be performed due to a fork() error.

I wouldn’t mind the error EXCEPT that when this error occurs it writes to the messages file thousands of times every second and fills up the harddrive on the server. Any assistance would be greatly appreciated.

Loose · June 25, 2008, 8:11am

Hi!

I’ve had this kind of errors for 2 entirely different reasons:

on a server, we were reaching the limit of the maximum number of processes launcheable by one user (nagios). ie: this limit was fixed a 120 processes, and we were above this limit, thus this error. We had to increase the limit on this server (don’t know how to do that though, so we had to ask this limit to be increased :))
the check_ping plugin was “not well” compiled for this server; in fact, we imported the check_ping from another server which should have been exactly the same … in the end, this plugin was not working correctly, and we had this kind of errors around 0.3% of the time. As we couldn’t compile on this server, I had to write a small .pl script that would call /usr/bin/ping and parse the result (which is what the check_ping plugin does, anyway).

I hope this will help you;
if you’re not in one of theses cases, I can’t help you further

Barley · June 25, 2008, 7:49pm

On further study, the error is caused by the Nagios process developing a memory leak. It grows bigger and bigger in memory until it runs out of both physical and virtual memory and then gives the forking error and fill the hard drive with error messages.

I could just set a cron to restart Nagios every night but that seems like a duct tape fix. I might try running Nagios without embeded Perl modules. Any other suggestions?