Nagios Service not starting

Hello, I am having issues with the Nagios service not starting…the pre-flight check comes back with no errors. When I ps -ef|grep nagios I get this:

root 3625 2978 0 14:39 pts/1 00:00:00 grep nagios

Also when I restart the service I get this:

Running configuration check…done
Stopping network monitor: nagios
/etc/init.d/nagios: line 68: kill: (3523) - No such process
Waiting for nagios to exit . done.
Starting network monitor: nagios

Is there some kind of error or permissions issue I am overlooking here?

Thanks for your help!

Okay…well, the result of ps -ef | grep nagios is telling you that Nagios isn’t running, so you tell Nagios to restart. Because Nagios is not yet running, it returns ***“no such process”***. The string two lines down from that tells you that it’s starting Nagios…so unless you get an error message or something, Nagios should be running. Does the ps command still show that Nagios isn’t running even after you try a restart or start of Nagios?

Your lock file for nagios still exists.

The lock file is a file that tells nagios that it is already running and it’s PID (Process ID) is, in your case 3523.

What you should do is remove the file nagios.lock. On a standard install it is located at /usr/local/nagios/var/nagios.lock

Remove that and then start the program /etc/init.d/nagios start and you should be golden.

Good luck!

P.S. Write back and tell me if it worked!
Edited Thu Dec 29 2005, 11:30PM ]

Ok, the lock file disappears when I stop the service…why when I run service nagios status it comes back with:
PID TTY TIME CMD
and nothing under those?

I still am not sure if the service is really running or not.
And yes…when I stop and start the service I get the same result for the ps command just a different number in the second column after root.

Try this… stop the service, but instead of starting it again with /etc/init.d/nagios start try doing it with this

/usr/local/nagios/bin/nagios /usr/local/nagios/etc/nagios.cfg

Sometimes the pre-flight check doesn’t catch everything and by running the program in the foreground instead of as a daemon perhaps it will shed some light on the situation.

Ok cool… getting an error…

Error: Could not create external command file ‘/usr/local/nagios/var/rw/nagios.cmd’ as named pipe: (2) -> No such file or directory. If this file already exists and you are sure that another copy of Nagios is not running, you should delete this file.
Bailing out due to errors encountered while trying to initialize the external command file… (PID=4077)

Thanks!!

I also noticed that the rw directory doesn’t exist…maybe this is being pointed to by an external command directory location within a configuration file…any idea which one?

Ok, I noticed within the nagios.cfg file the entry is as follows…

command_file=/usr/local/nagios/var/rw/nagios.cmd

But that path doesn’t exist…either does nagios.cmd… should I disable the external command files… the only thing I really want to work that is not out of the box is the EM01 temp/hum/illum sensor…

Again, thanks for the help

You can disable it if you want but it’s good that you got that error. I got the same thing when I used the program. If you want to allow external commands just type in the following two commands:

mkdir /usr/local/nagios/var/rw/
touch /usr/local/nagios/var/rw/nagios.cmd

It doesn’t matter that the file is completely blank, it just doesn’t know how to make the file itself, as long as it exists it’ll write in it.

If you remove the line in the nagios.cfg about the command_file then it won’t accept external commands and your nagios will start up just the same as if you had done those previous two commands.

After that run it again the long way to make sure there are no more silly little errors, and then you can start it as a daemon with the /etc/init.d/nagios script.

Good luck!

It all works!! What I ended up doing was making the check external command = 0 because even with the created nagios.cmd file it still came back with the same error. I guess I can keep messing with that part until it works… I would like to use the external command functionality. Thanks so much for your help and persistence.

Oh sure, got to justify getting paid at work somehow :smiley:

I hear ya… any chance you know much about iptables?? this nagios is my first linux based project and the previous admin loved scripts, linux, and not documenting. Me on the other hand am more into cisco and NT side of the house as an MCSE… so, I have had a nice learning curve for the past few months. What I am trying to do is make our firewall external only and utlilze the layer 3 switch for VLAN routing and DHCP relay. We have some Dell Powerconnect switches… I don’t like the fact that when the single box that has iptables and the internal routing/DHCP relay go down (and it has) that all internal routing is hosed! Let me know if you can assist and maybe we can work something out…

Let’s take this into PMs so that we don’t have to keep bumping this solved thread to the top. :smiley: