CGI Hanging on comments, reschedule of check, etc


#1

I am having the problem of the CGI of nagios hanging when trying to do anything that needs to write to the nagios.cmd file.

I am basically setup doing the following:

Web front end is on a different server which access the status files, configruation file, etc. via an NFS mount to my nfs server.

Nagios writes and reads everything to the NFS server as well. Everything else works great, no issues. The CGIs’ are able to read everything and output as expected. Just when trying to add any comments, reschedule checks, etc that the CGI hangs and then eventually I get a page cannot be displayed.

the nagios server process is running as ‘nagios’, and the apache process runs as ‘daemon’.

Here are the permissions set for the directories and files:

drwxr-xr-x 3 nagios nagios 512 Feb 4 14:37 var
drwxr-xr-x 2 nagios daemon 512 Feb 3 15:28 rw
prwxrwxr-x 1 nagios daemon 0 Feb 3 15:28 nagios.cmd

I’m sure it’s some simple mistake I am missing, but I have been banging my head over this one for awhile.

Any help or suggestions will be greatly appreciated!


#2

According to the docs, your rw folder permissions are wrong.
Also, the group is wrong on rw folder.
In the docs, they use nobody for the apache user so substitute your user, daemon, in it’s place.
Please see the docs for further info.
nagios.sourceforge.net/docs/1_0/commandfile.html
The docs give a step by step, but if you can’t follow it then report back. Let us know how it went.


#3

Ok, I tried that again, i’ve gone through that doc a couple of times… The permission might not have looked like what the doc wanted because I have been trying different settings.

So, what I did was I killed the directory and started over. Here is what it looks like now:

The RW directory:

drwxrws---  2 nagios  daemon     512 Feb  5 15:12 rw

The Actual File:
~~~~~~~~~~~~~~~~
prw-rw----  1 nagios  daemon    0 Feb  5 15:12 nagios.cmd


Still hangs...  I know that it can read the file because I am not getting the error message about not being able to access it.  

I went ahead and tried to manually set the permissions on the nagios.cmd file as well, but this did no good either.  

I appreciate the assistance.  Do you think this could be caused because it is going across an NFS mount?  But then again the nagios process has no issue writing to it, so I would think there would not be an issue there either.

Maybe looking in the wrong place, perhaps something on the Apache config?  I am running Apache/2.2.0. 

Here is what the config looks like for the vHOST:

<VirtualHost ********:80>

<Directory "/usr/www">
    Options Indexes FollowSymLinks

    AllowOverride None
    AuthName "Nagios Login"
    AuthType Basic
    AuthUserFile /usr/nagios/etc/htpasswd.users
    Require valid-user

    Order allow,deny
    Allow from all

</Directory>

<Directory "/usr/nagios/sbin">
   Options ExecCGI
    AllowOverride None
    Order allow,deny
    Allow from all
    AuthName "Nagios Login"
    AuthType Basic
    AuthUserFile /usr/nagios/etc/htpasswd.users
    Require valid-user
</Directory>

    ScriptAlias /nagios/cgi-bin "/usr/nagios/sbin/"

    DocumentRoot /usr/www
    ServerName ********
</VirtualHost>


Thank you for your help, I really do appreciate it.  This one has had me puzzled for a little bit..

#4

Ok, so I found out what the problem is. Basically since nagios is using the file “nagios.cmd” as a named pipe, and it is running across an NFS partition, this is why it hangs. As far as I know you can not write from one host to another host’s “named pipe” across an NFS.

I have an idea for a workaround, will see if it works. It would be nice if I wouldn’t need to do this, but from my research doesn’t look like i’ll be able to get around it. Unless there is something I am missing. :slight_smile:

Thanks


#5

[quote=“opey”]
The RW directory:

drwxrws---  2 nagios  daemon     512 Feb  5 15:12 rw
..  [/quote]



Still not right.
The docs state to make another group called nagiocmd.  You are to add the nagios and apace user to that group in /etc/groups.  You will then chgrp of the rw folder so it's nagios owner and nagiocmd group.

drwxrwsr-x  2 nagios nagiocmd  4096 Feb  5 16:39 rw
grep nagiocmd /etc/group
nagiocmd:x:502:nagios,apache

#6

In your case, it would be
grep nagiocmd /etc/group
nagiocmd:x:502:nagios,daemon


#7

[quote=“jakkedup”]

Just for S&Gs I went ahead and did this, but it still does not work it hangs.

See my post about named piped files across NFS. That is why it hangs and then eventually times out. It has nothing to do with permissions or anything since the file is accessible and writable (or else we would get an error stating that the file is unwritable).

I am working on a “work around” for this, and will post what I did if it actually works. Again, I found the issue to be the fact that I am going across an NFS mount trying to write to a named pipe file (nagios.cmd).

Thanks a bunch


#8

Well, I got things working with a “workaround” or a hack, whatever you want to call it. :slight_smile:

Here is what I did. On the web server that is running the CGI, I pointed it to read from a separate nagios.cfg file, which tells it to write to a different nagios.cmd ( note: a regular file -> nagios-web.cmd) which is on the NFS server, which both the web and the nagios server have access to the mountpint. Set the appropriate permissions, etc.

On the nagios server itself, I used this script that I hacked together (mind you I am not much of a shell scripter, so if anyone has a better way of doing this I wouldn’t mind seeing it…:slight_smile: ):

# Filename nag-workaround #!/usr/local/bin/bash # change the paths and filename to whatever you created if -e {path to filename set for nagios-web.cmd} ]; then while -e {path to filename set for nagios-web.cmd} ] do sleep 3 tail /nagios-web.cmd > {path to actual nagios.cmd file}/nagios.cmd #did this to clear the file to avoid infinite number of entries truncate -s -100 {path to filename set}/nagios-web.cmd done fi

And of course I have that running in the background. Like so

./nag-workaround &

Doing it this way things are working the way i’d like it to. Like I said, I am not much of a shell scripter, so there is definately room for improvement on this method.

Anyone have any thoughts or suggestions? As far as I can tell this is the only way to get it to work.
Edited Mon Feb 06 2006, 06:06PM ]


#9

I don’t write scripts either, but I have to wonder now. If you setup 500 passive service checks coming in from remote systems, would they be able to write to the file correctly.


#10

Good point - fortunately at this time there is not that many changes that are being made. Although eventually there very well could be. I imagine with a better script, it would not be an issue - but here is to crossing fingers it keeps on working good! :?