Running NSCA and NRPE on server under XINETD

Pichi · June 2, 2008, 11:02am

Hello,

I have a working Nagios server 3.0.1 with the 1.4.12 plugins running on Debian Etch 4.02. and I have set up NRPE 2.12 for hosts in my local network. Everything is working fine.

Now I need to introduce alerts for remote machines (as in over the Internet and who will be behind NAT devices out of my control). So I looked around and found NSCA. This seemed like the right solution, but now I found this is the README for NSCA:

NOTE: If you run nsca under inetd or xinetd, the server_port
and allowed_hosts variables in the nrpe configuration file are
ignored.

On a test server that wes set up with NRPE and running under xinetd I installed NSCA and ran it under xinetd and BOOM NRPE was no longer listening on port 5666. So any NRPE clients trying to send thier data would not be able to. This would break my current setup.

Questions:

Is it a situation of all or nothing? I mean you can use NSCA or NRPE under xinetd but not both on the same server?
If you wanted to use both how coudl you do it?

Thanks for any help you might be able to give.

Pete

MP1 · June 2, 2008, 9:51pm

NRPE binds to port 5666. NSCA binds to port 5667. you shouldn’t be having a problem if your xinetd files are set up properly.

In your /etc/xinetd.d/nrpe and /nsca you should have the ports defined as such. ** you should NOT be running the nsca or nrpe binaries manually (that’s daemon mode). **

Heres an example xinetd.d/nsca
service nsca
{
flags = REUSE
type = UNLISTED
port = 5667
socket_type = stream
wait = no
user = nagios
group = nagios
server = /usr/sbin/nsca
server_args = -c /etc/nagios/nsca.cfg --inetd
log_on_failure += USERID
disable = no
only_from = 123.123.123.123 222.222.111.111
cps = 300 1
}

that will cause xinetd to spawn /usr/sbin/nsca -c /etc/nagios/nsca.cfg --inetd when it gets a request on port 5667.

You can check to see if your xinetd entries are correct by doing a $ netstat -pantu |grep 566 and making sure xinetd is assigned to ports 5666 and 5667. Restart xinetd after any changes. check /var/log/secure and /var/log/messages for errors after restarting xinetd.

xinetd LISTENS for requests on certain ports, then runs the binary it has associated with that service. xinetd doesn’t SEND anything, it responds to port requests. So, if you’re receiving nsca data on port 5667, you only need xinetd/nsca set up on the host retrieving that data (ie: your central nagios).

Pichi · June 3, 2008, 6:37am

MP,

Thanks for your post. You are abosolutely correct. I decided to redo the testserver this time I was able to get both NRPE and NSCA listening on thier respetive ports. There are both running under xinetd. The Note in the README threw me though and when things didnt work the first time I freaked.

However when I began to think of the archetecture of Nagios I soon realized that I really didnt need to be listening on 5666 (nrpe) on the main server because NRPE PULLS from the clientes. And like you said in your post NSCA will passively listen over port 5667. It was a lightbulb moment.

Thanks for your help!

Pete

Pichi · June 3, 2008, 7:40am

OK I am back because I am quite lost to tell the truth.

This is what I have working:

The Nagios main server is listening on port 5667 for external clients to give it information.

So how the heck do I get my remote clients to report back to the main server? Believe me I have been looking all over the Internet and in forums and I cant find any at least for me understandable documentation on how to do this. A real How To is what I am looking for. Most of what I see is how to send a test to the localhost. Something that explains all the pieces. And yes I have read the manuals but I still dont get it.

My setup is pretty straight forward: I have remote clients behind NAT routers (with dynamic IPs) who need to send their status to the Nagios server in my office.

Thanks again,

Pete

Pichi · June 3, 2008, 12:04pm

OK here is an update.

The server is setup and has passive checks configured with the freshness_check and freshness_threshold and a script to run that will tell me the nsca clients have stopped sending thier status after 30 minutes. This is working. Nagios is reporting that is has not recieved any reports from this host in 30 minutes because:

When I run from the commandline on the remote client:

./send_nsca -H 211.211.33.9 -c /usr/local/nagios/etc/send_nsca.cfg < test

I get this error:

Error: Server closed connection before init packet was received
Error: Could not read init packet from server

Where 211.211.33.9 is the main Nagios server listening on port 5667. And the file test has:

www2 bar 0 0

www2 has a vaild host.cfg and is included the passive service I mentioned before.

In the file /etc/xinet.d/nsca on the main server:

default: on

description: NRPE (Nagios Remote Plugin Executor)

service nsca
{
flags = REUSE
socket_type = stream
wait = no
user = nagios
group = nagios
server = /usr/local/nagios/bin/nsca
server_args = -c /usr/local/nagios/etc/nsca.cfg --inetd
log_on_failure += USERID
disable = no
only_from = 99.99.99.1
only_from = 211.211.33.9
}

Where 99.99.99.1 is the public IP of the remote router and 211.211.33.9 is the public IP of the local firewall just in case the daemon only sees the last hop address as a connection point.

Also the /usr/local/nagios/etc/nsca.cfg file on the main server has the remote address of the server, router in the allowed_hosts= directive, but this should make no difference since this file really handles the local daemon which is not going to be used on the remote host. All I want to use on the remote host is the send_nsca binary. Correct?

On the remote host in the send_nsca.cfg file I have a password set and O for encryption. On the main server in the nsca.cfg file I have the same password set and 0 for encryption.

I can see the first packets get to the main Nagios server but the connection gets dropped after that. This error message has been written about quite a bit but I have seen no answers … just frustration …which now includes me…

The boss is starting to ask questions…any ideas?

Many Thanks,

Pete

MP1 · June 5, 2008, 11:23pm

Hey Pete,

First off, i’d just remove the only_from entries in your nsca xinetd file for now (you can add them later when you determine that’s not the problem). Remember to restart xinetd.

So, heres the thing, it sounds like you’re thinking NSCA does the same thing as NRPE, except it can traverse NATs & firewalls. Is this correct? You’re not finding any examples in the docs because it doesn’t work that way :). NSCA is typically used in a DISTRIBUTED nagios setup, that means you’ve got your Central main nagios that handles sending out alerts, does some checks on the network it has access to, and does all your reporting and web interface stuff. Then you’ve got your distributed nagios servers. These are nagios boxes over the internet that have access to servers behind the NAT/Firewall, doing service checks against all those hosts behind the NAT. that distributed box is doing your service checks, and then using NSCA to send the results back to its parent nagios server (which has a duplicate set of service definitions, except they are marked as passive). All of the “distributed nagios” setup docs to do with NSCA describe how to set up something like this. Heres a shoddy text diagram:

|Nag-Central|----<-NSCA----|INTORWEBS/NAT/FIREWALL|-<–NSCA----[Dist-Nag]<—NRPE—>Host1

And a picture:
nagios.sourceforge.net/docs/3_0/ … ibuted.png

Do you have any servers behind these NATted networks that you could install a distributed nagios on? If you don’t have access to the routers or firewall configs, you need to have a distributed nagios process (with access to the internet, back to your central nagios) with access to those hosts, running service checks for you. Any other solution would require firewall rules/routing rules to be modified so your central nagios can run nrpe against remote hosts.

This doc to set up distributed nagios is pretty good:
nagios.sourceforge.net/docs/3_0/distributed.html

Basically you just enable your OCSP command on the distributed nagios box and it runs /usr/bin/printf “properly-formatted-nsca-result” | send_nsca -H 211.211.33.9 -c /usr/local/nagios/etc/send_nsca.cfg after every single service check result it receives. NSCA on central receives that info, and it echo’s it (properly formatted) into your nagios.cmd file on your central box. The nagios process is watching nagios.cmd, and processes anything sent to there as any other service check (it just received it passively rather than doing it itself).

Let me know if that makes things any clearer,

-MP