A few quick questions

schetle · January 17, 2008, 3:49pm

I’m not quite sure if this issue has been addressed in this forum, or elsewhere, but I haven’t found it.

I’ve got 3rc1 installed on an r5k debian box by pretty much following the ubuntu guide, with a few things changed to compensate. It was a clean install, and everything checked out including a very basic config - which I doubled checked against the online documentation. Permissions was originally an issue, but resolved and I can access all but one CGI (one which isn’t run as a CGI for some reason, probably some ‘make’ parameter I missed).

My primary question - generally speaking how long does it take for nagios to actively check a host, and how does it exactly work? While nagios is installed, it is not doing anything. The daemon is running - specifically 32 processes. I’ve done a tail -f on the nagios.cmd file while issuing commands through the web interface and it’s correctly writing to the nagios.cmd, but it seems somewhere along the line those issued commands aren’t being seen by the nagios daemon. A poor attempt at solving this issue consisted of modifying the intervals and time nagios took to check a host.

A bit of a hint that tells me nagiosd isn’t getting anything - I can write all the comments about a host (localhost), and see it being written to the nagios.cmd file, but nothing ever gets retained… status.dat holds no significant change in data. No settings I set through the interface are retained. The interface is correctly doing everything it should, it appears, however the daemon is being completely ignorant of it.

I get no errors in config validation, I can manually run the CGI executables/scripts over the command line. I don’t get a single error in the interface while doing anything. I’ll have to double check to make sure I don’t have SE Linux enabled.

I also attempted an install on a FreeBSD 6.2 box, but I ended up pausing that one for a minute to get this one working first.

–Thanks for taking any time to help.

schetle · January 17, 2008, 3:53pm

Also I wanted to know if there was a verbose mode to see if the daemon IS getting anything. I want something more descriptive than just “[xx-xx-xxxx xx:xx:xx] Finished daemonizing… (New PID=xxxxx:evil:”.

(For fun, I’ve been ignoring the nagios server for 2 days, and the localhost information is still “pending”; with everything running)

Loose · January 17, 2008, 4:04pm

Hi!

It seems you’re a bit mixing some properties of nagios^^

Firstly, everything you are trying to do (I mean: everything that implies writing in the nagios.cmd file) is known as either:
external command
or passive check

which means that if you want those commands/checks to be read and understood, you have to set the variable
check_external_commands to 1
(then, restart nagios; not just a reload).

As for the tests of the hosts, nagios, by default, do not check them regularly.
In fact, nagios checks the services for your hosts; if 1 service fails, then nagios checks the host.

By the way, just to make sure:
don’t forget to type nagios-dir/bin/nagios -v nagios-dir/etc/nagios.cfg -and if ok then reload nagios- every time you modify your config files

I hope this will help you; don’t hesitate to ask other questions if you’re stuck somewhere

schetle · January 17, 2008, 4:14pm

check_external_commands is enabled (=1) in the configuration file.

I also had modified command_check_interval from “15s”, to “-1”, and then also tried “5s”.

I don’t suppose “Starting nagios: No directory, logging in with HOME=/
done.” has anything to do with it

schetle · January 17, 2008, 8:38pm

I fixed the “No directory” problem, but still nothing as far as I can tell.

Loose · January 18, 2008, 9:16am

Well; I don’t really have any idea of what the problem could be.

Maybe you could check these (from the nagios cgi):

the scheduling queue: is it showing anything; does it changes when you refresh it after a few minutes ?
the performance info: Are there any active/passive service checks process ? same with hosts ?
the event log: are there any errors logged ?

I hope these indicators may help you find the cause of your problems;
post some extracts if you don’t know what to do with them

schetle · January 18, 2008, 3:55pm

The scheduling queue never changes. It stays the exact same once it is set at the initial start point.

Everything in the Performance Info CGI is null ( or 0, or 0% take your pick).

…and the only thing the event log shows regarding this instance is (or for any session of nagios I start):
[01-17-2008 14:55:16] Finished daemonizing… (New PID=2614)
Informational Message[01-17-2008 14:55:16] LOG VERSION: 2.0
Informational Message[01-17-2008 14:55:16] Local time is Thu Jan 17 14:55:16 CST 2008
Program Start[01-17-2008 14:55:16] Nagios 3.0rc1 starting… (PID=2613)

schetle · January 18, 2008, 5:56pm

I’m going to remove all traces of 3.0rc1, and try out 2.0 stable to see what I can get going.

schetle · January 18, 2008, 8:49pm

I’ve got her up and running. I put a shotgun to all the nagios files, then cleaned out my source trees and started from scratch. Now to configure my hosts and services

Thanks for your time, Loose.

schetle · January 19, 2008, 2:14am

It appears that I can run 2.0, but not 3.0rc1 on my setup. I’m curious as to find out what factor has changed between the two that may cause it to blow up.