Nagios Unable to Generate Availability Logs


#1

Dear all,

The Nagios nettwork monitor at my site, which basically monitors the leased lines, has of late, stopped generating host log enteries and as such we are unable to determine the uptime and the downtime (or Availability) of a particular link.

The host log entries show the following:

Event Start Time Event End Time Event Duration Event/State Type Event/State Information
04-19-2005 00:44:23 04-19-2005 10:42:53 0d 9h 58m 30s HOST UP PING OK - Packet loss = 0%, RTA = 26.53 ms
04-19-2005 10:42:53 04-19-2005 10:42:53 0d 0h 0m 0s PROGRAM (RE)START Program start
04-19-2005 10:42:53 04-19-2005 11:06:01 0d 0h 23m 8s PROGRAM END Abnormal program termination
04-19-2005 11:06:01 04-19-2005 11:06:01 0d 0h 0m 0s PROGRAM (RE)START Program start
04-19-2005 11:06:01 04-19-2005 11:25:19 0d 0h 19m 18s PROGRAM END Abnormal program termination
04-19-2005 11:25:19 04-19-2005 11:25:19 0d 0h 0m 0s PROGRAM (RE)START Program start
04-19-2005 11:25:19 04-20-2005 06:43:27 0d 19h 18m 8s PROGRAM END Abnormal program termination
04-20-2005 06:43:27 04-20-2005 06:43:27 0d 0h 0m 0s PROGRAM (RE)START Program start
04-20-2005 06:43:27 04-20-2005 16:17:42 0d 9h 34m 15s PROGRAM END Abnormal program termination
04-20-2005 16:17:42 04-20-2005 16:17:42 0d 0h 0m 0s PROGRAM (RE)START Program start
04-20-2005 16:17:42 04-21-2005 15:16:19 0d 22h 58m 37s PROGRAM END Abnormal program termination
04-21-2005 15:16:19 04-21-2005 15:16:19 0d 0h 0m 0s PROGRAM (RE)START Program start
04-21-2005 15:16:19 04-27-2005 10:16:16 5d 18h 59m 57s PROGRAM END Abnormal program termination
04-27-2005 10:16:16 04-27-2005 10:16:16 0d 0h 0m 0s PROGRAM (RE)START Program start
04-27-2005 10:16:16 04-27-2005 16:27:32 0d 6h 11m 16s PROGRAM END Abnormal program termination
04-27-2005 16:27:32 04-27-2005 16:27:32 0d 0h 0m 0s PROGRAM (RE)START Program start
04-27-2005 16:27:32 05-02-2005 11:32:35 4d 19h 5m 3s+ PROGRAM END Abnormal program termination

Any help on this would be highly appreciated.

Thanks

Edited ]


#2

Hi,
nagios stop by itself ?
Or did you restart it ?

If not, try to see why it stops. Perhaps when it try to check a particular service or host ?

use /etc/init.d/nagios stop and not a kill.


#3

Hi Guano,
Thank you very much for your reply. I just did as advised by you, plus I will also look out for other probable reasons, which might be culpable for “Abnormal Program Termination”.

Thanks again…I really appericate your help.

:slight_smile:

Have a nice day!


#4

Post the output of the following:
/usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.cfg please.


#5

Hi J,

I am sorry, i couldnt come, as I was busy with other work and stuff. As advised Guano, I did what he had suggested. Though the program doesnt terminate and start that frequently, still Nagios isnt generating uptime/downtime logs.

Anyway, here is the output of the command you asked for:

Nagios 1.2
Copyright © 1999-2004 Ethan Galstad ([email protected])
Last Modified: 02-02-2004
License: GPL

Reading configuration data…

Running pre-flight check on configuration data…

Checking services…
Checked 14 services.
Checking hosts…
Checked 14 hosts.
Checking host groups…
Checked 3 host groups.
Checking contacts…
Checked 1 contacts.
Checking contact groups…
Checked 1 contact groups.
Checking service escalations…
Checked 0 service escalations.
Checking service escalations…
Checked 0 service escalations.
Checking host group escalations…
Checked 0 host group escalations.
Checking service dependencies…
Checked 0 service dependencies.
Checking host escalations…
Checked 0 host escalations.
Checking host dependencies…
Checked 0 host dependencies.
Checking commands…
Checked 23 commands.
Checking time periods…
Checked 4 time periods.
Checking for circular paths between hosts…
Checking for circular service execution dependencies…
Checking global event handlers…
Checking obsessive compulsive service processor command…
Checking misc settings…

Total Warnings: 0
Total Errors: 0

as Ia m not very well versed with Linix, so please pardon me for any stupidity or idiosyncracy on my part. :slight_smile:

Thanks so much.


#6

Perhaps nagios isn’t even making any checks. Are active checks enabled?


#7

Here are another things to do, I hope it can help you:

  • Check how many nagios are running : "ps -e | grep nagios"
    There must be only one.

  • See syslog : "more /var/log/messages"
    Pershaps nagios has written why it stopped this way.

Good luck