Errors in Eventlog check - Persistence


#1

Hi there,

I have the following issue with my nagios check for windows eventlog errors: Errors persist and do get sent via email on a regular basis.

As an example, I still receive notifications for a backup error that occurred one month ago.

Is there any way to control this ??

Many thanks for your help,
Aurore.


#2

possibily a stuck nagios process…
/etc/init.d/nagios stop
ps -ef | grep nagios
kill any surviving nagios processes
/etc/init.d/nagios start

see if the problem persists

(This is usually due to a restart of nagios from the web interface)

Luca


#3

Hi,
Thanks for your answer.
Unfortunately, it does not seem to clear all my issues.
I have a event log check on a server

define service {
host_name jungfrau
service_description eventlog error
check_command check_eventlog!>6m!warning
name eventlog error
max_check_attempts 1
normal_check_interval 5
retry_check_interval 1
check_period 24x7
notification_interval 240
notification_period 24x7
notification_options c,r
contact_groups sql-admins
}

where

define command {
command_name check_eventlog
command_line $USER1$/check_nrpe -H $HOSTADDRESS$ -p 5666 -c checkeventlog -a file=system file=application MaxWarn=1 MaxCrit=1 description
s filter-generated=$ARG1$ filter-eventType=>$ARG2$ filter=out filter=any truncate=1000

}

BUT

I receive notification messages regularly such as

2006-02-23 17:34:32 2006-02-23 17:39:32 0d 0h 5m 0s SERVICE CRITICAL MSSQLSERVER(error)Error: 9002, Severity: 17, State: 6

whereas there is NO corresponding message in the event log of the corresponding machine.

Can you help again ???


#4

could the fact that my server is clustered be the cause of my problems ???


#5

could it be the error is from another machine in the cluster?
it must be in some logfile…


#6

the cluster event log receives error messages from the two nodes and from the referee.

the errors do exist but with old dates !! These errors were generated on the 19.02.2006 (last week-end), on the SQL Server clustered instance…


#7

the cluster event log receives error messages from the two nodes and from the referee.

the errors do exist but with old dates !! These errors were generated on the 19.02.2006 (last week-end), on the SQL Server clustered instance…


#8

The problem appears to be the plugin check being performed. If the plugin is not written correctly, it is going to search the ENTIRE event log each and every time the check is made. Of course, each time it checks the log, it’s going to find the same error every single time, and send an alert every time. The only way to fix that would be to find a different plugin, or save/clear the event log daily or something like that.