Having Nagios monitoring Microsoft SQL Server 200x?


#1

Hi,
we are considering to install a new monitoring system to our almost 100% purely based Microsoft environment.

I would lika to have Nagios as the new monitoring system for several reasons (one is that I am somewhat familiar with it) althoug I am not sure because we have many Microsoft SQL Server Databases (2000 and 2005) so I am wondering if you can monitor jobs via Nagios?

Short explenation:
We have some jobs that are running between 5-60 minutes, sometimes 120 minutes, but if they exceed 180 minutes I want an alarm on this to my mail?

Is this possible?

I found this page on Nagios homepage:
www.nagios.org/faqs/viewfaq.php.?faq_id=156

But it is very out-to-date so if you have some newer information about MSSQL-monitoring by Nagios , please let me know =)

Thanks in advance,
KFluffie


#2

Does no one have any idea regarding this? =)


#3

Hi KFluffie

i’ve been using scripts from this page : http://www.sqlservercentral.com/articles/Administering/sevenmonitoringscripts/1024/

and wrapping small shell scripts around them that i call from nagios.

here’s an example of one i use for checking for failed jobs;

[code]#!/bin/bash

check_mssql_jobs.sh

version date

1.0 18 Sep 2007 k release

if ! $1 ]
then
echo "Usage: $0 server [jobname]"
exit 3
fi

SERVER=echo $1 | awk 'BEGIN{FS="."}{print $1}'
SP="msdb.dbo.sp_help_job"
if $2 ]
then
jobname=echo $2 | awk 'BEGIN{FS="."}{print $1}'
SP="msdb.dbo.sp_help_job @job_name = ‘$jobname’"
fi

SQL=“select name, run_status, substring(cast(run_date as varchar(8)),7,2) + ‘/’ + substring(cast(run_date as varchar(8)),5,2) + ‘/’ +
substring(cast(run_date as varchar(8)),1,4) + ’ ’ + case len(cast(run_time as varchar(6))) when 6 then left(cast(run_time as
varchar(6)),2) + ‘:’ + substring(cast(run_time as varchar(6)),3,2) when 5 then left(cast(run_time as varchar(6)),1) + ‘:’ +
substring(cast(run_time as varchar(6)),2,2) else ‘00:00’ end as ‘RunTime’ from msdb.dbo.sysjobhistory sjh join msdb.dbo.sysjobs sjv
on sjh.job_id = sjv.job_id where instance_id in (select top 1 instance_id from msdb.dbo.sysjobhistory sjh2 where step_name = ‘(Job
outcome)’ and sjh.job_id = sjh2.job_id order by instance_id desc) and run_status = 1 order by run_status, instance_id DESC”

export USER=xxxxxx
export PASS=xxxxxx
export OUTFILE="/tmp/check_mssql_jobs_$1.tmp"

/usr/local/bin/tsql -H $SERVER -p 1433 -U $USER -P $PASS > $OUTFILE << EOF
set nocount on
go
$SQL
go
exit
EOF

grep “Login failed” $OUTFILE >> /dev/null
if $? -eq 0 ]
then
echo "CRITICAL - LOGIN FAILED!! $SERVER DOWN??"
rm -f $OUTFILE
exit 2
fi
JOB=awk '{FS="\t"}{print $1}' $OUTFILE | sed -n '4p'

mv -f $OUTFILE $OUTFILE.bak

if $JOB ]
then
echo "$SERVER Sql Job(s) Failed - $JOB"
exit 2
else
echo "$SERVER Sql Jobs all OK"
exit 0
fi[/code]

not the most elegant solution i admit, but it does the job :slight_smile:


#4

Hi KS

Could you explain me why are you using the following line on your script?

JOB=awk '{FS="\t"}{print $1}' $OUTFILE | sed -n '4p'

Regards

Victor