Problem to get all performance datas nagios - nagiostat


#1

Hello,

I’ve been searching for a while without any success :frowning:

The nagios server is 1.2 and the nagiostat plugin is 1.0

Here are extracts of the configuration files:

hostgroups.cfg

define hostgroup {
hostgroup_name                 cc-vcscluster
alias                          CC VCS Servers
contact_groups                 cc-admins
members                        svrccvcs10,svrccvcs11,svrccvcs12,svrccvcs13,svrccvcs01,svrccvcs02,svrccvcs03,svrccvcs04,svrccvcs05,svrccvcs14,svr
ccvcs15,svrccvcs16,svrccvcs17,ccaselsr01,ccaselsr02,ccasergy01,ccasergy03
}

hosts.cfg

define host {
use                            generic-lsf
host_name                      svrccvcs01
alias                          svrccvcs01
address                        svrccvcs01.tif.ti.com
}
define host {
use                            generic-lsf
host_name                      svrccvcs10
alias                          svrccvcs10
address                        svrccvcs10.tif.ti.com
}

services.cfg

define service {
use                            unix_24_7
hostgroup_name                 cc-vcscluster,cc-vobservers-dz
service_description            CPU load
check_command                  check_nrpe_ssl!check_load!6,5,4!12,11,10
contact_groups                 cc-admins
notification_options           u,c
process_perf_data              1
}

nagiostat.conf

InsertValue     svrccvcs10-cpuload5.rrd cpuload_5min            /svrccvcs10/    /CPU\\sload/                    cpu_load_out
InsertValue     svrccvcs01-cpuload5.rrd cpuload_5min            /svrccvcs01/    /CPU\\sload/                    cpu_load_out
Graph           svrccvcs10-cpu          svrccvcs10-cpuload5.rrd std_1year               cpu_load        default.html    "CPU load svrccvcs10"
Graph           svrccvcs01-cpu          svrccvcs01-cpuload5.rrd std_1year               cpu_load        default.html    "CPU load svrccvcs01

Both servers are monitored (I can see the last datas polled). But I’m only able to get perf datas sent to nagiostat (checked in the debug.log file with DEBUGLEVEL set to 3 in nagiostat) for svrccvcs01 (nothing for svrccvcs10).
For example:

Wed Feb  8 09:39:15 2006
**INCOMING PERFDATA:
  LASTCHECK=1139387949
  HOSTNAME=svrccvcs01
  SERVICEDESCR="CPU load"
  SERVICESTATE="OK"
  OUTPUT="OK - load average: 0.59, 0.62, 0.64"
  PERFDATA=""
 +VALUE: 0.64
 =INSERT into 'svrccvcs01-cpuload5.rrd': 0.64 DSA-names=load
 !RRDCMDLINE: /opt/rrd/1.2.11/bin/rrdtool update /opt/nagios/nagiostat/archives/svrccvcs01-cpuload5.rrd --template load 1139387949:0.64

I tired lots of things in the config file (different sorting order, these kind of things). No improvement.

If you have an idea you are highly welcome.


#2

Double check your debug output. You may have one output stepping on the other.
For example:
HOSTNAME=svrccvcs10
SERVICEDESCR=“CPU load”
=INSERT into 'svrccvcs01-cpuload5.rrd’
See how the one host inserted to the WRONG rrd file?
A RegEx problem perhaps, I know I had trouble with that sort of thing, and had to change my service descriptions, even though, there was NOTHING I could see wrong with the RegEx.


#3

For example, maybe change
HOSTNAME=svrccvcs10
SERVICEDESCR="CPU load"
to SERVICEDESCR=“Csvrccvcs10-CPU load”


#4

Hello jakkedup,
I’m grepping the debug.log for svrccvcs10 and for sure there is no data received by nagiostat. :frowning:


#5

I have seen where I had to edit the mysql database manually and force a process_perf_data 1 for a particular service, even though it was set in the services.cfg file for that service check.