Dears,
I’m new in configuring nagios ,
first I have nagios servers called RH1 which is responsible for monitoring my locally HP servers and switches by ping and snmp
Then I decided to make another nagios server as master nagios and make RH1 as NRPE remote nagios
Lets’s say that
master nagios IP is 1.1.1.1 (public IP)
Remote nagios IP is 2.2.2.2 (public IP) and 3.3.3.3 (private IP)
HP server which is connected to Remote nagios 3.3.3.10 (private IP)
I configured NRPE on both master and remote servers under Xinetd
From master nagios I can make direct checks for services on the remote nagios like (chek_disk,check_swap…etc) through NRPE
but I can’t make indirect checks for service ping for HP server which is connected to remote host RH1 and it gave me:
CRITICAL > CHECK_NRPE: Socket timeout after 10 seconds.
My configuration are :
Remote nagios:
I have added the ip of master nagios on /etc/xinetd.d/nrpe ===> 1.1.1.1
I have added an 5666/tcp entry for nrpe daemon on /etc/services
Add command for ping in /usr/local/nagios/etc/nrpe.cfg as following: command[check_ping]=/usr/local/nagios/libexec/check_ping -H 2.2.2.2 -w 3000.0,80% -c 5000.0,100% -p 5
Master Nagios
- I have added command definition to use the check_nrpe plugin on /usr/local/nagios/etc/commands.cfg
- I have created host and service definition as following:
define host{
name RH1
use generic-host
check_period 24x7
check_interval 5
retry_interval 1
max_check_attempts 10
check_command check-host-alive
notification_period 24x7
notification_interval 30
notification_options d,r
contact_groups admins
register 0
}
##################################################################################################
define host{
use RH1
host_name RH1
address 2.2.2.2
}
###################################################################################################
define hostgroup{
hostgroup_name MediaCity
alias RH1
members RH1
}
###################################################################################################
define service{
use generic-service
host_name RH1
service_description CPU Load
check_command check_nrpe!check_load
}
define service{
use generic-service
host_name RH1
service_description Current Users
check_command check_nrpe!check_users
}
define service{
use generic-service
host_name RH1
service_description /dev/mapper/VolGroup00-LogVol00 Free Space
check_command check_nrpe!check_sda1
}
define service{
use generic-service
host_name RH1
service_description Total Processes
check_command check_nrpe!check_total_procs
}
#############################################################
HP Servers
#############################################################
define host{
use generic-switch
host_name HP server
address 3.3.3.10
hostgroups MediaCity
}
#############################################
define service{
use generic-service
host_name HP server
service_description PING
check_command check_nrpe!check_ping!200.0,20%!600.0,60%
normal_check_interval 5
retry_check_interval 1
}
#############################################
‘check_nrpe’ command definition
define command{
command_name check_nrpe
command_line $USER1$/check_nrpe -H $HOSTADDRESS$ -c $ARG1$
}
##############################################
Could you please find why I can’t monitor the HP server from master nagios , and is there any faults in my configuration, u can modify any thing u see as I’m not expert