Handling SNMP Traps on a single machine


#1

I need to get Nagios to process snmp traps that arrive on the same machine that is running Nagios. All the docs, and forum posts discuss performing this operation on multiple machines. I have tried to configure my server with nsca/send_nsca to work within itself. Is that the right way to go? If so, what do I need in submit_check_result. In this config, it seems to be used for 2 purposes. The first to submit trap data to Nagios, and the second to check for the trap data.

So far, I have been able to echo data into send_nsca and that appears in nagios.cmd.

If anyone is running Nagios this way, can you please post what you are using in checkcommands and the service definition?

Regards,
Casey


#2

I now have Nagios configured to submit passive checks w/o nsca since everything is on the same machine. Currently, snmptrapd will execute a specific script per OID.

e.g /etc/snmp/snmptrapd.conf
traphandle .1.3.6.1.4.1.311.1.13.1.23.83.101.114.118.105.99.101.32.67.111.110.116.114.111.108.32.77.97.110.97.103.101.114.0.1073748859 /usr/lib/nagios/plugins/eventhandlers/submit_check_result_w32time

submit_check_result_w32time looks like this:
echocmd="/bin/echo"

read host
read ip
read OID

CommandFile="/var/log/nagios/rw/nagios.cmd"

get the current date/time in seconds since UNIX epoch

datetime=date +%s

create the command line to add to the command file

cmdline="$datetime] PROCESS_SERVICE_CHECK_RESULT;$host;w32_time;1;“W32Time Event”"

append the command to the end of the command file

$echocmd $cmdline >> $CommandFile

/var/log/nagios/rw/nagios.cmd contains this after snmptrapd matches the oid:
[1141163054] PROCESS_SERVICE_CHECK_RESULT;fqhost.domain.com;w32_time;1;“W32Time Event”

That said, how do I get Nagios to check the command file? I don’t have any of the check commands mentioned in other posts. Is there a common check command that I should used in the service definition, or are we left to write our own?

Regards,
Casey


#3

First, you need to enable nagios to check it.
See the docs please.
nagios.sourceforge.net/docs/2_0/ … l_commands
Then you need to make sure the directory has the proper permissions.
Again, see the docs.
nagios.sourceforge.net/docs/2_0/commandfile.html
Then the command submitted must be of the proper format.
nagios.org/developerinfo/ext … ndlist.php


#4

Thanks for the info. I went through those and all the other docs that seemed relevant. I’ve got Nagios to notice the passive_check submission.

e.g. /var/log/nagios.log will show this when I submit a result
[1141332818] EXTERNAL COMMAND: PROCESS_SERVICE_CHECK_RESULT;10.1.3.31;2;w32_time;W32Time Event

I can’t get it to send a notice out though. My service definition is as follows:

define service {
use generic-service
name passive-service
is_volatile 1
passive_checks_enabled 1
max_check_attempts 10
normal_check_interval 1
retry_check_interval 1
notification_interval 60 ; 2h
active_checks_enabled 0
notification_period 24x7
notification_options c,r
check_command check_dummy!3
check_period 24x7
check_freshness 0 #1
register 0
}


define service{
use passive-service
host_name servername
service_description Trap Test
contact_groups admin-email
}

nagios.cfg looks like this:

log_file=/var/log/nagios/nagios.log
cfg_file=/etc/nagios/checkcommands.cfg
cfg_file=/etc/nagios/misccommands.cfg
cfg_file=/etc/nagios/contactgroups.cfg
cfg_file=/etc/nagios/contacts.cfg
cfg_file=/etc/nagios/hosts.cfg
cfg_file=/etc/nagios/services.cfg
cfg_file=/etc/nagios/servicegroups.cfg
cfg_file=/etc/nagios/hostgroups.cfg
object_cache_file=/var/log/nagios/objects.cache
resource_file=/etc/nagios/resource.cfg
status_file=/var/log/nagios/status.dat
nagios_user=nagios
nagios_group=nagios
check_external_commands=1
command_check_interval=1
command_file=/var/log/nagios/rw/nagios.cmd
comment_file=/var/log/nagios/comments.dat
downtime_file=/var/log/nagios/downtime.dat
lock_file=/var/run/nagios.pid
temp_file=/var/log/nagios/nagios.tmp
event_broker_options=-1
log_rotation_method=d
log_archive_path=/var/log/nagios/archives
use_syslog=0
log_notifications=1
log_service_retries=1
log_host_retries=1
log_event_handlers=1
log_initial_states=0
log_external_commands=1
log_passive_checks=1
service_inter_check_delay_method=s
max_service_check_spread=10
service_interleave_factor=s
host_inter_check_delay_method=s
max_host_check_spread=30
max_concurrent_checks=0
service_reaper_frequency=10
auto_reschedule_checks=0
auto_rescheduling_interval=30
auto_rescheduling_window=180
sleep_time=0.25
service_check_timeout=60
host_check_timeout=30
event_handler_timeout=30
notification_timeout=30
ocsp_timeout=5
perfdata_timeout=5
retain_state_information=1
state_retention_file=/var/log/nagios/retention.dat
retention_update_interval=60
use_retained_program_state=1
use_retained_scheduling_info=0
interval_length=60
use_aggressive_host_checking=0
execute_service_checks=1
accept_passive_service_checks=1
execute_host_checks=1
accept_passive_host_checks=1
enable_notifications=1
enable_event_handlers=1
process_performance_data=0
obsess_over_services=0
check_for_orphaned_services=0
check_service_freshness=1
service_freshness_check_interval=60
check_host_freshness=1
host_freshness_check_interval=60
aggregate_status_updates=1
status_update_interval=15
enable_flap_detection=0
low_service_flap_threshold=5.0
high_service_flap_threshold=20.0
low_host_flap_threshold=5.0
high_host_flap_threshold=20.0
date_format=us
p1_file=/usr/bin/p1.pl
illegal_object_name_chars=~!$%^&*|'"<>?,()= illegal_macro_output_chars=~$&|’"<>
use_regexp_matching=0
use_true_regexp_matching=0
admin_email=nagios
admin_pager=pagenagios
daemon_dumps_core=0


#5

OK, so now I gather that the check is working, and that it simply is not sending out notifications.
You MUST clik the link and disable/enable notifications for each service check that you want notifications enabled.
The reason is explained here:
nagios.sourceforge.net/docs/2_0/ … tion_notes


#6

Notifications were enabled on the service. To be sure, I disabled then enabled notifications on the Trap service.

Active Checks:
DISABLED
Passive Checks:
ENABLED
Obsessing:
ENABLED
Notifications:
ENABLED
Event Handler:
ENABLED
Flap Detection:
ENABLED

Nagios isn’t sending any notifications though.


#7

The output return code is 1, warning, and your notification_options is set to c,r
Therefor, you won’t get notifications for any warnings due to your setting of c,r


#8

Since the first post, I have been sending message through as 2. This is what shows up in nagios.log
[1141402904] EXTERNAL COMMAND: PROCESS_SERVICE_CHECK_RESULT;host.fqdn.com;2;w32_time;W32Time Event

Of course, the host.fqdn.com matches a specific host. Other notifications for that host do get sent. Any other ideas? I think it’s pretty close, and I really need to get this working.

Thanks,
Casey


#9

How did you do the above? By using the .cfg files or by using the cgi webpage.


#10

nagios.org/developerinfo/ext … and_id=114
PROCESS_SERVICE_CHECK_RESULT must have the following format:
PROCESS_SERVICE_CHECK_RESULT;<host_name>;<service_description>;<return_code>;<plugin_output>
Now, lets plug in your values and see if it fits.
PROCESS_SERVICE_CHECK_RESULT;host.fqdn.com;2;w32_time;W32Time Event
So, you have:
<host_name> = host.fqdn.com
<service_description> = 2
<return_code> = w32_time
and so on. Somehow, that just doesn’t look right to me.
nagios.org/developerinfo/ext … ndlist.php


#11

So the entry is correct now.
[1141408365] EXTERNAL COMMAND: PROCESS_SERVICE_CHECK_RESULT;host.domain.com;w32_time;2;CRITICAL - W32Time Event

I also enabled notifications in the service defintion with “notifications_enable 1”, but still no notifications from Nagios.


#12

In order for a notification, there must be a host named host.domain.com
and a service for that host named;
w32_time
Check the spelling again.
Are you using mysql for nagios?
According to your earlier post, you have a host name of:
servername
and a service description for that host of:
Trap

So I don’t see how you can get a notification until you define the host and service.


#13

That did it. Works great now! The names and descriptions I was giving in previous posts were just examples, but your comment about spelling maybe me realize that hosts.cfg didn’t have the fqdn of the server, just the host name. Another example of focusing on the difficult and obscure instead of the simple and obvious. :frowning: Thanks a lot for your help.

Casey


#14

hi
how do I do this in perl ?
thx


#15

Hi Casey,

Great that you got this all working. Have you tried looking at SNMPTT? I use this between snmptrapd and nagios to format the trap for the nagios command file. This bit of snmptt config reacts to linkdown traps that are not caused by being put admin down, and sends to nagios.

EVENT linkdown .1.3.6.1.6.3.1.1.5.3 “Status Events” Warning
MATCH $4: !(dmin)
FORMAT Interface $2 down. Reason: $4
EXEC echo [email protected]] PROCESS_SERVICE_CHECK_RESULT;$A;check-$2-status;2;Interface $2 gone down. Reason $4 >> /usr/local/nagios/var/rw/nagios.cmd
REGEX (.newskies.net)()

This is a simple example, but it’s a great way of handling more complex traps in a scalable way.

John