Strange inconsistencies between web ui and configs


#1

I’ve done something in my nagios configs that is resulting in some strange nagios behavior.

I’ve changed some router interface check service configurations so that instead of being OK when up, they are OK when down. Now, please know that I’ve done this a bunch of times before with no problem, but for some reason, this time it’s not working.

I go from:

define service {
use generic-service
host_name Core Router
service_description GigabitEthernet 3/13
check_command check_snmp! -P 2c -C public -o ifOperStatus.19 -r 1
}

define service {
use generic-service
host_name Core Router
service_description GigabitEthernet 3/13
check_command check_snmp! -P 2c -C public -o ifOperStatus.19 **-c 2:2 ** -r 1
}

But, once I made this change, it only intermittently shows up in the web ui. Sometimes the screen refreshes and it shows up in the ui with the down state being OK, sometimes it shows up with the down state being CRITICAL.

Here’s what I’ve checked:

  1. retention.dat - at first this had the old config, but then I config’ed a refresh of this with nagios once every minute, and it fixed that
  2. status.dat - at first this also had the old config, even though the new had been in place for days. I restarted nagios again, and it seems to have gone to the new config
  3. objects.cache - this seems to have the correct, new config

I also checked the nagios.log and it showed the weirdness, where sometimes it shows the down state as OK, and sometimes CRITICAL:

[1216127252] INITIAL SERVICE STATE: Core Router;GigabitEthernet 3/13;OK;HARD;1;(null)
[1216127282] SERVICE ALERT: Core Router;GigabitEthernet 3/13;CRITICAL;SOFT;1;SNMP CRITICAL - down(2)
[1216127299] SERVICE ALERT: Core Router;GigabitEthernet 3/13;OK;HARD;10;SNMP OK - 2
[1216127352] SERVICE ALERT: Core Router;GigabitEthernet 3/13;CRITICAL;SOFT;2;SNMP CRITICAL - down(2)
[1216127479] SERVICE ALERT: Core Router;GigabitEthernet 3/13;CRITICAL;SOFT;1;SNMP CRITICAL - down(2)
[1216127532] SERVICE ALERT: Core Router;GigabitEthernet 3/13;CRITICAL;SOFT;3;SNMP CRITICAL - down(2)
[1216127599] SERVICE ALERT: Core Router;GigabitEthernet 3/13;OK;SOFT;2;SNMP OK - 2
[1216127659] SERVICE ALERT: Core Router;GigabitEthernet 3/13;CRITICAL;SOFT;1;SNMP CRITICAL - down(2)
[1216127719] SERVICE ALERT: Core Router;GigabitEthernet 3/13;CRITICAL;SOFT;2;SNMP CRITICAL - down(2)
[1216127779] SERVICE ALERT: Core Router;GigabitEthernet 3/13;CRITICAL;SOFT;3;SNMP CRITICAL - down(2)
[1216127832] SERVICE ALERT: Core Router;GigabitEthernet 3/13;CRITICAL;SOFT;4;SNMP CRITICAL - down(2)
[1216127899] SERVICE ALERT: Core Router;GigabitEthernet 3/13;OK;SOFT;4;SNMP OK - 2
[1216127959] SERVICE ALERT: Core Router;GigabitEthernet 3/13;CRITICAL;SOFT;1;SNMP CRITICAL - down(2)
[1216128019] SERVICE ALERT: Core Router;GigabitEthernet 3/13;CRITICAL;SOFT;2;SNMP CRITICAL - down(2)

I should also say that in addition to this issue, and possibly related to it, I’ve added some OTHER new services that, when the nagios web ui refreshes, sometimes show up in the service list for the config’ed host, and sometimes don’t.

For the life of me, I cannot figure out the problem. Any help would be greatly appreciated.

Neil