About a week ago my manager had the lovely surprise for me that I was responsible for the movement of a linux server which was used by nagios for monitoring part of our customers environment. I only knew nagios by name then.
My problem at the moment is this. I installed the NRPE on the new server and have moved one host to be monitored through the new server. The first host I moved did not use the NRPE btw, it used ssh to run a script on the host which did some checks on logfiles. There was a problem with the ssh command though. The new server still head a big welcome message which was also displayed in the return text for nagios. Nagios couldn’t handle this and gave the service_check the status unknown. I have removed the header and now the message is oke, but nagios still didn’t recognize the message. Now after 24 hours it does see the message and start to send emails. So to me it seems that there is some internal queueing going on and it is still reading old events instead of the current one. Can I somehow clear this Queue. Tried a restart of nagios, but that didn’t help. This is happening only for the one host, all other hosts have no problems.
Nagios is displaying events but 24 hours to late. Looks like an internal queueing problem but I can’t find a way to clear the Queue. Restart of Nagios didn’t help. Is there anyway to clear the Queue so it actually starts to show the actual state of affairs? This problem only takes place for 1 host, the rest works fine.
Nagios 1.2 (I know it’s old, but upgrading is not a solution.)