Urgent help needed ndoutils latency


#1

I have noticed a performance lag in nagios due to ndoutils having everything being passed through it. Nagios Latency with ndoutils was 140sec, as soon as I disabled ndoutils it went down to 3 seconds.

We tested this by issuing manual check requests from nagios, watched the log file to see when it received it and when it actioned it. Massive delay.

I would like to know how I can pass some parameters/data from nagios to ndoutils and leave others out.

What performance enhancements can I make to make ndoutils work faster and thereby not hurting nagios performance too much… A user has listed the following but I do not know how to achieve this as he has not published the HOWTO! VERY HELPFUL!

"- indexes should be re-arranged so that the time column is first.
Currently, a lot of indexes have instance_id first. However, when you
are doing a delete based on time, the index is effectively useless,
so mysql has to do a complete table scan to work out which rows need
to be deleted. This will cause mysql to take a lot of time. This is
the single biggest thing that you can do

  • reduce the amount of times ndo2db calls the housekeeping routine.
    By default, it is every 60 seconds. We’ve reduced down to 600
    seconds. It could probably be even less frequent. One thing I’ve just
    thought is to have ndo2db NOT do any housekeeping and do it yourself
    (mysql is multi-user after all)
  • reduce the amount of data sent. We stop the broker module sending
    systemcommands, log entries and passive commands
  • we’ve also patched Nagios to not send status data on a reload. By
    default, Nagios will send data to ndo about the status of all hosts/
    services on a reload. This is not required because the db already
    knows what the status of the things were before the reload!
  • we’re currently testing a de-coupling of NDOMOD from ndo2db. The
    idea is that NDOMOD writes files and then a separate daemon loads
    those files into ndo2db. This effectively means that NDO updates are
    now asynchronous, though there is now a delay in the updates

We’ve also made a patch to Nagios 2.9 (which Ethan has applied to
Nagios 3), where the status file is kept between reloads, so you
don’t get the dreaded “Could not read host and service status
information” error. That is available at altinity.blogs.com/
dotorg/2007/09/nagios-patch-da.html."

Any suggestions on how to improve ndoutils performance or handle the DB updates/queuing differently, anything to reduce or normalise the latency.
Any help would be greatly appreciated.

thanks
Ali


#2

is it the mysql server which is slow or what? or do you simply see a delay between the check and the data appearing? It seems almost impossible that you have a 140 sec delay and no other performance hit on the server. If i had to bet i’d say it’s some kind of configuration error… but don’t ask me what… :-/

have you checked that the nagios/NDO configuration is correct?