good news for you: I had "exactly" the same problem: our nagios performance was going down due to to many tests...
Thank you for your answer! I am glad that I am not alone. :)*
We've made extensive perf tests to try to find the max numer of services allowed.
We found that -on our server-, this limit was between 2500 and 2900services; above, the latency would increase exponentially.
On a better server, this limit doesn't increase much...
What you mean "better"? More powerful? What is the configuration of your server? I think it does not matter what server we use. On my nagios server processors are not loaded at all and there is 5 GB free! RAM
The conclusion was that nagios (the soft in itself) was not able to launch more than 5-6 services per second... that is likely to be your problem too.
Mine launches almost 20...
The solution ?
ours: we created 2 new accounts: nagios2 and nagios3. On these 2 accounts, we installed for each a nagios server, and we dispatched all our tests on these new nagios servers.
almost the same solution was to implement 2 new nagios servers on the same account, sharing one apache server ... but this solution is quite tricky to implement, and prone to more bugs - thus your choice
I think it is not very good because it is difficult to have 2-3 servers with 2000 services per each for example. If one dies...But if NOTHING solves this problem it will be the one solution in my situation
btw: I tried to ask here if nagios 3.0 would help solve the problem by itself, but I had no answer ... maybe you'll get lucky (I can't help you: we're running nagios 2.10; and this version has the same latency problems (if not worse)).
*I will wait with big hope that somebody helps us:cry:
If not I will have to implement another monitoring soft in order to suite our large environment
It will be very sad for me and other nagios users if nagios can't perform AT ALL more than 3000 checks in 5-10 min because of VERY LARGE latency and that this is the soft limitation which we-nagios users and admins can't change by ourselves.:cry:(*