i have finally found the source of my server woes… at first i thought it was nsca processes killing my central server, till in absolute desperation i turned off process_perfdata.
here is my senario, I have approx. 65 hosts reporting to a central server (2.4ghz, 1gb ram). I setup nagiosgrap to draw graphs for on average 3 services per host. at first everything worked on, when i started adding more host it started taking strain…
I would really like to get those graphs working, is there something i can do to optimise my setup, of am I asking too much??
thanks for your time, any advice would be greatly appreciated.
I run nagiostat, and have 288 graphs being generated. Nagios also runs on the same server. Perhaps it’s time to switch, since 195 graphs isn’t much. MY pc specs are about the same.
I had nagiostat running with many graphs and had no problems… (it was a dual processor SPARC with 4Gb of RAM). Today i have cacti on a much less powerful machine inserting data into 200 graphs every 5 minutes without probems…
cant see anything wrong with the processes, i get many nsca ones, but nothing out of the ordinary…
i have the process_performance_data set to 1 in my nagios.cfg.
this obviously submits perfdata for all services, if I then add process_perf_data=0 to the services that I do not wish to graph would this ensure that I only receive results for the stuff i want?
Turn on debug and look at the log. I know that is how nagiostat works, but dunno about yours. In nagiostat, you can save some time, by turning off perf for services like you stated.
nagiostat is pretty easy if you already have a handle on regex stuff. There is one config file and it has many examples in it, that you can simply copy/paste and then add your “hostname” “service description” and you are rocking.