So we have this weird problem on our setup of several (15) Linux servers.
I have more or less narrowed it down to something in check_http plugin itself or something that it interacts with
(the problem happens even if I run check_http manually from command line).
I have machine A from which check_http is run (normally via nrpe). It checks a static page (with not links) served by a Tomcat on machine B. (Nagios itself is not on any of these servers)
Between midnight and noon I am getting 5ms response times.
From noon to midnight I get 130ms response times with outliers up to 8 seconds.
What is weird is that when I time how long it takes wget to pull the same page I am getting consistent 5ms throughout the day.
Now the question is what is check_http doing that goes beyond simply pulling the page content and that would be affected by time of day.
I do not have any cron jobs running on the machine A, and there are only once every 5 minute cron jobs on machine B.