i’m getting pretty annoyed by the fact that nagios is serving different states whenever I refresh my master-GUI. If I for instance look at a certain service result it can be CRITICAL at one moment and when I hit refresh it will be OK (and vice versa). When I look at the duration of the check result it can easily say that it has been CRITICAL for a few hours, but when I hit refresh it’ll say that it has been OK. Can someone explain to me how this can occur?
Here are some specifications of my setup :
hosts : 1000+
services : 4000+
we have one master receiving checks from 6 different slaves through nsca the master itself doesnt do any checks it only processes the passive results. Can’t see anything unusual in the event or debug log, if we scale down the number of results the problem doesn’t occur.
Hope someone has an explanation for this behavior