Nagios Event Duration question


#1

Hello,
Can someone please tell me the meaning of “Event Duration” on an availability page?
For example in the uploaded jpg the last one has an event duration of just under 9hours. Can someone please explain what that means?

Thank you


#2

Hi,

the view you’re talking about is a summary of all the events that have happened the previous X days (X being the number you put in the “backtracked archives” option).
So, if your service is down for 2 hours, you’ll see a red line with the error, and a duration of 2hours.

As for the screenshot, this might happen if you restart your server.


#3

Thanks for the reply,
My follow up question is:
We have 2 different monitoring servers running nagios watching the same host. Both are reporting different number of errors and none of them match up for event start/end or duration. IE one is reporting an error that lasts 44m but the other reports that the error only lasts 4m.
Also am I right to assume that nagios can miss an error/outage if the duration is less than a few minutes?

Thanks for the help


#4

Well, as you already know, nagios schedules more or less “randomly” its services.
So, if you have a service with a “normal_check_interval” set to 15min, for example, it will take a shortage of at least 15min to be sure that nagios will detect it.
Add to that the fact that you can use the “max_check_attempts” attempts option before signaling the error, you’ll need a shortage of a minimum of (normal_check_interval + max_check_attempts x retry_check_interval) to be sure that nagios will report it …

So, it is “normal” that 2 nagios servers don’t see exactly the same things (but it should be coherent anyway).