I’m thinking about logfile monitoring with error persistence and manual handling. I mean that if error happens, it should be kept by plugin in error state until manual reset, because there is no other way for plugin or Nagios to know that error has been fixed (I suppose that logfile can contain only error messages).
In case of not-persistence, if next logfile check succeeds, service status will be OK, but initial error will remain unfixed. I really want to avoid this behavior.
So what kind the “manual reset” should be? It seems to me that “problem acknowledgement” is the best way to do that. But there should be some ‘acknowledgement handler’ to call some script, which, in turn, will signal check plugin to turn alarm off. At this time, such handler doesn’t exist, but it would be great to see it.
What do you think about that or possible other ways to achieve the same goal?