nagios service check

Posted by DRH on Server Fault See other posts from Server Fault or by DRH
Published on 2012-08-31T22:42:38Z Indexed on 2012/09/01 9:40 UTC
Read the original article Hit count: 115

Filed under:

I am new to nagios and we have a small issue I need to ask assistance with. Many of the machines that we monitor can go unresponsive for a bit when some very intensive cpu tasks are run. This makes nagios send warnings and alerts while these hosts are busy reporting things like 'ping timeout' or 'zombie processes' and even swap space warnings, but in actuality there is not a problem.

Is there a way to configure nagios to not send such alerts, but check x number of times over a period of time and only then send an alert at the end of that time if the server in question has not recovered?.

Looking at the commands.cfg file, I see entries like this:

define command{
        command_name    check_local_swap
        command_line    $USER1$/check_swap -w $ARG1$ -c $ARG2$
        }

How could I modify this example to accomplish what I want above.

Thanks

© Server Fault or respective owner

Related posts about nagios