Nagios state transition and event handler issue

Posted by Dattatray on Server Fault See other posts from Server Fault or by Dattatray
Published on 2012-09-07T05:58:56Z Indexed on 2012/09/07 21:40 UTC
Read the original article Hit count: 113

Filed under:

We are using Nagios to check duplicate processes.

define service
{         
    use                             local-service    
    host_name                       xxx
    service_description             xxx Duplicate Processes
    check_interval                  1
    max_check_attempts              1
    contact_groups                  admins
    event_handler                   restart-dependent-processes
    check_command                   check_procs_duplicate!2!3!2!2!2

}

check_procs_duplicate checks if there are any duplicate processes and returns the state - e.g. CRITICAL.

The event handler kills the duplicate processes and it's dependent processes and starts one instance of the process and dependent process. At the end of this again Nagios checks if there are any duplicate processes and sets the state accordingly - OK/WARNING/CRITICAL.

The event handler takes more time to start the processes and during this time if someone manually starts the process, the state will remain in CRITICAL itself.

During the next interval, Nagios will again check for duplicate processes and it will find it again CRITICAL.

The event handler will not get executed now, as the previos and current both the states are CRITICAL.

Any pointers about how to fix this issue?

© Server Fault or respective owner

Related posts about nagios