Nagios3: Conditional operators for service checks?

Posted by Dave on Server Fault See other posts from Server Fault or by Dave
Published on 2012-10-08T18:19:03Z Indexed on 2012/10/10 3:40 UTC
Read the original article Hit count: 444

Filed under:
|
|

I'm trying to setup Nagios to monitor my various using hostgroups to define 'machine roles', against which I run services to check the machines by role. However, I'd like to use conditional operators that would enable me to run the service check against an intersection of two host groups, rather than their unions... i.e. using &&, ||, or () operators.

For example, imagine I have the following servers:

  • www-eu: Linux WWW (Apache) server, in the EU
  • www-us: Windows WWW (IIS) server, in the US (West coast)
  • ftp-eu: Linux FTP server, in the EU
  • ftp-us: Windows FTP server, in the US

I would want to create the following host groups:

  • US-Servers: www-us, ftp-us
  • EU-Servers: www-eu, ftp-eu
  • WWW-Servers: www-us, www-eu
  • FTP-Servers: ftp-us, ftp-eu

Now say I'm interested in checking the HTTP response time for my web servers. Then let's say this particular Nagios service is running from the US (West Coast), and that I have a command called *check_http_response_time*. This command will check the responsiveness of the HTTP server, which I can provide an argument which defines the max response time before raising critical.

My command might look like: check_http_response_time $HOSTNAME$ 50

Now traditionally, I can run my checks by specifying a list of host or hostgroups.

define service{
    use             local-service
    hostgroup_name              WWW-Servers   # Servers = www-us, www-eu
    servicegroups           WWW Checks
    service_description     Check HTTP Response Time
    check_command           check_http_response_time!50
}

However, with the above service definition, given my Nagios service is in US West, I could reasonably expect that my EU server will return critical. Really, I want different thresholds for each region (50 for US West, 200 for EU.)

I would have to permutate my service for each host and set their custom threshold, or alternatively permutate out my service groups by role & region (i.e. WWW-Servers-EU), and run my specific thresholds against those. Though the latter is better, both are much messier than I'd like...

What I would love, and what this post is asking for, is a way to use hostgroups to perform an intersection using conditional logic, rather than a simple union. It might look like:

define service{
    use             local-service
    hostgroup_name              WWW-Servers && US-Servers
    servicegroups           WWW Checks
    service_description     Check HTTP Response Time
    check_command           check_http_response_time!50
}

It then would run the check only against servers that are in both WWW-Servers and US-Servers, in my example, just www-us. The benefits of such a feature would be significant for Nagios services configured for large-scale.

Is this feature available? If it isn't, will it be available in the future? Is there an alternative way to accomplish this given the most recent Nagios version?

Any tips/suggestions are most appreciated!

  • Dave

© Server Fault or respective owner

Related posts about linux

Related posts about monitoring