Hosts vs services

Joachim · May 5, 2008, 12:16pm

Hi

Can someone please help me understand the differences between hosts and services. I know hosts have Down, Unreachable etc while services have Warning, Critical etc. But when I install a plugin, how do I know if I am supposed to install the plugin as a host or service plugin. Im using Centreon btw.

Albin · May 5, 2008, 1:13pm

Host is, for example, a computer on the network, and when defining a host, you must suply the check-host-alive command, which is in 99% check_ping. In such case it is the only plugin you’ll be using for host checks. Host checks are run less frequently then service checks.

After you define a host, you probably want to define services that would check certain states on your host. For example you could check disk usage (which would be first service for you host) and it would use check_disk plugin. Second service could be SSH reachability of the host, which would use check_ssh plugin.

If you install a plugin, it would probably check some feature of the host on whichyou run the feature, so it would be a service plugin.

Although plugins are not separated as service or host plugins, they are just plugins, but if it’s easier for you to understand, I hope the post I’ve written is helpful.

Joachim · May 5, 2008, 1:39pm

Aha, so the “host” command is reserved for plugins to see if the host is up and running. That explains a lot, thanks.

Albin · May 5, 2008, 2:30pm

Hmmm, I’m not sure you got it right.

When defining a host, you tell Nagios what is the name of that host, what is the IP address of that host etc. In host definition you have to supply a check-host-alive command which will be used to check if the host is in up or down state. The command that you supply for check-host-alive is a plugin. For example, I’ve mentioned check_ping. So check_ping is a plugin.

Services also use plugins to check the availability of features on network devices (hosts). Services are Nagios name for host features so we could differ hosts from services that run on them.

Nagios uses plugins to determine if host is up and to check all other sorts of features that could be checked on many hosts. It could be checking of disk usage or memory usage on computers, it could be checking of up/down interfaces on routers and switches, it could be a http availability on a web server etc.

In a graphical tree view it would be like this:

----------- service 1 on host 1
|
host 1 - service 2 on host 1
|
------------service 3 on host 1

host 2--------service 1 on host 2
|
-----------------service 2 on host 2

To check the host 1 and host 2 reachability we use plugins (for example, a check_ping plugin). Also for reachability/availability of the services: service 1, service 2 and service 3 on host 1 we use plugins. Also for service 1 and service 2 on host2 we use plugins.

Joachim · May 5, 2008, 3:33pm

okay thanks, I think I got it

Joachim · May 6, 2008, 9:32am

Another question

Im using Centreon and i saw a “check_host_alive” command was allready isntalled, so I used it. The default command line is:

$USER1$/check_ping -H $HOSTADDRESS$ -w 3000.0,80% -c 5000.0,100% -p 1

Warning is triggered at 3 seconds round trip average or 80% packet loss, and critical at 5 seconds rta or 100& packet loss. Is this right?

But my real question is this. As Im using this as a host check, my notification options are Down, Unreachable, Recovery or Flapping. No Warning or critical as it is defined in the default command line. How do I define Down instead of Critical, and Unreachable instead of warning?

Albin · May 6, 2008, 9:42am

Define a service that would check_ping to the host. I know this seems like a double check, but the host checks aren’t issued so frequently as the service checks. So additional service with the check_ping would result in correct and exact info about reachability of the montored host. And then those warning and critical thresholds would make sense.

Check-host-alive only checks if the host is reachable or not. It looks if the host responded to the ping. At least it is like that defined in my example:
command_name check-host-alive
command_line /usr/lib/nagios/plugins/check_ping -H $HOSTADDRESS$ -w 5000,100% -c 5000,100% -p 1

As I see, you have used $USER$ defined macro. I suppose that was modified in your case.

Joachim · May 6, 2008, 10:09am

okay, so I need to define a service similar to check_ping, can the warning and critical threshold be the same as in my check_host_alive command? If I do define a check_ping service, will this be the one doing the notification, and the host_check_plugin wont? I am still unclear on how I can define check_host_alive to use Down and Unreachable instead of warning and critical. In Centreon there are host and service “boxes” at the top, listing the different conditions, how do I get check_host_alive to list Down or unreachable if my command have only defined warning and critical?

I might have written this a little messy, but I hope you understand what I mean

Joachim · May 6, 2008, 10:21am

btw, do I have to define a check_ping as a service? checking to see if the network is stable isnt all that important to me. Can I make do with just the host check, maybe changing from my command to the one you listed:

$USER1$/check_ping -H $HOSTADDRESS$ -w 5000.0,100% -c 5000.0,100% -p 1

If I did this, how would the Unreachable and Down work?

Albin · May 6, 2008, 10:48am

-w 5000.0,100% -c 5000.0,100% -p 1
These thresholds would return BAD state (call it Critical or Unreachable, it doesn’t matter) and are suitable for a host-check. I have wrote this already. Host-check is performed only to see if the host is reachable or not. For it, Nagios doesn’t need Warning or Critical thresholds. If you ask, why the threshold is defined then? Well, the answer is, that Nagios could know if the packet loss is 100%, then set the host state to down and send a notification that the host is down.
If you define additional service, as I’ve said, you will get more precise output of the ping command and the thresholds in service check would then have more sense. In service check they should be like this:
-w 3000.0,80% -c 5000.0,100% -p 1
so Nagios could distinguish the warning from critical.

When Nagios notice that services start getting Critical states, then it checks the host to see if it is up or down. Probably in such scenario you will get notification for the services first, and then for the host (of course if it IS in down state).

Why does check_ping plugin don’t have down or unreachable thresholds? Because it doesn’t need them. With warning and critical thresholds everything is settled the way Nagios understands. Check-host-alive, when once set-up, it doesn’t have to be edited anymore, 'cause it is only a check to determine if the host is reachable. It isn’t meant to be the ping analyzing tool. For that you can create service with check_ping plugin.

As for the Centreon, personally, I don’t like it and it isn’t an official Nagios product. It is an add-on if you ask me.

Maybe you should read the Nagios documentation to understand how it works.

Joachim · May 6, 2008, 11:31am

Aha, now I think I got it.

just to recap what you said:
When a service reaches critical state, it runs the host command to see if the host is up and running. (do the host command run a check if it doesnt recieve a critical state from a service, or does it assume it is up and running?)

So a decent setup would be:
Host:
check_host_alive
$USER1$/check_ping -H $HOSTADDRESS$ -w 5000.0,100% -c 5000.0,100% -p 1

Service:
check_host_alive (and other plugins, but as an example using check_plugin here)
$USER1$/check_ping -H $HOSTADDRESS$ -w 3000.0,80% -c 5000.0,100% -p 1

sorry you have to spoon feed me the information, believe it or not I’m writing a bachelor report on Nagios with no previous Nagios experience.

Albin · May 6, 2008, 11:56am

Host command runs a check if the host’s services are in critical state. Watch it over some period of time and you will see that the last host check is way beyond any last service checks (by time).

Decent setup is ok.

I would name the service check_ping, because it checks ping state, it doesn’t check check_host_alive.

Good luck with bachelor report And I would suggest that you read the documentation if you’re doing what you say. Because you’ll then have much better understanding of Nagios. And then play a bit with every option, so you could get pratical insight of what you’ve read.

Joachim · May 6, 2008, 12:14pm

Okay, thanks for your help