Version 1.8 has the notes from setting things up "the old way", ie with 1.x running on oates. Herewith follows The New Way, nagios 3 on nagios.cscf.
nagios.cscf.uwaterloo.ca
too, which points at it as well now too.
You can look at it here - it'll ask for UWDir authentication.
/usr/local/etc/nagios
. One should, of course, RTFM
hosts
directory. This directory contains separate sub-directories for the different CSCF groups (research, infrastructure, etc.) and these, in turn, include separate sub-directories for the different machine groups. A group directory contains the group configuration file, and a separate configuration file for each of the machines in the group. Each of the latter contains a host
object and, optionally, several service
objects.
Example: The database research group directory structure:
/usr/local/etc/nagios |- hosts |- research |- DB |- db.cfg (group configuration file) |- nimbus.cs.cfg (host+services configuration file) |- softbase.cs.cfg (host+services configuration file)
Note: The previous, simple structure with a single hosts.cfg file is now deprecated.
Every host
object inherits from the host template defined in misc.cfg
. Similarly, every service
object inherits from the service template in the same file.
See: NagiosMonitoring#Config_files
Here's an example of adding a new service to monitor, assuming that nagios already does something with the machine. (We'll talk about adding a new machine later.)
LOGNAME
and USER
variables for RCS purposes, then change to the directory where config files are stored (/usr/local/etc/nagios
)
services.cfg
.
define service{ use generic-service host_name softbase.math service_description SSH is_volatile 0 contact_groups cscf-rg check_command check_ssh }
/usr/local/libexec/nagios
/usr/local/bin/nagios -v /usr/local/etc/nagios/nagios.cfg
- it should report zero errors. If it reports any, fix them all.
/usr/local/etc/rc.d/nagios restart
.
You'll want to stick around long enough to make sure your changes don't cause problems.
See: NagiosMonitoring#Config_files
check-host-alive
. There may be a way around this, but we haven't worked it out yet.
LOGNAME
and USER
variables for RCS purposes, then change to the directory where config files are stored (/usr/local/etc/nagios
).
hosts.cfg
. define host{ use generic-host host_name zonker alias zonker address 129.97.74.66 contact_groups cscf-rsg }
/usr/local/bin/nagios -v /usr/local/etc/nagios/nagios.cfg
- it should report zero errors. If it reports any, fix them all.
/usr/local/etc/rc.d/nagios restart
.
See: NagiosMonitoring#Config_files
LOGNAME
and USER
variables for RCS purposes, then change to the directory where config files are stored (/usr/local/etc/nagios
).
hosts.cfg
. # CS core servers define hostgroup{ hostgroup_name cscore alias CS core servers contact_groups cscf-csi members fe02.math,hopper.math,barbarus.cs,cpu102.cs,cpu104.cs,cpu106.cs,cpu108.cs,cpu110.cs,cpu112.cs,cpu114.cs }
/usr/local/bin/nagios -v /usr/local/etc/nagios/nagios.cfg
- it should report zero errors. If it reports any, fix them all.
/usr/local/etc/rc.d/nagios restart
.
LOGNAME
and USER
variables for RCS purposes, then change to the directory where config files are stored (/usr/local/etc/nagios
).
contacts.cfg
. define contact{ contact_name lfolland alias Lawrence Folland service_notification_period 24x7 host_notification_period 24x7 service_notification_options w,u,c,r host_notification_options d,u,r service_notification_commands notify-by-email host_notification_commands host-notify-by-email email lfolland@cs.uwaterloo.ca }
/usr/local/bin/nagios -v /usr/local/etc/nagios/nagios.cfg
- it should report zero errors. If it reports any, fix them all.
/usr/local/etc/rc.d/nagios restart
.
LOGNAME
and USER
variables for RCS purposes, then change to the directory where config files are stored (/usr/local/etc/nagios
).
contacts.cfg
. define contactgroup{ contactgroup_name cscf-rsg alias CSCF Research Group members mpatters,lfolland,magore,trg }
/usr/local/bin/nagios -v /usr/local/etc/nagios/nagios.cfg
- it should report zero errors. If it reports any, fix them all. Especially make sure that all individual contacts added have their own entry.
/usr/local/etc/rc.d/nagios restart
.
hosts.cfg
services.cfg
The easiest way to do this is to define the virtual host to be the same as the "real" host, but with a different name. For instance:
define host{ use generic-host host_name softbase.math alias softbase address softbase.math.uwaterloo.ca contact_groups cscf-rsg } define host{ use generic-host host_name db alias db address db.uwaterloo.ca contact_groups cscf-rsg }
Then you can monitor the Apache virtual host db.uwaterloo.ca
like this:
define service{ use generic-service host_name db service_description DBWEB is_volatile 0 contact_groups cscf-rsg check_command check_http }
This likely results in double-pinging hosts though, there may be a better way to do it.
Some people can't or won't pass ICMP echoes through their firewalls. One such example is zonker.
define host{ use generic-host host_name zonker alias zonker address zonker.cs.uwaterloo.ca check_command check_none contact_groups cscf-rsg }
Here, the key is check_command
. Now, looking in checkcommands.cfg
:
define command{ command_name check_none command_line $USER1$/check_dummy 0 }
This will always return "OK", so nagios thinks the machine is always up.
/usr/local/bin/nagios -v /usr/local/etc/nagios/nagios.cfg
ci -u hosts.cfg
ci -u services.cfg
ci -u contacts.cfg
/usr/local/etc/rc.d/nagios restart
-- MikePatterson - 21 Feb 2005, 09 May 2005 (with help from LawrenceFolland), 21 April 2006