-- MikeGore - 2015-08-27

Cluster installation and Setup walk-trough

  • Administrative view
  • This document explains what key files need to be created to install the cluster tools on a new server
  • Setting and configuration is confined to just a few small files

Other References

Summary of services covered with these tools

  • What:
    • Simple set of tools useful for managing a cluster of machines
  • Where:
  • Provides:
    • Local DNS, TFTPBOOT, DHCP, Imaging tools, Firewall/NAT and more
  • TFTPboot server
    • /tftpboot/pxes - root folder
    • SAMBA server
      • /tftpboot/pxes shares for imaging
    • Image tools and Node image repository
      • himrod.cs:/himrod.node - current image of working nodes on himrod
    • Has private NAT firewalled network - 192.168.1/24 10.0.15/24
      • DHCP with local name server
      • PXE tftpboot functions
        • Boot Linux repair and imaging utilities or Linux network installers
      • DNS name server for NAT
      • SAMBA file shares
      • NFS file shares
      • APACHE Web service

Main configuration Setup and Installation folders overview

  • /cscf-adm/src
    • Sources for all of the cluster tools used to do the setup and configuration of the head node from asimov.uwaterloo.ca:/cscf-adm/src
  • /cscf-adm/src/dnsmasq - [ [ClusterToolsDNSMASQ][ DNSMASQ ]] / TFTPBOOT /DNS serices and configuration
    • /cscf-adm/src/dnsmasq/dnsmasq.common.
    • dnsmasq is a single package that provides PXE BOOT, DNS, DHCP services
  • /cscf-adm/src/pxe - PXE boot files - a minimal working PXE tree with boot images
  • /cscf-adm/src/hosts - host network configuration - defines interfaces used by all scripts*
    • /cscf-adm/src/hosts/common_hosts. defines interfaces and roles (ie is the interface to a NAT network)
  • /cscf-adm/src/syslinux - Syslinux sources - used for PXE booting
  • /cscf-adm/src/Idrac - iDrac scripts - Dell configuration and licencing scripts
  • /cscf-adm/src/cluster - Cluster scripts
  • /cscf-adm/src/twiki - Twiki documents - autogenerated
  • /cscf-adm/src/tools - IPMI configuration and IPMI managements tools from SUPERMICRO
  • /cscf-adm/src/html - HTML documents - autogenerated
  • /cscf-adm/src.web - web based reporting utils
  • /etc/network/interfaces - system network configuration
    • NOTE the setup scripts assume the network setting exist statically in order to work

Initial Setup Walk-trough

  • This section will help show how few configuration files need to be created to setup a new cluster.
    • Only Done once durring initial setup - as you can see there are only a few files needed to create a new cluster*
    • Replace with the unqualified name of the head node.
  • rsync -a -x -H cscf-adm@asimov.uwaterloo.ca:/cscf-adm/src/ /cscf-adm/src/
    • Obtian a copy of the tools
  • In this example replace with the unqualified name of the head node.
  • Create /cscf-adm/src/dnsmasq/dnsmasq.common.
    • This defines subnets and host/mac IP assignments for private networks and ILOM/BMC controllers
    • Look at the other example in this directory for guidance.
  • Edit /etc/network/interface
    • Define ALL networks statically
  • Create /cscf-adm/src/hosts/common_hosts.
    • Define interface names and roles
  • Create /cscf-adm/src/hosts/NODES.
    • Defines NODES the cluster nodes, and ILOM_NODES the cluster ILOM/BMC interfaces for the cluster
    • Look at NODES.generic for an example - if you wich you can hard code the list.
  • You are done configuration! - Mo on to installation steps

Initial Configuration Detail

  • Software under /cscf-adm/src provides a number of tools for setting up himrod
  • In this example we use the host himrod for the head node.
      • /etc/network/interfaces network interfaces for system
        • These MUST be statically defined for our setup scripts to work
      • /cscf-adm/src/hosts/common_hosts.himrod*
        • The scripts assume common_hosts. so replace himrod with your server
        • The master Makefile in /cscf-adm/src copies this to /usr/local/bin/common_host
        • EXTIF="em4.529" defines the device name for the external interface
        • INTIF="bond0" main internal network
        • INTNAT="TRUE" use NAT to access the outside
        • INT2IF="em3" secondary internal network
        • INT2NAT="TRUE" use NAT to access the outside
        • Other settings include SAMBA and university network ranges
    • [ [ClusterToolsDNSMASQ][ DNSMASQ ]]
      • /cscf-adm/src/dnsmasq contains all configuration files used for
      • * Depends on /etc/network/interfaces to have all interfaces defined statically
        • DHCP and DNS
        • /cscf-adm/src/dnsmasq/dnsmasq.common.himrod
          • See ClusterTools[ [ClusterToolsDNSMASQ][ DNSMASQ ]] for more details
          • The scripts assume dnsmasq.common. so replace himrod with your server
          • Defines all nodes and interfaces with simplified IP and MAC address notation
          • provides DNS and reverse DNS for all local networks
          • Example:
                           # Private Network
                     * *make* processes */cscf-adm/src/dnsmasq/dnsmasq.common.himrod*
                        * Creates these files automatically:
                           * */etc/hosts* - defines local addresses for all *himrod* nodes
                           * */etc/resolv.conf* using */cscf-adm/src/dnsmasq/resolv.template*
                           * */etc/dnsmasq.hosts* - is an include file in */etc/dnsmasq.conf*
                              * Note: */etc/dnsmasq.conf* ONLY includes *dnsmasq.hosts* - do not define anything else in dnsmasq.conf
            ---+++ Installation - assumes Initial Configuration has been done
               * Run */cscf-adm/src/install_packages*
                  * This installs required packages need to run the make script.
               * Run: *make copy_scripts* 
                  * installs *[[https://cs.uwaterloo.ca/twiki/pub/CF/HimrodTools/common_functions][common_functions]], [[https://cs.uwaterloo.ca/twiki/pub/CF/HimrodTools/common_host][common_host]] and [[https://cs.uwaterloo.ca/twiki/pub/CF/HimrodTools/common_vars][common_vars]]* in:
               * Run *make all*
                  * runs install scripts
                     * *dnsmasq*, *samba*, *nfs* setup *sccf-adm* accounts
                  * runs update scripts
                     * download *pxe live images* etc
                  * installs NAT firewall
                  * *service firewall start* to start the firewall</verbatim>
Topic revision: r2 - 2015-08-27 - MikeGore
This site is powered by the TWiki collaboration platform Powered by PerlCopyright © 2008-2019 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback