SaltStack within CSCF

Policies

  • Please read "Salt in 10 minutes" (linked in External Documentation) and all of this article before doing anything on a Salt master.

  • Don't change anything in states/common/, pillars/common/, or that otherwise might apply to minions that you don't maintain, without asking/telling somebody.
    • Either whoever is in charge of the affected machines, or Nathan Fish (nfish@uwaterloo.ca).
    • Tagging in Teams/CSCF/CSCF-Changes works too
    • Test such changes thoroughly on a few minions, rebooting them if that might matter.
      • Watch out for LXC container vs physical differences. Some things break containers that work on real machines.

  • For the moment: Once you've finished changing a State or some Pillar files, git commit --author X and git push. The first time (per repository) you need, eg, --author "Nathan Fish <nfish@uwaterloo.ca>". After that you can use --author nathan and git will figure it out. Check git status to make sure you don't commit files someone else left staged. * In future, as we scale up, we may move to a workflow where admins commit on their workstations, push, and the masters pull. * begin your commit title with the name of the state/pillar, ie "cscf_apache: pull in common.ldap"

Current Salt Masters

  • salt-csi-2004.cscf.uwaterloo.ca - is the Ubuntu 20.04 Prod salt master for INF

  • salt-csi-1804.cscf.uwaterloo.ca - is the Ubuntu 18.04 Prod salt master for INF

  • salt-cscf-2004.cscf.uwaterloo.ca is the Ubuntu 20.04 Prod salt master for non-INF minions.

  • salt-top-1604 and salt-rsg-1604 exist for those departments. There is very little on them yet, but they are in prod.

Salt Masters being devolved.

  • salt-204.cscf is the old ubuntu-16.04 salt-master. It is deprecated and has only a few minions left.

Setup and Administration Guidelines

Internal Documentation

External Documentation

There is excellent documentation available from the SaltStack project. It is generally good at being up-to-date, as much of it is auto-generated from code comments. We should avoid duplicating it ourselves.

Important starting points (bookmark these all in a Salt folder):

Salt in 10 minutes: https://docs.saltstack.com/en/latest/topics/tutorials/walkthrough.html

Using the salt command: https://docs.saltstack.com/en/latest/topics/execution/remote_execution.html

The top file - where states are assigned to minions: https://docs.saltstack.com/en/latest/ref/states/top.html

List of States: https://docs.saltstack.com/en/latest/ref/states/all/

Intro to YAML: https://docs.saltstack.com/en/latest/topics/yaml/index.html

Intro to Jinja templating: https://docs.saltstack.com/en/latest/topics/jinja/index.html

Salt Best Practices: https://docs.saltstack.com/en/latest/topics/best_practices.html

Example cluster using VMs

Creating a new LXC container

salt-204 has several IAAS container hosts as minions. The fastest way to get a new container running is by creating it through Salt. Prerequisites: The new machine must have its (eth0) hostname already resolving to the IP you want, so create the container's record in inventory first. Also, the VLAN that IP is on must be trunked to the IAAS host (ONA) and added to the IAAS host's network config. This trunking is defined by the pillar list iaas:trunk_ifaces , which is configured in, eg, /srv/saltstack/pillar/iaas/211/networking.sls.

Once the prerequisites are in place, run, eg:

salt-run lxc.init containername.cs.uwaterloo.ca host=dc-3558-208.cloud.cs.uwaterloo.ca template=salt-ubuntu

And in about 5 minutes you will have a new Ubuntu 16.04 container with the latest salt-minion, with its key already accepted by the salt master. Run salt containername.cs.uwaterloo.ca state.apply to bring it up to speed. Any extra network interfaces currently need to be added manually. Mounting filesystems inside the container is controlled by the pillar iaas:lxc_mounts configured in, eg, /srv/saltstack/pillar/iaas/211/dc.sls. The state common.autofs is good for mounting on the host, but use lxc_mounts to bind-mount that into the container.

Adding an Existing Machine as a Minion

Ensure that hostname -f returns the FQDN. Edit /etc/hosts and /etc/hostname if needed. Then run the following, substituting the correct salt master:

   root@# wget -O bootstrap-salt.sh https://bootstrap.saltstack.com
   root@# sh  bootstrap-salt.sh -A salt-204.cscf.uwaterloo.ca

On the salt master, run salt-key -l un and the minion id should be there. salt-key -a to accept the key.

Then you can run: salt <fqdn> state.apply test=True and examine changes that would be made, then run it for real.

Common method of applying states in steps:

   # Double check that salt itself is setup the way CSCF expects
   #  Keep separate cause it may cause needed updates.
    salt  state.apply common.salt.minion test=true
    salt  state.apply common.salt.minion 
    # if you want to verify that everthing worked use " --state-verbose=false", ie
    salt  state.apply common.salt.minion  --state-verbose=false
   
    # Put common base in place (repos, networking, ....)
    salt  state.apply common test=true
    salt  state.apply common
    
   # Lets see what highstate wants to do
  salt  state.apply  --state-verbose=false test=true
   #   salt  state.apply  --state-verbose=false

File Layout

Salt has 2 main directories: States (code) and Pillar (variables/secrets). The usual location for these is /srv/salt and /srv/pillar respectively. It has been decided that all Salt-related directories should be under a single tree. Thus, the CSCF salt master directory layout is /srv/saltstack/states and /srv/saltstack/pillar. These directories are git repositories, with ./common being a git submodule shared between all 3 salt masters. These file paths are configured in /etc/salt/master. Keep this change in mind when reading non-CSCF documentation.

States are assigned to minions in the state top file; /srv/saltstack/states/top.sls. Pillar has a similar file, /srv/saltstack/pillar/top.sls. See External Documentation for the format of these. If you want to know what a minion is for, look for it in states/top.sls.

When making a new state, make a directory with a clear, snake-case name (lowercase with underscores for spaces). For example, cscf_apache/. Then edit cscf_apache/init.sls. This init.sls is a special name in Salt. When addressing the state cscf_apache it will match first cscf_apache.sls, and secondly cscf_apache/init.sls. Directories keep things neater. Try to minimize the number of files/directories in the top-level directory.

Configuration Layout

Configuration for the daemons themselves (salt-master and salt-minion services) is stored in /etc/salt/. Extra config, such as defining nodegroups (groups of minions, ie dfs) or git remotes, is stored in /etc/salt/master.d/ files. There are many files in salt-204's /etc/salt, but the relevant ones are just master and master.d/ /etc/salt/master is managed from the state salt.master. Any changes should be tested by editing /etc/salt/master, then putting it in salt once tested. But changes to the master's config should be rare and cautiously tested.

Minions have a much simpler configuration, just /etc/salt/minion and minion_id. Note that salt-204 is a minion of itself, as is common. /etc/salt/minion is itself managed through Salt, so when bootstrapping a new minion all that's needed in /etc/salt/minion is: master: salt-204.cscf.uwaterloo.ca and working networking. The state salt.minion will bootstrap the rest.

Do's / Don'ts / Tips / Gotchas

Upgrading Salt through Salt sometimes breaks, so for one machine it's simplest to ssh to the machine and run:

   apt-get update && apt-get dist-upgrade
You can use 'at' to safely upgrade: salt -N non-critical cmd.run 'echo "apt update && apt install -o Dpkg::Options::="--force-confold" --force-yes -y salt-minion" | at -M now + 5 minutes'

You can't Control-C a salt command, such as a state.apply - all that does is kill the salt command line process that's listening for the returns - the minions have already been given your orders. So double-check before running things, and/or use test=True.

When repeatedly making changes to a state, you can iterate faster by applying only that state to the minion, rather than everything in top.sls. Also, you can use test mode to do a dry run. So, rather than salt 'minion' state.apply use salt 'minion' state.apply mystate test=True. This will apply the state even if it's not in top, which is good for testing, but be extra careful about applying the correct state to the correct minion.

To hide the spam of 'Clean' state returns, you can add --state-verbose=False.

When running a command against all minions, or all minions in an HA cluster, please test! use test=True, test it on a few minions first, etc. If the change might only affect things on a reboot, try it on 1 minion and wait for it to reboot and work correctly before going ahead.

When using a minion glob, eg '*student*' run salt '*student*' test.ping to get a list of what minions it matches, before running your real command. Always put single quotes around your minion matching glob to prevent bash doing unexpected things to it.

To add a minion: On the master, run salt-key -l un to show all unaccepted minions. salt-key -a 'minion' to accept it.

When reinstalling / renaming minions: On the master, delete all old keys with salt-key -d 'old_minion' before starting the new minion.

Future Plans

  • We need to set up a proper workflow using git:
    • admins clone the git repos and edit locally
    • test using salt-ssh
    • git commit and git push, salt master git pulls / gitfs

  • We may create a dev sandbox salt master.

Salt Formulas

Salt States can be moved into their own git repositories and referenced as Formulas. This is good practice for states meant to be shared widely. However, generalizing a Salt State to be worth making it into a Formula is frequently not worth it. For example, the Nextcloud configuration is here: https://git.uwaterloo.ca/salt_cs/nextcloud-formula And HAProxy is here: https://git.uwaterloo.ca/salt_cs/haproxy-formula

These states didn't really need to be separate formulas, in hindsight, but it's not worth moving them back either.

Currently they are set to private, if you need permissions ask Nathan Fish or Lori Paniak.

CSCF How-To Documents

Add links here to any documents you create for particular tasks (eg: adding a minion to the RSG Salt Master, How to install a Masterless node, etc)

Monitoring temperature on servers:

SaltTemperatureLoggingAndShutdown

Installing Ubuntu 18.04 on a cluster using Salt:

SaltPxebootInstallUbuntu1804

-- Nathan Fish - 2020-01-07

Edit | Attach | Watch | Print version | History: r29 < r28 < r27 < r26 < r25 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r29 - 2024-03-05 - DanielAllen
 
This site is powered by the TWiki collaboration platform Powered by PerlCopyright © 2008-2024 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback