TWiki
>
CF Web
>
Linux
>
ClusterTools
>
ClusterToolsUtils
(2015-09-22,
MikeGore
)
(raw view)
E
dit
A
ttach
-- Main.MikeGore - 2015-08-27 ---++ Common management utilities for end users and admins ---++ Other resources * ClusterTools * Cluster Tools main documentation root * ClusterToolsSetup * CLuster tools setup and installation walk-through * * ClusterToolsScripts * Documentation, File and scripts for the installation and operation of a new cluster along with some - some end user tools ---+++ add_users * *What:* * * Add users from CVS optionally specify home directory, email address, password and groups* * If the group is admin - the user will be get all of the admin groups * *See:* * *sync-users to sync all user changes to all of the nodes* * *usermod* * *Notes:* * *needs root* ---+++ usermod * *usermod* - Linux command to add or change groups for a user * Options: * *-a* append - you normally always use this unless you intend to replace groups which is unlikely * *-G* list of group names to add * Admin groups: * *admin, adm, sudo* ---+++ del-user * Delete a user on all of the nodes * *Usage:* * *del-user userid* * *Notes:* * *needs root perms* ---+++ sync-users * sync user accounts, passwords, ssh keys and group setting to all of the nodes * *Usage:* * *sync-users* * This command can be run any time, and more then once without harm * *Notes:* * *needs root* * Creates SSH keys for a user - but only if they do not have any * Add the users public key to their authorised_keys2 file - so in a cluster they can log into nodes that share /home ---++ Packaging ---+++ finding packages * Example: apt-cache search postgres -n * Search for postgres in the one line description * Example: apt-cahe search postgres * Searchj for postgres in the entire description ---+++ example script to install packages on all of the nodes * */cscf-adm/src/cluster/install-mpi* * The script first installed the packages listed in the script on himrod and then on the nodes * The script is only 27 lines long and you will only have to change *2 lines!* * *NODES* and "common_vars* is pulled in from the search path - in this case: */usr/local/bin* * (ie. they do NOT have to be in the current directory) <verbatim> !/bin/bash # # Mike Gore, 10 Oct 2014 # # Install openmpi on the nodes and headnode . common_vars . NODES update_list update_packages netpipe-openmpi openmpi-bin openmpi-checkpoint openmpi-common openmpi-doc for i in $NODES do if ping -c 1 $i >/dev/null 2>&1 then cat <<EOF | ssh root@"$i" . common_vars . NODES update_list update_packages netpipe-openmpi openmpi-bin openmpi-checkpoint openmpi-common openmpi-doc EOF else echo $i is down fi done </verbatim> ---+++ Limits * File */etc/security/limits.conf* controls user memory, process, file handel, signal, lock limits * *WARNING the default Ubuntu install HAS NO LIMITS set* * *This means ANY user can CRASH an Ubuntu system by running out of system resources!* ---+++ /etc/security/limits.conf * Defaults <verbatim> * hard cpu unlimited * hard nproc unlimited * hard as unlimited * hard data unlimited * hard sigpending unlimited * hard nofile unlimited * hard msqqueue unlimited * hard locks unlimited * hard file unlimited </verbatim> ---+++ Checking nodes and NFS mounts * */cscf-adm/src/cluster/fix-mount* * Verifies NFS mounts are working - mounts them if not ---+++ check-nodes * check that each node is on line or not * *Usage:* * *check-nodes* * *Notes:* * *This can be used as a common template we use to perform a task on all nodes* * Check to See if all of the nodes are online ---++ Users ---+++ sync-users * Grab the settings for all of the non-system users from the headnode * Create a script that can be run on all of the nodes to reproduce everything * Run this script on each node to make the changes * Notes: * We use both *useradd* and *usermod* - if the first fails because they already exist then *usermod* fixing the values ---+++ check-nodes * Ping each node once to see if it is alive - display up/down status ---+++ shutdown-all * Shutdown all of the nodes then the headnode ---+++ reboot-nodes * Reboots all of the nodes ---++ all documenation below this section is work in progress * As of *27 Aug 2015 ---+++ fix-mount * runs *mount -a* on all nodes ---+++ fix-resolv * Fix */etc/resolv.conf* using the headnode as a template * Note: removes 127.0.0.1 before the copy ---+++ fix-network * Updates * */etc/hosts* * */etc/hostname* * */etc/resolv.conf* * */etc/udev/rules.d/70-persistent-net.rules* * Restarts * *networking* service * *nscd* service * Mounts * *mount -a* ---+++ fix-profile * Updates */etc/profile* on all NODES ---++++ profile * local copy of *profile* */etc/profile* used by *fix-profile* ---++ Software ---+++ install-autoupdates * Setup Automatic updates of critical files on all of the nodes ---+++ install-mpi * CF.OpenMPI task/cpu sharing software * Install OpenMPI on all of the nodes and headnode ---+++ install-scheduler * Updates the disk scheduler options on all the nodes and headnode ---++++ update-scheduler * Updates the disk scheduler options ---+++ save-configs * save important system config files in */etc/config-backups* ---+++ sync-packages * Using the nodes called PACKAGE_MASTER defined in file *NODES* * Sync the packages so they are the same on all of the other nodes ---++ GRUB ---+++ fix-grub-all * Reinstall and configure GRUB on all of the nodes * Updates /etc/default/grub ---++++ grub-fix * Reinstall and configure *GRUB* * Updates */etc/default/grub*
E
dit
|
A
ttach
|
Watch
|
P
rint version
|
H
istory
: r3
<
r2
<
r1
|
B
acklinks
|
V
iew topic
|
WYSIWYG
|
M
ore topic actions
Topic revision: r3 - 2015-09-22
-
MikeGore
CF
Information in this area is meant for use by CSCF staff and is not official documentation, but anybody who is interested is welcome to use it if they find it useful.
CF Web
CF Web Home
Changes
Index
Search
Administration
Communication
Email
Hardware
HelpDeskGuide
Infrastructure
InternalProjects
Linux
MachineNotes
Macintosh
Management
Networking
Printing
Research
Security
Software
Solaris
StaffStuff
TaskGroups
TermGoals
Teaching
UserSupport
Vendors
Windows
XHier
Other Webs
CSEveryBody
Main
Sandbox
TWiki
UW
My links
People
CERAS
WatForm
Tetherless lab
Ubuntu Main.HowTo
eDocs
RGG NE notes
RGG
CS infrastructure
Grad images
Edit
Copyright © 2008-2025 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki?
Send feedback