CFI 2004 (Kevin Hare) Cluster setup and configuration

Initial Setup

Several RTs for pre-cluster setup experimentation:
  • 53551 - Setup test cluster with stored hardware
  • 54344 - Experiment with Rocks on test cluster and services114
  • 54159 - Experiment with Sun N1 Grid Engine on cluster

03 August 2006 - MikePatterson, Dave Gawley discussed networks a bit further. The 129.97.7.129-131 setup was fine with Dave. We reserved 129.97.7.128 as well. That way we can break off a small /22 chunk from 10.7 and map them directly to the KVMs, and the ability to address any of the KVMs will grant ability to address them all, but still have easy firewall rules for everybody else.

Cluster "owners"

Cluster name Primary User
vidal Pascal Poupart
marroo Ashraf Aboulnaga
shiraz Ihab Ilyas

Documentation

Other information that was here has been removed as no longer relevant. Pretty documentation for the clusters may be found at http://www.cs.uwaterloo.ca/cscf/research.

Creating accounts on shiraz/marroo/vidal

Formal Documentation

See: 9.1 Accounts in the documentation.

Quirks

There are a few quirks - you need to create the account as described above. The steps then say to go to /var/yp and run "make all". However, that doesn't seem to provide access to the rest of the nodes in the system. For that you need to follow the steps in Chapter 2 - Accounts for creating the ssh keys. Then go back to /var/yp and do series of "make" and "make all" until it doesn't seem to build anything new. It should then work. (Oh ya, try throwing some salt over your left shoulder and say a few colourful magic words ...)

Step-by-step notes

Example: we would like to add "jdoe" as a user to vidal

  • Determine uid/gid for jdoe in the CS core
@cscf[101]% idregistry request type=Group jdoe
jdoe:1411
@cscf[102]% idregistry request jdoe
jdoe:1411
  • login to the head node (vidal, in this case) and gain root access
% ssh cscf-adm@vidal
cscf-adm@vidal:~> sudo -s
root's password:
vidal:~ #

  • check that the user isn't already there
vidal:~ # grep jdoe /etc/passwd

  • Create group and user
vidal:~ # groupadd -g 1411 jdoe
vidal:~ # useradd -u 1411 -g 1411 -m jdoe
  • Set a password
vidal:~ # passwd jdoe
Changing password for jdoe.
New Password:
Reenter New Password:
Password changed.
  • Create the home directory
vidal:~ # mkdir /home/jdoe
  • Update the NIS information across all nodes (this seems to require a series of "makes")
vidal:~ # cd /var/yp
vidal:/var/yp # make all
Updating group.byname...
Updating group.bygid...
Updating netid.byname...
Updating passwd.byname...
Updating passwd.byuid...
vidal:/var/yp # make
gmake[1]: Entering directory `/var/yp/cs.uwaterloo.ca'
Updating group.byname...
Updating group.bygid...
Updating netid.byname...
Updating passwd.byname...
Updating passwd.byuid...
gmake[1]: Leaving directory `/var/yp/cs.uwaterloo.ca'
vidal:/var/yp # make all
Updating netid.byname...
vidal:/var/yp # make
gmake[1]: Entering directory `/var/yp/cs.uwaterloo.ca'
Updating netid.byname...
gmake[1]: Leaving directory `/var/yp/cs.uwaterloo.ca'
vidal:/var/yp # 
  • Generate keys to allow ssh access to other nodes
% ssh jdoe@vidal
Password: 
jdoe@vidal:~> ssh-keygen -t dsa
Generating public/private dsa key pair.
Enter file in which to save the key (/home/jdoe/.ssh/id_dsa): *[enter]*
Created directory '/home/wzhou/.ssh'.
Enter passphrase (empty for no passphrase): *[enter]*
Enter same passphrase again: *[enter]*
Your identification has been saved in /home/jdoe/.ssh/id_dsa.
Your public key has been saved in /home/jdoe/.ssh/id_dsa.pub.
The key fingerprint is:
... jdoe@vidal
jdoe@vidal:~> 
  • Copy the keys into the authorized key file (which is shared on all nodes, so lets you in on all nodes)
jdoe@vidal:~> cd .ssh
jdoe@vidal:~/.ssh> ls -al
total 16
drwx------ 2 jdoe jdoe 4096 2009-03-10 10:24 .
drwxr-xr-x 9 jdoe jdoe 4096 2009-03-10 10:24 ..
-rw------- 1 jdoe jdoe 1192 2009-03-10 10:24 id_dsa
-rw-r--r-- 1 jdoe jdoe 1113 2009-03-10 10:24 id_dsa.pub
jdoe@vidal:~/.ssh> cat id_dsa.pub >> authorized_keys2
jdoe@vidal:~/.ssh> 
jdoe@vidal:~/.ssh> ls -al
total 20
drwx------ 2 jdoe jdoe 4096 2009-03-10 10:26 .
drwxr-xr-x 9 jdoe jdoe 4096 2009-03-10 10:24 ..
-rw-r--r-- 1 jdoe jdoe 1113 2009-03-10 10:26 authorized_keys2
-rw------- 1 jdoe jdoe 1192 2009-03-10 10:24 id_dsa
-rw-r--r-- 1 jdoe jdoe 1113 2009-03-10 10:24 id_dsa.pub
jdoe@vidal:~/.ssh> 
  • Test login to other nodes
jdoe@vidal:~/.ssh> ssh vidal-03
Have a lot of fun...
jdoe@vidal-03:~> exit
logout
Connection to vidal-03 closed.
jdoe@vidal:~/.ssh> ssh vidal-09
Have a lot of fun...
jdoe@vidal-09:~> exit

Console access

In general, compute nodes' consoles are accessible from a web browser on the head node. Details here
Topic attachments
I Attachment History Action Size Date Who Comment
PNGpng clusters_networking_01.png r1 manage 1292.5 K 2006-08-01 - 20:08 MikePatterson First cut at a (partial) network setup
Edit | Attach | Watch | Print version | History: r19 < r18 < r17 < r16 < r15 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r19 - 2012-09-06 - BillInce
 
This site is powered by the TWiki collaboration platform Powered by PerlCopyright © 2008-2019 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback