Tembo cluster
tembo.cs is a "paper" cluster. User documentation for that type of cluster is here:
https://cs.uwaterloo.ca/twiki/view/UW/Paper
Contacts for questions/issues with the cluster are Harsh Roghelia (
hkroghel@uwaterloo.ca) and Lori Paniak (
ldpaniak@uwaterloo.ca) via email or UW MS Teams chat.
To start a Teams chat with Harsh Roghelia, click
HERE
To start a Teams chat with Lori Paniak, click
HERE
Principal Investigators
Tamer Ozsu and Khuzaima Daudjee
Cluster hardware overview
system |
count |
cpu |
memory |
disk |
interconnect |
other |
OS |
BIOS version |
Hyperthreading |
tem01-tem04 |
4 |
2x Intel E5-2620v2 @2.1GHz (12 physical cores total) |
32GB |
2x 1TB 7200RPM HDD |
GbE + 10GbE |
|
Ubuntu 20.04.3/5.11.0 kernel |
3.2 |
Enabled |
tem05-tem06 |
2 |
2x Intel E5-2620v2 @2.1GHz (12 physical cores total) |
32GB |
400GB Intel S3700 SSD + 1TB 7200RPM HDD |
GbE + 10GbE |
|
Ubuntu 20.04.3/5.11.0 kernel |
3.0 |
Enabled |
tem07-tem82 |
76 |
2x Intel E5-2620v2 @2.1GHz (12 physical cores total) |
32GB |
2x 1TB 7200RPM HDD |
GbE + 10GbE |
|
Ubuntu 20.04.3/5.11.0 kernel |
3.0 |
Enabled |
tem83-tem130 |
48 |
2x Intel E5-2620v2 @2.1GHz (12 physical cores total) |
32GB |
2x 1TB 7200RPM HDD |
GbE + 10GbE |
|
Ubuntu 20.04.3/5.11.0 kernel |
3.2 |
Enabled |
tem201-tem204 |
4 |
4x Intel E5-4620 v2 @ 2.60GHz (32 physical cores total) |
256GB |
400GB Intel S3700 SSD + 2x 1TB 7200RPM HDD |
GbE + 10GbE + 40GbE |
|
Ubuntu 20.04.3/5.11.0 kernel |
3.0a |
Enabled |
tem205-tem207 |
3 |
4x Intel E5-4620 v2 @ 2.60GHz (32 physical cores total) |
256GB |
400GB Intel S3700 SSD + 2x 1TB 7200RPM HDD |
GbE + 10GbE + 40GbE |
|
Ubuntu 20.04.3/5.11.0 kernel |
3.0b |
Enabled |
tembo.cs |
1 |
2x Intel E5-2630 v2 @ 2.60GHz (12 physical cores total) |
128GB |
2x 1TB 7200RPM HDD (OS) + 24x 3TB 7200RPM SAS HDD (ZFS) |
2x 10GbE + 40GbE |
10GbE UW uplink, ZFS storage array for user homes via 10GbE NFS, OpenVPN gateway |
Ubuntu 20.04.3/5.11.0 kernel |
3.0a |
Enabled |
Network configuration
system |
gigabit interface |
10 gigabit interface |
40 gigabit interface |
Mellanox firmware revision |
tem[01-130] |
eno1: 192.168.152.0/24 |
enp4s0: 192.168.252.0/24 |
N/A |
2.40.7000 |
tem20{1-7] |
enp193s0f0: 192.168.152.0/24 |
enp196s0: 192.168.252.0/24 |
enp2s0: 192.168.240.0/24 |
2.40.7000 |
Switch configuration
switch |
model |
OS/firmware version |
IP address |
rack 1 gigabit |
Supermicro G2252 |
2.0.0.11 |
192.168.152.2 |
rack 2 gigabit |
Supermicro G2252 |
2.0.0.11 |
192.168.152.3 |
rack 3 gigabit |
Supermicro G2252 |
2.0.0.11 |
192.168.152.4 |
rack 1 10/40/56 gigabit |
Mellanox SX1710 |
3.6.4112 |
192.168.152.8 |
rack 2 10/40/56 gigabit |
Mellanox SX1024 |
3.6.4112 |
192.168.152.6 |
rack 3 10/40/56 gigabit |
Mellanox SX1024 |
3.6.4112 |
192.168.152.7 |
rack 1 40/56 gigabit |
Mellanox SX1012 |
3.6.4112 |
192.168.152.5 |
Password changes
The cluster uses a Samba4 Active Directory system to manage users and system access. Passwords can be changed when logged into the head node of the cluster by issuing the following command:
samba-tool user password
You will be prompted for your current password and then a new password (meeting complexity), twice:
Password for [LOCAL-DOMAIN\ldpaniak]:
New Password:
Retype Password:
Reservation System
https://tembo-reserve.cs.uwaterloo.ca/Web/?
The Tembo cluster features an autonomous, user-managed graphical interfaces for reserving Facility resources. The Reservation System allows for users to reserve multiple compute resources easily and include other Tembo users in reservations to facilitate collaboration. The Tembo Reservation System allows users to reserve resources for 30-minute periods on the top and bottom of the hour. Reservations can be placed, modified and deleted until 5 minutes before the opening of the next reservation period. Once a reservation has been successfully registered, all participants will receive a notification e-mail with details of the reservation. For long-term reservations spanning several days, it is highly-recommended to generate a series a repeating 1-day reservations and include your contact information in the reservation description.
At the end of the reservation period, access to reserved resources will be revoked and reassigned to users as allocated in the Reservation System.
Utilities
pdsh
Parallel distributed shell (pdsh) allows one to send commands to many machines in parallel. From tembo.cs you can use it with a file of hostnames eg.
pdsh -F hostfile -l user "command"
There are pre-defined host patterns in /etc/genders which also gives the format for the hostfile.
pdsh -g yellow "df -h"
man pdsh is useful.
Open VPN
All-in-one configuration file can be loaded by most
OpenVPN clients:
https://cs.uwaterloo.ca/twiki/pub/CF/Tembo/tembo.ovpn
Troubleshooting
If you notice that you are unable to login to a reserved machine or if there is any other issue with one of the machines, please email Harsh Roghelia (
hkroghel@uwaterloo.ca) or Lori Paniak (
ldpaniak@uwaterloo.ca), or you can start a Teams chat with either as well:
To start a Teams chat with Harsh Roghelia, click
HERE
To start a Teams chat with Lori Paniak, click
HERE