-- MikeGore - 2018-12-03

tml.cs and tml2.cs hardware build notes and software configuration


  • These machines are owned by Yaoliang Yu
  • Managed by Mike Gore


  • The main constraint in the design of this machine was to permit up to 4 liquid cooled GPU cards in one chassis.
  • The biggest problem we faced is that most GPU cards are air cooled from the side - putting 4 every other slot would obstruct the fans
  • I9 System with 64G ram (expandable to 128G)
  • Mother board needs at least 7 slots and the chassis must have 8 slots (GPU cards are two slots wide) the
    • GPU cards plug into slots: 1,3,5,7
  • Build is only cost effective if you plan to add the 4 GPU's - currently the CPU is 35% of the overall cost
  • Overall this system runs quiet as all of the fans ar 120mm and because of the large radiators they do not need to run fast



Fri Nov 30 15:14:29 CST 2018 2018 IMPORTANT UPDATE tml.cs.uwaterloo.ca has cuda 9.0 cuDNN 7.05 tensorflow and torch installed
The packages are all interdependent on specific cuda versions chosen.
We installed anaconda to permit private python environments. We then create an environment called "ml" for math learning
FYI: Tensorflow and Torch use "ml" for their installation.
Place the following lines in any bash script you create
    source "/root/gpu-setup"/install_env   - this sets search paths and library paths
    source activate ml                    - makes sure that you are in the ml workspace!
To make it easy to share cuda and related tools between users I created a new system group called "ml"
    I added all users to the "ml" group
    You can run the script update_ml_users as root at any time to update all users to be part of the ml group
    Example ml group sharing:    chgrp -R ml /home/share;  chmod -R g+w /home/share
The following directories belong to the "ml group and all their files have group write added to them

Installation scripts I used are under "/root/gpu-setup" and were installed in the following order:
  install_2nd   - reboot after this
  install_cuda  - reboot after this
  Note: install_cuda, install_cuDNN-7.05,  install_tensorflow and install_torch can be rerun anytime
Testing tensorflow:
  cd "/root/gpu-setup"
Testing cuda - system GPU benchmark
  cd "/root/gpu-setup"


  • TML - with covers off power supply side view:

  • TMP with covers off rear top view:

  • TML covers off CPU side view:

Install scripts

3 Dec 2018 - Mike Gore
  • Note: These scripts are in constant development - please use the latest version of these scripts which can be found on cscf-adm@asimov.uwaterloo.ca:/cscf-adm/src/gpu-setup

  • install_1st: initial install script - a few basic package installs

  • install_2nd: install anaconda , create python environment "ml" for math learning, installed support packages

  • install_cuda: install cuda 9.0 and drivers using nVidias site - removes any existing nvidia or cuda drivers

  • install_env: source this file in your shell scripts to setup environment and libraries paths

Topic attachments
I Attachment History Action Size Date Who Comment
JPEGjpg IMG_20181203_102802.jpg r1 manage 949.7 K 2018-12-03 - 10:38 MikeGore TML - with covers off power supply side view
JPEGjpg IMG_20181203_102818.jpg r1 manage 1014.8 K 2018-12-03 - 10:39 MikeGore TMP with covers off rear top view
JPEGjpg IMG_20181203_102831.jpg r1 manage 997.0 K 2018-12-03 - 10:40 MikeGore TML covers off CPU side view
Unknown file formatext benchmark_gpu r1 manage 0.2 K 2018-12-03 - 11:09 MikeGore Cuda benchmarks
Unknown file formatext common_functions r1 manage 84.0 K 2018-12-03 - 11:09 MikeGore support shell functions using in all scripts
Unknown file formatext install_1st r1 manage 8.7 K 2018-12-03 - 11:03 MikeGore initial install script - a few basic package installs
Unknown file formatext install_2nd r1 manage 1.3 K 2018-12-03 - 11:05 MikeGore install anaconda , create python envioronment "ml" for math learings, installed support packages
Unknown file format05 install_cuDNN-7.05 r1 manage 0.9 K 2018-12-03 - 11:06 MikeGore install cuDNN 7.05
Unknown file formatext install_cuda r1 manage 2.2 K 2018-12-03 - 11:06 MikeGore install cuda 9.0 and drivers using nVidias site - removes any existing nvidia or cuda drivers
Unknown file formatext install_env r1 manage 1.0 K 2018-12-03 - 11:08 MikeGore source this file in your shell scripts to setup invironment and libraries paths
Unknown file formatext install_pytorch r1 manage 0.8 K 2018-12-03 - 11:07 MikeGore install pytorch
Unknown file formatext install_tensorflow r1 manage 1.8 K 2018-12-03 - 11:07 MikeGore install tensorflow
Unknown file formatext test_tensorflow r1 manage 0.2 K 2018-12-03 - 11:10 MikeGore test tensorflow
Unknown file formatext update_ml_users r1 manage 0.1 K 2018-12-03 - 11:11 MikeGore add all users to the system group called ml
Edit | Attach | Watch | Print version | History: r5 < r4 < r3 < r2 < r1 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r5 - 2019-02-28 - MikeGore
This site is powered by the TWiki collaboration platform Powered by PerlCopyright © 2008-2020 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback