Location of master repository - asimov:/home/gpu-setup - which is symlinked to asimov.uwaterloo.ca:/cscf-adm/src/gpu-setup - scripts push_servers contains a script that updates all of these machines currently only working from cs-tech1.cs.uwaterloo.ca Hosts using these scripts - ai-vector tml tml2 tml3 ming-gpu-1 ming-gpu-2 ming-gpu-3 ming-gpu-4 ming-gpu-p40 ming-gpu-v100 honeydew beaker cabernet Installation scripts in /home/gpu-setup - This Document was created on Tue Feb 18 13:01:43 EST 2020 by /home/gpu-setup/create_documents - We now have cuda 10, cuDNN 7.31, tensorflow, pycuda, pytorch and keras installed Updating or Creating this document - Note: the script /home/gpu-setup/create_documents generates this TWIKI page - Please update /home/gpu-setup/create_documents and rerun it to update the notes Anaconda environment ml - I created an anaconda environment called ML - "ml" stands for math learning - FYI: tensorflow, pycuda, pytorch and tensorflow use "ml" for their installation. - These python packages are VERY DEPENDENT ON A SPECIFIC CUDA VERSIONS - WARNING to prevent DESTROYING the system python invironment - You MUST ALWAYS USE ANACONDA "ml" environment* Using the "ml" python environment - source /home/gpu-setup/install_env - This sets search paths and library paths - source activate ml - This makes sure that you are in the ml workspace! Linux system group called ml - I created a new Linux system group "ml" - This group permits sharing code between users Adding users to the ml system group - Run /home/gpu-setup/update_ml_users as root any time Manually adding the ml group to files or directories - Example ml group sharing - chgrp -R ml /home/share - chmod -R g+w /home/share directories alrwady added to the ml group - /usr/local/cuda - /usr/local/anaconda3 - /home/gpu-setup/cudnn_samples_v7 Installation scripts located in /home/gpu-setup - Installed in the following order - install_first - Installs all required Ubuntu packages - installs common_functions and common_vars under /usr/local/bin - shell functions used by all these scripts - create_documents - Creats CUDA_README.txt file when run - Also updates the login message by updating /etc/motd.tail - install_anaconda - Creates anaconda ml environment - reboot after this - Aside: these next scripts can also be run anytime to fix broken installation - install_cuda - Install cuda drivers - reboot after this! - install_cuDNN - installs cuDNN - install_pycuda - installs pycuda - uses ml environment - install_tensorflow - installs tensorflow - uses ml environment - install_pytorch - installs pytorch - uses ml environment - install_keras - installs keras - uses ml environment Tests - ./test_cuda - test cuda installation - ./test_pycuda - test pycuda install - ./test_pytorch - test pytorch install - ./test_tensorflow - test tensorflow - ./benchmark_gpu - run gpu benchmark Cleanup Scripts - purge_anaconda - This file DELETES ALL ANACONDA USER ENVIRONMENTS - ONLY USE THIS IF YOU HAVE TO START OVER FROM SCRATCH