--
Mike Gore - 2020-08-05
Linux CUDA,cuDNN, NVIDIA and Tensorflow GPU install
- Note: As of Ubuntu 20.04 we are switching to docker
- Why?
- The Ubuntu packages and various dependencies for NVIDIA driver, CUDA version and utilities change and break as often as the weather!, so what works today will not work in a few weeks even if you install exactly the same packages by name. Worse is that older drivers are removed so you then must reevaluate new dependencies that eventually tie to user applications like tensorflow. Tensorflow as very specific requirements for cuDNN and CUDA version so that makes things even worse.... Shortly after 20.04 came out just a few packages were need to install and everything just worked - now its far from simple
Overview
- This document is broken down into two sections
- ) Administrative software and driver install
- CSCF RSG does this if you request a computer specifically with GPU setup machine from us
- ) End user install private to their profile - setup a virtual environment and install TensorFlow
References
Docker References
Administrative section software and driver installation
- *Note: the dependency and file names seem to change every few week so the examples below may be quickly out of date
- It is important to know in advance what versions of cuda your tools will new. For example TensorFlow is very sensitive to versions
Ubuntu 20.04LTS easy setup with Docker install script
Updated and tested on Jan 9th 2020 Mike Gore
- Aside: The original simple package install actions no longer work. The packages dependencies were altered so this method is longer viable
- Solution - we are using a docker install scripts that will do EVERYTHING including nVidia drivers, CUDA and all package prerequisites
- install_nvidia_docker: Install GPU Docker container and support packages CUDA. NVIDIA tools and Python support
- This will install the prerequisite packages and correct nVidia drivers then install docker with images for testing
- run_docker_tests: Run tests on GPU Docker container and support packages CUDA. NVIDIA tools and Python support
- The will run basic tests on the containers
- Please consult this file for running applications with the installed containers
docker images
REPOSITORY TAG IMAGE ID CREATED SIZE
nvidia/cuda 10.1-base c06d556b5f80 3 months ago 105MB
tensorflow/tensorflow 2.1.0-gpu-py3 e2a4af785bdb 12 months ago 4.11GB
hello-world latest bf756fb1ae65 12 months ago 13.3kB
Ubuntu 20.04LTS Docker tests
Ubuntu 20.04LTS remove all nVIDIA, CUDA and docker and docker images to prepare for a clean install
- remove_docker_nvidia: Purge all GPU Docker container and support packages CUDA. NVIDIA tools and Python support
- Warning this will remove all drivers, containers and packages related to CUDA, NVIDIA and Docker
- This was used on the test machine to successfully remove the Docker images, CUDA driver, nvidia drivers to prepare a clean install
- Note: this script will remove existing docker images and programs - please update the script if that is not desired
Ubuntu18.04LTS
- This document has been tested for installing CUDA, NVIDIA drivers and TensorFlow on Ubuntu 18.04LTS
Ubuntu 16.04 and 18.04LTS
Hardware Requirements
- nVidia GPU card
- CPU with AVX support
- Open a terminal window
- Activities -> Search -> term
- grep -i avx /proc/cpuinfo
- You need to see a line with avx in it
Software Requirements
- Open a terminal window
- Activities -> Search -> term
- For future quick access -> right click on the terminal icon now on your task bar and pick Add to favorites
- sudo bash
- This gives you a root shell
- It will ask you for your normal login password
- ubuntu-drivers devices
- Note the recommened driver name and install it
- Example: apt-get install nvidia-driver-440
- Example: apt-get install nvidia-driver-455
*
reboot machine before continuing to allow drivers to load
Useful prerequisites for coding and development
apt-get install aptitude vim gdebi linux-headers-$(uname -r) curl apt-transport-https build-essential binutils gdb coreutils dpkg-dev autoconf automake make cmake patch git rcs subversion pylint python-dev python2.7-dev python3-dev swig libcupti-dev golang python-opengl python3-msgpack python-setuptools libboost-python-dev libboost-thread-dev libboost-all-dev tmux htop unzip bzip2 gzip p7zip-full p7zip-rar zip tar cabextract
CUDA, cuDNN, NVIDIA install
Note: I assume you have 3rd part driver support enabled in the Ubuntu Software center
18.04
End User instructions for installing of TensorFlow GPU in a virtual environment
- You MUST always use a Python3 virtual environment otherwise you risk damaging the system wide Python installation
- Note: this installation can be done as a normal using a terminal window
- Open a terminal window
- Activities -> Search -> term
- Create a vertual environment called venv in your current directory
- python3 -m venv --system-site-packages ./venv
- source venv/bin/activate
- pip install --upgrade pip
20.04LTS
-
- This no longer works - use Docker method
- pip install tensorflow-gpu==2.4
18.04LTS
-
- pip install tensorflow-gpu==2.0
Testing
- For 20.04LTS consult
- Make sure your virtual environment is activated
- Open a terminal window
- Activities -> Search -> term
- Activate venv
- python -c "import tensorflow as tf;print(tf.reduce_sum(tf.random.normal([1000, 1000])))"