TensorFlow学习笔记01-在EC2上安装

  • Essentials
  • CUDA 8.0
  • cuDNN v5.1, for CUDA 8.0
  • TensorFlow 1.0.0

选择EC2 p2.xlarge: 1 GPU (Nvidia K80), 61G RAM, $0.900 hourly
AMI: Ubuntu Server 16.04 LTS (HVM), SSD Volume Type - ami-a58d0dc5

此处略过如何启动ec2 instance,下面的操作直接在instance上进行。

安装dependencies & build tools

1
2
3
4
5
6
7
8
sudo apt-get update && sudo apt-get -y upgrade
sudo apt-get install -y build-essential git swig default-jdk zip zlib1g-dev
# 确定gcc已经安装
gcc --version
# 判断是否有NVIDIA GPU
lspci | grep -i nvidia

p2.xlarge GPU 如下:

We need to blacklist Nouveau which has a conflict with the nvidia driver.

1
2
3
echo -e "blacklist nouveau\nblacklist lbm-nouveau\noptions nouveau modeset=0\nalias nouveau off\nalias lbm-nouveau off\n" | sudo tee /etc/modprobe.d/blacklist-nouveau.conf echo options nouveau modeset=0 | sudo tee -a /etc/modprobe.d/nouveau-kms.conf
sudo update-initramfs -u
sudo reboot

安装Kenel headers

1
2
3
4
sudo apt-get install -y linux-image-extra-virtual
sudo reboot
sudo apt-get install -y linux-source linux-headers-`uname -r`

安装Cuda 8.0

1
2
3
4
5
6
7
wget https://developer.nvidia.com/compute/cuda/8.0/Prod2/local_installers/cuda-repo-ubuntu1604-8-0-local-ga2_8.0.61-1_amd64-deb
sudo dpkg -i cuda-repo-ubuntu1604-8-0-local-ga2_8.0.61-1_amd64-deb
rm cuda-repo-ubuntu1604-8-0-local-ga2_8.0.61-1_amd64-deb
sudo apt-get update
sudo apt-get install -y cuda

配置环境变量

1
2
3
4
5
6
7
8
vim ~/.profile
export CUDA_HOME=/usr/local/cuda
export CUDA_ROOT=/usr/local/cuda
export PATH=$PATH:$CUDA_ROOT/bin
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$CUDA_ROOT/lib64
sudo reboot

验证Cuda安装成功

1
2
3
4
5
6
7
8
9
10
11
nvcc --version
# verify the driver is installed
nvidia-smi
cd /usr/local/cuda/
cd samples
sudo make
cd ./1_Utilities/deviceQuery
./deviceQuery



安装cuDNN v5.1

https://developer.nvidia.com/rdp/cudnn-download
cuDNN v5.1 Runtime Library for Ubuntu14.04 (Deb)
cuDNN v5.1 Developer Library for Ubuntu14.04 (Deb)

需要先加入Accelerated Computing Developer Program,然后下载到本地,再上传到ec2,然后安装

1
2
sudo dpkg -i libcudnn5_5.1.5-1+cuda8.0_amd64.deb
sudo dpkg -i libcudnn5-dev_5.1.5-1+cuda8.0_amd64.deb

The libcupti-dev library, which is the NVIDIA CUDA Profile Tools Interface. This library provides advanced profiling support.

1
sudo apt-get install libcupti-dev

安装Tensorflow

1
2
3
4
5
6
wget https://repo.continuum.io/archive/Anaconda2-4.3.0-Linux-x86_64.sh
bash Anaconda2-4.3.0-Linux-x86_64.sh
conda create -n tensorflow
source activate tensorflow
pip install --ignore-installed --upgrade https://storage.googleapis.com/tensorflow/linux/gpu/tensorflow_gpu-1.0.0-cp27-none-linux_x86_64.whl

运行成功,Tensorflow + GPU

Ref:
https://www.tensorflow.org/install/install_linux
https://gist.github.com/erikbern/78ba519b97b440e10640
http://expressionflow.com/2016/10/09/installing-tensorflow-on-an-aws-ec2-p2-gpu-instance/
https://medium.com/@giltamari/tensorflow-getting-started-gpu-installation-on-ec2-9b9915d95d6f#.ef96jc7a4
https://eatcodeplay.com/installing-tensorflow-with-python-3-on-ec2-gpu-instances-f9fa199eb3cc#.142acv4zq

Contents
  1. 1. 安装dependencies & build tools
  2. 2. 安装Cuda 8.0
    1. 2.1. 配置环境变量
    2. 2.2. 验证Cuda安装成功
  3. 3. 安装cuDNN v5.1
  4. 4. 安装Tensorflow
|