前言
上一篇文章中搭建了CUDA9.0 + cuDNN7.0的基础环境.
先简单说下即将要进行的步骤以及原因。
1. 通过源码编译tensorflow指定CUDA版本
然而在我使用pip安装tensorflow-gpu之后,在import tensorlfow时报错"libcudart.so.8.0 cannot open shared object file: No such file or directory"。然后我到cuda的路径下查看,相关文件的版本都是libcudart.so.9.0xxxxx,再到网上查看相关资料发现,pip安装的tensorflow不支持CUDA9,但是通过源码编译tensorflow,可以指定CUDA版本。
所以通过源码编译的形式安装tensorflow GPU版。
2. 编译tensorflow需要bazel( 0.5.4以上版本)
config tensorflow时报错需要bazel支持,这里需要安装bazel.
这里也是一把辛酸泪.
bazel有两种安装方式,官网提供。我之前贪图省事选择的第二种.sh文件直接安装,然而官网下载的特别慢,我就到某网站下载了0.5.2的版本,并且按照官网的步骤安转完毕,一切顺利。但是config tensorflow时提示要0.5.4以上版本,所以只能卸载重新安装,因为找了很多资料还是不很清楚upgradd方法。这里卸载的时候步骤:
(1).直接删掉~/bin文件夹
(2).修改环境变量,去掉之前添加的
(3).删掉~/.bazel文件夹
卸载完之后使用官网提供的第一种方式来安装bazel以及相关文件,即Using Bazel custom APT repository 。
下面正文
一、安装Bazel
Using Bazel custom APT repository (recommended)
Install JDK 8 by using:
sudo apt-get install openjdk-8-jdk
On Ubuntu 14.04 LTS you'll have to use a PPA:
sudo add-apt-repository ppa:webupd8team/java
sudo apt-get update&&sudo apt-get install oracle-java8-installer
2. Add Bazel distribution URI as a package source (one time setup)
echo"deb [arch=amd64] http://storage.googleapis.com/bazel-apt stable jdk1.8"|sudo tee /etc/apt/sources.list.d/bazel.listcurl https://bazel.build/bazel-release.pub.gpg|sudo apt-key add -
If you want to install the testing version of Bazel, replace stable with testing.
3. Install and update Bazel
sudo apt-get update&&sudo apt-get install bazel
Once installed, you can upgrade to a newer version of Bazel with:
sudo apt-get upgrade bazel
其实都是官网的安装步骤,亲身实践有效。
二、源码编译安装Tensorflow
1.克隆tensorflow源码
git clone--recurse-submodules https://github.com/tensorflow/tensorflow
2.配置tensorflow
进入到tensorflow目录下,有一个configure文件
输入:
./configure
3.配置选项
按如下规则配置tensorflow
You have bazel 0.6.1 installed.
Please specify the location of python. [Default is /usr/bin/python]:/usr/bin/python3
Found possible Python library paths:
/home/shengchun/tensorflow/models/
/bin
/usr/local/cuda/bin
/sbin
/home/shengchun/tensorflow/models/slim
/usr/bin
/usr/local/sbin
/usr/local/games
/usr/lib/python3/dist-packages
/usr/games
/home/shengchun/mxnet-ssd/mxnet/python
/usr/sbin
/usr/local/bin
/usr/local/lib/python3.5/dist-packages
Please input the desired Python library path to use. Default is [/home/shengchun/tensorflow/models/]/usr/local/lib/python3.5/dist-packages
Do you wish to build TensorFlow with jemalloc as malloc support? [Y/n]:[Enter]
jemalloc as malloc support will be enabled for TensorFlow.
Do you wish to build TensorFlow with Google Cloud Platform support? [Y/n]:[Enter]
Google Cloud Platform support will be enabled for TensorFlow.
Do you wish to build TensorFlow with Hadoop File System support? [Y/n]:[Enter]
Hadoop File System support will be enabled for TensorFlow.
Do you wish to build TensorFlow with XLA JIT support? [y/N]:[Enter]
No XLA JIT support will be enabled for TensorFlow.
Do you wish to build TensorFlow with GDR support? [y/N]:[Enter]
No GDR support will be enabled for TensorFlow.
Do you wish to build TensorFlow with VERBS support? [y/N]:[Enter]
No VERBS support will be enabled for TensorFlow.
Do you wish to build TensorFlow with OpenCL support? [y/N]:[Enter]
No OpenCL support will be enabled for TensorFlow.
Do you wish to build TensorFlow with CUDA support? [y/N]:y
CUDA support will be enabled for TensorFlow.
Please specify the CUDA SDK version you want to use, e.g. 7.0. [Leave empty to default to CUDA 8.0]:9.0
Please specify the location where CUDA 9.0 toolkit is
installed. Refer to README.md for more details. [Default is
/usr/local/cuda]:[Enter]
Please specify the cuDNN version you want to use. [Leave empty to default to cuDNN 6.0]:7.0.3
Please specify the location where cuDNN 7.0.3 library is
installed. Refer to README.md for more details. [Default is /usr/local/cuda]:[Enter]
**Please specify a list of comma-separated Cuda compute capabilities you want to build with.
You can find the compute capability of your device at:https://developer.nvidia.com/cuda-gpus.
Please note that each additional compute capability significantly increases your build time and binary size. [Default is: 5.0]**[Enter]
Do you want to use clang as CUDA compiler? [y/N]:[Enter]
nvcc will be used as CUDA compiler.
Please specify which gcc should be used by nvcc as the host compiler. [Default is /usr/bin/gcc]:
Do you wish to build TensorFlow with MPI support? [y/N]:[Enter]
No MPI support will be enabled for TensorFlow.
Please specify optimization flags to use during compilation
when bazel option “–config=opt” is specified [Default is -march=native]:[Enter]
Add “–config=mkl” to your bazel command to build with MKL support.
Please note that MKL on MacOS or windows is still not supported.
If you would like to use a local MKL instead of downloading,
please set the environment variable “TF_MKL_ROOT” every time before
build.
Configuration finished
4.编译tensorflow, 开启 GPU 支持:
bazelbuild--config=opt--config=cuda//tensorflow/tools/pip_package:build_pip_package
时间很长。
5.生成whl文件
bazel编译命令建立了一个名为build_pip_package的脚本。运行如下的命令将会在 /tmp/tensorflow_pkg路径下生成一个.whl文件:
bazel-bin/tensorflow/tools/pip_package/build_pip_package /tmp/tensorflow_pkg
6.pip安装tensorflow
sudo pip3 install /tmp/tensorflow_pkg/tensorflow-1.4.0-cp35-cp35m-linux_x86_64.whl
7.验证安装
打开任意一个新的终端,注意不要在tensorflow的安装路径下,运行
python3
输入以下代码
import tensorflow as tf
hello = tf.constant(‘Hello,TensorFlow!’)
sess = tf.Session()
print(sess.run(hello))
tensorflow源码编译安装完成。