问题1:
pip安装时,提示找不到对应的版本“No matching distribution found ”c:\>pip install tensorflow-gpuCollecting tensorflow-gpu Could not find a version that satisfies the requirement tensorflow-gpu (from versions: )No matching distribution found for tensorflow-gpu
解答:可用“python -m pip install tensorflow-gpu” 或者 “pip install 版本链接”
问题2:
FileNotFoundError: [WinError 3] 系统找不到指定的路径。: 'c:\\users\\administrator.chenbo-ovr097b6\\appdata\\local\\programs\\python\\python35-32\\lib\\site-packages\\pip\\_vendor\\requests\\packages\\urllib3\\packages\\ssl_match_hostname\\__pycache__\\__init__.cpython-35.pyc' -> 'C:\\Users\\ADMINI~1.CHE\\AppData\\Local\\Temp\\2\\pip-xsio8aj7-uninstall\\users\\administrator.chenbo-ovr097b6\\appdata\\local\\programs\\python\\python35-32\\lib\\site-packages\\pip\\_vendor\\requests\\packages\\urllib3\\packages\\ssl_match_hostname\\__pycache__\\__init__.cpython-35.pyc'
解答:重新配置环境变量问题
问题3:
安装python后,提示pip无法被执行,需要重新安装C:\Users\Administrator.chenbo-ovr097b6\AppData\Local\Programs\Python\Python35-32\python.exe: No module named pip.__main__; 'pip' is a package and cannot be directly executed
解答:python -m ensurepip
问题4:
找不到指定模块>>> import tensorflow as tfTraceback (most recent call last): File "C:\Users\Administrator.chenbo-ovr097b6\AppData\Local\Programs\Python\Python36\lib\site-packages\tensorflow\python\pywrap_tensorflow_internal.py", line 18, in swig_import_helper return importlib.import_module(mname) File "C:\Users\Administrator.chenbo-ovr097b6\AppData\Local\Programs\Python\Python36\lib\importlib\__init__.py", line 126, in import_module return _bootstrap._gcd_import(name[level:], package, level) File "", line 978, in _gcd_import File "", line 961, in _find_and_load File "", line 950, in _find_and_load_unlocked File "", line 648, in _load_unlocked File "", line 560, in module_from_spec File "", line 922, in create_module File "", line 205, in _call_with_frames_removedImportError: DLL load failed: 找不到指定的模块。During handling of the above exception, another exception occurred:Traceback (most recent call last): File "C:\Users\Administrator.chenbo-ovr097b6\AppData\Local\Programs\Python\Python36\lib\site-packages\tensorflow\python\pywrap_tensorflow.py", line 41, infrom tensorflow.python.pywrap_tensorflow_internal import * File "C:\Users\Administrator.chenbo-ovr097b6\AppData\Local\Programs\Python\Python36\lib\site-packages\tensorflow\python\pywrap_tensorflow_internal.py", line 21, in_pywrap_tensorflow_internal = swig_import_helper()
File "C:\Users\Administrator.chenbo-ovr097b6\AppData\Local\Programs\Python\Python36\lib\site-packages\tensorflow\python\pywrap_tensorflow_internal.py", line 20, in swig_import_helper
return importlib.import_module('_pywrap_tensorflow_internal')
File "C:\Users\Administrator.chenbo-ovr097b6\AppData\Local\Programs\Python\Python36\lib\importlib\__init__.py", line 126, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
ModuleNotFoundError: No module named '_pywrap_tensorflow_internal'
During handling of the above exception, another exception occurred:
解答:
missing MSVCP140.dll,安装https://www.microsoft.com/en-us/download/details.aspx?id=53587
参考链接:https://github.com/tensorflow/tensorflow/issues/5949
问题5:
没有使用GPU进行加速
>>> import tensorflow as tf
>>> sess = tf.Session()
2017-09-18 14:57:45.014544: W C:\tf_jenkins\home\workspace\rel-win\M\windows\PY\36\tensorflow\core\platform\cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX instructions, but these are available on your machine and could speed up CPU computations.
2017-09-18 14:57:45.015422: W C:\tf_jenkins\home\workspace\rel-win\M\windows\PY\36\tensorflow\core\platform\cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX2 instructions, but these are available on your machine and could speed up CPU computations.
>>>
>>> sess = tf.Session()
>>> a = tf.constant(1)
>>> b = tf.constant(2)
>>> print(sess.run(a+b))
3
>>>
解答:
1.CPU的加速效果更好 --> 运行其他代码尝试
2.框架安装有问题,换成其他方式安装
问题6:
显存不够
>>> import tensorflow as tf
>>> hello = tf.constant('Hello, TensorFlow!')
>>> sess = tf.Session()
2017-09-18 18:47:48.550964: W C:\tf_jenkins\home\workspace\rel-win\M\windows-gpu\PY\36\tensorflow\core\platform\cpu_feature_guard.cc:45] The TensorFlow librarywasn't compiled to use AVX instructions, but these are available on your machine and could speed up CPU computations.
2017-09-18 18:47:48.551931: W C:\tf_jenkins\home\workspace\rel-win\M\windows-gpu\PY\36\tensorflow\core\platform\cpu_feature_guard.cc:45] The TensorFlow librarywasn't compiled to use AVX2 instructions, but these are available on your machine and could speed up CPU computations.
2017-09-18 18:47:49.117177: I C:\tf_jenkins\home\workspace\rel-win\M\windows-gpu\PY\36\tensorflow\core\common_runtime\gpu\gpu_device.cc:955] Found device 0 with
properties:
name: Tesla M60
major: 5 minor: 2 memoryClockRate (GHz) 1.1775
pciBusID 0000:00:15.0
Total memory: 8.00GiB
Free memory: 7.64GiB
2017-09-18 18:47:49.117837: I C:\tf_jenkins\home\workspace\rel-win\M\windows-gpu\PY\36\tensorflow\core\common_runtime\gpu\gpu_device.cc:976] DMA: 02017-09-18 18:47:49.121139: I C:\tf_jenkins\home\workspace\rel-win\M\windows-gpu\PY\36\tensorflow\core\common_runtime\gpu\gpu_device.cc:986] 0: Y
2017-09-18 18:47:49.122430: I C:\tf_jenkins\home\workspace\rel-win\M\windows-gpu\PY\36\tensorflow\core\common_runtime\gpu\gpu_device.cc:1045] Creating TensorFlow device (/gpu:0) -> (device: 0, name: Tesla M60, pci bus id: 0000:00:15.0)
2017-09-18 18:47:49.265265: E C:\tf_jenkins\home\workspace\rel-win\M\windows-gpu\PY\36\tensorflow\stream_executor\cuda\cuda_driver.cc:924] failed to allocate 7.
26G (7792089088 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY2017-09-18 18:47:49.401091: E C:\tf_jenkins\home\workspace\rel-win\M\windows-gpu\PY\36\tensorflow\stream_executor\cuda\cuda_driver.cc:924] failed to allocate 6.
53G (7012879872 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY
2017-09-18 18:47:49.537186: E C:\tf_jenkins\home\workspace\rel-win\M\windows-gpu\PY\36\tensorflow\stream_executor\cuda\cuda_driver.cc:924] failed to allocate 5.88G (6311591936 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY
2017-09-18 18:47:49.674310: E C:\tf_jenkins\home\workspace\rel-win\M\windows-gpu\PY\36\tensorflow\stream_executor\cuda\cuda_driver.cc:924] failed to allocate 5.29G (5680432640 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY
2017-09-18 18:47:49.813375: E C:\tf_jenkins\home\workspace\rel-win\M\windows-gpu\PY\36\tensorflow\stream_executor\cuda\cuda_driver.cc:924] failed to allocate 4.76G (5112389120 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY
2017-09-18 18:47:49.949057: E C:\tf_jenkins\home\workspace\rel-win\M\windows-gpu\PY\36\tensorflow\stream_executor\cuda\cuda_driver.cc:924] failed to allocate 4.28G (4601149952 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY
2017-09-18 18:47:49.963002: E C:\tf_jenkins\home\workspace\rel-win\M\windows-gpu\PY\36\tensorflow\stream_executor\cuda\cuda_driver.cc:924] failed to allocate 3.86G (4141034752 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY
2017-09-18 18:47:49.975810: E C:\tf_jenkins\home\workspace\rel-win\M\windows-gpu\PY\36\tensorflow\stream_executor\cuda\cuda_driver.cc:924] failed to allocate 3.47G (3726931200 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY
>>> print(sess.run(hello))
b'Hello, TensorFlow!'
>>>
解决方法:
GPU的显存是按照core进行分配的,初次创建的时候,会尽可能分配更多的显存给框架,如果多个任务并行,就会出现显存争抢的问题,导致CUDA OOM
如果同时跑多个任务,则可以通过一下命令,修改没个session分配的缓存
config = tf.ConfigProto(log_device_placement=False, allow_soft_placement=True)
config.gpu_options.allow_growth=True
sess = tf.Session(config=config)
如果只跑一个任务,可能是驱动版本不对可以更新驱动尝试
Nvidia驱动for windows:http://www.nvidia.cn/content/DriverDownload-March2009/confirmation.php?url=/Windows/Quadro_Certified/385.08/385.08-tesla-desktop-winserver2008-2012r2-64bit-international-whql.exe&lang=cn&type=Tesla
http://cn.download.nvidia.com/Windows/Quadro_Certified/385.08/385.08-tesla-desktop-winserver2008-2012r2-64bit-international-whql.exe