GPU Linux虚拟主机GN7型安装配置文档定稿

我现在配的虚拟主机缺了颗GPU，一些使用GPU的算法无法在线演示，有点美中不足。网上搜了一圈，腾讯云现在有个推广活动，花点小钱就可以配一个实验了，比较便宜，其它一些厂的个人担负不起，所以在腾讯云上买了一个实例，试用一个月，以完成配置测试的实验。

完全没有用过Ubuntu，大概需要重装很多次才能搞定。进度很难估计，先用一个月看看。所以要写下每一步的详细文档，以便随时重装。我将安装tensorflow2.6、pytorch1.11.0与HanLP2.1，它们的版本不会冲突。对应的是CUDA11.2与cuDNN8.5，cuDNN8.5适配CUDA11.X（最后改回cuDNN8.1了），Python 3.9。然后会在Rstudio中通过reticulate包、tensorflow包与keras包调用它们。如果有时间，也会测试一下R语言实现的torch包，它提供了类似PyTorch的功能，直接调用libtorch。

腾讯云GPU计算型虚拟主机 GN7，搭载 NVIDIA T4 GPU，8核CPU+32G RAM+100G SSD+1颗T4，带宽5M，￥80/试用一个月，试用计划GPU实验室，入门教程。

一、从镜像安装操作系统。

不同的GPU驱动版本，可选的CUDA版本不同，要选460.106.00版。

公共镜像：Ubuntu Server 18.04.1 LTS64位

后台自动安装GPU驱动

GPU 驱动版本：460.106.00

CUDA版本: 11.2.2

cuDNN版本： 8.2.1

用户名: ubuntu

网址：172.16.XX.XX（内）106.52.XX.XX（公）

安装完成，用SecureCRT或PuTTY连接，它的SSH服务器启用了更新的密钥交换算法，SecureCRT要升级到9.0版以上。

1、登录机器后，先启用root账户，参阅资料。设置root账户密码：

$sudo passwd root

账户切换：

$su root
#su ubuntu

如果要允许root在SSH登录，参阅资料1与资料2。

# vi /etc/ssh/sshd_config

找到这一段：

# Authentication:
#LoginGraceTime 2m
#PermitRootLogin prohibit-password
#StrictModes yes
#MaxAuthTries 6
#MaxSessions 10

改成这样：

# Authentication:
#LoginGraceTime 2m
#PermitRootLogin prohibit-password
PermitRootLogin yes
StrictModes yes
#MaxAuthTries 6
#MaxSessions 10

重启SSH服务器：

# systemctl restart sshd.service

为了方便后面安装软件，关闭sudo命令的PATH限制，参阅资料，用wq!存盘：

# vi /etc/sudoers

Defaults        env_reset
Defaults        mail_badpass
# Defaults      secure_path="/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/snap/bin"

然后用 -E选项运行sudo命令可以继承当前用户的环境变量设置，这样安装软件也可以不用登录到root用户，比如后面用conda命令安装Python软件包：

(gpu) ubuntu@VM-0-14-ubuntu:~$ sudo -E conda list hanlp
# packages in environment at /usr/local/anaconda3/envs/gpu:
#
# Name                    Version                   Build  Channel
hanlp                     2.1.0b42                 pypi_0    pypi
hanlp-common              0.0.18                   pypi_0    pypi
hanlp-downloader          0.0.25                   pypi_0    pypi
hanlp-trie                0.0.5                    pypi_0    pypi

2、大约需要10～15分钟进行安装，可以用以下命令查看当前安装进程：

root@VM-0-14-ubuntu:~# ps aux | grep -i install
root      8158  0.0  0.0  13776  1156 pts/0    S+   08:50   0:00 grep --color=auto -i install

如上面所示，里面没有nv_driver_install.sh及nv_cuda_install.sh，则表示驱动安装已经完成。

3、验证GPU驱动安装成功。

root@VM-0-14-ubuntu:~# nvidia-smi
Sat Oct 29 08:52:11 2022       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 460.106.00   Driver Version: 460.106.00   CUDA Version: 11.2     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  Tesla T4            On   | 00000000:00:08.0 Off |                    0 |
| N/A   28C    P8     8W /  70W |      0MiB / 15109MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+

4、验证CUDA 安装成功。上面入门教程写的不适用于这个配置组合，/usr/local/cuda是到/usr/local/cuda-11.2的链接。

root@VM-0-14-ubuntu:~# cat  /usr/local/cuda/version.txt
cat: /usr/local/cuda/version.txt: No such file or directory
root@VM-0-14-ubuntu:~# find / -name cuda
/usr/local/cuda-11.2/targets/x86_64-linux/include/cuda
/usr/local/cuda-11.2/targets/x86_64-linux/include/thrust/system/cuda
/usr/local/cuda
root@VM-0-14-ubuntu:~# cd /usr/local/cuda
root@VM-0-14-ubuntu:/usr/local/cuda# ls
bin                DOCS      extras   lib64    nsight-compute-2020.3.1  nsight-systems-2020.4.3  nvvm       README   share  targets  version.json
compute-sanitizer  EULA.txt  include  libnvvp  nsightee_plugins         nvml                     nvvm-prev  samples  src    tools
root@VM-0-14-ubuntu:/usr/local/cuda# cd bin
root@VM-0-14-ubuntu:/usr/local/cuda/bin# ./nvcc -V
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2021 NVIDIA Corporation
Built on Sun_Feb_14_21:12:58_PST_2021
Cuda compilation tools, release 11.2, V11.2.152
Build cuda_11.2.r11.2/compiler.29618528_0

5、验证cuDNN安装，上面入门教程写的同样不适用，系统从镜像安装cuDNN没有成功。

root@VM-0-14-ubuntu:/usr/local/cuda/bin# cat /usr/include/cudnn_version.h | grep CUDNN_MAJOR -A 2
cat: /usr/include/cudnn_version.h: No such file or directory

二、手工安装cuDNN，参阅资料。

cuDNN下载要登录Nvidia的网站，所以用下面的命令是不行的：

wget https://developer.nvidia.com/compute/cudnn/secure/8.5.0/local_installers/11.7/cudnn-linux-x86_64-8.5.0.96_cuda11-archive.tar.xz

1、在笔记本上下载好，再用SecureFX从SSH端口传到服务器上，解压安装。Linux上验证过的CUDA与cuDNN等的匹配关系参阅该资料。

tensorflow-cuda-cudnn-python版本对照表

# tar -xvf cudnn-linux-x86_64-8.5.0.96_cuda11-archive.tar.xz
# cd cudnn-linux-x86_64-8.5.0.96_cuda11-archive
# cp lib/* /usr/local/cuda/lib64/
# cp include/* /usr/local/cuda/include/
# chmod a+r /usr/local/cuda/lib64/*
# chmod a+r /usr/local/cuda/include/*

2、将CUDA目录加入全局环境变量：

# vi /etc/profile

export PATH=/usr/local/cuda-11.2/bin:$PATH
export LD_LIBRARY_PATH=/usr/local/cuda-11.2/lib64:$LD_LIBRARY_PATH
export CUDA_HOME=/usr/local/cuda-11.2

3、source /etc/profile使它生效，或者logout再login，验证cuDNN安装：

root@VM-0-14-ubuntu:/usr/local/cuda/bin# source /etc/profile
root@VM-0-14-ubuntu:/usr/local/cuda/bin# echo $PATH
/usr/local/cuda-11.2/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/snap/bin
root@VM-0-14-ubuntu:/usr/local/cuda/bin# nvcc -V
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2021 NVIDIA Corporation
Built on Sun_Feb_14_21:12:58_PST_2021
Cuda compilation tools, release 11.2, V11.2.152
Build cuda_11.2.r11.2/compiler.29618528_0
root@VM-0-14-ubuntu:/usr/local/cuda/bin# cat /usr/local/cuda/include/cudnn_version.h | grep CUDNN_MAJOR -A 2
#define CUDNN_MAJOR 8
#define CUDNN_MINOR 5
#define CUDNN_PATCHLEVEL 0
--
#define CUDNN_VERSION (CUDNN_MAJOR * 1000 + CUDNN_MINOR * 100 + CUDNN_PATCHLEVEL)

#endif /* CUDNN_VERSION_H */

三、安装Anaconda

1、下载安装Anaconda，装在/usr/local/anaconda3目录。

$ wget https://mirrors.tuna.tsinghua.edu.cn/anaconda/archive/Anaconda3-2022.10-Linux-x86_64.sh
$ sudo bash Anaconda3-2022.10-Linux-x86_64.sh

安装完成，选择运行 conda init：

done
installation finished.
Do you wish the installer to initialize Anaconda3
by running conda init? [yes|no]
[no] >>> yes
modified      /usr/local/anaconda3/condabin/conda
modified      /usr/local/anaconda3/bin/conda
modified      /usr/local/anaconda3/bin/conda-env
no change     /usr/local/anaconda3/bin/activate
no change     /usr/local/anaconda3/bin/deactivate
no change     /usr/local/anaconda3/etc/profile.d/conda.sh
no change     /usr/local/anaconda3/etc/fish/conf.d/conda.fish
no change     /usr/local/anaconda3/shell/condabin/Conda.psm1
no change     /usr/local/anaconda3/shell/condabin/conda-hook.ps1
no change     /usr/local/anaconda3/lib/python3.9/site-packages/xontrib/conda.xsh
no change     /usr/local/anaconda3/etc/profile.d/conda.csh
modified      /root/.bashrc

==> For changes to take effect, close and re-open your current shell. <==

If you'd prefer that conda's base environment not be activated on startup, 
   set the auto_activate_base parameter to false: 

conda config --set auto_activate_base false

Thank you for installing Anaconda3!

===========================================================================

Working with Python and Jupyter is a breeze in DataSpell. It is an IDE
designed for exploratory data analysis and ML. Get better data insights
with DataSpell.

DataSpell for Anaconda is available at: https://www.anaconda.com/dataspell

编辑全局变量脚本，把设置conda环境的脚本加到最后，以便所有用户都可用。

# vi /etc/profile

# >>> conda initialize >>>
# !! Contents within this block are managed by 'conda init' !!
__conda_setup="$('/usr/local/anaconda3/bin/conda' 'shell.bash' 'hook' 2> /dev/null)"
if [ $? -eq 0 ]; then
    eval "$__conda_setup"
else
    if [ -f "/usr/local/anaconda3/etc/profile.d/conda.sh" ]; then
        . "/usr/local/anaconda3/etc/profile.d/conda.sh"
    else
        export PATH="/usr/local/anaconda3/bin:$PATH"
    fi
fi
unset __conda_setup
# <<< conda initialize <<<

运行~/.bashrc使conda base环境生效，或者logout再login。

# source ~/.bashrc

2、root安装tensorflow-gpu 2.6。

# conda create --name gpu python=3.9
# pip install ipykernel
# python -m ipykernel install --user --name gpu
# conda activate gpu
# pip install tensorflow-gpu==2.6

3、ubuntu用户测试安装。

(base) ubuntu@VM-0-14-ubuntu:~$ conda activate gpu
(gpu) ubuntu@VM-0-14-ubuntu:~$ python
Python 3.9.13 (main, Oct 13 2022, 21:15:33) 
[GCC 11.2.0] :: Anaconda, Inc. on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import tensorflow as tf
>>> tf.test.is_built_with_cuda() 
True
>>> a = tf.constant(1.)
2022-10-29 18:14:29.577429: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2022-10-29 18:14:29.585025: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2022-10-29 18:14:29.585898: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2022-10-29 18:14:29.587034: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 AVX512F FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2022-10-29 18:14:29.587744: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2022-10-29 18:14:29.588624: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2022-10-29 18:14:29.589442: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2022-10-29 18:14:30.245462: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2022-10-29 18:14:30.246301: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2022-10-29 18:14:30.247122: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2022-10-29 18:14:30.247901: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1510] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 13803 MB memory:  -> device: 0, name: Tesla T4, pci bus id: 0000:00:08.0, compute capability: 7.5
>>> b = tf.constant(2.)
>>> print(a+b)
tf.Tensor(3.0, shape=(), dtype=float32)
>>>

四、配置Jupyter Notebook

Jupyter Notebook的安装配置要简单一点，先配起它来验证GPU环境的安装，参阅资料。

1、安装Anaconda3时base环境已经安装了Jupyter Notebook，但上面建立的虚拟环境"gpu"里面没有安装，要安装一下，先用conda activate激活环境再装。

(base) root@VM-0-14-ubuntu:~# conda activate gpu
(gpu) root@VM-0-14-ubuntu:~# conda list jupyter
# packages in environment at /usr/local/anaconda3/envs/gpu:
#
# Name                    Version                   Build  Channel
(gpu) root@VM-0-14-ubuntu:~# conda install  jupyter notebook
Collecting package metadata (current_repodata.json): done
Solving environment: done

## Package Plan ##

  environment location: /usr/local/anaconda3/envs/gpu

  added / updated specs:
    - jupyter
    - notebook


The following packages will be downloaded:

    package                    |            build
    ---------------------------|-----------------
    asttokens-2.0.5            |     pyhd3eb1b0_0          20 KB
......
Proceed ([y]/n)? y


Downloading and Extracting Packages
soupsieve-2.3.2.post | 65 KB     | ################################################################################################################################################## | 100% 
......
asttokens-2.0.5      | 20 KB     | ################################################################################################################################################## | 100% 
Preparing transaction: done
Verifying transaction: done
Executing transaction: done
Retrieving notices: ...working... done

2、为用户ubuntu配置Jupyter Notebook。

1）产生配置文件。

(base) ubuntu@VM-0-14-ubuntu:~$ jupyter notebook --generate-config
Writing default config to: /home/ubuntu/.jupyter/jupyter_notebook_config.py

2）产生登录口令的Hash。

(base) ubuntu@VM-0-14-ubuntu:~$ python
Python 3.9.13 (main, Aug 25 2022, 23:26:10) 
[GCC 11.2.0] :: Anaconda, Inc. on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from notebook.auth import passwd
>>> passwd()
Enter password: 
Verify password: 
'argon2:$argon2id$v=19$m=10240,t=10,p=xxxxxxxxxxxxxxxxxxx'
>>>

3、编辑配置文件，拷贝上面登录口令的Hash到配置文件。

$ vi ~/.jupyter/jupyter_notebook_config.py

c.NotebookApp.ip='*'                     # 就是设置所有ip皆可访问  
c.NotebookApp.password = 'argon2:$argon2id$v=19$m=10240,t=10,p=xxxxxxxxxxxxxxxxxxx'  # 上面复制的那个sha密文'  
c.NotebookApp.open_browser = False       # 禁止自动打开浏览器  
c.NotebookApp.port =8888                 # 端口
c.NotebookApp.notebook_dir = '/home/ubuntu/jupyternotebook'  #设置Notebook启动进入的目录

4、启动Jupyter Notebook，注意要先激活使用"gpu"环境，用的是它。

(base) ubuntu@VM-0-14-ubuntu:~$ conda activate gpu
(gpu) ubuntu@VM-0-14-ubuntu:~$ conda list jupyter
# packages in environment at /usr/local/anaconda3/envs/gpu:
#
# Name                    Version                   Build  Channel
jupyter                   1.0.0            py39h06a4308_8  
jupyter_client            7.3.5            py39h06a4308_0  
jupyter_console           6.4.3              pyhd3eb1b0_0  
jupyter_core              4.11.1           py39h06a4308_0  
jupyter_server            1.18.1           py39h06a4308_0  
jupyterlab                3.4.4            py39h06a4308_0  
jupyterlab_pygments       0.1.2                      py_0  
jupyterlab_server         2.15.2           py39h06a4308_0  
jupyterlab_widgets        1.0.0              pyhd3eb1b0_1  
(gpu) ubuntu@VM-0-14-ubuntu:~$ jupyter notebook &
[1] 16510
(gpu) ubuntu@VM-0-14-ubuntu:~$ [W 07:53:21.094 NotebookApp] WARNING: The notebook server is listening on all IP addresses and not using encryption. This is not recommended.
[W 2022-10-30 07:53:21.326 LabApp] 'ip' has moved from NotebookApp to ServerApp. This config will be passed to ServerApp. Be sure to update your config before our next release.
[W 2022-10-30 07:53:21.326 LabApp] 'password' has moved from NotebookApp to ServerApp. This config will be passed to ServerApp. Be sure to update your config before our next release.
[W 2022-10-30 07:53:21.326 LabApp] 'password' has moved from NotebookApp to ServerApp. This config will be passed to ServerApp. Be sure to update your config before our next release.
[W 2022-10-30 07:53:21.326 LabApp] 'port' has moved from NotebookApp to ServerApp. This config will be passed to ServerApp. Be sure to update your config before our next release.
[W 2022-10-30 07:53:21.326 LabApp] 'notebook_dir' has moved from NotebookApp to ServerApp. This config will be passed to ServerApp. Be sure to update your config before our next release.
[W 2022-10-30 07:53:21.326 LabApp] 'notebook_dir' has moved from NotebookApp to ServerApp. This config will be passed to ServerApp. Be sure to update your config before our next release.
[I 2022-10-30 07:53:21.333 LabApp] JupyterLab extension loaded from /usr/local/anaconda3/envs/gpu/lib/python3.9/site-packages/jupyterlab
[I 2022-10-30 07:53:21.333 LabApp] JupyterLab application directory is /usr/local/anaconda3/envs/gpu/share/jupyter/lab
[I 07:53:21.337 NotebookApp] Serving notebooks from local directory: /home/ubuntu/jupyternotebook
[I 07:53:21.337 NotebookApp] Jupyter Notebook 6.4.12 is running at:
[I 07:53:21.337 NotebookApp] http://VM-0-14-ubuntu:8888/
[I 07:53:21.337 NotebookApp] Use Control-C to stop this server and shut down all kernels (twice to skip confirmation).

5、浏览器访问，输入上面设置的密码登录，然后新建一个测试的notebook测试GPU环境的安装。

import tensorflow as tf
tf.test.is_built_with_cuda() 
a = tf.constant(1.)
b = tf.constant(2.)
print(a+b)

JupyterNotebook测试tensorflow-gpu安装

6、新建一个测试的notebook测试keras与cuDNN。

import os
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers,optimizers, datasets
from tensorflow.keras.models import load_model
from matplotlib import pyplot as plt
import numpy as np

# 一、数据集处理
 
# 构建模型
(x_train_raw, y_train_raw),(x_test_raw,y_test_raw) = datasets.mnist.load_data()
print(y_train_raw[0])                                         # 5
print(x_train_raw.shape, y_train_raw.shape)                   # (60000,28,28)6万张训练集
print(x_test_raw.shape, y_test_raw.shape)                     # (10000,28,28)1万张测试集
 
num_classes = 10
y_train= keras.utils.to_categorical(y_train_raw,num_classes)  # 将分类标签变为独热码(onehot)
y_test = keras.utils.to_categorical(y_test_raw,num_classes)
print(y_train[0])                                             # [0. 0. 0. 0. 0. 1. 0. 0. 0. 0.]
 
# 数据可视化，看看测试的数据
plt.figure()
for i in range(9):
    plt.subplot(3,3,i+1)
    plt.imshow(x_train_raw[i])
    plt.axis('off')
plt.show()

# 二、构建并编译全连接神经网络
 
# 编译全连接层
x_train = x_train_raw.reshape(60000,784)                     # 将28*28的图像展开成784*1的向量
x_test = x_test_raw.reshape(10000,784)                       # 将图像像素值归一化0~1
x_train= x_train.astype('float32')/255
x_test = x_test.astype('float32')/255                        
    
model = keras.Sequential([                                   # 创建模型。模型包括3个全连接层和两个RELU激活函数
    layers.Dense(512,activation='relu', input_dim = 784),    # 降维处理
    layers.Dense(256,activation='relu'),
    layers.Dense(124,activation='relu'),
    layers.Dense(num_classes,activation='softmax')
])

# 三、训练网络
 
Optimizer = optimizers.Adam(0.001)
model.compile(loss=keras.losses.categorical_crossentropy,
    optimizer=Optimizer,                                     # Adam优化器 
    metrics=['accuracy']
)
model.fit(x_train,y_train,                                   # 训练集数据标签
    batch_size=128,                                          # 批大小 
    epochs=10,                                               # 训练的轮次
    verbose=1)                                               # 输出日志

# 四、测试模型
 
score = model.evaluate(x_test,y_test,verbose=0)
print('Test loss:', score[0])                                # 损失函数: 0.0853068439
print('Test accuracy:', score[1])                             # 精确度: 0.9767
 
test_loss,test_acc = model.evaluate(x=x_test,y=y_test)
print("Test Accuracy %.2f"%test_acc)                         # 精确度: 0.9   

# 五、保存模型
 
model.save('./final_DNN_mode1.h5')                 # 保存DNN模型

# 六、加载保存的模型
new_model = load_model('./final_DNN_mode1.h5')
new_model.summary()

# 七、CNN 模型测试 -----------------------------------------------------------------------------------------------------

# 将数据扩充维度，以适应CNN模型
X_train=x_train.reshape(60000,28,28,1)
X_test=x_test.reshape(10000,28,28,1)

# 定义卷积神经网络模型
model=keras.Sequential([                                   # 创建网络序列
    layers.Conv2D(filters=32,kernel_size = 5,strides = (1,1), padding ='same',activation = tf.nn.relu,input_shape = (28,28,1)),
                                                             # 添加第一层卷积层和池化层
    layers.MaxPool2D(pool_size=(2,2),strides = (2,2),padding = 'valid'),
                                                             # 添加第二层卷积层和泄化层
    layers.Conv2D(filters=64, kernel_size = 3, strides=(1, 1),padding='same', activation = tf.nn.relu),
    layers.MaxPool2D(pool_size=(2,2),strides = (2,2),padding = 'valid'),
                                                             # 添加dropout层 以减少过拟合
    layers.Dropout(0.25),                     # 随机丢弃神经元的比例    
    layers.Flatten(),
                                                             # 添加两层全连接层
    layers.Dense(units=128,activation = tf.nn.relu),
    layers.Dropout(0.5),
    layers.Dense(units=10,activation = tf.nn.softmax)
])  

# 编译并训练模型
Optimizer = optimizers.Adam(0.001)
model.compile(Optimizer,loss="categorical_crossentropy",metrics=['accuracy'])
model.fit(x=X_train,y=y_train,epochs=5,batch_size=128)       # 轮次为5

# 保存CNN模型
model.save('./final_CNN_model.h5')                  
# 加载保存的模型
new_model = load_model('./final_CNN_model.h5')

# 八、测试数据进行可视化测试
 
# @matplotlib.inline
def res_Visual(n):
    # 参阅 https://blog.csdn.net/yiyihuazi/article/details/122323349
    # keras 2.6删除了predict_classes()函数
    # final_opt_a=new_model.predict_classes(X_test[0:n])        # 通过模型预测测试集
    # 用下面的语句代替
    predicts = new_model.predict(X_test[0:n])
    final_opt_a = np.argmax(predicts, axis=1)
    
    fig, ax = plt.subplots(nrows=int(n/5), ncols=5)
    ax = ax.flatten()
    print('前{}张图片预测结果为:'.format(n))
    for i in range(n): 
        print(final_opt_a[i],end='.')
        if int((i+1)%5)==0:
            print('\t')
 
        # 图片可视化展示
        img = X_test[i].reshape((28,28))                       # 读取每行数据，格式为Ndarry
        plt.axis("off")
        ax[i].imshow(img,cmap='Greys',interpolation='nearest') # 可视化
        ax[i].axis("off")
    print('测试集前{}张图片为:'.format(n))
    
    
res_Visual(20)

keras要降低版本到2.6.0，否则出错，参阅资料。

ImportError: cannot import name 'dtensor' from 'tensorflow.compat.v2.experimental'


(gpu) root@VM-0-14-ubuntu:~# conda list keras
# packages in environment at /usr/local/anaconda3/envs/gpu:
#
# Name                    Version                   Build  Channel
keras                     2.10.0                   pypi_0    pypi
keras-preprocessing       1.1.2                    pypi_0    pypi
(gpu) root@VM-0-14-ubuntu:~# pip install keras==2.6

测试程序前面DNN全连接神经网络的部分通过了，后面使用cuDNN的CNN卷积神经网络部分没有通过，cuDNN8.5的版本可能过高，参阅资料。需要降回经过测试确认的8.1版。报错：

OP_REQUIRES failed at conv_ops.cc:1276 : Not found: No algorithm worked!

7、降低cuDNN版本到8.1。笔记本上下载并用SecureFX通过SSH传到服务器上，拷贝并替换cuDNN8.5的文件。

# tar -xvf cudnn-11.2-linux-x64-v8.1.1.33.tgz
# cd cuda
# cp -f lib64/* /usr/local/cuda/lib64/
# cp -f include/* /usr/local/cuda/include/
# chmod a+r /usr/local/cuda/lib64/*
# chmod a+r /usr/local/cuda/include/*

在全局环境变量中加入下面的设置，否则跑CNN测试时可能会报错说申请的内存过大，导致算法运行失败：

# vi /etc/profile

TF_GPU_ALLOCATOR=cuda_malloc_async

更新动态链接库的Cache，否则链接不对，重启系统:

# ldconfig -X
# reboot now

用ubuntu用户登录，激活"gpu"环境并启动Jupyter Notebook：

$ conda activate gpu
$ jupyter notebook &

8、重新运行刚才的notebook测试GPU环境的安装，通过。

1、加载tensorflow识别手写数字例子数据集

2、构建并编译DNN神经网络

3、训练网络

4、测试模型

5、CNN 模型测试

6、测试数据进行可视化

五、安装Pytorch与HanLP

我在同一个虚拟环境"gpu"中安装Tensorflow、Pytorch与HanLP，因为要跑HanLP2.1，它同时支持两个后端。

1、安装Pytorch。

(gpu) root@VM-0-14-ubuntu:~# conda install pytorch==1.11.0 torchvision==0.12.0 torchaudio==0.11.0 cudatoolkit=11.3 -c pytorch
Collecting package metadata (current_repodata.json): done
Solving environment: failed with initial frozen solve. Retrying with flexible solve.
Collecting package metadata (repodata.json): done
Solving environment: done

## Package Plan ##

  environment location: /usr/local/anaconda3/envs/gpu

  added / updated specs:
    - cudatoolkit=11.3
    - pytorch==1.11.0
    - torchaudio==0.11.0
    - torchvision==0.12.0


The following packages will be downloaded:

    package                    |            build
    ---------------------------|-----------------
    cudatoolkit-11.3.1         |       h2bc3f7f_2       549.3 MB
......
  torchvision        pytorch/linux-64::torchvision-0.12.0-py39_cu113 None


Proceed ([y]/n)? y


Downloading and Extracting Packages
lame-3.100           | 323 KB    | ################################################################################################################################################## | 100% 
......
######################################################################## | 100% 
Preparing transaction: done
Verifying transaction: done
Executing transaction: | By downloading and using the CUDA Toolkit conda packages, you accept the terms and conditions of the CUDA End User License Agreement (EULA): https://docs.nvidia.com/cuda/eula/index.html

done
Retrieving notices: ...working... done

2、安装HanLP。

(gpu) root@VM-0-14-ubuntu:~# pip install hanlp
Looking in indexes: http://mirrors.tencentyun.com/pypi/simple
Collecting hanlp
......
Successfully built hanlp-common hanlp-trie hanlp-downloader phrasetree
Installing collected packages: toposort, tokenizers, phrasetree, tqdm, regex, pyyaml, pynvml, hanlp-common, filelock, huggingface-hub, hanlp-trie, hanlp-downloader, transformers, hanlp
Successfully installed filelock-3.8.0 hanlp-2.1.0b42 hanlp-common-0.0.18 hanlp-downloader-0.0.25 hanlp-trie-0.0.5 huggingface-hub-0.10.1 phrasetree-0.0.8 pynvml-11.4.1 pyyaml-6.0 regex-2022.9.13 tokenizers-0.11.6 toposort-1.5 tqdm-4.64.1 transformers-4.23.1
WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv

安装fasttext，这是HanLP一些Tensorflow预训练模型要用的：

(gpu) root@VM-0-14-ubuntu:~# pip install fasttext
Looking in indexes: http://mirrors.tencentyun.com/pypi/simple
Collecting fasttext
  Downloading http://mirrors.tencentyun.com/pypi/packages/f8/85/e2b368ab6d3528827b147fdb814f8189acc981a4bc2f99ab894650e05c40/fasttext-0.9.2.tar.gz (68 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 68.8/68.8 kB 332.3 kB/s eta 0:00:00
  Preparing metadata (setup.py) ... done
Collecting pybind11>=2.2
  Using cached http://mirrors.tencentyun.com/pypi/packages/1d/53/e6b27f3596278f9dd1d28ef1ddb344fd0cd5db98ef2179d69a2044e11897/pybind11-2.10.1-py3-none-any.whl (216 kB)
Requirement already satisfied: setuptools>=0.7.0 in /usr/local/anaconda3/envs/gpu/lib/python3.9/site-packages (from fasttext) (65.5.0)
Requirement already satisfied: numpy in /usr/local/anaconda3/envs/gpu/lib/python3.9/site-packages (from fasttext) (1.23.3)
Building wheels for collected packages: fasttext
  Building wheel for fasttext (setup.py) ... done
  Created wheel for fasttext: filename=fasttext-0.9.2-cp39-cp39-linux_x86_64.whl size=299146 sha256=4dee6f6dc5fb53404fb5cbb69c2cc3a2faef7f3af0500567ad49dc01f26d89d7
  Stored in directory: /root/.cache/pip/wheels/ca/08/ee/d0dd871c6c089c4c3971722067bd577f8827c9b4d5d6f2477a
Successfully built fasttext
Installing collected packages: pybind11, fasttext

3、测试PyTorch及HanLP。

先简单测试下，后面会继续测试。

import torch

print(torch.__version__)
print(torch.cuda.is_available())

pytorch识别出GPU

# 先运行Tensorflow模型再运行PyTorch模型就成功，如果前面先运行过PyTorch模型，这里就会失败。
import hanlp
tokenizer = hanlp.load(hanlp.pretrained.tok.LARGE_ALBERT_BASE)
text = 'NLP统计模型没有加规则，聪明人知道自己加。英文、数字、自定义词典统统都是规则。'
print(tokenizer(text))

# 后面的测试不受运行顺序的影响

import hanlp
HanLP = hanlp.load(hanlp.pretrained.mtl.CLOSE_TOK_POS_NER_SRL_DEP_SDP_CON_ELECTRA_SMALL_ZH) # 世界最大中文语料库
HanLP(['2021年HanLPv2.1为生产环境带来次世代最先进的多语种NLP技术。', '阿婆主来到北京立方庭参观自然语义科技公司。'])

import hanlp
HanLP = hanlp.pipeline() \
    .append(hanlp.utils.rules.split_sentence, output_key='sentences') \
    .append(hanlp.load('FINE_ELECTRA_SMALL_ZH'), output_key='tok') \
    .append(hanlp.load('CTB9_POS_ELECTRA_SMALL'), output_key='pos') \
    .append(hanlp.load('MSRA_NER_ELECTRA_SMALL_ZH'), output_key='ner', input_key='tok') \
    .append(hanlp.load('CTB9_DEP_ELECTRA_SMALL', conll=0), output_key='dep', input_key='tok')\
    .append(hanlp.load('CTB9_CON_ELECTRA_SMALL'), output_key='con', input_key='tok')
HanLP('2021年HanLPv2.1为生产环境带来次世代最先进的多语种NLP技术。阿婆主来到北京立方庭参观自然语义科技公司。')

HanLP('2021年HanLPv2.1为生产环境带来次世代最先进的多语种NLP技术。').pretty_print()

import hanlp

tok = hanlp.load(hanlp.pretrained.tok.COARSE_ELECTRA_SMALL_ZH)
tok(['商品和服务。', '阿婆主来到北京立方庭参观自然语义科技公司。'])

tok_fine = hanlp.load(hanlp.pretrained.tok.FINE_ELECTRA_SMALL_ZH)
tok_fine('阿婆主来到北京立方庭参观自然语义科技公司')

pos = hanlp.load(hanlp.pretrained.pos.CTB9_POS_ELECTRA_SMALL)
pos(["我", "的", "希望", "是", "希望", "张晚霞", "的", "背影", "被", "晚霞", "映红", "。"])

分词与词性标注

管道操作

打印语法树

各种预训练模型模型分词

六、安装配置JupyterHub

Linux GPU虚拟主机作为科研、开发、测试或生产环境，多用户是很自然的，Jupyter Notebook是单用户的，JupyterHub则提供了一层多用户的代理，让大家可以通过它登录系统，使用各自的Jupyter Notebook或Jupyter Lab，后者是前者的下一代版本。

Jupyterhub统一代理各用户的Jupyterlab，从而实现多用户服务

根据该帖子，如果曾经运行过Jupyter Notebook，那么它在$HOME/.jupyter下的配置文件会与Jupyterhub要启动的用户Jupyter Lab或Jupyter Notebook Server冲突，导致服务进程不能启动，代理转发失败，这是个BUG？所以如果曾经运行过Jupyter Notebook，像前面那样，要先删除那个目录。这个问题搞了两天，几乎要崩溃，还是stackoverflow给力。

参阅资料1，参阅资料2，参阅资料3，参阅资料4。

1、安装并升级node.js与npm。

# #从软件源获取最新软件列表，更新系统软件
# apt-get update 
# apt-get upgrade
# #安装依赖
# apt install -y npm nodejs

升级node.js，不要安装最新的18版，兼容性有问题，会报错，参阅资料，JupyterHub要求版本10以上，而Ubuntu18安装的是版本8。

##----- 先清除 npm cache
# npm cache clean -f 
##----- 安装 n 模块
# npm install -g n

升级node.js：

root@VM-0-14-ubuntu:~# n 16.18.0    # 指定版本16.18.0
  installing : node-v16.18.0
       mkdir : /usr/local/n/versions/node/16.18.0
       fetch : https://nodejs.org/dist/v16.18.0/node-v16.18.0-linux-x64.tar.xz
     copying : node/16.18.0
   installed : v16.18.0 (with npm 8.19.2)

Note: the node command changed location and the old location may be remembered in your current shell.
         old : /usr/bin/node
         new : /usr/local/bin/node
If "node --version" shows the old version then start a new shell, or reset the location hash with:
hash -r  (for bash, zsh, ash, dash, and ksh)
rehash   (for csh and tcsh)

root@VM-0-14-ubuntu:~# hash -r
root@VM-0-14-ubuntu:~# node -v
v16.18.0
root@VM-0-14-ubuntu:~# npm -v
8.19.2

2、安装configurable-http-proxy。

可以用npm装：

npm install -g configurable-http-proxy

不过推荐用conda装，会把其它依赖包一起装上，它也会安装一个node.js版本11，也可以用，注意要切换并安装到相应的虚拟环境中，这里是"gpu"。

(gpu) root@VM-0-14-ubuntu:~# conda install configurable-http-proxy
(gpu) root@VM-0-14-ubuntu:~# conda list configurable-http-proxy
# packages in environment at /usr/local/anaconda3/envs/gpu:
#
# Name                    Version                   Build  Channel
configurable-http-proxy   4.0.1                   node6_0  
(gpu) root@VM-0-14-ubuntu:~# configurable-http-proxy -V
4.0.1
(gpu) root@VM-0-14-ubuntu:~#

3、在虚拟环境中安装JupyterHub等。

(gpu) root@VM-0-14-ubuntu:~# conda install jupyter jupyterlab jupyterhub
(gpu) root@VM-0-14-ubuntu:~# conda list jupyter
# packages in environment at /usr/local/anaconda3/envs/gpu:
#
# Name                    Version                   Build  Channel
jupyter                   1.0.0            py39h06a4308_8  
jupyter_client            7.3.5            py39h06a4308_0  
jupyter_console           6.4.3              pyhd3eb1b0_0  
jupyter_core              4.11.1           py39h06a4308_0  
jupyter_server            1.18.1           py39h06a4308_0  
jupyter_telemetry         0.1.0                      py_0  
jupyterhub                2.0.0              pyhd3eb1b0_0  
jupyterlab                3.4.4            py39h06a4308_0  
jupyterlab_pygments       0.1.2                      py_0  
jupyterlab_server         2.15.2           py39h06a4308_0  
jupyterlab_widgets        1.0.0              pyhd3eb1b0_1

4、配置JupyterHub。

新建目录/etc/jupyterhub，在该目录下新建一个配置文件，编辑文件。

(gpu) root@VM-0-14-ubuntu:~#  mkdir /etc/jupyterhub
(gpu) root@VM-0-14-ubuntu:~# cd /etc/jupyterhub
(gpu) root@VM-0-14-ubuntu:/etc/jupyterhub# jupyterhub --generate-config
Writing default config to: jupyterhub_config.py
(gpu) root@VM-0-14-ubuntu:/etc/jupyterhub# vi  jupyterhub_config.py

内容如下：

# Added by Jean 2022/10/31
c.Authenticator.whitelist = {'ubuntu'}   # 允许使用Jupyterhub的用户列表，逗号分隔。
c.Authenticator.admin_users = {'ubuntu'}  #Jupyterhub的管理员用户列表
c.Spawner.notebook_dir = '/home/{username}'  #浏览器登录后进入用户的主目录
c.Spawner.default_url = '/lab'    # 使用Jupyterlab而不是Notebook
c.JupyterHub.extra_log_file = '/var/log/jupyterhub.log'

5、用root用户后台启动JupyterHub。

(gpu) root@VM-0-14-ubuntu:/etc/jupyterhub# jupyterhub  -f /etc/jupyterhub/jupyterhub_config.py  &

6、在浏览器中访问，输入的是Linux系统中已有的用户名，网址是http://ip:8000，后面再配SSL加密。

JupyterHub中运行Jupyter Lab

JupyterHub里可以打开终端窗口，执行各种操作，用户的身份就是登录的用户。如果SSH端口被屏蔽，这样就可以通过HTTP端口建立隧道。执行su命令就可以root。

(base) ubuntu@VM-0-14-ubuntu:~$ su --help
Usage: su [options] [LOGIN]

Options:
  -c, --command COMMAND         pass COMMAND to the invoked shell
  -h, --help                    display this help message and exit
  -, -l, --login                make the shell a login shell
  -m, -p,
  --preserve-environment        do not reset environment variables, and
                                keep the same shell
  -s, --shell SHELL             use SHELL instead of the default in passwd

(base) ubuntu@VM-0-14-ubuntu:~$ su --preserve-environment
Password: 
(base) root@VM-0-14-ubuntu:~#

JupyterHub中打开终端窗口

7、配置SSL加密。

这是配好后SSL加密连接登录的截图，可以打开网址前面的锁图标看证书链的内容，前面的截图可见，如果是非加密连接，网址前面显示的是“不安全”提示。此处自签的数字证书是签给IP，因为这个虚拟主机还没有申请域名。

用自签证书给JupyterHub建立SSL加密通道

1）先讲讲JupyterHub配置。在配置文件中增加两行指出使用的服务器密钥文件和证书文件即可，后面再讲用openssl自建CA及签发该数字证书。因为是root用户，server.key没有指定访问密码。

# Added by Jean for SSL 2022/03/19
c.JupyterHub.ssl_key = '/root/cert/server.key'
c.JupyterHub.ssl_cert = '/root/cert/server.crt'

重启JupyterHub后，把自建CA的根证书拷出并导入浏览器（后面讲）,用https://ip:8000访问即可，如上图所示。

2）自建CA签发自签服务器证书。

参阅资料。

(gpu) root@VM-0-14-ubuntu:~# cd /root
(gpu) root@VM-0-14-ubuntu:~# mkdir cert
(gpu) root@VM-0-14-ubuntu:~# cd cert
(gpu) root@VM-0-14-ubuntu:~/cert# mkdir demoCA && cd demoCA
(gpu) root@VM-0-14-ubuntu:~/cert/demoCA# mkdir private newcerts
(gpu) root@VM-0-14-ubuntu:~/cert/demoCA# touch index.txt
(gpu) root@VM-0-14-ubuntu:~/cert/demoCA# echo '01' > serial
(gpu) root@VM-0-14-ubuntu:~/cert/demoCA# cd private
(gpu) root@VM-0-14-ubuntu:~/cert/demoCA/private# openssl genrsa -out cakey.pem 2048
Generating RSA private key, 2048 bit long modulus (2 primes)
...............................................................................+++++
....................+++++
e is 65537 (0x010001)
(gpu) root@VM-0-14-ubuntu:~/cert/demoCA/private# openssl req -sha256 -new -x509 -days 3650 -key cakey.pem -out cacert.pem \
>              -subj "/C=CN/ST=GD/L=ZhuHai/O=Jean/OU=Study/CN=RootCA"
(gpu) root@VM-0-14-ubuntu:~/cert/demoCA/private# ls
cacert.pem  cakey.pem
(gpu) root@VM-0-14-ubuntu:~/cert/demoCA/private# cd .. && mv ./private/cacert.pem ./
(gpu) root@VM-0-14-ubuntu:~/cert/demoCA# ls
cacert.pem  index.txt  newcerts  private  serial

上面的命令执行了一系列的操作：

A、在root用户的HOME目录/root下新建了/root/cert目录。

B、然后在其下建立了自建CA的目录结构./demoCA，因为openssl默认的配置文件中，建在当前目录的./demoCA目录下。

C、然后产生了CA的密钥cakey.pem。

D、签发了CA的自签数字证书cacert.pem，然后移动到./demoCA目录下。后面自建CA签发服务器证书时会到那里找CA根证书，这是openssl默认的配置。

E、最后列出了demoCA的目录结构。

可以找出openssl默认的配置文件看一下，自建CA在当前目录的./demoCA目录下：

(gpu) root@VM-0-14-ubuntu:~# find / -name openssl.cnf
/usr/lib/ssl/openssl.cnf
/usr/local/anaconda3/pkgs/openssl-1.1.1q-h7f8727e_0/ssl/openssl.cnf
/usr/local/anaconda3/ssl/openssl.cnf
/usr/local/anaconda3/envs/gpu/ssl/openssl.cnf
/usr/local/anaconda3/envs/hub/ssl/openssl.cnf
/etc/ssl/openssl.cnf
(gpu) root@VM-0-14-ubuntu:~# vi /usr/lib/ssl/openssl.cnf

####################################################################
[ ca ]
default_ca      = CA_default            # The default ca section

####################################################################
[ CA_default ]

dir             = ./demoCA              # Where everything is kept
certs           = $dir/certs            # Where the issued certs are kept
crl_dir         = $dir/crl              # Where the issued crl are kept
database        = $dir/index.txt        # database index file.
#unique_subject = no                    # Set to 'no' to allow creation of
                                        # several certs with same subject.
new_certs_dir   = $dir/newcerts         # default place for new certs.

certificate     = $dir/cacert.pem       # The CA certificate
serial          = $dir/serial           # The current serial number
crlnumber       = $dir/crlnumber        # the current crl number
                                        # must be commented out to leave a V1 CRL
crl             = $dir/crl.pem          # The current CRL
private_key     = $dir/private/cakey.pem# The private key
RANDFILE        = $dir/private/.rand    # private random number file

x509_extensions = usr_cert              # The extensions to add to the cert

F、生成服务器证书的密钥与证书请求。

参考帖子1与帖子2，要先执行下面的命令产生/root/.rnd文件，否则产生服务器密钥的命令会出错。

openssl rand -out /root/.rnd -hex 256

切换到./demoCA的父目录/root/cert，然后执行下面的命令产生服务器证书的密钥与证书请求，产生证书请求用配置文件/usr/lib/ssl/openssl.cnf，额外增加了认证的主体别名，Chrome浏览器使用主体别名来检查证书的主体别名与网址是否一致。因为用https://ip访问，这里的主体别名为IP.1:106.52.33.185，表示是该证书认证的第一个IP，还可以有IP.2等等。如果是认证域名，可以是DNS.1 = jeanye.cn等等，如此类推。产生证书请求文件server.csr。

(gpu) root@VM-0-14-ubuntu:~/cert# openssl genrsa -out server.key 2048
(gpu) root@VM-0-14-ubuntu:~/cert# openssl req -new \
>     -sha256 \
>     -key server.key \
>     -subj "/C=CN/ST=GD/L=ZhuHai/O=Jean/OU=Study/CN=106.52.33.185" \
>     -reqexts SAN \
>     -config <(cat /usr/lib/ssl/openssl.cnf \
>         <(printf "[SAN]\nsubjectAltName=IP.1:106.52.33.185")) \
>     -out server.csr

G、签署服务器证书。

openssl会在默认子目录./demoCA中找到cakey.pem与cacert.pem，按照证书请求文件server.csr的请求，使用配置文件/usr/lib/ssl/openssl.cnf，以及与请求一样的证书扩展（主体别名）签署证书，输出成server.crt。

(gpu) root@VM-0-14-ubuntu:~/cert# openssl ca -in server.csr \
>         -md sha256 \
>     -extensions SAN \
>     -config <(cat /usr/lib/ssl/openssl.cnf \
>         <(printf "[SAN]\nsubjectAltName=IP.1:106.52.33.185")) \
>      -out server.crt
Using configuration from /dev/fd/63
Check that the request matches the signature
Signature ok
Certificate Details:
        Serial Number: 1 (0x1)
        Validity
            Not Before: Nov  2 09:47:58 2022 GMT
            Not After : Nov  2 09:47:58 2023 GMT
        Subject:
            countryName               = CN
            stateOrProvinceName       = GD
            organizationName          = Jean
            organizationalUnitName    = Study
            commonName                = 106.52.33.185
        X509v3 extensions:
            X509v3 Subject Alternative Name: 
                IP Address:106.52.33.185
Certificate is to be certified until Nov  2 09:47:58 2023 GMT (365 days)
Sign the certificate? [y/n]:y


1 out of 1 certificate requests certified, commit? [y/n]y
Write out database with 1 new entries
Data Base Updated
(gpu) root@VM-0-14-ubuntu:~/cert# ls
demoCA  server.crt  server.csr  server.key

H、自建CA根证书导入浏览器。

把自建CA的根证书/root/cert/demoCA/cacert.pem下载到客户端（比如Win10），在浏览器（比如Chrome）中导入到受信任根证书颁证机构中。

Google浏览器:

设置->隐私设置和安全性->安全->高级->管理证书->受信任根证书颁证机构->导入->下一步->浏览->所有文件(*.*)

导入自建CA根证书到浏览器受信任根证书颁发机构列表

I、浏览器中输入网址https://106.52.33.185:8000访问，输入用户名/密码登录。

输入用户名/密码登录，启动自己的Jupyter Lab实例

8、配置JupyterHub为开机自启动服务。

1）建立服务配置文件。

先看看conda虚拟环境"gpu"的PATH设置：

(gpu) root@VM-0-14-ubuntu:~# echo $PATH
/usr/local/anaconda3/envs/gpu/bin:/usr/local/anaconda3/condabin:/usr/local/cuda-11.2/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/snap/bin
(gpu) root@VM-0-14-ubuntu:~#

然后新建一个系统守护进程的配置文件：

(gpu) root@VM-0-14-ubuntu:~# vi /etc/systemd/system/jupyterhub.service

内容如下，几个要点。

A、以root运行。

B、设定PATH路径，因为开机启动进程没有登录的过程，不会执行/etc/profile等设置环境变量，把上面的PATH拷进去。

C、用全路径引用执行jupyterhub。

[Unit]
Description=Jupyterhub service
After=syslog.target network.target

[Service]
User=root
Environment="PATH=/usr/local/anaconda3/envs/gpu/bin:/usr/local/anaconda3/condabin:/usr/local/cuda-11.2/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin"
ExecStart=/usr/local/anaconda3/envs/gpu/bin/jupyterhub -f /etc/jupyterhub/config.py

[Install]
WantedBy=multi-user.target

然后让服务配置文件生效：

(gpu) root@VM-0-14-ubuntu:~# systemctl enable jupyterhub.service

然后可以用下面几个命令来管理服务：

# systemctl status jupyterhub.service
# systemctl start jupyterhub.service
# systemctl stop jupyterhub.service

用下面的命令来查看服务的日志：

(gpu) root@VM-0-14-ubuntu:~# journalctl -u jupyterhub.service -f

上面Jupyterhub的配置文件中，日志也另外输出到以下的文件：

c.JupyterHub.extra_log_file = '/var/log/jupyterhub.log'

所以也可以打开日志文件来看。

这样，每次服务器重启，Jupyterhub都会自动启动了。

本篇到此结束，Linux GPU虚拟主机与GPU、Python深度学习运行与开发环境相关的部分就配好了，Rstudio、Shiny等其它部分另起文章。

最后编辑于：2022.11.25 20:44:11

人面猴
序言：七十年代末，一起剥皮案震惊了整个滨河市，随后出现的几起案子，更是在滨河造成了极大的恐慌，老刑警刘岩，带你破解...
沈念sama阅读 194,088评论 5赞 459
死咒
序言：滨河连续发生了三起死亡事件，死亡现场离奇诡异，居然都是意外死亡，警方通过查阅死者的电脑和手机，发现死者居然都...
沈念sama阅读 81,715评论 2赞 371
救了他两次的神仙让他今天三更去死
文/潘晓璐我一进店门，熙熙楼的掌柜王于贵愁眉苦脸地迎上来，“玉大人，你说我怎么就摊上这事。” “怎么了？”我有些...
开封第一讲书人阅读 141,361评论 0赞 319
道士缉凶录：失踪的卖姜人
文/不坏的土叔我叫张陵，是天一观的道长。经常有香客问我，道长，这世上最难降的妖魔是什么？我笑而不...
开封第一讲书人阅读 52,099评论 1赞 263
港岛之恋（遗憾婚礼）
正文为了忘掉前任，我火速办了婚礼，结果婚礼上，老公的妹妹穿的比我还像新娘。我一直安慰自己，他们只是感情好，可当我...
茶点故事阅读 60,987评论 4赞 355
恶毒庶女顶嫁案：这布局不是一般人想出来的
文/花漫我一把揭开白布。她就那样静静地躺着，像睡着了一般。火红的嫁衣衬着肌肤如雪。梳的纹丝不乱的头发上，一...
开封第一讲书人阅读 46,063评论 1赞 272
城市分裂传说
那天，我揣着相机与录音，去河边找鬼。笑死，一个胖子当着我的面吹牛，可吹牛的内容都是我干的。我是一名探鬼主播，决...
沈念sama阅读 36,486评论 3赞 381
双鸳鸯连环套：你想象不到人心有多黑
文/苍兰香墨我猛地睁开眼，长吁一口气：“原来是场噩梦啊……” “哼！你这毒妇竟也来了？” 一声冷哼从身侧响起，我...
开封第一讲书人阅读 35,175评论 0赞 253
万荣杀人案实录
序言：老挝万荣一对情侣失踪，失踪者是张志新（化名）和其女友刘颖，没想到半个月后，有当地人在树林里发现了一具尸体，经...
沈念sama阅读 39,440评论 1赞 290
护林员之死
正文独居荒郊野岭守林人离奇死亡，尸身上长有42处带血的脓包…… 初始之章·张勋以下内容为张勋视角年9月15日...
茶点故事阅读 34,518评论 2赞 309
白月光启示录
正文我和宋清朗相恋三年，在试婚纱的时候发现自己被绿了。大学时的朋友给我发了我未婚夫和他白月光在一起吃饭的照片。...
茶点故事阅读 36,305评论 1赞 326
活死人
序言：一个原本活蹦乱跳的男人离奇死亡，死状恐怖，灵堂内的尸体忽然破棺而出，到底是诈尸还是另有隐情，我是刑警宁泽，带...
沈念sama阅读 32,190评论 3赞 312
日本核电站爆炸内幕
正文年R本政府宣布，位于F岛的核电站，受9级特大地震影响，放射性物质发生泄漏。R本人自食恶果不足惜，却给世界环境...
茶点故事阅读 37,550评论 3赞 298
男人毒药：我在死后第九天来索命
文/蒙蒙一、第九天我趴在偏房一处隐蔽的房顶上张望。院中可真热闹，春花似锦、人声如沸。这庄子的主人今日做“春日...
开封第一讲书人阅读 28,880评论 0赞 17
一桩弑父案，背后竟有这般阴谋
文/苍兰香墨我抬头看了看天上的太阳。三九已至，却和暖如春，着一层夹袄步出监牢的瞬间，已是汗流浃背。一阵脚步声响...
开封第一讲书人阅读 30,152评论 1赞 250
情欲美人皮
我被黑心中介骗来泰国打工，没想到刚下飞机就差点儿被人妖公主榨干…… 1. 我叫王不留，地道东北人。一个月前我还...
沈念sama阅读 41,451评论 2赞 341
代替公主和亲
正文我出身青楼，却偏偏与公主长得像，于是被迫代替她去往敌国和亲。传闻我的和亲对象是个残疾皇子，可洞房花烛夜当晚...
茶点故事阅读 40,637评论 2赞 335

GPU Linux虚拟主机GN7型安装配置文档 定稿

推荐阅读更多精彩内容

GPU Linux虚拟主机GN7型安装配置文档定稿