建议大家可以配合跟着B站的跟李沐学AI。但是datawhale 也是讲的非常好的课程。
代码演示部分:配合本章学习材料使用
第一部分:张量运算示例
这里将演示Tensor的一些基本操作
In[1]:
importtorch
In[2]:
?torch.tensor
In[3]:
# 创建tensor,用dtype指定类型。注意类型要匹配
a=torch.tensor(1.0,dtype=torch.float)
b=torch.tensor(1,dtype=torch.long)
c=torch.tensor(1.0,dtype=torch.int8)
print(a,b,c)
tensor(1.) tensor(1) tensor(1, dtype=torch.int8)
<ipython-input-3-da5fc7804e0f>:4: DeprecationWarning: an integer is required (got type float). Implicit conversion to integers using __int__ is deprecated, and may be removed in a future version of Python.
c = torch.tensor(1.0, dtype=torch.int8)
In[4]:
# 使用指定类型函数随机初始化指定大小的tensor
d=torch.FloatTensor(2,3)
e=torch.IntTensor(2)
f=torch.IntTensor([1,2,3,4])#对于python已经定义好的数据结构可以直接转换
print(d,'\n',e,'\n',f)
tensor([[0.0000e+00, 8.4490e-39, 5.2003e+22],
[1.0692e-05, 1.3237e+22, 2.7448e-06]])
tensor([ 0, 1083129856], dtype=torch.int32)
tensor([1, 2, 3, 4], dtype=torch.int32)
In[6]:
# tensor和numpy array之间的相互转换
importnumpyasnp
g=np.array([[1,2,3],[4,5,6]])
h=torch.tensor(g)
print(h)
i=torch.from_numpy(g)
print(i)
j=h.numpy()
print(j)
tensor([[1, 2, 3],
[4, 5, 6]])
tensor([[1, 2, 3],
[4, 5, 6]])
[[1 2 3]
[4 5 6]]
In[10]:
# 常见的构造Tensor的函数
k=torch.rand(2,3)
l=torch.ones(2,3)
m=torch.zeros(2,3)
n=torch.arange(0,10,2)
print(k,'\n',l,'\n',m,'\n',n)
tensor([[0.2652, 0.0650, 0.5593],
[0.7864, 0.0015, 0.4458]])
tensor([[1., 1., 1.],
[1., 1., 1.]])
tensor([[0., 0., 0.],
[0., 0., 0.]])
tensor([0, 2, 4, 6, 8])
In[11]:
# 查看tensor的维度信息(两种方式)
print(k.shape)
print(k.size())
torch.Size([2, 3])
torch.Size([2, 3])
In[12]:
# tensor的运算
o=torch.add(k,l)
print(o)
tensor([[1.2652, 1.0650, 1.5593],
[1.7864, 1.0015, 1.4458]])
In[13]:
# tensor的索引方式与numpy类似
print(o[:,1])
print(o[0,:])
tensor([1.0650, 1.0015])
tensor([1.2652, 1.0650, 1.5593])
In[16]:
# 改变tensor形状的神器:view
print(o.view((3,2)))
print(o.view(-1,2))
tensor([[1.2652, 1.0650],
[1.5593, 1.7864],
[1.0015, 1.4458]])
tensor([[1.2652, 1.0650],
[1.5593, 1.7864],
[1.0015, 1.4458]])
In[17]:
# tensor的广播机制(使用时要注意这个特性)
p=torch.arange(1,3).view(1,2)
print(p)
q=torch.arange(1,4).view(3,1)
print(q)
print(p+q)
tensor([[1, 2]])
tensor([[1],
[2],
[3]])
tensor([[2, 3],
[3, 4],
[4, 5]])
In[18]:
# 扩展&压缩tensor的维度:squeeze
print(o)
r=o.unsqueeze(1)
print(r)
print(r.shape)
tensor([[1.2652, 1.0650, 1.5593],
[1.7864, 1.0015, 1.4458]])
tensor([[[1.2652, 1.0650, 1.5593]],
[[1.7864, 1.0015, 1.4458]]])
torch.Size([2, 1, 3])
In[19]:
s=r.squeeze(0)
print(s)
print(s.shape)
tensor([[[1.2652, 1.0650, 1.5593]],
[[1.7864, 1.0015, 1.4458]]])
torch.Size([2, 1, 3])
In[20]:
t=r.squeeze(1)
print(t)
print(t.shape)
tensor([[1.2652, 1.0650, 1.5593],
[1.7864, 1.0015, 1.4458]])
torch.Size([2, 3])
In[ ]:
In[ ]:
第二部分:自动求导示例
这里将通过一个简单的函数 𝑦=𝑥1+2∗𝑥2y=x1+2∗x2 来说明PyTorch自动求导的过程
In[22]:
importtorch
x1=torch.tensor(1.0,requires_grad=True)
x2=torch.tensor(2.0,requires_grad=True)
y=x1+2*x2
print(y)
tensor(5., grad_fn=<AddBackward0>)
In[23]:
# 首先查看每个变量是否需要求导
print(x1.requires_grad)
print(x2.requires_grad)
print(y.requires_grad)
True
True
True
In[24]:
# 查看每个变量导数大小。此时因为还没有反向传播,因此导数都不存在
print(x1.grad.data)
print(x2.grad.data)
print(y.grad.data)
---------------------------------------------------------------------------AttributeErrorTraceback (most recent call last)/tmp/ipykernel_11770/1707027577.pyin<module> 1# 查看每个变量导数大小。此时因为还没有反向传播,因此导数都不存在----> 2 print(x1.grad.data) 3print(x2.grad.data) 4print(y.grad.data)AttributeError: 'NoneType' object has no attribute 'data'
In[25]:
x1
Out[25]:
tensor(1., requires_grad=True)
In[26]:
## 反向传播后看导数大小
y=x1+2*x2
y.backward()
print(x1.grad.data)
print(x2.grad.data)
tensor(1.)
tensor(2.)
In[30]:
# 导数是会累积的,重复运行相同命令,grad会增加
y=x1+2*x2
y.backward()
print(x1.grad.data)
print(x2.grad.data)
tensor(5.)
tensor(10.)
In[ ]:
# 所以每次计算前需要清除当前导数值避免累积,这一功能可以通过pytorch的optimizer实现。后续会讲到
In[31]:
# 尝试,如果不允许求导,会出现什么情况?
x1=torch.tensor(1.0,requires_grad=False)
x2=torch.tensor(2.0,requires_grad=False)
y=x1+2*x2
y.backward()
---------------------------------------------------------------------------RuntimeErrorTraceback (most recent call last)/tmp/ipykernel_11770/4087792071.pyin<module> 3x2=torch.tensor(2.0,requires_grad=False) 4y=x1+2*x2----> 5 y.backward()/data1/ljq/anaconda3/envs/smp/lib/python3.8/site-packages/torch/_tensor.pyinbackward(self, gradient, retain_graph, create_graph, inputs) 253create_graph=create_graph, 254inputs=inputs)--> 255 torch.autograd.backward(self,gradient,retain_graph,create_graph,inputs=inputs) 256 257defregister_hook(self,hook):/data1/ljq/anaconda3/envs/smp/lib/python3.8/site-packages/torch/autograd/__init__.pyinbackward(tensors, grad_tensors, retain_graph, create_graph, grad_variables, inputs) 145retain_graph=create_graph 146--> 147 Variable._execution_engine.run_backward( 148tensors,grad_tensors_,retain_graph,create_graph,inputs, 149allow_unreachable=True, accumulate_grad=True) # allow_unreachable flagRuntimeError: element 0 of tensors does not require grad and does not have a grad_fn