2023-09-24 01 RNN 的前向传播

来源：https://hyunhp.tistory.com/448

1. RNN cell 与 RNN 直观图示

RNN ----> Recurrent Neural Network

You can think of the recurrent neural network as the repeated use of a single cell，the computations for a single time step.

2. 输入的维度Dimensions of input x

2.1 Input with $n_{x}$ number of units

➢ For a single time step of a single input example, $x^{(i)<t>}$ is a one-dimensional input vector

➢ Using language as an example, a language with a 5000-word vocabulary could be one-hot encoded into a vector that has $n_{x}=5000$ units. so $x^{(i)<t>}$ could have the shape (5000,)

➢ The notation $n_{x}$ is used here to denote the number of units in a single time step of a single training example

2.2 Time Steps of size $T_{x}$

➢ A recurrent neural network has multiple time steps, which you'll be index with t.

➢ In the lessons, you saw a single training example $x^{(i)}$ consisting of multiple time steps $T_{x}$ . In this notebook, $T_{x}$ will denote the number of timesteps in the longest sequence.

2.3 Batches of size m

➢ Let's say we have mini-batches, each with 20 training examples

➢ To benefit from vectorization, you'll stack 20 columns of $x^{(i)}$ examples

➢ For example, the tensor has the shape (5000,20,10)

➢ You'll use m to denote the number of training examples

➢ So, the shape of a mini-batch is

2.4 3D Tensor of shape $(n_{x},m,T_{x})$

➢ The 3-dimensional tensor x of shape $(n_{x},m,T_{x})$ represents the input x that is fed into the RNN

2.5 Take a 2D slice for each time step: $x^{<t>}$

➢ At each time step, you'll use a mini-batch of training examples (not just a single example)

➢ So, for each time step t, you'll use a 2D slice of shape $(n_{x},m)$

➢ This 2D slice is referred to as $x^{t}$ . The variable name in the code is xt.

3. 隐藏状态的维度 hidden state a

the activation $a^{<t>}$ that is passed to the RNN from one time step to another is called a "hidden state"

3.1 Dimensions of hidden state a

➢ Similar to the input tensor x, the hidden state for a single training example is a vector of length

➢ If you include a mini-batch or m training examples, the shape of a mini-batch is $(n_{a},m)$

➢ When you include the time step dimension, the shape of the hidden state is $(n_{a},m,T_{x})$

➢ You'll loop through the time steps with index t, and work with 2 2D slice of the 3D tensor

➢ This 2D slice is referred to as $a^{<t>}$

➢ In the code, the variable names used are either a_prev or a_next, depending on the function being implemented

➢ The shape of this 2D slice is $(n_{a},m)$

4. 输出的维度Dimensions of prediction $\hat{y}$

➢ Similar to the inputs and hidden states, $\hat{y}$ is a 3D tensor of shape $(n_{y},m,T_{y})$

■ $n_{y}$ : number of units in the vector representing the prediction

■ m : number of examples in a mini-batch

■ $T_{y}$ : number of time steps in the prediction

➢ For a single similar time step t, a 2D slice $\hat{y} ^{<t>}$ has shape $(n_{y},m)$

➢ In the code, the varriable names are:

● y_pred : $\hat{y}$

● yt_pred : $\hat{y} ^{<t>}$

5. 构建RNN

➢ Here is how you can implement an RNN:

Steps:

● Implement the calculations needed for one time step of the RNN.

● Implement a loop over $T_{x}$ time steps in order to process all the inputs, one at a time

➢ 关于 RNN Cell

You can think of the recurrent neural network as the repeated use of a single cell. First, you'll implement the computations for a single time step.

➢ RNN cell versus RNN_cell_forward:

● Note that an RNN cell outputs the hidden state $a^{<t>}$

■ RNN cell is shown in the figure as the inner box with solid lines

● The function that you'll implement, rnn_cell_forward, also calculates the prediction $\hat{y} ^{<t>}$

■ RNN_cell_forward is shown in the figure as the outer ox with dashed lines

➢ The following figure describes the operations for a single time step of an RNN cell:

代码如下：

# UNQ_C1 (UNIQUE CELL IDENTIFIER, DO NOT EDIT)

# GRADED FUNCTION: rnn_cell_forward

def rnn_cell_forward(xt, a_prev, parameters):

"""

【代码注释】

Implements a single forward step of the RNN-cell as described in Figure (2)

Arguments:

xt -- your input data at timestep "t", numpy array of shape (n_x, m).

a_prev -- Hidden state at timestep "t-1", numpy array of shape (n_a, m)

parameters -- python dictionary containing:

Wax -- Weight matrix multiplying the input, numpy array of shape (n_a, n_x)

Waa -- Weight matrix multiplying the hidden state, numpy array of shape (n_a, n_a)

Wya -- Weight matrix relating the hidden-state to the output, numpy array of shape (n_y, n_a)

ba -- Bias, numpy array of shape (n_a, 1)

by -- Bias relating the hidden-state to the output, numpy array of shape (n_y, 1)

Returns:

a_next -- next hidden state, of shape (n_a, m)

yt_pred -- prediction at timestep "t", numpy array of shape (n_y, m)

cache -- tuple of values needed for the backward pass, contains (a_next, a_prev, xt, parameters)

"""

# Retrieve parameters from "parameters"

Wax = parameters["Wax"]

Waa = parameters["Waa"]

Wya = parameters["Wya"]

ba = parameters["ba"]

by = parameters["by"]

### START CODE HERE ### (≈2 lines)

# compute next activation state using the formula given above

a_next = np.tanh(np.dot(Wax, xt) + np.dot(Waa, a_prev) + ba)

# compute output of the current cell using the formula given above

yt_pred = softmax(np.dot(Wya, a_next) + by)

### END CODE HERE ###

# store values you need for backward propagation in cache

cache = (a_next, a_prev, xt, parameters)

return a_next, yt_pred, cache

执行上述代码

def rnn_cell_forward_tests(rnn_cell_forward):

        np.random.seed(1)

        xt_tmp = np.random.randn(3, 10)

        a_prev_tmp = np.random.randn(5, 10)

        parameters_tmp = {}

        parameters_tmp['Waa'] = np.random.randn(5, 5)

        parameters_tmp['Wax'] = np.random.randn(5, 3)

        parameters_tmp['Wya'] = np.random.randn(2, 5)

        parameters_tmp['ba'] = np.random.randn(5, 1)

        parameters_tmp['by'] = np.random.randn(2, 1)

        a_next_tmp, yt_pred_tmp, cache_tmp = rnn_cell_forward(xt_tmp, a_prev_tmp, parameters_tmp)

        print("a_next[4] = \n", a_next_tmp[4])

        print("a_next.shape = \n", a_next_tmp.shape)

        print("yt_pred[1] =\n", yt_pred_tmp[1])

        print("yt_pred.shape = \n", yt_pred_tmp.shape)

# UNIT TESTS

rnn_cell_forward_tests(rnn_cell_forward)

6. RNN前向传播的过程 RNN Forward Pass

➢ A recurrent neural network (RNN) is repetition of the RNN cell that you've just built.

● If your input sequence of data is 10 time steps long, then you will re-use the RNN cell 10 times

➢ Each cell takes two inputs at each time step:

● $a^{<t-1>}$ : The hidden state from the previous cell

● $x^{<t>}$ : The current time step's input data

➢ It has two outputs at each time step:

● A hidden state $(a^{<t>})$

● A prediction $(y^{<t>})$

➢ The weights biases $(W_{aa},b_{a},W_{ax},b_{x})$ are resued each time step

● They are maintained between calls to rnn_cell_forward in the 'parameters' dictionary

? 上面代码里面没有提 $b_{x}$

# UNQ_C2 (UNIQUE CELL IDENTIFIER, DO NOT EDIT)

# GRADED FUNCTION: rnn_forward

def rnn_forward(x, a0, parameters):

""" Implement the forward propagation of the recurrent neural network described in Figure (3). Arguments:

x -- Input data for every time-step, of shape (n_x, m, T_x).

    a0 -- Initial hidden state, of shape (n_a, m)

    parameters -- python dictionary containing:

    Waa -- Weight matrix multiplying the hidden state, numpy array of shape (n_a, n_a)

    Wax -- Weight matrix multiplying the input, numpy array of shape (n_a, n_x)

    Wya -- Weight matrix relating the hidden-state to the output, numpy array of shape (n_y, n_a)

    ba -- Bias numpy array of shape (n_a, 1)

    by -- Bias relating the hidden-state to the output, numpy array of shape (n_y, 1)

    Returns:

    a -- Hidden states for every time-step, numpy array of shape (n_a, m, T_x)

    y_pred -- Predictions for every time-step, numpy array of shape (n_y, m, T_x)

    caches -- tuple of values needed for the backward pass, contains (list of caches, x)

"""

# Initialize "caches" which will contain the list of all caches

caches = []

# Retrieve dimensions from shapes of x and parameters["Wya"]

n_x, m, T_x = x.shape

n_y,n_a = parameters["Wya"].shape

### START CODE HERE ###

# initialize "a" and "y_pred" with zeros (≈2 lines)

a = np.zeros((n_a, m, T_x))

y_pred = np.zeros((n_y, m, T_x))

# Initialize a_next (≈1 line)

a_next = a0

# loop over all time-steps

for t in range(T_x):

    # Update next hidden state, compute the prediction, get the cache (≈1 line)

    a_next, yt_pred, cache = rnn_cell_forward(x[:,:,t] ,a_next, parameters)

    # Save the value of the new "next" hidden state in a (≈1 line)

    a[:,:,t] = a_next

    # Save the value of the prediction in y (≈1 line)

    y_pred[:,:,t] = yt_pred

    # Append "cache" to "caches" (≈1 line)

    caches.append(cache)

### END CODE HERE

### # store values needed for backward propagation in cache

caches = (caches, x)

return a, y_pred, caches

执行上述代码

def rnn_forward_test(rnn_forward) :
    np.random.seed(1)

    x_tmp = np.random.randn(3, 10, 4)

    a0_tmp = np.random.randn(5, 10)

    parameters_tmp = {}

    parameters_tmp['Waa'] = np.random.randn(5, 5)

    parameters_tmp['Wax'] = np.random.randn(5, 3)

    parameters_tmp['Wya'] = np.random.randn(2, 5)

    parameters_tmp['ba'] = np.random.randn(5, 1)

    parameters_tmp['by'] = np.random.randn(2, 1)

    a_tmp, y_pred_tmp, caches_tmp = rnn_forward(x_tmp, a0_tmp, parameters_tmp)

    print("a[4][1] = \n", a_tmp[4][1])

    print("a.shape = \n", a_tmp.shape)

    print("y_pred[1][3] =\n", y_pred_tmp[1][3])

    print("y_pred.shape = \n", y_pred_tmp.shape)

    print("caches[1][1][3] =\n", caches_tmp[1][1][3])

    print("len(caches) = \n", len(caches_tmp))

#UNIT TEST

rnn_forward_test(rnn_forward)

7. 小结

You've successfully built the forward propagation of a recurrent network from scratch.

➢ Situations when this RNN will peform better:

● This will work well enough for some applications, but it suffers from vanishing gradients.

● The RNN works best when each output $\hat{y}^{<t>}$ can be estimated using "local" context.

● "Local" context refers to information that is close to the prediction's time step t.

● More formally, local context refers to inputs $x^{<t_j> }$ and predictions $\hat{y}^{<t>}$ where is $t_j$ close to $t$

➢ What you should remember:

● The recurrent neural network, or RNN , is essentially the repeated use of a single cell.

● A basic RNN reads inputs one at a time, and remembers information through the hidden layer activations(hidden states) that are passed from one step to the next

■ The timestep dimension determines how many times to re-use the RNN cell

● Each cell takes into two inputs at each time step:

■ The hidden state from the previous cell

■ The current time step's input data

● Each cell has two outputs at each time step:

■ A hidden state

■ A prediction

最后编辑于：2023.09.24 14:30:14

人面猴
序言：七十年代末，一起剥皮案震惊了整个滨河市，随后出现的几起案子，更是在滨河造成了极大的恐慌，老刑警刘岩，带你破解...
沈念sama阅读 202,905评论 5赞 476
死咒
序言：滨河连续发生了三起死亡事件，死亡现场离奇诡异，居然都是意外死亡，警方通过查阅死者的电脑和手机，发现死者居然都...
沈念sama阅读 85,140评论 2赞 379
救了他两次的神仙让他今天三更去死
文/潘晓璐我一进店门，熙熙楼的掌柜王于贵愁眉苦脸地迎上来，“玉大人，你说我怎么就摊上这事。” “怎么了？”我有些...
开封第一讲书人阅读 149,791评论 0赞 335
道士缉凶录：失踪的卖姜人
文/不坏的土叔我叫张陵，是天一观的道长。经常有香客问我，道长，这世上最难降的妖魔是什么？我笑而不...
开封第一讲书人阅读 54,483评论 1赞 273
港岛之恋（遗憾婚礼）
正文为了忘掉前任，我火速办了婚礼，结果婚礼上，老公的妹妹穿的比我还像新娘。我一直安慰自己，他们只是感情好，可当我...
茶点故事阅读 63,476评论 5赞 364
恶毒庶女顶嫁案：这布局不是一般人想出来的
文/花漫我一把揭开白布。她就那样静静地躺着，像睡着了一般。火红的嫁衣衬着肌肤如雪。梳的纹丝不乱的头发上，一...
开封第一讲书人阅读 48,516评论 1赞 281
城市分裂传说
那天，我揣着相机与录音，去河边找鬼。笑死，一个胖子当着我的面吹牛，可吹牛的内容都是我干的。我是一名探鬼主播，决...
沈念sama阅读 37,905评论 3赞 395
双鸳鸯连环套：你想象不到人心有多黑
文/苍兰香墨我猛地睁开眼，长吁一口气：“原来是场噩梦啊……” “哼！你这毒妇竟也来了？” 一声冷哼从身侧响起，我...
开封第一讲书人阅读 36,560评论 0赞 256
万荣杀人案实录
序言：老挝万荣一对情侣失踪，失踪者是张志新（化名）和其女友刘颖，没想到半个月后，有当地人在树林里发现了一具尸体，经...
沈念sama阅读 40,778评论 1赞 296
护林员之死
正文独居荒郊野岭守林人离奇死亡，尸身上长有42处带血的脓包…… 初始之章·张勋以下内容为张勋视角年9月15日...
茶点故事阅读 35,557评论 2赞 319
白月光启示录
正文我和宋清朗相恋三年，在试婚纱的时候发现自己被绿了。大学时的朋友给我发了我未婚夫和他白月光在一起吃饭的照片。...
茶点故事阅读 37,635评论 1赞 329
活死人
序言：一个原本活蹦乱跳的男人离奇死亡，死状恐怖，灵堂内的尸体忽然破棺而出，到底是诈尸还是另有隐情，我是刑警宁泽，带...
沈念sama阅读 33,338评论 4赞 318
日本核电站爆炸内幕
正文年R本政府宣布，位于F岛的核电站，受9级特大地震影响，放射性物质发生泄漏。R本人自食恶果不足惜，却给世界环境...
茶点故事阅读 38,925评论 3赞 307
男人毒药：我在死后第九天来索命
文/蒙蒙一、第九天我趴在偏房一处隐蔽的房顶上张望。院中可真热闹，春花似锦、人声如沸。这庄子的主人今日做“春日...
开封第一讲书人阅读 29,898评论 0赞 19
一桩弑父案，背后竟有这般阴谋
文/苍兰香墨我抬头看了看天上的太阳。三九已至，却和暖如春，着一层夹袄步出监牢的瞬间，已是汗流浃背。一阵脚步声响...
开封第一讲书人阅读 31,142评论 1赞 259
情欲美人皮
我被黑心中介骗来泰国打工，没想到刚下飞机就差点儿被人妖公主榨干…… 1. 我叫王不留，地道东北人。一个月前我还...
沈念sama阅读 42,818评论 2赞 349
代替公主和亲
正文我出身青楼，却偏偏与公主长得像，于是被迫代替她去往敌国和亲。传闻我的和亲对象是个残疾皇子，可洞房花烛夜当晚...
茶点故事阅读 42,347评论 2赞 342