Decode "Residual Network"???

"When I study Residual Network, it has made enormous confusion for me, therefore it is needed to make a memo post for later review as well as beginners' skip connection."

  • Basic idea

    • (wikipedia quots)A residual neural network(ResNet) is an artificial neural network of a kind that builds on constructs known from pyramidal cells(锥体细胞 ) in the cerebral cortex(大脑皮层).
      The Residual neural networks do this by utilizing so called skip connections or short-cuts to jump over some layers in order to avoid the problem of vanishing gradients and training degradation.

    • There are two types of Neural networks: Plain Networks and Deeper Networks.

      • The Plain Networks often contain layers at most smaller than 25 with accuracy roughly as good as it should be, i.e. the result is not too bad and there is going to more improvement.

      • The Deeper Networks often contain layers more than 25+ and people often think more deeper layer more better the accuracy, in reality it is often wrong because the deeper layers is the higher risk of vanishing or exploding gradient will be. Even if you add the regularization to save the whole network from it, there is also things called degradation

        • Vanishing or exploding gradient(it is trivial to explain it so we omit this part)

        • Degradation - This problem has been observed while training deeper neural networks, as we increase the network depth, accuracy gets saturated which is expected as more complex layers of the network to model all the intricacies of the data. Overtime, there will come a time, as we increase the layers of the network further(after the saturation region), the accuracy of the network dropped. We can think of this happened due to overfitting, but actually it is not, additional layers in a deep model lead to higher training errors(training not testing)

          image.png

          It is hard to grasp at the beginning since the common sense of training a neural network is to make deeper layers in order to achieve a higher accuracy, we often think more input more better result.

    • So in order to avoid all those type of problems, people find out residual network which has been proved a good solution for deeper neural networks(25+ layers).

      comparison between plain and resnet

  • How

    • Intuition behind ResNet
      • What is residual - A residual is the error in a result, for example, find out someone's age, if the actual age is 20 and you guessed 18, 2 is off from the right answer and it is the residua. In essence, residual is what you should have added to your prediction to match the actual data. It is important to realize that when residual is 0, we don't do anything since the prediction already matches the actual data.
      image.png

      In the diagram, x is our prediction and we want it to be equal to the Actual. However, if is it off by a margin, our residual function residual() will kick in and produce the residual of the operation so as to correct our prediction to match the actual. If x == Actual, residual(x) will be 0. The Identity function just copies x.

    • How ResNet works
      • We want to go deeper without degradation in accuracy and error rate. We can do this via injecting identity mappings.

      • We want to be able to learn the residuals so that our predictions are close to the actuals.

      • Shortcut connections are those skipping one or more layers. In our case, the shortcut connections simply perform identity mapping, and their outputs are added to the outputs of the stacked layers. Identity shortcut connections add neither extra parameter nor computational complexity. The entire network can still be trained end-to-end by SGD with backpropagation, and can be easily implemented using common libraries without modifying the solvers.


        image.png

        H(x) = F(x) + x, where F(x) = W2 * relu(W1 * x + b1) + b2

        During training period, the residual network learns the weights of its layers such that if the identity mapping were optimal, all the weights get set to 0. In effect F(x) become 0, as in x gets directly mapped to H(x) and no corrections need to be made. Hence these become your identity mappings which help grow the network deep. And if there is a deviation from optimal identity mapping, weights and biases of F(x) are learned to adjust for it. Think of F(x) as learning how to adjust our predictions to match the actuals.


Conclusion

Deep residual networks works well due to the flow of information from the very first layer to the last layer of the network. By formulating residual functions as identity mappings, information is able to flow unimpeded throughout the entire network. This allows any layer to be represented as a function of the original input. Using pre-activation resnets by placing batch normalization and relu before the convolution, the output of the addition becomes the output of the layer, this achieves the identity effect we desire.

image.png

PS-1

image.png

image.png
image.png

image.png

image.png

image.png

image.png

image.png

image.png

image.png

PS-2

  • Take a close look of the residual block
    image.png

    image.png

    The main take away here is to make the a[l+2] == relu(a[l]),
    therefore, the gradients at every single layer could be computed with the original input taking into consideration. Given the above equation, when G and H are identity functions, information would always flow unimpeded and gradients would never vanish no matter how deep we go.

PS-3 ResNet code example

#import needed classes
import keras
from keras.datasets import cifar10
from keras.layers import Dense,Conv2D,MaxPooling2D,Flatten,AveragePooling2D,Dropout,BatchNormalization,Activation
from keras.models import Model,Input
from keras.optimizers import Adam
from keras.callbacks import LearningRateScheduler
from keras.callbacks import ModelCheckpoint
from math import ceil
import os
from keras.preprocessing.image import ImageDataGenerator


def Unit(x,filters,pool=False):
    res = x
    if pool:
        x = MaxPooling2D(pool_size=(2, 2))(x)
        res = Conv2D(filters=filters,kernel_size=[1,1],strides=(2,2),padding="same")(res)
    out = BatchNormalization()(x)
    out = Activation("relu")(out)
    out = Conv2D(filters=filters, kernel_size=[3, 3], strides=[1, 1], padding="same")(out)

    out = BatchNormalization()(out)
    out = Activation("relu")(out)
    out = Conv2D(filters=filters, kernel_size=[3, 3], strides=[1, 1], padding="same")(out)

    out = keras.layers.add([res,out])

    return out

#Define the model


def MiniModel(input_shape):
    images = Input(input_shape)
    net = Conv2D(filters=32, kernel_size=[3, 3], strides=[1, 1], padding="same")(images)
    net = Unit(net,32)
    net = Unit(net,32)
    net = Unit(net,32)

    net = Unit(net,64,pool=True)
    net = Unit(net,64)
    net = Unit(net,64)

    net = Unit(net,128,pool=True)
    net = Unit(net,128)
    net = Unit(net,128)

    net = Unit(net, 256,pool=True)
    net = Unit(net, 256)
    net = Unit(net, 256)

    net = BatchNormalization()(net)
    net = Activation("relu")(net)
    net = Dropout(0.25)(net)

    net = AveragePooling2D(pool_size=(4,4))(net)
    net = Flatten()(net)
    net = Dense(units=10,activation="softmax")(net)

    model = Model(inputs=images,outputs=net)

    return model

#load the cifar10 dataset
(train_x, train_y) , (test_x, test_y) = cifar10.load_data()

#normalize the data
train_x = train_x.astype('float32') / 255
test_x = test_x.astype('float32') / 255

#Subtract the mean image from both train and test set
train_x = train_x - train_x.mean()
test_x = test_x - test_x.mean()

#Divide by the standard deviation
train_x = train_x / train_x.std(axis=0)
test_x = test_x / test_x.std(axis=0)

# Generate batches of tensor image data with real-time data augmentation. 
# The data will be looped over (in batches).
datagen = ImageDataGenerator(rotation_range=10,
                             width_shift_range=5. / 32,
                             height_shift_range=5. / 32,
                             horizontal_flip=True)

# Compute quantities required for featurewise normalization
# (std, mean, and principal components if ZCA whitening is applied).
datagen.fit(train_x)



#Encode the labels to vectors
train_y = keras.utils.to_categorical(train_y,10)
test_y = keras.utils.to_categorical(test_y,10)

#define a common unit


input_shape = (32,32,3)
model = MiniModel(input_shape)

#Print a Summary of the model

model.summary()
#Specify the training components
model.compile(optimizer=Adam(0.001),loss="categorical_crossentropy",metrics=["accuracy"])



epochs = 50
steps_per_epoch = ceil(50000/128)

# Fit the model on the batches generated by datagen.flow().
model.fit_generator(datagen.flow(train_x, train_y, batch_size=128),
                    validation_data=[test_x,test_y],
                    epochs=epochs,steps_per_epoch=steps_per_epoch, verbose=1, workers=4)


#Evaluate the accuracy of the test dataset
accuracy = model.evaluate(x=test_x,y=test_y,batch_size=128)
model.save("cifar10model.h5")

running result

<pre style="box-sizing: border-box; overflow: auto; font-family: monospace; font-size: inherit; display: block; padding: 1px 0px; margin: 0px; line-height: inherit; word-break: break-all; overflow-wrap: break-word; color: black; background-color: transparent; border: 0px; border-radius: 0px; white-space: pre-wrap; vertical-align: baseline;">Using TensorFlow backend.
</pre>

<pre style="box-sizing: border-box; overflow: auto; font-family: monospace; font-size: inherit; display: block; padding: 1px 0px; margin: 0px; line-height: inherit; word-break: break-all; overflow-wrap: break-word; color: black; background-color: transparent; border: 0px; border-radius: 0px; white-space: pre-wrap; vertical-align: baseline;">Downloading data from [https://www.cs.toronto.edu/~kriz/cifar-10-python.tar.gz](https://www.cs.toronto.edu/~kriz/cifar-10-python.tar.gz)
170500096/170498071 [==============================] - 42s 0us/step
__________________________________________________________________________________________________
Layer (type)                    Output Shape         Param #     Connected to                     
==================================================================================================
input_1 (InputLayer)            (None, 32, 32, 3)    0                                            
__________________________________________________________________________________________________
conv2d_1 (Conv2D)               (None, 32, 32, 32)   896         input_1[0][0]                    
__________________________________________________________________________________________________
batch_normalization_1 (BatchNor (None, 32, 32, 32)   128         conv2d_1[0][0]                   
__________________________________________________________________________________________________
activation_1 (Activation)       (None, 32, 32, 32)   0           batch_normalization_1[0][0]      
__________________________________________________________________________________________________
conv2d_2 (Conv2D)               (None, 32, 32, 32)   9248        activation_1[0][0]               
__________________________________________________________________________________________________
batch_normalization_2 (BatchNor (None, 32, 32, 32)   128         conv2d_2[0][0]                   
__________________________________________________________________________________________________
activation_2 (Activation)       (None, 32, 32, 32)   0           batch_normalization_2[0][0]      
__________________________________________________________________________________________________
conv2d_3 (Conv2D)               (None, 32, 32, 32)   9248        activation_2[0][0]               
__________________________________________________________________________________________________
add_1 (Add)                     (None, 32, 32, 32)   0           conv2d_1[0][0]                   
                                                                 conv2d_3[0][0]                   
__________________________________________________________________________________________________
batch_normalization_3 (BatchNor (None, 32, 32, 32)   128         add_1[0][0]                      
__________________________________________________________________________________________________
activation_3 (Activation)       (None, 32, 32, 32)   0           batch_normalization_3[0][0]      
__________________________________________________________________________________________________
conv2d_4 (Conv2D)               (None, 32, 32, 32)   9248        activation_3[0][0]               
__________________________________________________________________________________________________
batch_normalization_4 (BatchNor (None, 32, 32, 32)   128         conv2d_4[0][0]                   
__________________________________________________________________________________________________
activation_4 (Activation)       (None, 32, 32, 32)   0           batch_normalization_4[0][0]      
__________________________________________________________________________________________________
conv2d_5 (Conv2D)               (None, 32, 32, 32)   9248        activation_4[0][0]               
__________________________________________________________________________________________________
add_2 (Add)                     (None, 32, 32, 32)   0           add_1[0][0]                      
                                                                 conv2d_5[0][0]                   
__________________________________________________________________________________________________
batch_normalization_5 (BatchNor (None, 32, 32, 32)   128         add_2[0][0]                      
__________________________________________________________________________________________________
activation_5 (Activation)       (None, 32, 32, 32)   0           batch_normalization_5[0][0]      
__________________________________________________________________________________________________
conv2d_6 (Conv2D)               (None, 32, 32, 32)   9248        activation_5[0][0]               
__________________________________________________________________________________________________
batch_normalization_6 (BatchNor (None, 32, 32, 32)   128         conv2d_6[0][0]                   
__________________________________________________________________________________________________
activation_6 (Activation)       (None, 32, 32, 32)   0           batch_normalization_6[0][0]      
__________________________________________________________________________________________________
conv2d_7 (Conv2D)               (None, 32, 32, 32)   9248        activation_6[0][0]               
__________________________________________________________________________________________________
add_3 (Add)                     (None, 32, 32, 32)   0           add_2[0][0]                      
                                                                 conv2d_7[0][0]                   
__________________________________________________________________________________________________
max_pooling2d_1 (MaxPooling2D)  (None, 16, 16, 32)   0           add_3[0][0]                      
__________________________________________________________________________________________________
batch_normalization_7 (BatchNor (None, 16, 16, 32)   128         max_pooling2d_1[0][0]            
__________________________________________________________________________________________________
activation_7 (Activation)       (None, 16, 16, 32)   0           batch_normalization_7[0][0]      
__________________________________________________________________________________________________
conv2d_9 (Conv2D)               (None, 16, 16, 64)   18496       activation_7[0][0]               
__________________________________________________________________________________________________
batch_normalization_8 (BatchNor (None, 16, 16, 64)   256         conv2d_9[0][0]                   
__________________________________________________________________________________________________
activation_8 (Activation)       (None, 16, 16, 64)   0           batch_normalization_8[0][0]      
__________________________________________________________________________________________________
conv2d_8 (Conv2D)               (None, 16, 16, 64)   2112        add_3[0][0]                      
__________________________________________________________________________________________________
conv2d_10 (Conv2D)              (None, 16, 16, 64)   36928       activation_8[0][0]               
__________________________________________________________________________________________________
add_4 (Add)                     (None, 16, 16, 64)   0           conv2d_8[0][0]                   
                                                                 conv2d_10[0][0]                  
__________________________________________________________________________________________________
batch_normalization_9 (BatchNor (None, 16, 16, 64)   256         add_4[0][0]                      
__________________________________________________________________________________________________
activation_9 (Activation)       (None, 16, 16, 64)   0           batch_normalization_9[0][0]      
__________________________________________________________________________________________________
conv2d_11 (Conv2D)              (None, 16, 16, 64)   36928       activation_9[0][0]               
__________________________________________________________________________________________________
batch_normalization_10 (BatchNo (None, 16, 16, 64)   256         conv2d_11[0][0]                  
__________________________________________________________________________________________________
activation_10 (Activation)      (None, 16, 16, 64)   0           batch_normalization_10[0][0]     
__________________________________________________________________________________________________
conv2d_12 (Conv2D)              (None, 16, 16, 64)   36928       activation_10[0][0]              
__________________________________________________________________________________________________
add_5 (Add)                     (None, 16, 16, 64)   0           add_4[0][0]                      
                                                                 conv2d_12[0][0]                  
__________________________________________________________________________________________________
batch_normalization_11 (BatchNo (None, 16, 16, 64)   256         add_5[0][0]                      
__________________________________________________________________________________________________
activation_11 (Activation)      (None, 16, 16, 64)   0           batch_normalization_11[0][0]     
__________________________________________________________________________________________________
conv2d_13 (Conv2D)              (None, 16, 16, 64)   36928       activation_11[0][0]              
__________________________________________________________________________________________________
batch_normalization_12 (BatchNo (None, 16, 16, 64)   256         conv2d_13[0][0]                  
__________________________________________________________________________________________________
activation_12 (Activation)      (None, 16, 16, 64)   0           batch_normalization_12[0][0]     
__________________________________________________________________________________________________
conv2d_14 (Conv2D)              (None, 16, 16, 64)   36928       activation_12[0][0]              
__________________________________________________________________________________________________
add_6 (Add)                     (None, 16, 16, 64)   0           add_5[0][0]                      
                                                                 conv2d_14[0][0]                  
__________________________________________________________________________________________________
max_pooling2d_2 (MaxPooling2D)  (None, 8, 8, 64)     0           add_6[0][0]                      
__________________________________________________________________________________________________
batch_normalization_13 (BatchNo (None, 8, 8, 64)     256         max_pooling2d_2[0][0]            
__________________________________________________________________________________________________
activation_13 (Activation)      (None, 8, 8, 64)     0           batch_normalization_13[0][0]     
__________________________________________________________________________________________________
conv2d_16 (Conv2D)              (None, 8, 8, 128)    73856       activation_13[0][0]              
__________________________________________________________________________________________________
batch_normalization_14 (BatchNo (None, 8, 8, 128)    512         conv2d_16[0][0]                  
__________________________________________________________________________________________________
activation_14 (Activation)      (None, 8, 8, 128)    0           batch_normalization_14[0][0]     
__________________________________________________________________________________________________
conv2d_15 (Conv2D)              (None, 8, 8, 128)    8320        add_6[0][0]                      
__________________________________________________________________________________________________
conv2d_17 (Conv2D)              (None, 8, 8, 128)    147584      activation_14[0][0]              
__________________________________________________________________________________________________
add_7 (Add)                     (None, 8, 8, 128)    0           conv2d_15[0][0]                  
                                                                 conv2d_17[0][0]                  
__________________________________________________________________________________________________
batch_normalization_15 (BatchNo (None, 8, 8, 128)    512         add_7[0][0]                      
__________________________________________________________________________________________________
activation_15 (Activation)      (None, 8, 8, 128)    0           batch_normalization_15[0][0]     
__________________________________________________________________________________________________
conv2d_18 (Conv2D)              (None, 8, 8, 128)    147584      activation_15[0][0]              
__________________________________________________________________________________________________
batch_normalization_16 (BatchNo (None, 8, 8, 128)    512         conv2d_18[0][0]                  
__________________________________________________________________________________________________
activation_16 (Activation)      (None, 8, 8, 128)    0           batch_normalization_16[0][0]     
__________________________________________________________________________________________________
conv2d_19 (Conv2D)              (None, 8, 8, 128)    147584      activation_16[0][0]              
__________________________________________________________________________________________________
add_8 (Add)                     (None, 8, 8, 128)    0           add_7[0][0]                      
                                                                 conv2d_19[0][0]                  
__________________________________________________________________________________________________
batch_normalization_17 (BatchNo (None, 8, 8, 128)    512         add_8[0][0]                      
__________________________________________________________________________________________________
activation_17 (Activation)      (None, 8, 8, 128)    0           batch_normalization_17[0][0]     
__________________________________________________________________________________________________
conv2d_20 (Conv2D)              (None, 8, 8, 128)    147584      activation_17[0][0]              
__________________________________________________________________________________________________
batch_normalization_18 (BatchNo (None, 8, 8, 128)    512         conv2d_20[0][0]                  
__________________________________________________________________________________________________
activation_18 (Activation)      (None, 8, 8, 128)    0           batch_normalization_18[0][0]     
__________________________________________________________________________________________________
conv2d_21 (Conv2D)              (None, 8, 8, 128)    147584      activation_18[0][0]              
__________________________________________________________________________________________________
add_9 (Add)                     (None, 8, 8, 128)    0           add_8[0][0]                      
                                                                 conv2d_21[0][0]                  
__________________________________________________________________________________________________
max_pooling2d_3 (MaxPooling2D)  (None, 4, 4, 128)    0           add_9[0][0]                      
__________________________________________________________________________________________________
batch_normalization_19 (BatchNo (None, 4, 4, 128)    512         max_pooling2d_3[0][0]            
__________________________________________________________________________________________________
activation_19 (Activation)      (None, 4, 4, 128)    0           batch_normalization_19[0][0]     
__________________________________________________________________________________________________
conv2d_23 (Conv2D)              (None, 4, 4, 256)    295168      activation_19[0][0]              
__________________________________________________________________________________________________
batch_normalization_20 (BatchNo (None, 4, 4, 256)    1024        conv2d_23[0][0]                  
__________________________________________________________________________________________________
activation_20 (Activation)      (None, 4, 4, 256)    0           batch_normalization_20[0][0]     
__________________________________________________________________________________________________
conv2d_22 (Conv2D)              (None, 4, 4, 256)    33024       add_9[0][0]                      
__________________________________________________________________________________________________
conv2d_24 (Conv2D)              (None, 4, 4, 256)    590080      activation_20[0][0]              
__________________________________________________________________________________________________
add_10 (Add)                    (None, 4, 4, 256)    0           conv2d_22[0][0]                  
                                                                 conv2d_24[0][0]                  
__________________________________________________________________________________________________
batch_normalization_21 (BatchNo (None, 4, 4, 256)    1024        add_10[0][0]                     
__________________________________________________________________________________________________
activation_21 (Activation)      (None, 4, 4, 256)    0           batch_normalization_21[0][0]     
__________________________________________________________________________________________________
conv2d_25 (Conv2D)              (None, 4, 4, 256)    590080      activation_21[0][0]              
__________________________________________________________________________________________________
batch_normalization_22 (BatchNo (None, 4, 4, 256)    1024        conv2d_25[0][0]                  
__________________________________________________________________________________________________
activation_22 (Activation)      (None, 4, 4, 256)    0           batch_normalization_22[0][0]     
__________________________________________________________________________________________________
conv2d_26 (Conv2D)              (None, 4, 4, 256)    590080      activation_22[0][0]              
__________________________________________________________________________________________________
add_11 (Add)                    (None, 4, 4, 256)    0           add_10[0][0]                     
                                                                 conv2d_26[0][0]                  
__________________________________________________________________________________________________
batch_normalization_23 (BatchNo (None, 4, 4, 256)    1024        add_11[0][0]                     
__________________________________________________________________________________________________
activation_23 (Activation)      (None, 4, 4, 256)    0           batch_normalization_23[0][0]     
__________________________________________________________________________________________________
conv2d_27 (Conv2D)              (None, 4, 4, 256)    590080      activation_23[0][0]              
__________________________________________________________________________________________________
batch_normalization_24 (BatchNo (None, 4, 4, 256)    1024        conv2d_27[0][0]                  
__________________________________________________________________________________________________
activation_24 (Activation)      (None, 4, 4, 256)    0           batch_normalization_24[0][0]     
__________________________________________________________________________________________________
conv2d_28 (Conv2D)              (None, 4, 4, 256)    590080      activation_24[0][0]              
__________________________________________________________________________________________________
add_12 (Add)                    (None, 4, 4, 256)    0           add_11[0][0]                     
                                                                 conv2d_28[0][0]                  
__________________________________________________________________________________________________
batch_normalization_25 (BatchNo (None, 4, 4, 256)    1024        add_12[0][0]                     
__________________________________________________________________________________________________
activation_25 (Activation)      (None, 4, 4, 256)    0           batch_normalization_25[0][0]     
__________________________________________________________________________________________________
dropout_1 (Dropout)             (None, 4, 4, 256)    0           activation_25[0][0]              
__________________________________________________________________________________________________
average_pooling2d_1 (AveragePoo (None, 1, 1, 256)    0           dropout_1[0][0]                  
__________________________________________________________________________________________________
flatten_1 (Flatten)             (None, 256)          0           average_pooling2d_1[0][0]        
__________________________________________________________________________________________________
dense_1 (Dense)                 (None, 10)           2570        flatten_1[0][0]                  
==================================================================================================
Total params: 4,374,538
Trainable params: 4,368,714
Non-trainable params: 5,824
__________________________________________________________________________________________________
</pre>

<pre style="box-sizing: border-box; overflow: auto; font-family: monospace; font-size: inherit; display: block; padding: 1px 0px; margin: 0px; line-height: inherit; word-break: break-all; overflow-wrap: break-word; color: black; background-color: transparent; border: 0px; border-radius: 0px; white-space: pre-wrap; vertical-align: baseline;">Epoch 1/50
391/391 [==============================] - 27s 68ms/step - loss: 1.2885 - acc: 0.5326 - val_loss: 1.6630 - val_acc: 0.4961
Epoch 2/50
391/391 [==============================] - 21s 53ms/step - loss: 0.8541 - acc: 0.7001 - val_loss: 1.0465 - val_acc: 0.6674
Epoch 3/50
391/391 [==============================] - 21s 54ms/step - loss: 0.6907 - acc: 0.7593 - val_loss: 0.9077 - val_acc: 0.7053
Epoch 4/50
391/391 [==============================] - 22s 56ms/step - loss: 0.6064 - acc: 0.7902 - val_loss: 0.6870 - val_acc: 0.7732
Epoch 5/50
391/391 [==============================] - 21s 53ms/step - loss: 0.5409 - acc: 0.8119 - val_loss: 0.6286 - val_acc: 0.7820
Epoch 6/50
391/391 [==============================] - 20s 52ms/step - loss: 0.4976 - acc: 0.8276 - val_loss: 0.6467 - val_acc: 0.7915
Epoch 7/50
391/391 [==============================] - 21s 53ms/step - loss: 0.4554 - acc: 0.8428 - val_loss: 0.7318 - val_acc: 0.7812
Epoch 8/50
391/391 [==============================] - 21s 54ms/step - loss: 0.4276 - acc: 0.8515 - val_loss: 0.5955 - val_acc: 0.8024
Epoch 9/50
391/391 [==============================] - 20s 51ms/step - loss: 0.4037 - acc: 0.8592 - val_loss: 0.7164 - val_acc: 0.7742
Epoch 10/50
391/391 [==============================] - 20s 52ms/step - loss: 0.3785 - acc: 0.8691 - val_loss: 0.5306 - val_acc: 0.8272
Epoch 11/50
391/391 [==============================] - 20s 51ms/step - loss: 0.3606 - acc: 0.8747 - val_loss: 0.6534 - val_acc: 0.8090
Epoch 12/50
391/391 [==============================] - 20s 51ms/step - loss: 0.3378 - acc: 0.8816 - val_loss: 0.4706 - val_acc: 0.8475
Epoch 13/50
391/391 [==============================] - 20s 51ms/step - loss: 0.3182 - acc: 0.8888 - val_loss: 0.4721 - val_acc: 0.8438
Epoch 14/50
391/391 [==============================] - 21s 54ms/step - loss: 0.3070 - acc: 0.8941 - val_loss: 0.5304 - val_acc: 0.8327
Epoch 15/50
391/391 [==============================] - 20s 52ms/step - loss: 0.2959 - acc: 0.8972 - val_loss: 0.5714 - val_acc: 0.8310
Epoch 16/50
391/391 [==============================] - 22s 56ms/step - loss: 0.2757 - acc: 0.9032 - val_loss: 0.5431 - val_acc: 0.8413
Epoch 17/50
391/391 [==============================] - 21s 53ms/step - loss: 0.2722 - acc: 0.9045 - val_loss: 0.5690 - val_acc: 0.8257
Epoch 18/50
391/391 [==============================] - 21s 54ms/step - loss: 0.2542 - acc: 0.9105 - val_loss: 0.5157 - val_acc: 0.8502
Epoch 19/50
391/391 [==============================] - 20s 52ms/step - loss: 0.2447 - acc: 0.9150 - val_loss: 0.4588 - val_acc: 0.8625
Epoch 20/50
391/391 [==============================] - 21s 53ms/step - loss: 0.2299 - acc: 0.9180 - val_loss: 0.5702 - val_acc: 0.8410
Epoch 21/50
391/391 [==============================] - 20s 51ms/step - loss: 0.2238 - acc: 0.9207 - val_loss: 0.5116 - val_acc: 0.8418
Epoch 22/50
391/391 [==============================] - 20s 52ms/step - loss: 0.2201 - acc: 0.9242 - val_loss: 0.4404 - val_acc: 0.8655
Epoch 23/50
391/391 [==============================] - 21s 53ms/step - loss: 0.2071 - acc: 0.9270 - val_loss: 0.3913 - val_acc: 0.8784
Epoch 24/50
391/391 [==============================] - 21s 55ms/step - loss: 0.2007 - acc: 0.9300 - val_loss: 0.4831 - val_acc: 0.8581
Epoch 25/50
391/391 [==============================] - 20s 52ms/step - loss: 0.1993 - acc: 0.9298 - val_loss: 0.4367 - val_acc: 0.8684
Epoch 26/50
391/391 [==============================] - 24s 61ms/step - loss: 0.1902 - acc: 0.9327 - val_loss: 0.3972 - val_acc: 0.8818
Epoch 27/50
391/391 [==============================] - 25s 64ms/step - loss: 0.1804 - acc: 0.9355 - val_loss: 0.4377 - val_acc: 0.8714
Epoch 28/50
391/391 [==============================] - 24s 62ms/step - loss: 0.1751 - acc: 0.9396 - val_loss: 0.4713 - val_acc: 0.8644
Epoch 29/50
391/391 [==============================] - 23s 60ms/step - loss: 0.1686 - acc: 0.9399 - val_loss: 0.4441 - val_acc: 0.8689
Epoch 30/50
391/391 [==============================] - 22s 57ms/step - loss: 0.1619 - acc: 0.9436 - val_loss: 0.5143 - val_acc: 0.8729
Epoch 31/50
391/391 [==============================] - 21s 55ms/step - loss: 0.1562 - acc: 0.9439 - val_loss: 0.4043 - val_acc: 0.8834
Epoch 32/50
391/391 [==============================] - 22s 56ms/step - loss: 0.1512 - acc: 0.9463 - val_loss: 0.3830 - val_acc: 0.8895
Epoch 33/50
391/391 [==============================] - 21s 54ms/step - loss: 0.1456 - acc: 0.9482 - val_loss: 0.3707 - val_acc: 0.8900
Epoch 34/50
391/391 [==============================] - 23s 58ms/step - loss: 0.1415 - acc: 0.9498 - val_loss: 0.4362 - val_acc: 0.8788
Epoch 35/50
391/391 [==============================] - 22s 56ms/step - loss: 0.1423 - acc: 0.9501 - val_loss: 0.4081 - val_acc: 0.8881
Epoch 36/50
391/391 [==============================] - 22s 56ms/step - loss: 0.1350 - acc: 0.9523 - val_loss: 0.4355 - val_acc: 0.8809
Epoch 37/50
391/391 [==============================] - 22s 55ms/step - loss: 0.1343 - acc: 0.9526 - val_loss: 0.4465 - val_acc: 0.8825
Epoch 38/50
391/391 [==============================] - 22s 57ms/step - loss: 0.1314 - acc: 0.9526 - val_loss: 0.3857 - val_acc: 0.8941
Epoch 39/50
391/391 [==============================] - 22s 57ms/step - loss: 0.1207 - acc: 0.9574 - val_loss: 0.5319 - val_acc: 0.8636
Epoch 40/50
391/391 [==============================] - 21s 55ms/step - loss: 0.1206 - acc: 0.9569 - val_loss: 0.4038 - val_acc: 0.8907
Epoch 41/50
391/391 [==============================] - 22s 55ms/step - loss: 0.1191 - acc: 0.9578 - val_loss: 0.3672 - val_acc: 0.8963
Epoch 42/50
391/391 [==============================] - 21s 54ms/step - loss: 0.1148 - acc: 0.9596 - val_loss: 0.4449 - val_acc: 0.8819
Epoch 43/50
391/391 [==============================] - 21s 54ms/step - loss: 0.1116 - acc: 0.9591 - val_loss: 0.4252 - val_acc: 0.8844
Epoch 44/50
391/391 [==============================] - 22s 55ms/step - loss: 0.1097 - acc: 0.9612 - val_loss: 0.5019 - val_acc: 0.8774
Epoch 45/50
391/391 [==============================] - 22s 55ms/step - loss: 0.1066 - acc: 0.9619 - val_loss: 0.4458 - val_acc: 0.8822
Epoch 46/50
391/391 [==============================] - 22s 56ms/step - loss: 0.1032 - acc: 0.9634 - val_loss: 0.4647 - val_acc: 0.8833
Epoch 47/50
391/391 [==============================] - 22s 56ms/step - loss: 0.1027 - acc: 0.9634 - val_loss: 0.4329 - val_acc: 0.8845
Epoch 48/50
391/391 [==============================] - 22s 56ms/step - loss: 0.0990 - acc: 0.9644 - val_loss: 0.4254 - val_acc: 0.8880
Epoch 49/50
391/391 [==============================] - 22s 57ms/step - loss: 0.0935 - acc: 0.9676 - val_loss: 0.4516 - val_acc: 0.8850
Epoch 50/50
391/391 [==============================] - 22s 55ms/step - loss: 0.0969 - acc: 0.9660 - val_loss: 0.3984 - val_acc: 0.8995
10000/10000 [==============================] - 1s 143us/step</pre>

最后编辑于
©著作权归作者所有,转载或内容合作请联系作者
  • 序言:七十年代末,一起剥皮案震惊了整个滨河市,随后出现的几起案子,更是在滨河造成了极大的恐慌,老刑警刘岩,带你破解...
    沈念sama阅读 200,783评论 5 472
  • 序言:滨河连续发生了三起死亡事件,死亡现场离奇诡异,居然都是意外死亡,警方通过查阅死者的电脑和手机,发现死者居然都...
    沈念sama阅读 84,396评论 2 377
  • 文/潘晓璐 我一进店门,熙熙楼的掌柜王于贵愁眉苦脸地迎上来,“玉大人,你说我怎么就摊上这事。” “怎么了?”我有些...
    开封第一讲书人阅读 147,834评论 0 333
  • 文/不坏的土叔 我叫张陵,是天一观的道长。 经常有香客问我,道长,这世上最难降的妖魔是什么? 我笑而不...
    开封第一讲书人阅读 54,036评论 1 272
  • 正文 为了忘掉前任,我火速办了婚礼,结果婚礼上,老公的妹妹穿的比我还像新娘。我一直安慰自己,他们只是感情好,可当我...
    茶点故事阅读 63,035评论 5 362
  • 文/花漫 我一把揭开白布。 她就那样静静地躺着,像睡着了一般。 火红的嫁衣衬着肌肤如雪。 梳的纹丝不乱的头发上,一...
    开封第一讲书人阅读 48,242评论 1 278
  • 那天,我揣着相机与录音,去河边找鬼。 笑死,一个胖子当着我的面吹牛,可吹牛的内容都是我干的。 我是一名探鬼主播,决...
    沈念sama阅读 37,727评论 3 393
  • 文/苍兰香墨 我猛地睁开眼,长吁一口气:“原来是场噩梦啊……” “哼!你这毒妇竟也来了?” 一声冷哼从身侧响起,我...
    开封第一讲书人阅读 36,376评论 0 255
  • 序言:老挝万荣一对情侣失踪,失踪者是张志新(化名)和其女友刘颖,没想到半个月后,有当地人在树林里发现了一具尸体,经...
    沈念sama阅读 40,508评论 1 294
  • 正文 独居荒郊野岭守林人离奇死亡,尸身上长有42处带血的脓包…… 初始之章·张勋 以下内容为张勋视角 年9月15日...
    茶点故事阅读 35,415评论 2 317
  • 正文 我和宋清朗相恋三年,在试婚纱的时候发现自己被绿了。 大学时的朋友给我发了我未婚夫和他白月光在一起吃饭的照片。...
    茶点故事阅读 37,463评论 1 329
  • 序言:一个原本活蹦乱跳的男人离奇死亡,死状恐怖,灵堂内的尸体忽然破棺而出,到底是诈尸还是另有隐情,我是刑警宁泽,带...
    沈念sama阅读 33,140评论 3 316
  • 正文 年R本政府宣布,位于F岛的核电站,受9级特大地震影响,放射性物质发生泄漏。R本人自食恶果不足惜,却给世界环境...
    茶点故事阅读 38,734评论 3 303
  • 文/蒙蒙 一、第九天 我趴在偏房一处隐蔽的房顶上张望。 院中可真热闹,春花似锦、人声如沸。这庄子的主人今日做“春日...
    开封第一讲书人阅读 29,809评论 0 19
  • 文/苍兰香墨 我抬头看了看天上的太阳。三九已至,却和暖如春,着一层夹袄步出监牢的瞬间,已是汗流浃背。 一阵脚步声响...
    开封第一讲书人阅读 31,028评论 1 255
  • 我被黑心中介骗来泰国打工, 没想到刚下飞机就差点儿被人妖公主榨干…… 1. 我叫王不留,地道东北人。 一个月前我还...
    沈念sama阅读 42,521评论 2 346
  • 正文 我出身青楼,却偏偏与公主长得像,于是被迫代替她去往敌国和亲。 传闻我的和亲对象是个残疾皇子,可洞房花烛夜当晚...
    茶点故事阅读 42,119评论 2 341

推荐阅读更多精彩内容