Keras 和 TFlearn 的安装和使用——Jetson Nano 初体验4

阅读量：4608 次

发布时间：2019-06-09

本文共 38807 字，大约阅读时间需要 129 分钟。

1. TFlearn

1.1 TFlearn 安装

pip3 安装 TFlearn

pip3 install tflearn --userInstalling collected packages: tflearnSuccessfully installed tflearn-0.3.2

1.2 官方例子

在这个例子中我们将对泰坦尼克号上的乘客进行存活可能性预测。

1.2.1 数据集加载

数据集中，每一个乘客的相关信息如下：

VARIABLE DESCRIPTIONS:survived        Survived                (0 = No; 1 = Yes)pclass          Passenger Class                (1 = st; 2 = nd; 3 = rd)name            Namesex             Sexage             Agesibsp           Number of Siblings/Spouses Aboardparch           Number of Parents/Children Aboardticket          Ticket Numberfare            Passenger Fare

其中总共有9项，我们将其分为标签（label）和输入（data），令标签为是否存活，存活为1，那么输入包含8项，其中我们认为姓名以及船票的号码（可以由票价直接体现）对于我们预测乘客的存活几率是没有什么用的，所以在预处理中，我们将其抛弃。

数据集被存储为 csv 文件格式。csv ，全称为 Comma-Separated Values ，即逗号分隔值，其文本以纯文本形式存储表格数据，我们可以使用文本编辑器或 excel 直接打开。先加载数据到内存中

使用 load_csv() 函数从csv文件中读取数据，并转为 python List 。其中 target_column 参数用于表示我们的标签列 id ，该函数将返回一个元组：（data,labels）。然后按照我们前面说的，抛弃输入中的姓名以及船票号码字段，并将性别字段转为数值，0 表示男性，1 表示女性。

1.2.2 构建神经网络

TFLearn中采用Tensor进行运算，因此这里的net都是Tensor，与TensorFlow中一样，我们也可以将其中的某一个部分用TensorFlow中的函数自己写，从而实现一些TFLearn库中没有的功能。其中全连接层的W(weights_init)和b(bias_init)可以指定，不过默认为W：'truncated_normal'，b：'zeros'，此外，其中的 activation 参数默认为'linear'。

1.2.3 训练

其中 tflearn.DNN 是TFLearn中提供的一个模型 wrapper，相当于我们将很多功能包装起来，我们给它一个 net 结构，生成一个 model 对象，然后调用model对象的训练、预测、存储等功能，DNN类有三个属性（成员变量）：trainer，predictor，session。在fit()函数中n_epoch=10表示整个训练数据集将会用10遍，batch_size=16表示一次用16个数据计算参数的更新。

最后利用训练得到的模型进行预测：

import numpy as npimport tflearn# Download the Titanic datasetfrom tflearn.datasets import titanictitanic.download_dataset('titanic_dataset.csv')# Load CSV file, indicate that the first column represents labelsfrom tflearn.data_utils import load_csvdata, labels = load_csv('titanic_dataset.csv', target_column=0,                        categorical_labels=True, n_classes=2)# Preprocessing functiondef preprocess(data, columns_to_ignore):    # Sort by descending id and delete columns    for id in sorted(columns_to_ignore, reverse=True):        [r.pop(id) for r in data]    for i in range(len(data)):      # Converting 'sex' field to float (id is 1 after removing labels column)      data[i][1] = 1. if data[i][1] == 'female' else 0.    return np.array(data, dtype=np.float32)# Ignore 'name' and 'ticket' columns (id 1 & 6 of data array)to_ignore=[1, 6]# Preprocess datadata = preprocess(data, to_ignore)# Build neural networknet = tflearn.input_data(shape=[None, 6])net = tflearn.fully_connected(net, 32)net = tflearn.fully_connected(net, 32)net = tflearn.fully_connected(net, 2, activation='softmax')net = tflearn.regression(net)# Define modelmodel = tflearn.DNN(net)# Start training (apply gradient descent algorithm)model.fit(data, labels, n_epoch=10, batch_size=16, show_metric=True)# Let's create some data for DiCaprio and Winsletdicaprio = [3, 'Jack Dawson', 'male', 19, 0, 0, 'N/A', 5.0000]winslet = [1, 'Rose DeWitt Bukater', 'female', 17, 1, 2, 'N/A', 100.0000]# Preprocess datadicaprio, winslet = preprocess([dicaprio, winslet], to_ignore)# Predict surviving chances (class 1 results)pred = model.predict([dicaprio, winslet])print("DiCaprio Surviving Rate:", pred[0][1])print("Winslet Surviving Rate:", pred[1][1])

1.2.4 测试结果

Training samples: 1309Validation samples: 0--successfully opened CUDA library libcublas.so.10.0 locallyTraining Step: 82  | total loss: 0.65318 | time: 3.584s| Adam | epoch: 001 | loss: 0.65318 - acc: 0.6781 -- iter: 1309/1309--Training Step: 164  | total loss: 0.63713 | time: 1.298s| Adam | epoch: 002 | loss: 0.63713 - acc: 0.6687 -- iter: 1309/1309--Training Step: 246  | total loss: 0.55357 | time: 1.354s| Adam | epoch: 003 | loss: 0.55357 - acc: 0.7219 -- iter: 1309/1309--Training Step: 328  | total loss: 0.56566 | time: 1.312s| Adam | epoch: 004 | loss: 0.56566 - acc: 0.7091 -- iter: 1309/1309--Training Step: 410  | total loss: 0.48417 | time: 1.311s| Adam | epoch: 005 | loss: 0.48417 - acc: 0.7854 -- iter: 1309/1309--Training Step: 492  | total loss: 0.56114 | time: 1.300s| Adam | epoch: 006 | loss: 0.56114 - acc: 0.7463 -- iter: 1309/1309--Training Step: 574  | total loss: 0.51057 | time: 1.289s| Adam | epoch: 007 | loss: 0.51057 - acc: 0.7988 -- iter: 1309/1309--Training Step: 656  | total loss: 0.56562 | time: 1.312s| Adam | epoch: 008 | loss: 0.56562 - acc: 0.7551 -- iter: 1309/1309--Training Step: 738  | total loss: 0.52883 | time: 1.324s| Adam | epoch: 009 | loss: 0.52883 - acc: 0.7654 -- iter: 1309/1309--Training Step: 820  | total loss: 0.50510 | time: 1.340s| Adam | epoch: 010 | loss: 0.50510 - acc: 0.7687 -- iter: 1309/1309--DiCaprio Surviving Rate: 0.17452878Winslet Surviving Rate: 0.938663

我们的模型完成训练，总体准确率在 76.87％，这意味着它可以预测76％总乘客的正确结果（幸存与否）。

其中 Dicaprio 是男主角，Winslet 为女主角，可以看出预测还是比较准的。

2. Keras

掌握 keras 可以大幅提升对开发效率和网络结构的理解。优点：

模块化

极简主义

易扩展性

2.1 安装Keras

pip3 install keras --userSuccessfully installed keras-2.2.4

安装完成后，进入python3，检查一下安装成果，import keras时，下方提示using TensorFlow backend,就证明Keras安装成功并使用TensorFlow作为backend。

import kerasUsing TensorFlow backend.ModuleNotFoundError: No module named 'numpy.core._multiarray_umath'ImportError: numpy.core.multiarray failed to import

这里有一个小问题，需要升级numpy包

pip3  install -i https://pypi.tuna.tsinghua.edu.cn/simple --upgrade numpy --userSuccessfully installed numpy-1.16.3

然后keras成功安装

import kerasUsing TensorFlow backend.

2.2 keras 的模型

keras 的核心数据是模型。模型是用来组织网络层的方式。模型有两种，一种叫 Sequential 模型，另一种叫 Model 模型。 Sequential 模型是一系列网络层按顺序构成的栈，是单输入单输出的，层与层之间只有相邻关系，是最简单的一种模型。

Keras 是一个用 Python 编写的高级神经网络 API，它能够以 TensorFlow, CNTK, 或者 Theano 作为后端运行。Keras 的开发重点是支持快速的实验。能够以最小的时延把你的想法转换为实验结果，是做好研究的关键。

如果你在以下情况下需要深度学习库，请使用 Keras：

允许简单而快速的原型设计（由于用户友好，高度模块化，可扩展性）。

同时支持卷积神经网络和循环神经网络，以及两者的组合。

在 CPU 和 GPU 上无缝运行。

查看文档，请访问 Keras.io。

Keras 兼容的 Python 版本: Python 2.7-3.6。

指导原则

用户友好。 Keras 是为人类而不是为机器设计的 API。它把用户体验放在首要和中心位置。Keras 遵循减少认知困难的最佳实践：它提供一致且简单的 API，将常见用例所需的用户操作数量降至最低，并且在用户错误时提供清晰和可操作的反馈。

模块化。模型被理解为由独立的、完全可配置的模块构成的序列或图。这些模块可以以尽可能少的限制组装在一起。特别是神经网络层、损失函数、优化器、初始化方法、激活函数、正则化方法，它们都是可以结合起来构建新模型的模块。

易扩展性。新的模块是很容易添加的（作为新的类和函数），现有的模块已经提供了充足的示例。由于能够轻松地创建可以提高表现力的新模块，Keras 更加适合高级研究。

基于 Python 实现。 Keras 没有特定格式的单独配置文件。模型定义在 Python 代码中，这些代码紧凑，易于调试，并且易于扩展。

2.3 官方例子：使用 ResNet 模型对 CIFAIR 10 数据集分类

在目录中，你可以找到真实数据集的示例模型：

CIFAR10 小图片分类：具有实时数据增强的卷积神经网络 (CNN)

IMDB 电影评论情感分类：基于词序列的 LSTM

Reuters 新闻主题分类：多层感知器 (MLP)

MNIST 手写数字分类：MLP & CNN

基于 LSTM 的字符级文本生成

2.3.1 Keras数据集和预训练模型目录

Keras下载的数据集在以下目录中：

win10 ： C:\Users\user_name\.keras\datasets
其中一般化user_name是Administrator

Linux中，对一般用户，用户主目录是：/home/user_name，对于root用户，主目录是：/root

Keras下载的预训练模型在一下目录中：

root用户在 /root/.keras/models

一般用户在 /home/user_name/.keras/models

2.3.2 例程源码

# https://github.com/keras-team/keras/tree/master/examples/cifar10_cnn.py"""#Trains a ResNet on the CIFAR10 dataset.ResNet v1:[Deep Residual Learning for Image Recognition](https://arxiv.org/pdf/1512.03385.pdf)ResNet v2:[Identity Mappings in Deep Residual Networks](https://arxiv.org/pdf/1603.05027.pdf)Model|n|200-epoch accuracy|Original paper accuracy |sec/epoch GTX1080Ti:------------|--:|-------:|-----------------------:|---:ResNet20   v1|  3| 92.16 %|                 91.25 %|35ResNet32   v1|  5| 92.46 %|                 92.49 %|50ResNet44   v1|  7| 92.50 %|                 92.83 %|70ResNet56   v1|  9| 92.71 %|                 93.03 %|90ResNet110  v1| 18| 92.65 %|            93.39+-.16 %|165ResNet164  v1| 27|     - %|                 94.07 %|  -ResNet1001 v1|N/A|     - %|                 92.39 %|  - Model|n|200-epoch accuracy|Original paper accuracy |sec/epoch GTX1080Ti:------------|--:|-------:|-----------------------:|---:ResNet20   v2|  2|     - %|                     - %|---ResNet32   v2|N/A| NA    %|            NA         %| NAResNet44   v2|N/A| NA    %|            NA         %| NAResNet56   v2|  6| 93.01 %|            NA         %|100ResNet110  v2| 12| 93.15 %|            93.63      %|180ResNet164  v2| 18|     - %|            94.54      %|  -ResNet1001 v2|111|     - %|            95.08+-.14 %|  -"""# %matplotlib inline# %config InlineBackend.figure_format = 'svg'# calculate time usingimport timeitstart = timeit.default_timer()import kerasfrom keras.layers import Dense, Conv2D, BatchNormalization, Activationfrom keras.layers import AveragePooling2D, Input, Flattenfrom keras.optimizers import Adamfrom keras.callbacks import ModelCheckpoint, LearningRateSchedulerfrom keras.callbacks import ReduceLROnPlateaufrom keras.preprocessing.image import ImageDataGeneratorfrom keras.regularizers import l2from keras import backend as Kfrom keras.models import Modelfrom keras.datasets import cifar10import numpy as npimport os# calculate time usingimport timeitstart = timeit.default_timer()# Training parametersbatch_size = 32  # orig paper trained all networks with batch_size=128epochs = 10data_augmentation = Truenum_classes = 10# Subtracting pixel mean improves accuracysubtract_pixel_mean = True# Model parameter# ----------------------------------------------------------------------------#           |      | 200-epoch | Orig Paper| 200-epoch | Orig Paper| sec/epoch# Model     |  n   | ResNet v1 | ResNet v1 | ResNet v2 | ResNet v2 | GTX1080Ti#           |v1(v2)| %Accuracy | %Accuracy | %Accuracy | %Accuracy | v1 (v2)# ----------------------------------------------------------------------------# ResNet20  | 3 (2)| 92.16     | 91.25     | -----     | -----     | 35 (---)# ResNet32  | 5(NA)| 92.46     | 92.49     | NA        | NA        | 50 ( NA)# ResNet44  | 7(NA)| 92.50     | 92.83     | NA        | NA        | 70 ( NA)# ResNet56  | 9 (6)| 92.71     | 93.03     | 93.01     | NA        | 90 (100)# ResNet110 |18(12)| 92.65     | 93.39+-.16| 93.15     | 93.63     | 165(180)# ResNet164 |27(18)| -----     | 94.07     | -----     | 94.54     | ---(---)# ResNet1001| (111)| -----     | 92.39     | -----     | 95.08+-.14| ---(---)# ---------------------------------------------------------------------------n = 3# Model version# Orig paper: version = 1 (ResNet v1), Improved ResNet: version = 2 (ResNet v2)version = 1# Computed depth from supplied model parameter nif version == 1:    depth = n * 6 + 2elif version == 2:    depth = n * 9 + 2# Model name, depth and versionmodel_type = 'ResNet%dv%d' % (depth, version)# Load the CIFAR10 data.(x_train, y_train), (x_test, y_test) = cifar10.load_data()# Input image dimensions.input_shape = x_train.shape[1:]# Normalize data.x_train = x_train.astype('float32') / 255x_test = x_test.astype('float32') / 255# If subtract pixel mean is enabledif subtract_pixel_mean:    x_train_mean = np.mean(x_train, axis=0)    x_train -= x_train_mean    x_test -= x_train_meanprint('x_train shape:', x_train.shape)print(x_train.shape[0], 'train samples')print(x_test.shape[0], 'test samples')print('y_train shape:', y_train.shape)# Convert class vectors to binary class matrices.y_train = keras.utils.to_categorical(y_train, num_classes)y_test = keras.utils.to_categorical(y_test, num_classes)def lr_schedule(epoch):    """Learning Rate Schedule    Learning rate is scheduled to be reduced after 80, 120, 160, 180 epochs.    Called automatically every epoch as part of callbacks during training.    # Arguments        epoch (int): The number of epochs    # Returns        lr (float32): learning rate    """    lr = 1e-3    if epoch > 180:        lr *= 0.5e-3    elif epoch > 160:        lr *= 1e-3    elif epoch > 120:        lr *= 1e-2    elif epoch > 80:        lr *= 1e-1    print('Learning rate: ', lr)    return lrdef resnet_layer(inputs,                 num_filters=16,                 kernel_size=3,                 strides=1,                 activation='relu',                 batch_normalization=True,                 conv_first=True):    """2D Convolution-Batch Normalization-Activation stack builder    # Arguments        inputs (tensor): input tensor from input image or previous layer        num_filters (int): Conv2D number of filters        kernel_size (int): Conv2D square kernel dimensions        strides (int): Conv2D square stride dimensions        activation (string): activation name        batch_normalization (bool): whether to include batch normalization        conv_first (bool): conv-bn-activation (True) or            bn-activation-conv (False)    # Returns        x (tensor): tensor as input to the next layer    """    conv = Conv2D(num_filters,                  kernel_size=kernel_size,                  strides=strides,                  padding='same',                  kernel_initializer='he_normal',                  kernel_regularizer=l2(1e-4))    x = inputs    if conv_first:        x = conv(x)        if batch_normalization:            x = BatchNormalization()(x)        if activation is not None:            x = Activation(activation)(x)    else:        if batch_normalization:            x = BatchNormalization()(x)        if activation is not None:            x = Activation(activation)(x)        x = conv(x)    return xdef resnet_v1(input_shape, depth, num_classes=10):    """ResNet Version 1 Model builder [a]    Stacks of 2 x (3 x 3) Conv2D-BN-ReLU    Last ReLU is after the shortcut connection.    At the beginning of each stage, the feature map size is halved (downsampled)    by a convolutional layer with strides=2, while the number of filters is    doubled. Within each stage, the layers have the same number filters and the    same number of filters.    Features maps sizes:    stage 0: 32x32, 16    stage 1: 16x16, 32    stage 2:  8x8,  64    The Number of parameters is approx the same as Table 6 of [a]:    ResNet20 0.27M    ResNet32 0.46M    ResNet44 0.66M    ResNet56 0.85M    ResNet110 1.7M    # Arguments        input_shape (tensor): shape of input image tensor        depth (int): number of core convolutional layers        num_classes (int): number of classes (CIFAR10 has 10)    # Returns        model (Model): Keras model instance    """    if (depth - 2) % 6 != 0:        raise ValueError('depth should be 6n+2 (eg 20, 32, 44 in [a])')    # Start model definition.    num_filters = 16    num_res_blocks = int((depth - 2) / 6)    inputs = Input(shape=input_shape)    x = resnet_layer(inputs=inputs)    # Instantiate the stack of residual units    for stack in range(3):        for res_block in range(num_res_blocks):            strides = 1            if stack > 0 and res_block == 0:  # first layer but not first stack                strides = 2  # downsample            y = resnet_layer(inputs=x,                             num_filters=num_filters,                             strides=strides)            y = resnet_layer(inputs=y,                             num_filters=num_filters,                             activation=None)            if stack > 0 and res_block == 0:  # first layer but not first stack                # linear projection residual shortcut connection to match                # changed dims                x = resnet_layer(inputs=x,                                 num_filters=num_filters,                                 kernel_size=1,                                 strides=strides,                                 activation=None,                                 batch_normalization=False)            x = keras.layers.add([x, y])            x = Activation('relu')(x)        num_filters *= 2    # Add classifier on top.    # v1 does not use BN after last shortcut connection-ReLU    x = AveragePooling2D(pool_size=8)(x)    y = Flatten()(x)    outputs = Dense(num_classes,                    activation='softmax',                    kernel_initializer='he_normal')(y)    # Instantiate model.    model = Model(inputs=inputs, outputs=outputs)    return modeldef resnet_v2(input_shape, depth, num_classes=10):    """ResNet Version 2 Model builder [b]    Stacks of (1 x 1)-(3 x 3)-(1 x 1) BN-ReLU-Conv2D or also known as    bottleneck layer    First shortcut connection per layer is 1 x 1 Conv2D.    Second and onwards shortcut connection is identity.    At the beginning of each stage, the feature map size is halved (downsampled)    by a convolutional layer with strides=2, while the number of filter maps is    doubled. Within each stage, the layers have the same number filters and the    same filter map sizes.    Features maps sizes:    conv1  : 32x32,  16    stage 0: 32x32,  64    stage 1: 16x16, 128    stage 2:  8x8,  256    # Arguments        input_shape (tensor): shape of input image tensor        depth (int): number of core convolutional layers        num_classes (int): number of classes (CIFAR10 has 10)    # Returns        model (Model): Keras model instance    """    if (depth - 2) % 9 != 0:        raise ValueError('depth should be 9n+2 (eg 56 or 110 in [b])')    # Start model definition.    num_filters_in = 16    num_res_blocks = int((depth - 2) / 9)    inputs = Input(shape=input_shape)    # v2 performs Conv2D with BN-ReLU on input before splitting into 2 paths    x = resnet_layer(inputs=inputs,                     num_filters=num_filters_in,                     conv_first=True)    # Instantiate the stack of residual units    for stage in range(3):        for res_block in range(num_res_blocks):            activation = 'relu'            batch_normalization = True            strides = 1            if stage == 0:                num_filters_out = num_filters_in * 4                if res_block == 0:  # first layer and first stage                    activation = None                    batch_normalization = False            else:                num_filters_out = num_filters_in * 2                if res_block == 0:  # first layer but not first stage                    strides = 2    # downsample            # bottleneck residual unit            y = resnet_layer(inputs=x,                             num_filters=num_filters_in,                             kernel_size=1,                             strides=strides,                             activation=activation,                             batch_normalization=batch_normalization,                             conv_first=False)            y = resnet_layer(inputs=y,                             num_filters=num_filters_in,                             conv_first=False)            y = resnet_layer(inputs=y,                             num_filters=num_filters_out,                             kernel_size=1,                             conv_first=False)            if res_block == 0:                # linear projection residual shortcut connection to match                # changed dims                x = resnet_layer(inputs=x,                                 num_filters=num_filters_out,                                 kernel_size=1,                                 strides=strides,                                 activation=None,                                 batch_normalization=False)            x = keras.layers.add([x, y])        num_filters_in = num_filters_out    # Add classifier on top.    # v2 has BN-ReLU before Pooling    x = BatchNormalization()(x)    x = Activation('relu')(x)    x = AveragePooling2D(pool_size=8)(x)    y = Flatten()(x)    outputs = Dense(num_classes,                    activation='softmax',                    kernel_initializer='he_normal')(y)    # Instantiate model.    model = Model(inputs=inputs, outputs=outputs)    return modelif version == 2:    model = resnet_v2(input_shape=input_shape, depth=depth)else:    model = resnet_v1(input_shape=input_shape, depth=depth)model.compile(loss='categorical_crossentropy',              optimizer=Adam(lr=lr_schedule(0)),              metrics=['accuracy'])model.summary()print(model_type)# Prepare model model saving directory.save_dir = os.path.join(os.getcwd(), 'saved_models')model_name = 'cifar10_%s_model.{epoch:03d}.h5' % model_typeif not os.path.isdir(save_dir):    os.makedirs(save_dir)filepath = os.path.join(save_dir, model_name)# Prepare callbacks for model saving and for learning rate adjustment.checkpoint = ModelCheckpoint(filepath=filepath,                             monitor='val_acc',                             verbose=1,                             save_best_only=True)lr_scheduler = LearningRateScheduler(lr_schedule)lr_reducer = ReduceLROnPlateau(factor=np.sqrt(0.1),                               cooldown=0,                               patience=5,                               min_lr=0.5e-6)callbacks = [checkpoint, lr_reducer, lr_scheduler]# Run training, with or without data augmentation.if not data_augmentation:    print('Not using data augmentation.')    model.fit(x_train, y_train,              batch_size=batch_size,              epochs=epochs,              validation_data=(x_test, y_test),              shuffle=True,              callbacks=callbacks)else:    print('Using real-time data augmentation.')    # This will do preprocessing and realtime data augmentation:    datagen = ImageDataGenerator(        # set input mean to 0 over the dataset        featurewise_center=False,        # set each sample mean to 0        samplewise_center=False,        # divide inputs by std of dataset        featurewise_std_normalization=False,        # divide each input by its std        samplewise_std_normalization=False,        # apply ZCA whitening        zca_whitening=False,        # epsilon for ZCA whitening        zca_epsilon=1e-06,        # randomly rotate images in the range (deg 0 to 180)        rotation_range=0,        # randomly shift images horizontally        width_shift_range=0.1,        # randomly shift images vertically        height_shift_range=0.1,        # set range for random shear        shear_range=0.,        # set range for random zoom        zoom_range=0.,        # set range for random channel shifts        channel_shift_range=0.,        # set mode for filling points outside the input boundaries        fill_mode='nearest',        # value used for fill_mode = "constant"        cval=0.,        # randomly flip images        horizontal_flip=True,        # randomly flip images        vertical_flip=False,        # set rescaling factor (applied before any other transformation)        rescale=None,        # set function that will be applied on each input        preprocessing_function=None,        # image data format, either "channels_first" or "channels_last"        data_format=None,        # fraction of images reserved for validation (strictly between 0 and 1)        validation_split=0.0)    # Compute quantities required for featurewise normalization    # (std, mean, and principal components if ZCA whitening is applied).    datagen.fit(x_train)    # Fit the model on the batches generated by datagen.flow().    model.fit_generator(datagen.flow(x_train, y_train, batch_size=batch_size),                        steps_per_epoch=x_train.shape[0],                        validation_data=(x_test, y_test),                        epochs=epochs, verbose=1, workers=4,                        callbacks=callbacks)# Score trained model.scores = model.evaluate(x_test, y_test, verbose=1)print('Test loss:', scores[0])print('Test accuracy:', scores[1])# output time usingend = timeit.default_timer()tdf = end -starttimeh = tdf // 3600timem = tdf // 60times = tdf % 60print("use time: " , int(timeh) , "h" , int(timem) , "m" ,times, "s")# output time usingend = timeit.default_timer()tdf = end -starttimeh = tdf // 3600timem = tdf // 60times = tdf % 60print("use time: " , int(timeh) , "h" , int(timem) , "m" ,times, "s")

2.3.3 训练

直接运行后会有错误

python3 cifar10_cnn.pyValueError: steps_per_epoch=None is only valid for a generator based on the keras.utils.Sequence class. Please specify steps_per_epoch or use the keras.utils.Sequence class.

这个是由于版本更迭，有些函数的参数作了修改

只需要将 cifar10_resnet.py 文件中

model.fit_generator(datagen.flow(x_train, y_train, batch_size=batch_size),                        validation_data=(x_test, y_test),                        epochs=epochs, verbose=1, workers=4,                        callbacks=callbacks)

修改为

model.fit_generator(datagen.flow(x_train, y_train, batch_size=batch_size),                        steps_per_epoch=x_train.shape[0] // batch_size,                        validation_data=(x_test, y_test),                        epochs=epochs, verbose=1, workers=4,                        callbacks=callbacks)

2.3.3 测试结果

Using TensorFlow backend.x_train shape: (50000, 32, 32, 3)50000 train samples10000 test samplesy_train shape: (50000, 1)Learning rate:  0.001__________________________________________________________________________________________________Layer (type)                    Output Shape         Param #     Connected to==================================================================================================input_1 (InputLayer)            (None, 32, 32, 3)    0__________________________________________________________________________________________________conv2d_1 (Conv2D)               (None, 32, 32, 16)   448         input_1[0][0]__________________________________________________________________________________________________batch_normalization_1 (BatchNor (None, 32, 32, 16)   64          conv2d_1[0][0]__________________________________________________________________________________________________activation_1 (Activation)       (None, 32, 32, 16)   0           batch_normalization_1[0][0]__________________________________________________________________________________________________conv2d_2 (Conv2D)               (None, 32, 32, 16)   2320        activation_1[0][0]__________________________________________________________________________________________________batch_normalization_2 (BatchNor (None, 32, 32, 16)   64          conv2d_2[0][0]__________________________________________________________________________________________________activation_2 (Activation)       (None, 32, 32, 16)   0           batch_normalization_2[0][0]__________________________________________________________________________________________________conv2d_3 (Conv2D)               (None, 32, 32, 16)   2320        activation_2[0][0]__________________________________________________________________________________________________batch_normalization_3 (BatchNor (None, 32, 32, 16)   64          conv2d_3[0][0]__________________________________________________________________________________________________add_1 (Add)                     (None, 32, 32, 16)   0           activation_1[0][0]                                                                 batch_normalization_3[0][0]__________________________________________________________________________________________________activation_3 (Activation)       (None, 32, 32, 16)   0           add_1[0][0]__________________________________________________________________________________________________conv2d_4 (Conv2D)               (None, 32, 32, 16)   2320        activation_3[0][0]__________________________________________________________________________________________________batch_normalization_4 (BatchNor (None, 32, 32, 16)   64          conv2d_4[0][0]__________________________________________________________________________________________________activation_4 (Activation)       (None, 32, 32, 16)   0           batch_normalization_4[0][0]__________________________________________________________________________________________________conv2d_5 (Conv2D)               (None, 32, 32, 16)   2320        activation_4[0][0]__________________________________________________________________________________________________batch_normalization_5 (BatchNor (None, 32, 32, 16)   64          conv2d_5[0][0]__________________________________________________________________________________________________add_2 (Add)                     (None, 32, 32, 16)   0           activation_3[0][0]                                                                 batch_normalization_5[0][0]__________________________________________________________________________________________________activation_5 (Activation)       (None, 32, 32, 16)   0           add_2[0][0]__________________________________________________________________________________________________conv2d_6 (Conv2D)               (None, 32, 32, 16)   2320        activation_5[0][0]__________________________________________________________________________________________________batch_normalization_6 (BatchNor (None, 32, 32, 16)   64          conv2d_6[0][0]__________________________________________________________________________________________________activation_6 (Activation)       (None, 32, 32, 16)   0           batch_normalization_6[0][0]__________________________________________________________________________________________________conv2d_7 (Conv2D)               (None, 32, 32, 16)   2320        activation_6[0][0]__________________________________________________________________________________________________batch_normalization_7 (BatchNor (None, 32, 32, 16)   64          conv2d_7[0][0]__________________________________________________________________________________________________add_3 (Add)                     (None, 32, 32, 16)   0           activation_5[0][0]                                                                 batch_normalization_7[0][0]__________________________________________________________________________________________________activation_7 (Activation)       (None, 32, 32, 16)   0           add_3[0][0]__________________________________________________________________________________________________conv2d_8 (Conv2D)               (None, 16, 16, 32)   4640        activation_7[0][0]__________________________________________________________________________________________________batch_normalization_8 (BatchNor (None, 16, 16, 32)   128         conv2d_8[0][0]__________________________________________________________________________________________________activation_8 (Activation)       (None, 16, 16, 32)   0           batch_normalization_8[0][0]__________________________________________________________________________________________________conv2d_9 (Conv2D)               (None, 16, 16, 32)   9248        activation_8[0][0]__________________________________________________________________________________________________conv2d_10 (Conv2D)              (None, 16, 16, 32)   544         activation_7[0][0]__________________________________________________________________________________________________batch_normalization_9 (BatchNor (None, 16, 16, 32)   128         conv2d_9[0][0]__________________________________________________________________________________________________add_4 (Add)                     (None, 16, 16, 32)   0           conv2d_10[0][0]                                                                 batch_normalization_9[0][0]__________________________________________________________________________________________________activation_9 (Activation)       (None, 16, 16, 32)   0           add_4[0][0]__________________________________________________________________________________________________conv2d_11 (Conv2D)              (None, 16, 16, 32)   9248        activation_9[0][0]__________________________________________________________________________________________________batch_normalization_10 (BatchNo (None, 16, 16, 32)   128         conv2d_11[0][0]__________________________________________________________________________________________________activation_10 (Activation)      (None, 16, 16, 32)   0           batch_normalization_10[0][0]__________________________________________________________________________________________________conv2d_12 (Conv2D)              (None, 16, 16, 32)   9248        activation_10[0][0]__________________________________________________________________________________________________batch_normalization_11 (BatchNo (None, 16, 16, 32)   128         conv2d_12[0][0]__________________________________________________________________________________________________add_5 (Add)                     (None, 16, 16, 32)   0           activation_9[0][0]                                                                 batch_normalization_11[0][0]__________________________________________________________________________________________________activation_11 (Activation)      (None, 16, 16, 32)   0           add_5[0][0]__________________________________________________________________________________________________conv2d_13 (Conv2D)              (None, 16, 16, 32)   9248        activation_11[0][0]__________________________________________________________________________________________________batch_normalization_12 (BatchNo (None, 16, 16, 32)   128         conv2d_13[0][0]__________________________________________________________________________________________________activation_12 (Activation)      (None, 16, 16, 32)   0           batch_normalization_12[0][0]__________________________________________________________________________________________________conv2d_14 (Conv2D)              (None, 16, 16, 32)   9248        activation_12[0][0]__________________________________________________________________________________________________batch_normalization_13 (BatchNo (None, 16, 16, 32)   128         conv2d_14[0][0]__________________________________________________________________________________________________add_6 (Add)                     (None, 16, 16, 32)   0           activation_11[0][0]                                                                 batch_normalization_13[0][0]__________________________________________________________________________________________________activation_13 (Activation)      (None, 16, 16, 32)   0           add_6[0][0]__________________________________________________________________________________________________conv2d_15 (Conv2D)              (None, 8, 8, 64)     18496       activation_13[0][0]__________________________________________________________________________________________________batch_normalization_14 (BatchNo (None, 8, 8, 64)     256         conv2d_15[0][0]__________________________________________________________________________________________________activation_14 (Activation)      (None, 8, 8, 64)     0           batch_normalization_14[0][0]__________________________________________________________________________________________________conv2d_16 (Conv2D)              (None, 8, 8, 64)     36928       activation_14[0][0]__________________________________________________________________________________________________conv2d_17 (Conv2D)              (None, 8, 8, 64)     2112        activation_13[0][0]__________________________________________________________________________________________________batch_normalization_15 (BatchNo (None, 8, 8, 64)     256         conv2d_16[0][0]__________________________________________________________________________________________________add_7 (Add)                     (None, 8, 8, 64)     0           conv2d_17[0][0]                                                                 batch_normalization_15[0][0]__________________________________________________________________________________________________activation_15 (Activation)      (None, 8, 8, 64)     0           add_7[0][0]__________________________________________________________________________________________________conv2d_18 (Conv2D)              (None, 8, 8, 64)     36928       activation_15[0][0]__________________________________________________________________________________________________batch_normalization_16 (BatchNo (None, 8, 8, 64)     256         conv2d_18[0][0]__________________________________________________________________________________________________activation_16 (Activation)      (None, 8, 8, 64)     0           batch_normalization_16[0][0]__________________________________________________________________________________________________conv2d_19 (Conv2D)              (None, 8, 8, 64)     36928       activation_16[0][0]__________________________________________________________________________________________________batch_normalization_17 (BatchNo (None, 8, 8, 64)     256         conv2d_19[0][0]__________________________________________________________________________________________________add_8 (Add)                     (None, 8, 8, 64)     0           activation_15[0][0]                                                                 batch_normalization_17[0][0]__________________________________________________________________________________________________activation_17 (Activation)      (None, 8, 8, 64)     0           add_8[0][0]__________________________________________________________________________________________________conv2d_20 (Conv2D)              (None, 8, 8, 64)     36928       activation_17[0][0]__________________________________________________________________________________________________batch_normalization_18 (BatchNo (None, 8, 8, 64)     256         conv2d_20[0][0]__________________________________________________________________________________________________activation_18 (Activation)      (None, 8, 8, 64)     0           batch_normalization_18[0][0]__________________________________________________________________________________________________conv2d_21 (Conv2D)              (None, 8, 8, 64)     36928       activation_18[0][0]__________________________________________________________________________________________________batch_normalization_19 (BatchNo (None, 8, 8, 64)     256         conv2d_21[0][0]__________________________________________________________________________________________________add_9 (Add)                     (None, 8, 8, 64)     0           activation_17[0][0]                                                                 batch_normalization_19[0][0]__________________________________________________________________________________________________activation_19 (Activation)      (None, 8, 8, 64)     0           add_9[0][0]__________________________________________________________________________________________________average_pooling2d_1 (AveragePoo (None, 1, 1, 64)     0           activation_19[0][0]__________________________________________________________________________________________________flatten_1 (Flatten)             (None, 64)           0           average_pooling2d_1[0][0]__________________________________________________________________________________________________dense_1 (Dense)                 (None, 10)           650         flatten_1[0][0]==================================================================================================Total params: 274,442Trainable params: 273,066Non-trainable params: 1,376__________________________________________________________________________________________________ResNet20v1Using real-time data augmentation.Epoch 1/10Learning rate:  0.001successfully opened CUDA library libcublas.so.10.0 locally50000/50000 [==============================] - 11286s 226ms/step - loss: 0.7185 - acc: 0.8164 - val_loss: 0.7312 - val_acc: 0.8302

训练一个 Epoch 需要3个小时左右，训练后测试集精度为83.03%。例子需要训练200个Epoch，Jentson Nano 的 0.5T 的算力太差，不适合训练模型，计算量太大选择放弃。