操作系统：Centos7.4

1. 准备工作

1.1 install protobuf

下载protobuf-2.6.1

cd protobuf-2.6.1/

./autogen.sh (需要下载gtest-1.6.0放到当前目录，重命名为gtest，下载地址：https://pan.baidu.com/s/1kU7ac4J)

./configure

make

make install

protoc --version

1.2. install tensorflow

For CPU

pip install tensorflow

For GPU

pip install tensorflow-gpu

1.3. 配置tensorflow/models

参考：https://github.com/tensorflow/models/blob/master/research/object_detection/g3doc/installation.md

Tensorflow Object Detection API 依赖以下库:

Protobuf 2.6

Pillow 1.0

lxml

tf Slim (which is included in the "tensorflow/models/research/" checkout)

Jupyter notebook

Matplotlib

Tensorflow

具体步骤如下：

下载TensorFlow Models

git clone https://github.com/tensorflow/models.git

编译protobuf（在object_detection/protos/下生成若干py文件）

#From tensorflow/models/research
cd models/research
protoc object_detection/protos/*.proto --python_out=.

添加PYTHONPATH

From tensorflow/models/research
export PYTHONPATH=$PYTHONPATH:`pwd`:`pwd`/slim
或者在/root/.bashrc中加入：export PYTHONPATH=$PYTHONPATH:/home/tensorflow_install/models/research:/home/tensorflow_install/models/research/slim

验证

#From tensorflow/models/research
python object_detection/builders/model_builder_test.py

验证前先确保setup.py编译安装，一些依赖库是否安装，我在验证中遇到很多错误，安装以下依赖库解决：

python setup.py build

python setup.py install

pip install matplotlib

yum install -y tkinter

pip install image

pip install pillow

1.4. 准备数据

参考：https://github.com/tensorflow/models/blob/master/research/object_detection/g3doc/preparing_inputs.md

以PASCAL VOC 2012为例：

下载并解压

#From tensorflow/models/research/object_detection
wget http://host.robots.ox.ac.uk/pascal/VOC/voc2012/VOCtrainval_11-May-2012.tar
tar -xvf VOCtrainval_11-May-2012.tar

生成TFRecord（得到pascal_train.record和pascal_val.record）

#From tensorflow/models/research
mkdir object_detection/VOC2012
python object_detection/dataset_tools/create_pascal_tf_record.py
--label_map_path=object_detection/data/pascal_label_map.pbtxt
--data_dir=VOCdevkit --year=VOC2012 --set=train
--output_path=object_detection/VOC2012/pascal_train.record
python object_detection/dataset_tools/create_pascal_tf_record.py
--label_map_path=object_detection/data/pascal_label_map.pbtxt
--data_dir=VOCdevkit --year=VOC2012 --set=val
--output_path=object_detection/VOC2012/pascal_val.record

如果需要用自己的数据，则参考create_pascal_tf_record.py编写处理数据生成TFRecord的脚本。（参考：https://github.com/tensorflow/models/blob/master/research/object_detection/g3doc/using_your_own_dataset.md）在下一篇文章介绍。

1.5. （可选）下载模型

官方提供了不少与训练模型（https://github.com/tensorflow/models/blob/master/research/object_detection/g3doc/detection_model_zoo.md），这里以ssd_mobilenet_v1_coco以例：

#From tensorflow/models/research/object_detection
wget http://download.tensorflow.org/models/object_detection/ssd_mobilenet_v1_coco_2017_11_17.tar.gz
tar -xzvf ssd_mobilenet_v1_coco_2017_11_17.tar.gz

2. 训练

如果使用现有模型进行预测则不需要训练。
文件结构：

models
├── research
│   ├── object_detection
│   │   ├── VOC2012
│   │   │   ├── ssd_mobilenet_train_logs
│   │   │   ├── ssd_mobilenet_val_logs
│   │   │   ├── ssd_mobilenet_v1_voc2012.config
│   │   │   ├── pascal_label_map.pbtxt
│   │   │   ├── pascal_train.record
│   │   │   └── pascal_val.record
│   │   ├── infer.py
│   │   └── create_pascal_tf_record.py
│   ├── eval_voc2012.sh
│   └── train_voc2012.sh

2.1. 配置

参考：https://github.com/tensorflow/models/blob/master/research/object_detection/g3doc/configuring_jobs.md

配置分为5个部分：

model config：定义了什么类型的模型将被训练（如元架构，特征提取器）
train_config：决定应该使用哪些参数来训练模型参数（如SGD参数，输入预处理和特征提取器初始化值）
eval_config
train_input_config：定义了模型应该训练的数据集
eval_input_config：定义了模型将被评估的数据集

在object_detection/samples/model_configs文件夹中提供了示例模型配置。这些配置文件的内容可以粘贴到框架配置的模型区域。应该注意的是，num_classes字段应该改为适合正在训练的数据集的值。

这里使用ssd_mobilenet:

#From tensorflow/models/research

cp object_detection/samples/configs/ssd_mobilenet_v1_pets.config object_detection/VOC2012/ssd_mobilenet_v1_voc2012.config

修改9行为
num_classes:20

修改158行为
fine_tune_checkpoint: "object_detection/ssd_mobilenet_v1_coco_2017_11_17/model.ckpt"

修改177行为
input_path: "object_detection/VOC2012/pascal_train.record"

修改179行和193行为：
label_map_path: "object_detection/data/pascal_label_map.pbtxt"

修改191行为：input_path: "object_detection/VOC2012/pascal_val.record"

2.2. 训练

新建research/train_voc2012.sh，添加以下内容：

python object_detection/train.py \
    --logtostderr \
    --pipeline_config_path=object_detection/VOC2012/ssd_mobilenet_v1_voc2012.config \
    --train_dir=object_detection/VOC2012/ssd_mobilenet_train_logs \
    2>&1 | tee object_detection/VOC2012/ssd_mobilenet_train_logs.txt &

执行以下命令即可训练：

./train_voc2012.sh

2.3. 验证

可一边训练一边验证，注意使用其它的GPU或合理分配显存。
新建tensorflow/models/eval_voc2012.sh，内容以下：

    --logtostderr \
    --pipeline_config_path=object_detection/VOC2012/ssd_mobilenet_v1_voc2012.config \
    --checkpoint_dir=object_detection/VOC2012/ssd_mobilenet_train_logs \
    --eval_dir=object_detection/VOC2012/ssd_mobilenet_val_logs &

进入tensorflow/models/research，运行CUDA_VISIBLE_DEVICES="1" （这里就不需要设置了，我们使用CPU，不用GPU）./train_voc2012.sh即可验证。

3. 测试

3.1. 导出模型

训练完成后得到一些checkpoint文件在ssd_mobilenet_train_logs中，如：

graph.pbtxt
model.ckpt-200000.data-00000-of-00001
model.ckpt-200000.info
model.ckpt-200000.meta
其中meta保存了graph和metadata，ckpt保存了网络的weights。
而进行预测时只需模型和权重，不需要metadata，故可使用官方提供的脚本生成推导图：

python object_detection/export_inference_graph.py \
    --input_type image_tensor \
    --pipeline_config_path object_detection/VOC2012/ssd_mobilenet_v1_voc2012.config \
    --trained_checkpoint_prefix object_detection/VOC2012/ssd_mobilenet_train_logs/model.ckpt-200000 \
    --output_directory object_detection/VOC2012

3.2. 测试图片

运行object_detection_tutorial.ipynb并修改其中的各种路径即可。
或自写编译inference脚本，如tensorflow/models/research/object_detection/infer.py

import numpy as np
import os
import six.moves.urllib as urllib
import sys
import tarfile
import tensorflow as tf
import zipfile

from collections import defaultdict
from io import StringIO
from matplotlib import pyplot as plt
from PIL import Image

## This is needed to display the images.
#%matplotlib inline

# This is needed since the notebook is stored in the object_detection folder.
sys.path.append("..")

from utils import label_map_util

from utils import visualization_utils as vis_util

# What model to download.
MODEL_NAME = 'ssd_mobilenet_v1_coco_2017_11_17'
MODEL_FILE = MODEL_NAME + '.tar.gz'
DOWNLOAD_BASE = 'http://download.tensorflow.org/models/object_detection/'

# Path to frozen detection graph. This is the actual model that is used for the object detection.
PATH_TO_CKPT = MODEL_NAME + '/frozen_inference_graph.pb'

# List of the strings that is used to add correct label for each box.
PATH_TO_LABELS = os.path.join('data', 'mscoco_label_map.pbtxt')

NUM_CLASSES = 90

#download model
opener = urllib.request.URLopener()
opener.retrieve(DOWNLOAD_BASE + MODEL_FILE, MODEL_FILE)
#tar_file = tarfile.open(MODEL_FILE)
#for file in tar_file.getmembers():
#  file_name = os.path.basename(file.name)
#  if 'frozen_inference_graph.pb' in file_name:
#    tar_file.extract(file, os.getcwd())

#Load a (frozen) Tensorflow model into memory.
detection_graph = tf.Graph()
with detection_graph.as_default():
  od_graph_def = tf.GraphDef()
  with tf.gfile.GFile(PATH_TO_CKPT, 'rb') as fid:
    serialized_graph = fid.read()
    od_graph_def.ParseFromString(serialized_graph)
    tf.import_graph_def(od_graph_def, name='')
#Loading label map
label_map = label_map_util.load_labelmap(PATH_TO_LABELS)
categories = label_map_util.convert_label_map_to_categories(label_map, max_num_classes=NUM_CLASSES, use_display_name=True)
category_index = label_map_util.create_category_index(categories)
#Helper code
def load_image_into_numpy_array(image):
  (im_width, im_height) = image.size
  return np.array(image.getdata()).reshape(
      (im_height, im_width, 3)).astype(np.uint8)


# For the sake of simplicity we will use only 2 images:
# image1.jpg
# image2.jpg
# If you want to test the code with your images, just add path to the images to the TEST_IMAGE_PATHS.
PATH_TO_TEST_IMAGES_DIR = 'test_images'
#TEST_IMAGE_PATHS = [ os.path.join(PATH_TO_TEST_IMAGES_DIR, 'image{}.jpg'.format(i)) for i in range(1, 3) ]
TEST_IMAGE = sys.argv[1]
print 'the test image is:', TEST_IMAGE

# Size, in inches, of the output images.
IMAGE_SIZE = (12, 8)

with detection_graph.as_default():
  with tf.Session(graph=detection_graph) as sess:
    # Definite input and output Tensors for detection_graph
    image_tensor = detection_graph.get_tensor_by_name('image_tensor:0')
    # Each box represents a part of the image where a particular object was detected.
    detection_boxes = detection_graph.get_tensor_by_name('detection_boxes:0')
    # Each score represent how level of confidence for each of the objects.
    # Score is shown on the result image, together with the class label.
    detection_scores = detection_graph.get_tensor_by_name('detection_scores:0')
    detection_classes = detection_graph.get_tensor_by_name('detection_classes:0')
    num_detections = detection_graph.get_tensor_by_name('num_detections:0')
    #for image_path in TEST_IMAGE_PATHS:
    image = Image.open(TEST_IMAGE)
    # the array based representation of the image will be used later in order to prepare the
    # result image with boxes and labels on it.
    image_np = load_image_into_numpy_array(image)
    # Expand dimensions since the model expects images to have shape: [1, None, None, 3]
    image_np_expanded = np.expand_dims(image_np, axis=0)
    image_tensor = detection_graph.get_tensor_by_name('image_tensor:0')
    # Each box represents a part of the image where a particular object was detected.
    boxes = detection_graph.get_tensor_by_name('detection_boxes:0')
    # Each score represent how level of confidence for each of the objects.
    # Score is shown on the result image, together with the class label.
    scores = detection_graph.get_tensor_by_name('detection_scores:0')
    classes = detection_graph.get_tensor_by_name('detection_classes:0')
    num_detections = detection_graph.get_tensor_by_name('num_detections:0')
    # Actual detection.
    (boxes, scores, classes, num_detections) = sess.run(
        [boxes, scores, classes, num_detections],
        feed_dict={image_tensor: image_np_expanded})
    # Visualization of the results of a detection.
    vis_util.visualize_boxes_and_labels_on_image_array(
        image_np,
        np.squeeze(boxes),
        np.squeeze(classes).astype(np.int32),
        np.squeeze(scores),
        category_index,
        use_normalized_coordinates=True,
        line_thickness=8)

    print(scores)  
    print(classes)  
    print(category_index) 

    count = 0
    for i in range(100):
        if scores is None or final_score[i] > 0.5:
            count = count + 1
    print 'the count of objects is: ', count  

    plt.figure(figsize=IMAGE_SIZE)
    plt.imshow(image_np)
    plt.show()

其中，这段代码是统计识别的物体个数：

count = 0
    for i in range(100):
        if scores is None or final_score[i] > 0.5:
            count = count + 1
    print 'the count of objects is: ', count

运行 infer.py test_images/image2.jpg，效果如图：

源码安装tensorflow
git clone --recurse-submodules https://github.com/tensorflow/tensorflow
wget https://copr.fedorainfracloud.org/coprs/vbatts/bazel/repo/epel-7/vbatts-bazel-epel-7.repo
cp vbatts-bazel-epel-7.repo /etc/yum.repos.d/
yum install -y bazel
yum install -y python-numpy swig python-dev python-wheel

cd tensorflow
./configure

bazel build -c opt --config=cuda //tensorflow/tools/pip_package:build_pip_package

【Tensorflow】Object Detection API学习