Keras 从零开始构建VGG网络

Keras使得创建深度学习模型变得快速而简单, 虽然如此很多时候我们只要复制许多官网的范例就可做出很多令人觉得惊奇的结果。但是当要解决的问题需要进行一些模型的调整与优化或是需要构建出一个新论文的网络结构的时候, 我们就可能会左支右拙的难以招架。

在本教程中,您将通过阅读VGG的原始论文从零开始使用Keras来构建在ILSVRC-2014 (ImageNet competition)竞赛中获的第一名的VGG (Visual Geometry Group, University of Oxford)网络结构。

那么,重新构建别人已经构建的东西有什么意义呢?重点是学习。通过完成这次的练习,您将:

  • 了解更多关于VGG的架构
  • 了解有关卷积神经网络的更多信息
  • 了解如何在Keras中实施某种网络结构
  • 通过阅读论文并实施其中的某些部分可以了解更多底层的原理与原始构想

从零开始构建VGG网络来学习Keras

为什么从VGG开始?

  • 它很容易实现
  • 它在ILSVRC-2014(ImageNet竞赛)上取得了优异的成绩
  • 它今天被广泛使用
  • 它的论文简单易懂
  • Keras己经实现VGG在散布的版本中,所以你可以用来参考与比较

让我们从论文中挖宝

根据论文的测试给果D (VGG16)与E (VGG19)是效果最好的,由于这两种网络构建的方法与技巧几乎相同,因此我们选手构建D (VGG16)这个网络结构类型。

归纳一下论文网络构建讯息:

  • 输入图像尺寸( input size):224 x 224
  • 感受过泸器( receptive field)的大小是3 x 3
  • 卷积步长( stride)是1个像素
  • 填充( padding)是1(对于3 x 3的感受过泸器)
  • 池化层的大小是2×2且步长( stride)为2像素
  • 有两个完全连接层,每层4096个神经元
  • 最后一层是具有1000个神经元的softmax分类层(代表1000个ImageNet类别)
  • 激励函数是ReLU
#这个Jupyter Notebook的环境
import  platform 
import  tensorflow 
import  keras 
print ( "Platform: {} " . format ( platform . platform ())) 
print ( "Tensorflow version: {} " . format ( tensorflow . __version__ )) 
print ( " Keras version: {} " . format ( keras . __version__ ))

% matplotlib inline
 import  matplotlib.pyplot  as  plt 
import  matplotlib.image  as  mpimg 
import  numpy  as  np 
from  IPython.display  import  Image

Using TensorFlow backend.
Platform: Windows-7-6.1.7601-SP1
Tensorflow version: 1.4.0
Keras version: 2.1.1

创建模型(Sequential)

import  keras 
from  keras.models  import  Sequential 
from  keras.layers  import  Dense ,  Activation ,  Dropout ,  Flatten 
from  keras.layers  import  Conv2D ,  MaxPool2D 
from  keras.utils  import  plot_model

#定义输入
input_shape  =  ( 224 ,  224 ,  3 )  # RGB影像224x224 (height, width, channel)

#使用'序贯模型(Sequential)来定义
model  =  Sequential ( name = 'vgg16-sequential' )

#第1个卷积区块(block1) 
model . add ( Conv2D ( 64 ,  ( 3 ,  3 ),  padding = 'same' ,  activation = 'relu' ,  input_shape = input_shape ,  name = 'block1_conv1' )) 
model . add ( Conv2D ( 64 ,  ( 3 ,  3 ),  padding = 'same' ,  activation = 'relu' , name = 'block1_conv2' )) 
model . add ( MaxPool2D (( 2 ,  2 ),  strides = ( 2 ,  2 ),  name = 'block1_pool' ))

#第2个卷积区块(block2) 
model . add ( Conv2D ( 128 ,  ( 3 ,  3 ),  padding = 'same' ,  activation = 'relu' ,  name = 'block2_conv1' )) 
model . add ( Conv2D ( 128 ,  ( 3 ,  3 ),  padding = 'same' ,  activation = 'relu' ,  name = 'block2_conv2' ))
model . add ( MaxPool2D (( 2 ,  2 ),  strides = ( 2 ,  2 ),  name = 'block2_pool' ))

#第3个卷积区块(block3) 
model . add ( Conv2D ( 256 ,  ( 3 ,  3 ),  padding = 'same' ,  activation = 'relu' ,  name = 'block3_conv1' )) 
model . add ( Conv2D ( 256 ,  ( 3 ,  3 ),  padding = 'same' ,  activation = 'relu' ,  name = 'block3_conv2' ))
model . add ( Conv2D ( 256 ,  ( 3 ,  3 ),  padding = 'same' ,  activation = 'relu' ,  name = 'block3_conv3' )) 
model . add ( MaxPool2D (( 2 ,  2 ),  strides = ( 2 ,  2 ),  name = 'block3_pool' ))

#第4个卷积区块(block4) 
model . add ( Conv2D ( 512 ,  ( 3 ,  3 ),  padding = 'same' ,  activation = 'relu' ,  name = 'block4_conv1' )) 
model . add ( Conv2D ( 512 ,  ( 3 ,  3 ),  padding = 'same' ,  activation = 'relu' ,  name = 'block4_conv2' ))
model . add ( Conv2D ( 512 ,  ( 3 ,  3 ),  padding = 'same' ,  activation = 'relu' ,  name = 'block4_conv3' )) 
model . add ( MaxPool2D (( 2 ,  2 ),  strides = ( 2 ,  2 ),  name = 'block4_pool' ))

#第5个卷积区块(block5) 
model . add ( Conv2D ( 512 ,  ( 3 ,  3 ),  padding = 'same' ,  activation = 'relu' ,  name = 'block5_conv1' )) 
model . add ( Conv2D ( 512 ,  ( 3 ,  3 ),  padding = 'same' ,  activation = 'relu' ,  name = 'block5_conv2' ))
model . add ( Conv2D ( 512 ,  ( 3 ,  3 ),  padding = 'same' ,  activation = 'relu' ,  name = 'block5_conv3' )) 
model . add ( MaxPool2D (( 2 ,  2 ),  strides = ( 2 ,  2 ),  name = 'block5_pool' ))

#前馈全连接区块
model . add ( Flatten ( name = 'flatten' )) 
model . add ( Dense ( 4096 ,  activation = 'relu' ,  name = 'fc1' )) 
model . add ( Dense ( 4096 ,  activation = 'relu' ,  name = 'fc2' )) 
model . add ( Dense ( 1000 ,  activation= 'softmax' ,  name = 'predictions' ))

#打印网络结构
model . summary ()

Layer (type) Output Shape Param #
================================================== ===============
block1_conv1 (Conv2D) (None, 224, 224, 64) 1792


block1_conv2 (Conv2D) (None, 224, 224, 64) 36928


block1_pool (MaxPooling2D) (None, 112, 112, 64) 0


block2_conv1 (Conv2D) (None, 112, 112, 128) 73856


block2_conv2 (Conv2D) (None, 112, 112, 128) 147584


block2_pool (MaxPooling2D) (None, 56, 56, 128) 0


block3_conv1 (Conv2D) (None, 56, 56, 256) 295168


block3_conv2 (Conv2D) (None, 56, 56, 256) 590080


block3_conv3 (Conv2D) (None, 56, 56, 256) 590080


block3_pool (MaxPooling2D) (None, 28, 28, 256) 0


block4_conv1 (Conv2D) (None, 28, 28, 512) 1180160


block4_conv2 (Conv2D) (None, 28, 28, 512) 2359808


block4_conv3 (Conv2D) (None, 28, 28, 512) 2359808


block4_pool (MaxPooling2D) (None, 14, 14, 512) 0


block5_conv1 (Conv2D) (None, 14, 14, 512) 2359808


block5_conv2 (Conv2D) (None, 14, 14, 512) 2359808


block5_conv3 (Conv2D) (None, 14, 14, 512) 2359808


block5_pool (MaxPooling2D) (None, 7, 7, 512) 0


flatten (Flatten) (None, 25088) 0


fc1 (Dense) (None, 4096) 102764544


fc2 (Dense) (None, 4096) 16781312


predictions (Dense) (None, 1000) 4097000
================================================== ===============
Total params: 138,357,544
Trainable params: 138,357,544
Non-trainable params: 0


确认模型训练的参数总数
根据论文2.3章节的讯息与我们模型的网络结构参数比对,我们构建的模型138,357,544参数的确符合论文提及的138百万的训练参数。

创建模型(Functaional API)

使用Keras的functiona api来定义网络结构。详细的说明与参考:

import  keras 
from  keras.models  import  Model 
from  keras.layers  import  Input ,  Dense ,  Activation ,  Dropout ,  Flatten 
from  keras.layers  import  Conv2D ,  MaxPool2D

#定义输入
input_shape  =  ( 224 ,  224 ,  3 )  # RGB影像224x224 (height, width, channel)

#输入层
img_input  =  Input ( shape = input_shape ,  name = 'img_input' )

#第1个卷积区块(block1) 
x  =  Conv2D ( 64 ,  ( 3 ,  3 ),  padding = 'same' ,  activation = 'relu' ,  name = 'block1_conv1' )( img_input ) 
x  =  Conv2D ( 64 ,  ( 3 ,  3 ),  padding = 'same' ,  activation = 'relu' ,  name = 'block1_conv2' )( x ) 
x =  MaxPool2D (( 2 ,  2 ),  strides = ( 2 ,  2 ),  name = 'block1_pool' )( x )

#第2个卷积区块(block2) 
x  =  Conv2D ( 128 ,  ( 3 ,  3 ),  padding = 'same' ,  activation = 'relu' ,  name = 'block2_conv1' )( x ) 
x  =  Conv2D ( 128 ,  ( 3 ,  3 ),  padding = 'same' ,  activation = 'relu' ,  name = 'block2_conv2' )( x ) 
x  = MaxPool2D (( 2 ,  2 ),  strides = ( 2 ,  2 ),  name = 'block2_pool' )( x )

#第3个卷积区块(block3) 
x  =  Conv2D ( 256 ,  ( 3 ,  3 ),  padding = 'same' ,  activation = 'relu' ,  name = 'block3_conv1' )( x ) 
x  =  Conv2D ( 256 ,  ( 3 ,  3 ),  padding = 'same' ,  activation = 'relu' ,  name = 'block3_conv2' )( x ) 
x  = Conv2D ( 256 ,  ( 3 ,  3 ),  padding = 'same' ,  activation = 'relu' ,  name = 'block3_conv3' )( x ) 
x  =  MaxPool2D (( 2 ,  2 ),  strides = ( 2 ,  2 ),  name = 'block3_pool' )( x )

#第4个卷积区块(block4) 
x  =  Conv2D ( 512 ,  ( 3 ,  3 ),  padding = 'same' ,  activation = 'relu' ,  name = 'block4_conv1' )( x ) 
x  =  Conv2D ( 512 ,  ( 3 ,  3 ),  padding = 'same' ,  activation = 'relu' ,  name = 'block4_conv2' )( x ) 
x  = Conv2D ( 512 ,  ( 3 ,  3 ),  padding = 'same' ,  activation = 'relu' ,  name = 'block4_conv3' )( x ) 
x  =  MaxPool2D (( 2 ,  2 ),  strides = ( 2 ,  2 ),  name = 'block4_pool' )( x )

#第5个卷积区块(block5) 
x  =  Conv2D ( 512 ,  ( 3 ,  3 ),  padding = 'same' ,  activation = 'relu' ,  name = 'block5_conv1' )( x ) 
x  =  Conv2D ( 512 ,  ( 3 ,  3 ),  padding = 'same' ,  activation = 'relu' ,  name = 'block5_conv2' )( x ) 
x  = Conv2D ( 512 ,  ( 3 ,  3 ),  padding = 'same' ,  activation = 'relu' ,  name = 'block5_conv3' )( x ) 
x  =  MaxPool2D (( 2 ,  2 ),  strides = ( 2 ,  2 ),  name = 'block5_pool' )( x )

#前馈全连接区块
x  =  Flatten ( name = 'flatten' )( x ) 
x  =  Dense ( 4096 ,  activation = 'relu' ,  name = 'fc1' )( x ) 
x  =  Dense ( 4096 ,  activation = ' relu' ,  name = 'fc2' )( x ) 
x  =  Dense ( 1000 ,  activation = 'softmax' , name = 'predictions' )( x )

#产生模型
model2  =  Model ( inputs = img_input ,  outputs = x ,  name = 'vgg16-funcapi' )

#打印网络结构
model2 . summary ()

Layer (type) Output Shape Param #
================================================== ===============
img_input (InputLayer) (None, 224, 224, 3) 0


block1_conv1 (Conv2D) (None, 224, 224, 64) 1792


block1_conv2 (Conv2D) (None, 224, 224, 64) 36928


block1_pool (MaxPooling2D) (None, 112, 112, 64) 0


block2_conv1 (Conv2D) (None, 112, 112, 128) 73856


block2_conv2 (Conv2D) (None, 112, 112, 128) 147584


block2_pool (MaxPooling2D) (None, 56, 56, 128) 0


block3_conv1 (Conv2D) (None, 56, 56, 256) 295168


block3_conv2 (Conv2D) (None, 56, 56, 256) 590080


block3_conv3 (Conv2D) (None, 56, 56, 256) 590080


block3_pool (MaxPooling2D) (None, 28, 28, 256) 0


block4_conv1 (Conv2D) (None, 28, 28, 512) 1180160


block4_conv2 (Conv2D) (None, 28, 28, 512) 2359808


block4_conv3 (Conv2D) (None, 28, 28, 512) 2359808


block4_pool (MaxPooling2D) (None, 14, 14, 512) 0


block5_conv1 (Conv2D) (None, 14, 14, 512) 2359808


block5_conv2 (Conv2D) (None, 14, 14, 512) 2359808


block5_conv3 (Conv2D) (None, 14, 14, 512) 2359808


block5_pool (MaxPooling2D) (None, 7, 7, 512) 0


flatten (Flatten) (None, 25088) 0


fc1 (Dense) (None, 4096) 102764544


fc2 (Dense) (None, 4096) 16781312


predictions (Dense) (None, 1000) 4097000
================================================== ===============
Total params: 138,357,544
Trainable params: 138,357,544
Non-trainable params: 0


模型训练

要用ImageNet的资料来训练VGG16的模型则不是一件容易的事喔。

VGG论文指出:

On a system equipped with four NVIDIA Titan Black GPUs, training a single net took 2–3 weeks depending on the architecture.

也就是说就算你有四张NVIDIA的Titan网卡用Imagenet的影像集来训练VGG16模型, 可能也得花个2-3星期。即使买的起这样的硬体,你也得花蛮多的时间来训练这个模型。

幸运的是Keras不仅己经在它的模组中包括了VGG16与VGG19的模型定义以外, 同时也帮大家预训练好了VGG16与VGG19的模型权重。

总结(Conclusion)

在这篇文章中有一些个人学习到的一些有趣的重点:

  • 在Keras中要建构一个网络不难, 但了解这个网络架构的原理则需要多一点耐心
  • VGG16构建简单效能高,真是神奇!
  • VGG16在卷积层的设计是愈后面feature map的size愈小, 而过泸器(receptive field/fiter/kernel)则愈多

Python教程

Java教程

Web教程

数据库教程

图形图像教程

大数据教程

开发工具教程

计算机教程