实现不同的层 · TensorFlow 机器学习秘籍中文第二版

# 实现不同的层了解如何实现不同的层非常重要。在前面的秘籍中，我们实现了完全连接的层。在本文中，我们将进一步扩展我们对各层的了解。 ## 做好准备我们已经探索了如何连接数据输入和完全连接的隐藏层，但是 TensorFlow 中有更多类型的层是内置函数。最常用的层是卷积层和 maxpool 层。我们将向您展示如何使用输入数据和完全连接的数据创建和使用此类层。首先，我们将研究如何在一维数据上使用这些层，然后在二维数据上使用这些层。虽然神经网络可以以任何方式分层，但最常见的用途之一是使用卷积层和完全连接的层来首先创建特征。如果我们有太多的特征，通常会有一个 maxpool 层。在这些层之后，通常引入非线性层作为激活函数。我们将在[第 8 章](../Text/61.html)卷积神经网络中考虑的卷积神经网络（CNN）通常具有卷积，maxpool，激活，卷积，maxpool 和激活形式。 ## 操作步骤我们将首先看一维数据。我们需要使用以下步骤为此任务生成随机数据数组： 1. 我们首先加载我们需要的库并开始图会话，如下所示： ```py import tensorflow as tf import numpy as np sess = tf.Session() ``` 1. 现在我们可以初始化我们的数据（长度为`25`的 NumPy 数组）并创建占位符，我们将通过以下代码提供它： ```py data_size = 25 data_1d = np.random.normal(size=data_size) x_input_1d = tf.placeholder(dtype=tf.float32, shape=[data_size]) ``` 1. 接下来，我们将定义一个将构成卷积层的函数。然后我们将声明一个随机过滤器并创建卷积层，如下所示： > 请注意，许多 TensorFlow 的层函数都是为处理 4D 数据而设计的（`4D = [batch size, width, height, and channels]`）。我们需要修改输入数据和输出数据，以扩展或折叠所需的额外维度。对于我们的示例数据，我们的批量大小为 1，宽度为 1，高度为 25，通道大小为 1.要扩展尺寸，我们使用`expand_dims()`函数，并且为了折叠尺寸，我们使用`squeeze()`函数。另请注意，我们可以使用`output_size=(W-F+2P)/S+1`公式计算卷积层的输出尺寸，其中`W`是输入尺寸，`F`是滤镜尺寸，`P`是填充尺寸，`S`是步幅大小。 ```py def conv_layer_1d(input_1d, my_filter): # Make 1d input into 4d input_2d = tf.expand_dims(input_1d, 0) input_3d = tf.expand_dims(input_2d, 0) input_4d = tf.expand_dims(input_3d, 3) # Perform convolution convolution_output = tf.nn.conv2d(input_4d, filter=my_filter, strides=[1,1,1,1], padding="VALID") # Now drop extra dimensions conv_output_1d = tf.squeeze(convolution_output) return(conv_output_1d) my_filter = tf.Variable(tf.random_normal(shape=[1,5,1,1])) my_convolution_output = conv_layer_1d(x_input_1d, my_filter) ``` 1. 默认情况下，TensorFlow 的激活函数将按元素方式执行。这意味着我们只需要在感兴趣的层上调用激活函数。我们通过创建激活函数然后在图上初始化它来完成此操作，如下所示： ```py def activation(input_1d): return tf.nn.relu(input_1d) my_activation_output = activation(my_convolution_output) ``` 1. 现在我们将声明一个 maxpool 层函数。此函数将在我们的一维向量上的移动窗口上创建一个 maxpool。对于此示例，我们将其初始化为宽度为 5，如下所示： > TensorFlow 的 maxpool 参数与卷积层的参数非常相似。虽然 maxpool 参数没有过滤器，但它确实有 size，stride 和 padding 选项。由于我们有一个带有有效填充的 5 的窗口（没有零填充），因此我们的输出数组将减少 4 个条目。 ```py def max_pool(input_1d, width): # First we make the 1d input into 4d. input_2d = tf.expand_dims(input_1d, 0) input_3d = tf.expand_dims(input_2d, 0) input_4d = tf.expand_dims(input_3d, 3) # Perform the max pool operation pool_output = tf.nn.max_pool(input_4d, ksize=[1, 1, width, 1], strides=[1, 1, 1, 1], padding='VALID') pool_output_1d = tf.squeeze(pool_output) return pool_output_1d my_maxpool_output = max_pool(my_activation_output, width=5) ``` 1. 我们将要连接的最后一层是完全连接的层。在这里，我们想要创建一个多特征函数，输入一维数组并输出指示的数值。还要记住，要使用 1D 数组进行矩阵乘法，我们必须将维度扩展为 2D，如下面的代码块所示： ```py def fully_connected(input_layer, num_outputs): # Create weights weight_shape = tf.squeeze(tf.stack([tf.shape(input_layer), [num_outputs]])) weight = tf.random_normal(weight_shape, stddev=0.1) bias = tf.random_normal(shape=[num_outputs]) # Make input into 2d input_layer_2d = tf.expand_dims(input_layer, 0) # Perform fully connected operations full_output = tf.add(tf.matmul(input_layer_2d, weight), bias) # Drop extra dimensions full_output_1d = tf.squeeze(full_output) return full_output_1d my_full_output = fully_connected(my_maxpool_output, 5) ``` 1. 现在我们将初始化所有变量，运行图并打印每个层的输出，如下所示： ```py init = tf.global_variable_initializer() sess.run(init) feed_dict = {x_input_1d: data_1d} # Convolution Output print('Input = array of length 25') print('Convolution w/filter, length = 5, stride size = 1, results in an array of length 21:') print(sess.run(my_convolution_output, feed_dict=feed_dict)) # Activation Output print('Input = the above array of length 21') print('ReLU element wise returns the array of length 21:') print(sess.run(my_activation_output, feed_dict=feed_dict)) # Maxpool Output print('Input = the above array of length 21') print('MaxPool, window length = 5, stride size = 1, results in the array of length 17:') print(sess.run(my_maxpool_output, feed_dict=feed_dict)) # Fully Connected Output print('Input = the above array of length 17') print('Fully connected layer on all four rows with five outputs:') print(sess.run(my_full_output, feed_dict=feed_dict)) ``` 1. 上一步应该产生以下输出： ```py Input = array of length 25 Convolution w/filter, length = 5, stride size = 1, results in an array of length 21: [-0.91608119 1.53731811 -0.7954089 0.5041104 1.88933098 -1.81099761 0.56695032 1.17945457 -0.66252393 -1.90287709 0.87184119 0.84611893 -5.25024986 -0.05473572 2.19293165 -4.47577858 -1.71364677 3.96857905 -2.0452652 -1.86647367 -0.12697852] Input = the above array of length 21 ReLU element wise returns the array of length 21: [ 0\. 1.53731811 0\. 0.5041104 1.88933098 0\. 0\. 1.17945457 0\. 0\. 0.87184119 0.84611893 0\. 0\. 2.19293165 0\. 0\. 3.96857905 0\. 0\. 0\. ] Input = the above array of length 21 MaxPool, window length = 5, stride size = 1, results in the array of length 17: [ 1.88933098 1.88933098 1.88933098 1.88933098 1.88933098 1.17945457 1.17945457 1.17945457 0.87184119 0.87184119 2.19293165 2.19293165 2.19293165 3.96857905 3.96857905 3.96857905 3.96857905] Input = the above array of length 17 Fully connected layer on all four rows with five outputs: [ 1.23588216 -0.42116445 1.44521213 1.40348077 -0.79607368] ``` > 对于神经网络，一维数据非常重要。时间序列，信号处理和一些文本嵌入被认为是一维的并且经常在神经网络中使用。我们现在将以相同的顺序考虑相同类型的层，但是对于二维数据： 1. 我们将从清除和重置计算图开始，如下所示： ```py ops.reset_default_graph() sess = tf.Session() ``` 1. 然后我们将初始化我们的输入数组，使其为 10x10 矩阵，然后我们将为具有相同形状的图初始化占位符，如下所示： ```py data_size = [10,10] data_2d = np.random.normal(size=data_size) x_input_2d = tf.placeholder(dtype=tf.float32, shape=data_size) ``` 1. 就像在一维示例中一样，我们现在需要声明卷积层函数。由于我们的数据已经具有高度和宽度，我们只需要将其扩展为二维（批量大小为 1，通道大小为 1），以便我们可以使用`conv2d()`函数对其进行操作。对于滤波器，我们将使用随机 2x2 滤波器，两个方向的步幅为 2，以及有效填充（换句话说，没有零填充）。因为我们的输入矩阵是 10x10，我们的卷积输出将是 5x5，如下所示： ```py def conv_layer_2d(input_2d, my_filter): # First, change 2d input to 4d input_3d = tf.expand_dims(input_2d, 0) input_4d = tf.expand_dims(input_3d, 3) # Perform convolution convolution_output = tf.nn.conv2d(input_4d, filter=my_filter, strides=[1,2,2,1], padding="VALID") # Drop extra dimensions conv_output_2d = tf.squeeze(convolution_output) return(conv_output_2d) my_filter = tf.Variable(tf.random_normal(shape=[2,2,1,1])) my_convolution_output = conv_layer_2d(x_input_2d, my_filter) ``` 1. 激活函数在逐个元素的基础上工作，因此我们现在可以创建激活操作并使用以下代码在图上初始化它： ```py def activation(input_2d): return tf.nn.relu(input_2d) my_activation_output = activation(my_convolution_output) ``` 1. 我们的 maxpool 层与一维情况非常相似，只是我们必须声明 maxpool 窗口的宽度和高度。就像我们的卷积 2D 层一样，我们只需要扩展到两个维度，如下所示： ```py def max_pool(input_2d, width, height): # Make 2d input into 4d input_3d = tf.expand_dims(input_2d, 0) input_4d = tf.expand_dims(input_3d, 3) # Perform max pool pool_output = tf.nn.max_pool(input_4d, ksize=[1, height, width, 1], strides=[1, 1, 1, 1], padding='VALID') # Drop extra dimensions pool_output_2d = tf.squeeze(pool_output) return pool_output_2d my_maxpool_output = max_pool(my_activation_output, width=2, height=2) ``` 1. 我们的全连接层与一维输出非常相似。我们还应该注意到，此层的 2D 输入被视为一个对象，因此我们希望每个条目都连接到每个输出。为了实现这一点，我们需要完全展平二维矩阵，然后将其展开以进行矩阵乘法，如下所示： ```py def fully_connected(input_layer, num_outputs): # Flatten into 1d flat_input = tf.reshape(input_layer, [-1]) # Create weights weight_shape = tf.squeeze(tf.stack([tf.shape(flat_input), [num_outputs]])) weight = tf.random_normal(weight_shape, stddev=0.1) bias = tf.random_normal(shape=[num_outputs]) # Change into 2d input_2d = tf.expand_dims(flat_input, 0) # Perform fully connected operations full_output = tf.add(tf.matmul(input_2d, weight), bias) # Drop extra dimensions full_output_2d = tf.squeeze(full_output) return full_output_2d my_full_output = fully_connected(my_maxpool_output, 5) ``` 1. 现在我们需要初始化变量并使用以下代码为我们的操作创建一个 feed 字典： ```py init = tf.global_variables_initializer() sess.run(init) feed_dict = {x_input_2d: data_2d} ``` 1. 每个层的输出应如下所示： ```py # Convolution Output print('Input = [10 X 10] array') print('2x2 Convolution, stride size = [2x2], results in the [5x5] array:') print(sess.run(my_convolution_output, feed_dict=feed_dict)) # Activation Output print('Input = the above [5x5] array') print('ReLU element wise returns the [5x5] array:') print(sess.run(my_activation_output, feed_dict=feed_dict)) # Max Pool Output print('Input = the above [5x5] array') print('MaxPool, stride size = [1x1], results in the [4x4] array:') print(sess.run(my_maxpool_output, feed_dict=feed_dict)) # Fully Connected Output print('Input = the above [4x4] array') print('Fully connected layer on all four rows with five outputs:') print(sess.run(my_full_output, feed_dict=feed_dict)) ``` 1. 上一步应该产生以下输出： ```py Input = [10 X 10] array 2x2 Convolution, stride size = [2x2], results in the [5x5] array: [[ 0.37630892 -1.41018617 -2.58821273 -0.32302785 1.18970704] [-4.33685207 1.97415686 1.0844903 -1.18965471 0.84643292] [ 5.23706436 2.46556497 -0.95119286 1.17715418 4.1117816 ] [ 5.86972761 1.2213701 1.59536231 2.66231227 2.28650784] [-0.88964868 -2.75502229 4.3449688 2.67776585 -2.23714781]] Input = the above [5x5] array ReLU element wise returns the [5x5] array: [[ 0.37630892 0\. 0\. 0\. 1.18970704] [ 0\. 1.97415686 1.0844903 0\. 0.84643292] [ 5.23706436 2.46556497 0\. 1.17715418 4.1117816 ] [ 5.86972761 1.2213701 1.59536231 2.66231227 2.28650784] [ 0\. 0\. 4.3449688 2.67776585 0\. ]] Input = the above [5x5] array MaxPool, stride size = [1x1], results in the [4x4] array: [[ 1.97415686 1.97415686 1.0844903 1.18970704] [ 5.23706436 2.46556497 1.17715418 4.1117816 ] [ 5.86972761 2.46556497 2.66231227 4.1117816 ] [ 5.86972761 4.3449688 4.3449688 2.67776585]] Input = the above [4x4] array Fully connected layer on all four rows with five outputs: [-0.6154139 -1.96987963 -1.88811922 0.20010889 0.32519674] ``` ## 工作原理我们现在应该知道如何在 TensorFlow 中使用一维和二维数据中的卷积和 maxpool 层。无论输入的形状如何，我们最终都得到相同的大小输出。这对于说明神经网络层的灵活性很重要。本节还应该再次向我们强调形状和大小在神经网络操作中的重要性。