生产环境 TensorFlow 的一个例子 · TensorFlow 机器学习秘籍中文第二版

# 生产环境 TensorFlow 的一个例子生产机器学习模型的一个好方法是将训练和评估程序分开。在本节中，我们将说明一个评估脚本，该脚本已经扩展到包括单元测试，模型保存和加载以及评估。 ## 做好准备在本文中，我们将向您展示如何使用上述标准实现评估脚本。代码实际上包含一个训练脚本和一个评估脚本，但是对于这个秘籍，我们只会向您展示评估脚本。提醒一下，两个脚本都可以在 [](https://github.com/nfmcclure/tensorflow_cookbook/) [https://github.com/nfmcclure/tensorflow_cookbook/](https://github.com/nfmcclure/tensorflow_cookbook/) 的在线 GitHub 仓库和官方 Packt 仓库中看到： [https ：//github.com/PacktPublishing/TensorFlow-Machine-Learning-Cookbook-Second-Edition](https://github.com/PacktPublishing/TensorFlow-Machine-Learning-Cookbook-Second-Edition) 。对于即将到来的示例，我们将实现[第 9 章](../Text/68.html)，回归神经网络中的第一个 RNN 示例，该示例试图预测文本消息是垃圾邮件还是火腿。我们将假设 RNN 模型与词汇一起被训练和保存。 ## 操作步骤 1. 首先，我们首先加载必要的库并声明 TensorFlow 应用标志，如下所示： ```py import os import re import numpy as np import tensorflow as tf from tensorflow.python.framework import ops ops.reset_default_graph() # Define App Flags tf.flags.DEFINE_string("storage_folder", "temp", "Where to store model and data.") tf.flags.DEFINE_float('learning_rate', 0.0005, 'Initial learning rate.') tf.flags.DEFINE_float('dropout_prob', 0.5, 'Per to keep probability for dropout.') tf.flags.DEFINE_integer('epochs', 20, 'Number of epochs for training.') tf.flags.DEFINE_integer('batch_size', 250, 'Batch Size for training.') tf.flags.DEFINE_integer('rnn_size', 15, 'RNN feature size.') tf.flags.DEFINE_integer('embedding_size', 25, 'Word embedding size.') tf.flags.DEFINE_integer('min_word_frequency', 20, 'Word frequency cutoff.') tf.flags.DEFINE_boolean('run_unit_tests', False, 'If true, run tests.') FLAGS = tf.flags.FLAGS ``` 1. 接下来，我们声明一个文本清理函数。这与训练脚本中使用的清洁函数相同，如下所示： ```py def clean_text(text_string): text_string = re.sub(r'([^sw]|_|[0-9])+', '', text_string) text_string = " ".join(text_string.split()) text_string = text_string.lower() return text_string ``` 1. 现在，我们需要加载以下词汇处理函数： ```py def load_vocab(): vocab_path = os.path.join(FLAGS.storage_folder, "vocab") vocab_processor = tf.contrib.learn.preprocessing.VocabularyProcessor.restore(vocab_path) return vocab_processor ``` 1. 现在我们有了清理文本的方法，并且还有一个词汇处理器，我们可以将这些函数组合起来为给定的文本创建数据处理管道，如下所示： ```py def process_data(input_data, vocab_processor): input_data = clean_text(input_data) input_data = input_data.split() processed_input = np.array(list(vocab_processor.transform(input_data))) return processed_input ``` 1. 接下来，我们需要一种方法来获取要评估的数据。为此，我们将要求用户在屏幕上键入文本。然后，我们将处理文本并返回以下处理过的文本： ```py def get_input_data(): input_text = input("Please enter a text message to evaluate: ") vocab_processor = load_vocab() return process_data(input_text, vocab_processor) ``` > 对于此示例，我们通过要求用户键入来创建评估数据。虽然许多应用将通过提供的文件或 API 请求获取数据，但我们可以相应地更改此输入数据函数。 1. 对于单元测试，我们需要使用以下代码确保我们的文本清理函数正常运行： ```py class clean_test(tf.test.TestCase): # Make sure cleaning function behaves correctly def clean_string_test(self): with self.test_session(): test_input = '--Tensorflow's so Great! Dont you think so? ' test_expected = 'tensorflows so great don you think so' test_out = clean_text(test_input) self.assertEqual(test_expected, test_out) ``` 1. 现在我们有了模型和数据，我们可以运行`main`函数。 `main`函数将获取数据，设置图，加载变量，输入处理过的数据，然后打印输出，如下面的代码片段所示： ```py def main(args): # Get flags storage_folder = FLAGS.storage_folder # Get user input text x_data = get_input_data() # Load model graph = tf.Graph() with graph.as_default(): sess = tf.Session() with sess.as_default(): # Load the saved meta graph and restore variables saver = tf.train.import_meta_graph("{}.meta".format(os.path.join(storage_folder, "model.ckpt"))) saver.restore(sess, os.path.join(storage_folder, "model.ckpt")) # Get the placeholders from the graph by name x_data_ph = graph.get_operation_by_name("x_data_ph").outputs[0] dropout_keep_prob = graph.get_operation_by_name("dropout_keep_prob").outputs[0] probability_outputs = graph.get_operation_by_name("probability_outputs").outputs[0] # Make the prediction eval_feed_dict = {x_data_ph: x_data, dropout_keep_prob: 1.0} probability_prediction = sess.run(tf.reduce_mean(probability_outputs, 0), eval_feed_dict) # Print output (Or save to file or DB connection?) print('Probability of Spam: {:.4}'.format(probability_prediction[1])) ``` 1. 最后，要运行`main()`函数或单元测试，请使用以下代码： ```py if __name__ == "__main__": if FLAGS.run_unit_tests: # Perform unit tests tf.test.main() else: # Run evaluation tf.app.run() ``` ## 工作原理为了评估模型，我们能够使用 TensorFlow 的 app 标志加载命令行参数，加载模型和词汇处理器，然后通过模型运行处理过的数据并进行预测。请记住通过命令行运行此脚本，并在创建模型和词汇表字典之前检查是否运行了训练脚本。