Fetch抓取 · JAVA

[TOC] # 简介 Fetch抓取是指,hive中对某些情况的查询可以不必使用mapreduce计算.例如 ~~~ select * from employees; ~~~ 在这种情况下,hive可以简单的读取employees对应的存储目录下的文件,然后输出查询结果到控制台. # 配置在hive-default.xml.template文件中hive.fetch.task.conversion默认是more. 老版本hive默认minimal,该属性改为more以后,在全局查找,字段查找,limit查找等都不走mapreduce ~~~ <property> <name>hive.fetch.task.conversion</name> <value>more</value> <description> Expects one of [none, minimal, more]. Some select queries can be converted to single FETCH task minimizing latency. Currently the query should be single sourced not having any subquery and should not have any aggregations or distincts (which incurs RS), lateral views and joins. 0. none : disable hive.fetch.task.conversion 1. minimal : SELECT STAR, FILTER on partition columns, LIMIT only 2. more : SELECT, FILTER, LIMIT only (support TABLESAMPLE and virtual columns) </description> </property> ~~~ # 案例 1. 把hive.fetch.task.conversion设置成none,然后执行查询语句,都会执行mapreduce程序. ~~~ hive> set hive.fetch.task.conversion=none; ~~~ 然后执行下 ~~~ hive> select * from emp; hive> select ename frome emp; hive> select ename frome emp limit 3; ~~~ 2. 把hive.fetch.task.conversion设置成more,然后执行查询语句,都不会执行mapreduce程序. ~~~ hive> set hive.fetch.task.conversion=more; ~~~ 然后执行下 ~~~ hive> select * from emp; hive> select ename frome emp; hive> select ename frome emp limit 3; ~~~