导入表数据子集 · 大数据

[TOC] # 导入表数据子集我们可以导入表的使用Sqoop导入工具，`"where"`子句的一个子集。它执行在各自的数据库服务器相应的SQL查询，并将结果存储在HDFS的目标目录。 where子句的语法如下。 ~~~ --where <condition> ~~~ 下面的命令用来导入emp_add表数据的子集。子集查询检索员工ID和地址，居住城市为：Secunderabad ~~~ sqoop import \ --connect jdbc:mysql:/localhost:3306/userdb \ --username root \ --password root123 \ --where "city ='sec-bad'" \ --target-dir /wherequery \ --table emp_add --m 1 ~~~ 按需求导入 ~~~ sqoop import \ --connect jdbc:mysql://localhost:3306/userdb \ --username root \ --password root123 \ --target-dir /wherequery2 \ --query 'select id,name,deg from emp WHERE id>1207 and $CONDITIONS' \ --split-by id \ --fields-terminated-by '\t' \ --m 1 ~~~ 如果要用query,就要写`$CONDITIONS` `--fields-terminated-by` 指定存储的时候,字段按照这个分割下面的命令用来验证数据从emp_add表导入`/wherequery`目录 ~~~ $HADOOP_HOME/bin/hadoop fs -cat /wherequery/part-m-* ~~~ 它用逗号(,)分隔 emp_add表数据和字段 ~~~ 1202, 108I, aoc, sec-bad 1204, 78B, oldcity, sec-bad 1205, 720C, hitech, sec-bad ~~~