动态分区 · Hadoop2.x

**1. 动态分区特点** * 往分区表装载数据时，只需要指定分区字段即可； * 只支持`insert` 一种插入数据方式； **2. 创建动态分区表的前提** 开启动态分区：低于hive2.3.4版本的默认关闭，hive2.3.4版本之后默认开启。严格模式：严格模式必须至少指定一个静态分区列，这是为了避免意外覆盖分区，非严格模式则不必指定静态分区。（1）在当前会话开启动态分区和设置非严格模式 ```sql # true为开启动态分区，false为关闭 0: jdbc:hive2://hadoop101:10000> SET hive.exec.dynamic.partition=true; # nonstrict为非严格模式，strict为严格模式 0: jdbc:hive2://hadoop101:10000> SET hive.exec.dynamic.partition.mode=nonstrict; ``` 或者在 {hive_home}/conf/hive-site.xml中配置 ```xml <property> <name>hive.exec.dynamic.partition</name> <value>true</value> <description>Whether or not to allow dynamic partitions in DML/DDL.</description> </property> <property> <name>hive.exec.dynamic.partition.mode</name> <value>nonstrict</value> <description> In strict mode, the user must specify at least one static partition in case the user accidentally overwrites all partitions. In nonstrict mode all partitions are allowed to be dynamic. </description> </property> ``` <br/> **3. 示例：将people内部表的数据复制到 dynamic_people 动态分区表中** (1) 准备数据`people.txt` ```xml 001,tom,23,2019-03-16 002,jack,12,2019-03-13 003,robin,14,2018-08-13 004,justin,34,2018-10-12 005,jarry,24,2017-11-11 006,jasper,24,2017-12-12 ``` (2) 创建people内部表和dynamic_people分区表 ```sql create table if not exists people( people_id int, people_name string, people_age int, people_start_date date ) row format delimited fields terminated by "," ; create table if not exists dynamic_people( dypeople_id int, dypeople_name string, dypeople_age int, dypeople_start_date date ) -- year、month字段名必须是建表字段中没有的字段 partitioned by(year string, month string) row format delimited fields terminated by "," ; ``` (3) 向people内部表插入数据，然后再向dynamic_people分区表动态插入数据 ```sql load data local inpath "/hdatas/people.txt" into table people; 0: jdbc:hive2://hadoop101:10000> select * from people; +-------------------+---------------------+--------------------+---------------------------+--+ | people.people_id | people.people_name | people.people_age | people.people_start_date | +-------------------+---------------------+--------------------+---------------------------+--+ | 1 | tom | 23 | 2019-03-16 | | 2 | jack | 12 | 2019-03-13 | | 3 | robin | 14 | 2018-08-13 | | 4 | justin | 34 | 2018-10-12 | | 5 | jarry | 24 | 2017-11-11 | | 6 | jasper | 24 | 2017-12-12 | +-------------------+---------------------+--------------------+---------------------------+--+ #### 往分区表中插入数据 #### insert into table dynamic_people partition(year, month) select people_id as dypeople_id, people_name as dypeople_name, people_age as dypople_age, people_start_date as dypeople_start_date, year(people_start_date) as year, month(people_start_date) as month from people; ``` 在hdfs中存储路径如下，使用的数据库为hivebook： ```xml /hivebook.db/dynamic_people/year=2017/month=11/000000_0 /hivebook.db/dynamic_people/year=2017/month=12/000000_0 /hivebook.db/dynamic_people/year=2018/month=8/000000_0 /hivebook.db/dynamic_people/year=2018/month=10/000000_0 /hivebook.db/dynamic_people/year=2019/month=3/000000_0 ``` ```sql 0: jdbc:hive2://hadoop101:10000> select * from dynamic_people where year="2019" and month="3"; +-----------------------------+-------------------------------+------------------------------+-------------------------------------+----------------------+-----------------------+--+ | dynamic_people.dypeople_id | dynamic_people.dypeople_name | dynamic_people.dypeople_age | dynamic_people.dypeople_start_date | dynamic_people.year | dynamic_people.month | +-----------------------------+-------------------------------+------------------------------+-------------------------------------+----------------------+-----------------------+--+ | 1 | tom | 23 | 2019-03-16 | 2019 | 3 | | 2 | jack | 12 | 2019-03-13 | 2019 | 3 | +-----------------------------+-------------------------------+------------------------------+-------------------------------------+--- ```