静态分区 · Hadoop2.x

**1. 静态分区特点** * 往分区表装载数据时，需要明确指定分区字段和分区值。 * 支持 `load` 和 `insert` 两种插入数据方式。 * 适用于分区数少，分区名可以明确的数据。 **2. 创建静态分区** ```sql #### 1. 创建分区表 #### create table if not exists dept_part( dept_id int, dept_name string, dept_salary double ) -- partitioned by 要在 row format的前面 -- month字段名必须是建表字段中没有的字段 partitioned by (month string) row format delimited fields terminated by '\t' ; #### 2. 往分区表加载数据 #### load data local inpath "/hdatas/dept.txt" into table dept_part partition(month='201908'); load data local inpath "/hdatas/dept.txt" into table dept_part partition(month='201909'); load data local inpath "/hdatas/dept.txt" into table dept_part partition(month='201910'); ``` 分区表在hdfs存储的路径如下，当前我使用的数据库为hivebook. ```sql /hivebook.db/dept_part/month=201908/dept.txt /hivebook.db/dept_part/month=201909/dept.txt /hivebook.db/dept_part/month=201910/dept.txt ``` ![](https://img.kancloud.cn/57/df/57dfb8dfd58197ba73bf1cf060efcf62_1461x267.png) <br/> **3. 分区表常用操作** ```sql #### 查询所有分区 #### 0: jdbc:hive2://hadoop101:10000> select * from dept_part; +--------------------+----------------------+------------------------+------------------+--+ | dept_part.dept_id | dept_part.dept_name | dept_part.dept_salary | dept_part.month | +--------------------+----------------------+------------------------+------------------+--+ | 10 | ACCOUNTING | 1700.0 | 201908 | | 20 | RESEARCH | 1800.0 | 201908 | | 30 | SALES | 1900.0 | 201908 | | 10 | ACCOUNTING | 1700.0 | 201909 | | 20 | RESEARCH | 1800.0 | 201909 | | 30 | SALES | 1900.0 | 201909 | | 10 | ACCOUNTING | 1700.0 | 201910 | | 20 | RESEARCH | 1800.0 | 201910 | | 30 | SALES | 1900.0 | 201910 | +--------------------+----------------------+------------------------+------------------+--+ #### 单分区查询 #### 0: jdbc:hive2://hadoop101:10000> select * from dept_part where month="201908"; +--------------------+----------------------+------------------------+------------------+--+ | dept_part.dept_id | dept_part.dept_name | dept_part.dept_salary | dept_part.month | +--------------------+----------------------+------------------------+------------------+--+ | 10 | ACCOUNTING | 1700.0 | 201908 | | 20 | RESEARCH | 1800.0 | 201908 | | 30 | SALES | 1900.0 | 201908 | +--------------------+----------------------+------------------------+------------------+--+ #### 查询有哪些分区 #### 0: jdbc:hive2://hadoop101:10000> show partitions dept_2; +---------------+--+ | partition | +---------------+--+ | month=201910 | | month=201911 | | month=201912 | +---------------+--+ #### 删除单个分区 #### alter table dept_2 drop partition(month="201910"); #### 删除多个分区 #### alter table dept_2 drop partition(month="201910"), partition(month="201912"); #### 新增单个分区 #### alter table dept_2 add partition(month="202010"); #### 新增多个分区 #### alter table dept_2 add partition(month="202012"), partition(month="202013"); ``` <br/> **4. 创建多级分区表** ```sql #### 创建二级分区 #### create table if not exists dept_3( dept_id int, dept_name string, dept_salary double ) -- 二级分区 partitioned by (month string, day string) row format delimited fields terminated by '\t' ; #### 往二级分区装载数据 #### load data local inpath "/hdatas/dept.txt" into table dept_3 partition(month='201910', day='23'); load data local inpath "/hdatas/dept.txt" into table dept_3 partition(month='201910', day='24'); load data local inpath "/hdatas/dept.txt" into table dept_3 partition(month='201910', day='25'); 0: jdbc:hive2://hadoop101:10000> select * from dept_3; +-----------------+-------------------+---------------------+---------------+-------------+--+ | dept_3.dept_id | dept_3.dept_name | dept_3.dept_salary | dept_3.month | dept_3.day | +-----------------+-------------------+---------------------+---------------+-------------+--+ | 10 | ACCOUNTING | 1700.0 | 201910 | 23 | | 20 | RESEARCH | 1800.0 | 201910 | 23 | | 30 | SALES | 1900.0 | 201910 | 23 | | 10 | ACCOUNTING | 1700.0 | 201910 | 24 | | 20 | RESEARCH | 1800.0 | 201910 | 24 | | 30 | SALES | 1900.0 | 201910 | 24 | | 10 | ACCOUNTING | 1700.0 | 201910 | 25 | | 20 | RESEARCH | 1800.0 | 201910 | 25 | | 30 | SALES | 1900.0 | 201910 | 25 | +-----------------+-------------------+---------------------+---------------+-------------+--+ ``` 在hdfs中的路径如下，我使用的数据库为hivebook. ```sql /hivebook.db/dept_3/month=201910/day=23/dept.txt /hivebook.db/dept_3/month=201910/day=24/dept.txt /hivebook.db/dept_3/month=201910/day=25/dept.txt ``` ![](https://img.kancloud.cn/a9/91/a9917f8b63c9ac0b7174924f544eb53d_1497x458.png) ```sql #### 查询单个分区 #### 0: jdbc:hive2://hadoop101:10000> select * from dept_3 where month="201910" and day="23"; +-----------------+-------------------+---------------------+---------------+-------------+--+ | dept_3.dept_id | dept_3.dept_name | dept_3.dept_salary | dept_3.month | dept_3.day | +-----------------+-------------------+---------------------+---------------+-------------+--+ | 10 | ACCOUNTING | 1700.0 | 201910 | 23 | | 20 | RESEARCH | 1800.0 | 201910 | 23 | | 30 | SALES | 1900.0 | 201910 | 23 | +-----------------+-------------------+---------------------+---------------+-------------+--+ ```