默认不支持分桶功能,需要开启。
```bash
set hive.enforce.bucketing;
```
```
+-------------------------------+--+
| set |
+-------------------------------+--+
| hive.enforce.bucketing=false |
+-------------------------------+--+
```
执行开启:
```bash
set hive.enforce.bucketing = true;
```
```
+------------------------------+--+
| set |
+------------------------------+--+
| hive.enforce.bucketing=true |
+------------------------------+--+
```
最终转为MapReduce,设置4桶:
```bash
set mapreduce.job.reduces = 4;
```
```
+--------------------------+--+
| set |
+--------------------------+--+
| mapreduce.job.reduces=4 |
+--------------------------+--+
```
创建分桶表
```mysql
create table t1(id int, name string, age int, dept string)
clustered by(id)
into 4 buckets
row format delimited
fields terminated by ',';
```
导入数据:insert + select
```mysql
insert overwrite table t1 select * from t2 cluster by(id);
```