数据定义 · Hadoop

set hive.cli.print.current.db=true; 创建表 CREATE TABLE sales( name STRING, amount INT, region STRING) row format delimited fields terminated by ','; 插入语句 ~~~ INSERT INTO TINSERT INTO TABLE sales VALUES("ljs",100,"beijing"); INSERT INTO TINSERT INTO TABLE sales VALUES("zhangs",10,"shanghai"); INSERT INTO TABLE sales VALUES("zhoug",8,"liaoning"); ~~~ 执行SQL语句后，数据存储在dhfs /hive/warehouse 创建集合类型的表 ~~~ create table employees( name string, salary float, subordinates array<string>, deductions map<string,float>, address struct<street:string,city:string,state:string,zip:int>) row format delimited fields terminated by '\001' collection items terminated by '\002' map keys terminated by '\003' lines terminated by '\n' stored as textfile; ~~~ 桶表—介绍桶表 ~~~ CREATE TABLE bucketed_users( UserID Int, Gender string, Age Int, Occupation string, Zipcode string) CLUSTERED BY (UserID) INTO 4 BUCKETS ROW FORMAT DELIMITED FIELDS TERMINATED BY ','; ~~~ 要向分桶表中填充成员，需要将 hive.enforce.bucketing 属性设置为 true，Hive 就知道用表定义中声明的数量来创建桶。 ~~~ hive>set hive.enforce.bucketing = true; ~~~ 插入数据 ~~~ INSERT OVERWRITE TABLE bucketed_users SELECT UserID,Gender, Age,Occupation,Zipcode FROM users; ~~~ 每个桶在磁盘上对应一个文件。