多应用+插件架构,代码干净,二开方便,首家独创一键云编译技术,文档视频完善,免费商用码云13.8K 广告
[TOC] Hive的 TRANSFORM 关键字提供了在SQL中调用自写脚本的功能 适合实现Hive中没有的功能又不想写UDF的情况 使用示例1:下面这句sql就是借用了weekday_mapper.py对数据进行了处理. ~~~ CREATE TABLE u_data_new ( movieid INT, rating INT, weekday INT, userid INT) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t'; ~~~ ~~~ add FILE weekday_mapper.py; ~~~ ~~~ INSERT OVERWRITE TABLE u_data_new SELECT TRANSFORM (movieid , rate, timestring,uid) USING 'python weekday_mapper.py' AS (movieid, rating, weekday,userid) FROM t_rating; ~~~ 其中weekday_mapper.py内容如下 ~~~ #!/bin/python import sys import datetime for line in sys.stdin: line = line.strip() movieid, rating, unixtime,userid = line.split('\t') weekday = datetime.datetime.fromtimestamp(float(unixtime)).isoweekday() print '\t'.join([movieid, rating, str(weekday),userid]) ~~~