🔥码云GVP开源项目 12k star Uniapp+ElementUI 功能强大 支持多语言、二开方便! 广告
[TOC] # ChainMapper 通过ChainMapper可以将多个map类合并成一个map任务 ~~~ ChainMapper<LongWritable, Text, Text, NullWritable> chain = new ChainMapper<~>(); chain.addMapper(job, readWordsMapper.class, LongWritable.class, Text.class, Text.class, NullWritable.class, conf); chain.addMapper(job, addPreMapper.class, Text.class, NullWritable.class, Text.class, NullWritable.class, conf); ~~~ # JobControl 一个稍复杂点的处理逻辑往往需要多个mapreduce程序串联处理,多job的串联可以借助mapreduce框架的JobControl实现 1. 我们可以用shell脚本,根据状态返回,来决定下一步的shell执行还是不执行 2. 可以设置多个job他们的依赖关系 ~~~ ControlledJob cJob1 = new ControlledJob(job1.getConfiguration()); ControlledJob cJob2 = new ControlledJob(job2.getConfiguration()); ControlledJob cJob3 = new ControlledJob(job3.getConfiguration()); cJob1.setJob(job1); cJob2.setJob(job2); cJob3.setJob(job3); // 设置作业依赖关系,job2执行依赖job1,job3依赖job2 cJob2.addDependingJob(cJob1); cJob3.addDependingJob(cJob2); //设置JobControl,里面放一个组名 JobControl jobControl = new JobControl("RecommendationJob"); jobControl.addJob(cJob1); jobControl.addJob(cJob2); jobControl.addJob(cJob3); // 新建一个线程来运行已加入JobControl中的作业,开始进程并等待结束 Thread jobControlThread = new Thread(jobControl); jobControlThread.start(); //判断是不是已经finish了,没有finish就继续执行 while (!jobControl.allFinished()) { Thread.sleep(500); } jobControl.stop(); return 0; ~~~