Home > Database > Mysql Tutorial > MRUnit使用技巧

MRUnit使用技巧

WBOY
Release: 2016-06-07 16:33:00
Original
1059 people have browsed it

导读 为了能测试编写的hadoop组件和MapReduce程序,一般有下面三种思路: 一、使用hadoop-eclipse插件来调试MapReduce程序,不过这在hadoop比较新的版本里已经不再提供了; 二、是配置jvm参数远程调试hadoop组件。这种方式用于读hadoop源代码比较适合,而如

导读

为了能测试编写的hadoop组件和MapReduce程序,一般有下面三种思路:

一、使用hadoop-eclipse插件来调试MapReduce程序,不过这在hadoop比较新的版本里已经不再提供了;

二、是配置jvm参数远程调试hadoop组件。这种方式用于读hadoop源代码比较适合,而如果用于远程调试MapReduce还是有点麻烦的;

详细参考的文档有:

http://blog.javachen.com/hadoop/2013/08/01/remote-debug-hadoop/

http://zhangjie.me/eclipse-debug-hadoop/

三、最后我选择了MRuinit来用于主要开发调试MapReduce应用程序。

MRunit简介

MRunit是用于做MapReduce单元测试的java库。使用apache发布,下载地址是:http://mrunit.apache.org/general/downloads.html

MRUnit测试框架是基于JUnit的。我们可以方便的测试Map ?Reduce程序。它适用于?0.20 , 0.23.x , 1.0.x , 2.x 等 Hadoop版本。

下面我们来做些MRunit的使用官方例子(SMS CDR (call details record) analysis):

使用记录如下

CDRID;CDRType;Phone1;Phone2;SMS Status Code
655209;1;796764372490213;804422938115889;6
353415;0;356857119806206;287572231184798;4
835699;1;252280313968413;889717902341635;0
Copy after login

需要做的事情是查找所有CDRType 为1的记录和它相关的状态码(SMS Status Code)
Map输出应该是:
6, 1
0, 1

代码如下:

public class SMSCDRMapper extends Mapper {
  private Text status = new Text();
  private final static IntWritable addOne = new IntWritable(1);
  /**
   * Returns the SMS status code and its count
   */
  protected void map(LongWritable key, Text value, Context context)
      throws java.io.IOException, InterruptedException {
    //655209;1;796764372490213;804422938115889;6 is the Sample record format
    String[] line = value.toString().split(";");
    // If record is of SMS CDR
    if (Integer.parseInt(line[1]) == 1) {
      status.set(line[4]);
      context.write(status, addOne);
    }
  }
}
Copy after login

Reduce 程序把最后的结果相加,程序如下:

public class SMSCDRReducer extends
  Reducer {
  protected void reduce(Text key, Iterable values, Context context) throws java.io.IOException, InterruptedException {
    int sum = 0;
    for (IntWritable value : values) {
      sum += value.get();
    }
    context.write(key, new IntWritable(sum));
  }
}
Copy after login

MRunit的测试程序如下:

import java.util.ArrayList;
import java.util.List;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.LongWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mrunit.mapreduce.MapDriver;
import org.apache.hadoop.mrunit.mapreduce.MapReduceDriver;
import org.apache.hadoop.mrunit.mapreduce.ReduceDriver;
import org.junit.Before;
import org.junit.Test;
public class SMSCDRMapperReducerTest {
  MapDriver mapDriver;
  ReduceDriver reduceDriver;
  MapReduceDriver mapReduceDriver;
  @Before
  public void setUp() {
    SMSCDRMapper mapper = new SMSCDRMapper();
    SMSCDRReducer reducer = new SMSCDRReducer();
    mapDriver = MapDriver.newMapDriver(mapper);;
    reduceDriver = ReduceDriver.newReduceDriver(reducer);
    mapReduceDriver = MapReduceDriver.newMapReduceDriver(mapper, reducer);
  }
  @Test
  public void testMapper() {
    mapDriver.withInput(new LongWritable(), new Text(
        "655209;1;796764372490213;804422938115889;6"));
    mapDriver.withOutput(new Text("6"), new IntWritable(1));
    mapDriver.runTest();
  }
  @Test
  public void testReducer() {
    List values = new ArrayList();
    values.add(new IntWritable(1));
    values.add(new IntWritable(1));
    reduceDriver.withInput(new Text("6"), values);
    reduceDriver.withOutput(new Text("6"), new IntWritable(2));
    reduceDriver.runTest();
  }
}
Copy after login

使用过JUnit的就应该知道怎么运行上面的代码了,这里就不重复了。

MRUint可以测试单个Map,单个Reduce和一个MapReduce或者多个MapReduce程序。
详细的可以参考官网文档:MRUnit Tutorial

参考:http://www.cnblogs.com/gpcuster/archive/2009/10/04/1577921.html

Related labels:
source:php.cn
Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn
Popular Tutorials
More>
Latest Downloads
More>
Web Effects
Website Source Code
Website Materials
Front End Template