MR Unit
MR Unit is a testing tool used for testing map-reduce job.
Normally for a java program we have JUnit which will take the input and checks
whether the input for that particular piece of code is emitting the desired
output. Like Wise we can give the inputs, outputs for mapper and reducer class
and verify it is emitting the desired output or not.
Advantage
1. We can check our mapper and reducer from our IDE(say eclipse) itself
instead of making it as a jar and run using the hadoop jar command.
2. Saving a lots of time and resources by preventing a false run on
Hadoop cluster which will internally run a map-reduce job.
Jars needed
Download the latest versions of
1. MRUnit jar from apache website.
2. mokito and JUnit jar from the below given link.
Implementation
Sample input data
CDRID CDRType Phone StatusCode
655209 1 796764372490213 6
353415 0 356857119806206 4
835699 1 252280313968413 0
Requirment
Need to fetch those having CRD Type 1 and noting its status
code.
Mapper Class Implentation
public class MyMapper extends Mapper<LongWritable, Text, Text, IntWritable> {
private Text status = new Text();
private final static IntWritable addOne = new IntWritable(1);
/**
* Returns the SMS status code and its
count
*/
protected void map(LongWritable key,
Text value, Context context)
throws java.io.IOException, InterruptedException {
//655209;1;796764372490213;804422938115889;6
is the Sample record format
String[] line =
value.toString().split(";");
// If record is of SMS CDR
if (Integer.parseInt(line[1]) == 1) {
status.set(line[4]);
context.write(status, addOne);
}
}
}
Reducer Class Implentation
public class MyReducer extends Reducer<Text, IntWritable, Text, IntWritable> {
protected void reduce(Text key,
Iterable<IntWritable> values, Context context) throws java.io.IOException, InterruptedException {
int sum = 0;
for (IntWritable value : values) {
sum += value.get();
}
context.write(key, new IntWritable(sum));
}
}
MRUnit class implementation
public class MyMapReduceTest {
MapDriver<LongWritable,
Text, Text, IntWritable> mapDriver;
ReduceDriver<Text,
IntWritable, Text, IntWritable> reduceDriver;
MapReduceDriver<LongWritable,Text,Text,IntWritable,Text,IntWritable>mapReduceDriver;
@Before
public void setUp() {
MyMapper mapper = new MyMapper ();
MyReducer reducer = new MyReducer();
mapDriver = MapDriver.newMapDriver(mapper);
reduceDriver = ReduceDriver.newReduceDriver(reducer);
mapReduceDriver = MapReduceDriver.newMapReduceDriver(mapper,
reducer);
}
@Test
public void testMapper() {
mapDriver.withInput(new LongWritable(), new Text(
"655209;1;796764372490213;804422938115889;6"));
mapDriver.withOutput(new Text("6"), new IntWritable(1));
mapDriver.runTest();
}
@Test
public void testReducer() {
List<IntWritable> values = new ArrayList<IntWritable>();
values.add(new IntWritable(1));
values.add(new IntWritable(1));
reduceDriver.withInput(new Text("6"), values);
reduceDriver.withOutput(new Text("6"), new IntWritable(2));
reduceDriver.runTest();
}
}
MapDriver Class
Functionality : This class allows you to test a Mapper
instance. You provide the input key and value that should be sent to the
Mapper, and outputs you expect to be sent by the Mapper to the collector for
those inputs. By calling runTest(),the harness will deliver the input to the
Mapper and will check its outputs against the expected results.
Input – Input key and value
to mapper
Output – Desired Output
<Key, Value> from mapper.
ReduceDriver
Class
Functionality : This class allows you to test a Reducer
instance. You provide a key and a set of intermediate values for that key that
represent inputs that should be sent to the Reducer(as if they came from a
Mapper),and outputs you expect to be sent by the Reducer to the collector. By
calling runTest(),the harness will deliver the input to the Reducer and will
check its outputs against the expected results.
Input – Input key and value
to reducer
Output – Desired Output
<Key, Value> from reducer.
MapReduceDriver
Class
Functionality : This class allows you to test a Mapper and
a Reducer instance together You provide the input key and value that should be
sent to the Mapper, and outputs you expect to be sent by the Reducer to the
collector for those inputs. By calling runTest(),the harness will deliver the
input to the Mapper, feed the intermediate results to the Reducer (without
checking them), and will check the Reducer's outputs against the expected
results.
Input – Key and Value sent
to mapper.
Output – Expected result of
reducer to the collector.
Conclusion
By calling the runTest() method, it will check the functionality
of mapper, Reducer and the whole map-reduce framework. If it is same as the
expected one ,then your MR unit test is a success.
No comments:
Post a Comment