You are viewing a plain text version of this content. The canonical link for it is here.
Posted to hdfs-user@hadoop.apache.org by "Rusia, Devansh" <dr...@paypal.com> on 2013/03/20 11:11:08 UTC
TupleWritable value in mapper Not getting cleaned up ( using
CompositeInputFormat )
Hi,
I am trying to do an outer join on to input files.
But while joining the TupleWritable value in the mapper is not getting cleaned up and so is using the previous values of a different key.
The code I used is : ( 'plist' is containing the set of paths to be taken as input )
jobConf.setInputFormat(CompositeInputFormat.class);
jobConf.set("mapred.join.expr", CompositeInputFormat.compose(op, inputFormatClass,plist.toArray(new Path[0])));
jobConf.setOutputFormat(outputFormatClass);
inp1:
anil1 10
anil2 20
anil3 30
dev1 40
dev2 50
inp2:
anil1 100
dev1 400
dev2 500
dev3 600
outer join output:
anil1 10,100
anil2 20,100
anil3 30,100
dev1 40,400
dev2 50,500
dev3 50,600
Actually It should be, right?
anil1 10,100
anil2 20
anil3 30
dev1 40,400
dev2 50,500
dev3 600
Regards,
Devansh Rusia