You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by dhana sekaran <dh...@gmail.com> on 2010/04/28 08:52:37 UTC

Map-Side Join not giving output as expected

Hi,

I am using map-side-join to join three data sets.  I am not getting output
as expected.  Please guide me.  Hadoop-Version: 0.20.1

a.txt
====
9000000000,Dhana
9000000001,Sridhar
9000000002,Mani

b.txt
====
9000000000,Chennai
9000000001,Bangalore
9000000002,Madurai

c.txt
====
9000000000,Dev
9000000001,Mgr
9000000002,Lead

part-00000
========
9000000000      [Chennai]
9000000000      [Dhana]
9000000000      [Dev]
9000000001      [Mgr]
9000000001      [Bangalore]
9000000001      [Sridhar]
9000000002      [Mani]
9000000002      [Lead]
9000000002      [Madurai]

Expected Output
=============
9000000000      [Dhana,Chennai,Dev]
9000000001      [Sridhar,Bangalore,Mgr]
9000000002      [Mani,Madurai,Lead]

This is the command I ran.  Am I missing something?

# hadoop jar hadoop-*-examples.jar join -D
key.value.separator.in.input.line=',' -inFormat
org.apache.hadoop.mapred.KeyValueTextInputFormat -outKey
org.apache.hadoop.io.Text -joinOp outer mapred/join/ joinout


Also if I give unsorted input,  I am getting the same output.  Is it not
mandatory to give sorted data?


Please Guide.

Thanks in Advance.
Dhana




-- 
There are only 10 types of people in the world: Those who understand binary,
and those who don't.