You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by jingguo yao <ya...@gmail.com> on 2013/04/05 02:46:22 UTC
What is the output format of org.apache.hadoop.examples.Join?
I am reading the following mail:
http://www.mail-archive.com/core-user@hadoop.apache.org/msg04066.html
After running the following command (I am using Hadoop 1.0.4):
bin/hadoop jar hadoop-examples-1.0.4.jar join \
-inFormat org.apache.hadoop.mapred.KeyValueTextInputFormat \
-outKey org.apache.hadoop.io.Text \
-joinOp outer \
join/a.txt join/b.txt join/c.txt joinout
Then I run "bin/hadoop fs -text joinout/part-00000". I see the following
result:
AAAAAAAA a0 [,,]
AAAAAAAA b0 [,,]
AAAAAAAA c0 [,,]
BBBBBBBB a1 [,,]
BBBBBBBB b1 [,,]
BBBBBBBB b2 [,,]
BBBBBBBB b3 [,,]
BBBBBBBB c1 [,,]
CCCCCCCC a2 [,,]
CCCCCCCC a3 [,,]
DDDDDDDD c2 [,,]
DDDDDDDD c3 [,,]
But Chris said that the result should be:
AAAAAAAA [a0,b0,c0]
BBBBBBBB [a1,b1,c1]
BBBBBBBB [a1,b2,c1]
BBBBBBBB [a1,b3,c1]
CCCCCCCC [a2,,]
CCCCCCCC [a3,,]
DDDDDDDD [,,c2]
DDDDDDDD [,,c3]
Is Join's output format changed for Hadoop 1.0.4?
--
Jingguo