You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by Jason Venner <ja...@attributor.com> on 2008/06/30 18:11:26 UTC

Re: joins in map reduce

I have just started to try using the Join operators.

The join I am trying is this;
join is 
outer(tbl(org.apache.hadoop.mapred.SequenceFileInputFormat,"Input1"),tbl(org.apache.hadoop.mapred.SequenceFileInputFormat,"IndexedTry1"))

but I get an error
08/06/30 08:55:13 INFO mapred.FileInputFormat: Total input paths to 
process : 10
Exception in thread "main" java.io.IOException: No input paths specified 
in input
    at 
org.apache.hadoop.mapred.FileInputFormat.validateInput(FileInputFormat.java:115)
    at org.apache.hadoop.mapred.join.Parser$WNode.getSplits(Parser.java:304)
    at org.apache.hadoop.mapred.join.Parser$CNode.getSplits(Parser.java:375)
    at 
org.apache.hadoop.mapred.join.CompositeInputFormat.getSplits(CompositeInputFormat.java:131)
    at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:544)

I am clearly missing something basic...

        conf.setInputFormat(CompositeInputFormat.class);
        conf.setOutputPath( outputDirectory );
        conf.setOutputKeyClass(Text.class);
        conf.setOutputValueClass(Text.class);
        conf.setOutputFormat(MapFileOutputFormat.class);
        conf.setMapperClass( LeftHandJoinMapper.class );
        conf.setReducerClass( IdentityReducer.class );
        conf.setNumReduceTasks(0);

        System.err.println( "join is " + 
CompositeInputFormat.compose("outer", SequenceFileInputFormat.class, 
allTables ) );
        conf.set("mapred.join.expr", 
CompositeInputFormat.compose("outer", SequenceFileInputFormat.class, 
allTables ));
       
        JobClient client = new JobClient();
       
        client.setConf( conf );

        RunningJob job = JobClient.runJob( conf );



Shirley Cohen wrote:
> Hi,
>
> How does one do a join operation in map reduce? Is there more than one 
> way to do a join? Which way works better and why?
>
> Thanks,
>
> Shirley
-- 
Jason Venner
Attributor - Program the Web <http://www.attributor.com/>
Attributor is hiring Hadoop Wranglers and coding wizards, contact if 
interested