You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by kumudu harshani <ku...@gmail.com> on 2012/09/23 05:40:05 UTC

How to set 2mappers on 1 job

Hi...
Could someone help me with following scenario..

I want implement a job which should get 2 mapper outputs and send them to 1
reducer. Attached image show the flow I wanted....




Normal flow is like:

JobConf conf2 = new JobConf(WordCount.class);
Job job2 = new Job(conf2);
conf2.setOutputKeyClass(IntWritable.class);
conf2.setOutputValueClass(Text.class);

conf2.setMapperClass(Map1.class);
conf2.setReducerClass(Reduce1.class);

--- where it takes 1 mapper and 1 reducer. What i want is to set 2
maps(mapper1a, mapper1b) and 1 reducer...
Is that possible, if so could someone please help..

thanks
kumudu
-- 

*Kumudu Samarappuli* | Creative Search Technologies, a Microsoft IEG
Partner | Software Engineer I | m: +94 719 258 242 |
www.microsoft.com/enterprisesearch

Re: How to set 2mappers on 1 job

Posted by Bertrand Dechoux <de...@gmail.com>.
Harsh's solution is indeed cleaner and must be what you were looking for
(and there is a version for both mapred and mapreduce packages).

If you are curious, see :
https://svn.apache.org/repos/asf/hadoop/common/tags/release-1.0.3/src/mapred/org/apache/hadoop/mapred/lib/MultipleInputs.java
https://svn.apache.org/repos/asf/hadoop/common/tags/release-1.0.3/src/mapred/org/apache/hadoop/mapred/lib/DelegatingMapper.java

https://svn.apache.org/repos/asf/hadoop/common/tags/release-1.0.3/src/mapred/org/apache/hadoop/mapreduce/lib/input/MultipleInputs.java
https://svn.apache.org/repos/asf/hadoop/common/tags/release-1.0.3/src/mapred/org/apache/hadoop/mapreduce/lib/input/DelegatingMapper.java

Regards

Bertrand


On Sun, Sep 23, 2012 at 7:37 AM, Harsh J <ha...@cloudera.com> wrote:

> Hi,
>
> There's an easier way to do what Bertrand has suggested. Look at
> MultipleInputs class:
> http://hadoop.apache.org/common/docs/current/api/org/apache/hadoop/mapreduce/lib/input/MultipleInputs.htmland see this blog post for an example on how to use it:
> http://kickstarthadoop.blogspot.in/2011/09/joins-with-plain-map-reduce.html
>
> Note though that the reducer input key and value types are singular, and
> you need to ensure that. There's no easy way around that aside of using
> generic containers.
>
>
> On Sun, Sep 23, 2012 at 9:34 AM, kumudu harshani <kumuduharshani@gmail.com
> > wrote:
>
>> I am sorry.. I didn't get you.. shouldn't i handle that with jobconf code.
>>
>> The confusion i have is, if i put like:
>>
>> JobConf conf2 = new JobConf(WordCount.class);
>> Job job2 = new Job(conf2);
>> conf2.setOutputKeyClass(IntWritable.class);
>> conf2.setOutputValueClass(Text.class);
>>
>> conf2.setMapperClass(Map1.class);
>> conf2.setReducerClass(Reduce1.class);
>>
>>  ---it will execute Map1.class and then Reduce1.class.
>>
>> so if i have Mapper1a.class and Mapper2a.class, how should i write the
>> code of job to execute both and then execute Reducer.class such that,
>> Reducer will take both mappers (1a, 1b) emit outputs...
>>
>> thanks
>> kumudu
>>
>> On Sun, Sep 23, 2012 at 9:23 AM, Bertrand Dechoux <de...@gmail.com>wrote:
>>
>>> You can use the map.input.file property to decide which logic should
>>> your mapper apply.
>>> Regards
>>> Bertrand
>>>
>>>
>>> On Sun, Sep 23, 2012 at 5:40 AM, kumudu harshani <
>>> kumuduharshani@gmail.com> wrote:
>>>
>>>> Hi...
>>>> Could someone help me with following scenario..
>>>>
>>>> I want implement a job which should get 2 mapper outputs and send them
>>>> to 1 reducer. Attached image show the flow I wanted....
>>>>
>>>>
>>>>
>>>>
>>>> Normal flow is like:
>>>>
>>>> JobConf conf2 = new JobConf(WordCount.class);
>>>> Job job2 = new Job(conf2);
>>>> conf2.setOutputKeyClass(IntWritable.class);
>>>> conf2.setOutputValueClass(Text.class);
>>>>
>>>> conf2.setMapperClass(Map1.class);
>>>> conf2.setReducerClass(Reduce1.class);
>>>>
>>>> --- where it takes 1 mapper and 1 reducer. What i want is to set 2
>>>> maps(mapper1a, mapper1b) and 1 reducer...
>>>> Is that possible, if so could someone please help..
>>>>
>>>> thanks
>>>> kumudu
>>>> --
>>>>
>>>> *Kumudu Samarappuli* | Creative Search Technologies, a Microsoft IEG
>>>> Partner | Software Engineer I | m: +94 719 258 242 |
>>>> www.microsoft.com/enterprisesearch
>>>>
>>>>
>>>
>>>
>>> --
>>> Bertrand Dechoux
>>>
>>
>>
>>
>> --
>>
>> *Kumudu Samarappuli* | Creative Search Technologies, a Microsoft IEG
>> Partner | Software Engineer I | m: +94 719 258 242 |
>> www.microsoft.com/enterprisesearch
>>
>>
>
>
> --
> Harsh J
>



-- 
Bertrand Dechoux

Re: How to set 2mappers on 1 job

Posted by Bertrand Dechoux <de...@gmail.com>.
Harsh's solution is indeed cleaner and must be what you were looking for
(and there is a version for both mapred and mapreduce packages).

If you are curious, see :
https://svn.apache.org/repos/asf/hadoop/common/tags/release-1.0.3/src/mapred/org/apache/hadoop/mapred/lib/MultipleInputs.java
https://svn.apache.org/repos/asf/hadoop/common/tags/release-1.0.3/src/mapred/org/apache/hadoop/mapred/lib/DelegatingMapper.java

https://svn.apache.org/repos/asf/hadoop/common/tags/release-1.0.3/src/mapred/org/apache/hadoop/mapreduce/lib/input/MultipleInputs.java
https://svn.apache.org/repos/asf/hadoop/common/tags/release-1.0.3/src/mapred/org/apache/hadoop/mapreduce/lib/input/DelegatingMapper.java

Regards

Bertrand


On Sun, Sep 23, 2012 at 7:37 AM, Harsh J <ha...@cloudera.com> wrote:

> Hi,
>
> There's an easier way to do what Bertrand has suggested. Look at
> MultipleInputs class:
> http://hadoop.apache.org/common/docs/current/api/org/apache/hadoop/mapreduce/lib/input/MultipleInputs.htmland see this blog post for an example on how to use it:
> http://kickstarthadoop.blogspot.in/2011/09/joins-with-plain-map-reduce.html
>
> Note though that the reducer input key and value types are singular, and
> you need to ensure that. There's no easy way around that aside of using
> generic containers.
>
>
> On Sun, Sep 23, 2012 at 9:34 AM, kumudu harshani <kumuduharshani@gmail.com
> > wrote:
>
>> I am sorry.. I didn't get you.. shouldn't i handle that with jobconf code.
>>
>> The confusion i have is, if i put like:
>>
>> JobConf conf2 = new JobConf(WordCount.class);
>> Job job2 = new Job(conf2);
>> conf2.setOutputKeyClass(IntWritable.class);
>> conf2.setOutputValueClass(Text.class);
>>
>> conf2.setMapperClass(Map1.class);
>> conf2.setReducerClass(Reduce1.class);
>>
>>  ---it will execute Map1.class and then Reduce1.class.
>>
>> so if i have Mapper1a.class and Mapper2a.class, how should i write the
>> code of job to execute both and then execute Reducer.class such that,
>> Reducer will take both mappers (1a, 1b) emit outputs...
>>
>> thanks
>> kumudu
>>
>> On Sun, Sep 23, 2012 at 9:23 AM, Bertrand Dechoux <de...@gmail.com>wrote:
>>
>>> You can use the map.input.file property to decide which logic should
>>> your mapper apply.
>>> Regards
>>> Bertrand
>>>
>>>
>>> On Sun, Sep 23, 2012 at 5:40 AM, kumudu harshani <
>>> kumuduharshani@gmail.com> wrote:
>>>
>>>> Hi...
>>>> Could someone help me with following scenario..
>>>>
>>>> I want implement a job which should get 2 mapper outputs and send them
>>>> to 1 reducer. Attached image show the flow I wanted....
>>>>
>>>>
>>>>
>>>>
>>>> Normal flow is like:
>>>>
>>>> JobConf conf2 = new JobConf(WordCount.class);
>>>> Job job2 = new Job(conf2);
>>>> conf2.setOutputKeyClass(IntWritable.class);
>>>> conf2.setOutputValueClass(Text.class);
>>>>
>>>> conf2.setMapperClass(Map1.class);
>>>> conf2.setReducerClass(Reduce1.class);
>>>>
>>>> --- where it takes 1 mapper and 1 reducer. What i want is to set 2
>>>> maps(mapper1a, mapper1b) and 1 reducer...
>>>> Is that possible, if so could someone please help..
>>>>
>>>> thanks
>>>> kumudu
>>>> --
>>>>
>>>> *Kumudu Samarappuli* | Creative Search Technologies, a Microsoft IEG
>>>> Partner | Software Engineer I | m: +94 719 258 242 |
>>>> www.microsoft.com/enterprisesearch
>>>>
>>>>
>>>
>>>
>>> --
>>> Bertrand Dechoux
>>>
>>
>>
>>
>> --
>>
>> *Kumudu Samarappuli* | Creative Search Technologies, a Microsoft IEG
>> Partner | Software Engineer I | m: +94 719 258 242 |
>> www.microsoft.com/enterprisesearch
>>
>>
>
>
> --
> Harsh J
>



-- 
Bertrand Dechoux

Re: How to set 2mappers on 1 job

Posted by Bertrand Dechoux <de...@gmail.com>.
Harsh's solution is indeed cleaner and must be what you were looking for
(and there is a version for both mapred and mapreduce packages).

If you are curious, see :
https://svn.apache.org/repos/asf/hadoop/common/tags/release-1.0.3/src/mapred/org/apache/hadoop/mapred/lib/MultipleInputs.java
https://svn.apache.org/repos/asf/hadoop/common/tags/release-1.0.3/src/mapred/org/apache/hadoop/mapred/lib/DelegatingMapper.java

https://svn.apache.org/repos/asf/hadoop/common/tags/release-1.0.3/src/mapred/org/apache/hadoop/mapreduce/lib/input/MultipleInputs.java
https://svn.apache.org/repos/asf/hadoop/common/tags/release-1.0.3/src/mapred/org/apache/hadoop/mapreduce/lib/input/DelegatingMapper.java

Regards

Bertrand


On Sun, Sep 23, 2012 at 7:37 AM, Harsh J <ha...@cloudera.com> wrote:

> Hi,
>
> There's an easier way to do what Bertrand has suggested. Look at
> MultipleInputs class:
> http://hadoop.apache.org/common/docs/current/api/org/apache/hadoop/mapreduce/lib/input/MultipleInputs.htmland see this blog post for an example on how to use it:
> http://kickstarthadoop.blogspot.in/2011/09/joins-with-plain-map-reduce.html
>
> Note though that the reducer input key and value types are singular, and
> you need to ensure that. There's no easy way around that aside of using
> generic containers.
>
>
> On Sun, Sep 23, 2012 at 9:34 AM, kumudu harshani <kumuduharshani@gmail.com
> > wrote:
>
>> I am sorry.. I didn't get you.. shouldn't i handle that with jobconf code.
>>
>> The confusion i have is, if i put like:
>>
>> JobConf conf2 = new JobConf(WordCount.class);
>> Job job2 = new Job(conf2);
>> conf2.setOutputKeyClass(IntWritable.class);
>> conf2.setOutputValueClass(Text.class);
>>
>> conf2.setMapperClass(Map1.class);
>> conf2.setReducerClass(Reduce1.class);
>>
>>  ---it will execute Map1.class and then Reduce1.class.
>>
>> so if i have Mapper1a.class and Mapper2a.class, how should i write the
>> code of job to execute both and then execute Reducer.class such that,
>> Reducer will take both mappers (1a, 1b) emit outputs...
>>
>> thanks
>> kumudu
>>
>> On Sun, Sep 23, 2012 at 9:23 AM, Bertrand Dechoux <de...@gmail.com>wrote:
>>
>>> You can use the map.input.file property to decide which logic should
>>> your mapper apply.
>>> Regards
>>> Bertrand
>>>
>>>
>>> On Sun, Sep 23, 2012 at 5:40 AM, kumudu harshani <
>>> kumuduharshani@gmail.com> wrote:
>>>
>>>> Hi...
>>>> Could someone help me with following scenario..
>>>>
>>>> I want implement a job which should get 2 mapper outputs and send them
>>>> to 1 reducer. Attached image show the flow I wanted....
>>>>
>>>>
>>>>
>>>>
>>>> Normal flow is like:
>>>>
>>>> JobConf conf2 = new JobConf(WordCount.class);
>>>> Job job2 = new Job(conf2);
>>>> conf2.setOutputKeyClass(IntWritable.class);
>>>> conf2.setOutputValueClass(Text.class);
>>>>
>>>> conf2.setMapperClass(Map1.class);
>>>> conf2.setReducerClass(Reduce1.class);
>>>>
>>>> --- where it takes 1 mapper and 1 reducer. What i want is to set 2
>>>> maps(mapper1a, mapper1b) and 1 reducer...
>>>> Is that possible, if so could someone please help..
>>>>
>>>> thanks
>>>> kumudu
>>>> --
>>>>
>>>> *Kumudu Samarappuli* | Creative Search Technologies, a Microsoft IEG
>>>> Partner | Software Engineer I | m: +94 719 258 242 |
>>>> www.microsoft.com/enterprisesearch
>>>>
>>>>
>>>
>>>
>>> --
>>> Bertrand Dechoux
>>>
>>
>>
>>
>> --
>>
>> *Kumudu Samarappuli* | Creative Search Technologies, a Microsoft IEG
>> Partner | Software Engineer I | m: +94 719 258 242 |
>> www.microsoft.com/enterprisesearch
>>
>>
>
>
> --
> Harsh J
>



-- 
Bertrand Dechoux

Re: How to set 2mappers on 1 job

Posted by Bertrand Dechoux <de...@gmail.com>.
Harsh's solution is indeed cleaner and must be what you were looking for
(and there is a version for both mapred and mapreduce packages).

If you are curious, see :
https://svn.apache.org/repos/asf/hadoop/common/tags/release-1.0.3/src/mapred/org/apache/hadoop/mapred/lib/MultipleInputs.java
https://svn.apache.org/repos/asf/hadoop/common/tags/release-1.0.3/src/mapred/org/apache/hadoop/mapred/lib/DelegatingMapper.java

https://svn.apache.org/repos/asf/hadoop/common/tags/release-1.0.3/src/mapred/org/apache/hadoop/mapreduce/lib/input/MultipleInputs.java
https://svn.apache.org/repos/asf/hadoop/common/tags/release-1.0.3/src/mapred/org/apache/hadoop/mapreduce/lib/input/DelegatingMapper.java

Regards

Bertrand


On Sun, Sep 23, 2012 at 7:37 AM, Harsh J <ha...@cloudera.com> wrote:

> Hi,
>
> There's an easier way to do what Bertrand has suggested. Look at
> MultipleInputs class:
> http://hadoop.apache.org/common/docs/current/api/org/apache/hadoop/mapreduce/lib/input/MultipleInputs.htmland see this blog post for an example on how to use it:
> http://kickstarthadoop.blogspot.in/2011/09/joins-with-plain-map-reduce.html
>
> Note though that the reducer input key and value types are singular, and
> you need to ensure that. There's no easy way around that aside of using
> generic containers.
>
>
> On Sun, Sep 23, 2012 at 9:34 AM, kumudu harshani <kumuduharshani@gmail.com
> > wrote:
>
>> I am sorry.. I didn't get you.. shouldn't i handle that with jobconf code.
>>
>> The confusion i have is, if i put like:
>>
>> JobConf conf2 = new JobConf(WordCount.class);
>> Job job2 = new Job(conf2);
>> conf2.setOutputKeyClass(IntWritable.class);
>> conf2.setOutputValueClass(Text.class);
>>
>> conf2.setMapperClass(Map1.class);
>> conf2.setReducerClass(Reduce1.class);
>>
>>  ---it will execute Map1.class and then Reduce1.class.
>>
>> so if i have Mapper1a.class and Mapper2a.class, how should i write the
>> code of job to execute both and then execute Reducer.class such that,
>> Reducer will take both mappers (1a, 1b) emit outputs...
>>
>> thanks
>> kumudu
>>
>> On Sun, Sep 23, 2012 at 9:23 AM, Bertrand Dechoux <de...@gmail.com>wrote:
>>
>>> You can use the map.input.file property to decide which logic should
>>> your mapper apply.
>>> Regards
>>> Bertrand
>>>
>>>
>>> On Sun, Sep 23, 2012 at 5:40 AM, kumudu harshani <
>>> kumuduharshani@gmail.com> wrote:
>>>
>>>> Hi...
>>>> Could someone help me with following scenario..
>>>>
>>>> I want implement a job which should get 2 mapper outputs and send them
>>>> to 1 reducer. Attached image show the flow I wanted....
>>>>
>>>>
>>>>
>>>>
>>>> Normal flow is like:
>>>>
>>>> JobConf conf2 = new JobConf(WordCount.class);
>>>> Job job2 = new Job(conf2);
>>>> conf2.setOutputKeyClass(IntWritable.class);
>>>> conf2.setOutputValueClass(Text.class);
>>>>
>>>> conf2.setMapperClass(Map1.class);
>>>> conf2.setReducerClass(Reduce1.class);
>>>>
>>>> --- where it takes 1 mapper and 1 reducer. What i want is to set 2
>>>> maps(mapper1a, mapper1b) and 1 reducer...
>>>> Is that possible, if so could someone please help..
>>>>
>>>> thanks
>>>> kumudu
>>>> --
>>>>
>>>> *Kumudu Samarappuli* | Creative Search Technologies, a Microsoft IEG
>>>> Partner | Software Engineer I | m: +94 719 258 242 |
>>>> www.microsoft.com/enterprisesearch
>>>>
>>>>
>>>
>>>
>>> --
>>> Bertrand Dechoux
>>>
>>
>>
>>
>> --
>>
>> *Kumudu Samarappuli* | Creative Search Technologies, a Microsoft IEG
>> Partner | Software Engineer I | m: +94 719 258 242 |
>> www.microsoft.com/enterprisesearch
>>
>>
>
>
> --
> Harsh J
>



-- 
Bertrand Dechoux

Re: How to set 2mappers on 1 job

Posted by Harsh J <ha...@cloudera.com>.
Hi,

There's an easier way to do what Bertrand has suggested. Look at
MultipleInputs class:
http://hadoop.apache.org/common/docs/current/api/org/apache/hadoop/mapreduce/lib/input/MultipleInputs.htmland
see this blog post for an example on how to use it:
http://kickstarthadoop.blogspot.in/2011/09/joins-with-plain-map-reduce.html

Note though that the reducer input key and value types are singular, and
you need to ensure that. There's no easy way around that aside of using
generic containers.

On Sun, Sep 23, 2012 at 9:34 AM, kumudu harshani
<ku...@gmail.com>wrote:

> I am sorry.. I didn't get you.. shouldn't i handle that with jobconf code.
>
> The confusion i have is, if i put like:
>
> JobConf conf2 = new JobConf(WordCount.class);
> Job job2 = new Job(conf2);
> conf2.setOutputKeyClass(IntWritable.class);
> conf2.setOutputValueClass(Text.class);
>
> conf2.setMapperClass(Map1.class);
> conf2.setReducerClass(Reduce1.class);
>
>  ---it will execute Map1.class and then Reduce1.class.
>
> so if i have Mapper1a.class and Mapper2a.class, how should i write the
> code of job to execute both and then execute Reducer.class such that,
> Reducer will take both mappers (1a, 1b) emit outputs...
>
> thanks
> kumudu
>
> On Sun, Sep 23, 2012 at 9:23 AM, Bertrand Dechoux <de...@gmail.com>wrote:
>
>> You can use the map.input.file property to decide which logic should your
>> mapper apply.
>> Regards
>> Bertrand
>>
>>
>> On Sun, Sep 23, 2012 at 5:40 AM, kumudu harshani <
>> kumuduharshani@gmail.com> wrote:
>>
>>> Hi...
>>> Could someone help me with following scenario..
>>>
>>> I want implement a job which should get 2 mapper outputs and send them
>>> to 1 reducer. Attached image show the flow I wanted....
>>>
>>>
>>>
>>>
>>> Normal flow is like:
>>>
>>> JobConf conf2 = new JobConf(WordCount.class);
>>> Job job2 = new Job(conf2);
>>> conf2.setOutputKeyClass(IntWritable.class);
>>> conf2.setOutputValueClass(Text.class);
>>>
>>> conf2.setMapperClass(Map1.class);
>>> conf2.setReducerClass(Reduce1.class);
>>>
>>> --- where it takes 1 mapper and 1 reducer. What i want is to set 2
>>> maps(mapper1a, mapper1b) and 1 reducer...
>>> Is that possible, if so could someone please help..
>>>
>>> thanks
>>> kumudu
>>> --
>>>
>>> *Kumudu Samarappuli* | Creative Search Technologies, a Microsoft IEG
>>> Partner | Software Engineer I | m: +94 719 258 242 |
>>> www.microsoft.com/enterprisesearch
>>>
>>>
>>
>>
>> --
>> Bertrand Dechoux
>>
>
>
>
> --
>
> *Kumudu Samarappuli* | Creative Search Technologies, a Microsoft IEG
> Partner | Software Engineer I | m: +94 719 258 242 |
> www.microsoft.com/enterprisesearch
>
>


-- 
Harsh J

Re: How to set 2mappers on 1 job

Posted by Harsh J <ha...@cloudera.com>.
Hi,

There's an easier way to do what Bertrand has suggested. Look at
MultipleInputs class:
http://hadoop.apache.org/common/docs/current/api/org/apache/hadoop/mapreduce/lib/input/MultipleInputs.htmland
see this blog post for an example on how to use it:
http://kickstarthadoop.blogspot.in/2011/09/joins-with-plain-map-reduce.html

Note though that the reducer input key and value types are singular, and
you need to ensure that. There's no easy way around that aside of using
generic containers.

On Sun, Sep 23, 2012 at 9:34 AM, kumudu harshani
<ku...@gmail.com>wrote:

> I am sorry.. I didn't get you.. shouldn't i handle that with jobconf code.
>
> The confusion i have is, if i put like:
>
> JobConf conf2 = new JobConf(WordCount.class);
> Job job2 = new Job(conf2);
> conf2.setOutputKeyClass(IntWritable.class);
> conf2.setOutputValueClass(Text.class);
>
> conf2.setMapperClass(Map1.class);
> conf2.setReducerClass(Reduce1.class);
>
>  ---it will execute Map1.class and then Reduce1.class.
>
> so if i have Mapper1a.class and Mapper2a.class, how should i write the
> code of job to execute both and then execute Reducer.class such that,
> Reducer will take both mappers (1a, 1b) emit outputs...
>
> thanks
> kumudu
>
> On Sun, Sep 23, 2012 at 9:23 AM, Bertrand Dechoux <de...@gmail.com>wrote:
>
>> You can use the map.input.file property to decide which logic should your
>> mapper apply.
>> Regards
>> Bertrand
>>
>>
>> On Sun, Sep 23, 2012 at 5:40 AM, kumudu harshani <
>> kumuduharshani@gmail.com> wrote:
>>
>>> Hi...
>>> Could someone help me with following scenario..
>>>
>>> I want implement a job which should get 2 mapper outputs and send them
>>> to 1 reducer. Attached image show the flow I wanted....
>>>
>>>
>>>
>>>
>>> Normal flow is like:
>>>
>>> JobConf conf2 = new JobConf(WordCount.class);
>>> Job job2 = new Job(conf2);
>>> conf2.setOutputKeyClass(IntWritable.class);
>>> conf2.setOutputValueClass(Text.class);
>>>
>>> conf2.setMapperClass(Map1.class);
>>> conf2.setReducerClass(Reduce1.class);
>>>
>>> --- where it takes 1 mapper and 1 reducer. What i want is to set 2
>>> maps(mapper1a, mapper1b) and 1 reducer...
>>> Is that possible, if so could someone please help..
>>>
>>> thanks
>>> kumudu
>>> --
>>>
>>> *Kumudu Samarappuli* | Creative Search Technologies, a Microsoft IEG
>>> Partner | Software Engineer I | m: +94 719 258 242 |
>>> www.microsoft.com/enterprisesearch
>>>
>>>
>>
>>
>> --
>> Bertrand Dechoux
>>
>
>
>
> --
>
> *Kumudu Samarappuli* | Creative Search Technologies, a Microsoft IEG
> Partner | Software Engineer I | m: +94 719 258 242 |
> www.microsoft.com/enterprisesearch
>
>


-- 
Harsh J

Re: How to set 2mappers on 1 job

Posted by Harsh J <ha...@cloudera.com>.
Hi,

There's an easier way to do what Bertrand has suggested. Look at
MultipleInputs class:
http://hadoop.apache.org/common/docs/current/api/org/apache/hadoop/mapreduce/lib/input/MultipleInputs.htmland
see this blog post for an example on how to use it:
http://kickstarthadoop.blogspot.in/2011/09/joins-with-plain-map-reduce.html

Note though that the reducer input key and value types are singular, and
you need to ensure that. There's no easy way around that aside of using
generic containers.

On Sun, Sep 23, 2012 at 9:34 AM, kumudu harshani
<ku...@gmail.com>wrote:

> I am sorry.. I didn't get you.. shouldn't i handle that with jobconf code.
>
> The confusion i have is, if i put like:
>
> JobConf conf2 = new JobConf(WordCount.class);
> Job job2 = new Job(conf2);
> conf2.setOutputKeyClass(IntWritable.class);
> conf2.setOutputValueClass(Text.class);
>
> conf2.setMapperClass(Map1.class);
> conf2.setReducerClass(Reduce1.class);
>
>  ---it will execute Map1.class and then Reduce1.class.
>
> so if i have Mapper1a.class and Mapper2a.class, how should i write the
> code of job to execute both and then execute Reducer.class such that,
> Reducer will take both mappers (1a, 1b) emit outputs...
>
> thanks
> kumudu
>
> On Sun, Sep 23, 2012 at 9:23 AM, Bertrand Dechoux <de...@gmail.com>wrote:
>
>> You can use the map.input.file property to decide which logic should your
>> mapper apply.
>> Regards
>> Bertrand
>>
>>
>> On Sun, Sep 23, 2012 at 5:40 AM, kumudu harshani <
>> kumuduharshani@gmail.com> wrote:
>>
>>> Hi...
>>> Could someone help me with following scenario..
>>>
>>> I want implement a job which should get 2 mapper outputs and send them
>>> to 1 reducer. Attached image show the flow I wanted....
>>>
>>>
>>>
>>>
>>> Normal flow is like:
>>>
>>> JobConf conf2 = new JobConf(WordCount.class);
>>> Job job2 = new Job(conf2);
>>> conf2.setOutputKeyClass(IntWritable.class);
>>> conf2.setOutputValueClass(Text.class);
>>>
>>> conf2.setMapperClass(Map1.class);
>>> conf2.setReducerClass(Reduce1.class);
>>>
>>> --- where it takes 1 mapper and 1 reducer. What i want is to set 2
>>> maps(mapper1a, mapper1b) and 1 reducer...
>>> Is that possible, if so could someone please help..
>>>
>>> thanks
>>> kumudu
>>> --
>>>
>>> *Kumudu Samarappuli* | Creative Search Technologies, a Microsoft IEG
>>> Partner | Software Engineer I | m: +94 719 258 242 |
>>> www.microsoft.com/enterprisesearch
>>>
>>>
>>
>>
>> --
>> Bertrand Dechoux
>>
>
>
>
> --
>
> *Kumudu Samarappuli* | Creative Search Technologies, a Microsoft IEG
> Partner | Software Engineer I | m: +94 719 258 242 |
> www.microsoft.com/enterprisesearch
>
>


-- 
Harsh J

Re: How to set 2mappers on 1 job

Posted by Harsh J <ha...@cloudera.com>.
Hi,

There's an easier way to do what Bertrand has suggested. Look at
MultipleInputs class:
http://hadoop.apache.org/common/docs/current/api/org/apache/hadoop/mapreduce/lib/input/MultipleInputs.htmland
see this blog post for an example on how to use it:
http://kickstarthadoop.blogspot.in/2011/09/joins-with-plain-map-reduce.html

Note though that the reducer input key and value types are singular, and
you need to ensure that. There's no easy way around that aside of using
generic containers.

On Sun, Sep 23, 2012 at 9:34 AM, kumudu harshani
<ku...@gmail.com>wrote:

> I am sorry.. I didn't get you.. shouldn't i handle that with jobconf code.
>
> The confusion i have is, if i put like:
>
> JobConf conf2 = new JobConf(WordCount.class);
> Job job2 = new Job(conf2);
> conf2.setOutputKeyClass(IntWritable.class);
> conf2.setOutputValueClass(Text.class);
>
> conf2.setMapperClass(Map1.class);
> conf2.setReducerClass(Reduce1.class);
>
>  ---it will execute Map1.class and then Reduce1.class.
>
> so if i have Mapper1a.class and Mapper2a.class, how should i write the
> code of job to execute both and then execute Reducer.class such that,
> Reducer will take both mappers (1a, 1b) emit outputs...
>
> thanks
> kumudu
>
> On Sun, Sep 23, 2012 at 9:23 AM, Bertrand Dechoux <de...@gmail.com>wrote:
>
>> You can use the map.input.file property to decide which logic should your
>> mapper apply.
>> Regards
>> Bertrand
>>
>>
>> On Sun, Sep 23, 2012 at 5:40 AM, kumudu harshani <
>> kumuduharshani@gmail.com> wrote:
>>
>>> Hi...
>>> Could someone help me with following scenario..
>>>
>>> I want implement a job which should get 2 mapper outputs and send them
>>> to 1 reducer. Attached image show the flow I wanted....
>>>
>>>
>>>
>>>
>>> Normal flow is like:
>>>
>>> JobConf conf2 = new JobConf(WordCount.class);
>>> Job job2 = new Job(conf2);
>>> conf2.setOutputKeyClass(IntWritable.class);
>>> conf2.setOutputValueClass(Text.class);
>>>
>>> conf2.setMapperClass(Map1.class);
>>> conf2.setReducerClass(Reduce1.class);
>>>
>>> --- where it takes 1 mapper and 1 reducer. What i want is to set 2
>>> maps(mapper1a, mapper1b) and 1 reducer...
>>> Is that possible, if so could someone please help..
>>>
>>> thanks
>>> kumudu
>>> --
>>>
>>> *Kumudu Samarappuli* | Creative Search Technologies, a Microsoft IEG
>>> Partner | Software Engineer I | m: +94 719 258 242 |
>>> www.microsoft.com/enterprisesearch
>>>
>>>
>>
>>
>> --
>> Bertrand Dechoux
>>
>
>
>
> --
>
> *Kumudu Samarappuli* | Creative Search Technologies, a Microsoft IEG
> Partner | Software Engineer I | m: +94 719 258 242 |
> www.microsoft.com/enterprisesearch
>
>


-- 
Harsh J

Re: How to set 2mappers on 1 job

Posted by kumudu harshani <ku...@gmail.com>.
I am sorry.. I didn't get you.. shouldn't i handle that with jobconf code.

The confusion i have is, if i put like:
JobConf conf2 = new JobConf(WordCount.class);
Job job2 = new Job(conf2);
conf2.setOutputKeyClass(IntWritable.class);
conf2.setOutputValueClass(Text.class);

conf2.setMapperClass(Map1.class);
conf2.setReducerClass(Reduce1.class);

 ---it will execute Map1.class and then Reduce1.class.

so if i have Mapper1a.class and Mapper2a.class, how should i write the code
of job to execute both and then execute Reducer.class such that, Reducer
will take both mappers (1a, 1b) emit outputs...

thanks
kumudu

On Sun, Sep 23, 2012 at 9:23 AM, Bertrand Dechoux <de...@gmail.com>wrote:

> You can use the map.input.file property to decide which logic should your
> mapper apply.
> Regards
> Bertrand
>
>
> On Sun, Sep 23, 2012 at 5:40 AM, kumudu harshani <kumuduharshani@gmail.com
> > wrote:
>
>> Hi...
>> Could someone help me with following scenario..
>>
>> I want implement a job which should get 2 mapper outputs and send them to
>> 1 reducer. Attached image show the flow I wanted....
>>
>>
>>
>>
>> Normal flow is like:
>>
>> JobConf conf2 = new JobConf(WordCount.class);
>> Job job2 = new Job(conf2);
>> conf2.setOutputKeyClass(IntWritable.class);
>> conf2.setOutputValueClass(Text.class);
>>
>> conf2.setMapperClass(Map1.class);
>> conf2.setReducerClass(Reduce1.class);
>>
>> --- where it takes 1 mapper and 1 reducer. What i want is to set 2
>> maps(mapper1a, mapper1b) and 1 reducer...
>> Is that possible, if so could someone please help..
>>
>> thanks
>> kumudu
>> --
>>
>> *Kumudu Samarappuli* | Creative Search Technologies, a Microsoft IEG
>> Partner | Software Engineer I | m: +94 719 258 242 |
>> www.microsoft.com/enterprisesearch
>>
>>
>
>
> --
> Bertrand Dechoux
>



-- 

*Kumudu Samarappuli* | Creative Search Technologies, a Microsoft IEG
Partner | Software Engineer I | m: +94 719 258 242 |
www.microsoft.com/enterprisesearch

Re: How to set 2mappers on 1 job

Posted by kumudu harshani <ku...@gmail.com>.
I am sorry.. I didn't get you.. shouldn't i handle that with jobconf code.

The confusion i have is, if i put like:
JobConf conf2 = new JobConf(WordCount.class);
Job job2 = new Job(conf2);
conf2.setOutputKeyClass(IntWritable.class);
conf2.setOutputValueClass(Text.class);

conf2.setMapperClass(Map1.class);
conf2.setReducerClass(Reduce1.class);

 ---it will execute Map1.class and then Reduce1.class.

so if i have Mapper1a.class and Mapper2a.class, how should i write the code
of job to execute both and then execute Reducer.class such that, Reducer
will take both mappers (1a, 1b) emit outputs...

thanks
kumudu

On Sun, Sep 23, 2012 at 9:23 AM, Bertrand Dechoux <de...@gmail.com>wrote:

> You can use the map.input.file property to decide which logic should your
> mapper apply.
> Regards
> Bertrand
>
>
> On Sun, Sep 23, 2012 at 5:40 AM, kumudu harshani <kumuduharshani@gmail.com
> > wrote:
>
>> Hi...
>> Could someone help me with following scenario..
>>
>> I want implement a job which should get 2 mapper outputs and send them to
>> 1 reducer. Attached image show the flow I wanted....
>>
>>
>>
>>
>> Normal flow is like:
>>
>> JobConf conf2 = new JobConf(WordCount.class);
>> Job job2 = new Job(conf2);
>> conf2.setOutputKeyClass(IntWritable.class);
>> conf2.setOutputValueClass(Text.class);
>>
>> conf2.setMapperClass(Map1.class);
>> conf2.setReducerClass(Reduce1.class);
>>
>> --- where it takes 1 mapper and 1 reducer. What i want is to set 2
>> maps(mapper1a, mapper1b) and 1 reducer...
>> Is that possible, if so could someone please help..
>>
>> thanks
>> kumudu
>> --
>>
>> *Kumudu Samarappuli* | Creative Search Technologies, a Microsoft IEG
>> Partner | Software Engineer I | m: +94 719 258 242 |
>> www.microsoft.com/enterprisesearch
>>
>>
>
>
> --
> Bertrand Dechoux
>



-- 

*Kumudu Samarappuli* | Creative Search Technologies, a Microsoft IEG
Partner | Software Engineer I | m: +94 719 258 242 |
www.microsoft.com/enterprisesearch

Re: How to set 2mappers on 1 job

Posted by kumudu harshani <ku...@gmail.com>.
I am sorry.. I didn't get you.. shouldn't i handle that with jobconf code.

The confusion i have is, if i put like:
JobConf conf2 = new JobConf(WordCount.class);
Job job2 = new Job(conf2);
conf2.setOutputKeyClass(IntWritable.class);
conf2.setOutputValueClass(Text.class);

conf2.setMapperClass(Map1.class);
conf2.setReducerClass(Reduce1.class);

 ---it will execute Map1.class and then Reduce1.class.

so if i have Mapper1a.class and Mapper2a.class, how should i write the code
of job to execute both and then execute Reducer.class such that, Reducer
will take both mappers (1a, 1b) emit outputs...

thanks
kumudu

On Sun, Sep 23, 2012 at 9:23 AM, Bertrand Dechoux <de...@gmail.com>wrote:

> You can use the map.input.file property to decide which logic should your
> mapper apply.
> Regards
> Bertrand
>
>
> On Sun, Sep 23, 2012 at 5:40 AM, kumudu harshani <kumuduharshani@gmail.com
> > wrote:
>
>> Hi...
>> Could someone help me with following scenario..
>>
>> I want implement a job which should get 2 mapper outputs and send them to
>> 1 reducer. Attached image show the flow I wanted....
>>
>>
>>
>>
>> Normal flow is like:
>>
>> JobConf conf2 = new JobConf(WordCount.class);
>> Job job2 = new Job(conf2);
>> conf2.setOutputKeyClass(IntWritable.class);
>> conf2.setOutputValueClass(Text.class);
>>
>> conf2.setMapperClass(Map1.class);
>> conf2.setReducerClass(Reduce1.class);
>>
>> --- where it takes 1 mapper and 1 reducer. What i want is to set 2
>> maps(mapper1a, mapper1b) and 1 reducer...
>> Is that possible, if so could someone please help..
>>
>> thanks
>> kumudu
>> --
>>
>> *Kumudu Samarappuli* | Creative Search Technologies, a Microsoft IEG
>> Partner | Software Engineer I | m: +94 719 258 242 |
>> www.microsoft.com/enterprisesearch
>>
>>
>
>
> --
> Bertrand Dechoux
>



-- 

*Kumudu Samarappuli* | Creative Search Technologies, a Microsoft IEG
Partner | Software Engineer I | m: +94 719 258 242 |
www.microsoft.com/enterprisesearch

Re: How to set 2mappers on 1 job

Posted by kumudu harshani <ku...@gmail.com>.
I am sorry.. I didn't get you.. shouldn't i handle that with jobconf code.

The confusion i have is, if i put like:
JobConf conf2 = new JobConf(WordCount.class);
Job job2 = new Job(conf2);
conf2.setOutputKeyClass(IntWritable.class);
conf2.setOutputValueClass(Text.class);

conf2.setMapperClass(Map1.class);
conf2.setReducerClass(Reduce1.class);

 ---it will execute Map1.class and then Reduce1.class.

so if i have Mapper1a.class and Mapper2a.class, how should i write the code
of job to execute both and then execute Reducer.class such that, Reducer
will take both mappers (1a, 1b) emit outputs...

thanks
kumudu

On Sun, Sep 23, 2012 at 9:23 AM, Bertrand Dechoux <de...@gmail.com>wrote:

> You can use the map.input.file property to decide which logic should your
> mapper apply.
> Regards
> Bertrand
>
>
> On Sun, Sep 23, 2012 at 5:40 AM, kumudu harshani <kumuduharshani@gmail.com
> > wrote:
>
>> Hi...
>> Could someone help me with following scenario..
>>
>> I want implement a job which should get 2 mapper outputs and send them to
>> 1 reducer. Attached image show the flow I wanted....
>>
>>
>>
>>
>> Normal flow is like:
>>
>> JobConf conf2 = new JobConf(WordCount.class);
>> Job job2 = new Job(conf2);
>> conf2.setOutputKeyClass(IntWritable.class);
>> conf2.setOutputValueClass(Text.class);
>>
>> conf2.setMapperClass(Map1.class);
>> conf2.setReducerClass(Reduce1.class);
>>
>> --- where it takes 1 mapper and 1 reducer. What i want is to set 2
>> maps(mapper1a, mapper1b) and 1 reducer...
>> Is that possible, if so could someone please help..
>>
>> thanks
>> kumudu
>> --
>>
>> *Kumudu Samarappuli* | Creative Search Technologies, a Microsoft IEG
>> Partner | Software Engineer I | m: +94 719 258 242 |
>> www.microsoft.com/enterprisesearch
>>
>>
>
>
> --
> Bertrand Dechoux
>



-- 

*Kumudu Samarappuli* | Creative Search Technologies, a Microsoft IEG
Partner | Software Engineer I | m: +94 719 258 242 |
www.microsoft.com/enterprisesearch

Re: How to set 2mappers on 1 job

Posted by Bertrand Dechoux <de...@gmail.com>.
You can use the map.input.file property to decide which logic should your
mapper apply.
Regards
Bertrand

On Sun, Sep 23, 2012 at 5:40 AM, kumudu harshani
<ku...@gmail.com>wrote:

> Hi...
> Could someone help me with following scenario..
>
> I want implement a job which should get 2 mapper outputs and send them to
> 1 reducer. Attached image show the flow I wanted....
>
>
>
>
> Normal flow is like:
>
> JobConf conf2 = new JobConf(WordCount.class);
> Job job2 = new Job(conf2);
> conf2.setOutputKeyClass(IntWritable.class);
> conf2.setOutputValueClass(Text.class);
>
> conf2.setMapperClass(Map1.class);
> conf2.setReducerClass(Reduce1.class);
>
> --- where it takes 1 mapper and 1 reducer. What i want is to set 2
> maps(mapper1a, mapper1b) and 1 reducer...
> Is that possible, if so could someone please help..
>
> thanks
> kumudu
> --
>
> *Kumudu Samarappuli* | Creative Search Technologies, a Microsoft IEG
> Partner | Software Engineer I | m: +94 719 258 242 |
> www.microsoft.com/enterprisesearch
>
>


-- 
Bertrand Dechoux

Re: How to set 2mappers on 1 job

Posted by Bertrand Dechoux <de...@gmail.com>.
You can use the map.input.file property to decide which logic should your
mapper apply.
Regards
Bertrand

On Sun, Sep 23, 2012 at 5:40 AM, kumudu harshani
<ku...@gmail.com>wrote:

> Hi...
> Could someone help me with following scenario..
>
> I want implement a job which should get 2 mapper outputs and send them to
> 1 reducer. Attached image show the flow I wanted....
>
>
>
>
> Normal flow is like:
>
> JobConf conf2 = new JobConf(WordCount.class);
> Job job2 = new Job(conf2);
> conf2.setOutputKeyClass(IntWritable.class);
> conf2.setOutputValueClass(Text.class);
>
> conf2.setMapperClass(Map1.class);
> conf2.setReducerClass(Reduce1.class);
>
> --- where it takes 1 mapper and 1 reducer. What i want is to set 2
> maps(mapper1a, mapper1b) and 1 reducer...
> Is that possible, if so could someone please help..
>
> thanks
> kumudu
> --
>
> *Kumudu Samarappuli* | Creative Search Technologies, a Microsoft IEG
> Partner | Software Engineer I | m: +94 719 258 242 |
> www.microsoft.com/enterprisesearch
>
>


-- 
Bertrand Dechoux

Re: How to set 2mappers on 1 job

Posted by Bertrand Dechoux <de...@gmail.com>.
You can use the map.input.file property to decide which logic should your
mapper apply.
Regards
Bertrand

On Sun, Sep 23, 2012 at 5:40 AM, kumudu harshani
<ku...@gmail.com>wrote:

> Hi...
> Could someone help me with following scenario..
>
> I want implement a job which should get 2 mapper outputs and send them to
> 1 reducer. Attached image show the flow I wanted....
>
>
>
>
> Normal flow is like:
>
> JobConf conf2 = new JobConf(WordCount.class);
> Job job2 = new Job(conf2);
> conf2.setOutputKeyClass(IntWritable.class);
> conf2.setOutputValueClass(Text.class);
>
> conf2.setMapperClass(Map1.class);
> conf2.setReducerClass(Reduce1.class);
>
> --- where it takes 1 mapper and 1 reducer. What i want is to set 2
> maps(mapper1a, mapper1b) and 1 reducer...
> Is that possible, if so could someone please help..
>
> thanks
> kumudu
> --
>
> *Kumudu Samarappuli* | Creative Search Technologies, a Microsoft IEG
> Partner | Software Engineer I | m: +94 719 258 242 |
> www.microsoft.com/enterprisesearch
>
>


-- 
Bertrand Dechoux

Re: How to set 2mappers on 1 job

Posted by Bertrand Dechoux <de...@gmail.com>.
You can use the map.input.file property to decide which logic should your
mapper apply.
Regards
Bertrand

On Sun, Sep 23, 2012 at 5:40 AM, kumudu harshani
<ku...@gmail.com>wrote:

> Hi...
> Could someone help me with following scenario..
>
> I want implement a job which should get 2 mapper outputs and send them to
> 1 reducer. Attached image show the flow I wanted....
>
>
>
>
> Normal flow is like:
>
> JobConf conf2 = new JobConf(WordCount.class);
> Job job2 = new Job(conf2);
> conf2.setOutputKeyClass(IntWritable.class);
> conf2.setOutputValueClass(Text.class);
>
> conf2.setMapperClass(Map1.class);
> conf2.setReducerClass(Reduce1.class);
>
> --- where it takes 1 mapper and 1 reducer. What i want is to set 2
> maps(mapper1a, mapper1b) and 1 reducer...
> Is that possible, if so could someone please help..
>
> thanks
> kumudu
> --
>
> *Kumudu Samarappuli* | Creative Search Technologies, a Microsoft IEG
> Partner | Software Engineer I | m: +94 719 258 242 |
> www.microsoft.com/enterprisesearch
>
>


-- 
Bertrand Dechoux