You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hadoop.apache.org by Jason Yang <li...@gmail.com> on 2012/09/11 05:25:17 UTC

how to make different mappers execute different processing on same data ?

Hi, all

I've got a question about how to make different mappers execute different
processing on a same data?

Here is my scenario:
I got to process a data, however, there multiple choices to process this
data and I have no idea which one is better, so I was thinking that maybe I
could execute multiple mappers, in which different processing solution is
applied, and eventually the best one is chosen according to some evaluation
functions.

But I'm not sure whether this could be done in MapReduce.

Any help would be appreciated.

-- 
YANG, Lin

Re: how to make different mappers execute different processing on same data ?

Posted by Jason Yang <li...@gmail.com>.
Thanks for your reply.

But I'm not sure that woks since the data volume is large, which makes the
cost of shuffling quite high if all the process are applied in Reducer.

I thought the Hadoop would transfer all the output of Mapper to Reducer by
HTTP, right?

2012/9/11 Narasingu Ramesh <ra...@gmail.com>

> Hi Jason,
>             Mehmet said is exactly correct ,without reducers we cannot
> increase performance please you can add mappers and reducers in any
> processing data you can get output and performance is good.
> Thanks & Regards,
> Ramesh.Narasingu
>
>
> On Tue, Sep 11, 2012 at 9:31 AM, Mehmet Tepedelenlioglu <
> mehmetsino@gmail.com> wrote:
>
>> If you have n processes to evaluate, make a reducer that calls the
>> process i when it receives key i, 1<=i<=n. Either replicate the data for
>> the n reducers, or cache it for it to be read on the reducer side. The
>> reducers will output the process id i and the performance.
>>
>>
>> On Sep 10, 2012, at 8:25 PM, Jason Yang wrote:
>>
>> > Hi, all
>> >
>> > I've got a question about how to make different mappers execute
>> different processing on a same data?
>> >
>> > Here is my scenario:
>> > I got to process a data, however, there multiple choices to process
>> this data and I have no idea which one is better, so I was thinking that
>> maybe I could execute multiple mappers, in which different processing
>> solution is applied, and eventually the best one is chosen according to
>> some evaluation functions.
>> >
>> > But I'm not sure whether this could be done in MapReduce.
>> >
>> > Any help would be appreciated.
>> >
>> > --
>> > YANG, Lin
>> >
>>
>>
>


-- 
YANG, Lin

Re: how to make different mappers execute different processing on same data ?

Posted by Jason Yang <li...@gmail.com>.
Thanks for your reply.

But I'm not sure that woks since the data volume is large, which makes the
cost of shuffling quite high if all the process are applied in Reducer.

I thought the Hadoop would transfer all the output of Mapper to Reducer by
HTTP, right?

2012/9/11 Narasingu Ramesh <ra...@gmail.com>

> Hi Jason,
>             Mehmet said is exactly correct ,without reducers we cannot
> increase performance please you can add mappers and reducers in any
> processing data you can get output and performance is good.
> Thanks & Regards,
> Ramesh.Narasingu
>
>
> On Tue, Sep 11, 2012 at 9:31 AM, Mehmet Tepedelenlioglu <
> mehmetsino@gmail.com> wrote:
>
>> If you have n processes to evaluate, make a reducer that calls the
>> process i when it receives key i, 1<=i<=n. Either replicate the data for
>> the n reducers, or cache it for it to be read on the reducer side. The
>> reducers will output the process id i and the performance.
>>
>>
>> On Sep 10, 2012, at 8:25 PM, Jason Yang wrote:
>>
>> > Hi, all
>> >
>> > I've got a question about how to make different mappers execute
>> different processing on a same data?
>> >
>> > Here is my scenario:
>> > I got to process a data, however, there multiple choices to process
>> this data and I have no idea which one is better, so I was thinking that
>> maybe I could execute multiple mappers, in which different processing
>> solution is applied, and eventually the best one is chosen according to
>> some evaluation functions.
>> >
>> > But I'm not sure whether this could be done in MapReduce.
>> >
>> > Any help would be appreciated.
>> >
>> > --
>> > YANG, Lin
>> >
>>
>>
>


-- 
YANG, Lin

Re: how to make different mappers execute different processing on same data ?

Posted by Jason Yang <li...@gmail.com>.
Thanks for your reply.

But I'm not sure that woks since the data volume is large, which makes the
cost of shuffling quite high if all the process are applied in Reducer.

I thought the Hadoop would transfer all the output of Mapper to Reducer by
HTTP, right?

2012/9/11 Narasingu Ramesh <ra...@gmail.com>

> Hi Jason,
>             Mehmet said is exactly correct ,without reducers we cannot
> increase performance please you can add mappers and reducers in any
> processing data you can get output and performance is good.
> Thanks & Regards,
> Ramesh.Narasingu
>
>
> On Tue, Sep 11, 2012 at 9:31 AM, Mehmet Tepedelenlioglu <
> mehmetsino@gmail.com> wrote:
>
>> If you have n processes to evaluate, make a reducer that calls the
>> process i when it receives key i, 1<=i<=n. Either replicate the data for
>> the n reducers, or cache it for it to be read on the reducer side. The
>> reducers will output the process id i and the performance.
>>
>>
>> On Sep 10, 2012, at 8:25 PM, Jason Yang wrote:
>>
>> > Hi, all
>> >
>> > I've got a question about how to make different mappers execute
>> different processing on a same data?
>> >
>> > Here is my scenario:
>> > I got to process a data, however, there multiple choices to process
>> this data and I have no idea which one is better, so I was thinking that
>> maybe I could execute multiple mappers, in which different processing
>> solution is applied, and eventually the best one is chosen according to
>> some evaluation functions.
>> >
>> > But I'm not sure whether this could be done in MapReduce.
>> >
>> > Any help would be appreciated.
>> >
>> > --
>> > YANG, Lin
>> >
>>
>>
>


-- 
YANG, Lin

Re: how to make different mappers execute different processing on same data ?

Posted by Jason Yang <li...@gmail.com>.
Thanks for your reply.

But I'm not sure that woks since the data volume is large, which makes the
cost of shuffling quite high if all the process are applied in Reducer.

I thought the Hadoop would transfer all the output of Mapper to Reducer by
HTTP, right?

2012/9/11 Narasingu Ramesh <ra...@gmail.com>

> Hi Jason,
>             Mehmet said is exactly correct ,without reducers we cannot
> increase performance please you can add mappers and reducers in any
> processing data you can get output and performance is good.
> Thanks & Regards,
> Ramesh.Narasingu
>
>
> On Tue, Sep 11, 2012 at 9:31 AM, Mehmet Tepedelenlioglu <
> mehmetsino@gmail.com> wrote:
>
>> If you have n processes to evaluate, make a reducer that calls the
>> process i when it receives key i, 1<=i<=n. Either replicate the data for
>> the n reducers, or cache it for it to be read on the reducer side. The
>> reducers will output the process id i and the performance.
>>
>>
>> On Sep 10, 2012, at 8:25 PM, Jason Yang wrote:
>>
>> > Hi, all
>> >
>> > I've got a question about how to make different mappers execute
>> different processing on a same data?
>> >
>> > Here is my scenario:
>> > I got to process a data, however, there multiple choices to process
>> this data and I have no idea which one is better, so I was thinking that
>> maybe I could execute multiple mappers, in which different processing
>> solution is applied, and eventually the best one is chosen according to
>> some evaluation functions.
>> >
>> > But I'm not sure whether this could be done in MapReduce.
>> >
>> > Any help would be appreciated.
>> >
>> > --
>> > YANG, Lin
>> >
>>
>>
>


-- 
YANG, Lin

Re: how to make different mappers execute different processing on same data ?

Posted by Narasingu Ramesh <ra...@gmail.com>.
Hi Jason,
            Mehmet said is exactly correct ,without reducers we cannot
increase performance please you can add mappers and reducers in any
processing data you can get output and performance is good.
Thanks & Regards,
Ramesh.Narasingu

On Tue, Sep 11, 2012 at 9:31 AM, Mehmet Tepedelenlioglu <
mehmetsino@gmail.com> wrote:

> If you have n processes to evaluate, make a reducer that calls the process
> i when it receives key i, 1<=i<=n. Either replicate the data for the n
> reducers, or cache it for it to be read on the reducer side. The reducers
> will output the process id i and the performance.
>
>
> On Sep 10, 2012, at 8:25 PM, Jason Yang wrote:
>
> > Hi, all
> >
> > I've got a question about how to make different mappers execute
> different processing on a same data?
> >
> > Here is my scenario:
> > I got to process a data, however, there multiple choices to process this
> data and I have no idea which one is better, so I was thinking that maybe I
> could execute multiple mappers, in which different processing solution is
> applied, and eventually the best one is chosen according to some evaluation
> functions.
> >
> > But I'm not sure whether this could be done in MapReduce.
> >
> > Any help would be appreciated.
> >
> > --
> > YANG, Lin
> >
>
>

Re: how to make different mappers execute different processing on same data ?

Posted by Narasingu Ramesh <ra...@gmail.com>.
Hi Jason,
            Mehmet said is exactly correct ,without reducers we cannot
increase performance please you can add mappers and reducers in any
processing data you can get output and performance is good.
Thanks & Regards,
Ramesh.Narasingu

On Tue, Sep 11, 2012 at 9:31 AM, Mehmet Tepedelenlioglu <
mehmetsino@gmail.com> wrote:

> If you have n processes to evaluate, make a reducer that calls the process
> i when it receives key i, 1<=i<=n. Either replicate the data for the n
> reducers, or cache it for it to be read on the reducer side. The reducers
> will output the process id i and the performance.
>
>
> On Sep 10, 2012, at 8:25 PM, Jason Yang wrote:
>
> > Hi, all
> >
> > I've got a question about how to make different mappers execute
> different processing on a same data?
> >
> > Here is my scenario:
> > I got to process a data, however, there multiple choices to process this
> data and I have no idea which one is better, so I was thinking that maybe I
> could execute multiple mappers, in which different processing solution is
> applied, and eventually the best one is chosen according to some evaluation
> functions.
> >
> > But I'm not sure whether this could be done in MapReduce.
> >
> > Any help would be appreciated.
> >
> > --
> > YANG, Lin
> >
>
>

Re: how to make different mappers execute different processing on same data ?

Posted by Narasingu Ramesh <ra...@gmail.com>.
Hi Jason,
            Mehmet said is exactly correct ,without reducers we cannot
increase performance please you can add mappers and reducers in any
processing data you can get output and performance is good.
Thanks & Regards,
Ramesh.Narasingu

On Tue, Sep 11, 2012 at 9:31 AM, Mehmet Tepedelenlioglu <
mehmetsino@gmail.com> wrote:

> If you have n processes to evaluate, make a reducer that calls the process
> i when it receives key i, 1<=i<=n. Either replicate the data for the n
> reducers, or cache it for it to be read on the reducer side. The reducers
> will output the process id i and the performance.
>
>
> On Sep 10, 2012, at 8:25 PM, Jason Yang wrote:
>
> > Hi, all
> >
> > I've got a question about how to make different mappers execute
> different processing on a same data?
> >
> > Here is my scenario:
> > I got to process a data, however, there multiple choices to process this
> data and I have no idea which one is better, so I was thinking that maybe I
> could execute multiple mappers, in which different processing solution is
> applied, and eventually the best one is chosen according to some evaluation
> functions.
> >
> > But I'm not sure whether this could be done in MapReduce.
> >
> > Any help would be appreciated.
> >
> > --
> > YANG, Lin
> >
>
>

Re: how to make different mappers execute different processing on same data ?

Posted by Narasingu Ramesh <ra...@gmail.com>.
Hi Jason,
            Mehmet said is exactly correct ,without reducers we cannot
increase performance please you can add mappers and reducers in any
processing data you can get output and performance is good.
Thanks & Regards,
Ramesh.Narasingu

On Tue, Sep 11, 2012 at 9:31 AM, Mehmet Tepedelenlioglu <
mehmetsino@gmail.com> wrote:

> If you have n processes to evaluate, make a reducer that calls the process
> i when it receives key i, 1<=i<=n. Either replicate the data for the n
> reducers, or cache it for it to be read on the reducer side. The reducers
> will output the process id i and the performance.
>
>
> On Sep 10, 2012, at 8:25 PM, Jason Yang wrote:
>
> > Hi, all
> >
> > I've got a question about how to make different mappers execute
> different processing on a same data?
> >
> > Here is my scenario:
> > I got to process a data, however, there multiple choices to process this
> data and I have no idea which one is better, so I was thinking that maybe I
> could execute multiple mappers, in which different processing solution is
> applied, and eventually the best one is chosen according to some evaluation
> functions.
> >
> > But I'm not sure whether this could be done in MapReduce.
> >
> > Any help would be appreciated.
> >
> > --
> > YANG, Lin
> >
>
>

Re: how to make different mappers execute different processing on same data ?

Posted by Mehmet Tepedelenlioglu <me...@gmail.com>.
If you have n processes to evaluate, make a reducer that calls the process i when it receives key i, 1<=i<=n. Either replicate the data for the n reducers, or cache it for it to be read on the reducer side. The reducers will output the process id i and the performance. 


On Sep 10, 2012, at 8:25 PM, Jason Yang wrote:

> Hi, all
> 
> I've got a question about how to make different mappers execute different processing on a same data? 
> 
> Here is my scenario:
> I got to process a data, however, there multiple choices to process this data and I have no idea which one is better, so I was thinking that maybe I could execute multiple mappers, in which different processing solution is applied, and eventually the best one is chosen according to some evaluation functions.
> 
> But I'm not sure whether this could be done in MapReduce.
> 
> Any help would be appreciated.
> 
> -- 
> YANG, Lin
> 


Re: how to make different mappers execute different processing on same data ?

Posted by Mehmet Tepedelenlioglu <me...@gmail.com>.
If you have n processes to evaluate, make a reducer that calls the process i when it receives key i, 1<=i<=n. Either replicate the data for the n reducers, or cache it for it to be read on the reducer side. The reducers will output the process id i and the performance. 


On Sep 10, 2012, at 8:25 PM, Jason Yang wrote:

> Hi, all
> 
> I've got a question about how to make different mappers execute different processing on a same data? 
> 
> Here is my scenario:
> I got to process a data, however, there multiple choices to process this data and I have no idea which one is better, so I was thinking that maybe I could execute multiple mappers, in which different processing solution is applied, and eventually the best one is chosen according to some evaluation functions.
> 
> But I'm not sure whether this could be done in MapReduce.
> 
> Any help would be appreciated.
> 
> -- 
> YANG, Lin
> 


Re: how to make different mappers execute different processing on same data ?

Posted by Mehmet Tepedelenlioglu <me...@gmail.com>.
If you have n processes to evaluate, make a reducer that calls the process i when it receives key i, 1<=i<=n. Either replicate the data for the n reducers, or cache it for it to be read on the reducer side. The reducers will output the process id i and the performance. 


On Sep 10, 2012, at 8:25 PM, Jason Yang wrote:

> Hi, all
> 
> I've got a question about how to make different mappers execute different processing on a same data? 
> 
> Here is my scenario:
> I got to process a data, however, there multiple choices to process this data and I have no idea which one is better, so I was thinking that maybe I could execute multiple mappers, in which different processing solution is applied, and eventually the best one is chosen according to some evaluation functions.
> 
> But I'm not sure whether this could be done in MapReduce.
> 
> Any help would be appreciated.
> 
> -- 
> YANG, Lin
> 


Re: how to make different mappers execute different processing on same data ?

Posted by Mehmet Tepedelenlioglu <me...@gmail.com>.
If you have n processes to evaluate, make a reducer that calls the process i when it receives key i, 1<=i<=n. Either replicate the data for the n reducers, or cache it for it to be read on the reducer side. The reducers will output the process id i and the performance. 


On Sep 10, 2012, at 8:25 PM, Jason Yang wrote:

> Hi, all
> 
> I've got a question about how to make different mappers execute different processing on a same data? 
> 
> Here is my scenario:
> I got to process a data, however, there multiple choices to process this data and I have no idea which one is better, so I was thinking that maybe I could execute multiple mappers, in which different processing solution is applied, and eventually the best one is chosen according to some evaluation functions.
> 
> But I'm not sure whether this could be done in MapReduce.
> 
> Any help would be appreciated.
> 
> -- 
> YANG, Lin
>