You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by Mark Olimpiati <ma...@gmail.com> on 2013/01/14 06:57:05 UTC

Multi-threaded map task

Hi, this is a simple question, but why wasn't map or reduce tasks
programmed to be multi-threaded ? ie. instead of spawning 6 map tasks for 6
cores, run one map task with 6 parallel threads.

In fact I tried this myself, but turns that threading is not helping as it
would be in regular java programs for some reason .. any feedback on this
topic?

Thanks,
Mark

Re: Multi-threaded map task

Posted by Mark Olimpiati <ma...@gmail.com>.
Never mind, depends on plantform, in my case would work fine. Thanks guys!
Mark


On Mon, Jan 14, 2013 at 12:23 PM, Mark Olimpiati <ma...@gmail.com>wrote:

> Thanks Bertrand, I shall try it and hope to gain some speed. One last
> question though, do you think the threads used are user-level or
> kernel-level threads in MultithreadedMapper ?
>
> Mark
>
> On Mon, Jan 14, 2013 at 12:06 AM, Bertrand Dechoux <de...@gmail.com>wrote:
>
>> Bertrand
>
>
>
>

Re: Multi-threaded map task

Posted by Mark Olimpiati <ma...@gmail.com>.
Thanks Bertrand, I shall try it and hope to gain some speed. One last
question though, do you think the threads used are user-level or
kernel-level threads in MultithreadedMapper ?

Mark

On Mon, Jan 14, 2013 at 12:06 AM, Bertrand Dechoux <de...@gmail.com>wrote:

> Bertrand

Re: Multi-threaded map task

Posted by Bertrand Dechoux <de...@gmail.com>.
Well... It all depends on where is your bottleneck. Do a benchmark for your
use case if it is critical. Multi-threading might be useful not always. And
you would rather want to avoid having a locally shared mutable state
because it can become a pain to manage. But it doesn't mean you can't do
multi-threading...

You only need to browse the type hierarchy a bit to find about
http://hadoop.apache.org/docs/r1.0.4/api/org/apache/hadoop/mapreduce/lib/map/MultithreadedMapper.html

Regards

Bertrand

On Mon, Jan 14, 2013 at 8:22 AM, Mark Olimpiati <ma...@gmail.com> wrote:

> Thanks for the reply Nitin, but I don't see what's the bottleneck of having
> it distributed with multi-threaded maps ?
>
> I see your point in that each map is processing different splits, but my
> question is if each map task had 2 threads multiplexing  or running in
> parallel if there is enough cores to process the same split, wouldn't that
> be faster with enough cores?
>
> Mark
>
>
> On Sun, Jan 13, 2013 at 10:34 PM, Nitin Pawar <nitinpawar432@gmail.com
> >wrote:
>
> > Thats because its distributed processing framework over network
> > On Jan 14, 2013 11:27 AM, "Mark Olimpiati" <ma...@gmail.com> wrote:
> >
> > > Hi, this is a simple question, but why wasn't map or reduce tasks
> > > programmed to be multi-threaded ? ie. instead of spawning 6 map tasks
> > for 6
> > > cores, run one map task with 6 parallel threads.
> > >
> > > In fact I tried this myself, but turns that threading is not helping as
> > it
> > > would be in regular java programs for some reason .. any feedback on
> this
> > > topic?
> > >
> > > Thanks,
> > > Mark
> > >
> >
>



-- 
Bertrand Dechoux

Re: Multi-threaded map task

Posted by Mark Olimpiati <ma...@gmail.com>.
Thanks for the reply Nitin, but I don't see what's the bottleneck of having
it distributed with multi-threaded maps ?

I see your point in that each map is processing different splits, but my
question is if each map task had 2 threads multiplexing  or running in
parallel if there is enough cores to process the same split, wouldn't that
be faster with enough cores?

Mark


On Sun, Jan 13, 2013 at 10:34 PM, Nitin Pawar <ni...@gmail.com>wrote:

> Thats because its distributed processing framework over network
> On Jan 14, 2013 11:27 AM, "Mark Olimpiati" <ma...@gmail.com> wrote:
>
> > Hi, this is a simple question, but why wasn't map or reduce tasks
> > programmed to be multi-threaded ? ie. instead of spawning 6 map tasks
> for 6
> > cores, run one map task with 6 parallel threads.
> >
> > In fact I tried this myself, but turns that threading is not helping as
> it
> > would be in regular java programs for some reason .. any feedback on this
> > topic?
> >
> > Thanks,
> > Mark
> >
>

Re: Multi-threaded map task

Posted by Nitin Pawar <ni...@gmail.com>.
Thats because its distributed processing framework over network
On Jan 14, 2013 11:27 AM, "Mark Olimpiati" <ma...@gmail.com> wrote:

> Hi, this is a simple question, but why wasn't map or reduce tasks
> programmed to be multi-threaded ? ie. instead of spawning 6 map tasks for 6
> cores, run one map task with 6 parallel threads.
>
> In fact I tried this myself, but turns that threading is not helping as it
> would be in regular java programs for some reason .. any feedback on this
> topic?
>
> Thanks,
> Mark
>