You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by Mark Olimpiati <ma...@gmail.com> on 2013/01/14 06:57:05 UTC
Multi-threaded map task
Hi, this is a simple question, but why wasn't map or reduce tasks
programmed to be multi-threaded ? ie. instead of spawning 6 map tasks for 6
cores, run one map task with 6 parallel threads.
In fact I tried this myself, but turns that threading is not helping as it
would be in regular java programs for some reason .. any feedback on this
topic?
Thanks,
Mark
Re: Multi-threaded map task
Posted by Mark Olimpiati <ma...@gmail.com>.
Never mind, depends on plantform, in my case would work fine. Thanks guys!
Mark
On Mon, Jan 14, 2013 at 12:23 PM, Mark Olimpiati <ma...@gmail.com>wrote:
> Thanks Bertrand, I shall try it and hope to gain some speed. One last
> question though, do you think the threads used are user-level or
> kernel-level threads in MultithreadedMapper ?
>
> Mark
>
> On Mon, Jan 14, 2013 at 12:06 AM, Bertrand Dechoux <de...@gmail.com>wrote:
>
>> Bertrand
>
>
>
>
Re: Multi-threaded map task
Posted by Mark Olimpiati <ma...@gmail.com>.
Thanks Bertrand, I shall try it and hope to gain some speed. One last
question though, do you think the threads used are user-level or
kernel-level threads in MultithreadedMapper ?
Mark
On Mon, Jan 14, 2013 at 12:06 AM, Bertrand Dechoux <de...@gmail.com>wrote:
> Bertrand
Re: Multi-threaded map task
Posted by Bertrand Dechoux <de...@gmail.com>.
Well... It all depends on where is your bottleneck. Do a benchmark for your
use case if it is critical. Multi-threading might be useful not always. And
you would rather want to avoid having a locally shared mutable state
because it can become a pain to manage. But it doesn't mean you can't do
multi-threading...
You only need to browse the type hierarchy a bit to find about
http://hadoop.apache.org/docs/r1.0.4/api/org/apache/hadoop/mapreduce/lib/map/MultithreadedMapper.html
Regards
Bertrand
On Mon, Jan 14, 2013 at 8:22 AM, Mark Olimpiati <ma...@gmail.com> wrote:
> Thanks for the reply Nitin, but I don't see what's the bottleneck of having
> it distributed with multi-threaded maps ?
>
> I see your point in that each map is processing different splits, but my
> question is if each map task had 2 threads multiplexing or running in
> parallel if there is enough cores to process the same split, wouldn't that
> be faster with enough cores?
>
> Mark
>
>
> On Sun, Jan 13, 2013 at 10:34 PM, Nitin Pawar <nitinpawar432@gmail.com
> >wrote:
>
> > Thats because its distributed processing framework over network
> > On Jan 14, 2013 11:27 AM, "Mark Olimpiati" <ma...@gmail.com> wrote:
> >
> > > Hi, this is a simple question, but why wasn't map or reduce tasks
> > > programmed to be multi-threaded ? ie. instead of spawning 6 map tasks
> > for 6
> > > cores, run one map task with 6 parallel threads.
> > >
> > > In fact I tried this myself, but turns that threading is not helping as
> > it
> > > would be in regular java programs for some reason .. any feedback on
> this
> > > topic?
> > >
> > > Thanks,
> > > Mark
> > >
> >
>
--
Bertrand Dechoux
Re: Multi-threaded map task
Posted by Mark Olimpiati <ma...@gmail.com>.
Thanks for the reply Nitin, but I don't see what's the bottleneck of having
it distributed with multi-threaded maps ?
I see your point in that each map is processing different splits, but my
question is if each map task had 2 threads multiplexing or running in
parallel if there is enough cores to process the same split, wouldn't that
be faster with enough cores?
Mark
On Sun, Jan 13, 2013 at 10:34 PM, Nitin Pawar <ni...@gmail.com>wrote:
> Thats because its distributed processing framework over network
> On Jan 14, 2013 11:27 AM, "Mark Olimpiati" <ma...@gmail.com> wrote:
>
> > Hi, this is a simple question, but why wasn't map or reduce tasks
> > programmed to be multi-threaded ? ie. instead of spawning 6 map tasks
> for 6
> > cores, run one map task with 6 parallel threads.
> >
> > In fact I tried this myself, but turns that threading is not helping as
> it
> > would be in regular java programs for some reason .. any feedback on this
> > topic?
> >
> > Thanks,
> > Mark
> >
>
Re: Multi-threaded map task
Posted by Nitin Pawar <ni...@gmail.com>.
Thats because its distributed processing framework over network
On Jan 14, 2013 11:27 AM, "Mark Olimpiati" <ma...@gmail.com> wrote:
> Hi, this is a simple question, but why wasn't map or reduce tasks
> programmed to be multi-threaded ? ie. instead of spawning 6 map tasks for 6
> cores, run one map task with 6 parallel threads.
>
> In fact I tried this myself, but turns that threading is not helping as it
> would be in regular java programs for some reason .. any feedback on this
> topic?
>
> Thanks,
> Mark
>