You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by Massimo Schiavon <ms...@volunia.com> on 2011/04/15 17:03:55 UTC

Force single map task execution per node for a job

I need that during the execution of a particular job, a maximum of one 
map task execute on each cluster node.
I've tried setting mapred.tasktracker.map.tasks.maximum=1 on job 
configuration but seems not to work.

Anyone coul'd help?
Thanks

Massimo

-- 
DISCLAIMER: This e-mail and any attachment is for authorised use by
the intended recipient(s) only. It may contain proprietary material,
confidential information and/or be subject to legal privilege. It
should not be copied, disclosed to, retained or used by, any other
party. If you are not an intended recipient then please promptly
delete this e-mail and any attachment and all copies and inform
the sender. Thank you.


Re: Force single map task execution per node for a job

Posted by Juwei Shi <sh...@gmail.com>.
You should set mapred.tasktracker.map.tasks.maximum=1 on each node.

2011/4/15 baran cakici <ba...@gmail.com>

> mapred.map.tasks=1
>
> did you try that??
>
> 2011/4/15 Massimo Schiavon <ms...@volunia.com>
>
> > I need that during the execution of a particular job, a maximum of one
> map
> > task execute on each cluster node.
> > I've tried setting mapred.tasktracker.map.tasks.maximum=1 on job
> > configuration but seems not to work.
> >
> > Anyone coul'd help?
> > Thanks
> >
> > Massimo
> >
> > --
> > DISCLAIMER: This e-mail and any attachment is for authorised use by
> > the intended recipient(s) only. It may contain proprietary material,
> > confidential information and/or be subject to legal privilege. It
> > should not be copied, disclosed to, retained or used by, any other
> > party. If you are not an intended recipient then please promptly
> > delete this e-mail and any attachment and all copies and inform
> > the sender. Thank you.
> >
> >
>



-- 
- Juwei

Re: Force single map task execution per node for a job

Posted by baran cakici <ba...@gmail.com>.
mapred.map.tasks=1

did you try that??

2011/4/15 Massimo Schiavon <ms...@volunia.com>

> I need that during the execution of a particular job, a maximum of one map
> task execute on each cluster node.
> I've tried setting mapred.tasktracker.map.tasks.maximum=1 on job
> configuration but seems not to work.
>
> Anyone coul'd help?
> Thanks
>
> Massimo
>
> --
> DISCLAIMER: This e-mail and any attachment is for authorised use by
> the intended recipient(s) only. It may contain proprietary material,
> confidential information and/or be subject to legal privilege. It
> should not be copied, disclosed to, retained or used by, any other
> party. If you are not an intended recipient then please promptly
> delete this e-mail and any attachment and all copies and inform
> the sender. Thank you.
>
>

RE: Force single map task execution per node for a job

Posted by Jim Falgout <ji...@pervasive.com>.
I'm not sure that is possible. You can use the NLineInputFormat as a control file and have a line per node in the cluster. I've used that technique for a data generation program and it works well. This will run a pre-determined number of mappers. However, it's up to the scheduler to decide when and where they run. If other jobs are running concurrently, I don't believe you can be guaranteed you'll get a distinct mapper per node.

Running my data generator job on a quiet cluster did run one mapper per node as I wanted. But if you don't have more control over your cluster, I believe the behavior is not deterministic.

-----Original Message-----
From: Massimo Schiavon [mailto:mschiavon@volunia.com] 
Sent: Friday, April 15, 2011 10:04 AM
To: common-user@hadoop.apache.org
Subject: Force single map task execution per node for a job

I need that during the execution of a particular job, a maximum of one map task execute on each cluster node.
I've tried setting mapred.tasktracker.map.tasks.maximum=1 on job configuration but seems not to work.

Anyone coul'd help?
Thanks

Massimo

--
DISCLAIMER: This e-mail and any attachment is for authorised use by the intended recipient(s) only. It may contain proprietary material, confidential information and/or be subject to legal privilege. It should not be copied, disclosed to, retained or used by, any other party. If you are not an intended recipient then please promptly delete this e-mail and any attachment and all copies and inform the sender. Thank you.




Re: Force single map task execution per node for a job

Posted by Harsh J <ha...@cloudera.com>.
Hello Massimo,

This is sort-of possible with a custom InputFormat (getSplits contents
has the real scheduling information in it if you notice). But the
host-to-task mapping is not strongly guaranteed (if slots are full on
a node while launching, it could get scheduled elsewhere).

On Fri, Apr 15, 2011 at 8:33 PM, Massimo Schiavon <ms...@volunia.com> wrote:
> I need that during the execution of a particular job, a maximum of one map
> task execute on each cluster node.
> I've tried setting mapred.tasktracker.map.tasks.maximum=1 on job
> configuration but seems not to work.

-- 
Harsh J