You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-user@hadoop.apache.org by Shaojun Zhao <zh...@cs.rochester.edu> on 2010/07/14 20:50:08 UTC

specify different number of mapper tasks for different machines

Hi,

I am running mapreduce on 5 machines, where I have 8 cores for 3 of
them, but 2 cores for 2 of them, and the 8 core machines are more
powerful (faster, more mem, more disk).

Currently, I am using only the 3 machines (each with 8 cores), and the
max number of mapper tasks is 8.
I may use one of the 2 core machine as the master, but it turns out I
need a powerful master.

Is there any way to specify that some machines run, say, 8 mapper
tasks, while some machines run only 2 tasks?

What I can imagine is to extend the slave file, and have
machine1:8
machine2:8
machine3:8
machine4:2
machine5:2
but I have never seen this format.

Any option could be I specify the 8 core machines several times in the
slave file:
machine1
machine1
machine1
machine1
<same for machines 2 and 3>
machine4
machine5

But I believe there are ways to do this. I just can not find the
information from the hadoop website.

Thanks in advance.
-Sam

Re: specify different number of mapper tasks for different machines

Posted by Ted Yu <yu...@gmail.com>.
hadoop-daemon.sh also needs to be modified - it would wipe your custom
config files:
    if [ "$HADOOP_MASTER" != "" ]; then
      echo rsync from $HADOOP_MASTER
      rsync -a -e ssh --delete --exclude=.svn --exclude='logs/*'
--exclude='contrib/hod/logs/*' $HADOOP_MASTER/ "$HADOOP_HOME"
    fi

On Wed, Jul 14, 2010 at 11:53 AM, Allen Wittenauer <awittenauer@linkedin.com
> wrote:

>
> On Jul 14, 2010, at 11:50 AM, Shaojun Zhao wrote:
> > Is there any way to specify that some machines run, say, 8 mapper
> > tasks, while some machines run only 2 tasks?
>
> A custom mapred-site.xml per machine.
>
>
>

Re: specify different number of mapper tasks for different machines

Posted by Allen Wittenauer <aw...@linkedin.com>.
On Jul 14, 2010, at 11:50 AM, Shaojun Zhao wrote:
> Is there any way to specify that some machines run, say, 8 mapper
> tasks, while some machines run only 2 tasks?

A custom mapred-site.xml per machine.



Re: specify different number of mapper tasks for different machines

Posted by Vitaliy Semochkin <vi...@gmail.com>.
Thank you Sam,
I'll give it a try.


On Mon, Aug 30, 2010 at 4:39 PM, Vitaliy Semochkin <vi...@gmail.com> wrote:
> To say the truth I didn't understood Ted's proposal to solve  the
> wiping configuration.
> If you manage to make such configuration work please report :-)
>
> On Mon, Aug 30, 2010 at 3:59 PM, Shaojun Zhao <zh...@cs.rochester.edu> wrote:
>> I beleive what Allen and Ted said, but so far, I did not try it out.
>> -Sam
>>
>> On Mon, Aug 30, 2010 at 4:42 AM, Vitaliy Semochkin <vi...@gmail.com> wrote:
>>> Hi,
>>>
>>> Have you find the way to set different amount of mappers/reducers on a
>>> particular node?
>>>
>>> On Wed, Jul 14, 2010 at 10:50 PM, Shaojun Zhao <zh...@cs.rochester.edu> wrote:
>>>> Hi,
>>>>
>>>> I am running mapreduce on 5 machines, where I have 8 cores for 3 of
>>>> them, but 2 cores for 2 of them, and the 8 core machines are more
>>>> powerful (faster, more mem, more disk).
>>>>
>>>> Currently, I am using only the 3 machines (each with 8 cores), and the
>>>> max number of mapper tasks is 8.
>>>> I may use one of the 2 core machine as the master, but it turns out I
>>>> need a powerful master.
>>>>
>>>> Is there any way to specify that some machines run, say, 8 mapper
>>>> tasks, while some machines run only 2 tasks?
>>>>
>>>> What I can imagine is to extend the slave file, and have
>>>> machine1:8
>>>> machine2:8
>>>> machine3:8
>>>> machine4:2
>>>> machine5:2
>>>> but I have never seen this format.
>>>>
>>>> Any option could be I specify the 8 core machines several times in the
>>>> slave file:
>>>> machine1
>>>> machine1
>>>> machine1
>>>> machine1
>>>> <same for machines 2 and 3>
>>>> machine4
>>>> machine5
>>>>
>>>> But I believe there are ways to do this. I just can not find the
>>>> information from the hadoop website.
>>>>
>>>> Thanks in advance.
>>>> -Sam
>>>>
>>>
>>
>

Re: specify different number of mapper tasks for different machines

Posted by Vitaliy Semochkin <vi...@gmail.com>.
To say the truth I didn't understood Ted's proposal to solve  the
wiping configuration.
If you manage to make such configuration work please report :-)

On Mon, Aug 30, 2010 at 3:59 PM, Shaojun Zhao <zh...@cs.rochester.edu> wrote:
> I beleive what Allen and Ted said, but so far, I did not try it out.
> -Sam
>
> On Mon, Aug 30, 2010 at 4:42 AM, Vitaliy Semochkin <vi...@gmail.com> wrote:
>> Hi,
>>
>> Have you find the way to set different amount of mappers/reducers on a
>> particular node?
>>
>> On Wed, Jul 14, 2010 at 10:50 PM, Shaojun Zhao <zh...@cs.rochester.edu> wrote:
>>> Hi,
>>>
>>> I am running mapreduce on 5 machines, where I have 8 cores for 3 of
>>> them, but 2 cores for 2 of them, and the 8 core machines are more
>>> powerful (faster, more mem, more disk).
>>>
>>> Currently, I am using only the 3 machines (each with 8 cores), and the
>>> max number of mapper tasks is 8.
>>> I may use one of the 2 core machine as the master, but it turns out I
>>> need a powerful master.
>>>
>>> Is there any way to specify that some machines run, say, 8 mapper
>>> tasks, while some machines run only 2 tasks?
>>>
>>> What I can imagine is to extend the slave file, and have
>>> machine1:8
>>> machine2:8
>>> machine3:8
>>> machine4:2
>>> machine5:2
>>> but I have never seen this format.
>>>
>>> Any option could be I specify the 8 core machines several times in the
>>> slave file:
>>> machine1
>>> machine1
>>> machine1
>>> machine1
>>> <same for machines 2 and 3>
>>> machine4
>>> machine5
>>>
>>> But I believe there are ways to do this. I just can not find the
>>> information from the hadoop website.
>>>
>>> Thanks in advance.
>>> -Sam
>>>
>>
>

Re: specify different number of mapper tasks for different machines

Posted by Shaojun Zhao <zh...@cs.rochester.edu>.
I beleive what Allen and Ted said, but so far, I did not try it out.
-Sam

On Mon, Aug 30, 2010 at 4:42 AM, Vitaliy Semochkin <vi...@gmail.com> wrote:
> Hi,
>
> Have you find the way to set different amount of mappers/reducers on a
> particular node?
>
> On Wed, Jul 14, 2010 at 10:50 PM, Shaojun Zhao <zh...@cs.rochester.edu> wrote:
>> Hi,
>>
>> I am running mapreduce on 5 machines, where I have 8 cores for 3 of
>> them, but 2 cores for 2 of them, and the 8 core machines are more
>> powerful (faster, more mem, more disk).
>>
>> Currently, I am using only the 3 machines (each with 8 cores), and the
>> max number of mapper tasks is 8.
>> I may use one of the 2 core machine as the master, but it turns out I
>> need a powerful master.
>>
>> Is there any way to specify that some machines run, say, 8 mapper
>> tasks, while some machines run only 2 tasks?
>>
>> What I can imagine is to extend the slave file, and have
>> machine1:8
>> machine2:8
>> machine3:8
>> machine4:2
>> machine5:2
>> but I have never seen this format.
>>
>> Any option could be I specify the 8 core machines several times in the
>> slave file:
>> machine1
>> machine1
>> machine1
>> machine1
>> <same for machines 2 and 3>
>> machine4
>> machine5
>>
>> But I believe there are ways to do this. I just can not find the
>> information from the hadoop website.
>>
>> Thanks in advance.
>> -Sam
>>
>

Re: specify different number of mapper tasks for different machines

Posted by Vitaliy Semochkin <vi...@gmail.com>.
Hi,

Have you find the way to set different amount of mappers/reducers on a
particular node?

On Wed, Jul 14, 2010 at 10:50 PM, Shaojun Zhao <zh...@cs.rochester.edu> wrote:
> Hi,
>
> I am running mapreduce on 5 machines, where I have 8 cores for 3 of
> them, but 2 cores for 2 of them, and the 8 core machines are more
> powerful (faster, more mem, more disk).
>
> Currently, I am using only the 3 machines (each with 8 cores), and the
> max number of mapper tasks is 8.
> I may use one of the 2 core machine as the master, but it turns out I
> need a powerful master.
>
> Is there any way to specify that some machines run, say, 8 mapper
> tasks, while some machines run only 2 tasks?
>
> What I can imagine is to extend the slave file, and have
> machine1:8
> machine2:8
> machine3:8
> machine4:2
> machine5:2
> but I have never seen this format.
>
> Any option could be I specify the 8 core machines several times in the
> slave file:
> machine1
> machine1
> machine1
> machine1
> <same for machines 2 and 3>
> machine4
> machine5
>
> But I believe there are ways to do this. I just can not find the
> information from the hadoop website.
>
> Thanks in advance.
> -Sam
>