You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hadoop.apache.org by Libo Yu <yu...@hotmail.com> on 2014/04/02 03:08:31 UTC

number of map tasks on yarn

Hi all,

I pretty much use the default yarn setting to run a word count example on
a 3 node cluster. Here are my settings:
yarn.nodemanager.resource.memory-mb 8192
yarn.scheduler.minimum-allocation-mb 1024
yarn.scheduler.maximum-allocation-vcores 32

I would expect to see 8192/1024 * 3 = 24 map tasks.
However, I see 32 map tasks. Anybody knows why? Thanks.

Libo

 		 	   		  

Re: number of map tasks on yarn

Posted by Mingjiang Shi <ms...@gopivotal.com>.
+1 for Wangda's comment.

My 2 cents:
There are 2 aspect of the problem:
1. How many maps task in a job.
2. How many map tasks can be run concurrently.

For #1, see Wangda's comments.
For #2, it depends on the cluster resource.  In your case, the cluster will
only be able to run 24 map tasks concurrently at most.



On Wed, Apr 2, 2014 at 10:45 AM, Wangda Tan <wh...@gmail.com> wrote:

> More specifically, Number of map tasks for each job is depended on
> InputFormat.getSplits(...). The number of map tasks is as same as number of
> splits returned by InputFormat.getSplits(...). You can read source code of
> FileInputFormat to get more understanding about this.
>
>
>
> Regards,
> Wangda Tan
>
>
> On Wed, Apr 2, 2014 at 10:23 AM, Stanley Shi <ss...@gopivotal.com> wrote:
>
>> map task number is not decided by the resources you need.
>> It's decided by something else.
>>
>> Regards,
>> *Stanley Shi,*
>>
>>
>>
>> On Wed, Apr 2, 2014 at 9:08 AM, Libo Yu <yu...@hotmail.com> wrote:
>>
>>> Hi all,
>>>
>>> I pretty much use the default yarn setting to run a word count example on
>>> a 3 node cluster. Here are my settings:
>>> yarn.nodemanager.resource.memory-mb 8192
>>> yarn.scheduler.minimum-allocation-mb 1024
>>> yarn.scheduler.maximum-allocation-vcores 32
>>>
>>> I would expect to see 8192/1024 * 3 = 24 map tasks.
>>> However, I see 32 map tasks. Anybody knows why? Thanks.
>>>
>>> Libo
>>>
>>>
>>
>


-- 
Cheers
-MJ

Re: number of map tasks on yarn

Posted by Mingjiang Shi <ms...@gopivotal.com>.
+1 for Wangda's comment.

My 2 cents:
There are 2 aspect of the problem:
1. How many maps task in a job.
2. How many map tasks can be run concurrently.

For #1, see Wangda's comments.
For #2, it depends on the cluster resource.  In your case, the cluster will
only be able to run 24 map tasks concurrently at most.



On Wed, Apr 2, 2014 at 10:45 AM, Wangda Tan <wh...@gmail.com> wrote:

> More specifically, Number of map tasks for each job is depended on
> InputFormat.getSplits(...). The number of map tasks is as same as number of
> splits returned by InputFormat.getSplits(...). You can read source code of
> FileInputFormat to get more understanding about this.
>
>
>
> Regards,
> Wangda Tan
>
>
> On Wed, Apr 2, 2014 at 10:23 AM, Stanley Shi <ss...@gopivotal.com> wrote:
>
>> map task number is not decided by the resources you need.
>> It's decided by something else.
>>
>> Regards,
>> *Stanley Shi,*
>>
>>
>>
>> On Wed, Apr 2, 2014 at 9:08 AM, Libo Yu <yu...@hotmail.com> wrote:
>>
>>> Hi all,
>>>
>>> I pretty much use the default yarn setting to run a word count example on
>>> a 3 node cluster. Here are my settings:
>>> yarn.nodemanager.resource.memory-mb 8192
>>> yarn.scheduler.minimum-allocation-mb 1024
>>> yarn.scheduler.maximum-allocation-vcores 32
>>>
>>> I would expect to see 8192/1024 * 3 = 24 map tasks.
>>> However, I see 32 map tasks. Anybody knows why? Thanks.
>>>
>>> Libo
>>>
>>>
>>
>


-- 
Cheers
-MJ

Re: number of map tasks on yarn

Posted by Mingjiang Shi <ms...@gopivotal.com>.
+1 for Wangda's comment.

My 2 cents:
There are 2 aspect of the problem:
1. How many maps task in a job.
2. How many map tasks can be run concurrently.

For #1, see Wangda's comments.
For #2, it depends on the cluster resource.  In your case, the cluster will
only be able to run 24 map tasks concurrently at most.



On Wed, Apr 2, 2014 at 10:45 AM, Wangda Tan <wh...@gmail.com> wrote:

> More specifically, Number of map tasks for each job is depended on
> InputFormat.getSplits(...). The number of map tasks is as same as number of
> splits returned by InputFormat.getSplits(...). You can read source code of
> FileInputFormat to get more understanding about this.
>
>
>
> Regards,
> Wangda Tan
>
>
> On Wed, Apr 2, 2014 at 10:23 AM, Stanley Shi <ss...@gopivotal.com> wrote:
>
>> map task number is not decided by the resources you need.
>> It's decided by something else.
>>
>> Regards,
>> *Stanley Shi,*
>>
>>
>>
>> On Wed, Apr 2, 2014 at 9:08 AM, Libo Yu <yu...@hotmail.com> wrote:
>>
>>> Hi all,
>>>
>>> I pretty much use the default yarn setting to run a word count example on
>>> a 3 node cluster. Here are my settings:
>>> yarn.nodemanager.resource.memory-mb 8192
>>> yarn.scheduler.minimum-allocation-mb 1024
>>> yarn.scheduler.maximum-allocation-vcores 32
>>>
>>> I would expect to see 8192/1024 * 3 = 24 map tasks.
>>> However, I see 32 map tasks. Anybody knows why? Thanks.
>>>
>>> Libo
>>>
>>>
>>
>


-- 
Cheers
-MJ

Re: number of map tasks on yarn

Posted by Mingjiang Shi <ms...@gopivotal.com>.
+1 for Wangda's comment.

My 2 cents:
There are 2 aspect of the problem:
1. How many maps task in a job.
2. How many map tasks can be run concurrently.

For #1, see Wangda's comments.
For #2, it depends on the cluster resource.  In your case, the cluster will
only be able to run 24 map tasks concurrently at most.



On Wed, Apr 2, 2014 at 10:45 AM, Wangda Tan <wh...@gmail.com> wrote:

> More specifically, Number of map tasks for each job is depended on
> InputFormat.getSplits(...). The number of map tasks is as same as number of
> splits returned by InputFormat.getSplits(...). You can read source code of
> FileInputFormat to get more understanding about this.
>
>
>
> Regards,
> Wangda Tan
>
>
> On Wed, Apr 2, 2014 at 10:23 AM, Stanley Shi <ss...@gopivotal.com> wrote:
>
>> map task number is not decided by the resources you need.
>> It's decided by something else.
>>
>> Regards,
>> *Stanley Shi,*
>>
>>
>>
>> On Wed, Apr 2, 2014 at 9:08 AM, Libo Yu <yu...@hotmail.com> wrote:
>>
>>> Hi all,
>>>
>>> I pretty much use the default yarn setting to run a word count example on
>>> a 3 node cluster. Here are my settings:
>>> yarn.nodemanager.resource.memory-mb 8192
>>> yarn.scheduler.minimum-allocation-mb 1024
>>> yarn.scheduler.maximum-allocation-vcores 32
>>>
>>> I would expect to see 8192/1024 * 3 = 24 map tasks.
>>> However, I see 32 map tasks. Anybody knows why? Thanks.
>>>
>>> Libo
>>>
>>>
>>
>


-- 
Cheers
-MJ

Re: number of map tasks on yarn

Posted by Wangda Tan <wh...@gmail.com>.
More specifically, Number of map tasks for each job is depended on
InputFormat.getSplits(...). The number of map tasks is as same as number of
splits returned by InputFormat.getSplits(...). You can read source code of
FileInputFormat to get more understanding about this.



Regards,
Wangda Tan


On Wed, Apr 2, 2014 at 10:23 AM, Stanley Shi <ss...@gopivotal.com> wrote:

> map task number is not decided by the resources you need.
> It's decided by something else.
>
> Regards,
> *Stanley Shi,*
>
>
>
> On Wed, Apr 2, 2014 at 9:08 AM, Libo Yu <yu...@hotmail.com> wrote:
>
>> Hi all,
>>
>> I pretty much use the default yarn setting to run a word count example on
>> a 3 node cluster. Here are my settings:
>> yarn.nodemanager.resource.memory-mb 8192
>> yarn.scheduler.minimum-allocation-mb 1024
>> yarn.scheduler.maximum-allocation-vcores 32
>>
>> I would expect to see 8192/1024 * 3 = 24 map tasks.
>> However, I see 32 map tasks. Anybody knows why? Thanks.
>>
>> Libo
>>
>>
>

Re: number of map tasks on yarn

Posted by Wangda Tan <wh...@gmail.com>.
More specifically, Number of map tasks for each job is depended on
InputFormat.getSplits(...). The number of map tasks is as same as number of
splits returned by InputFormat.getSplits(...). You can read source code of
FileInputFormat to get more understanding about this.



Regards,
Wangda Tan


On Wed, Apr 2, 2014 at 10:23 AM, Stanley Shi <ss...@gopivotal.com> wrote:

> map task number is not decided by the resources you need.
> It's decided by something else.
>
> Regards,
> *Stanley Shi,*
>
>
>
> On Wed, Apr 2, 2014 at 9:08 AM, Libo Yu <yu...@hotmail.com> wrote:
>
>> Hi all,
>>
>> I pretty much use the default yarn setting to run a word count example on
>> a 3 node cluster. Here are my settings:
>> yarn.nodemanager.resource.memory-mb 8192
>> yarn.scheduler.minimum-allocation-mb 1024
>> yarn.scheduler.maximum-allocation-vcores 32
>>
>> I would expect to see 8192/1024 * 3 = 24 map tasks.
>> However, I see 32 map tasks. Anybody knows why? Thanks.
>>
>> Libo
>>
>>
>

Re: number of map tasks on yarn

Posted by Wangda Tan <wh...@gmail.com>.
More specifically, Number of map tasks for each job is depended on
InputFormat.getSplits(...). The number of map tasks is as same as number of
splits returned by InputFormat.getSplits(...). You can read source code of
FileInputFormat to get more understanding about this.



Regards,
Wangda Tan


On Wed, Apr 2, 2014 at 10:23 AM, Stanley Shi <ss...@gopivotal.com> wrote:

> map task number is not decided by the resources you need.
> It's decided by something else.
>
> Regards,
> *Stanley Shi,*
>
>
>
> On Wed, Apr 2, 2014 at 9:08 AM, Libo Yu <yu...@hotmail.com> wrote:
>
>> Hi all,
>>
>> I pretty much use the default yarn setting to run a word count example on
>> a 3 node cluster. Here are my settings:
>> yarn.nodemanager.resource.memory-mb 8192
>> yarn.scheduler.minimum-allocation-mb 1024
>> yarn.scheduler.maximum-allocation-vcores 32
>>
>> I would expect to see 8192/1024 * 3 = 24 map tasks.
>> However, I see 32 map tasks. Anybody knows why? Thanks.
>>
>> Libo
>>
>>
>

Re: number of map tasks on yarn

Posted by Wangda Tan <wh...@gmail.com>.
More specifically, Number of map tasks for each job is depended on
InputFormat.getSplits(...). The number of map tasks is as same as number of
splits returned by InputFormat.getSplits(...). You can read source code of
FileInputFormat to get more understanding about this.



Regards,
Wangda Tan


On Wed, Apr 2, 2014 at 10:23 AM, Stanley Shi <ss...@gopivotal.com> wrote:

> map task number is not decided by the resources you need.
> It's decided by something else.
>
> Regards,
> *Stanley Shi,*
>
>
>
> On Wed, Apr 2, 2014 at 9:08 AM, Libo Yu <yu...@hotmail.com> wrote:
>
>> Hi all,
>>
>> I pretty much use the default yarn setting to run a word count example on
>> a 3 node cluster. Here are my settings:
>> yarn.nodemanager.resource.memory-mb 8192
>> yarn.scheduler.minimum-allocation-mb 1024
>> yarn.scheduler.maximum-allocation-vcores 32
>>
>> I would expect to see 8192/1024 * 3 = 24 map tasks.
>> However, I see 32 map tasks. Anybody knows why? Thanks.
>>
>> Libo
>>
>>
>

Re: number of map tasks on yarn

Posted by Stanley Shi <ss...@gopivotal.com>.
map task number is not decided by the resources you need.
It's decided by something else.

Regards,
*Stanley Shi,*



On Wed, Apr 2, 2014 at 9:08 AM, Libo Yu <yu...@hotmail.com> wrote:

> Hi all,
>
> I pretty much use the default yarn setting to run a word count example on
> a 3 node cluster. Here are my settings:
> yarn.nodemanager.resource.memory-mb 8192
> yarn.scheduler.minimum-allocation-mb 1024
> yarn.scheduler.maximum-allocation-vcores 32
>
> I would expect to see 8192/1024 * 3 = 24 map tasks.
> However, I see 32 map tasks. Anybody knows why? Thanks.
>
> Libo
>
>

Re: number of map tasks on yarn

Posted by Stanley Shi <ss...@gopivotal.com>.
map task number is not decided by the resources you need.
It's decided by something else.

Regards,
*Stanley Shi,*



On Wed, Apr 2, 2014 at 9:08 AM, Libo Yu <yu...@hotmail.com> wrote:

> Hi all,
>
> I pretty much use the default yarn setting to run a word count example on
> a 3 node cluster. Here are my settings:
> yarn.nodemanager.resource.memory-mb 8192
> yarn.scheduler.minimum-allocation-mb 1024
> yarn.scheduler.maximum-allocation-vcores 32
>
> I would expect to see 8192/1024 * 3 = 24 map tasks.
> However, I see 32 map tasks. Anybody knows why? Thanks.
>
> Libo
>
>

Re: number of map tasks on yarn

Posted by Stanley Shi <ss...@gopivotal.com>.
map task number is not decided by the resources you need.
It's decided by something else.

Regards,
*Stanley Shi,*



On Wed, Apr 2, 2014 at 9:08 AM, Libo Yu <yu...@hotmail.com> wrote:

> Hi all,
>
> I pretty much use the default yarn setting to run a word count example on
> a 3 node cluster. Here are my settings:
> yarn.nodemanager.resource.memory-mb 8192
> yarn.scheduler.minimum-allocation-mb 1024
> yarn.scheduler.maximum-allocation-vcores 32
>
> I would expect to see 8192/1024 * 3 = 24 map tasks.
> However, I see 32 map tasks. Anybody knows why? Thanks.
>
> Libo
>
>

Re: number of map tasks on yarn

Posted by Stanley Shi <ss...@gopivotal.com>.
map task number is not decided by the resources you need.
It's decided by something else.

Regards,
*Stanley Shi,*



On Wed, Apr 2, 2014 at 9:08 AM, Libo Yu <yu...@hotmail.com> wrote:

> Hi all,
>
> I pretty much use the default yarn setting to run a word count example on
> a 3 node cluster. Here are my settings:
> yarn.nodemanager.resource.memory-mb 8192
> yarn.scheduler.minimum-allocation-mb 1024
> yarn.scheduler.maximum-allocation-vcores 32
>
> I would expect to see 8192/1024 * 3 = 24 map tasks.
> However, I see 32 map tasks. Anybody knows why? Thanks.
>
> Libo
>
>