You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-user@hadoop.apache.org by SACHINGUPTA <sa...@datametica.com> on 2014/10/15 14:09:31 UTC

number of mappers allowed in a container in hadoop2

Hi guys

I have situation in which i have machine with 4 processor and i have 5 
containers so does it mean i can have only 4 mappers running parallely 
at a time

and number of mappers is not dependent on the number of containers in a 
machine then what is the use of container concept

sorry if i have asked anything obvious.

-- 
Thanks
Sachin Gupta


Hadoop2 on Windows in psuedo-distributed or cluster mode?

Posted by Wa...@Instinet.com.
Has anybody able to run Hadoop 2 on a Windows machine in pseudo distributed
or cluster mode? I am able to run it on a single machine but has not been
able to deploy it across multiple machines.

Wadood

=========================================================================================================  <<<< Disclaimer >>>>   This message is intended solely for use by the named addressee(s). If you receive this transmission in error, please immediately notify the sender and destroy this message in its entirety, whether in electronic or hard copy format. Any unauthorized use (and reliance thereon), copying, disclosure, retention, or distribution of this transmission or the material in this transmission is forbidden. We reserve the right to monitor and archive electronic communications. This material does not constitute an offer or solicitation with respect to the purchase or sale of any security. It should not be construed to contain any recommendation regarding any security or strategy. Any views expressed are those of the individual sender, except where the message states otherwise and the sender is authorized to state them to be the views of any such entity. This communication is provided on an “as is” basis. It contains material that is owned by Instinet Incorporated, its subsidiaries or its or their licensors, and may not, in whole or in part, be (i) copied, photocopied or duplicated in any form, by any means, or (ii) redistributed, posted, published, excerpted, or quoted without Instinet Incorporated's prior written consent. Please access the following link for important information and instructions:  http://instinet.com/includes/index.jsp?thePage=/html/le_index.txt   Securities products and services are provided by locally registered brokerage subsidiaries of Instinet Incorporated: Instinet Australia Pty Limited (ACN: 131 253 686 AFSL No: 327834), regulated by the Australian Securities & Investments Commission; Instinet Canada Limited, member IIROC/CIPF; Instinet Pacific Limited, authorized and regulated by the Securities and Futures Commission of Hong Kong; Instinet Singapore Services Private Limited, regulated by the Monetary Authority of Singapore, trading member of The Singapore Exchange Securities Trading Private Limited and clearing member of The Central Depository (Pte) Limited; and Instinet, LLC, member SIPC.  

=========================================================================================================  

Hadoop2 on Windows in psuedo-distributed or cluster mode?

Posted by Wa...@Instinet.com.
Has anybody able to run Hadoop 2 on a Windows machine in pseudo distributed
or cluster mode? I am able to run it on a single machine but has not been
able to deploy it across multiple machines.

Wadood

=========================================================================================================  <<<< Disclaimer >>>>   This message is intended solely for use by the named addressee(s). If you receive this transmission in error, please immediately notify the sender and destroy this message in its entirety, whether in electronic or hard copy format. Any unauthorized use (and reliance thereon), copying, disclosure, retention, or distribution of this transmission or the material in this transmission is forbidden. We reserve the right to monitor and archive electronic communications. This material does not constitute an offer or solicitation with respect to the purchase or sale of any security. It should not be construed to contain any recommendation regarding any security or strategy. Any views expressed are those of the individual sender, except where the message states otherwise and the sender is authorized to state them to be the views of any such entity. This communication is provided on an “as is” basis. It contains material that is owned by Instinet Incorporated, its subsidiaries or its or their licensors, and may not, in whole or in part, be (i) copied, photocopied or duplicated in any form, by any means, or (ii) redistributed, posted, published, excerpted, or quoted without Instinet Incorporated's prior written consent. Please access the following link for important information and instructions:  http://instinet.com/includes/index.jsp?thePage=/html/le_index.txt   Securities products and services are provided by locally registered brokerage subsidiaries of Instinet Incorporated: Instinet Australia Pty Limited (ACN: 131 253 686 AFSL No: 327834), regulated by the Australian Securities & Investments Commission; Instinet Canada Limited, member IIROC/CIPF; Instinet Pacific Limited, authorized and regulated by the Securities and Futures Commission of Hong Kong; Instinet Singapore Services Private Limited, regulated by the Monetary Authority of Singapore, trading member of The Singapore Exchange Securities Trading Private Limited and clearing member of The Central Depository (Pte) Limited; and Instinet, LLC, member SIPC.  

=========================================================================================================  

Hadoop2 on Windows in psuedo-distributed or cluster mode?

Posted by Wa...@Instinet.com.
Has anybody able to run Hadoop 2 on a Windows machine in pseudo distributed
or cluster mode? I am able to run it on a single machine but has not been
able to deploy it across multiple machines.

Wadood

=========================================================================================================  <<<< Disclaimer >>>>   This message is intended solely for use by the named addressee(s). If you receive this transmission in error, please immediately notify the sender and destroy this message in its entirety, whether in electronic or hard copy format. Any unauthorized use (and reliance thereon), copying, disclosure, retention, or distribution of this transmission or the material in this transmission is forbidden. We reserve the right to monitor and archive electronic communications. This material does not constitute an offer or solicitation with respect to the purchase or sale of any security. It should not be construed to contain any recommendation regarding any security or strategy. Any views expressed are those of the individual sender, except where the message states otherwise and the sender is authorized to state them to be the views of any such entity. This communication is provided on an “as is” basis. It contains material that is owned by Instinet Incorporated, its subsidiaries or its or their licensors, and may not, in whole or in part, be (i) copied, photocopied or duplicated in any form, by any means, or (ii) redistributed, posted, published, excerpted, or quoted without Instinet Incorporated's prior written consent. Please access the following link for important information and instructions:  http://instinet.com/includes/index.jsp?thePage=/html/le_index.txt   Securities products and services are provided by locally registered brokerage subsidiaries of Instinet Incorporated: Instinet Australia Pty Limited (ACN: 131 253 686 AFSL No: 327834), regulated by the Australian Securities & Investments Commission; Instinet Canada Limited, member IIROC/CIPF; Instinet Pacific Limited, authorized and regulated by the Securities and Futures Commission of Hong Kong; Instinet Singapore Services Private Limited, regulated by the Monetary Authority of Singapore, trading member of The Singapore Exchange Securities Trading Private Limited and clearing member of The Central Depository (Pte) Limited; and Instinet, LLC, member SIPC.  

=========================================================================================================  

Hadoop2 on Windows in psuedo-distributed or cluster mode?

Posted by Wa...@Instinet.com.
Has anybody able to run Hadoop 2 on a Windows machine in pseudo distributed
or cluster mode? I am able to run it on a single machine but has not been
able to deploy it across multiple machines.

Wadood

=========================================================================================================  <<<< Disclaimer >>>>   This message is intended solely for use by the named addressee(s). If you receive this transmission in error, please immediately notify the sender and destroy this message in its entirety, whether in electronic or hard copy format. Any unauthorized use (and reliance thereon), copying, disclosure, retention, or distribution of this transmission or the material in this transmission is forbidden. We reserve the right to monitor and archive electronic communications. This material does not constitute an offer or solicitation with respect to the purchase or sale of any security. It should not be construed to contain any recommendation regarding any security or strategy. Any views expressed are those of the individual sender, except where the message states otherwise and the sender is authorized to state them to be the views of any such entity. This communication is provided on an “as is” basis. It contains material that is owned by Instinet Incorporated, its subsidiaries or its or their licensors, and may not, in whole or in part, be (i) copied, photocopied or duplicated in any form, by any means, or (ii) redistributed, posted, published, excerpted, or quoted without Instinet Incorporated's prior written consent. Please access the following link for important information and instructions:  http://instinet.com/includes/index.jsp?thePage=/html/le_index.txt   Securities products and services are provided by locally registered brokerage subsidiaries of Instinet Incorporated: Instinet Australia Pty Limited (ACN: 131 253 686 AFSL No: 327834), regulated by the Australian Securities & Investments Commission; Instinet Canada Limited, member IIROC/CIPF; Instinet Pacific Limited, authorized and regulated by the Securities and Futures Commission of Hong Kong; Instinet Singapore Services Private Limited, regulated by the Monetary Authority of Singapore, trading member of The Singapore Exchange Securities Trading Private Limited and clearing member of The Central Depository (Pte) Limited; and Instinet, LLC, member SIPC.  

=========================================================================================================  

Re: number of mappers allowed in a container in hadoop2

Posted by Shahab Yunus <sh...@gmail.com>.
No. In Yarn, a container is a container which you can call normal
container. It is the M/R framework that you run on top of it that considers
them 'map' and 'reduce' containers.

I recommend reading the documentation about architecture and design of yarn
at the link below. It will answer lot of your questions.

http://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/YARN.html

Regards,
Shahab

On Wed, Oct 15, 2014 at 10:18 AM, SACHINGUPTA <sa...@datametica.com> wrote:

>  thanks for the reply
>
> i have one more doubt
>
> are there three kinds of containers with different memory sizes in hadoop 2
>
> 1.normal container
> 2.map task container
> 3.reduce task container
>
> On Wednesday 15 October 2014 07:33 PM, Shahab Yunus wrote:
>
> The data that the each map task will process is different from the memory
> the task itself might require depending upon whatever processing that you
> plan to do in the task.
>
>  Very trivial example: Let us say your map gets 128mb input data but your
> task logic is such that it creates lots of String objects and ArrayList
> objects, then wouldn't your memory requirement for the task be greater than
> your input data?
>
>  I think you are confusing the size of the input data to the map/task
> with the actual memory required by the map/task itself to do its work.
>
>  Regards,
> Shahab
>
> On Wed, Oct 15, 2014 at 9:44 AM, SACHINGUPTA <sa...@datametica.com>
> wrote:
>
>>  it is still not clear to me
>> lets suppose block size of my hdfs is 128 mb so every mapper will process
>> only 128 mb of data
>> then what is the meaning of setting the property mapreduce.map.memory.mb
>> that is already known from the block size then why this property
>>
>>
>>
>> On Wednesday 15 October 2014 07:06 PM, Shahab Yunus wrote:
>>
>> Explanation here.
>>
>>
>> http://stackoverflow.com/questions/24070557/what-is-the-relation-between-mapreduce-map-memory-mb-and-mapred-map-child-jav
>>
>> https://support.pivotal.io/hc/en-us/articles/201462036-Mapreduce-YARN-Memory-Parameters
>>
>> http://hadoop.apache.org/docs/r2.5.1/hadoop-project-dist/hadoop-common/ClusterSetup.html
>> (scroll towards the end.)
>>
>>  Regards,
>> Shahab
>>
>> On Wed, Oct 15, 2014 at 9:24 AM, SACHINGUPTA <sa...@datametica.com>
>> wrote:
>>
>>>  I have one more doubt i was reading this
>>>
>>>
>>> http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.0.6.0/bk_installing_manually_book/content/rpm-chap1-11.html
>>>
>>> there is one property as
>>>
>>>   mapreduce.map.memory.mb  = 2*1024 MB
>>> mapreduce.reduce.memory.mb           = 2 * 2 = 4*1024 MB
>>> what are these properties mapreduce.map.memory.mb and
>>> mapreduce.reduce.memory.mb
>>>
>>> On Wednesday 15 October 2014 06:17 PM, Shahab Yunus wrote:
>>>
>>> It cannot run more mappers (tasks) in parallel than the underlying cores
>>> available. Just like it cannot run multiple mappers in parallel if each
>>> mapper's (task's) memory requirements are greater than allocated and
>>> available container size configured on each node.
>>>
>>>  The links that I provided earlier...see the following section in that
>>> one:
>>> Section:"Configuring YARN"
>>>
>>>  Also this:
>>> http://blog.cloudera.com/blog/2014/04/apache-hadoop-yarn-avoiding-6-time-consuming-gotchas/
>>> Section "1. YARN Concurrency (aka “What Happened to Slots?”)"
>>>
>>>  This should help in putting things in perspective regarding how
>>> resource allocation for each task, container and resources available on the
>>> node relate to each other.
>>>
>>>  Regards,
>>> Shahab
>>>
>>> On Wed, Oct 15, 2014 at 8:18 AM, SACHINGUPTA <sa...@datametica.com>
>>> wrote:
>>>
>>>>  but Shahab if i have only 4 core machine then how yarn can run more
>>>> then 4 mappers in parallel
>>>> On Wednesday 15 October 2014 05:45 PM, Shahab Yunus wrote:
>>>>
>>>> It depends on memory settings as well, that how much you want to assign
>>>> resources to each container. Then yarn will run as many mappers in parallel
>>>> as possible.
>>>>
>>>>  See this:
>>>> http://hortonworks.com/blog/how-to-plan-and-configure-yarn-in-hdp-2-0/
>>>>
>>>> http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.0.6.0/bk_installing_manually_book/content/rpm-chap1-11.html
>>>>
>>>>  Regards,
>>>> Shahab
>>>>
>>>> On Wed, Oct 15, 2014 at 8:09 AM, SACHINGUPTA <sa...@datametica.com>
>>>> wrote:
>>>>
>>>>> Hi guys
>>>>>
>>>>> I have situation in which i have machine with 4 processor and i have 5
>>>>> containers so does it mean i can have only 4 mappers running parallely at a
>>>>> time
>>>>>
>>>>> and number of mappers is not dependent on the number of containers in
>>>>> a machine then what is the use of container concept
>>>>>
>>>>> sorry if i have asked anything obvious.
>>>>>
>>>>> --
>>>>> Thanks
>>>>> Sachin Gupta
>>>>>
>>>>>
>>>>
>>>> --
>>>> Thanks
>>>> Sachin Gupta
>>>>
>>>>
>>>
>>> --
>>> Thanks
>>> Sachin Gupta
>>>
>>>
>>
>>   --
>> Thanks
>> Sachin Gupta
>>
>>
>
> --
> Thanks
> Sachin Gupta
>
>

Re: number of mappers allowed in a container in hadoop2

Posted by Shahab Yunus <sh...@gmail.com>.
No. In Yarn, a container is a container which you can call normal
container. It is the M/R framework that you run on top of it that considers
them 'map' and 'reduce' containers.

I recommend reading the documentation about architecture and design of yarn
at the link below. It will answer lot of your questions.

http://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/YARN.html

Regards,
Shahab

On Wed, Oct 15, 2014 at 10:18 AM, SACHINGUPTA <sa...@datametica.com> wrote:

>  thanks for the reply
>
> i have one more doubt
>
> are there three kinds of containers with different memory sizes in hadoop 2
>
> 1.normal container
> 2.map task container
> 3.reduce task container
>
> On Wednesday 15 October 2014 07:33 PM, Shahab Yunus wrote:
>
> The data that the each map task will process is different from the memory
> the task itself might require depending upon whatever processing that you
> plan to do in the task.
>
>  Very trivial example: Let us say your map gets 128mb input data but your
> task logic is such that it creates lots of String objects and ArrayList
> objects, then wouldn't your memory requirement for the task be greater than
> your input data?
>
>  I think you are confusing the size of the input data to the map/task
> with the actual memory required by the map/task itself to do its work.
>
>  Regards,
> Shahab
>
> On Wed, Oct 15, 2014 at 9:44 AM, SACHINGUPTA <sa...@datametica.com>
> wrote:
>
>>  it is still not clear to me
>> lets suppose block size of my hdfs is 128 mb so every mapper will process
>> only 128 mb of data
>> then what is the meaning of setting the property mapreduce.map.memory.mb
>> that is already known from the block size then why this property
>>
>>
>>
>> On Wednesday 15 October 2014 07:06 PM, Shahab Yunus wrote:
>>
>> Explanation here.
>>
>>
>> http://stackoverflow.com/questions/24070557/what-is-the-relation-between-mapreduce-map-memory-mb-and-mapred-map-child-jav
>>
>> https://support.pivotal.io/hc/en-us/articles/201462036-Mapreduce-YARN-Memory-Parameters
>>
>> http://hadoop.apache.org/docs/r2.5.1/hadoop-project-dist/hadoop-common/ClusterSetup.html
>> (scroll towards the end.)
>>
>>  Regards,
>> Shahab
>>
>> On Wed, Oct 15, 2014 at 9:24 AM, SACHINGUPTA <sa...@datametica.com>
>> wrote:
>>
>>>  I have one more doubt i was reading this
>>>
>>>
>>> http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.0.6.0/bk_installing_manually_book/content/rpm-chap1-11.html
>>>
>>> there is one property as
>>>
>>>   mapreduce.map.memory.mb  = 2*1024 MB
>>> mapreduce.reduce.memory.mb           = 2 * 2 = 4*1024 MB
>>> what are these properties mapreduce.map.memory.mb and
>>> mapreduce.reduce.memory.mb
>>>
>>> On Wednesday 15 October 2014 06:17 PM, Shahab Yunus wrote:
>>>
>>> It cannot run more mappers (tasks) in parallel than the underlying cores
>>> available. Just like it cannot run multiple mappers in parallel if each
>>> mapper's (task's) memory requirements are greater than allocated and
>>> available container size configured on each node.
>>>
>>>  The links that I provided earlier...see the following section in that
>>> one:
>>> Section:"Configuring YARN"
>>>
>>>  Also this:
>>> http://blog.cloudera.com/blog/2014/04/apache-hadoop-yarn-avoiding-6-time-consuming-gotchas/
>>> Section "1. YARN Concurrency (aka “What Happened to Slots?”)"
>>>
>>>  This should help in putting things in perspective regarding how
>>> resource allocation for each task, container and resources available on the
>>> node relate to each other.
>>>
>>>  Regards,
>>> Shahab
>>>
>>> On Wed, Oct 15, 2014 at 8:18 AM, SACHINGUPTA <sa...@datametica.com>
>>> wrote:
>>>
>>>>  but Shahab if i have only 4 core machine then how yarn can run more
>>>> then 4 mappers in parallel
>>>> On Wednesday 15 October 2014 05:45 PM, Shahab Yunus wrote:
>>>>
>>>> It depends on memory settings as well, that how much you want to assign
>>>> resources to each container. Then yarn will run as many mappers in parallel
>>>> as possible.
>>>>
>>>>  See this:
>>>> http://hortonworks.com/blog/how-to-plan-and-configure-yarn-in-hdp-2-0/
>>>>
>>>> http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.0.6.0/bk_installing_manually_book/content/rpm-chap1-11.html
>>>>
>>>>  Regards,
>>>> Shahab
>>>>
>>>> On Wed, Oct 15, 2014 at 8:09 AM, SACHINGUPTA <sa...@datametica.com>
>>>> wrote:
>>>>
>>>>> Hi guys
>>>>>
>>>>> I have situation in which i have machine with 4 processor and i have 5
>>>>> containers so does it mean i can have only 4 mappers running parallely at a
>>>>> time
>>>>>
>>>>> and number of mappers is not dependent on the number of containers in
>>>>> a machine then what is the use of container concept
>>>>>
>>>>> sorry if i have asked anything obvious.
>>>>>
>>>>> --
>>>>> Thanks
>>>>> Sachin Gupta
>>>>>
>>>>>
>>>>
>>>> --
>>>> Thanks
>>>> Sachin Gupta
>>>>
>>>>
>>>
>>> --
>>> Thanks
>>> Sachin Gupta
>>>
>>>
>>
>>   --
>> Thanks
>> Sachin Gupta
>>
>>
>
> --
> Thanks
> Sachin Gupta
>
>

Re: number of mappers allowed in a container in hadoop2

Posted by Shahab Yunus <sh...@gmail.com>.
No. In Yarn, a container is a container which you can call normal
container. It is the M/R framework that you run on top of it that considers
them 'map' and 'reduce' containers.

I recommend reading the documentation about architecture and design of yarn
at the link below. It will answer lot of your questions.

http://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/YARN.html

Regards,
Shahab

On Wed, Oct 15, 2014 at 10:18 AM, SACHINGUPTA <sa...@datametica.com> wrote:

>  thanks for the reply
>
> i have one more doubt
>
> are there three kinds of containers with different memory sizes in hadoop 2
>
> 1.normal container
> 2.map task container
> 3.reduce task container
>
> On Wednesday 15 October 2014 07:33 PM, Shahab Yunus wrote:
>
> The data that the each map task will process is different from the memory
> the task itself might require depending upon whatever processing that you
> plan to do in the task.
>
>  Very trivial example: Let us say your map gets 128mb input data but your
> task logic is such that it creates lots of String objects and ArrayList
> objects, then wouldn't your memory requirement for the task be greater than
> your input data?
>
>  I think you are confusing the size of the input data to the map/task
> with the actual memory required by the map/task itself to do its work.
>
>  Regards,
> Shahab
>
> On Wed, Oct 15, 2014 at 9:44 AM, SACHINGUPTA <sa...@datametica.com>
> wrote:
>
>>  it is still not clear to me
>> lets suppose block size of my hdfs is 128 mb so every mapper will process
>> only 128 mb of data
>> then what is the meaning of setting the property mapreduce.map.memory.mb
>> that is already known from the block size then why this property
>>
>>
>>
>> On Wednesday 15 October 2014 07:06 PM, Shahab Yunus wrote:
>>
>> Explanation here.
>>
>>
>> http://stackoverflow.com/questions/24070557/what-is-the-relation-between-mapreduce-map-memory-mb-and-mapred-map-child-jav
>>
>> https://support.pivotal.io/hc/en-us/articles/201462036-Mapreduce-YARN-Memory-Parameters
>>
>> http://hadoop.apache.org/docs/r2.5.1/hadoop-project-dist/hadoop-common/ClusterSetup.html
>> (scroll towards the end.)
>>
>>  Regards,
>> Shahab
>>
>> On Wed, Oct 15, 2014 at 9:24 AM, SACHINGUPTA <sa...@datametica.com>
>> wrote:
>>
>>>  I have one more doubt i was reading this
>>>
>>>
>>> http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.0.6.0/bk_installing_manually_book/content/rpm-chap1-11.html
>>>
>>> there is one property as
>>>
>>>   mapreduce.map.memory.mb  = 2*1024 MB
>>> mapreduce.reduce.memory.mb           = 2 * 2 = 4*1024 MB
>>> what are these properties mapreduce.map.memory.mb and
>>> mapreduce.reduce.memory.mb
>>>
>>> On Wednesday 15 October 2014 06:17 PM, Shahab Yunus wrote:
>>>
>>> It cannot run more mappers (tasks) in parallel than the underlying cores
>>> available. Just like it cannot run multiple mappers in parallel if each
>>> mapper's (task's) memory requirements are greater than allocated and
>>> available container size configured on each node.
>>>
>>>  The links that I provided earlier...see the following section in that
>>> one:
>>> Section:"Configuring YARN"
>>>
>>>  Also this:
>>> http://blog.cloudera.com/blog/2014/04/apache-hadoop-yarn-avoiding-6-time-consuming-gotchas/
>>> Section "1. YARN Concurrency (aka “What Happened to Slots?”)"
>>>
>>>  This should help in putting things in perspective regarding how
>>> resource allocation for each task, container and resources available on the
>>> node relate to each other.
>>>
>>>  Regards,
>>> Shahab
>>>
>>> On Wed, Oct 15, 2014 at 8:18 AM, SACHINGUPTA <sa...@datametica.com>
>>> wrote:
>>>
>>>>  but Shahab if i have only 4 core machine then how yarn can run more
>>>> then 4 mappers in parallel
>>>> On Wednesday 15 October 2014 05:45 PM, Shahab Yunus wrote:
>>>>
>>>> It depends on memory settings as well, that how much you want to assign
>>>> resources to each container. Then yarn will run as many mappers in parallel
>>>> as possible.
>>>>
>>>>  See this:
>>>> http://hortonworks.com/blog/how-to-plan-and-configure-yarn-in-hdp-2-0/
>>>>
>>>> http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.0.6.0/bk_installing_manually_book/content/rpm-chap1-11.html
>>>>
>>>>  Regards,
>>>> Shahab
>>>>
>>>> On Wed, Oct 15, 2014 at 8:09 AM, SACHINGUPTA <sa...@datametica.com>
>>>> wrote:
>>>>
>>>>> Hi guys
>>>>>
>>>>> I have situation in which i have machine with 4 processor and i have 5
>>>>> containers so does it mean i can have only 4 mappers running parallely at a
>>>>> time
>>>>>
>>>>> and number of mappers is not dependent on the number of containers in
>>>>> a machine then what is the use of container concept
>>>>>
>>>>> sorry if i have asked anything obvious.
>>>>>
>>>>> --
>>>>> Thanks
>>>>> Sachin Gupta
>>>>>
>>>>>
>>>>
>>>> --
>>>> Thanks
>>>> Sachin Gupta
>>>>
>>>>
>>>
>>> --
>>> Thanks
>>> Sachin Gupta
>>>
>>>
>>
>>   --
>> Thanks
>> Sachin Gupta
>>
>>
>
> --
> Thanks
> Sachin Gupta
>
>

Re: number of mappers allowed in a container in hadoop2

Posted by Shahab Yunus <sh...@gmail.com>.
No. In Yarn, a container is a container which you can call normal
container. It is the M/R framework that you run on top of it that considers
them 'map' and 'reduce' containers.

I recommend reading the documentation about architecture and design of yarn
at the link below. It will answer lot of your questions.

http://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/YARN.html

Regards,
Shahab

On Wed, Oct 15, 2014 at 10:18 AM, SACHINGUPTA <sa...@datametica.com> wrote:

>  thanks for the reply
>
> i have one more doubt
>
> are there three kinds of containers with different memory sizes in hadoop 2
>
> 1.normal container
> 2.map task container
> 3.reduce task container
>
> On Wednesday 15 October 2014 07:33 PM, Shahab Yunus wrote:
>
> The data that the each map task will process is different from the memory
> the task itself might require depending upon whatever processing that you
> plan to do in the task.
>
>  Very trivial example: Let us say your map gets 128mb input data but your
> task logic is such that it creates lots of String objects and ArrayList
> objects, then wouldn't your memory requirement for the task be greater than
> your input data?
>
>  I think you are confusing the size of the input data to the map/task
> with the actual memory required by the map/task itself to do its work.
>
>  Regards,
> Shahab
>
> On Wed, Oct 15, 2014 at 9:44 AM, SACHINGUPTA <sa...@datametica.com>
> wrote:
>
>>  it is still not clear to me
>> lets suppose block size of my hdfs is 128 mb so every mapper will process
>> only 128 mb of data
>> then what is the meaning of setting the property mapreduce.map.memory.mb
>> that is already known from the block size then why this property
>>
>>
>>
>> On Wednesday 15 October 2014 07:06 PM, Shahab Yunus wrote:
>>
>> Explanation here.
>>
>>
>> http://stackoverflow.com/questions/24070557/what-is-the-relation-between-mapreduce-map-memory-mb-and-mapred-map-child-jav
>>
>> https://support.pivotal.io/hc/en-us/articles/201462036-Mapreduce-YARN-Memory-Parameters
>>
>> http://hadoop.apache.org/docs/r2.5.1/hadoop-project-dist/hadoop-common/ClusterSetup.html
>> (scroll towards the end.)
>>
>>  Regards,
>> Shahab
>>
>> On Wed, Oct 15, 2014 at 9:24 AM, SACHINGUPTA <sa...@datametica.com>
>> wrote:
>>
>>>  I have one more doubt i was reading this
>>>
>>>
>>> http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.0.6.0/bk_installing_manually_book/content/rpm-chap1-11.html
>>>
>>> there is one property as
>>>
>>>   mapreduce.map.memory.mb  = 2*1024 MB
>>> mapreduce.reduce.memory.mb           = 2 * 2 = 4*1024 MB
>>> what are these properties mapreduce.map.memory.mb and
>>> mapreduce.reduce.memory.mb
>>>
>>> On Wednesday 15 October 2014 06:17 PM, Shahab Yunus wrote:
>>>
>>> It cannot run more mappers (tasks) in parallel than the underlying cores
>>> available. Just like it cannot run multiple mappers in parallel if each
>>> mapper's (task's) memory requirements are greater than allocated and
>>> available container size configured on each node.
>>>
>>>  The links that I provided earlier...see the following section in that
>>> one:
>>> Section:"Configuring YARN"
>>>
>>>  Also this:
>>> http://blog.cloudera.com/blog/2014/04/apache-hadoop-yarn-avoiding-6-time-consuming-gotchas/
>>> Section "1. YARN Concurrency (aka “What Happened to Slots?”)"
>>>
>>>  This should help in putting things in perspective regarding how
>>> resource allocation for each task, container and resources available on the
>>> node relate to each other.
>>>
>>>  Regards,
>>> Shahab
>>>
>>> On Wed, Oct 15, 2014 at 8:18 AM, SACHINGUPTA <sa...@datametica.com>
>>> wrote:
>>>
>>>>  but Shahab if i have only 4 core machine then how yarn can run more
>>>> then 4 mappers in parallel
>>>> On Wednesday 15 October 2014 05:45 PM, Shahab Yunus wrote:
>>>>
>>>> It depends on memory settings as well, that how much you want to assign
>>>> resources to each container. Then yarn will run as many mappers in parallel
>>>> as possible.
>>>>
>>>>  See this:
>>>> http://hortonworks.com/blog/how-to-plan-and-configure-yarn-in-hdp-2-0/
>>>>
>>>> http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.0.6.0/bk_installing_manually_book/content/rpm-chap1-11.html
>>>>
>>>>  Regards,
>>>> Shahab
>>>>
>>>> On Wed, Oct 15, 2014 at 8:09 AM, SACHINGUPTA <sa...@datametica.com>
>>>> wrote:
>>>>
>>>>> Hi guys
>>>>>
>>>>> I have situation in which i have machine with 4 processor and i have 5
>>>>> containers so does it mean i can have only 4 mappers running parallely at a
>>>>> time
>>>>>
>>>>> and number of mappers is not dependent on the number of containers in
>>>>> a machine then what is the use of container concept
>>>>>
>>>>> sorry if i have asked anything obvious.
>>>>>
>>>>> --
>>>>> Thanks
>>>>> Sachin Gupta
>>>>>
>>>>>
>>>>
>>>> --
>>>> Thanks
>>>> Sachin Gupta
>>>>
>>>>
>>>
>>> --
>>> Thanks
>>> Sachin Gupta
>>>
>>>
>>
>>   --
>> Thanks
>> Sachin Gupta
>>
>>
>
> --
> Thanks
> Sachin Gupta
>
>

Re: number of mappers allowed in a container in hadoop2

Posted by SACHINGUPTA <sa...@datametica.com>.
thanks for the reply

i have one more doubt

are there three kinds of containers with different memory sizes in hadoop 2

1.normal container
2.map task container
3.reduce task container

On Wednesday 15 October 2014 07:33 PM, Shahab Yunus wrote:
> The data that the each map task will process is different from the 
> memory the task itself might require depending upon whatever 
> processing that you plan to do in the task.
>
> Very trivial example: Let us say your map gets 128mb input data but 
> your task logic is such that it creates lots of String objects and 
> ArrayList objects, then wouldn't your memory requirement for the task 
> be greater than your input data?
>
> I think you are confusing the size of the input data to the map/task 
> with the actual memory required by the map/task itself to do its work.
>
> Regards,
> Shahab
>
> On Wed, Oct 15, 2014 at 9:44 AM, SACHINGUPTA <sachin@datametica.com 
> <ma...@datametica.com>> wrote:
>
>     it is still not clear to me
>     lets suppose block size of my hdfs is 128 mb so every mapper will
>     process only 128 mb of data
>     then what is the meaning of setting the property
>     mapreduce.map.memory.mb that is already known from the block size
>     then why this property
>
>
>
>     On Wednesday 15 October 2014 07:06 PM, Shahab Yunus wrote:
>>     Explanation here.
>>
>>     http://stackoverflow.com/questions/24070557/what-is-the-relation-between-mapreduce-map-memory-mb-and-mapred-map-child-jav
>>     https://support.pivotal.io/hc/en-us/articles/201462036-Mapreduce-YARN-Memory-Parameters
>>     http://hadoop.apache.org/docs/r2.5.1/hadoop-project-dist/hadoop-common/ClusterSetup.html
>>     (scroll towards the end.)
>>
>>     Regards,
>>     Shahab
>>
>>     On Wed, Oct 15, 2014 at 9:24 AM, SACHINGUPTA
>>     <sachin@datametica.com <ma...@datametica.com>> wrote:
>>
>>         I have one more doubt i was reading this
>>
>>         http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.0.6.0/bk_installing_manually_book/content/rpm-chap1-11.html
>>
>>         there is one property as
>>
>>         mapreduce.map.memory.mb 	= 2*1024 MB
>>         mapreduce.reduce.memory.mb 	= 2 * 2 = 4*1024 MB
>>
>>
>>         what are these properties mapreduce.map.memory.mb and
>>         mapreduce.reduce.memory.mb
>>
>>         On Wednesday 15 October 2014 06:17 PM, Shahab Yunus wrote:
>>>         It cannot run more mappers (tasks) in parallel than the
>>>         underlying cores available. Just like it cannot run multiple
>>>         mappers in parallel if each mapper's (task's) memory
>>>         requirements are greater than allocated and available
>>>         container size configured on each node.
>>>
>>>         The links that I provided earlier...see the following
>>>         section in that one:
>>>         Section:"Configuring YARN"
>>>
>>>         Also this:
>>>         http://blog.cloudera.com/blog/2014/04/apache-hadoop-yarn-avoiding-6-time-consuming-gotchas/
>>>         Section "1. YARN Concurrency (aka “What Happened to Slots?”)"
>>>
>>>         This should help in putting things in perspective regarding
>>>         how resource allocation for each task, container and
>>>         resources available on the node relate to each other.
>>>
>>>         Regards,
>>>         Shahab
>>>
>>>         On Wed, Oct 15, 2014 at 8:18 AM, SACHINGUPTA
>>>         <sachin@datametica.com <ma...@datametica.com>> wrote:
>>>
>>>             but Shahab if i have only 4 core machine then how yarn
>>>             can run more then 4 mappers in parallel
>>>             On Wednesday 15 October 2014 05:45 PM, Shahab Yunus wrote:
>>>>             It depends on memory settings as well, that how much
>>>>             you want to assign resources to each container. Then
>>>>             yarn will run as many mappers in parallel as possible.
>>>>
>>>>             See this:
>>>>             http://hortonworks.com/blog/how-to-plan-and-configure-yarn-in-hdp-2-0/
>>>>             http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.0.6.0/bk_installing_manually_book/content/rpm-chap1-11.html
>>>>
>>>>             Regards,
>>>>             Shahab
>>>>
>>>>             On Wed, Oct 15, 2014 at 8:09 AM, SACHINGUPTA
>>>>             <sachin@datametica.com <ma...@datametica.com>>
>>>>             wrote:
>>>>
>>>>                 Hi guys
>>>>
>>>>                 I have situation in which i have machine with 4
>>>>                 processor and i have 5 containers so does it mean i
>>>>                 can have only 4 mappers running parallely at a time
>>>>
>>>>                 and number of mappers is not dependent on the
>>>>                 number of containers in a machine then what is the
>>>>                 use of container concept
>>>>
>>>>                 sorry if i have asked anything obvious.
>>>>
>>>>                 -- 
>>>>                 Thanks
>>>>                 Sachin Gupta
>>>>
>>>>
>>>
>>>             -- 
>>>             Thanks
>>>             Sachin Gupta
>>>
>>>
>>
>>         -- 
>>         Thanks
>>         Sachin Gupta
>>
>>
>
>     -- 
>     Thanks
>     Sachin Gupta
>
>

-- 
Thanks
Sachin Gupta


Re: number of mappers allowed in a container in hadoop2

Posted by SACHINGUPTA <sa...@datametica.com>.
thanks for the reply

i have one more doubt

are there three kinds of containers with different memory sizes in hadoop 2

1.normal container
2.map task container
3.reduce task container

On Wednesday 15 October 2014 07:33 PM, Shahab Yunus wrote:
> The data that the each map task will process is different from the 
> memory the task itself might require depending upon whatever 
> processing that you plan to do in the task.
>
> Very trivial example: Let us say your map gets 128mb input data but 
> your task logic is such that it creates lots of String objects and 
> ArrayList objects, then wouldn't your memory requirement for the task 
> be greater than your input data?
>
> I think you are confusing the size of the input data to the map/task 
> with the actual memory required by the map/task itself to do its work.
>
> Regards,
> Shahab
>
> On Wed, Oct 15, 2014 at 9:44 AM, SACHINGUPTA <sachin@datametica.com 
> <ma...@datametica.com>> wrote:
>
>     it is still not clear to me
>     lets suppose block size of my hdfs is 128 mb so every mapper will
>     process only 128 mb of data
>     then what is the meaning of setting the property
>     mapreduce.map.memory.mb that is already known from the block size
>     then why this property
>
>
>
>     On Wednesday 15 October 2014 07:06 PM, Shahab Yunus wrote:
>>     Explanation here.
>>
>>     http://stackoverflow.com/questions/24070557/what-is-the-relation-between-mapreduce-map-memory-mb-and-mapred-map-child-jav
>>     https://support.pivotal.io/hc/en-us/articles/201462036-Mapreduce-YARN-Memory-Parameters
>>     http://hadoop.apache.org/docs/r2.5.1/hadoop-project-dist/hadoop-common/ClusterSetup.html
>>     (scroll towards the end.)
>>
>>     Regards,
>>     Shahab
>>
>>     On Wed, Oct 15, 2014 at 9:24 AM, SACHINGUPTA
>>     <sachin@datametica.com <ma...@datametica.com>> wrote:
>>
>>         I have one more doubt i was reading this
>>
>>         http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.0.6.0/bk_installing_manually_book/content/rpm-chap1-11.html
>>
>>         there is one property as
>>
>>         mapreduce.map.memory.mb 	= 2*1024 MB
>>         mapreduce.reduce.memory.mb 	= 2 * 2 = 4*1024 MB
>>
>>
>>         what are these properties mapreduce.map.memory.mb and
>>         mapreduce.reduce.memory.mb
>>
>>         On Wednesday 15 October 2014 06:17 PM, Shahab Yunus wrote:
>>>         It cannot run more mappers (tasks) in parallel than the
>>>         underlying cores available. Just like it cannot run multiple
>>>         mappers in parallel if each mapper's (task's) memory
>>>         requirements are greater than allocated and available
>>>         container size configured on each node.
>>>
>>>         The links that I provided earlier...see the following
>>>         section in that one:
>>>         Section:"Configuring YARN"
>>>
>>>         Also this:
>>>         http://blog.cloudera.com/blog/2014/04/apache-hadoop-yarn-avoiding-6-time-consuming-gotchas/
>>>         Section "1. YARN Concurrency (aka “What Happened to Slots?”)"
>>>
>>>         This should help in putting things in perspective regarding
>>>         how resource allocation for each task, container and
>>>         resources available on the node relate to each other.
>>>
>>>         Regards,
>>>         Shahab
>>>
>>>         On Wed, Oct 15, 2014 at 8:18 AM, SACHINGUPTA
>>>         <sachin@datametica.com <ma...@datametica.com>> wrote:
>>>
>>>             but Shahab if i have only 4 core machine then how yarn
>>>             can run more then 4 mappers in parallel
>>>             On Wednesday 15 October 2014 05:45 PM, Shahab Yunus wrote:
>>>>             It depends on memory settings as well, that how much
>>>>             you want to assign resources to each container. Then
>>>>             yarn will run as many mappers in parallel as possible.
>>>>
>>>>             See this:
>>>>             http://hortonworks.com/blog/how-to-plan-and-configure-yarn-in-hdp-2-0/
>>>>             http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.0.6.0/bk_installing_manually_book/content/rpm-chap1-11.html
>>>>
>>>>             Regards,
>>>>             Shahab
>>>>
>>>>             On Wed, Oct 15, 2014 at 8:09 AM, SACHINGUPTA
>>>>             <sachin@datametica.com <ma...@datametica.com>>
>>>>             wrote:
>>>>
>>>>                 Hi guys
>>>>
>>>>                 I have situation in which i have machine with 4
>>>>                 processor and i have 5 containers so does it mean i
>>>>                 can have only 4 mappers running parallely at a time
>>>>
>>>>                 and number of mappers is not dependent on the
>>>>                 number of containers in a machine then what is the
>>>>                 use of container concept
>>>>
>>>>                 sorry if i have asked anything obvious.
>>>>
>>>>                 -- 
>>>>                 Thanks
>>>>                 Sachin Gupta
>>>>
>>>>
>>>
>>>             -- 
>>>             Thanks
>>>             Sachin Gupta
>>>
>>>
>>
>>         -- 
>>         Thanks
>>         Sachin Gupta
>>
>>
>
>     -- 
>     Thanks
>     Sachin Gupta
>
>

-- 
Thanks
Sachin Gupta


Re: number of mappers allowed in a container in hadoop2

Posted by SACHINGUPTA <sa...@datametica.com>.
thanks for the reply

i have one more doubt

are there three kinds of containers with different memory sizes in hadoop 2

1.normal container
2.map task container
3.reduce task container

On Wednesday 15 October 2014 07:33 PM, Shahab Yunus wrote:
> The data that the each map task will process is different from the 
> memory the task itself might require depending upon whatever 
> processing that you plan to do in the task.
>
> Very trivial example: Let us say your map gets 128mb input data but 
> your task logic is such that it creates lots of String objects and 
> ArrayList objects, then wouldn't your memory requirement for the task 
> be greater than your input data?
>
> I think you are confusing the size of the input data to the map/task 
> with the actual memory required by the map/task itself to do its work.
>
> Regards,
> Shahab
>
> On Wed, Oct 15, 2014 at 9:44 AM, SACHINGUPTA <sachin@datametica.com 
> <ma...@datametica.com>> wrote:
>
>     it is still not clear to me
>     lets suppose block size of my hdfs is 128 mb so every mapper will
>     process only 128 mb of data
>     then what is the meaning of setting the property
>     mapreduce.map.memory.mb that is already known from the block size
>     then why this property
>
>
>
>     On Wednesday 15 October 2014 07:06 PM, Shahab Yunus wrote:
>>     Explanation here.
>>
>>     http://stackoverflow.com/questions/24070557/what-is-the-relation-between-mapreduce-map-memory-mb-and-mapred-map-child-jav
>>     https://support.pivotal.io/hc/en-us/articles/201462036-Mapreduce-YARN-Memory-Parameters
>>     http://hadoop.apache.org/docs/r2.5.1/hadoop-project-dist/hadoop-common/ClusterSetup.html
>>     (scroll towards the end.)
>>
>>     Regards,
>>     Shahab
>>
>>     On Wed, Oct 15, 2014 at 9:24 AM, SACHINGUPTA
>>     <sachin@datametica.com <ma...@datametica.com>> wrote:
>>
>>         I have one more doubt i was reading this
>>
>>         http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.0.6.0/bk_installing_manually_book/content/rpm-chap1-11.html
>>
>>         there is one property as
>>
>>         mapreduce.map.memory.mb 	= 2*1024 MB
>>         mapreduce.reduce.memory.mb 	= 2 * 2 = 4*1024 MB
>>
>>
>>         what are these properties mapreduce.map.memory.mb and
>>         mapreduce.reduce.memory.mb
>>
>>         On Wednesday 15 October 2014 06:17 PM, Shahab Yunus wrote:
>>>         It cannot run more mappers (tasks) in parallel than the
>>>         underlying cores available. Just like it cannot run multiple
>>>         mappers in parallel if each mapper's (task's) memory
>>>         requirements are greater than allocated and available
>>>         container size configured on each node.
>>>
>>>         The links that I provided earlier...see the following
>>>         section in that one:
>>>         Section:"Configuring YARN"
>>>
>>>         Also this:
>>>         http://blog.cloudera.com/blog/2014/04/apache-hadoop-yarn-avoiding-6-time-consuming-gotchas/
>>>         Section "1. YARN Concurrency (aka “What Happened to Slots?”)"
>>>
>>>         This should help in putting things in perspective regarding
>>>         how resource allocation for each task, container and
>>>         resources available on the node relate to each other.
>>>
>>>         Regards,
>>>         Shahab
>>>
>>>         On Wed, Oct 15, 2014 at 8:18 AM, SACHINGUPTA
>>>         <sachin@datametica.com <ma...@datametica.com>> wrote:
>>>
>>>             but Shahab if i have only 4 core machine then how yarn
>>>             can run more then 4 mappers in parallel
>>>             On Wednesday 15 October 2014 05:45 PM, Shahab Yunus wrote:
>>>>             It depends on memory settings as well, that how much
>>>>             you want to assign resources to each container. Then
>>>>             yarn will run as many mappers in parallel as possible.
>>>>
>>>>             See this:
>>>>             http://hortonworks.com/blog/how-to-plan-and-configure-yarn-in-hdp-2-0/
>>>>             http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.0.6.0/bk_installing_manually_book/content/rpm-chap1-11.html
>>>>
>>>>             Regards,
>>>>             Shahab
>>>>
>>>>             On Wed, Oct 15, 2014 at 8:09 AM, SACHINGUPTA
>>>>             <sachin@datametica.com <ma...@datametica.com>>
>>>>             wrote:
>>>>
>>>>                 Hi guys
>>>>
>>>>                 I have situation in which i have machine with 4
>>>>                 processor and i have 5 containers so does it mean i
>>>>                 can have only 4 mappers running parallely at a time
>>>>
>>>>                 and number of mappers is not dependent on the
>>>>                 number of containers in a machine then what is the
>>>>                 use of container concept
>>>>
>>>>                 sorry if i have asked anything obvious.
>>>>
>>>>                 -- 
>>>>                 Thanks
>>>>                 Sachin Gupta
>>>>
>>>>
>>>
>>>             -- 
>>>             Thanks
>>>             Sachin Gupta
>>>
>>>
>>
>>         -- 
>>         Thanks
>>         Sachin Gupta
>>
>>
>
>     -- 
>     Thanks
>     Sachin Gupta
>
>

-- 
Thanks
Sachin Gupta


Re: number of mappers allowed in a container in hadoop2

Posted by SACHINGUPTA <sa...@datametica.com>.
thanks for the reply

i have one more doubt

are there three kinds of containers with different memory sizes in hadoop 2

1.normal container
2.map task container
3.reduce task container

On Wednesday 15 October 2014 07:33 PM, Shahab Yunus wrote:
> The data that the each map task will process is different from the 
> memory the task itself might require depending upon whatever 
> processing that you plan to do in the task.
>
> Very trivial example: Let us say your map gets 128mb input data but 
> your task logic is such that it creates lots of String objects and 
> ArrayList objects, then wouldn't your memory requirement for the task 
> be greater than your input data?
>
> I think you are confusing the size of the input data to the map/task 
> with the actual memory required by the map/task itself to do its work.
>
> Regards,
> Shahab
>
> On Wed, Oct 15, 2014 at 9:44 AM, SACHINGUPTA <sachin@datametica.com 
> <ma...@datametica.com>> wrote:
>
>     it is still not clear to me
>     lets suppose block size of my hdfs is 128 mb so every mapper will
>     process only 128 mb of data
>     then what is the meaning of setting the property
>     mapreduce.map.memory.mb that is already known from the block size
>     then why this property
>
>
>
>     On Wednesday 15 October 2014 07:06 PM, Shahab Yunus wrote:
>>     Explanation here.
>>
>>     http://stackoverflow.com/questions/24070557/what-is-the-relation-between-mapreduce-map-memory-mb-and-mapred-map-child-jav
>>     https://support.pivotal.io/hc/en-us/articles/201462036-Mapreduce-YARN-Memory-Parameters
>>     http://hadoop.apache.org/docs/r2.5.1/hadoop-project-dist/hadoop-common/ClusterSetup.html
>>     (scroll towards the end.)
>>
>>     Regards,
>>     Shahab
>>
>>     On Wed, Oct 15, 2014 at 9:24 AM, SACHINGUPTA
>>     <sachin@datametica.com <ma...@datametica.com>> wrote:
>>
>>         I have one more doubt i was reading this
>>
>>         http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.0.6.0/bk_installing_manually_book/content/rpm-chap1-11.html
>>
>>         there is one property as
>>
>>         mapreduce.map.memory.mb 	= 2*1024 MB
>>         mapreduce.reduce.memory.mb 	= 2 * 2 = 4*1024 MB
>>
>>
>>         what are these properties mapreduce.map.memory.mb and
>>         mapreduce.reduce.memory.mb
>>
>>         On Wednesday 15 October 2014 06:17 PM, Shahab Yunus wrote:
>>>         It cannot run more mappers (tasks) in parallel than the
>>>         underlying cores available. Just like it cannot run multiple
>>>         mappers in parallel if each mapper's (task's) memory
>>>         requirements are greater than allocated and available
>>>         container size configured on each node.
>>>
>>>         The links that I provided earlier...see the following
>>>         section in that one:
>>>         Section:"Configuring YARN"
>>>
>>>         Also this:
>>>         http://blog.cloudera.com/blog/2014/04/apache-hadoop-yarn-avoiding-6-time-consuming-gotchas/
>>>         Section "1. YARN Concurrency (aka “What Happened to Slots?”)"
>>>
>>>         This should help in putting things in perspective regarding
>>>         how resource allocation for each task, container and
>>>         resources available on the node relate to each other.
>>>
>>>         Regards,
>>>         Shahab
>>>
>>>         On Wed, Oct 15, 2014 at 8:18 AM, SACHINGUPTA
>>>         <sachin@datametica.com <ma...@datametica.com>> wrote:
>>>
>>>             but Shahab if i have only 4 core machine then how yarn
>>>             can run more then 4 mappers in parallel
>>>             On Wednesday 15 October 2014 05:45 PM, Shahab Yunus wrote:
>>>>             It depends on memory settings as well, that how much
>>>>             you want to assign resources to each container. Then
>>>>             yarn will run as many mappers in parallel as possible.
>>>>
>>>>             See this:
>>>>             http://hortonworks.com/blog/how-to-plan-and-configure-yarn-in-hdp-2-0/
>>>>             http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.0.6.0/bk_installing_manually_book/content/rpm-chap1-11.html
>>>>
>>>>             Regards,
>>>>             Shahab
>>>>
>>>>             On Wed, Oct 15, 2014 at 8:09 AM, SACHINGUPTA
>>>>             <sachin@datametica.com <ma...@datametica.com>>
>>>>             wrote:
>>>>
>>>>                 Hi guys
>>>>
>>>>                 I have situation in which i have machine with 4
>>>>                 processor and i have 5 containers so does it mean i
>>>>                 can have only 4 mappers running parallely at a time
>>>>
>>>>                 and number of mappers is not dependent on the
>>>>                 number of containers in a machine then what is the
>>>>                 use of container concept
>>>>
>>>>                 sorry if i have asked anything obvious.
>>>>
>>>>                 -- 
>>>>                 Thanks
>>>>                 Sachin Gupta
>>>>
>>>>
>>>
>>>             -- 
>>>             Thanks
>>>             Sachin Gupta
>>>
>>>
>>
>>         -- 
>>         Thanks
>>         Sachin Gupta
>>
>>
>
>     -- 
>     Thanks
>     Sachin Gupta
>
>

-- 
Thanks
Sachin Gupta


Re: number of mappers allowed in a container in hadoop2

Posted by Shahab Yunus <sh...@gmail.com>.
The data that the each map task will process is different from the memory
the task itself might require depending upon whatever processing that you
plan to do in the task.

Very trivial example: Let us say your map gets 128mb input data but your
task logic is such that it creates lots of String objects and ArrayList
objects, then wouldn't your memory requirement for the task be greater than
your input data?

I think you are confusing the size of the input data to the map/task with
the actual memory required by the map/task itself to do its work.

Regards,
Shahab

On Wed, Oct 15, 2014 at 9:44 AM, SACHINGUPTA <sa...@datametica.com> wrote:

>  it is still not clear to me
> lets suppose block size of my hdfs is 128 mb so every mapper will process
> only 128 mb of data
> then what is the meaning of setting the property mapreduce.map.memory.mb
> that is already known from the block size then why this property
>
>
>
> On Wednesday 15 October 2014 07:06 PM, Shahab Yunus wrote:
>
> Explanation here.
>
>
> http://stackoverflow.com/questions/24070557/what-is-the-relation-between-mapreduce-map-memory-mb-and-mapred-map-child-jav
>
> https://support.pivotal.io/hc/en-us/articles/201462036-Mapreduce-YARN-Memory-Parameters
>
> http://hadoop.apache.org/docs/r2.5.1/hadoop-project-dist/hadoop-common/ClusterSetup.html
> (scroll towards the end.)
>
>  Regards,
> Shahab
>
> On Wed, Oct 15, 2014 at 9:24 AM, SACHINGUPTA <sa...@datametica.com>
> wrote:
>
>>  I have one more doubt i was reading this
>>
>>
>> http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.0.6.0/bk_installing_manually_book/content/rpm-chap1-11.html
>>
>> there is one property as
>>
>>   mapreduce.map.memory.mb  = 2*1024 MB
>> mapreduce.reduce.memory.mb           = 2 * 2 = 4*1024 MB
>> what are these properties mapreduce.map.memory.mb and
>> mapreduce.reduce.memory.mb
>>
>> On Wednesday 15 October 2014 06:17 PM, Shahab Yunus wrote:
>>
>> It cannot run more mappers (tasks) in parallel than the underlying cores
>> available. Just like it cannot run multiple mappers in parallel if each
>> mapper's (task's) memory requirements are greater than allocated and
>> available container size configured on each node.
>>
>>  The links that I provided earlier...see the following section in that
>> one:
>> Section:"Configuring YARN"
>>
>>  Also this:
>> http://blog.cloudera.com/blog/2014/04/apache-hadoop-yarn-avoiding-6-time-consuming-gotchas/
>> Section "1. YARN Concurrency (aka “What Happened to Slots?”)"
>>
>>  This should help in putting things in perspective regarding how
>> resource allocation for each task, container and resources available on the
>> node relate to each other.
>>
>>  Regards,
>> Shahab
>>
>> On Wed, Oct 15, 2014 at 8:18 AM, SACHINGUPTA <sa...@datametica.com>
>> wrote:
>>
>>>  but Shahab if i have only 4 core machine then how yarn can run more
>>> then 4 mappers in parallel
>>> On Wednesday 15 October 2014 05:45 PM, Shahab Yunus wrote:
>>>
>>> It depends on memory settings as well, that how much you want to assign
>>> resources to each container. Then yarn will run as many mappers in parallel
>>> as possible.
>>>
>>>  See this:
>>> http://hortonworks.com/blog/how-to-plan-and-configure-yarn-in-hdp-2-0/
>>>
>>> http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.0.6.0/bk_installing_manually_book/content/rpm-chap1-11.html
>>>
>>>  Regards,
>>> Shahab
>>>
>>> On Wed, Oct 15, 2014 at 8:09 AM, SACHINGUPTA <sa...@datametica.com>
>>> wrote:
>>>
>>>> Hi guys
>>>>
>>>> I have situation in which i have machine with 4 processor and i have 5
>>>> containers so does it mean i can have only 4 mappers running parallely at a
>>>> time
>>>>
>>>> and number of mappers is not dependent on the number of containers in a
>>>> machine then what is the use of container concept
>>>>
>>>> sorry if i have asked anything obvious.
>>>>
>>>> --
>>>> Thanks
>>>> Sachin Gupta
>>>>
>>>>
>>>
>>> --
>>> Thanks
>>> Sachin Gupta
>>>
>>>
>>
>> --
>> Thanks
>> Sachin Gupta
>>
>>
>
> --
> Thanks
> Sachin Gupta
>
>

Re: number of mappers allowed in a container in hadoop2

Posted by Shahab Yunus <sh...@gmail.com>.
The data that the each map task will process is different from the memory
the task itself might require depending upon whatever processing that you
plan to do in the task.

Very trivial example: Let us say your map gets 128mb input data but your
task logic is such that it creates lots of String objects and ArrayList
objects, then wouldn't your memory requirement for the task be greater than
your input data?

I think you are confusing the size of the input data to the map/task with
the actual memory required by the map/task itself to do its work.

Regards,
Shahab

On Wed, Oct 15, 2014 at 9:44 AM, SACHINGUPTA <sa...@datametica.com> wrote:

>  it is still not clear to me
> lets suppose block size of my hdfs is 128 mb so every mapper will process
> only 128 mb of data
> then what is the meaning of setting the property mapreduce.map.memory.mb
> that is already known from the block size then why this property
>
>
>
> On Wednesday 15 October 2014 07:06 PM, Shahab Yunus wrote:
>
> Explanation here.
>
>
> http://stackoverflow.com/questions/24070557/what-is-the-relation-between-mapreduce-map-memory-mb-and-mapred-map-child-jav
>
> https://support.pivotal.io/hc/en-us/articles/201462036-Mapreduce-YARN-Memory-Parameters
>
> http://hadoop.apache.org/docs/r2.5.1/hadoop-project-dist/hadoop-common/ClusterSetup.html
> (scroll towards the end.)
>
>  Regards,
> Shahab
>
> On Wed, Oct 15, 2014 at 9:24 AM, SACHINGUPTA <sa...@datametica.com>
> wrote:
>
>>  I have one more doubt i was reading this
>>
>>
>> http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.0.6.0/bk_installing_manually_book/content/rpm-chap1-11.html
>>
>> there is one property as
>>
>>   mapreduce.map.memory.mb  = 2*1024 MB
>> mapreduce.reduce.memory.mb           = 2 * 2 = 4*1024 MB
>> what are these properties mapreduce.map.memory.mb and
>> mapreduce.reduce.memory.mb
>>
>> On Wednesday 15 October 2014 06:17 PM, Shahab Yunus wrote:
>>
>> It cannot run more mappers (tasks) in parallel than the underlying cores
>> available. Just like it cannot run multiple mappers in parallel if each
>> mapper's (task's) memory requirements are greater than allocated and
>> available container size configured on each node.
>>
>>  The links that I provided earlier...see the following section in that
>> one:
>> Section:"Configuring YARN"
>>
>>  Also this:
>> http://blog.cloudera.com/blog/2014/04/apache-hadoop-yarn-avoiding-6-time-consuming-gotchas/
>> Section "1. YARN Concurrency (aka “What Happened to Slots?”)"
>>
>>  This should help in putting things in perspective regarding how
>> resource allocation for each task, container and resources available on the
>> node relate to each other.
>>
>>  Regards,
>> Shahab
>>
>> On Wed, Oct 15, 2014 at 8:18 AM, SACHINGUPTA <sa...@datametica.com>
>> wrote:
>>
>>>  but Shahab if i have only 4 core machine then how yarn can run more
>>> then 4 mappers in parallel
>>> On Wednesday 15 October 2014 05:45 PM, Shahab Yunus wrote:
>>>
>>> It depends on memory settings as well, that how much you want to assign
>>> resources to each container. Then yarn will run as many mappers in parallel
>>> as possible.
>>>
>>>  See this:
>>> http://hortonworks.com/blog/how-to-plan-and-configure-yarn-in-hdp-2-0/
>>>
>>> http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.0.6.0/bk_installing_manually_book/content/rpm-chap1-11.html
>>>
>>>  Regards,
>>> Shahab
>>>
>>> On Wed, Oct 15, 2014 at 8:09 AM, SACHINGUPTA <sa...@datametica.com>
>>> wrote:
>>>
>>>> Hi guys
>>>>
>>>> I have situation in which i have machine with 4 processor and i have 5
>>>> containers so does it mean i can have only 4 mappers running parallely at a
>>>> time
>>>>
>>>> and number of mappers is not dependent on the number of containers in a
>>>> machine then what is the use of container concept
>>>>
>>>> sorry if i have asked anything obvious.
>>>>
>>>> --
>>>> Thanks
>>>> Sachin Gupta
>>>>
>>>>
>>>
>>> --
>>> Thanks
>>> Sachin Gupta
>>>
>>>
>>
>> --
>> Thanks
>> Sachin Gupta
>>
>>
>
> --
> Thanks
> Sachin Gupta
>
>

Re: number of mappers allowed in a container in hadoop2

Posted by Shahab Yunus <sh...@gmail.com>.
The data that the each map task will process is different from the memory
the task itself might require depending upon whatever processing that you
plan to do in the task.

Very trivial example: Let us say your map gets 128mb input data but your
task logic is such that it creates lots of String objects and ArrayList
objects, then wouldn't your memory requirement for the task be greater than
your input data?

I think you are confusing the size of the input data to the map/task with
the actual memory required by the map/task itself to do its work.

Regards,
Shahab

On Wed, Oct 15, 2014 at 9:44 AM, SACHINGUPTA <sa...@datametica.com> wrote:

>  it is still not clear to me
> lets suppose block size of my hdfs is 128 mb so every mapper will process
> only 128 mb of data
> then what is the meaning of setting the property mapreduce.map.memory.mb
> that is already known from the block size then why this property
>
>
>
> On Wednesday 15 October 2014 07:06 PM, Shahab Yunus wrote:
>
> Explanation here.
>
>
> http://stackoverflow.com/questions/24070557/what-is-the-relation-between-mapreduce-map-memory-mb-and-mapred-map-child-jav
>
> https://support.pivotal.io/hc/en-us/articles/201462036-Mapreduce-YARN-Memory-Parameters
>
> http://hadoop.apache.org/docs/r2.5.1/hadoop-project-dist/hadoop-common/ClusterSetup.html
> (scroll towards the end.)
>
>  Regards,
> Shahab
>
> On Wed, Oct 15, 2014 at 9:24 AM, SACHINGUPTA <sa...@datametica.com>
> wrote:
>
>>  I have one more doubt i was reading this
>>
>>
>> http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.0.6.0/bk_installing_manually_book/content/rpm-chap1-11.html
>>
>> there is one property as
>>
>>   mapreduce.map.memory.mb  = 2*1024 MB
>> mapreduce.reduce.memory.mb           = 2 * 2 = 4*1024 MB
>> what are these properties mapreduce.map.memory.mb and
>> mapreduce.reduce.memory.mb
>>
>> On Wednesday 15 October 2014 06:17 PM, Shahab Yunus wrote:
>>
>> It cannot run more mappers (tasks) in parallel than the underlying cores
>> available. Just like it cannot run multiple mappers in parallel if each
>> mapper's (task's) memory requirements are greater than allocated and
>> available container size configured on each node.
>>
>>  The links that I provided earlier...see the following section in that
>> one:
>> Section:"Configuring YARN"
>>
>>  Also this:
>> http://blog.cloudera.com/blog/2014/04/apache-hadoop-yarn-avoiding-6-time-consuming-gotchas/
>> Section "1. YARN Concurrency (aka “What Happened to Slots?”)"
>>
>>  This should help in putting things in perspective regarding how
>> resource allocation for each task, container and resources available on the
>> node relate to each other.
>>
>>  Regards,
>> Shahab
>>
>> On Wed, Oct 15, 2014 at 8:18 AM, SACHINGUPTA <sa...@datametica.com>
>> wrote:
>>
>>>  but Shahab if i have only 4 core machine then how yarn can run more
>>> then 4 mappers in parallel
>>> On Wednesday 15 October 2014 05:45 PM, Shahab Yunus wrote:
>>>
>>> It depends on memory settings as well, that how much you want to assign
>>> resources to each container. Then yarn will run as many mappers in parallel
>>> as possible.
>>>
>>>  See this:
>>> http://hortonworks.com/blog/how-to-plan-and-configure-yarn-in-hdp-2-0/
>>>
>>> http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.0.6.0/bk_installing_manually_book/content/rpm-chap1-11.html
>>>
>>>  Regards,
>>> Shahab
>>>
>>> On Wed, Oct 15, 2014 at 8:09 AM, SACHINGUPTA <sa...@datametica.com>
>>> wrote:
>>>
>>>> Hi guys
>>>>
>>>> I have situation in which i have machine with 4 processor and i have 5
>>>> containers so does it mean i can have only 4 mappers running parallely at a
>>>> time
>>>>
>>>> and number of mappers is not dependent on the number of containers in a
>>>> machine then what is the use of container concept
>>>>
>>>> sorry if i have asked anything obvious.
>>>>
>>>> --
>>>> Thanks
>>>> Sachin Gupta
>>>>
>>>>
>>>
>>> --
>>> Thanks
>>> Sachin Gupta
>>>
>>>
>>
>> --
>> Thanks
>> Sachin Gupta
>>
>>
>
> --
> Thanks
> Sachin Gupta
>
>

Re: number of mappers allowed in a container in hadoop2

Posted by Shahab Yunus <sh...@gmail.com>.
The data that the each map task will process is different from the memory
the task itself might require depending upon whatever processing that you
plan to do in the task.

Very trivial example: Let us say your map gets 128mb input data but your
task logic is such that it creates lots of String objects and ArrayList
objects, then wouldn't your memory requirement for the task be greater than
your input data?

I think you are confusing the size of the input data to the map/task with
the actual memory required by the map/task itself to do its work.

Regards,
Shahab

On Wed, Oct 15, 2014 at 9:44 AM, SACHINGUPTA <sa...@datametica.com> wrote:

>  it is still not clear to me
> lets suppose block size of my hdfs is 128 mb so every mapper will process
> only 128 mb of data
> then what is the meaning of setting the property mapreduce.map.memory.mb
> that is already known from the block size then why this property
>
>
>
> On Wednesday 15 October 2014 07:06 PM, Shahab Yunus wrote:
>
> Explanation here.
>
>
> http://stackoverflow.com/questions/24070557/what-is-the-relation-between-mapreduce-map-memory-mb-and-mapred-map-child-jav
>
> https://support.pivotal.io/hc/en-us/articles/201462036-Mapreduce-YARN-Memory-Parameters
>
> http://hadoop.apache.org/docs/r2.5.1/hadoop-project-dist/hadoop-common/ClusterSetup.html
> (scroll towards the end.)
>
>  Regards,
> Shahab
>
> On Wed, Oct 15, 2014 at 9:24 AM, SACHINGUPTA <sa...@datametica.com>
> wrote:
>
>>  I have one more doubt i was reading this
>>
>>
>> http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.0.6.0/bk_installing_manually_book/content/rpm-chap1-11.html
>>
>> there is one property as
>>
>>   mapreduce.map.memory.mb  = 2*1024 MB
>> mapreduce.reduce.memory.mb           = 2 * 2 = 4*1024 MB
>> what are these properties mapreduce.map.memory.mb and
>> mapreduce.reduce.memory.mb
>>
>> On Wednesday 15 October 2014 06:17 PM, Shahab Yunus wrote:
>>
>> It cannot run more mappers (tasks) in parallel than the underlying cores
>> available. Just like it cannot run multiple mappers in parallel if each
>> mapper's (task's) memory requirements are greater than allocated and
>> available container size configured on each node.
>>
>>  The links that I provided earlier...see the following section in that
>> one:
>> Section:"Configuring YARN"
>>
>>  Also this:
>> http://blog.cloudera.com/blog/2014/04/apache-hadoop-yarn-avoiding-6-time-consuming-gotchas/
>> Section "1. YARN Concurrency (aka “What Happened to Slots?”)"
>>
>>  This should help in putting things in perspective regarding how
>> resource allocation for each task, container and resources available on the
>> node relate to each other.
>>
>>  Regards,
>> Shahab
>>
>> On Wed, Oct 15, 2014 at 8:18 AM, SACHINGUPTA <sa...@datametica.com>
>> wrote:
>>
>>>  but Shahab if i have only 4 core machine then how yarn can run more
>>> then 4 mappers in parallel
>>> On Wednesday 15 October 2014 05:45 PM, Shahab Yunus wrote:
>>>
>>> It depends on memory settings as well, that how much you want to assign
>>> resources to each container. Then yarn will run as many mappers in parallel
>>> as possible.
>>>
>>>  See this:
>>> http://hortonworks.com/blog/how-to-plan-and-configure-yarn-in-hdp-2-0/
>>>
>>> http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.0.6.0/bk_installing_manually_book/content/rpm-chap1-11.html
>>>
>>>  Regards,
>>> Shahab
>>>
>>> On Wed, Oct 15, 2014 at 8:09 AM, SACHINGUPTA <sa...@datametica.com>
>>> wrote:
>>>
>>>> Hi guys
>>>>
>>>> I have situation in which i have machine with 4 processor and i have 5
>>>> containers so does it mean i can have only 4 mappers running parallely at a
>>>> time
>>>>
>>>> and number of mappers is not dependent on the number of containers in a
>>>> machine then what is the use of container concept
>>>>
>>>> sorry if i have asked anything obvious.
>>>>
>>>> --
>>>> Thanks
>>>> Sachin Gupta
>>>>
>>>>
>>>
>>> --
>>> Thanks
>>> Sachin Gupta
>>>
>>>
>>
>> --
>> Thanks
>> Sachin Gupta
>>
>>
>
> --
> Thanks
> Sachin Gupta
>
>

Re: number of mappers allowed in a container in hadoop2

Posted by SACHINGUPTA <sa...@datametica.com>.
it is still not clear to me
lets suppose block size of my hdfs is 128 mb so every mapper will 
process only 128 mb of data
then what is the meaning of setting the property mapreduce.map.memory.mb 
that is already known from the block size then why this property


On Wednesday 15 October 2014 07:06 PM, Shahab Yunus wrote:
> Explanation here.
>
> http://stackoverflow.com/questions/24070557/what-is-the-relation-between-mapreduce-map-memory-mb-and-mapred-map-child-jav
> https://support.pivotal.io/hc/en-us/articles/201462036-Mapreduce-YARN-Memory-Parameters
> http://hadoop.apache.org/docs/r2.5.1/hadoop-project-dist/hadoop-common/ClusterSetup.html 
> (scroll towards the end.)
>
> Regards,
> Shahab
>
> On Wed, Oct 15, 2014 at 9:24 AM, SACHINGUPTA <sachin@datametica.com 
> <ma...@datametica.com>> wrote:
>
>     I have one more doubt i was reading this
>
>     http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.0.6.0/bk_installing_manually_book/content/rpm-chap1-11.html
>
>     there is one property as
>
>     mapreduce.map.memory.mb 	= 2*1024 MB
>     mapreduce.reduce.memory.mb 	= 2 * 2 = 4*1024 MB
>
>
>     what are these properties mapreduce.map.memory.mb and
>     mapreduce.reduce.memory.mb
>
>     On Wednesday 15 October 2014 06:17 PM, Shahab Yunus wrote:
>>     It cannot run more mappers (tasks) in parallel than the
>>     underlying cores available. Just like it cannot run multiple
>>     mappers in parallel if each mapper's (task's) memory requirements
>>     are greater than allocated and available container size
>>     configured on each node.
>>
>>     The links that I provided earlier...see the following section in
>>     that one:
>>     Section:"Configuring YARN"
>>
>>     Also this:
>>     http://blog.cloudera.com/blog/2014/04/apache-hadoop-yarn-avoiding-6-time-consuming-gotchas/
>>     Section "1. YARN Concurrency (aka “What Happened to Slots?”)"
>>
>>     This should help in putting things in perspective regarding how
>>     resource allocation for each task, container and resources
>>     available on the node relate to each other.
>>
>>     Regards,
>>     Shahab
>>
>>     On Wed, Oct 15, 2014 at 8:18 AM, SACHINGUPTA
>>     <sachin@datametica.com <ma...@datametica.com>> wrote:
>>
>>         but Shahab if i have only 4 core machine then how yarn can
>>         run more then 4 mappers in parallel
>>         On Wednesday 15 October 2014 05:45 PM, Shahab Yunus wrote:
>>>         It depends on memory settings as well, that how much you
>>>         want to assign resources to each container. Then yarn will
>>>         run as many mappers in parallel as possible.
>>>
>>>         See this:
>>>         http://hortonworks.com/blog/how-to-plan-and-configure-yarn-in-hdp-2-0/
>>>         http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.0.6.0/bk_installing_manually_book/content/rpm-chap1-11.html
>>>
>>>         Regards,
>>>         Shahab
>>>
>>>         On Wed, Oct 15, 2014 at 8:09 AM, SACHINGUPTA
>>>         <sachin@datametica.com <ma...@datametica.com>> wrote:
>>>
>>>             Hi guys
>>>
>>>             I have situation in which i have machine with 4
>>>             processor and i have 5 containers so does it mean i can
>>>             have only 4 mappers running parallely at a time
>>>
>>>             and number of mappers is not dependent on the number of
>>>             containers in a machine then what is the use of
>>>             container concept
>>>
>>>             sorry if i have asked anything obvious.
>>>
>>>             -- 
>>>             Thanks
>>>             Sachin Gupta
>>>
>>>
>>
>>         -- 
>>         Thanks
>>         Sachin Gupta
>>
>>
>
>     -- 
>     Thanks
>     Sachin Gupta
>
>

-- 
Thanks
Sachin Gupta


Re: number of mappers allowed in a container in hadoop2

Posted by SACHINGUPTA <sa...@datametica.com>.
it is still not clear to me
lets suppose block size of my hdfs is 128 mb so every mapper will 
process only 128 mb of data
then what is the meaning of setting the property mapreduce.map.memory.mb 
that is already known from the block size then why this property


On Wednesday 15 October 2014 07:06 PM, Shahab Yunus wrote:
> Explanation here.
>
> http://stackoverflow.com/questions/24070557/what-is-the-relation-between-mapreduce-map-memory-mb-and-mapred-map-child-jav
> https://support.pivotal.io/hc/en-us/articles/201462036-Mapreduce-YARN-Memory-Parameters
> http://hadoop.apache.org/docs/r2.5.1/hadoop-project-dist/hadoop-common/ClusterSetup.html 
> (scroll towards the end.)
>
> Regards,
> Shahab
>
> On Wed, Oct 15, 2014 at 9:24 AM, SACHINGUPTA <sachin@datametica.com 
> <ma...@datametica.com>> wrote:
>
>     I have one more doubt i was reading this
>
>     http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.0.6.0/bk_installing_manually_book/content/rpm-chap1-11.html
>
>     there is one property as
>
>     mapreduce.map.memory.mb 	= 2*1024 MB
>     mapreduce.reduce.memory.mb 	= 2 * 2 = 4*1024 MB
>
>
>     what are these properties mapreduce.map.memory.mb and
>     mapreduce.reduce.memory.mb
>
>     On Wednesday 15 October 2014 06:17 PM, Shahab Yunus wrote:
>>     It cannot run more mappers (tasks) in parallel than the
>>     underlying cores available. Just like it cannot run multiple
>>     mappers in parallel if each mapper's (task's) memory requirements
>>     are greater than allocated and available container size
>>     configured on each node.
>>
>>     The links that I provided earlier...see the following section in
>>     that one:
>>     Section:"Configuring YARN"
>>
>>     Also this:
>>     http://blog.cloudera.com/blog/2014/04/apache-hadoop-yarn-avoiding-6-time-consuming-gotchas/
>>     Section "1. YARN Concurrency (aka “What Happened to Slots?”)"
>>
>>     This should help in putting things in perspective regarding how
>>     resource allocation for each task, container and resources
>>     available on the node relate to each other.
>>
>>     Regards,
>>     Shahab
>>
>>     On Wed, Oct 15, 2014 at 8:18 AM, SACHINGUPTA
>>     <sachin@datametica.com <ma...@datametica.com>> wrote:
>>
>>         but Shahab if i have only 4 core machine then how yarn can
>>         run more then 4 mappers in parallel
>>         On Wednesday 15 October 2014 05:45 PM, Shahab Yunus wrote:
>>>         It depends on memory settings as well, that how much you
>>>         want to assign resources to each container. Then yarn will
>>>         run as many mappers in parallel as possible.
>>>
>>>         See this:
>>>         http://hortonworks.com/blog/how-to-plan-and-configure-yarn-in-hdp-2-0/
>>>         http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.0.6.0/bk_installing_manually_book/content/rpm-chap1-11.html
>>>
>>>         Regards,
>>>         Shahab
>>>
>>>         On Wed, Oct 15, 2014 at 8:09 AM, SACHINGUPTA
>>>         <sachin@datametica.com <ma...@datametica.com>> wrote:
>>>
>>>             Hi guys
>>>
>>>             I have situation in which i have machine with 4
>>>             processor and i have 5 containers so does it mean i can
>>>             have only 4 mappers running parallely at a time
>>>
>>>             and number of mappers is not dependent on the number of
>>>             containers in a machine then what is the use of
>>>             container concept
>>>
>>>             sorry if i have asked anything obvious.
>>>
>>>             -- 
>>>             Thanks
>>>             Sachin Gupta
>>>
>>>
>>
>>         -- 
>>         Thanks
>>         Sachin Gupta
>>
>>
>
>     -- 
>     Thanks
>     Sachin Gupta
>
>

-- 
Thanks
Sachin Gupta


Re: number of mappers allowed in a container in hadoop2

Posted by SACHINGUPTA <sa...@datametica.com>.
it is still not clear to me
lets suppose block size of my hdfs is 128 mb so every mapper will 
process only 128 mb of data
then what is the meaning of setting the property mapreduce.map.memory.mb 
that is already known from the block size then why this property


On Wednesday 15 October 2014 07:06 PM, Shahab Yunus wrote:
> Explanation here.
>
> http://stackoverflow.com/questions/24070557/what-is-the-relation-between-mapreduce-map-memory-mb-and-mapred-map-child-jav
> https://support.pivotal.io/hc/en-us/articles/201462036-Mapreduce-YARN-Memory-Parameters
> http://hadoop.apache.org/docs/r2.5.1/hadoop-project-dist/hadoop-common/ClusterSetup.html 
> (scroll towards the end.)
>
> Regards,
> Shahab
>
> On Wed, Oct 15, 2014 at 9:24 AM, SACHINGUPTA <sachin@datametica.com 
> <ma...@datametica.com>> wrote:
>
>     I have one more doubt i was reading this
>
>     http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.0.6.0/bk_installing_manually_book/content/rpm-chap1-11.html
>
>     there is one property as
>
>     mapreduce.map.memory.mb 	= 2*1024 MB
>     mapreduce.reduce.memory.mb 	= 2 * 2 = 4*1024 MB
>
>
>     what are these properties mapreduce.map.memory.mb and
>     mapreduce.reduce.memory.mb
>
>     On Wednesday 15 October 2014 06:17 PM, Shahab Yunus wrote:
>>     It cannot run more mappers (tasks) in parallel than the
>>     underlying cores available. Just like it cannot run multiple
>>     mappers in parallel if each mapper's (task's) memory requirements
>>     are greater than allocated and available container size
>>     configured on each node.
>>
>>     The links that I provided earlier...see the following section in
>>     that one:
>>     Section:"Configuring YARN"
>>
>>     Also this:
>>     http://blog.cloudera.com/blog/2014/04/apache-hadoop-yarn-avoiding-6-time-consuming-gotchas/
>>     Section "1. YARN Concurrency (aka “What Happened to Slots?”)"
>>
>>     This should help in putting things in perspective regarding how
>>     resource allocation for each task, container and resources
>>     available on the node relate to each other.
>>
>>     Regards,
>>     Shahab
>>
>>     On Wed, Oct 15, 2014 at 8:18 AM, SACHINGUPTA
>>     <sachin@datametica.com <ma...@datametica.com>> wrote:
>>
>>         but Shahab if i have only 4 core machine then how yarn can
>>         run more then 4 mappers in parallel
>>         On Wednesday 15 October 2014 05:45 PM, Shahab Yunus wrote:
>>>         It depends on memory settings as well, that how much you
>>>         want to assign resources to each container. Then yarn will
>>>         run as many mappers in parallel as possible.
>>>
>>>         See this:
>>>         http://hortonworks.com/blog/how-to-plan-and-configure-yarn-in-hdp-2-0/
>>>         http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.0.6.0/bk_installing_manually_book/content/rpm-chap1-11.html
>>>
>>>         Regards,
>>>         Shahab
>>>
>>>         On Wed, Oct 15, 2014 at 8:09 AM, SACHINGUPTA
>>>         <sachin@datametica.com <ma...@datametica.com>> wrote:
>>>
>>>             Hi guys
>>>
>>>             I have situation in which i have machine with 4
>>>             processor and i have 5 containers so does it mean i can
>>>             have only 4 mappers running parallely at a time
>>>
>>>             and number of mappers is not dependent on the number of
>>>             containers in a machine then what is the use of
>>>             container concept
>>>
>>>             sorry if i have asked anything obvious.
>>>
>>>             -- 
>>>             Thanks
>>>             Sachin Gupta
>>>
>>>
>>
>>         -- 
>>         Thanks
>>         Sachin Gupta
>>
>>
>
>     -- 
>     Thanks
>     Sachin Gupta
>
>

-- 
Thanks
Sachin Gupta


Re: number of mappers allowed in a container in hadoop2

Posted by SACHINGUPTA <sa...@datametica.com>.
it is still not clear to me
lets suppose block size of my hdfs is 128 mb so every mapper will 
process only 128 mb of data
then what is the meaning of setting the property mapreduce.map.memory.mb 
that is already known from the block size then why this property


On Wednesday 15 October 2014 07:06 PM, Shahab Yunus wrote:
> Explanation here.
>
> http://stackoverflow.com/questions/24070557/what-is-the-relation-between-mapreduce-map-memory-mb-and-mapred-map-child-jav
> https://support.pivotal.io/hc/en-us/articles/201462036-Mapreduce-YARN-Memory-Parameters
> http://hadoop.apache.org/docs/r2.5.1/hadoop-project-dist/hadoop-common/ClusterSetup.html 
> (scroll towards the end.)
>
> Regards,
> Shahab
>
> On Wed, Oct 15, 2014 at 9:24 AM, SACHINGUPTA <sachin@datametica.com 
> <ma...@datametica.com>> wrote:
>
>     I have one more doubt i was reading this
>
>     http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.0.6.0/bk_installing_manually_book/content/rpm-chap1-11.html
>
>     there is one property as
>
>     mapreduce.map.memory.mb 	= 2*1024 MB
>     mapreduce.reduce.memory.mb 	= 2 * 2 = 4*1024 MB
>
>
>     what are these properties mapreduce.map.memory.mb and
>     mapreduce.reduce.memory.mb
>
>     On Wednesday 15 October 2014 06:17 PM, Shahab Yunus wrote:
>>     It cannot run more mappers (tasks) in parallel than the
>>     underlying cores available. Just like it cannot run multiple
>>     mappers in parallel if each mapper's (task's) memory requirements
>>     are greater than allocated and available container size
>>     configured on each node.
>>
>>     The links that I provided earlier...see the following section in
>>     that one:
>>     Section:"Configuring YARN"
>>
>>     Also this:
>>     http://blog.cloudera.com/blog/2014/04/apache-hadoop-yarn-avoiding-6-time-consuming-gotchas/
>>     Section "1. YARN Concurrency (aka “What Happened to Slots?”)"
>>
>>     This should help in putting things in perspective regarding how
>>     resource allocation for each task, container and resources
>>     available on the node relate to each other.
>>
>>     Regards,
>>     Shahab
>>
>>     On Wed, Oct 15, 2014 at 8:18 AM, SACHINGUPTA
>>     <sachin@datametica.com <ma...@datametica.com>> wrote:
>>
>>         but Shahab if i have only 4 core machine then how yarn can
>>         run more then 4 mappers in parallel
>>         On Wednesday 15 October 2014 05:45 PM, Shahab Yunus wrote:
>>>         It depends on memory settings as well, that how much you
>>>         want to assign resources to each container. Then yarn will
>>>         run as many mappers in parallel as possible.
>>>
>>>         See this:
>>>         http://hortonworks.com/blog/how-to-plan-and-configure-yarn-in-hdp-2-0/
>>>         http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.0.6.0/bk_installing_manually_book/content/rpm-chap1-11.html
>>>
>>>         Regards,
>>>         Shahab
>>>
>>>         On Wed, Oct 15, 2014 at 8:09 AM, SACHINGUPTA
>>>         <sachin@datametica.com <ma...@datametica.com>> wrote:
>>>
>>>             Hi guys
>>>
>>>             I have situation in which i have machine with 4
>>>             processor and i have 5 containers so does it mean i can
>>>             have only 4 mappers running parallely at a time
>>>
>>>             and number of mappers is not dependent on the number of
>>>             containers in a machine then what is the use of
>>>             container concept
>>>
>>>             sorry if i have asked anything obvious.
>>>
>>>             -- 
>>>             Thanks
>>>             Sachin Gupta
>>>
>>>
>>
>>         -- 
>>         Thanks
>>         Sachin Gupta
>>
>>
>
>     -- 
>     Thanks
>     Sachin Gupta
>
>

-- 
Thanks
Sachin Gupta


Re: number of mappers allowed in a container in hadoop2

Posted by Shahab Yunus <sh...@gmail.com>.
Explanation here.

http://stackoverflow.com/questions/24070557/what-is-the-relation-between-mapreduce-map-memory-mb-and-mapred-map-child-jav
https://support.pivotal.io/hc/en-us/articles/201462036-Mapreduce-YARN-Memory-Parameters
http://hadoop.apache.org/docs/r2.5.1/hadoop-project-dist/hadoop-common/ClusterSetup.html
(scroll towards the end.)

Regards,
Shahab

On Wed, Oct 15, 2014 at 9:24 AM, SACHINGUPTA <sa...@datametica.com> wrote:

>  I have one more doubt i was reading this
>
>
> http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.0.6.0/bk_installing_manually_book/content/rpm-chap1-11.html
>
> there is one property as
>
>   mapreduce.map.memory.mb  = 2*1024 MB
> mapreduce.reduce.memory.mb           = 2 * 2 = 4*1024 MB
> what are these properties mapreduce.map.memory.mb and
> mapreduce.reduce.memory.mb
>
> On Wednesday 15 October 2014 06:17 PM, Shahab Yunus wrote:
>
> It cannot run more mappers (tasks) in parallel than the underlying cores
> available. Just like it cannot run multiple mappers in parallel if each
> mapper's (task's) memory requirements are greater than allocated and
> available container size configured on each node.
>
>  The links that I provided earlier...see the following section in that
> one:
> Section:"Configuring YARN"
>
>  Also this:
> http://blog.cloudera.com/blog/2014/04/apache-hadoop-yarn-avoiding-6-time-consuming-gotchas/
> Section "1. YARN Concurrency (aka “What Happened to Slots?”)"
>
>  This should help in putting things in perspective regarding how resource
> allocation for each task, container and resources available on the node
> relate to each other.
>
>  Regards,
> Shahab
>
> On Wed, Oct 15, 2014 at 8:18 AM, SACHINGUPTA <sa...@datametica.com>
> wrote:
>
>>  but Shahab if i have only 4 core machine then how yarn can run more then
>> 4 mappers in parallel
>> On Wednesday 15 October 2014 05:45 PM, Shahab Yunus wrote:
>>
>> It depends on memory settings as well, that how much you want to assign
>> resources to each container. Then yarn will run as many mappers in parallel
>> as possible.
>>
>>  See this:
>> http://hortonworks.com/blog/how-to-plan-and-configure-yarn-in-hdp-2-0/
>>
>> http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.0.6.0/bk_installing_manually_book/content/rpm-chap1-11.html
>>
>>  Regards,
>> Shahab
>>
>> On Wed, Oct 15, 2014 at 8:09 AM, SACHINGUPTA <sa...@datametica.com>
>> wrote:
>>
>>> Hi guys
>>>
>>> I have situation in which i have machine with 4 processor and i have 5
>>> containers so does it mean i can have only 4 mappers running parallely at a
>>> time
>>>
>>> and number of mappers is not dependent on the number of containers in a
>>> machine then what is the use of container concept
>>>
>>> sorry if i have asked anything obvious.
>>>
>>> --
>>> Thanks
>>> Sachin Gupta
>>>
>>>
>>
>> --
>> Thanks
>> Sachin Gupta
>>
>>
>
> --
> Thanks
> Sachin Gupta
>
>

Re: number of mappers allowed in a container in hadoop2

Posted by Shahab Yunus <sh...@gmail.com>.
Explanation here.

http://stackoverflow.com/questions/24070557/what-is-the-relation-between-mapreduce-map-memory-mb-and-mapred-map-child-jav
https://support.pivotal.io/hc/en-us/articles/201462036-Mapreduce-YARN-Memory-Parameters
http://hadoop.apache.org/docs/r2.5.1/hadoop-project-dist/hadoop-common/ClusterSetup.html
(scroll towards the end.)

Regards,
Shahab

On Wed, Oct 15, 2014 at 9:24 AM, SACHINGUPTA <sa...@datametica.com> wrote:

>  I have one more doubt i was reading this
>
>
> http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.0.6.0/bk_installing_manually_book/content/rpm-chap1-11.html
>
> there is one property as
>
>   mapreduce.map.memory.mb  = 2*1024 MB
> mapreduce.reduce.memory.mb           = 2 * 2 = 4*1024 MB
> what are these properties mapreduce.map.memory.mb and
> mapreduce.reduce.memory.mb
>
> On Wednesday 15 October 2014 06:17 PM, Shahab Yunus wrote:
>
> It cannot run more mappers (tasks) in parallel than the underlying cores
> available. Just like it cannot run multiple mappers in parallel if each
> mapper's (task's) memory requirements are greater than allocated and
> available container size configured on each node.
>
>  The links that I provided earlier...see the following section in that
> one:
> Section:"Configuring YARN"
>
>  Also this:
> http://blog.cloudera.com/blog/2014/04/apache-hadoop-yarn-avoiding-6-time-consuming-gotchas/
> Section "1. YARN Concurrency (aka “What Happened to Slots?”)"
>
>  This should help in putting things in perspective regarding how resource
> allocation for each task, container and resources available on the node
> relate to each other.
>
>  Regards,
> Shahab
>
> On Wed, Oct 15, 2014 at 8:18 AM, SACHINGUPTA <sa...@datametica.com>
> wrote:
>
>>  but Shahab if i have only 4 core machine then how yarn can run more then
>> 4 mappers in parallel
>> On Wednesday 15 October 2014 05:45 PM, Shahab Yunus wrote:
>>
>> It depends on memory settings as well, that how much you want to assign
>> resources to each container. Then yarn will run as many mappers in parallel
>> as possible.
>>
>>  See this:
>> http://hortonworks.com/blog/how-to-plan-and-configure-yarn-in-hdp-2-0/
>>
>> http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.0.6.0/bk_installing_manually_book/content/rpm-chap1-11.html
>>
>>  Regards,
>> Shahab
>>
>> On Wed, Oct 15, 2014 at 8:09 AM, SACHINGUPTA <sa...@datametica.com>
>> wrote:
>>
>>> Hi guys
>>>
>>> I have situation in which i have machine with 4 processor and i have 5
>>> containers so does it mean i can have only 4 mappers running parallely at a
>>> time
>>>
>>> and number of mappers is not dependent on the number of containers in a
>>> machine then what is the use of container concept
>>>
>>> sorry if i have asked anything obvious.
>>>
>>> --
>>> Thanks
>>> Sachin Gupta
>>>
>>>
>>
>> --
>> Thanks
>> Sachin Gupta
>>
>>
>
> --
> Thanks
> Sachin Gupta
>
>

Re: number of mappers allowed in a container in hadoop2

Posted by Shahab Yunus <sh...@gmail.com>.
Explanation here.

http://stackoverflow.com/questions/24070557/what-is-the-relation-between-mapreduce-map-memory-mb-and-mapred-map-child-jav
https://support.pivotal.io/hc/en-us/articles/201462036-Mapreduce-YARN-Memory-Parameters
http://hadoop.apache.org/docs/r2.5.1/hadoop-project-dist/hadoop-common/ClusterSetup.html
(scroll towards the end.)

Regards,
Shahab

On Wed, Oct 15, 2014 at 9:24 AM, SACHINGUPTA <sa...@datametica.com> wrote:

>  I have one more doubt i was reading this
>
>
> http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.0.6.0/bk_installing_manually_book/content/rpm-chap1-11.html
>
> there is one property as
>
>   mapreduce.map.memory.mb  = 2*1024 MB
> mapreduce.reduce.memory.mb           = 2 * 2 = 4*1024 MB
> what are these properties mapreduce.map.memory.mb and
> mapreduce.reduce.memory.mb
>
> On Wednesday 15 October 2014 06:17 PM, Shahab Yunus wrote:
>
> It cannot run more mappers (tasks) in parallel than the underlying cores
> available. Just like it cannot run multiple mappers in parallel if each
> mapper's (task's) memory requirements are greater than allocated and
> available container size configured on each node.
>
>  The links that I provided earlier...see the following section in that
> one:
> Section:"Configuring YARN"
>
>  Also this:
> http://blog.cloudera.com/blog/2014/04/apache-hadoop-yarn-avoiding-6-time-consuming-gotchas/
> Section "1. YARN Concurrency (aka “What Happened to Slots?”)"
>
>  This should help in putting things in perspective regarding how resource
> allocation for each task, container and resources available on the node
> relate to each other.
>
>  Regards,
> Shahab
>
> On Wed, Oct 15, 2014 at 8:18 AM, SACHINGUPTA <sa...@datametica.com>
> wrote:
>
>>  but Shahab if i have only 4 core machine then how yarn can run more then
>> 4 mappers in parallel
>> On Wednesday 15 October 2014 05:45 PM, Shahab Yunus wrote:
>>
>> It depends on memory settings as well, that how much you want to assign
>> resources to each container. Then yarn will run as many mappers in parallel
>> as possible.
>>
>>  See this:
>> http://hortonworks.com/blog/how-to-plan-and-configure-yarn-in-hdp-2-0/
>>
>> http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.0.6.0/bk_installing_manually_book/content/rpm-chap1-11.html
>>
>>  Regards,
>> Shahab
>>
>> On Wed, Oct 15, 2014 at 8:09 AM, SACHINGUPTA <sa...@datametica.com>
>> wrote:
>>
>>> Hi guys
>>>
>>> I have situation in which i have machine with 4 processor and i have 5
>>> containers so does it mean i can have only 4 mappers running parallely at a
>>> time
>>>
>>> and number of mappers is not dependent on the number of containers in a
>>> machine then what is the use of container concept
>>>
>>> sorry if i have asked anything obvious.
>>>
>>> --
>>> Thanks
>>> Sachin Gupta
>>>
>>>
>>
>> --
>> Thanks
>> Sachin Gupta
>>
>>
>
> --
> Thanks
> Sachin Gupta
>
>

Re: number of mappers allowed in a container in hadoop2

Posted by Shahab Yunus <sh...@gmail.com>.
Explanation here.

http://stackoverflow.com/questions/24070557/what-is-the-relation-between-mapreduce-map-memory-mb-and-mapred-map-child-jav
https://support.pivotal.io/hc/en-us/articles/201462036-Mapreduce-YARN-Memory-Parameters
http://hadoop.apache.org/docs/r2.5.1/hadoop-project-dist/hadoop-common/ClusterSetup.html
(scroll towards the end.)

Regards,
Shahab

On Wed, Oct 15, 2014 at 9:24 AM, SACHINGUPTA <sa...@datametica.com> wrote:

>  I have one more doubt i was reading this
>
>
> http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.0.6.0/bk_installing_manually_book/content/rpm-chap1-11.html
>
> there is one property as
>
>   mapreduce.map.memory.mb  = 2*1024 MB
> mapreduce.reduce.memory.mb           = 2 * 2 = 4*1024 MB
> what are these properties mapreduce.map.memory.mb and
> mapreduce.reduce.memory.mb
>
> On Wednesday 15 October 2014 06:17 PM, Shahab Yunus wrote:
>
> It cannot run more mappers (tasks) in parallel than the underlying cores
> available. Just like it cannot run multiple mappers in parallel if each
> mapper's (task's) memory requirements are greater than allocated and
> available container size configured on each node.
>
>  The links that I provided earlier...see the following section in that
> one:
> Section:"Configuring YARN"
>
>  Also this:
> http://blog.cloudera.com/blog/2014/04/apache-hadoop-yarn-avoiding-6-time-consuming-gotchas/
> Section "1. YARN Concurrency (aka “What Happened to Slots?”)"
>
>  This should help in putting things in perspective regarding how resource
> allocation for each task, container and resources available on the node
> relate to each other.
>
>  Regards,
> Shahab
>
> On Wed, Oct 15, 2014 at 8:18 AM, SACHINGUPTA <sa...@datametica.com>
> wrote:
>
>>  but Shahab if i have only 4 core machine then how yarn can run more then
>> 4 mappers in parallel
>> On Wednesday 15 October 2014 05:45 PM, Shahab Yunus wrote:
>>
>> It depends on memory settings as well, that how much you want to assign
>> resources to each container. Then yarn will run as many mappers in parallel
>> as possible.
>>
>>  See this:
>> http://hortonworks.com/blog/how-to-plan-and-configure-yarn-in-hdp-2-0/
>>
>> http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.0.6.0/bk_installing_manually_book/content/rpm-chap1-11.html
>>
>>  Regards,
>> Shahab
>>
>> On Wed, Oct 15, 2014 at 8:09 AM, SACHINGUPTA <sa...@datametica.com>
>> wrote:
>>
>>> Hi guys
>>>
>>> I have situation in which i have machine with 4 processor and i have 5
>>> containers so does it mean i can have only 4 mappers running parallely at a
>>> time
>>>
>>> and number of mappers is not dependent on the number of containers in a
>>> machine then what is the use of container concept
>>>
>>> sorry if i have asked anything obvious.
>>>
>>> --
>>> Thanks
>>> Sachin Gupta
>>>
>>>
>>
>> --
>> Thanks
>> Sachin Gupta
>>
>>
>
> --
> Thanks
> Sachin Gupta
>
>

Re: number of mappers allowed in a container in hadoop2

Posted by SACHINGUPTA <sa...@datametica.com>.
I have one more doubt i was reading this

http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.0.6.0/bk_installing_manually_book/content/rpm-chap1-11.html

there is one property as

mapreduce.map.memory.mb 	= 2*1024 MB
mapreduce.reduce.memory.mb 	= 2 * 2 = 4*1024 MB


what are these properties mapreduce.map.memory.mb and 
mapreduce.reduce.memory.mb

On Wednesday 15 October 2014 06:17 PM, Shahab Yunus wrote:
> It cannot run more mappers (tasks) in parallel than the underlying 
> cores available. Just like it cannot run multiple mappers in parallel 
> if each mapper's (task's) memory requirements are greater than 
> allocated and available container size configured on each node.
>
> The links that I provided earlier...see the following section in that 
> one:
> Section:"Configuring YARN"
>
> Also this: 
> http://blog.cloudera.com/blog/2014/04/apache-hadoop-yarn-avoiding-6-time-consuming-gotchas/
> Section "1. YARN Concurrency (aka “What Happened to Slots?”)"
>
> This should help in putting things in perspective regarding how 
> resource allocation for each task, container and resources available 
> on the node relate to each other.
>
> Regards,
> Shahab
>
> On Wed, Oct 15, 2014 at 8:18 AM, SACHINGUPTA <sachin@datametica.com 
> <ma...@datametica.com>> wrote:
>
>     but Shahab if i have only 4 core machine then how yarn can run
>     more then 4 mappers in parallel
>     On Wednesday 15 October 2014 05:45 PM, Shahab Yunus wrote:
>>     It depends on memory settings as well, that how much you want to
>>     assign resources to each container. Then yarn will run as many
>>     mappers in parallel as possible.
>>
>>     See this:
>>     http://hortonworks.com/blog/how-to-plan-and-configure-yarn-in-hdp-2-0/
>>     http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.0.6.0/bk_installing_manually_book/content/rpm-chap1-11.html
>>
>>     Regards,
>>     Shahab
>>
>>     On Wed, Oct 15, 2014 at 8:09 AM, SACHINGUPTA
>>     <sachin@datametica.com <ma...@datametica.com>> wrote:
>>
>>         Hi guys
>>
>>         I have situation in which i have machine with 4 processor and
>>         i have 5 containers so does it mean i can have only 4 mappers
>>         running parallely at a time
>>
>>         and number of mappers is not dependent on the number of
>>         containers in a machine then what is the use of container concept
>>
>>         sorry if i have asked anything obvious.
>>
>>         -- 
>>         Thanks
>>         Sachin Gupta
>>
>>
>
>     -- 
>     Thanks
>     Sachin Gupta
>
>

-- 
Thanks
Sachin Gupta


Re: number of mappers allowed in a container in hadoop2

Posted by SACHINGUPTA <sa...@datametica.com>.
I have one more doubt i was reading this

http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.0.6.0/bk_installing_manually_book/content/rpm-chap1-11.html

there is one property as

mapreduce.map.memory.mb 	= 2*1024 MB
mapreduce.reduce.memory.mb 	= 2 * 2 = 4*1024 MB


what are these properties mapreduce.map.memory.mb and 
mapreduce.reduce.memory.mb

On Wednesday 15 October 2014 06:17 PM, Shahab Yunus wrote:
> It cannot run more mappers (tasks) in parallel than the underlying 
> cores available. Just like it cannot run multiple mappers in parallel 
> if each mapper's (task's) memory requirements are greater than 
> allocated and available container size configured on each node.
>
> The links that I provided earlier...see the following section in that 
> one:
> Section:"Configuring YARN"
>
> Also this: 
> http://blog.cloudera.com/blog/2014/04/apache-hadoop-yarn-avoiding-6-time-consuming-gotchas/
> Section "1. YARN Concurrency (aka “What Happened to Slots?”)"
>
> This should help in putting things in perspective regarding how 
> resource allocation for each task, container and resources available 
> on the node relate to each other.
>
> Regards,
> Shahab
>
> On Wed, Oct 15, 2014 at 8:18 AM, SACHINGUPTA <sachin@datametica.com 
> <ma...@datametica.com>> wrote:
>
>     but Shahab if i have only 4 core machine then how yarn can run
>     more then 4 mappers in parallel
>     On Wednesday 15 October 2014 05:45 PM, Shahab Yunus wrote:
>>     It depends on memory settings as well, that how much you want to
>>     assign resources to each container. Then yarn will run as many
>>     mappers in parallel as possible.
>>
>>     See this:
>>     http://hortonworks.com/blog/how-to-plan-and-configure-yarn-in-hdp-2-0/
>>     http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.0.6.0/bk_installing_manually_book/content/rpm-chap1-11.html
>>
>>     Regards,
>>     Shahab
>>
>>     On Wed, Oct 15, 2014 at 8:09 AM, SACHINGUPTA
>>     <sachin@datametica.com <ma...@datametica.com>> wrote:
>>
>>         Hi guys
>>
>>         I have situation in which i have machine with 4 processor and
>>         i have 5 containers so does it mean i can have only 4 mappers
>>         running parallely at a time
>>
>>         and number of mappers is not dependent on the number of
>>         containers in a machine then what is the use of container concept
>>
>>         sorry if i have asked anything obvious.
>>
>>         -- 
>>         Thanks
>>         Sachin Gupta
>>
>>
>
>     -- 
>     Thanks
>     Sachin Gupta
>
>

-- 
Thanks
Sachin Gupta


Re: number of mappers allowed in a container in hadoop2

Posted by SACHINGUPTA <sa...@datametica.com>.
I have one more doubt i was reading this

http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.0.6.0/bk_installing_manually_book/content/rpm-chap1-11.html

there is one property as

mapreduce.map.memory.mb 	= 2*1024 MB
mapreduce.reduce.memory.mb 	= 2 * 2 = 4*1024 MB


what are these properties mapreduce.map.memory.mb and 
mapreduce.reduce.memory.mb

On Wednesday 15 October 2014 06:17 PM, Shahab Yunus wrote:
> It cannot run more mappers (tasks) in parallel than the underlying 
> cores available. Just like it cannot run multiple mappers in parallel 
> if each mapper's (task's) memory requirements are greater than 
> allocated and available container size configured on each node.
>
> The links that I provided earlier...see the following section in that 
> one:
> Section:"Configuring YARN"
>
> Also this: 
> http://blog.cloudera.com/blog/2014/04/apache-hadoop-yarn-avoiding-6-time-consuming-gotchas/
> Section "1. YARN Concurrency (aka “What Happened to Slots?”)"
>
> This should help in putting things in perspective regarding how 
> resource allocation for each task, container and resources available 
> on the node relate to each other.
>
> Regards,
> Shahab
>
> On Wed, Oct 15, 2014 at 8:18 AM, SACHINGUPTA <sachin@datametica.com 
> <ma...@datametica.com>> wrote:
>
>     but Shahab if i have only 4 core machine then how yarn can run
>     more then 4 mappers in parallel
>     On Wednesday 15 October 2014 05:45 PM, Shahab Yunus wrote:
>>     It depends on memory settings as well, that how much you want to
>>     assign resources to each container. Then yarn will run as many
>>     mappers in parallel as possible.
>>
>>     See this:
>>     http://hortonworks.com/blog/how-to-plan-and-configure-yarn-in-hdp-2-0/
>>     http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.0.6.0/bk_installing_manually_book/content/rpm-chap1-11.html
>>
>>     Regards,
>>     Shahab
>>
>>     On Wed, Oct 15, 2014 at 8:09 AM, SACHINGUPTA
>>     <sachin@datametica.com <ma...@datametica.com>> wrote:
>>
>>         Hi guys
>>
>>         I have situation in which i have machine with 4 processor and
>>         i have 5 containers so does it mean i can have only 4 mappers
>>         running parallely at a time
>>
>>         and number of mappers is not dependent on the number of
>>         containers in a machine then what is the use of container concept
>>
>>         sorry if i have asked anything obvious.
>>
>>         -- 
>>         Thanks
>>         Sachin Gupta
>>
>>
>
>     -- 
>     Thanks
>     Sachin Gupta
>
>

-- 
Thanks
Sachin Gupta


Re: number of mappers allowed in a container in hadoop2

Posted by SACHINGUPTA <sa...@datametica.com>.
I have one more doubt i was reading this

http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.0.6.0/bk_installing_manually_book/content/rpm-chap1-11.html

there is one property as

mapreduce.map.memory.mb 	= 2*1024 MB
mapreduce.reduce.memory.mb 	= 2 * 2 = 4*1024 MB


what are these properties mapreduce.map.memory.mb and 
mapreduce.reduce.memory.mb

On Wednesday 15 October 2014 06:17 PM, Shahab Yunus wrote:
> It cannot run more mappers (tasks) in parallel than the underlying 
> cores available. Just like it cannot run multiple mappers in parallel 
> if each mapper's (task's) memory requirements are greater than 
> allocated and available container size configured on each node.
>
> The links that I provided earlier...see the following section in that 
> one:
> Section:"Configuring YARN"
>
> Also this: 
> http://blog.cloudera.com/blog/2014/04/apache-hadoop-yarn-avoiding-6-time-consuming-gotchas/
> Section "1. YARN Concurrency (aka “What Happened to Slots?”)"
>
> This should help in putting things in perspective regarding how 
> resource allocation for each task, container and resources available 
> on the node relate to each other.
>
> Regards,
> Shahab
>
> On Wed, Oct 15, 2014 at 8:18 AM, SACHINGUPTA <sachin@datametica.com 
> <ma...@datametica.com>> wrote:
>
>     but Shahab if i have only 4 core machine then how yarn can run
>     more then 4 mappers in parallel
>     On Wednesday 15 October 2014 05:45 PM, Shahab Yunus wrote:
>>     It depends on memory settings as well, that how much you want to
>>     assign resources to each container. Then yarn will run as many
>>     mappers in parallel as possible.
>>
>>     See this:
>>     http://hortonworks.com/blog/how-to-plan-and-configure-yarn-in-hdp-2-0/
>>     http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.0.6.0/bk_installing_manually_book/content/rpm-chap1-11.html
>>
>>     Regards,
>>     Shahab
>>
>>     On Wed, Oct 15, 2014 at 8:09 AM, SACHINGUPTA
>>     <sachin@datametica.com <ma...@datametica.com>> wrote:
>>
>>         Hi guys
>>
>>         I have situation in which i have machine with 4 processor and
>>         i have 5 containers so does it mean i can have only 4 mappers
>>         running parallely at a time
>>
>>         and number of mappers is not dependent on the number of
>>         containers in a machine then what is the use of container concept
>>
>>         sorry if i have asked anything obvious.
>>
>>         -- 
>>         Thanks
>>         Sachin Gupta
>>
>>
>
>     -- 
>     Thanks
>     Sachin Gupta
>
>

-- 
Thanks
Sachin Gupta


Re: number of mappers allowed in a container in hadoop2

Posted by Shahab Yunus <sh...@gmail.com>.
It cannot run more mappers (tasks) in parallel than the underlying cores
available. Just like it cannot run multiple mappers in parallel if each
mapper's (task's) memory requirements are greater than allocated and
available container size configured on each node.

The links that I provided earlier...see the following section in that one:
Section:"Configuring YARN"

Also this:
http://blog.cloudera.com/blog/2014/04/apache-hadoop-yarn-avoiding-6-time-consuming-gotchas/
Section "1. YARN Concurrency (aka “What Happened to Slots?”)"

This should help in putting things in perspective regarding how resource
allocation for each task, container and resources available on the node
relate to each other.

Regards,
Shahab

On Wed, Oct 15, 2014 at 8:18 AM, SACHINGUPTA <sa...@datametica.com> wrote:

>  but Shahab if i have only 4 core machine then how yarn can run more then
> 4 mappers in parallel
> On Wednesday 15 October 2014 05:45 PM, Shahab Yunus wrote:
>
> It depends on memory settings as well, that how much you want to assign
> resources to each container. Then yarn will run as many mappers in parallel
> as possible.
>
>  See this:
> http://hortonworks.com/blog/how-to-plan-and-configure-yarn-in-hdp-2-0/
>
> http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.0.6.0/bk_installing_manually_book/content/rpm-chap1-11.html
>
>  Regards,
> Shahab
>
> On Wed, Oct 15, 2014 at 8:09 AM, SACHINGUPTA <sa...@datametica.com>
> wrote:
>
>> Hi guys
>>
>> I have situation in which i have machine with 4 processor and i have 5
>> containers so does it mean i can have only 4 mappers running parallely at a
>> time
>>
>> and number of mappers is not dependent on the number of containers in a
>> machine then what is the use of container concept
>>
>> sorry if i have asked anything obvious.
>>
>> --
>> Thanks
>> Sachin Gupta
>>
>>
>
> --
> Thanks
> Sachin Gupta
>
>

Re: number of mappers allowed in a container in hadoop2

Posted by Shahab Yunus <sh...@gmail.com>.
It cannot run more mappers (tasks) in parallel than the underlying cores
available. Just like it cannot run multiple mappers in parallel if each
mapper's (task's) memory requirements are greater than allocated and
available container size configured on each node.

The links that I provided earlier...see the following section in that one:
Section:"Configuring YARN"

Also this:
http://blog.cloudera.com/blog/2014/04/apache-hadoop-yarn-avoiding-6-time-consuming-gotchas/
Section "1. YARN Concurrency (aka “What Happened to Slots?”)"

This should help in putting things in perspective regarding how resource
allocation for each task, container and resources available on the node
relate to each other.

Regards,
Shahab

On Wed, Oct 15, 2014 at 8:18 AM, SACHINGUPTA <sa...@datametica.com> wrote:

>  but Shahab if i have only 4 core machine then how yarn can run more then
> 4 mappers in parallel
> On Wednesday 15 October 2014 05:45 PM, Shahab Yunus wrote:
>
> It depends on memory settings as well, that how much you want to assign
> resources to each container. Then yarn will run as many mappers in parallel
> as possible.
>
>  See this:
> http://hortonworks.com/blog/how-to-plan-and-configure-yarn-in-hdp-2-0/
>
> http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.0.6.0/bk_installing_manually_book/content/rpm-chap1-11.html
>
>  Regards,
> Shahab
>
> On Wed, Oct 15, 2014 at 8:09 AM, SACHINGUPTA <sa...@datametica.com>
> wrote:
>
>> Hi guys
>>
>> I have situation in which i have machine with 4 processor and i have 5
>> containers so does it mean i can have only 4 mappers running parallely at a
>> time
>>
>> and number of mappers is not dependent on the number of containers in a
>> machine then what is the use of container concept
>>
>> sorry if i have asked anything obvious.
>>
>> --
>> Thanks
>> Sachin Gupta
>>
>>
>
> --
> Thanks
> Sachin Gupta
>
>

Re: number of mappers allowed in a container in hadoop2

Posted by Shahab Yunus <sh...@gmail.com>.
It cannot run more mappers (tasks) in parallel than the underlying cores
available. Just like it cannot run multiple mappers in parallel if each
mapper's (task's) memory requirements are greater than allocated and
available container size configured on each node.

The links that I provided earlier...see the following section in that one:
Section:"Configuring YARN"

Also this:
http://blog.cloudera.com/blog/2014/04/apache-hadoop-yarn-avoiding-6-time-consuming-gotchas/
Section "1. YARN Concurrency (aka “What Happened to Slots?”)"

This should help in putting things in perspective regarding how resource
allocation for each task, container and resources available on the node
relate to each other.

Regards,
Shahab

On Wed, Oct 15, 2014 at 8:18 AM, SACHINGUPTA <sa...@datametica.com> wrote:

>  but Shahab if i have only 4 core machine then how yarn can run more then
> 4 mappers in parallel
> On Wednesday 15 October 2014 05:45 PM, Shahab Yunus wrote:
>
> It depends on memory settings as well, that how much you want to assign
> resources to each container. Then yarn will run as many mappers in parallel
> as possible.
>
>  See this:
> http://hortonworks.com/blog/how-to-plan-and-configure-yarn-in-hdp-2-0/
>
> http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.0.6.0/bk_installing_manually_book/content/rpm-chap1-11.html
>
>  Regards,
> Shahab
>
> On Wed, Oct 15, 2014 at 8:09 AM, SACHINGUPTA <sa...@datametica.com>
> wrote:
>
>> Hi guys
>>
>> I have situation in which i have machine with 4 processor and i have 5
>> containers so does it mean i can have only 4 mappers running parallely at a
>> time
>>
>> and number of mappers is not dependent on the number of containers in a
>> machine then what is the use of container concept
>>
>> sorry if i have asked anything obvious.
>>
>> --
>> Thanks
>> Sachin Gupta
>>
>>
>
> --
> Thanks
> Sachin Gupta
>
>

Re: number of mappers allowed in a container in hadoop2

Posted by Shahab Yunus <sh...@gmail.com>.
It cannot run more mappers (tasks) in parallel than the underlying cores
available. Just like it cannot run multiple mappers in parallel if each
mapper's (task's) memory requirements are greater than allocated and
available container size configured on each node.

The links that I provided earlier...see the following section in that one:
Section:"Configuring YARN"

Also this:
http://blog.cloudera.com/blog/2014/04/apache-hadoop-yarn-avoiding-6-time-consuming-gotchas/
Section "1. YARN Concurrency (aka “What Happened to Slots?”)"

This should help in putting things in perspective regarding how resource
allocation for each task, container and resources available on the node
relate to each other.

Regards,
Shahab

On Wed, Oct 15, 2014 at 8:18 AM, SACHINGUPTA <sa...@datametica.com> wrote:

>  but Shahab if i have only 4 core machine then how yarn can run more then
> 4 mappers in parallel
> On Wednesday 15 October 2014 05:45 PM, Shahab Yunus wrote:
>
> It depends on memory settings as well, that how much you want to assign
> resources to each container. Then yarn will run as many mappers in parallel
> as possible.
>
>  See this:
> http://hortonworks.com/blog/how-to-plan-and-configure-yarn-in-hdp-2-0/
>
> http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.0.6.0/bk_installing_manually_book/content/rpm-chap1-11.html
>
>  Regards,
> Shahab
>
> On Wed, Oct 15, 2014 at 8:09 AM, SACHINGUPTA <sa...@datametica.com>
> wrote:
>
>> Hi guys
>>
>> I have situation in which i have machine with 4 processor and i have 5
>> containers so does it mean i can have only 4 mappers running parallely at a
>> time
>>
>> and number of mappers is not dependent on the number of containers in a
>> machine then what is the use of container concept
>>
>> sorry if i have asked anything obvious.
>>
>> --
>> Thanks
>> Sachin Gupta
>>
>>
>
> --
> Thanks
> Sachin Gupta
>
>

Re: number of mappers allowed in a container in hadoop2

Posted by SACHINGUPTA <sa...@datametica.com>.
but Shahab if i have only 4 core machine then how yarn can run more then 
4 mappers in parallel
On Wednesday 15 October 2014 05:45 PM, Shahab Yunus wrote:
> It depends on memory settings as well, that how much you want to 
> assign resources to each container. Then yarn will run as many mappers 
> in parallel as possible.
>
> See this:
> http://hortonworks.com/blog/how-to-plan-and-configure-yarn-in-hdp-2-0/
> http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.0.6.0/bk_installing_manually_book/content/rpm-chap1-11.html
>
> Regards,
> Shahab
>
> On Wed, Oct 15, 2014 at 8:09 AM, SACHINGUPTA <sachin@datametica.com 
> <ma...@datametica.com>> wrote:
>
>     Hi guys
>
>     I have situation in which i have machine with 4 processor and i
>     have 5 containers so does it mean i can have only 4 mappers
>     running parallely at a time
>
>     and number of mappers is not dependent on the number of containers
>     in a machine then what is the use of container concept
>
>     sorry if i have asked anything obvious.
>
>     -- 
>     Thanks
>     Sachin Gupta
>
>

-- 
Thanks
Sachin Gupta


Re: number of mappers allowed in a container in hadoop2

Posted by SACHINGUPTA <sa...@datametica.com>.
but Shahab if i have only 4 core machine then how yarn can run more then 
4 mappers in parallel
On Wednesday 15 October 2014 05:45 PM, Shahab Yunus wrote:
> It depends on memory settings as well, that how much you want to 
> assign resources to each container. Then yarn will run as many mappers 
> in parallel as possible.
>
> See this:
> http://hortonworks.com/blog/how-to-plan-and-configure-yarn-in-hdp-2-0/
> http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.0.6.0/bk_installing_manually_book/content/rpm-chap1-11.html
>
> Regards,
> Shahab
>
> On Wed, Oct 15, 2014 at 8:09 AM, SACHINGUPTA <sachin@datametica.com 
> <ma...@datametica.com>> wrote:
>
>     Hi guys
>
>     I have situation in which i have machine with 4 processor and i
>     have 5 containers so does it mean i can have only 4 mappers
>     running parallely at a time
>
>     and number of mappers is not dependent on the number of containers
>     in a machine then what is the use of container concept
>
>     sorry if i have asked anything obvious.
>
>     -- 
>     Thanks
>     Sachin Gupta
>
>

-- 
Thanks
Sachin Gupta


Re: number of mappers allowed in a container in hadoop2

Posted by SACHINGUPTA <sa...@datametica.com>.
but Shahab if i have only 4 core machine then how yarn can run more then 
4 mappers in parallel
On Wednesday 15 October 2014 05:45 PM, Shahab Yunus wrote:
> It depends on memory settings as well, that how much you want to 
> assign resources to each container. Then yarn will run as many mappers 
> in parallel as possible.
>
> See this:
> http://hortonworks.com/blog/how-to-plan-and-configure-yarn-in-hdp-2-0/
> http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.0.6.0/bk_installing_manually_book/content/rpm-chap1-11.html
>
> Regards,
> Shahab
>
> On Wed, Oct 15, 2014 at 8:09 AM, SACHINGUPTA <sachin@datametica.com 
> <ma...@datametica.com>> wrote:
>
>     Hi guys
>
>     I have situation in which i have machine with 4 processor and i
>     have 5 containers so does it mean i can have only 4 mappers
>     running parallely at a time
>
>     and number of mappers is not dependent on the number of containers
>     in a machine then what is the use of container concept
>
>     sorry if i have asked anything obvious.
>
>     -- 
>     Thanks
>     Sachin Gupta
>
>

-- 
Thanks
Sachin Gupta


Re: number of mappers allowed in a container in hadoop2

Posted by SACHINGUPTA <sa...@datametica.com>.
but Shahab if i have only 4 core machine then how yarn can run more then 
4 mappers in parallel
On Wednesday 15 October 2014 05:45 PM, Shahab Yunus wrote:
> It depends on memory settings as well, that how much you want to 
> assign resources to each container. Then yarn will run as many mappers 
> in parallel as possible.
>
> See this:
> http://hortonworks.com/blog/how-to-plan-and-configure-yarn-in-hdp-2-0/
> http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.0.6.0/bk_installing_manually_book/content/rpm-chap1-11.html
>
> Regards,
> Shahab
>
> On Wed, Oct 15, 2014 at 8:09 AM, SACHINGUPTA <sachin@datametica.com 
> <ma...@datametica.com>> wrote:
>
>     Hi guys
>
>     I have situation in which i have machine with 4 processor and i
>     have 5 containers so does it mean i can have only 4 mappers
>     running parallely at a time
>
>     and number of mappers is not dependent on the number of containers
>     in a machine then what is the use of container concept
>
>     sorry if i have asked anything obvious.
>
>     -- 
>     Thanks
>     Sachin Gupta
>
>

-- 
Thanks
Sachin Gupta


Re: number of mappers allowed in a container in hadoop2

Posted by Shahab Yunus <sh...@gmail.com>.
It depends on memory settings as well, that how much you want to assign
resources to each container. Then yarn will run as many mappers in parallel
as possible.

See this:
http://hortonworks.com/blog/how-to-plan-and-configure-yarn-in-hdp-2-0/
http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.0.6.0/bk_installing_manually_book/content/rpm-chap1-11.html

Regards,
Shahab

On Wed, Oct 15, 2014 at 8:09 AM, SACHINGUPTA <sa...@datametica.com> wrote:

> Hi guys
>
> I have situation in which i have machine with 4 processor and i have 5
> containers so does it mean i can have only 4 mappers running parallely at a
> time
>
> and number of mappers is not dependent on the number of containers in a
> machine then what is the use of container concept
>
> sorry if i have asked anything obvious.
>
> --
> Thanks
> Sachin Gupta
>
>

Re: number of mappers allowed in a container in hadoop2

Posted by Shahab Yunus <sh...@gmail.com>.
It depends on memory settings as well, that how much you want to assign
resources to each container. Then yarn will run as many mappers in parallel
as possible.

See this:
http://hortonworks.com/blog/how-to-plan-and-configure-yarn-in-hdp-2-0/
http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.0.6.0/bk_installing_manually_book/content/rpm-chap1-11.html

Regards,
Shahab

On Wed, Oct 15, 2014 at 8:09 AM, SACHINGUPTA <sa...@datametica.com> wrote:

> Hi guys
>
> I have situation in which i have machine with 4 processor and i have 5
> containers so does it mean i can have only 4 mappers running parallely at a
> time
>
> and number of mappers is not dependent on the number of containers in a
> machine then what is the use of container concept
>
> sorry if i have asked anything obvious.
>
> --
> Thanks
> Sachin Gupta
>
>

Re: number of mappers allowed in a container in hadoop2

Posted by Shahab Yunus <sh...@gmail.com>.
It depends on memory settings as well, that how much you want to assign
resources to each container. Then yarn will run as many mappers in parallel
as possible.

See this:
http://hortonworks.com/blog/how-to-plan-and-configure-yarn-in-hdp-2-0/
http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.0.6.0/bk_installing_manually_book/content/rpm-chap1-11.html

Regards,
Shahab

On Wed, Oct 15, 2014 at 8:09 AM, SACHINGUPTA <sa...@datametica.com> wrote:

> Hi guys
>
> I have situation in which i have machine with 4 processor and i have 5
> containers so does it mean i can have only 4 mappers running parallely at a
> time
>
> and number of mappers is not dependent on the number of containers in a
> machine then what is the use of container concept
>
> sorry if i have asked anything obvious.
>
> --
> Thanks
> Sachin Gupta
>
>

Re: number of mappers allowed in a container in hadoop2

Posted by Shahab Yunus <sh...@gmail.com>.
It depends on memory settings as well, that how much you want to assign
resources to each container. Then yarn will run as many mappers in parallel
as possible.

See this:
http://hortonworks.com/blog/how-to-plan-and-configure-yarn-in-hdp-2-0/
http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.0.6.0/bk_installing_manually_book/content/rpm-chap1-11.html

Regards,
Shahab

On Wed, Oct 15, 2014 at 8:09 AM, SACHINGUPTA <sa...@datametica.com> wrote:

> Hi guys
>
> I have situation in which i have machine with 4 processor and i have 5
> containers so does it mean i can have only 4 mappers running parallely at a
> time
>
> and number of mappers is not dependent on the number of containers in a
> machine then what is the use of container concept
>
> sorry if i have asked anything obvious.
>
> --
> Thanks
> Sachin Gupta
>
>