You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-user@hadoop.apache.org by Jakub Stransky <st...@gmail.com> on 2014/10/23 09:46:18 UTC

Memory consumption by AM

Hello experienced users,

we are new to hadoop hence using nearly default configuration including
scheduler - which I guess by default is Capacity Scheduler.

Lately we were confronted with following behaviour on the cluster. We are
using apache oozie for job submission of various data pipes. We have single
customer for our cluster. There were submitted several jobs - hence
allocated containers to run an AM from YARN but after such allocation there
were not enough remaining resources to run any Mappers/Reducers so cluster
were effectively deadlocked. All resources consumed by AM and all of them
were waiting for resources.

We are using HDP 2.0 hence hadoop 2.2.0.  Is there any way how to prevent
this from happening ?

Thanks for suggestions
Jakub

Re: Memory consumption by AM

Posted by Girish Lingappa <gl...@pivotal.io>.
Jakub

If you are using 2.2 you have one option of limiting the number of
concurrent applications that get launched by setting a property in the
scheduler configuration. You can refer to that here :
http://hadoop.apache.org/docs/r2.2.0/hadoop-yarn/hadoop-yarn-site/CapacityScheduler.html.
Please look for yarn.scheduler.capacity.maximum-applications .

You will find a similar setting for fair scheduler as well, maxRunningApps (
http://hadoop.apache.org/docs/r2.2.0/hadoop-yarn/hadoop-yarn-site/FairScheduler.html
)

hth

On Thu, Oct 23, 2014 at 12:46 AM, Jakub Stransky <st...@gmail.com>
wrote:

> Hello experienced users,
>
> we are new to hadoop hence using nearly default configuration including
> scheduler - which I guess by default is Capacity Scheduler.
>
> Lately we were confronted with following behaviour on the cluster. We are
> using apache oozie for job submission of various data pipes. We have single
> customer for our cluster. There were submitted several jobs - hence
> allocated containers to run an AM from YARN but after such allocation there
> were not enough remaining resources to run any Mappers/Reducers so cluster
> were effectively deadlocked. All resources consumed by AM and all of them
> were waiting for resources.
>
> We are using HDP 2.0 hence hadoop 2.2.0.  Is there any way how to prevent
> this from happening ?
>
> Thanks for suggestions
> Jakub
>
>
>

Re: Memory consumption by AM

Posted by Girish Lingappa <gl...@pivotal.io>.
Jakub

If you are using 2.2 you have one option of limiting the number of
concurrent applications that get launched by setting a property in the
scheduler configuration. You can refer to that here :
http://hadoop.apache.org/docs/r2.2.0/hadoop-yarn/hadoop-yarn-site/CapacityScheduler.html.
Please look for yarn.scheduler.capacity.maximum-applications .

You will find a similar setting for fair scheduler as well, maxRunningApps (
http://hadoop.apache.org/docs/r2.2.0/hadoop-yarn/hadoop-yarn-site/FairScheduler.html
)

hth

On Thu, Oct 23, 2014 at 12:46 AM, Jakub Stransky <st...@gmail.com>
wrote:

> Hello experienced users,
>
> we are new to hadoop hence using nearly default configuration including
> scheduler - which I guess by default is Capacity Scheduler.
>
> Lately we were confronted with following behaviour on the cluster. We are
> using apache oozie for job submission of various data pipes. We have single
> customer for our cluster. There were submitted several jobs - hence
> allocated containers to run an AM from YARN but after such allocation there
> were not enough remaining resources to run any Mappers/Reducers so cluster
> were effectively deadlocked. All resources consumed by AM and all of them
> were waiting for resources.
>
> We are using HDP 2.0 hence hadoop 2.2.0.  Is there any way how to prevent
> this from happening ?
>
> Thanks for suggestions
> Jakub
>
>
>

Re: Memory consumption by AM

Posted by Girish Lingappa <gl...@pivotal.io>.
Jakub

If you are using 2.2 you have one option of limiting the number of
concurrent applications that get launched by setting a property in the
scheduler configuration. You can refer to that here :
http://hadoop.apache.org/docs/r2.2.0/hadoop-yarn/hadoop-yarn-site/CapacityScheduler.html.
Please look for yarn.scheduler.capacity.maximum-applications .

You will find a similar setting for fair scheduler as well, maxRunningApps (
http://hadoop.apache.org/docs/r2.2.0/hadoop-yarn/hadoop-yarn-site/FairScheduler.html
)

hth

On Thu, Oct 23, 2014 at 12:46 AM, Jakub Stransky <st...@gmail.com>
wrote:

> Hello experienced users,
>
> we are new to hadoop hence using nearly default configuration including
> scheduler - which I guess by default is Capacity Scheduler.
>
> Lately we were confronted with following behaviour on the cluster. We are
> using apache oozie for job submission of various data pipes. We have single
> customer for our cluster. There were submitted several jobs - hence
> allocated containers to run an AM from YARN but after such allocation there
> were not enough remaining resources to run any Mappers/Reducers so cluster
> were effectively deadlocked. All resources consumed by AM and all of them
> were waiting for resources.
>
> We are using HDP 2.0 hence hadoop 2.2.0.  Is there any way how to prevent
> this from happening ?
>
> Thanks for suggestions
> Jakub
>
>
>

Re: Memory consumption by AM

Posted by Girish Lingappa <gl...@pivotal.io>.
Jakub

If you are using 2.2 you have one option of limiting the number of
concurrent applications that get launched by setting a property in the
scheduler configuration. You can refer to that here :
http://hadoop.apache.org/docs/r2.2.0/hadoop-yarn/hadoop-yarn-site/CapacityScheduler.html.
Please look for yarn.scheduler.capacity.maximum-applications .

You will find a similar setting for fair scheduler as well, maxRunningApps (
http://hadoop.apache.org/docs/r2.2.0/hadoop-yarn/hadoop-yarn-site/FairScheduler.html
)

hth

On Thu, Oct 23, 2014 at 12:46 AM, Jakub Stransky <st...@gmail.com>
wrote:

> Hello experienced users,
>
> we are new to hadoop hence using nearly default configuration including
> scheduler - which I guess by default is Capacity Scheduler.
>
> Lately we were confronted with following behaviour on the cluster. We are
> using apache oozie for job submission of various data pipes. We have single
> customer for our cluster. There were submitted several jobs - hence
> allocated containers to run an AM from YARN but after such allocation there
> were not enough remaining resources to run any Mappers/Reducers so cluster
> were effectively deadlocked. All resources consumed by AM and all of them
> were waiting for resources.
>
> We are using HDP 2.0 hence hadoop 2.2.0.  Is there any way how to prevent
> this from happening ?
>
> Thanks for suggestions
> Jakub
>
>
>