You are viewing a plain text version of this content. The canonical link for it is here.

Posted to hdfs-user@hadoop.apache.org by Laxman Ch <la...@gmail.com> on 2015/10/01 11:33:40 UTC

Re: Concurrency control

Hi Naga,

Like most of the app-level configurations, admin can configure the defaults
which user may want override at application level.

If this is at queue-level then all applications in a queue will have the
same limits. But all our applications in a queue may not have same SLA and
we may need to restrict them differently. This requires again splitting
queues further which I feel is more overhead.


On 30 September 2015 at 09:00, Naganarasimha G R (Naga) <
garlanaganarasimha@huawei.com> wrote:

> Hi Laxman,
>
> Ideally i understand it would be better its available @ application level,
> but  its like each user is expected to ensure that he gives the right
> configuration which is within the limits of max capacity.
> And what if user submits some app *(kind of a query execution app**)*
> with out this setting *or* he doesn't know how much it should take ? In
> general, users specifying resources for containers itself is a difficult
> task.
> And it might not be right to expect that the admin will do it for each
> application in the queue either.  Basically governing will be difficult if
> its not enforced from queue/scheduler side.
>
> + Naga
>
> ------------------------------
> *From:* Laxman Ch [laxman.lux@gmail.com]
> *Sent:* Tuesday, September 29, 2015 16:52
>
> *To:* user@hadoop.apache.org
> *Subject:* Re: Concurrency control
>
> IMO, its better to have a application level configuration than to have a
> scheduler/queue level configuration.
> Having a queue level configuration will restrict every single application
> that runs in that queue.
> But, we may want to configure these limits for only some set of jobs and
> also for every application these limits can be different.
>
> FairOrdering policy thing, order of jobs can't be enforced as these are
> adhoc jobs and scheduled/owned independently by different teams.
>
> On 29 September 2015 at 16:43, Naganarasimha G R (Naga) <
> garlanaganarasimha@huawei.com> wrote:
>
>> Hi Laxman,
>>
>> What i meant was,  suppose if we support and configure
>> yarn.scheduler.capacity.<queue-path>.app-limit-factor to .25  then a
>> single app should not take more than 25 % of resources in the queue.
>> This would be a more generic configuration which can be enforced by the
>> admin, than expecting it to be configured for per app by the user.
>>
>> And for Rohith's suggestion of FairOrdering policy , I think it should
>> solve the problem if the App which is submitted first is not already hogged
>> all the queue's resources.
>>
>> + Naga
>>
>> ------------------------------
>> *From:* Laxman Ch [laxman.lux@gmail.com]
>> *Sent:* Tuesday, September 29, 2015 16:03
>>
>> *To:* user@hadoop.apache.org
>> *Subject:* Re: Concurrency control
>>
>> Thanks Rohit, Naga and Lloyd for the responses.
>>
>> > I think Laxman should also tell us more about which application type he
>> is running.
>>
>> We run mr jobs mostly with default core/memory allocation (1 vcore,
>> 1.5GB).
>> Our problem is more about controlling the * resources used
>> simultaneously by all running containers *at any given point of time per
>> application.
>>
>> Example:
>> 1. App1 and App2 are two MR apps.
>> 2. App1 and App2 belong to same queue (capacity: 100 vcores, 150 GB).
>> 3. Each App1 task takes 8 hrs for completion
>> 4. Each App2 task takes 5 mins for completion
>> 5. App1 triggered at time "t1" and using all the slots of queue.
>> 6. App2 triggered at time "t2" (where t2 > t1) and waits longer fot App1
>> tasks to release the resources.
>> 7. We can't have preemption enabled as we don't want to lose the work
>> completed so far by App1.
>> 8. We can't have separate queues for App1 and App2 as we have lots of
>> jobs like this and it will explode the number of queues.
>> 9. We use CapacityScheduler.
>>
>> In this scenario, if I can control App1 concurrent usage limits to
>> 50vcores and 75GB, then App1 may take longer time to finish but there won't
>> be any starvation for App2 (and other jobs running in same queue)
>>
>> @Rohit, FairOrdering policy may not solve this starvation problem.
>>
>> @Naga, I couldn't think through the expected behavior of "
>> yarn.scheduler.capacity.<queue-path>.app-limit-factor"
>> I will revert on this.
>>
>> On 29 September 2015 at 14:57, Namikaze Minato <ll...@gmail.com>
>> wrote:
>>
>>> I think Laxman should also tell us more about which application type
>>> he is running. The normal use cas of MAPREDUCE should be working as
>>> intended, but if he has for example one MAP using 100 vcores, then the
>>> second map will have to wait until the app completes. Same would
>>> happen if the applications running were spark, as spark does not free
>>> what is allocated to it.
>>>
>>> Regards,
>>> LLoyd
>>>
>>> On 29 September 2015 at 11:22, Naganarasimha G R (Naga)
>>> <ga...@huawei.com> wrote:
>>> > Thanks Rohith for your thoughts ,
>>> >       But i think by this configuration it might not completely solve
>>> the
>>> > scenario mentioned by Laxman, As if the there is some time gap between
>>> first
>>> > and and the second app then though we have fairness or priority set
>>> for apps
>>> > starvation will be there.
>>> > IIUC we can think of an approach where in we can have something
>>> similar to
>>> > "yarn.scheduler.capacity.<queue-path>.user-limit-factor"  where in it
>>> can
>>> > provide  the functionality like
>>> > "yarn.scheduler.capacity.<queue-path>.app-limit-factor" : The multiple
>>> of
>>> > the queue capacity which can be configured to allow a single app to
>>> acquire
>>> > more resources.  Thoughts ?
>>> >
>>> > + Naga
>>> >
>>> >
>>> >
>>> > ________________________________
>>> > From: Rohith Sharma K S [rohithsharmaks@huawei.com]
>>> > Sent: Tuesday, September 29, 2015 14:07
>>> > To: user@hadoop.apache.org
>>> > Subject: RE: Concurrency control
>>> >
>>> > Hi Laxman,
>>> >
>>> >
>>> >
>>> > In Hadoop-2.8(Not released  yet),  CapacityScheduler provides
>>> configuration
>>> > for configuring ordering policy.  By configuring FAIR_ORDERING_POLICY
>>> in CS
>>> > , probably you should be able to achieve  your goal i.e avoiding
>>> starving of
>>> > applications for resources.
>>> >
>>> >
>>> >
>>> >
>>> >
>>> >
>>> org.apache.hadoop.yarn.server.resourcemanager.scheduler.policy.FairOrderingPolicy<S>
>>> >
>>> > An OrderingPolicy which orders SchedulableEntities for fairness (see
>>> > FairScheduler FairSharePolicy), generally, processes with lesser usage
>>> are
>>> > lesser. If sizedBasedWeight is set to true then an application with
>>> high
>>> > demand may be prioritized ahead of an application with less usage.
>>> This is
>>> > to offset the tendency to favor small apps, which could result in
>>> starvation
>>> > for large apps if many small ones enter and leave the queue
>>> continuously
>>> > (optional, default false)
>>> >
>>> >
>>> >
>>> >
>>> >
>>> > Community Issue Id :  https://issues.apache.org/jira/browse/YARN-3463
>>> >
>>> >
>>> >
>>> > Thanks & Regards
>>> >
>>> > Rohith Sharma K S
>>> >
>>> >
>>> >
>>> > From: Laxman Ch [mailto:laxman.lux@gmail.com]
>>> > Sent: 29 September 2015 13:36
>>> > To: user@hadoop.apache.org
>>> > Subject: Re: Concurrency control
>>> >
>>> >
>>> >
>>> > Bouncing this thread again. Any other thoughts please?
>>> >
>>> >
>>> >
>>> > On 17 September 2015 at 23:21, Laxman Ch <la...@gmail.com> wrote:
>>> >
>>> > No Naga. That wont help.
>>> >
>>> >
>>> >
>>> > I am running two applications (app1 - 100 vcores, app2 - 100 vcores)
>>> with
>>> > same user which runs in same queue (capacity=100vcores). In this
>>> scenario,
>>> > if app1 triggers first occupies all the slots and runs longs then app2
>>> will
>>> > starve longer.
>>> >
>>> >
>>> >
>>> > Let me reiterate my problem statement. I wanted "to control the amount
>>> of
>>> > resources (vcores, memory) used by an application SIMULTANEOUSLY"
>>> >
>>> >
>>> >
>>> > On 17 September 2015 at 22:28, Naganarasimha Garla
>>> > <na...@gmail.com> wrote:
>>> >
>>> > Hi Laxman,
>>> >
>>> > For the example you have stated may be we can do the following things :
>>> >
>>> > 1. Create/modify the queue with capacity and max cap set such that its
>>> > equivalent to 100 vcores. So as there is no elasticity, given
>>> application
>>> > will not be using the resources beyond the capacity configured
>>> >
>>> > 2. yarn.scheduler.capacity.<queue-path>.minimum-user-limit-percent
>>>  so that
>>> > each active user would be assured with the minimum guaranteed
>>> resources . By
>>> > default value is 100 implies no user limits are imposed.
>>> >
>>> >
>>> >
>>> > Additionally we can think of
>>> >
>>> "yarn.nodemanager.linux-container-executor.cgroups.strict-resource-usage"
>>> > which will enforce strict cpu usage for a given container if required.
>>> >
>>> >
>>> >
>>> > + Naga
>>> >
>>> >
>>> >
>>> > On Thu, Sep 17, 2015 at 4:42 PM, Laxman Ch <la...@gmail.com>
>>> wrote:
>>> >
>>> > Yes. I'm already using cgroups. Cgroups helps in controlling the
>>> resources
>>> > at container level. But my requirement is more about controlling the
>>> > concurrent resource usage of an application at whole cluster level.
>>> >
>>> >
>>> >
>>> > And yes, we do configure queues properly. But, that won't help.
>>> >
>>> >
>>> >
>>> > For example, I have an application with a requirement of 1000 vcores.
>>> But, I
>>> > wanted to control this application not to go beyond 100 vcores at any
>>> point
>>> > of time in the cluster/queue. This makes that application to run
>>> longer even
>>> > when my cluster is free but I will be able meet the guaranteed SLAs of
>>> other
>>> > applications.
>>> >
>>> >
>>> >
>>> > Hope this helps to understand my question.
>>> >
>>> >
>>> >
>>> > And thanks Narasimha for quick response.
>>> >
>>> >
>>> >
>>> > On 17 September 2015 at 16:17, Naganarasimha Garla
>>> > <na...@gmail.com> wrote:
>>> >
>>> > Hi Laxman,
>>> >
>>> > Yes if cgroups are enabled and
>>> "yarn.scheduler.capacity.resource-calculator"
>>> > configured to DominantResourceCalculator then cpu and memory can be
>>> > controlled.
>>> >
>>> > Please Kindly  furhter refer to the official documentation
>>> > http://hadoop.apache.org/docs/r1.2.1/capacity_scheduler.html
>>> >
>>> >
>>> >
>>> > But may be if say more about problem then we can suggest ideal
>>> > configuration, seems like capacity configuration and splitting of the
>>> queue
>>> > is not rightly done or you might refer to Fair Scheduler if you want
>>> more
>>> > fairness for container allocation for different apps.
>>> >
>>> >
>>> >
>>> > On Thu, Sep 17, 2015 at 4:10 PM, Laxman Ch <la...@gmail.com>
>>> wrote:
>>> >
>>> > Hi,
>>> >
>>> >
>>> >
>>> > In YARN, do we have any way to control the amount of resources (vcores,
>>> > memory) used by an application SIMULTANEOUSLY.
>>> >
>>> >
>>> >
>>> > - In my cluster, noticed some large and long running mr-app occupied
>>> all the
>>> > slots of the queue and blocking other apps to get started.
>>> >
>>> > - I'm using Capacity schedulers (using hierarchical queues and
>>> preemption
>>> > disabled)
>>> >
>>> > - Using Hadoop version 2.6.0
>>> >
>>> > - Did some googling around this and gone through configuration docs
>>> but I'm
>>> > not able to find anything that matches my requirement.
>>> >
>>> >
>>> >
>>> > If needed, I can provide more details on the usecase and problem.
>>> >
>>> >
>>> >
>>> > --
>>> >
>>> > Thanks,
>>> > Laxman
>>> >
>>> >
>>> >
>>> >
>>> >
>>> >
>>> >
>>> > --
>>> >
>>> > Thanks,
>>> > Laxman
>>> >
>>> >
>>> >
>>> >
>>> >
>>> >
>>> >
>>> > --
>>> >
>>> > Thanks,
>>> > Laxman
>>> >
>>> >
>>> >
>>> >
>>> >
>>> > --
>>> >
>>> > Thanks,
>>> > Laxman
>>>
>>
>>
>>
>> --
>> Thanks,
>> Laxman
>>
>
>
>
> --
> Thanks,
> Laxman
>



-- 
Thanks,
Laxman

Re: Concurrency control

Posted by Naganarasimha Garla <na...@gmail.com>.

Thanks Harsh J for the useful sharing of the info, but can we think of some
way to support this scenario from YARN side ?
like the queue configuration i mentioned or in the way Laxman mentioned
(app specific override)  ?

On Fri, Oct 2, 2015 at 10:26 PM, Laxman Ch <la...@gmail.com> wrote:

> Thanks and Perfect Harsh. Exactly what I am looking for. Most of our
> applications are MR.
> So, this should be sufficient for us. These configurations, I will give a
> try and post my findings again here. Thanks again.
>
> Thanks Naga, Rohit & Lloyd for your suggestions and discussion.
>
> On 2 October 2015 at 07:37, Harsh J <ha...@cloudera.com> wrote:
>
>> If all your Apps are MR, then what you are looking for is MAPREDUCE-5583
>> (it can be set per-job).
>>
>> On Thu, Oct 1, 2015 at 3:03 PM Laxman Ch <la...@gmail.com> wrote:
>>
>>> Hi Naga,
>>>
>>> Like most of the app-level configurations, admin can configure the
>>> defaults which user may want override at application level.
>>>
>>> If this is at queue-level then all applications in a queue will have the
>>> same limits. But all our applications in a queue may not have same SLA and
>>> we may need to restrict them differently. This requires again splitting
>>> queues further which I feel is more overhead.
>>>
>>>
>>> On 30 September 2015 at 09:00, Naganarasimha G R (Naga) <
>>> garlanaganarasimha@huawei.com> wrote:
>>>
>>>> Hi Laxman,
>>>>
>>>> Ideally i understand it would be better its available @ application
>>>> level, but  its like each user is expected to ensure that he gives the
>>>> right configuration which is within the limits of max capacity.
>>>> And what if user submits some app *(kind of a query execution app**)*
>>>> with out this setting *or* he doesn't know how much it should take ?
>>>> In general, users specifying resources for containers itself is a difficult
>>>> task.
>>>> And it might not be right to expect that the admin will do it for each
>>>> application in the queue either.  Basically governing will be difficult if
>>>> its not enforced from queue/scheduler side.
>>>>
>>>> + Naga
>>>>
>>>> ------------------------------
>>>> *From:* Laxman Ch [laxman.lux@gmail.com]
>>>> *Sent:* Tuesday, September 29, 2015 16:52
>>>>
>>>> *To:* user@hadoop.apache.org
>>>> *Subject:* Re: Concurrency control
>>>>
>>>> IMO, its better to have a application level configuration than to have
>>>> a scheduler/queue level configuration.
>>>> Having a queue level configuration will restrict every single
>>>> application that runs in that queue.
>>>> But, we may want to configure these limits for only some set of jobs
>>>> and also for every application these limits can be different.
>>>>
>>>> FairOrdering policy thing, order of jobs can't be enforced as these are
>>>> adhoc jobs and scheduled/owned independently by different teams.
>>>>
>>>> On 29 September 2015 at 16:43, Naganarasimha G R (Naga) <
>>>> garlanaganarasimha@huawei.com> wrote:
>>>>
>>>>> Hi Laxman,
>>>>>
>>>>> What i meant was,  suppose if we support and configure
>>>>> yarn.scheduler.capacity.<queue-path>.app-limit-factor to .25  then a
>>>>> single app should not take more than 25 % of resources in the queue.
>>>>> This would be a more generic configuration which can be enforced by
>>>>> the admin, than expecting it to be configured for per app by the user.
>>>>>
>>>>> And for Rohith's suggestion of FairOrdering policy , I think it should
>>>>> solve the problem if the App which is submitted first is not already hogged
>>>>> all the queue's resources.
>>>>>
>>>>> + Naga
>>>>>
>>>>> ------------------------------
>>>>> *From:* Laxman Ch [laxman.lux@gmail.com]
>>>>> *Sent:* Tuesday, September 29, 2015 16:03
>>>>>
>>>>> *To:* user@hadoop.apache.org
>>>>> *Subject:* Re: Concurrency control
>>>>>
>>>>> Thanks Rohit, Naga and Lloyd for the responses.
>>>>>
>>>>> > I think Laxman should also tell us more about which application
>>>>> type he is running.
>>>>>
>>>>> We run mr jobs mostly with default core/memory allocation (1 vcore,
>>>>> 1.5GB).
>>>>> Our problem is more about controlling the * resources used
>>>>> simultaneously by all running containers *at any given point of time
>>>>> per application.
>>>>>
>>>>> Example:
>>>>> 1. App1 and App2 are two MR apps.
>>>>> 2. App1 and App2 belong to same queue (capacity: 100 vcores, 150 GB).
>>>>> 3. Each App1 task takes 8 hrs for completion
>>>>> 4. Each App2 task takes 5 mins for completion
>>>>> 5. App1 triggered at time "t1" and using all the slots of queue.
>>>>> 6. App2 triggered at time "t2" (where t2 > t1) and waits longer fot
>>>>> App1 tasks to release the resources.
>>>>> 7. We can't have preemption enabled as we don't want to lose the work
>>>>> completed so far by App1.
>>>>> 8. We can't have separate queues for App1 and App2 as we have lots of
>>>>> jobs like this and it will explode the number of queues.
>>>>> 9. We use CapacityScheduler.
>>>>>
>>>>> In this scenario, if I can control App1 concurrent usage limits to
>>>>> 50vcores and 75GB, then App1 may take longer time to finish but there won't
>>>>> be any starvation for App2 (and other jobs running in same queue)
>>>>>
>>>>> @Rohit, FairOrdering policy may not solve this starvation problem.
>>>>>
>>>>> @Naga, I couldn't think through the expected behavior of "
>>>>> yarn.scheduler.capacity.<queue-path>.app-limit-factor"
>>>>> I will revert on this.
>>>>>
>>>>> On 29 September 2015 at 14:57, Namikaze Minato <ll...@gmail.com>
>>>>> wrote:
>>>>>
>>>>>> I think Laxman should also tell us more about which application type
>>>>>> he is running. The normal use cas of MAPREDUCE should be working as
>>>>>> intended, but if he has for example one MAP using 100 vcores, then the
>>>>>> second map will have to wait until the app completes. Same would
>>>>>> happen if the applications running were spark, as spark does not free
>>>>>> what is allocated to it.
>>>>>>
>>>>>> Regards,
>>>>>> LLoyd
>>>>>>
>>>>>> On 29 September 2015 at 11:22, Naganarasimha G R (Naga)
>>>>>> <ga...@huawei.com> wrote:
>>>>>> > Thanks Rohith for your thoughts ,
>>>>>> >       But i think by this configuration it might not completely
>>>>>> solve the
>>>>>> > scenario mentioned by Laxman, As if the there is some time gap
>>>>>> between first
>>>>>> > and and the second app then though we have fairness or priority set
>>>>>> for apps
>>>>>> > starvation will be there.
>>>>>> > IIUC we can think of an approach where in we can have something
>>>>>> similar to
>>>>>> > "yarn.scheduler.capacity.<queue-path>.user-limit-factor"  where in
>>>>>> it can
>>>>>> > provide  the functionality like
>>>>>> > "yarn.scheduler.capacity.<queue-path>.app-limit-factor" : The
>>>>>> multiple of
>>>>>> > the queue capacity which can be configured to allow a single app to
>>>>>> acquire
>>>>>> > more resources.  Thoughts ?
>>>>>> >
>>>>>> > + Naga
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> > ________________________________
>>>>>> > From: Rohith Sharma K S [rohithsharmaks@huawei.com]
>>>>>> > Sent: Tuesday, September 29, 2015 14:07
>>>>>> > To: user@hadoop.apache.org
>>>>>> > Subject: RE: Concurrency control
>>>>>> >
>>>>>> > Hi Laxman,
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> > In Hadoop-2.8(Not released  yet),  CapacityScheduler provides
>>>>>> configuration
>>>>>> > for configuring ordering policy.  By configuring
>>>>>> FAIR_ORDERING_POLICY in CS
>>>>>> > , probably you should be able to achieve  your goal i.e avoiding
>>>>>> starving of
>>>>>> > applications for resources.
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> org.apache.hadoop.yarn.server.resourcemanager.scheduler.policy.FairOrderingPolicy<S>
>>>>>> >
>>>>>> > An OrderingPolicy which orders SchedulableEntities for fairness (see
>>>>>> > FairScheduler FairSharePolicy), generally, processes with lesser
>>>>>> usage are
>>>>>> > lesser. If sizedBasedWeight is set to true then an application with
>>>>>> high
>>>>>> > demand may be prioritized ahead of an application with less usage.
>>>>>> This is
>>>>>> > to offset the tendency to favor small apps, which could result in
>>>>>> starvation
>>>>>> > for large apps if many small ones enter and leave the queue
>>>>>> continuously
>>>>>> > (optional, default false)
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> > Community Issue Id :
>>>>>> https://issues.apache.org/jira/browse/YARN-3463
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> > Thanks & Regards
>>>>>> >
>>>>>> > Rohith Sharma K S
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> > From: Laxman Ch [mailto:laxman.lux@gmail.com]
>>>>>> > Sent: 29 September 2015 13:36
>>>>>> > To: user@hadoop.apache.org
>>>>>> > Subject: Re: Concurrency control
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> > Bouncing this thread again. Any other thoughts please?
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> > On 17 September 2015 at 23:21, Laxman Ch <la...@gmail.com>
>>>>>> wrote:
>>>>>> >
>>>>>> > No Naga. That wont help.
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> > I am running two applications (app1 - 100 vcores, app2 - 100
>>>>>> vcores) with
>>>>>> > same user which runs in same queue (capacity=100vcores). In this
>>>>>> scenario,
>>>>>> > if app1 triggers first occupies all the slots and runs longs then
>>>>>> app2 will
>>>>>> > starve longer.
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> > Let me reiterate my problem statement. I wanted "to control the
>>>>>> amount of
>>>>>> > resources (vcores, memory) used by an application SIMULTANEOUSLY"
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> > On 17 September 2015 at 22:28, Naganarasimha Garla
>>>>>> > <na...@gmail.com> wrote:
>>>>>> >
>>>>>> > Hi Laxman,
>>>>>> >
>>>>>> > For the example you have stated may be we can do the following
>>>>>> things :
>>>>>> >
>>>>>> > 1. Create/modify the queue with capacity and max cap set such that
>>>>>> its
>>>>>> > equivalent to 100 vcores. So as there is no elasticity, given
>>>>>> application
>>>>>> > will not be using the resources beyond the capacity configured
>>>>>> >
>>>>>> > 2. yarn.scheduler.capacity.<queue-path>.minimum-user-limit-percent
>>>>>>  so that
>>>>>> > each active user would be assured with the minimum guaranteed
>>>>>> resources . By
>>>>>> > default value is 100 implies no user limits are imposed.
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> > Additionally we can think of
>>>>>> >
>>>>>> "yarn.nodemanager.linux-container-executor.cgroups.strict-resource-usage"
>>>>>> > which will enforce strict cpu usage for a given container if
>>>>>> required.
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> > + Naga
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> > On Thu, Sep 17, 2015 at 4:42 PM, Laxman Ch <la...@gmail.com>
>>>>>> wrote:
>>>>>> >
>>>>>> > Yes. I'm already using cgroups. Cgroups helps in controlling the
>>>>>> resources
>>>>>> > at container level. But my requirement is more about controlling the
>>>>>> > concurrent resource usage of an application at whole cluster level.
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> > And yes, we do configure queues properly. But, that won't help.
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> > For example, I have an application with a requirement of 1000
>>>>>> vcores. But, I
>>>>>> > wanted to control this application not to go beyond 100 vcores at
>>>>>> any point
>>>>>> > of time in the cluster/queue. This makes that application to run
>>>>>> longer even
>>>>>> > when my cluster is free but I will be able meet the guaranteed SLAs
>>>>>> of other
>>>>>> > applications.
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> > Hope this helps to understand my question.
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> > And thanks Narasimha for quick response.
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> > On 17 September 2015 at 16:17, Naganarasimha Garla
>>>>>> > <na...@gmail.com> wrote:
>>>>>> >
>>>>>> > Hi Laxman,
>>>>>> >
>>>>>> > Yes if cgroups are enabled and
>>>>>> "yarn.scheduler.capacity.resource-calculator"
>>>>>> > configured to DominantResourceCalculator then cpu and memory can be
>>>>>> > controlled.
>>>>>> >
>>>>>> > Please Kindly  furhter refer to the official documentation
>>>>>> > http://hadoop.apache.org/docs/r1.2.1/capacity_scheduler.html
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> > But may be if say more about problem then we can suggest ideal
>>>>>> > configuration, seems like capacity configuration and splitting of
>>>>>> the queue
>>>>>> > is not rightly done or you might refer to Fair Scheduler if you
>>>>>> want more
>>>>>> > fairness for container allocation for different apps.
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> > On Thu, Sep 17, 2015 at 4:10 PM, Laxman Ch <la...@gmail.com>
>>>>>> wrote:
>>>>>> >
>>>>>> > Hi,
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> > In YARN, do we have any way to control the amount of resources
>>>>>> (vcores,
>>>>>> > memory) used by an application SIMULTANEOUSLY.
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> > - In my cluster, noticed some large and long running mr-app
>>>>>> occupied all the
>>>>>> > slots of the queue and blocking other apps to get started.
>>>>>> >
>>>>>> > - I'm using Capacity schedulers (using hierarchical queues and
>>>>>> preemption
>>>>>> > disabled)
>>>>>> >
>>>>>> > - Using Hadoop version 2.6.0
>>>>>> >
>>>>>> > - Did some googling around this and gone through configuration docs
>>>>>> but I'm
>>>>>> > not able to find anything that matches my requirement.
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> > If needed, I can provide more details on the usecase and problem.
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> > --
>>>>>> >
>>>>>> > Thanks,
>>>>>> > Laxman
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> > --
>>>>>> >
>>>>>> > Thanks,
>>>>>> > Laxman
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> > --
>>>>>> >
>>>>>> > Thanks,
>>>>>> > Laxman
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> > --
>>>>>> >
>>>>>> > Thanks,
>>>>>> > Laxman
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Thanks,
>>>>> Laxman
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> Thanks,
>>>> Laxman
>>>>
>>>
>>>
>>>
>>> --
>>> Thanks,
>>> Laxman
>>>
>>
>
>
> --
> Thanks,
> Laxman
>

Re: Concurrency control

Posted by Naganarasimha Garla <na...@gmail.com>.

Thanks Harsh J for the useful sharing of the info, but can we think of some
way to support this scenario from YARN side ?
like the queue configuration i mentioned or in the way Laxman mentioned
(app specific override)  ?

On Fri, Oct 2, 2015 at 10:26 PM, Laxman Ch <la...@gmail.com> wrote:

> Thanks and Perfect Harsh. Exactly what I am looking for. Most of our
> applications are MR.
> So, this should be sufficient for us. These configurations, I will give a
> try and post my findings again here. Thanks again.
>
> Thanks Naga, Rohit & Lloyd for your suggestions and discussion.
>
> On 2 October 2015 at 07:37, Harsh J <ha...@cloudera.com> wrote:
>
>> If all your Apps are MR, then what you are looking for is MAPREDUCE-5583
>> (it can be set per-job).
>>
>> On Thu, Oct 1, 2015 at 3:03 PM Laxman Ch <la...@gmail.com> wrote:
>>
>>> Hi Naga,
>>>
>>> Like most of the app-level configurations, admin can configure the
>>> defaults which user may want override at application level.
>>>
>>> If this is at queue-level then all applications in a queue will have the
>>> same limits. But all our applications in a queue may not have same SLA and
>>> we may need to restrict them differently. This requires again splitting
>>> queues further which I feel is more overhead.
>>>
>>>
>>> On 30 September 2015 at 09:00, Naganarasimha G R (Naga) <
>>> garlanaganarasimha@huawei.com> wrote:
>>>
>>>> Hi Laxman,
>>>>
>>>> Ideally i understand it would be better its available @ application
>>>> level, but  its like each user is expected to ensure that he gives the
>>>> right configuration which is within the limits of max capacity.
>>>> And what if user submits some app *(kind of a query execution app**)*
>>>> with out this setting *or* he doesn't know how much it should take ?
>>>> In general, users specifying resources for containers itself is a difficult
>>>> task.
>>>> And it might not be right to expect that the admin will do it for each
>>>> application in the queue either.  Basically governing will be difficult if
>>>> its not enforced from queue/scheduler side.
>>>>
>>>> + Naga
>>>>
>>>> ------------------------------
>>>> *From:* Laxman Ch [laxman.lux@gmail.com]
>>>> *Sent:* Tuesday, September 29, 2015 16:52
>>>>
>>>> *To:* user@hadoop.apache.org
>>>> *Subject:* Re: Concurrency control
>>>>
>>>> IMO, its better to have a application level configuration than to have
>>>> a scheduler/queue level configuration.
>>>> Having a queue level configuration will restrict every single
>>>> application that runs in that queue.
>>>> But, we may want to configure these limits for only some set of jobs
>>>> and also for every application these limits can be different.
>>>>
>>>> FairOrdering policy thing, order of jobs can't be enforced as these are
>>>> adhoc jobs and scheduled/owned independently by different teams.
>>>>
>>>> On 29 September 2015 at 16:43, Naganarasimha G R (Naga) <
>>>> garlanaganarasimha@huawei.com> wrote:
>>>>
>>>>> Hi Laxman,
>>>>>
>>>>> What i meant was,  suppose if we support and configure
>>>>> yarn.scheduler.capacity.<queue-path>.app-limit-factor to .25  then a
>>>>> single app should not take more than 25 % of resources in the queue.
>>>>> This would be a more generic configuration which can be enforced by
>>>>> the admin, than expecting it to be configured for per app by the user.
>>>>>
>>>>> And for Rohith's suggestion of FairOrdering policy , I think it should
>>>>> solve the problem if the App which is submitted first is not already hogged
>>>>> all the queue's resources.
>>>>>
>>>>> + Naga
>>>>>
>>>>> ------------------------------
>>>>> *From:* Laxman Ch [laxman.lux@gmail.com]
>>>>> *Sent:* Tuesday, September 29, 2015 16:03
>>>>>
>>>>> *To:* user@hadoop.apache.org
>>>>> *Subject:* Re: Concurrency control
>>>>>
>>>>> Thanks Rohit, Naga and Lloyd for the responses.
>>>>>
>>>>> > I think Laxman should also tell us more about which application
>>>>> type he is running.
>>>>>
>>>>> We run mr jobs mostly with default core/memory allocation (1 vcore,
>>>>> 1.5GB).
>>>>> Our problem is more about controlling the * resources used
>>>>> simultaneously by all running containers *at any given point of time
>>>>> per application.
>>>>>
>>>>> Example:
>>>>> 1. App1 and App2 are two MR apps.
>>>>> 2. App1 and App2 belong to same queue (capacity: 100 vcores, 150 GB).
>>>>> 3. Each App1 task takes 8 hrs for completion
>>>>> 4. Each App2 task takes 5 mins for completion
>>>>> 5. App1 triggered at time "t1" and using all the slots of queue.
>>>>> 6. App2 triggered at time "t2" (where t2 > t1) and waits longer fot
>>>>> App1 tasks to release the resources.
>>>>> 7. We can't have preemption enabled as we don't want to lose the work
>>>>> completed so far by App1.
>>>>> 8. We can't have separate queues for App1 and App2 as we have lots of
>>>>> jobs like this and it will explode the number of queues.
>>>>> 9. We use CapacityScheduler.
>>>>>
>>>>> In this scenario, if I can control App1 concurrent usage limits to
>>>>> 50vcores and 75GB, then App1 may take longer time to finish but there won't
>>>>> be any starvation for App2 (and other jobs running in same queue)
>>>>>
>>>>> @Rohit, FairOrdering policy may not solve this starvation problem.
>>>>>
>>>>> @Naga, I couldn't think through the expected behavior of "
>>>>> yarn.scheduler.capacity.<queue-path>.app-limit-factor"
>>>>> I will revert on this.
>>>>>
>>>>> On 29 September 2015 at 14:57, Namikaze Minato <ll...@gmail.com>
>>>>> wrote:
>>>>>
>>>>>> I think Laxman should also tell us more about which application type
>>>>>> he is running. The normal use cas of MAPREDUCE should be working as
>>>>>> intended, but if he has for example one MAP using 100 vcores, then the
>>>>>> second map will have to wait until the app completes. Same would
>>>>>> happen if the applications running were spark, as spark does not free
>>>>>> what is allocated to it.
>>>>>>
>>>>>> Regards,
>>>>>> LLoyd
>>>>>>
>>>>>> On 29 September 2015 at 11:22, Naganarasimha G R (Naga)
>>>>>> <ga...@huawei.com> wrote:
>>>>>> > Thanks Rohith for your thoughts ,
>>>>>> >       But i think by this configuration it might not completely
>>>>>> solve the
>>>>>> > scenario mentioned by Laxman, As if the there is some time gap
>>>>>> between first
>>>>>> > and and the second app then though we have fairness or priority set
>>>>>> for apps
>>>>>> > starvation will be there.
>>>>>> > IIUC we can think of an approach where in we can have something
>>>>>> similar to
>>>>>> > "yarn.scheduler.capacity.<queue-path>.user-limit-factor"  where in
>>>>>> it can
>>>>>> > provide  the functionality like
>>>>>> > "yarn.scheduler.capacity.<queue-path>.app-limit-factor" : The
>>>>>> multiple of
>>>>>> > the queue capacity which can be configured to allow a single app to
>>>>>> acquire
>>>>>> > more resources.  Thoughts ?
>>>>>> >
>>>>>> > + Naga
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> > ________________________________
>>>>>> > From: Rohith Sharma K S [rohithsharmaks@huawei.com]
>>>>>> > Sent: Tuesday, September 29, 2015 14:07
>>>>>> > To: user@hadoop.apache.org
>>>>>> > Subject: RE: Concurrency control
>>>>>> >
>>>>>> > Hi Laxman,
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> > In Hadoop-2.8(Not released  yet),  CapacityScheduler provides
>>>>>> configuration
>>>>>> > for configuring ordering policy.  By configuring
>>>>>> FAIR_ORDERING_POLICY in CS
>>>>>> > , probably you should be able to achieve  your goal i.e avoiding
>>>>>> starving of
>>>>>> > applications for resources.
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> org.apache.hadoop.yarn.server.resourcemanager.scheduler.policy.FairOrderingPolicy<S>
>>>>>> >
>>>>>> > An OrderingPolicy which orders SchedulableEntities for fairness (see
>>>>>> > FairScheduler FairSharePolicy), generally, processes with lesser
>>>>>> usage are
>>>>>> > lesser. If sizedBasedWeight is set to true then an application with
>>>>>> high
>>>>>> > demand may be prioritized ahead of an application with less usage.
>>>>>> This is
>>>>>> > to offset the tendency to favor small apps, which could result in
>>>>>> starvation
>>>>>> > for large apps if many small ones enter and leave the queue
>>>>>> continuously
>>>>>> > (optional, default false)
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> > Community Issue Id :
>>>>>> https://issues.apache.org/jira/browse/YARN-3463
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> > Thanks & Regards
>>>>>> >
>>>>>> > Rohith Sharma K S
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> > From: Laxman Ch [mailto:laxman.lux@gmail.com]
>>>>>> > Sent: 29 September 2015 13:36
>>>>>> > To: user@hadoop.apache.org
>>>>>> > Subject: Re: Concurrency control
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> > Bouncing this thread again. Any other thoughts please?
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> > On 17 September 2015 at 23:21, Laxman Ch <la...@gmail.com>
>>>>>> wrote:
>>>>>> >
>>>>>> > No Naga. That wont help.
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> > I am running two applications (app1 - 100 vcores, app2 - 100
>>>>>> vcores) with
>>>>>> > same user which runs in same queue (capacity=100vcores). In this
>>>>>> scenario,
>>>>>> > if app1 triggers first occupies all the slots and runs longs then
>>>>>> app2 will
>>>>>> > starve longer.
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> > Let me reiterate my problem statement. I wanted "to control the
>>>>>> amount of
>>>>>> > resources (vcores, memory) used by an application SIMULTANEOUSLY"
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> > On 17 September 2015 at 22:28, Naganarasimha Garla
>>>>>> > <na...@gmail.com> wrote:
>>>>>> >
>>>>>> > Hi Laxman,
>>>>>> >
>>>>>> > For the example you have stated may be we can do the following
>>>>>> things :
>>>>>> >
>>>>>> > 1. Create/modify the queue with capacity and max cap set such that
>>>>>> its
>>>>>> > equivalent to 100 vcores. So as there is no elasticity, given
>>>>>> application
>>>>>> > will not be using the resources beyond the capacity configured
>>>>>> >
>>>>>> > 2. yarn.scheduler.capacity.<queue-path>.minimum-user-limit-percent
>>>>>>  so that
>>>>>> > each active user would be assured with the minimum guaranteed
>>>>>> resources . By
>>>>>> > default value is 100 implies no user limits are imposed.
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> > Additionally we can think of
>>>>>> >
>>>>>> "yarn.nodemanager.linux-container-executor.cgroups.strict-resource-usage"
>>>>>> > which will enforce strict cpu usage for a given container if
>>>>>> required.
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> > + Naga
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> > On Thu, Sep 17, 2015 at 4:42 PM, Laxman Ch <la...@gmail.com>
>>>>>> wrote:
>>>>>> >
>>>>>> > Yes. I'm already using cgroups. Cgroups helps in controlling the
>>>>>> resources
>>>>>> > at container level. But my requirement is more about controlling the
>>>>>> > concurrent resource usage of an application at whole cluster level.
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> > And yes, we do configure queues properly. But, that won't help.
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> > For example, I have an application with a requirement of 1000
>>>>>> vcores. But, I
>>>>>> > wanted to control this application not to go beyond 100 vcores at
>>>>>> any point
>>>>>> > of time in the cluster/queue. This makes that application to run
>>>>>> longer even
>>>>>> > when my cluster is free but I will be able meet the guaranteed SLAs
>>>>>> of other
>>>>>> > applications.
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> > Hope this helps to understand my question.
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> > And thanks Narasimha for quick response.
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> > On 17 September 2015 at 16:17, Naganarasimha Garla
>>>>>> > <na...@gmail.com> wrote:
>>>>>> >
>>>>>> > Hi Laxman,
>>>>>> >
>>>>>> > Yes if cgroups are enabled and
>>>>>> "yarn.scheduler.capacity.resource-calculator"
>>>>>> > configured to DominantResourceCalculator then cpu and memory can be
>>>>>> > controlled.
>>>>>> >
>>>>>> > Please Kindly  furhter refer to the official documentation
>>>>>> > http://hadoop.apache.org/docs/r1.2.1/capacity_scheduler.html
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> > But may be if say more about problem then we can suggest ideal
>>>>>> > configuration, seems like capacity configuration and splitting of
>>>>>> the queue
>>>>>> > is not rightly done or you might refer to Fair Scheduler if you
>>>>>> want more
>>>>>> > fairness for container allocation for different apps.
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> > On Thu, Sep 17, 2015 at 4:10 PM, Laxman Ch <la...@gmail.com>
>>>>>> wrote:
>>>>>> >
>>>>>> > Hi,
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> > In YARN, do we have any way to control the amount of resources
>>>>>> (vcores,
>>>>>> > memory) used by an application SIMULTANEOUSLY.
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> > - In my cluster, noticed some large and long running mr-app
>>>>>> occupied all the
>>>>>> > slots of the queue and blocking other apps to get started.
>>>>>> >
>>>>>> > - I'm using Capacity schedulers (using hierarchical queues and
>>>>>> preemption
>>>>>> > disabled)
>>>>>> >
>>>>>> > - Using Hadoop version 2.6.0
>>>>>> >
>>>>>> > - Did some googling around this and gone through configuration docs
>>>>>> but I'm
>>>>>> > not able to find anything that matches my requirement.
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> > If needed, I can provide more details on the usecase and problem.
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> > --
>>>>>> >
>>>>>> > Thanks,
>>>>>> > Laxman
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> > --
>>>>>> >
>>>>>> > Thanks,
>>>>>> > Laxman
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> > --
>>>>>> >
>>>>>> > Thanks,
>>>>>> > Laxman
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> > --
>>>>>> >
>>>>>> > Thanks,
>>>>>> > Laxman
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Thanks,
>>>>> Laxman
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> Thanks,
>>>> Laxman
>>>>
>>>
>>>
>>>
>>> --
>>> Thanks,
>>> Laxman
>>>
>>
>
>
> --
> Thanks,
> Laxman
>

Re: Concurrency control

Posted by Naganarasimha Garla <na...@gmail.com>.

Thanks Harsh J for the useful sharing of the info, but can we think of some
way to support this scenario from YARN side ?
like the queue configuration i mentioned or in the way Laxman mentioned
(app specific override)  ?

On Fri, Oct 2, 2015 at 10:26 PM, Laxman Ch <la...@gmail.com> wrote:

> Thanks and Perfect Harsh. Exactly what I am looking for. Most of our
> applications are MR.
> So, this should be sufficient for us. These configurations, I will give a
> try and post my findings again here. Thanks again.
>
> Thanks Naga, Rohit & Lloyd for your suggestions and discussion.
>
> On 2 October 2015 at 07:37, Harsh J <ha...@cloudera.com> wrote:
>
>> If all your Apps are MR, then what you are looking for is MAPREDUCE-5583
>> (it can be set per-job).
>>
>> On Thu, Oct 1, 2015 at 3:03 PM Laxman Ch <la...@gmail.com> wrote:
>>
>>> Hi Naga,
>>>
>>> Like most of the app-level configurations, admin can configure the
>>> defaults which user may want override at application level.
>>>
>>> If this is at queue-level then all applications in a queue will have the
>>> same limits. But all our applications in a queue may not have same SLA and
>>> we may need to restrict them differently. This requires again splitting
>>> queues further which I feel is more overhead.
>>>
>>>
>>> On 30 September 2015 at 09:00, Naganarasimha G R (Naga) <
>>> garlanaganarasimha@huawei.com> wrote:
>>>
>>>> Hi Laxman,
>>>>
>>>> Ideally i understand it would be better its available @ application
>>>> level, but  its like each user is expected to ensure that he gives the
>>>> right configuration which is within the limits of max capacity.
>>>> And what if user submits some app *(kind of a query execution app**)*
>>>> with out this setting *or* he doesn't know how much it should take ?
>>>> In general, users specifying resources for containers itself is a difficult
>>>> task.
>>>> And it might not be right to expect that the admin will do it for each
>>>> application in the queue either.  Basically governing will be difficult if
>>>> its not enforced from queue/scheduler side.
>>>>
>>>> + Naga
>>>>
>>>> ------------------------------
>>>> *From:* Laxman Ch [laxman.lux@gmail.com]
>>>> *Sent:* Tuesday, September 29, 2015 16:52
>>>>
>>>> *To:* user@hadoop.apache.org
>>>> *Subject:* Re: Concurrency control
>>>>
>>>> IMO, its better to have a application level configuration than to have
>>>> a scheduler/queue level configuration.
>>>> Having a queue level configuration will restrict every single
>>>> application that runs in that queue.
>>>> But, we may want to configure these limits for only some set of jobs
>>>> and also for every application these limits can be different.
>>>>
>>>> FairOrdering policy thing, order of jobs can't be enforced as these are
>>>> adhoc jobs and scheduled/owned independently by different teams.
>>>>
>>>> On 29 September 2015 at 16:43, Naganarasimha G R (Naga) <
>>>> garlanaganarasimha@huawei.com> wrote:
>>>>
>>>>> Hi Laxman,
>>>>>
>>>>> What i meant was,  suppose if we support and configure
>>>>> yarn.scheduler.capacity.<queue-path>.app-limit-factor to .25  then a
>>>>> single app should not take more than 25 % of resources in the queue.
>>>>> This would be a more generic configuration which can be enforced by
>>>>> the admin, than expecting it to be configured for per app by the user.
>>>>>
>>>>> And for Rohith's suggestion of FairOrdering policy , I think it should
>>>>> solve the problem if the App which is submitted first is not already hogged
>>>>> all the queue's resources.
>>>>>
>>>>> + Naga
>>>>>
>>>>> ------------------------------
>>>>> *From:* Laxman Ch [laxman.lux@gmail.com]
>>>>> *Sent:* Tuesday, September 29, 2015 16:03
>>>>>
>>>>> *To:* user@hadoop.apache.org
>>>>> *Subject:* Re: Concurrency control
>>>>>
>>>>> Thanks Rohit, Naga and Lloyd for the responses.
>>>>>
>>>>> > I think Laxman should also tell us more about which application
>>>>> type he is running.
>>>>>
>>>>> We run mr jobs mostly with default core/memory allocation (1 vcore,
>>>>> 1.5GB).
>>>>> Our problem is more about controlling the * resources used
>>>>> simultaneously by all running containers *at any given point of time
>>>>> per application.
>>>>>
>>>>> Example:
>>>>> 1. App1 and App2 are two MR apps.
>>>>> 2. App1 and App2 belong to same queue (capacity: 100 vcores, 150 GB).
>>>>> 3. Each App1 task takes 8 hrs for completion
>>>>> 4. Each App2 task takes 5 mins for completion
>>>>> 5. App1 triggered at time "t1" and using all the slots of queue.
>>>>> 6. App2 triggered at time "t2" (where t2 > t1) and waits longer fot
>>>>> App1 tasks to release the resources.
>>>>> 7. We can't have preemption enabled as we don't want to lose the work
>>>>> completed so far by App1.
>>>>> 8. We can't have separate queues for App1 and App2 as we have lots of
>>>>> jobs like this and it will explode the number of queues.
>>>>> 9. We use CapacityScheduler.
>>>>>
>>>>> In this scenario, if I can control App1 concurrent usage limits to
>>>>> 50vcores and 75GB, then App1 may take longer time to finish but there won't
>>>>> be any starvation for App2 (and other jobs running in same queue)
>>>>>
>>>>> @Rohit, FairOrdering policy may not solve this starvation problem.
>>>>>
>>>>> @Naga, I couldn't think through the expected behavior of "
>>>>> yarn.scheduler.capacity.<queue-path>.app-limit-factor"
>>>>> I will revert on this.
>>>>>
>>>>> On 29 September 2015 at 14:57, Namikaze Minato <ll...@gmail.com>
>>>>> wrote:
>>>>>
>>>>>> I think Laxman should also tell us more about which application type
>>>>>> he is running. The normal use cas of MAPREDUCE should be working as
>>>>>> intended, but if he has for example one MAP using 100 vcores, then the
>>>>>> second map will have to wait until the app completes. Same would
>>>>>> happen if the applications running were spark, as spark does not free
>>>>>> what is allocated to it.
>>>>>>
>>>>>> Regards,
>>>>>> LLoyd
>>>>>>
>>>>>> On 29 September 2015 at 11:22, Naganarasimha G R (Naga)
>>>>>> <ga...@huawei.com> wrote:
>>>>>> > Thanks Rohith for your thoughts ,
>>>>>> >       But i think by this configuration it might not completely
>>>>>> solve the
>>>>>> > scenario mentioned by Laxman, As if the there is some time gap
>>>>>> between first
>>>>>> > and and the second app then though we have fairness or priority set
>>>>>> for apps
>>>>>> > starvation will be there.
>>>>>> > IIUC we can think of an approach where in we can have something
>>>>>> similar to
>>>>>> > "yarn.scheduler.capacity.<queue-path>.user-limit-factor"  where in
>>>>>> it can
>>>>>> > provide  the functionality like
>>>>>> > "yarn.scheduler.capacity.<queue-path>.app-limit-factor" : The
>>>>>> multiple of
>>>>>> > the queue capacity which can be configured to allow a single app to
>>>>>> acquire
>>>>>> > more resources.  Thoughts ?
>>>>>> >
>>>>>> > + Naga
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> > ________________________________
>>>>>> > From: Rohith Sharma K S [rohithsharmaks@huawei.com]
>>>>>> > Sent: Tuesday, September 29, 2015 14:07
>>>>>> > To: user@hadoop.apache.org
>>>>>> > Subject: RE: Concurrency control
>>>>>> >
>>>>>> > Hi Laxman,
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> > In Hadoop-2.8(Not released  yet),  CapacityScheduler provides
>>>>>> configuration
>>>>>> > for configuring ordering policy.  By configuring
>>>>>> FAIR_ORDERING_POLICY in CS
>>>>>> > , probably you should be able to achieve  your goal i.e avoiding
>>>>>> starving of
>>>>>> > applications for resources.
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> org.apache.hadoop.yarn.server.resourcemanager.scheduler.policy.FairOrderingPolicy<S>
>>>>>> >
>>>>>> > An OrderingPolicy which orders SchedulableEntities for fairness (see
>>>>>> > FairScheduler FairSharePolicy), generally, processes with lesser
>>>>>> usage are
>>>>>> > lesser. If sizedBasedWeight is set to true then an application with
>>>>>> high
>>>>>> > demand may be prioritized ahead of an application with less usage.
>>>>>> This is
>>>>>> > to offset the tendency to favor small apps, which could result in
>>>>>> starvation
>>>>>> > for large apps if many small ones enter and leave the queue
>>>>>> continuously
>>>>>> > (optional, default false)
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> > Community Issue Id :
>>>>>> https://issues.apache.org/jira/browse/YARN-3463
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> > Thanks & Regards
>>>>>> >
>>>>>> > Rohith Sharma K S
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> > From: Laxman Ch [mailto:laxman.lux@gmail.com]
>>>>>> > Sent: 29 September 2015 13:36
>>>>>> > To: user@hadoop.apache.org
>>>>>> > Subject: Re: Concurrency control
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> > Bouncing this thread again. Any other thoughts please?
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> > On 17 September 2015 at 23:21, Laxman Ch <la...@gmail.com>
>>>>>> wrote:
>>>>>> >
>>>>>> > No Naga. That wont help.
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> > I am running two applications (app1 - 100 vcores, app2 - 100
>>>>>> vcores) with
>>>>>> > same user which runs in same queue (capacity=100vcores). In this
>>>>>> scenario,
>>>>>> > if app1 triggers first occupies all the slots and runs longs then
>>>>>> app2 will
>>>>>> > starve longer.
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> > Let me reiterate my problem statement. I wanted "to control the
>>>>>> amount of
>>>>>> > resources (vcores, memory) used by an application SIMULTANEOUSLY"
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> > On 17 September 2015 at 22:28, Naganarasimha Garla
>>>>>> > <na...@gmail.com> wrote:
>>>>>> >
>>>>>> > Hi Laxman,
>>>>>> >
>>>>>> > For the example you have stated may be we can do the following
>>>>>> things :
>>>>>> >
>>>>>> > 1. Create/modify the queue with capacity and max cap set such that
>>>>>> its
>>>>>> > equivalent to 100 vcores. So as there is no elasticity, given
>>>>>> application
>>>>>> > will not be using the resources beyond the capacity configured
>>>>>> >
>>>>>> > 2. yarn.scheduler.capacity.<queue-path>.minimum-user-limit-percent
>>>>>>  so that
>>>>>> > each active user would be assured with the minimum guaranteed
>>>>>> resources . By
>>>>>> > default value is 100 implies no user limits are imposed.
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> > Additionally we can think of
>>>>>> >
>>>>>> "yarn.nodemanager.linux-container-executor.cgroups.strict-resource-usage"
>>>>>> > which will enforce strict cpu usage for a given container if
>>>>>> required.
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> > + Naga
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> > On Thu, Sep 17, 2015 at 4:42 PM, Laxman Ch <la...@gmail.com>
>>>>>> wrote:
>>>>>> >
>>>>>> > Yes. I'm already using cgroups. Cgroups helps in controlling the
>>>>>> resources
>>>>>> > at container level. But my requirement is more about controlling the
>>>>>> > concurrent resource usage of an application at whole cluster level.
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> > And yes, we do configure queues properly. But, that won't help.
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> > For example, I have an application with a requirement of 1000
>>>>>> vcores. But, I
>>>>>> > wanted to control this application not to go beyond 100 vcores at
>>>>>> any point
>>>>>> > of time in the cluster/queue. This makes that application to run
>>>>>> longer even
>>>>>> > when my cluster is free but I will be able meet the guaranteed SLAs
>>>>>> of other
>>>>>> > applications.
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> > Hope this helps to understand my question.
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> > And thanks Narasimha for quick response.
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> > On 17 September 2015 at 16:17, Naganarasimha Garla
>>>>>> > <na...@gmail.com> wrote:
>>>>>> >
>>>>>> > Hi Laxman,
>>>>>> >
>>>>>> > Yes if cgroups are enabled and
>>>>>> "yarn.scheduler.capacity.resource-calculator"
>>>>>> > configured to DominantResourceCalculator then cpu and memory can be
>>>>>> > controlled.
>>>>>> >
>>>>>> > Please Kindly  furhter refer to the official documentation
>>>>>> > http://hadoop.apache.org/docs/r1.2.1/capacity_scheduler.html
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> > But may be if say more about problem then we can suggest ideal
>>>>>> > configuration, seems like capacity configuration and splitting of
>>>>>> the queue
>>>>>> > is not rightly done or you might refer to Fair Scheduler if you
>>>>>> want more
>>>>>> > fairness for container allocation for different apps.
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> > On Thu, Sep 17, 2015 at 4:10 PM, Laxman Ch <la...@gmail.com>
>>>>>> wrote:
>>>>>> >
>>>>>> > Hi,
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> > In YARN, do we have any way to control the amount of resources
>>>>>> (vcores,
>>>>>> > memory) used by an application SIMULTANEOUSLY.
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> > - In my cluster, noticed some large and long running mr-app
>>>>>> occupied all the
>>>>>> > slots of the queue and blocking other apps to get started.
>>>>>> >
>>>>>> > - I'm using Capacity schedulers (using hierarchical queues and
>>>>>> preemption
>>>>>> > disabled)
>>>>>> >
>>>>>> > - Using Hadoop version 2.6.0
>>>>>> >
>>>>>> > - Did some googling around this and gone through configuration docs
>>>>>> but I'm
>>>>>> > not able to find anything that matches my requirement.
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> > If needed, I can provide more details on the usecase and problem.
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> > --
>>>>>> >
>>>>>> > Thanks,
>>>>>> > Laxman
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> > --
>>>>>> >
>>>>>> > Thanks,
>>>>>> > Laxman
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> > --
>>>>>> >
>>>>>> > Thanks,
>>>>>> > Laxman
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> > --
>>>>>> >
>>>>>> > Thanks,
>>>>>> > Laxman
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Thanks,
>>>>> Laxman
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> Thanks,
>>>> Laxman
>>>>
>>>
>>>
>>>
>>> --
>>> Thanks,
>>> Laxman
>>>
>>
>
>
> --
> Thanks,
> Laxman
>

Re: Concurrency control

Posted by Naganarasimha Garla <na...@gmail.com>.

Thanks Harsh J for the useful sharing of the info, but can we think of some
way to support this scenario from YARN side ?
like the queue configuration i mentioned or in the way Laxman mentioned
(app specific override)  ?

On Fri, Oct 2, 2015 at 10:26 PM, Laxman Ch <la...@gmail.com> wrote:

> Thanks and Perfect Harsh. Exactly what I am looking for. Most of our
> applications are MR.
> So, this should be sufficient for us. These configurations, I will give a
> try and post my findings again here. Thanks again.
>
> Thanks Naga, Rohit & Lloyd for your suggestions and discussion.
>
> On 2 October 2015 at 07:37, Harsh J <ha...@cloudera.com> wrote:
>
>> If all your Apps are MR, then what you are looking for is MAPREDUCE-5583
>> (it can be set per-job).
>>
>> On Thu, Oct 1, 2015 at 3:03 PM Laxman Ch <la...@gmail.com> wrote:
>>
>>> Hi Naga,
>>>
>>> Like most of the app-level configurations, admin can configure the
>>> defaults which user may want override at application level.
>>>
>>> If this is at queue-level then all applications in a queue will have the
>>> same limits. But all our applications in a queue may not have same SLA and
>>> we may need to restrict them differently. This requires again splitting
>>> queues further which I feel is more overhead.
>>>
>>>
>>> On 30 September 2015 at 09:00, Naganarasimha G R (Naga) <
>>> garlanaganarasimha@huawei.com> wrote:
>>>
>>>> Hi Laxman,
>>>>
>>>> Ideally i understand it would be better its available @ application
>>>> level, but  its like each user is expected to ensure that he gives the
>>>> right configuration which is within the limits of max capacity.
>>>> And what if user submits some app *(kind of a query execution app**)*
>>>> with out this setting *or* he doesn't know how much it should take ?
>>>> In general, users specifying resources for containers itself is a difficult
>>>> task.
>>>> And it might not be right to expect that the admin will do it for each
>>>> application in the queue either.  Basically governing will be difficult if
>>>> its not enforced from queue/scheduler side.
>>>>
>>>> + Naga
>>>>
>>>> ------------------------------
>>>> *From:* Laxman Ch [laxman.lux@gmail.com]
>>>> *Sent:* Tuesday, September 29, 2015 16:52
>>>>
>>>> *To:* user@hadoop.apache.org
>>>> *Subject:* Re: Concurrency control
>>>>
>>>> IMO, its better to have a application level configuration than to have
>>>> a scheduler/queue level configuration.
>>>> Having a queue level configuration will restrict every single
>>>> application that runs in that queue.
>>>> But, we may want to configure these limits for only some set of jobs
>>>> and also for every application these limits can be different.
>>>>
>>>> FairOrdering policy thing, order of jobs can't be enforced as these are
>>>> adhoc jobs and scheduled/owned independently by different teams.
>>>>
>>>> On 29 September 2015 at 16:43, Naganarasimha G R (Naga) <
>>>> garlanaganarasimha@huawei.com> wrote:
>>>>
>>>>> Hi Laxman,
>>>>>
>>>>> What i meant was,  suppose if we support and configure
>>>>> yarn.scheduler.capacity.<queue-path>.app-limit-factor to .25  then a
>>>>> single app should not take more than 25 % of resources in the queue.
>>>>> This would be a more generic configuration which can be enforced by
>>>>> the admin, than expecting it to be configured for per app by the user.
>>>>>
>>>>> And for Rohith's suggestion of FairOrdering policy , I think it should
>>>>> solve the problem if the App which is submitted first is not already hogged
>>>>> all the queue's resources.
>>>>>
>>>>> + Naga
>>>>>
>>>>> ------------------------------
>>>>> *From:* Laxman Ch [laxman.lux@gmail.com]
>>>>> *Sent:* Tuesday, September 29, 2015 16:03
>>>>>
>>>>> *To:* user@hadoop.apache.org
>>>>> *Subject:* Re: Concurrency control
>>>>>
>>>>> Thanks Rohit, Naga and Lloyd for the responses.
>>>>>
>>>>> > I think Laxman should also tell us more about which application
>>>>> type he is running.
>>>>>
>>>>> We run mr jobs mostly with default core/memory allocation (1 vcore,
>>>>> 1.5GB).
>>>>> Our problem is more about controlling the * resources used
>>>>> simultaneously by all running containers *at any given point of time
>>>>> per application.
>>>>>
>>>>> Example:
>>>>> 1. App1 and App2 are two MR apps.
>>>>> 2. App1 and App2 belong to same queue (capacity: 100 vcores, 150 GB).
>>>>> 3. Each App1 task takes 8 hrs for completion
>>>>> 4. Each App2 task takes 5 mins for completion
>>>>> 5. App1 triggered at time "t1" and using all the slots of queue.
>>>>> 6. App2 triggered at time "t2" (where t2 > t1) and waits longer fot
>>>>> App1 tasks to release the resources.
>>>>> 7. We can't have preemption enabled as we don't want to lose the work
>>>>> completed so far by App1.
>>>>> 8. We can't have separate queues for App1 and App2 as we have lots of
>>>>> jobs like this and it will explode the number of queues.
>>>>> 9. We use CapacityScheduler.
>>>>>
>>>>> In this scenario, if I can control App1 concurrent usage limits to
>>>>> 50vcores and 75GB, then App1 may take longer time to finish but there won't
>>>>> be any starvation for App2 (and other jobs running in same queue)
>>>>>
>>>>> @Rohit, FairOrdering policy may not solve this starvation problem.
>>>>>
>>>>> @Naga, I couldn't think through the expected behavior of "
>>>>> yarn.scheduler.capacity.<queue-path>.app-limit-factor"
>>>>> I will revert on this.
>>>>>
>>>>> On 29 September 2015 at 14:57, Namikaze Minato <ll...@gmail.com>
>>>>> wrote:
>>>>>
>>>>>> I think Laxman should also tell us more about which application type
>>>>>> he is running. The normal use cas of MAPREDUCE should be working as
>>>>>> intended, but if he has for example one MAP using 100 vcores, then the
>>>>>> second map will have to wait until the app completes. Same would
>>>>>> happen if the applications running were spark, as spark does not free
>>>>>> what is allocated to it.
>>>>>>
>>>>>> Regards,
>>>>>> LLoyd
>>>>>>
>>>>>> On 29 September 2015 at 11:22, Naganarasimha G R (Naga)
>>>>>> <ga...@huawei.com> wrote:
>>>>>> > Thanks Rohith for your thoughts ,
>>>>>> >       But i think by this configuration it might not completely
>>>>>> solve the
>>>>>> > scenario mentioned by Laxman, As if the there is some time gap
>>>>>> between first
>>>>>> > and and the second app then though we have fairness or priority set
>>>>>> for apps
>>>>>> > starvation will be there.
>>>>>> > IIUC we can think of an approach where in we can have something
>>>>>> similar to
>>>>>> > "yarn.scheduler.capacity.<queue-path>.user-limit-factor"  where in
>>>>>> it can
>>>>>> > provide  the functionality like
>>>>>> > "yarn.scheduler.capacity.<queue-path>.app-limit-factor" : The
>>>>>> multiple of
>>>>>> > the queue capacity which can be configured to allow a single app to
>>>>>> acquire
>>>>>> > more resources.  Thoughts ?
>>>>>> >
>>>>>> > + Naga
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> > ________________________________
>>>>>> > From: Rohith Sharma K S [rohithsharmaks@huawei.com]
>>>>>> > Sent: Tuesday, September 29, 2015 14:07
>>>>>> > To: user@hadoop.apache.org
>>>>>> > Subject: RE: Concurrency control
>>>>>> >
>>>>>> > Hi Laxman,
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> > In Hadoop-2.8(Not released  yet),  CapacityScheduler provides
>>>>>> configuration
>>>>>> > for configuring ordering policy.  By configuring
>>>>>> FAIR_ORDERING_POLICY in CS
>>>>>> > , probably you should be able to achieve  your goal i.e avoiding
>>>>>> starving of
>>>>>> > applications for resources.
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> org.apache.hadoop.yarn.server.resourcemanager.scheduler.policy.FairOrderingPolicy<S>
>>>>>> >
>>>>>> > An OrderingPolicy which orders SchedulableEntities for fairness (see
>>>>>> > FairScheduler FairSharePolicy), generally, processes with lesser
>>>>>> usage are
>>>>>> > lesser. If sizedBasedWeight is set to true then an application with
>>>>>> high
>>>>>> > demand may be prioritized ahead of an application with less usage.
>>>>>> This is
>>>>>> > to offset the tendency to favor small apps, which could result in
>>>>>> starvation
>>>>>> > for large apps if many small ones enter and leave the queue
>>>>>> continuously
>>>>>> > (optional, default false)
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> > Community Issue Id :
>>>>>> https://issues.apache.org/jira/browse/YARN-3463
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> > Thanks & Regards
>>>>>> >
>>>>>> > Rohith Sharma K S
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> > From: Laxman Ch [mailto:laxman.lux@gmail.com]
>>>>>> > Sent: 29 September 2015 13:36
>>>>>> > To: user@hadoop.apache.org
>>>>>> > Subject: Re: Concurrency control
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> > Bouncing this thread again. Any other thoughts please?
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> > On 17 September 2015 at 23:21, Laxman Ch <la...@gmail.com>
>>>>>> wrote:
>>>>>> >
>>>>>> > No Naga. That wont help.
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> > I am running two applications (app1 - 100 vcores, app2 - 100
>>>>>> vcores) with
>>>>>> > same user which runs in same queue (capacity=100vcores). In this
>>>>>> scenario,
>>>>>> > if app1 triggers first occupies all the slots and runs longs then
>>>>>> app2 will
>>>>>> > starve longer.
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> > Let me reiterate my problem statement. I wanted "to control the
>>>>>> amount of
>>>>>> > resources (vcores, memory) used by an application SIMULTANEOUSLY"
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> > On 17 September 2015 at 22:28, Naganarasimha Garla
>>>>>> > <na...@gmail.com> wrote:
>>>>>> >
>>>>>> > Hi Laxman,
>>>>>> >
>>>>>> > For the example you have stated may be we can do the following
>>>>>> things :
>>>>>> >
>>>>>> > 1. Create/modify the queue with capacity and max cap set such that
>>>>>> its
>>>>>> > equivalent to 100 vcores. So as there is no elasticity, given
>>>>>> application
>>>>>> > will not be using the resources beyond the capacity configured
>>>>>> >
>>>>>> > 2. yarn.scheduler.capacity.<queue-path>.minimum-user-limit-percent
>>>>>>  so that
>>>>>> > each active user would be assured with the minimum guaranteed
>>>>>> resources . By
>>>>>> > default value is 100 implies no user limits are imposed.
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> > Additionally we can think of
>>>>>> >
>>>>>> "yarn.nodemanager.linux-container-executor.cgroups.strict-resource-usage"
>>>>>> > which will enforce strict cpu usage for a given container if
>>>>>> required.
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> > + Naga
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> > On Thu, Sep 17, 2015 at 4:42 PM, Laxman Ch <la...@gmail.com>
>>>>>> wrote:
>>>>>> >
>>>>>> > Yes. I'm already using cgroups. Cgroups helps in controlling the
>>>>>> resources
>>>>>> > at container level. But my requirement is more about controlling the
>>>>>> > concurrent resource usage of an application at whole cluster level.
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> > And yes, we do configure queues properly. But, that won't help.
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> > For example, I have an application with a requirement of 1000
>>>>>> vcores. But, I
>>>>>> > wanted to control this application not to go beyond 100 vcores at
>>>>>> any point
>>>>>> > of time in the cluster/queue. This makes that application to run
>>>>>> longer even
>>>>>> > when my cluster is free but I will be able meet the guaranteed SLAs
>>>>>> of other
>>>>>> > applications.
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> > Hope this helps to understand my question.
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> > And thanks Narasimha for quick response.
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> > On 17 September 2015 at 16:17, Naganarasimha Garla
>>>>>> > <na...@gmail.com> wrote:
>>>>>> >
>>>>>> > Hi Laxman,
>>>>>> >
>>>>>> > Yes if cgroups are enabled and
>>>>>> "yarn.scheduler.capacity.resource-calculator"
>>>>>> > configured to DominantResourceCalculator then cpu and memory can be
>>>>>> > controlled.
>>>>>> >
>>>>>> > Please Kindly  furhter refer to the official documentation
>>>>>> > http://hadoop.apache.org/docs/r1.2.1/capacity_scheduler.html
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> > But may be if say more about problem then we can suggest ideal
>>>>>> > configuration, seems like capacity configuration and splitting of
>>>>>> the queue
>>>>>> > is not rightly done or you might refer to Fair Scheduler if you
>>>>>> want more
>>>>>> > fairness for container allocation for different apps.
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> > On Thu, Sep 17, 2015 at 4:10 PM, Laxman Ch <la...@gmail.com>
>>>>>> wrote:
>>>>>> >
>>>>>> > Hi,
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> > In YARN, do we have any way to control the amount of resources
>>>>>> (vcores,
>>>>>> > memory) used by an application SIMULTANEOUSLY.
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> > - In my cluster, noticed some large and long running mr-app
>>>>>> occupied all the
>>>>>> > slots of the queue and blocking other apps to get started.
>>>>>> >
>>>>>> > - I'm using Capacity schedulers (using hierarchical queues and
>>>>>> preemption
>>>>>> > disabled)
>>>>>> >
>>>>>> > - Using Hadoop version 2.6.0
>>>>>> >
>>>>>> > - Did some googling around this and gone through configuration docs
>>>>>> but I'm
>>>>>> > not able to find anything that matches my requirement.
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> > If needed, I can provide more details on the usecase and problem.
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> > --
>>>>>> >
>>>>>> > Thanks,
>>>>>> > Laxman
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> > --
>>>>>> >
>>>>>> > Thanks,
>>>>>> > Laxman
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> > --
>>>>>> >
>>>>>> > Thanks,
>>>>>> > Laxman
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> > --
>>>>>> >
>>>>>> > Thanks,
>>>>>> > Laxman
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Thanks,
>>>>> Laxman
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> Thanks,
>>>> Laxman
>>>>
>>>
>>>
>>>
>>> --
>>> Thanks,
>>> Laxman
>>>
>>
>
>
> --
> Thanks,
> Laxman
>

Re: Concurrency control

Posted by Laxman Ch <la...@gmail.com>.

Thanks and Perfect Harsh. Exactly what I am looking for. Most of our
applications are MR.
So, this should be sufficient for us. These configurations, I will give a
try and post my findings again here. Thanks again.

Thanks Naga, Rohit & Lloyd for your suggestions and discussion.

On 2 October 2015 at 07:37, Harsh J <ha...@cloudera.com> wrote:

> If all your Apps are MR, then what you are looking for is MAPREDUCE-5583
> (it can be set per-job).
>
> On Thu, Oct 1, 2015 at 3:03 PM Laxman Ch <la...@gmail.com> wrote:
>
>> Hi Naga,
>>
>> Like most of the app-level configurations, admin can configure the
>> defaults which user may want override at application level.
>>
>> If this is at queue-level then all applications in a queue will have the
>> same limits. But all our applications in a queue may not have same SLA and
>> we may need to restrict them differently. This requires again splitting
>> queues further which I feel is more overhead.
>>
>>
>> On 30 September 2015 at 09:00, Naganarasimha G R (Naga) <
>> garlanaganarasimha@huawei.com> wrote:
>>
>>> Hi Laxman,
>>>
>>> Ideally i understand it would be better its available @ application
>>> level, but  its like each user is expected to ensure that he gives the
>>> right configuration which is within the limits of max capacity.
>>> And what if user submits some app *(kind of a query execution app**)*
>>> with out this setting *or* he doesn't know how much it should take ? In
>>> general, users specifying resources for containers itself is a difficult
>>> task.
>>> And it might not be right to expect that the admin will do it for each
>>> application in the queue either.  Basically governing will be difficult if
>>> its not enforced from queue/scheduler side.
>>>
>>> + Naga
>>>
>>> ------------------------------
>>> *From:* Laxman Ch [laxman.lux@gmail.com]
>>> *Sent:* Tuesday, September 29, 2015 16:52
>>>
>>> *To:* user@hadoop.apache.org
>>> *Subject:* Re: Concurrency control
>>>
>>> IMO, its better to have a application level configuration than to have a
>>> scheduler/queue level configuration.
>>> Having a queue level configuration will restrict every single
>>> application that runs in that queue.
>>> But, we may want to configure these limits for only some set of jobs and
>>> also for every application these limits can be different.
>>>
>>> FairOrdering policy thing, order of jobs can't be enforced as these are
>>> adhoc jobs and scheduled/owned independently by different teams.
>>>
>>> On 29 September 2015 at 16:43, Naganarasimha G R (Naga) <
>>> garlanaganarasimha@huawei.com> wrote:
>>>
>>>> Hi Laxman,
>>>>
>>>> What i meant was,  suppose if we support and configure
>>>> yarn.scheduler.capacity.<queue-path>.app-limit-factor to .25  then a
>>>> single app should not take more than 25 % of resources in the queue.
>>>> This would be a more generic configuration which can be enforced by the
>>>> admin, than expecting it to be configured for per app by the user.
>>>>
>>>> And for Rohith's suggestion of FairOrdering policy , I think it should
>>>> solve the problem if the App which is submitted first is not already hogged
>>>> all the queue's resources.
>>>>
>>>> + Naga
>>>>
>>>> ------------------------------
>>>> *From:* Laxman Ch [laxman.lux@gmail.com]
>>>> *Sent:* Tuesday, September 29, 2015 16:03
>>>>
>>>> *To:* user@hadoop.apache.org
>>>> *Subject:* Re: Concurrency control
>>>>
>>>> Thanks Rohit, Naga and Lloyd for the responses.
>>>>
>>>> > I think Laxman should also tell us more about which application type he
>>>> is running.
>>>>
>>>> We run mr jobs mostly with default core/memory allocation (1 vcore,
>>>> 1.5GB).
>>>> Our problem is more about controlling the * resources used
>>>> simultaneously by all running containers *at any given point of time
>>>> per application.
>>>>
>>>> Example:
>>>> 1. App1 and App2 are two MR apps.
>>>> 2. App1 and App2 belong to same queue (capacity: 100 vcores, 150 GB).
>>>> 3. Each App1 task takes 8 hrs for completion
>>>> 4. Each App2 task takes 5 mins for completion
>>>> 5. App1 triggered at time "t1" and using all the slots of queue.
>>>> 6. App2 triggered at time "t2" (where t2 > t1) and waits longer fot
>>>> App1 tasks to release the resources.
>>>> 7. We can't have preemption enabled as we don't want to lose the work
>>>> completed so far by App1.
>>>> 8. We can't have separate queues for App1 and App2 as we have lots of
>>>> jobs like this and it will explode the number of queues.
>>>> 9. We use CapacityScheduler.
>>>>
>>>> In this scenario, if I can control App1 concurrent usage limits to
>>>> 50vcores and 75GB, then App1 may take longer time to finish but there won't
>>>> be any starvation for App2 (and other jobs running in same queue)
>>>>
>>>> @Rohit, FairOrdering policy may not solve this starvation problem.
>>>>
>>>> @Naga, I couldn't think through the expected behavior of "
>>>> yarn.scheduler.capacity.<queue-path>.app-limit-factor"
>>>> I will revert on this.
>>>>
>>>> On 29 September 2015 at 14:57, Namikaze Minato <ll...@gmail.com>
>>>> wrote:
>>>>
>>>>> I think Laxman should also tell us more about which application type
>>>>> he is running. The normal use cas of MAPREDUCE should be working as
>>>>> intended, but if he has for example one MAP using 100 vcores, then the
>>>>> second map will have to wait until the app completes. Same would
>>>>> happen if the applications running were spark, as spark does not free
>>>>> what is allocated to it.
>>>>>
>>>>> Regards,
>>>>> LLoyd
>>>>>
>>>>> On 29 September 2015 at 11:22, Naganarasimha G R (Naga)
>>>>> <ga...@huawei.com> wrote:
>>>>> > Thanks Rohith for your thoughts ,
>>>>> >       But i think by this configuration it might not completely
>>>>> solve the
>>>>> > scenario mentioned by Laxman, As if the there is some time gap
>>>>> between first
>>>>> > and and the second app then though we have fairness or priority set
>>>>> for apps
>>>>> > starvation will be there.
>>>>> > IIUC we can think of an approach where in we can have something
>>>>> similar to
>>>>> > "yarn.scheduler.capacity.<queue-path>.user-limit-factor"  where in
>>>>> it can
>>>>> > provide  the functionality like
>>>>> > "yarn.scheduler.capacity.<queue-path>.app-limit-factor" : The
>>>>> multiple of
>>>>> > the queue capacity which can be configured to allow a single app to
>>>>> acquire
>>>>> > more resources.  Thoughts ?
>>>>> >
>>>>> > + Naga
>>>>> >
>>>>> >
>>>>> >
>>>>> > ________________________________
>>>>> > From: Rohith Sharma K S [rohithsharmaks@huawei.com]
>>>>> > Sent: Tuesday, September 29, 2015 14:07
>>>>> > To: user@hadoop.apache.org
>>>>> > Subject: RE: Concurrency control
>>>>> >
>>>>> > Hi Laxman,
>>>>> >
>>>>> >
>>>>> >
>>>>> > In Hadoop-2.8(Not released  yet),  CapacityScheduler provides
>>>>> configuration
>>>>> > for configuring ordering policy.  By configuring
>>>>> FAIR_ORDERING_POLICY in CS
>>>>> > , probably you should be able to achieve  your goal i.e avoiding
>>>>> starving of
>>>>> > applications for resources.
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> org.apache.hadoop.yarn.server.resourcemanager.scheduler.policy.FairOrderingPolicy<S>
>>>>> >
>>>>> > An OrderingPolicy which orders SchedulableEntities for fairness (see
>>>>> > FairScheduler FairSharePolicy), generally, processes with lesser
>>>>> usage are
>>>>> > lesser. If sizedBasedWeight is set to true then an application with
>>>>> high
>>>>> > demand may be prioritized ahead of an application with less usage.
>>>>> This is
>>>>> > to offset the tendency to favor small apps, which could result in
>>>>> starvation
>>>>> > for large apps if many small ones enter and leave the queue
>>>>> continuously
>>>>> > (optional, default false)
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> > Community Issue Id :
>>>>> https://issues.apache.org/jira/browse/YARN-3463
>>>>> >
>>>>> >
>>>>> >
>>>>> > Thanks & Regards
>>>>> >
>>>>> > Rohith Sharma K S
>>>>> >
>>>>> >
>>>>> >
>>>>> > From: Laxman Ch [mailto:laxman.lux@gmail.com]
>>>>> > Sent: 29 September 2015 13:36
>>>>> > To: user@hadoop.apache.org
>>>>> > Subject: Re: Concurrency control
>>>>> >
>>>>> >
>>>>> >
>>>>> > Bouncing this thread again. Any other thoughts please?
>>>>> >
>>>>> >
>>>>> >
>>>>> > On 17 September 2015 at 23:21, Laxman Ch <la...@gmail.com>
>>>>> wrote:
>>>>> >
>>>>> > No Naga. That wont help.
>>>>> >
>>>>> >
>>>>> >
>>>>> > I am running two applications (app1 - 100 vcores, app2 - 100 vcores)
>>>>> with
>>>>> > same user which runs in same queue (capacity=100vcores). In this
>>>>> scenario,
>>>>> > if app1 triggers first occupies all the slots and runs longs then
>>>>> app2 will
>>>>> > starve longer.
>>>>> >
>>>>> >
>>>>> >
>>>>> > Let me reiterate my problem statement. I wanted "to control the
>>>>> amount of
>>>>> > resources (vcores, memory) used by an application SIMULTANEOUSLY"
>>>>> >
>>>>> >
>>>>> >
>>>>> > On 17 September 2015 at 22:28, Naganarasimha Garla
>>>>> > <na...@gmail.com> wrote:
>>>>> >
>>>>> > Hi Laxman,
>>>>> >
>>>>> > For the example you have stated may be we can do the following
>>>>> things :
>>>>> >
>>>>> > 1. Create/modify the queue with capacity and max cap set such that
>>>>> its
>>>>> > equivalent to 100 vcores. So as there is no elasticity, given
>>>>> application
>>>>> > will not be using the resources beyond the capacity configured
>>>>> >
>>>>> > 2. yarn.scheduler.capacity.<queue-path>.minimum-user-limit-percent
>>>>>  so that
>>>>> > each active user would be assured with the minimum guaranteed
>>>>> resources . By
>>>>> > default value is 100 implies no user limits are imposed.
>>>>> >
>>>>> >
>>>>> >
>>>>> > Additionally we can think of
>>>>> >
>>>>> "yarn.nodemanager.linux-container-executor.cgroups.strict-resource-usage"
>>>>> > which will enforce strict cpu usage for a given container if
>>>>> required.
>>>>> >
>>>>> >
>>>>> >
>>>>> > + Naga
>>>>> >
>>>>> >
>>>>> >
>>>>> > On Thu, Sep 17, 2015 at 4:42 PM, Laxman Ch <la...@gmail.com>
>>>>> wrote:
>>>>> >
>>>>> > Yes. I'm already using cgroups. Cgroups helps in controlling the
>>>>> resources
>>>>> > at container level. But my requirement is more about controlling the
>>>>> > concurrent resource usage of an application at whole cluster level.
>>>>> >
>>>>> >
>>>>> >
>>>>> > And yes, we do configure queues properly. But, that won't help.
>>>>> >
>>>>> >
>>>>> >
>>>>> > For example, I have an application with a requirement of 1000
>>>>> vcores. But, I
>>>>> > wanted to control this application not to go beyond 100 vcores at
>>>>> any point
>>>>> > of time in the cluster/queue. This makes that application to run
>>>>> longer even
>>>>> > when my cluster is free but I will be able meet the guaranteed SLAs
>>>>> of other
>>>>> > applications.
>>>>> >
>>>>> >
>>>>> >
>>>>> > Hope this helps to understand my question.
>>>>> >
>>>>> >
>>>>> >
>>>>> > And thanks Narasimha for quick response.
>>>>> >
>>>>> >
>>>>> >
>>>>> > On 17 September 2015 at 16:17, Naganarasimha Garla
>>>>> > <na...@gmail.com> wrote:
>>>>> >
>>>>> > Hi Laxman,
>>>>> >
>>>>> > Yes if cgroups are enabled and
>>>>> "yarn.scheduler.capacity.resource-calculator"
>>>>> > configured to DominantResourceCalculator then cpu and memory can be
>>>>> > controlled.
>>>>> >
>>>>> > Please Kindly  furhter refer to the official documentation
>>>>> > http://hadoop.apache.org/docs/r1.2.1/capacity_scheduler.html
>>>>> >
>>>>> >
>>>>> >
>>>>> > But may be if say more about problem then we can suggest ideal
>>>>> > configuration, seems like capacity configuration and splitting of
>>>>> the queue
>>>>> > is not rightly done or you might refer to Fair Scheduler if you want
>>>>> more
>>>>> > fairness for container allocation for different apps.
>>>>> >
>>>>> >
>>>>> >
>>>>> > On Thu, Sep 17, 2015 at 4:10 PM, Laxman Ch <la...@gmail.com>
>>>>> wrote:
>>>>> >
>>>>> > Hi,
>>>>> >
>>>>> >
>>>>> >
>>>>> > In YARN, do we have any way to control the amount of resources
>>>>> (vcores,
>>>>> > memory) used by an application SIMULTANEOUSLY.
>>>>> >
>>>>> >
>>>>> >
>>>>> > - In my cluster, noticed some large and long running mr-app occupied
>>>>> all the
>>>>> > slots of the queue and blocking other apps to get started.
>>>>> >
>>>>> > - I'm using Capacity schedulers (using hierarchical queues and
>>>>> preemption
>>>>> > disabled)
>>>>> >
>>>>> > - Using Hadoop version 2.6.0
>>>>> >
>>>>> > - Did some googling around this and gone through configuration docs
>>>>> but I'm
>>>>> > not able to find anything that matches my requirement.
>>>>> >
>>>>> >
>>>>> >
>>>>> > If needed, I can provide more details on the usecase and problem.
>>>>> >
>>>>> >
>>>>> >
>>>>> > --
>>>>> >
>>>>> > Thanks,
>>>>> > Laxman
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> > --
>>>>> >
>>>>> > Thanks,
>>>>> > Laxman
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> > --
>>>>> >
>>>>> > Thanks,
>>>>> > Laxman
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> > --
>>>>> >
>>>>> > Thanks,
>>>>> > Laxman
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> Thanks,
>>>> Laxman
>>>>
>>>
>>>
>>>
>>> --
>>> Thanks,
>>> Laxman
>>>
>>
>>
>>
>> --
>> Thanks,
>> Laxman
>>
>


-- 
Thanks,
Laxman

Re: Concurrency control

Posted by Laxman Ch <la...@gmail.com>.

Thanks and Perfect Harsh. Exactly what I am looking for. Most of our
applications are MR.
So, this should be sufficient for us. These configurations, I will give a
try and post my findings again here. Thanks again.

Thanks Naga, Rohit & Lloyd for your suggestions and discussion.

On 2 October 2015 at 07:37, Harsh J <ha...@cloudera.com> wrote:

> If all your Apps are MR, then what you are looking for is MAPREDUCE-5583
> (it can be set per-job).
>
> On Thu, Oct 1, 2015 at 3:03 PM Laxman Ch <la...@gmail.com> wrote:
>
>> Hi Naga,
>>
>> Like most of the app-level configurations, admin can configure the
>> defaults which user may want override at application level.
>>
>> If this is at queue-level then all applications in a queue will have the
>> same limits. But all our applications in a queue may not have same SLA and
>> we may need to restrict them differently. This requires again splitting
>> queues further which I feel is more overhead.
>>
>>
>> On 30 September 2015 at 09:00, Naganarasimha G R (Naga) <
>> garlanaganarasimha@huawei.com> wrote:
>>
>>> Hi Laxman,
>>>
>>> Ideally i understand it would be better its available @ application
>>> level, but  its like each user is expected to ensure that he gives the
>>> right configuration which is within the limits of max capacity.
>>> And what if user submits some app *(kind of a query execution app**)*
>>> with out this setting *or* he doesn't know how much it should take ? In
>>> general, users specifying resources for containers itself is a difficult
>>> task.
>>> And it might not be right to expect that the admin will do it for each
>>> application in the queue either.  Basically governing will be difficult if
>>> its not enforced from queue/scheduler side.
>>>
>>> + Naga
>>>
>>> ------------------------------
>>> *From:* Laxman Ch [laxman.lux@gmail.com]
>>> *Sent:* Tuesday, September 29, 2015 16:52
>>>
>>> *To:* user@hadoop.apache.org
>>> *Subject:* Re: Concurrency control
>>>
>>> IMO, its better to have a application level configuration than to have a
>>> scheduler/queue level configuration.
>>> Having a queue level configuration will restrict every single
>>> application that runs in that queue.
>>> But, we may want to configure these limits for only some set of jobs and
>>> also for every application these limits can be different.
>>>
>>> FairOrdering policy thing, order of jobs can't be enforced as these are
>>> adhoc jobs and scheduled/owned independently by different teams.
>>>
>>> On 29 September 2015 at 16:43, Naganarasimha G R (Naga) <
>>> garlanaganarasimha@huawei.com> wrote:
>>>
>>>> Hi Laxman,
>>>>
>>>> What i meant was,  suppose if we support and configure
>>>> yarn.scheduler.capacity.<queue-path>.app-limit-factor to .25  then a
>>>> single app should not take more than 25 % of resources in the queue.
>>>> This would be a more generic configuration which can be enforced by the
>>>> admin, than expecting it to be configured for per app by the user.
>>>>
>>>> And for Rohith's suggestion of FairOrdering policy , I think it should
>>>> solve the problem if the App which is submitted first is not already hogged
>>>> all the queue's resources.
>>>>
>>>> + Naga
>>>>
>>>> ------------------------------
>>>> *From:* Laxman Ch [laxman.lux@gmail.com]
>>>> *Sent:* Tuesday, September 29, 2015 16:03
>>>>
>>>> *To:* user@hadoop.apache.org
>>>> *Subject:* Re: Concurrency control
>>>>
>>>> Thanks Rohit, Naga and Lloyd for the responses.
>>>>
>>>> > I think Laxman should also tell us more about which application type he
>>>> is running.
>>>>
>>>> We run mr jobs mostly with default core/memory allocation (1 vcore,
>>>> 1.5GB).
>>>> Our problem is more about controlling the * resources used
>>>> simultaneously by all running containers *at any given point of time
>>>> per application.
>>>>
>>>> Example:
>>>> 1. App1 and App2 are two MR apps.
>>>> 2. App1 and App2 belong to same queue (capacity: 100 vcores, 150 GB).
>>>> 3. Each App1 task takes 8 hrs for completion
>>>> 4. Each App2 task takes 5 mins for completion
>>>> 5. App1 triggered at time "t1" and using all the slots of queue.
>>>> 6. App2 triggered at time "t2" (where t2 > t1) and waits longer fot
>>>> App1 tasks to release the resources.
>>>> 7. We can't have preemption enabled as we don't want to lose the work
>>>> completed so far by App1.
>>>> 8. We can't have separate queues for App1 and App2 as we have lots of
>>>> jobs like this and it will explode the number of queues.
>>>> 9. We use CapacityScheduler.
>>>>
>>>> In this scenario, if I can control App1 concurrent usage limits to
>>>> 50vcores and 75GB, then App1 may take longer time to finish but there won't
>>>> be any starvation for App2 (and other jobs running in same queue)
>>>>
>>>> @Rohit, FairOrdering policy may not solve this starvation problem.
>>>>
>>>> @Naga, I couldn't think through the expected behavior of "
>>>> yarn.scheduler.capacity.<queue-path>.app-limit-factor"
>>>> I will revert on this.
>>>>
>>>> On 29 September 2015 at 14:57, Namikaze Minato <ll...@gmail.com>
>>>> wrote:
>>>>
>>>>> I think Laxman should also tell us more about which application type
>>>>> he is running. The normal use cas of MAPREDUCE should be working as
>>>>> intended, but if he has for example one MAP using 100 vcores, then the
>>>>> second map will have to wait until the app completes. Same would
>>>>> happen if the applications running were spark, as spark does not free
>>>>> what is allocated to it.
>>>>>
>>>>> Regards,
>>>>> LLoyd
>>>>>
>>>>> On 29 September 2015 at 11:22, Naganarasimha G R (Naga)
>>>>> <ga...@huawei.com> wrote:
>>>>> > Thanks Rohith for your thoughts ,
>>>>> >       But i think by this configuration it might not completely
>>>>> solve the
>>>>> > scenario mentioned by Laxman, As if the there is some time gap
>>>>> between first
>>>>> > and and the second app then though we have fairness or priority set
>>>>> for apps
>>>>> > starvation will be there.
>>>>> > IIUC we can think of an approach where in we can have something
>>>>> similar to
>>>>> > "yarn.scheduler.capacity.<queue-path>.user-limit-factor"  where in
>>>>> it can
>>>>> > provide  the functionality like
>>>>> > "yarn.scheduler.capacity.<queue-path>.app-limit-factor" : The
>>>>> multiple of
>>>>> > the queue capacity which can be configured to allow a single app to
>>>>> acquire
>>>>> > more resources.  Thoughts ?
>>>>> >
>>>>> > + Naga
>>>>> >
>>>>> >
>>>>> >
>>>>> > ________________________________
>>>>> > From: Rohith Sharma K S [rohithsharmaks@huawei.com]
>>>>> > Sent: Tuesday, September 29, 2015 14:07
>>>>> > To: user@hadoop.apache.org
>>>>> > Subject: RE: Concurrency control
>>>>> >
>>>>> > Hi Laxman,
>>>>> >
>>>>> >
>>>>> >
>>>>> > In Hadoop-2.8(Not released  yet),  CapacityScheduler provides
>>>>> configuration
>>>>> > for configuring ordering policy.  By configuring
>>>>> FAIR_ORDERING_POLICY in CS
>>>>> > , probably you should be able to achieve  your goal i.e avoiding
>>>>> starving of
>>>>> > applications for resources.
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> org.apache.hadoop.yarn.server.resourcemanager.scheduler.policy.FairOrderingPolicy<S>
>>>>> >
>>>>> > An OrderingPolicy which orders SchedulableEntities for fairness (see
>>>>> > FairScheduler FairSharePolicy), generally, processes with lesser
>>>>> usage are
>>>>> > lesser. If sizedBasedWeight is set to true then an application with
>>>>> high
>>>>> > demand may be prioritized ahead of an application with less usage.
>>>>> This is
>>>>> > to offset the tendency to favor small apps, which could result in
>>>>> starvation
>>>>> > for large apps if many small ones enter and leave the queue
>>>>> continuously
>>>>> > (optional, default false)
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> > Community Issue Id :
>>>>> https://issues.apache.org/jira/browse/YARN-3463
>>>>> >
>>>>> >
>>>>> >
>>>>> > Thanks & Regards
>>>>> >
>>>>> > Rohith Sharma K S
>>>>> >
>>>>> >
>>>>> >
>>>>> > From: Laxman Ch [mailto:laxman.lux@gmail.com]
>>>>> > Sent: 29 September 2015 13:36
>>>>> > To: user@hadoop.apache.org
>>>>> > Subject: Re: Concurrency control
>>>>> >
>>>>> >
>>>>> >
>>>>> > Bouncing this thread again. Any other thoughts please?
>>>>> >
>>>>> >
>>>>> >
>>>>> > On 17 September 2015 at 23:21, Laxman Ch <la...@gmail.com>
>>>>> wrote:
>>>>> >
>>>>> > No Naga. That wont help.
>>>>> >
>>>>> >
>>>>> >
>>>>> > I am running two applications (app1 - 100 vcores, app2 - 100 vcores)
>>>>> with
>>>>> > same user which runs in same queue (capacity=100vcores). In this
>>>>> scenario,
>>>>> > if app1 triggers first occupies all the slots and runs longs then
>>>>> app2 will
>>>>> > starve longer.
>>>>> >
>>>>> >
>>>>> >
>>>>> > Let me reiterate my problem statement. I wanted "to control the
>>>>> amount of
>>>>> > resources (vcores, memory) used by an application SIMULTANEOUSLY"
>>>>> >
>>>>> >
>>>>> >
>>>>> > On 17 September 2015 at 22:28, Naganarasimha Garla
>>>>> > <na...@gmail.com> wrote:
>>>>> >
>>>>> > Hi Laxman,
>>>>> >
>>>>> > For the example you have stated may be we can do the following
>>>>> things :
>>>>> >
>>>>> > 1. Create/modify the queue with capacity and max cap set such that
>>>>> its
>>>>> > equivalent to 100 vcores. So as there is no elasticity, given
>>>>> application
>>>>> > will not be using the resources beyond the capacity configured
>>>>> >
>>>>> > 2. yarn.scheduler.capacity.<queue-path>.minimum-user-limit-percent
>>>>>  so that
>>>>> > each active user would be assured with the minimum guaranteed
>>>>> resources . By
>>>>> > default value is 100 implies no user limits are imposed.
>>>>> >
>>>>> >
>>>>> >
>>>>> > Additionally we can think of
>>>>> >
>>>>> "yarn.nodemanager.linux-container-executor.cgroups.strict-resource-usage"
>>>>> > which will enforce strict cpu usage for a given container if
>>>>> required.
>>>>> >
>>>>> >
>>>>> >
>>>>> > + Naga
>>>>> >
>>>>> >
>>>>> >
>>>>> > On Thu, Sep 17, 2015 at 4:42 PM, Laxman Ch <la...@gmail.com>
>>>>> wrote:
>>>>> >
>>>>> > Yes. I'm already using cgroups. Cgroups helps in controlling the
>>>>> resources
>>>>> > at container level. But my requirement is more about controlling the
>>>>> > concurrent resource usage of an application at whole cluster level.
>>>>> >
>>>>> >
>>>>> >
>>>>> > And yes, we do configure queues properly. But, that won't help.
>>>>> >
>>>>> >
>>>>> >
>>>>> > For example, I have an application with a requirement of 1000
>>>>> vcores. But, I
>>>>> > wanted to control this application not to go beyond 100 vcores at
>>>>> any point
>>>>> > of time in the cluster/queue. This makes that application to run
>>>>> longer even
>>>>> > when my cluster is free but I will be able meet the guaranteed SLAs
>>>>> of other
>>>>> > applications.
>>>>> >
>>>>> >
>>>>> >
>>>>> > Hope this helps to understand my question.
>>>>> >
>>>>> >
>>>>> >
>>>>> > And thanks Narasimha for quick response.
>>>>> >
>>>>> >
>>>>> >
>>>>> > On 17 September 2015 at 16:17, Naganarasimha Garla
>>>>> > <na...@gmail.com> wrote:
>>>>> >
>>>>> > Hi Laxman,
>>>>> >
>>>>> > Yes if cgroups are enabled and
>>>>> "yarn.scheduler.capacity.resource-calculator"
>>>>> > configured to DominantResourceCalculator then cpu and memory can be
>>>>> > controlled.
>>>>> >
>>>>> > Please Kindly  furhter refer to the official documentation
>>>>> > http://hadoop.apache.org/docs/r1.2.1/capacity_scheduler.html
>>>>> >
>>>>> >
>>>>> >
>>>>> > But may be if say more about problem then we can suggest ideal
>>>>> > configuration, seems like capacity configuration and splitting of
>>>>> the queue
>>>>> > is not rightly done or you might refer to Fair Scheduler if you want
>>>>> more
>>>>> > fairness for container allocation for different apps.
>>>>> >
>>>>> >
>>>>> >
>>>>> > On Thu, Sep 17, 2015 at 4:10 PM, Laxman Ch <la...@gmail.com>
>>>>> wrote:
>>>>> >
>>>>> > Hi,
>>>>> >
>>>>> >
>>>>> >
>>>>> > In YARN, do we have any way to control the amount of resources
>>>>> (vcores,
>>>>> > memory) used by an application SIMULTANEOUSLY.
>>>>> >
>>>>> >
>>>>> >
>>>>> > - In my cluster, noticed some large and long running mr-app occupied
>>>>> all the
>>>>> > slots of the queue and blocking other apps to get started.
>>>>> >
>>>>> > - I'm using Capacity schedulers (using hierarchical queues and
>>>>> preemption
>>>>> > disabled)
>>>>> >
>>>>> > - Using Hadoop version 2.6.0
>>>>> >
>>>>> > - Did some googling around this and gone through configuration docs
>>>>> but I'm
>>>>> > not able to find anything that matches my requirement.
>>>>> >
>>>>> >
>>>>> >
>>>>> > If needed, I can provide more details on the usecase and problem.
>>>>> >
>>>>> >
>>>>> >
>>>>> > --
>>>>> >
>>>>> > Thanks,
>>>>> > Laxman
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> > --
>>>>> >
>>>>> > Thanks,
>>>>> > Laxman
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> > --
>>>>> >
>>>>> > Thanks,
>>>>> > Laxman
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> > --
>>>>> >
>>>>> > Thanks,
>>>>> > Laxman
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> Thanks,
>>>> Laxman
>>>>
>>>
>>>
>>>
>>> --
>>> Thanks,
>>> Laxman
>>>
>>
>>
>>
>> --
>> Thanks,
>> Laxman
>>
>


-- 
Thanks,
Laxman

Re: Concurrency control

Posted by Laxman Ch <la...@gmail.com>.

Thanks and Perfect Harsh. Exactly what I am looking for. Most of our
applications are MR.
So, this should be sufficient for us. These configurations, I will give a
try and post my findings again here. Thanks again.

Thanks Naga, Rohit & Lloyd for your suggestions and discussion.

On 2 October 2015 at 07:37, Harsh J <ha...@cloudera.com> wrote:

> If all your Apps are MR, then what you are looking for is MAPREDUCE-5583
> (it can be set per-job).
>
> On Thu, Oct 1, 2015 at 3:03 PM Laxman Ch <la...@gmail.com> wrote:
>
>> Hi Naga,
>>
>> Like most of the app-level configurations, admin can configure the
>> defaults which user may want override at application level.
>>
>> If this is at queue-level then all applications in a queue will have the
>> same limits. But all our applications in a queue may not have same SLA and
>> we may need to restrict them differently. This requires again splitting
>> queues further which I feel is more overhead.
>>
>>
>> On 30 September 2015 at 09:00, Naganarasimha G R (Naga) <
>> garlanaganarasimha@huawei.com> wrote:
>>
>>> Hi Laxman,
>>>
>>> Ideally i understand it would be better its available @ application
>>> level, but  its like each user is expected to ensure that he gives the
>>> right configuration which is within the limits of max capacity.
>>> And what if user submits some app *(kind of a query execution app**)*
>>> with out this setting *or* he doesn't know how much it should take ? In
>>> general, users specifying resources for containers itself is a difficult
>>> task.
>>> And it might not be right to expect that the admin will do it for each
>>> application in the queue either.  Basically governing will be difficult if
>>> its not enforced from queue/scheduler side.
>>>
>>> + Naga
>>>
>>> ------------------------------
>>> *From:* Laxman Ch [laxman.lux@gmail.com]
>>> *Sent:* Tuesday, September 29, 2015 16:52
>>>
>>> *To:* user@hadoop.apache.org
>>> *Subject:* Re: Concurrency control
>>>
>>> IMO, its better to have a application level configuration than to have a
>>> scheduler/queue level configuration.
>>> Having a queue level configuration will restrict every single
>>> application that runs in that queue.
>>> But, we may want to configure these limits for only some set of jobs and
>>> also for every application these limits can be different.
>>>
>>> FairOrdering policy thing, order of jobs can't be enforced as these are
>>> adhoc jobs and scheduled/owned independently by different teams.
>>>
>>> On 29 September 2015 at 16:43, Naganarasimha G R (Naga) <
>>> garlanaganarasimha@huawei.com> wrote:
>>>
>>>> Hi Laxman,
>>>>
>>>> What i meant was,  suppose if we support and configure
>>>> yarn.scheduler.capacity.<queue-path>.app-limit-factor to .25  then a
>>>> single app should not take more than 25 % of resources in the queue.
>>>> This would be a more generic configuration which can be enforced by the
>>>> admin, than expecting it to be configured for per app by the user.
>>>>
>>>> And for Rohith's suggestion of FairOrdering policy , I think it should
>>>> solve the problem if the App which is submitted first is not already hogged
>>>> all the queue's resources.
>>>>
>>>> + Naga
>>>>
>>>> ------------------------------
>>>> *From:* Laxman Ch [laxman.lux@gmail.com]
>>>> *Sent:* Tuesday, September 29, 2015 16:03
>>>>
>>>> *To:* user@hadoop.apache.org
>>>> *Subject:* Re: Concurrency control
>>>>
>>>> Thanks Rohit, Naga and Lloyd for the responses.
>>>>
>>>> > I think Laxman should also tell us more about which application type he
>>>> is running.
>>>>
>>>> We run mr jobs mostly with default core/memory allocation (1 vcore,
>>>> 1.5GB).
>>>> Our problem is more about controlling the * resources used
>>>> simultaneously by all running containers *at any given point of time
>>>> per application.
>>>>
>>>> Example:
>>>> 1. App1 and App2 are two MR apps.
>>>> 2. App1 and App2 belong to same queue (capacity: 100 vcores, 150 GB).
>>>> 3. Each App1 task takes 8 hrs for completion
>>>> 4. Each App2 task takes 5 mins for completion
>>>> 5. App1 triggered at time "t1" and using all the slots of queue.
>>>> 6. App2 triggered at time "t2" (where t2 > t1) and waits longer fot
>>>> App1 tasks to release the resources.
>>>> 7. We can't have preemption enabled as we don't want to lose the work
>>>> completed so far by App1.
>>>> 8. We can't have separate queues for App1 and App2 as we have lots of
>>>> jobs like this and it will explode the number of queues.
>>>> 9. We use CapacityScheduler.
>>>>
>>>> In this scenario, if I can control App1 concurrent usage limits to
>>>> 50vcores and 75GB, then App1 may take longer time to finish but there won't
>>>> be any starvation for App2 (and other jobs running in same queue)
>>>>
>>>> @Rohit, FairOrdering policy may not solve this starvation problem.
>>>>
>>>> @Naga, I couldn't think through the expected behavior of "
>>>> yarn.scheduler.capacity.<queue-path>.app-limit-factor"
>>>> I will revert on this.
>>>>
>>>> On 29 September 2015 at 14:57, Namikaze Minato <ll...@gmail.com>
>>>> wrote:
>>>>
>>>>> I think Laxman should also tell us more about which application type
>>>>> he is running. The normal use cas of MAPREDUCE should be working as
>>>>> intended, but if he has for example one MAP using 100 vcores, then the
>>>>> second map will have to wait until the app completes. Same would
>>>>> happen if the applications running were spark, as spark does not free
>>>>> what is allocated to it.
>>>>>
>>>>> Regards,
>>>>> LLoyd
>>>>>
>>>>> On 29 September 2015 at 11:22, Naganarasimha G R (Naga)
>>>>> <ga...@huawei.com> wrote:
>>>>> > Thanks Rohith for your thoughts ,
>>>>> >       But i think by this configuration it might not completely
>>>>> solve the
>>>>> > scenario mentioned by Laxman, As if the there is some time gap
>>>>> between first
>>>>> > and and the second app then though we have fairness or priority set
>>>>> for apps
>>>>> > starvation will be there.
>>>>> > IIUC we can think of an approach where in we can have something
>>>>> similar to
>>>>> > "yarn.scheduler.capacity.<queue-path>.user-limit-factor"  where in
>>>>> it can
>>>>> > provide  the functionality like
>>>>> > "yarn.scheduler.capacity.<queue-path>.app-limit-factor" : The
>>>>> multiple of
>>>>> > the queue capacity which can be configured to allow a single app to
>>>>> acquire
>>>>> > more resources.  Thoughts ?
>>>>> >
>>>>> > + Naga
>>>>> >
>>>>> >
>>>>> >
>>>>> > ________________________________
>>>>> > From: Rohith Sharma K S [rohithsharmaks@huawei.com]
>>>>> > Sent: Tuesday, September 29, 2015 14:07
>>>>> > To: user@hadoop.apache.org
>>>>> > Subject: RE: Concurrency control
>>>>> >
>>>>> > Hi Laxman,
>>>>> >
>>>>> >
>>>>> >
>>>>> > In Hadoop-2.8(Not released  yet),  CapacityScheduler provides
>>>>> configuration
>>>>> > for configuring ordering policy.  By configuring
>>>>> FAIR_ORDERING_POLICY in CS
>>>>> > , probably you should be able to achieve  your goal i.e avoiding
>>>>> starving of
>>>>> > applications for resources.
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> org.apache.hadoop.yarn.server.resourcemanager.scheduler.policy.FairOrderingPolicy<S>
>>>>> >
>>>>> > An OrderingPolicy which orders SchedulableEntities for fairness (see
>>>>> > FairScheduler FairSharePolicy), generally, processes with lesser
>>>>> usage are
>>>>> > lesser. If sizedBasedWeight is set to true then an application with
>>>>> high
>>>>> > demand may be prioritized ahead of an application with less usage.
>>>>> This is
>>>>> > to offset the tendency to favor small apps, which could result in
>>>>> starvation
>>>>> > for large apps if many small ones enter and leave the queue
>>>>> continuously
>>>>> > (optional, default false)
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> > Community Issue Id :
>>>>> https://issues.apache.org/jira/browse/YARN-3463
>>>>> >
>>>>> >
>>>>> >
>>>>> > Thanks & Regards
>>>>> >
>>>>> > Rohith Sharma K S
>>>>> >
>>>>> >
>>>>> >
>>>>> > From: Laxman Ch [mailto:laxman.lux@gmail.com]
>>>>> > Sent: 29 September 2015 13:36
>>>>> > To: user@hadoop.apache.org
>>>>> > Subject: Re: Concurrency control
>>>>> >
>>>>> >
>>>>> >
>>>>> > Bouncing this thread again. Any other thoughts please?
>>>>> >
>>>>> >
>>>>> >
>>>>> > On 17 September 2015 at 23:21, Laxman Ch <la...@gmail.com>
>>>>> wrote:
>>>>> >
>>>>> > No Naga. That wont help.
>>>>> >
>>>>> >
>>>>> >
>>>>> > I am running two applications (app1 - 100 vcores, app2 - 100 vcores)
>>>>> with
>>>>> > same user which runs in same queue (capacity=100vcores). In this
>>>>> scenario,
>>>>> > if app1 triggers first occupies all the slots and runs longs then
>>>>> app2 will
>>>>> > starve longer.
>>>>> >
>>>>> >
>>>>> >
>>>>> > Let me reiterate my problem statement. I wanted "to control the
>>>>> amount of
>>>>> > resources (vcores, memory) used by an application SIMULTANEOUSLY"
>>>>> >
>>>>> >
>>>>> >
>>>>> > On 17 September 2015 at 22:28, Naganarasimha Garla
>>>>> > <na...@gmail.com> wrote:
>>>>> >
>>>>> > Hi Laxman,
>>>>> >
>>>>> > For the example you have stated may be we can do the following
>>>>> things :
>>>>> >
>>>>> > 1. Create/modify the queue with capacity and max cap set such that
>>>>> its
>>>>> > equivalent to 100 vcores. So as there is no elasticity, given
>>>>> application
>>>>> > will not be using the resources beyond the capacity configured
>>>>> >
>>>>> > 2. yarn.scheduler.capacity.<queue-path>.minimum-user-limit-percent
>>>>>  so that
>>>>> > each active user would be assured with the minimum guaranteed
>>>>> resources . By
>>>>> > default value is 100 implies no user limits are imposed.
>>>>> >
>>>>> >
>>>>> >
>>>>> > Additionally we can think of
>>>>> >
>>>>> "yarn.nodemanager.linux-container-executor.cgroups.strict-resource-usage"
>>>>> > which will enforce strict cpu usage for a given container if
>>>>> required.
>>>>> >
>>>>> >
>>>>> >
>>>>> > + Naga
>>>>> >
>>>>> >
>>>>> >
>>>>> > On Thu, Sep 17, 2015 at 4:42 PM, Laxman Ch <la...@gmail.com>
>>>>> wrote:
>>>>> >
>>>>> > Yes. I'm already using cgroups. Cgroups helps in controlling the
>>>>> resources
>>>>> > at container level. But my requirement is more about controlling the
>>>>> > concurrent resource usage of an application at whole cluster level.
>>>>> >
>>>>> >
>>>>> >
>>>>> > And yes, we do configure queues properly. But, that won't help.
>>>>> >
>>>>> >
>>>>> >
>>>>> > For example, I have an application with a requirement of 1000
>>>>> vcores. But, I
>>>>> > wanted to control this application not to go beyond 100 vcores at
>>>>> any point
>>>>> > of time in the cluster/queue. This makes that application to run
>>>>> longer even
>>>>> > when my cluster is free but I will be able meet the guaranteed SLAs
>>>>> of other
>>>>> > applications.
>>>>> >
>>>>> >
>>>>> >
>>>>> > Hope this helps to understand my question.
>>>>> >
>>>>> >
>>>>> >
>>>>> > And thanks Narasimha for quick response.
>>>>> >
>>>>> >
>>>>> >
>>>>> > On 17 September 2015 at 16:17, Naganarasimha Garla
>>>>> > <na...@gmail.com> wrote:
>>>>> >
>>>>> > Hi Laxman,
>>>>> >
>>>>> > Yes if cgroups are enabled and
>>>>> "yarn.scheduler.capacity.resource-calculator"
>>>>> > configured to DominantResourceCalculator then cpu and memory can be
>>>>> > controlled.
>>>>> >
>>>>> > Please Kindly  furhter refer to the official documentation
>>>>> > http://hadoop.apache.org/docs/r1.2.1/capacity_scheduler.html
>>>>> >
>>>>> >
>>>>> >
>>>>> > But may be if say more about problem then we can suggest ideal
>>>>> > configuration, seems like capacity configuration and splitting of
>>>>> the queue
>>>>> > is not rightly done or you might refer to Fair Scheduler if you want
>>>>> more
>>>>> > fairness for container allocation for different apps.
>>>>> >
>>>>> >
>>>>> >
>>>>> > On Thu, Sep 17, 2015 at 4:10 PM, Laxman Ch <la...@gmail.com>
>>>>> wrote:
>>>>> >
>>>>> > Hi,
>>>>> >
>>>>> >
>>>>> >
>>>>> > In YARN, do we have any way to control the amount of resources
>>>>> (vcores,
>>>>> > memory) used by an application SIMULTANEOUSLY.
>>>>> >
>>>>> >
>>>>> >
>>>>> > - In my cluster, noticed some large and long running mr-app occupied
>>>>> all the
>>>>> > slots of the queue and blocking other apps to get started.
>>>>> >
>>>>> > - I'm using Capacity schedulers (using hierarchical queues and
>>>>> preemption
>>>>> > disabled)
>>>>> >
>>>>> > - Using Hadoop version 2.6.0
>>>>> >
>>>>> > - Did some googling around this and gone through configuration docs
>>>>> but I'm
>>>>> > not able to find anything that matches my requirement.
>>>>> >
>>>>> >
>>>>> >
>>>>> > If needed, I can provide more details on the usecase and problem.
>>>>> >
>>>>> >
>>>>> >
>>>>> > --
>>>>> >
>>>>> > Thanks,
>>>>> > Laxman
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> > --
>>>>> >
>>>>> > Thanks,
>>>>> > Laxman
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> > --
>>>>> >
>>>>> > Thanks,
>>>>> > Laxman
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> > --
>>>>> >
>>>>> > Thanks,
>>>>> > Laxman
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> Thanks,
>>>> Laxman
>>>>
>>>
>>>
>>>
>>> --
>>> Thanks,
>>> Laxman
>>>
>>
>>
>>
>> --
>> Thanks,
>> Laxman
>>
>


-- 
Thanks,
Laxman

Re: Concurrency control

Posted by Laxman Ch <la...@gmail.com>.

Thanks and Perfect Harsh. Exactly what I am looking for. Most of our
applications are MR.
So, this should be sufficient for us. These configurations, I will give a
try and post my findings again here. Thanks again.

Thanks Naga, Rohit & Lloyd for your suggestions and discussion.

On 2 October 2015 at 07:37, Harsh J <ha...@cloudera.com> wrote:

> If all your Apps are MR, then what you are looking for is MAPREDUCE-5583
> (it can be set per-job).
>
> On Thu, Oct 1, 2015 at 3:03 PM Laxman Ch <la...@gmail.com> wrote:
>
>> Hi Naga,
>>
>> Like most of the app-level configurations, admin can configure the
>> defaults which user may want override at application level.
>>
>> If this is at queue-level then all applications in a queue will have the
>> same limits. But all our applications in a queue may not have same SLA and
>> we may need to restrict them differently. This requires again splitting
>> queues further which I feel is more overhead.
>>
>>
>> On 30 September 2015 at 09:00, Naganarasimha G R (Naga) <
>> garlanaganarasimha@huawei.com> wrote:
>>
>>> Hi Laxman,
>>>
>>> Ideally i understand it would be better its available @ application
>>> level, but  its like each user is expected to ensure that he gives the
>>> right configuration which is within the limits of max capacity.
>>> And what if user submits some app *(kind of a query execution app**)*
>>> with out this setting *or* he doesn't know how much it should take ? In
>>> general, users specifying resources for containers itself is a difficult
>>> task.
>>> And it might not be right to expect that the admin will do it for each
>>> application in the queue either.  Basically governing will be difficult if
>>> its not enforced from queue/scheduler side.
>>>
>>> + Naga
>>>
>>> ------------------------------
>>> *From:* Laxman Ch [laxman.lux@gmail.com]
>>> *Sent:* Tuesday, September 29, 2015 16:52
>>>
>>> *To:* user@hadoop.apache.org
>>> *Subject:* Re: Concurrency control
>>>
>>> IMO, its better to have a application level configuration than to have a
>>> scheduler/queue level configuration.
>>> Having a queue level configuration will restrict every single
>>> application that runs in that queue.
>>> But, we may want to configure these limits for only some set of jobs and
>>> also for every application these limits can be different.
>>>
>>> FairOrdering policy thing, order of jobs can't be enforced as these are
>>> adhoc jobs and scheduled/owned independently by different teams.
>>>
>>> On 29 September 2015 at 16:43, Naganarasimha G R (Naga) <
>>> garlanaganarasimha@huawei.com> wrote:
>>>
>>>> Hi Laxman,
>>>>
>>>> What i meant was,  suppose if we support and configure
>>>> yarn.scheduler.capacity.<queue-path>.app-limit-factor to .25  then a
>>>> single app should not take more than 25 % of resources in the queue.
>>>> This would be a more generic configuration which can be enforced by the
>>>> admin, than expecting it to be configured for per app by the user.
>>>>
>>>> And for Rohith's suggestion of FairOrdering policy , I think it should
>>>> solve the problem if the App which is submitted first is not already hogged
>>>> all the queue's resources.
>>>>
>>>> + Naga
>>>>
>>>> ------------------------------
>>>> *From:* Laxman Ch [laxman.lux@gmail.com]
>>>> *Sent:* Tuesday, September 29, 2015 16:03
>>>>
>>>> *To:* user@hadoop.apache.org
>>>> *Subject:* Re: Concurrency control
>>>>
>>>> Thanks Rohit, Naga and Lloyd for the responses.
>>>>
>>>> > I think Laxman should also tell us more about which application type he
>>>> is running.
>>>>
>>>> We run mr jobs mostly with default core/memory allocation (1 vcore,
>>>> 1.5GB).
>>>> Our problem is more about controlling the * resources used
>>>> simultaneously by all running containers *at any given point of time
>>>> per application.
>>>>
>>>> Example:
>>>> 1. App1 and App2 are two MR apps.
>>>> 2. App1 and App2 belong to same queue (capacity: 100 vcores, 150 GB).
>>>> 3. Each App1 task takes 8 hrs for completion
>>>> 4. Each App2 task takes 5 mins for completion
>>>> 5. App1 triggered at time "t1" and using all the slots of queue.
>>>> 6. App2 triggered at time "t2" (where t2 > t1) and waits longer fot
>>>> App1 tasks to release the resources.
>>>> 7. We can't have preemption enabled as we don't want to lose the work
>>>> completed so far by App1.
>>>> 8. We can't have separate queues for App1 and App2 as we have lots of
>>>> jobs like this and it will explode the number of queues.
>>>> 9. We use CapacityScheduler.
>>>>
>>>> In this scenario, if I can control App1 concurrent usage limits to
>>>> 50vcores and 75GB, then App1 may take longer time to finish but there won't
>>>> be any starvation for App2 (and other jobs running in same queue)
>>>>
>>>> @Rohit, FairOrdering policy may not solve this starvation problem.
>>>>
>>>> @Naga, I couldn't think through the expected behavior of "
>>>> yarn.scheduler.capacity.<queue-path>.app-limit-factor"
>>>> I will revert on this.
>>>>
>>>> On 29 September 2015 at 14:57, Namikaze Minato <ll...@gmail.com>
>>>> wrote:
>>>>
>>>>> I think Laxman should also tell us more about which application type
>>>>> he is running. The normal use cas of MAPREDUCE should be working as
>>>>> intended, but if he has for example one MAP using 100 vcores, then the
>>>>> second map will have to wait until the app completes. Same would
>>>>> happen if the applications running were spark, as spark does not free
>>>>> what is allocated to it.
>>>>>
>>>>> Regards,
>>>>> LLoyd
>>>>>
>>>>> On 29 September 2015 at 11:22, Naganarasimha G R (Naga)
>>>>> <ga...@huawei.com> wrote:
>>>>> > Thanks Rohith for your thoughts ,
>>>>> >       But i think by this configuration it might not completely
>>>>> solve the
>>>>> > scenario mentioned by Laxman, As if the there is some time gap
>>>>> between first
>>>>> > and and the second app then though we have fairness or priority set
>>>>> for apps
>>>>> > starvation will be there.
>>>>> > IIUC we can think of an approach where in we can have something
>>>>> similar to
>>>>> > "yarn.scheduler.capacity.<queue-path>.user-limit-factor"  where in
>>>>> it can
>>>>> > provide  the functionality like
>>>>> > "yarn.scheduler.capacity.<queue-path>.app-limit-factor" : The
>>>>> multiple of
>>>>> > the queue capacity which can be configured to allow a single app to
>>>>> acquire
>>>>> > more resources.  Thoughts ?
>>>>> >
>>>>> > + Naga
>>>>> >
>>>>> >
>>>>> >
>>>>> > ________________________________
>>>>> > From: Rohith Sharma K S [rohithsharmaks@huawei.com]
>>>>> > Sent: Tuesday, September 29, 2015 14:07
>>>>> > To: user@hadoop.apache.org
>>>>> > Subject: RE: Concurrency control
>>>>> >
>>>>> > Hi Laxman,
>>>>> >
>>>>> >
>>>>> >
>>>>> > In Hadoop-2.8(Not released  yet),  CapacityScheduler provides
>>>>> configuration
>>>>> > for configuring ordering policy.  By configuring
>>>>> FAIR_ORDERING_POLICY in CS
>>>>> > , probably you should be able to achieve  your goal i.e avoiding
>>>>> starving of
>>>>> > applications for resources.
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> org.apache.hadoop.yarn.server.resourcemanager.scheduler.policy.FairOrderingPolicy<S>
>>>>> >
>>>>> > An OrderingPolicy which orders SchedulableEntities for fairness (see
>>>>> > FairScheduler FairSharePolicy), generally, processes with lesser
>>>>> usage are
>>>>> > lesser. If sizedBasedWeight is set to true then an application with
>>>>> high
>>>>> > demand may be prioritized ahead of an application with less usage.
>>>>> This is
>>>>> > to offset the tendency to favor small apps, which could result in
>>>>> starvation
>>>>> > for large apps if many small ones enter and leave the queue
>>>>> continuously
>>>>> > (optional, default false)
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> > Community Issue Id :
>>>>> https://issues.apache.org/jira/browse/YARN-3463
>>>>> >
>>>>> >
>>>>> >
>>>>> > Thanks & Regards
>>>>> >
>>>>> > Rohith Sharma K S
>>>>> >
>>>>> >
>>>>> >
>>>>> > From: Laxman Ch [mailto:laxman.lux@gmail.com]
>>>>> > Sent: 29 September 2015 13:36
>>>>> > To: user@hadoop.apache.org
>>>>> > Subject: Re: Concurrency control
>>>>> >
>>>>> >
>>>>> >
>>>>> > Bouncing this thread again. Any other thoughts please?
>>>>> >
>>>>> >
>>>>> >
>>>>> > On 17 September 2015 at 23:21, Laxman Ch <la...@gmail.com>
>>>>> wrote:
>>>>> >
>>>>> > No Naga. That wont help.
>>>>> >
>>>>> >
>>>>> >
>>>>> > I am running two applications (app1 - 100 vcores, app2 - 100 vcores)
>>>>> with
>>>>> > same user which runs in same queue (capacity=100vcores). In this
>>>>> scenario,
>>>>> > if app1 triggers first occupies all the slots and runs longs then
>>>>> app2 will
>>>>> > starve longer.
>>>>> >
>>>>> >
>>>>> >
>>>>> > Let me reiterate my problem statement. I wanted "to control the
>>>>> amount of
>>>>> > resources (vcores, memory) used by an application SIMULTANEOUSLY"
>>>>> >
>>>>> >
>>>>> >
>>>>> > On 17 September 2015 at 22:28, Naganarasimha Garla
>>>>> > <na...@gmail.com> wrote:
>>>>> >
>>>>> > Hi Laxman,
>>>>> >
>>>>> > For the example you have stated may be we can do the following
>>>>> things :
>>>>> >
>>>>> > 1. Create/modify the queue with capacity and max cap set such that
>>>>> its
>>>>> > equivalent to 100 vcores. So as there is no elasticity, given
>>>>> application
>>>>> > will not be using the resources beyond the capacity configured
>>>>> >
>>>>> > 2. yarn.scheduler.capacity.<queue-path>.minimum-user-limit-percent
>>>>>  so that
>>>>> > each active user would be assured with the minimum guaranteed
>>>>> resources . By
>>>>> > default value is 100 implies no user limits are imposed.
>>>>> >
>>>>> >
>>>>> >
>>>>> > Additionally we can think of
>>>>> >
>>>>> "yarn.nodemanager.linux-container-executor.cgroups.strict-resource-usage"
>>>>> > which will enforce strict cpu usage for a given container if
>>>>> required.
>>>>> >
>>>>> >
>>>>> >
>>>>> > + Naga
>>>>> >
>>>>> >
>>>>> >
>>>>> > On Thu, Sep 17, 2015 at 4:42 PM, Laxman Ch <la...@gmail.com>
>>>>> wrote:
>>>>> >
>>>>> > Yes. I'm already using cgroups. Cgroups helps in controlling the
>>>>> resources
>>>>> > at container level. But my requirement is more about controlling the
>>>>> > concurrent resource usage of an application at whole cluster level.
>>>>> >
>>>>> >
>>>>> >
>>>>> > And yes, we do configure queues properly. But, that won't help.
>>>>> >
>>>>> >
>>>>> >
>>>>> > For example, I have an application with a requirement of 1000
>>>>> vcores. But, I
>>>>> > wanted to control this application not to go beyond 100 vcores at
>>>>> any point
>>>>> > of time in the cluster/queue. This makes that application to run
>>>>> longer even
>>>>> > when my cluster is free but I will be able meet the guaranteed SLAs
>>>>> of other
>>>>> > applications.
>>>>> >
>>>>> >
>>>>> >
>>>>> > Hope this helps to understand my question.
>>>>> >
>>>>> >
>>>>> >
>>>>> > And thanks Narasimha for quick response.
>>>>> >
>>>>> >
>>>>> >
>>>>> > On 17 September 2015 at 16:17, Naganarasimha Garla
>>>>> > <na...@gmail.com> wrote:
>>>>> >
>>>>> > Hi Laxman,
>>>>> >
>>>>> > Yes if cgroups are enabled and
>>>>> "yarn.scheduler.capacity.resource-calculator"
>>>>> > configured to DominantResourceCalculator then cpu and memory can be
>>>>> > controlled.
>>>>> >
>>>>> > Please Kindly  furhter refer to the official documentation
>>>>> > http://hadoop.apache.org/docs/r1.2.1/capacity_scheduler.html
>>>>> >
>>>>> >
>>>>> >
>>>>> > But may be if say more about problem then we can suggest ideal
>>>>> > configuration, seems like capacity configuration and splitting of
>>>>> the queue
>>>>> > is not rightly done or you might refer to Fair Scheduler if you want
>>>>> more
>>>>> > fairness for container allocation for different apps.
>>>>> >
>>>>> >
>>>>> >
>>>>> > On Thu, Sep 17, 2015 at 4:10 PM, Laxman Ch <la...@gmail.com>
>>>>> wrote:
>>>>> >
>>>>> > Hi,
>>>>> >
>>>>> >
>>>>> >
>>>>> > In YARN, do we have any way to control the amount of resources
>>>>> (vcores,
>>>>> > memory) used by an application SIMULTANEOUSLY.
>>>>> >
>>>>> >
>>>>> >
>>>>> > - In my cluster, noticed some large and long running mr-app occupied
>>>>> all the
>>>>> > slots of the queue and blocking other apps to get started.
>>>>> >
>>>>> > - I'm using Capacity schedulers (using hierarchical queues and
>>>>> preemption
>>>>> > disabled)
>>>>> >
>>>>> > - Using Hadoop version 2.6.0
>>>>> >
>>>>> > - Did some googling around this and gone through configuration docs
>>>>> but I'm
>>>>> > not able to find anything that matches my requirement.
>>>>> >
>>>>> >
>>>>> >
>>>>> > If needed, I can provide more details on the usecase and problem.
>>>>> >
>>>>> >
>>>>> >
>>>>> > --
>>>>> >
>>>>> > Thanks,
>>>>> > Laxman
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> > --
>>>>> >
>>>>> > Thanks,
>>>>> > Laxman
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> > --
>>>>> >
>>>>> > Thanks,
>>>>> > Laxman
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> > --
>>>>> >
>>>>> > Thanks,
>>>>> > Laxman
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> Thanks,
>>>> Laxman
>>>>
>>>
>>>
>>>
>>> --
>>> Thanks,
>>> Laxman
>>>
>>
>>
>>
>> --
>> Thanks,
>> Laxman
>>
>


-- 
Thanks,
Laxman

Re: Concurrency control

Posted by Harsh J <ha...@cloudera.com>.

If all your Apps are MR, then what you are looking for is MAPREDUCE-5583
(it can be set per-job).

On Thu, Oct 1, 2015 at 3:03 PM Laxman Ch <la...@gmail.com> wrote:

> Hi Naga,
>
> Like most of the app-level configurations, admin can configure the
> defaults which user may want override at application level.
>
> If this is at queue-level then all applications in a queue will have the
> same limits. But all our applications in a queue may not have same SLA and
> we may need to restrict them differently. This requires again splitting
> queues further which I feel is more overhead.
>
>
> On 30 September 2015 at 09:00, Naganarasimha G R (Naga) <
> garlanaganarasimha@huawei.com> wrote:
>
>> Hi Laxman,
>>
>> Ideally i understand it would be better its available @ application
>> level, but  its like each user is expected to ensure that he gives the
>> right configuration which is within the limits of max capacity.
>> And what if user submits some app *(kind of a query execution app**)*
>> with out this setting *or* he doesn't know how much it should take ? In
>> general, users specifying resources for containers itself is a difficult
>> task.
>> And it might not be right to expect that the admin will do it for each
>> application in the queue either.  Basically governing will be difficult if
>> its not enforced from queue/scheduler side.
>>
>> + Naga
>>
>> ------------------------------
>> *From:* Laxman Ch [laxman.lux@gmail.com]
>> *Sent:* Tuesday, September 29, 2015 16:52
>>
>> *To:* user@hadoop.apache.org
>> *Subject:* Re: Concurrency control
>>
>> IMO, its better to have a application level configuration than to have a
>> scheduler/queue level configuration.
>> Having a queue level configuration will restrict every single application
>> that runs in that queue.
>> But, we may want to configure these limits for only some set of jobs and
>> also for every application these limits can be different.
>>
>> FairOrdering policy thing, order of jobs can't be enforced as these are
>> adhoc jobs and scheduled/owned independently by different teams.
>>
>> On 29 September 2015 at 16:43, Naganarasimha G R (Naga) <
>> garlanaganarasimha@huawei.com> wrote:
>>
>>> Hi Laxman,
>>>
>>> What i meant was,  suppose if we support and configure
>>> yarn.scheduler.capacity.<queue-path>.app-limit-factor to .25  then a
>>> single app should not take more than 25 % of resources in the queue.
>>> This would be a more generic configuration which can be enforced by the
>>> admin, than expecting it to be configured for per app by the user.
>>>
>>> And for Rohith's suggestion of FairOrdering policy , I think it should
>>> solve the problem if the App which is submitted first is not already hogged
>>> all the queue's resources.
>>>
>>> + Naga
>>>
>>> ------------------------------
>>> *From:* Laxman Ch [laxman.lux@gmail.com]
>>> *Sent:* Tuesday, September 29, 2015 16:03
>>>
>>> *To:* user@hadoop.apache.org
>>> *Subject:* Re: Concurrency control
>>>
>>> Thanks Rohit, Naga and Lloyd for the responses.
>>>
>>> > I think Laxman should also tell us more about which application type he
>>> is running.
>>>
>>> We run mr jobs mostly with default core/memory allocation (1 vcore,
>>> 1.5GB).
>>> Our problem is more about controlling the * resources used
>>> simultaneously by all running containers *at any given point of time
>>> per application.
>>>
>>> Example:
>>> 1. App1 and App2 are two MR apps.
>>> 2. App1 and App2 belong to same queue (capacity: 100 vcores, 150 GB).
>>> 3. Each App1 task takes 8 hrs for completion
>>> 4. Each App2 task takes 5 mins for completion
>>> 5. App1 triggered at time "t1" and using all the slots of queue.
>>> 6. App2 triggered at time "t2" (where t2 > t1) and waits longer fot App1
>>> tasks to release the resources.
>>> 7. We can't have preemption enabled as we don't want to lose the work
>>> completed so far by App1.
>>> 8. We can't have separate queues for App1 and App2 as we have lots of
>>> jobs like this and it will explode the number of queues.
>>> 9. We use CapacityScheduler.
>>>
>>> In this scenario, if I can control App1 concurrent usage limits to
>>> 50vcores and 75GB, then App1 may take longer time to finish but there won't
>>> be any starvation for App2 (and other jobs running in same queue)
>>>
>>> @Rohit, FairOrdering policy may not solve this starvation problem.
>>>
>>> @Naga, I couldn't think through the expected behavior of "
>>> yarn.scheduler.capacity.<queue-path>.app-limit-factor"
>>> I will revert on this.
>>>
>>> On 29 September 2015 at 14:57, Namikaze Minato <ll...@gmail.com>
>>> wrote:
>>>
>>>> I think Laxman should also tell us more about which application type
>>>> he is running. The normal use cas of MAPREDUCE should be working as
>>>> intended, but if he has for example one MAP using 100 vcores, then the
>>>> second map will have to wait until the app completes. Same would
>>>> happen if the applications running were spark, as spark does not free
>>>> what is allocated to it.
>>>>
>>>> Regards,
>>>> LLoyd
>>>>
>>>> On 29 September 2015 at 11:22, Naganarasimha G R (Naga)
>>>> <ga...@huawei.com> wrote:
>>>> > Thanks Rohith for your thoughts ,
>>>> >       But i think by this configuration it might not completely solve
>>>> the
>>>> > scenario mentioned by Laxman, As if the there is some time gap
>>>> between first
>>>> > and and the second app then though we have fairness or priority set
>>>> for apps
>>>> > starvation will be there.
>>>> > IIUC we can think of an approach where in we can have something
>>>> similar to
>>>> > "yarn.scheduler.capacity.<queue-path>.user-limit-factor"  where in it
>>>> can
>>>> > provide  the functionality like
>>>> > "yarn.scheduler.capacity.<queue-path>.app-limit-factor" : The
>>>> multiple of
>>>> > the queue capacity which can be configured to allow a single app to
>>>> acquire
>>>> > more resources.  Thoughts ?
>>>> >
>>>> > + Naga
>>>> >
>>>> >
>>>> >
>>>> > ________________________________
>>>> > From: Rohith Sharma K S [rohithsharmaks@huawei.com]
>>>> > Sent: Tuesday, September 29, 2015 14:07
>>>> > To: user@hadoop.apache.org
>>>> > Subject: RE: Concurrency control
>>>> >
>>>> > Hi Laxman,
>>>> >
>>>> >
>>>> >
>>>> > In Hadoop-2.8(Not released  yet),  CapacityScheduler provides
>>>> configuration
>>>> > for configuring ordering policy.  By configuring FAIR_ORDERING_POLICY
>>>> in CS
>>>> > , probably you should be able to achieve  your goal i.e avoiding
>>>> starving of
>>>> > applications for resources.
>>>> >
>>>> >
>>>> >
>>>> >
>>>> >
>>>> >
>>>> org.apache.hadoop.yarn.server.resourcemanager.scheduler.policy.FairOrderingPolicy<S>
>>>> >
>>>> > An OrderingPolicy which orders SchedulableEntities for fairness (see
>>>> > FairScheduler FairSharePolicy), generally, processes with lesser
>>>> usage are
>>>> > lesser. If sizedBasedWeight is set to true then an application with
>>>> high
>>>> > demand may be prioritized ahead of an application with less usage.
>>>> This is
>>>> > to offset the tendency to favor small apps, which could result in
>>>> starvation
>>>> > for large apps if many small ones enter and leave the queue
>>>> continuously
>>>> > (optional, default false)
>>>> >
>>>> >
>>>> >
>>>> >
>>>> >
>>>> > Community Issue Id :  https://issues.apache.org/jira/browse/YARN-3463
>>>> >
>>>> >
>>>> >
>>>> > Thanks & Regards
>>>> >
>>>> > Rohith Sharma K S
>>>> >
>>>> >
>>>> >
>>>> > From: Laxman Ch [mailto:laxman.lux@gmail.com]
>>>> > Sent: 29 September 2015 13:36
>>>> > To: user@hadoop.apache.org
>>>> > Subject: Re: Concurrency control
>>>> >
>>>> >
>>>> >
>>>> > Bouncing this thread again. Any other thoughts please?
>>>> >
>>>> >
>>>> >
>>>> > On 17 September 2015 at 23:21, Laxman Ch <la...@gmail.com>
>>>> wrote:
>>>> >
>>>> > No Naga. That wont help.
>>>> >
>>>> >
>>>> >
>>>> > I am running two applications (app1 - 100 vcores, app2 - 100 vcores)
>>>> with
>>>> > same user which runs in same queue (capacity=100vcores). In this
>>>> scenario,
>>>> > if app1 triggers first occupies all the slots and runs longs then
>>>> app2 will
>>>> > starve longer.
>>>> >
>>>> >
>>>> >
>>>> > Let me reiterate my problem statement. I wanted "to control the
>>>> amount of
>>>> > resources (vcores, memory) used by an application SIMULTANEOUSLY"
>>>> >
>>>> >
>>>> >
>>>> > On 17 September 2015 at 22:28, Naganarasimha Garla
>>>> > <na...@gmail.com> wrote:
>>>> >
>>>> > Hi Laxman,
>>>> >
>>>> > For the example you have stated may be we can do the following things
>>>> :
>>>> >
>>>> > 1. Create/modify the queue with capacity and max cap set such that its
>>>> > equivalent to 100 vcores. So as there is no elasticity, given
>>>> application
>>>> > will not be using the resources beyond the capacity configured
>>>> >
>>>> > 2. yarn.scheduler.capacity.<queue-path>.minimum-user-limit-percent
>>>>  so that
>>>> > each active user would be assured with the minimum guaranteed
>>>> resources . By
>>>> > default value is 100 implies no user limits are imposed.
>>>> >
>>>> >
>>>> >
>>>> > Additionally we can think of
>>>> >
>>>> "yarn.nodemanager.linux-container-executor.cgroups.strict-resource-usage"
>>>> > which will enforce strict cpu usage for a given container if required.
>>>> >
>>>> >
>>>> >
>>>> > + Naga
>>>> >
>>>> >
>>>> >
>>>> > On Thu, Sep 17, 2015 at 4:42 PM, Laxman Ch <la...@gmail.com>
>>>> wrote:
>>>> >
>>>> > Yes. I'm already using cgroups. Cgroups helps in controlling the
>>>> resources
>>>> > at container level. But my requirement is more about controlling the
>>>> > concurrent resource usage of an application at whole cluster level.
>>>> >
>>>> >
>>>> >
>>>> > And yes, we do configure queues properly. But, that won't help.
>>>> >
>>>> >
>>>> >
>>>> > For example, I have an application with a requirement of 1000 vcores.
>>>> But, I
>>>> > wanted to control this application not to go beyond 100 vcores at any
>>>> point
>>>> > of time in the cluster/queue. This makes that application to run
>>>> longer even
>>>> > when my cluster is free but I will be able meet the guaranteed SLAs
>>>> of other
>>>> > applications.
>>>> >
>>>> >
>>>> >
>>>> > Hope this helps to understand my question.
>>>> >
>>>> >
>>>> >
>>>> > And thanks Narasimha for quick response.
>>>> >
>>>> >
>>>> >
>>>> > On 17 September 2015 at 16:17, Naganarasimha Garla
>>>> > <na...@gmail.com> wrote:
>>>> >
>>>> > Hi Laxman,
>>>> >
>>>> > Yes if cgroups are enabled and
>>>> "yarn.scheduler.capacity.resource-calculator"
>>>> > configured to DominantResourceCalculator then cpu and memory can be
>>>> > controlled.
>>>> >
>>>> > Please Kindly  furhter refer to the official documentation
>>>> > http://hadoop.apache.org/docs/r1.2.1/capacity_scheduler.html
>>>> >
>>>> >
>>>> >
>>>> > But may be if say more about problem then we can suggest ideal
>>>> > configuration, seems like capacity configuration and splitting of the
>>>> queue
>>>> > is not rightly done or you might refer to Fair Scheduler if you want
>>>> more
>>>> > fairness for container allocation for different apps.
>>>> >
>>>> >
>>>> >
>>>> > On Thu, Sep 17, 2015 at 4:10 PM, Laxman Ch <la...@gmail.com>
>>>> wrote:
>>>> >
>>>> > Hi,
>>>> >
>>>> >
>>>> >
>>>> > In YARN, do we have any way to control the amount of resources
>>>> (vcores,
>>>> > memory) used by an application SIMULTANEOUSLY.
>>>> >
>>>> >
>>>> >
>>>> > - In my cluster, noticed some large and long running mr-app occupied
>>>> all the
>>>> > slots of the queue and blocking other apps to get started.
>>>> >
>>>> > - I'm using Capacity schedulers (using hierarchical queues and
>>>> preemption
>>>> > disabled)
>>>> >
>>>> > - Using Hadoop version 2.6.0
>>>> >
>>>> > - Did some googling around this and gone through configuration docs
>>>> but I'm
>>>> > not able to find anything that matches my requirement.
>>>> >
>>>> >
>>>> >
>>>> > If needed, I can provide more details on the usecase and problem.
>>>> >
>>>> >
>>>> >
>>>> > --
>>>> >
>>>> > Thanks,
>>>> > Laxman
>>>> >
>>>> >
>>>> >
>>>> >
>>>> >
>>>> >
>>>> >
>>>> > --
>>>> >
>>>> > Thanks,
>>>> > Laxman
>>>> >
>>>> >
>>>> >
>>>> >
>>>> >
>>>> >
>>>> >
>>>> > --
>>>> >
>>>> > Thanks,
>>>> > Laxman
>>>> >
>>>> >
>>>> >
>>>> >
>>>> >
>>>> > --
>>>> >
>>>> > Thanks,
>>>> > Laxman
>>>>
>>>
>>>
>>>
>>> --
>>> Thanks,
>>> Laxman
>>>
>>
>>
>>
>> --
>> Thanks,
>> Laxman
>>
>
>
>
> --
> Thanks,
> Laxman
>

Re: Concurrency control

Posted by Harsh J <ha...@cloudera.com>.

If all your Apps are MR, then what you are looking for is MAPREDUCE-5583
(it can be set per-job).

On Thu, Oct 1, 2015 at 3:03 PM Laxman Ch <la...@gmail.com> wrote:

> Hi Naga,
>
> Like most of the app-level configurations, admin can configure the
> defaults which user may want override at application level.
>
> If this is at queue-level then all applications in a queue will have the
> same limits. But all our applications in a queue may not have same SLA and
> we may need to restrict them differently. This requires again splitting
> queues further which I feel is more overhead.
>
>
> On 30 September 2015 at 09:00, Naganarasimha G R (Naga) <
> garlanaganarasimha@huawei.com> wrote:
>
>> Hi Laxman,
>>
>> Ideally i understand it would be better its available @ application
>> level, but  its like each user is expected to ensure that he gives the
>> right configuration which is within the limits of max capacity.
>> And what if user submits some app *(kind of a query execution app**)*
>> with out this setting *or* he doesn't know how much it should take ? In
>> general, users specifying resources for containers itself is a difficult
>> task.
>> And it might not be right to expect that the admin will do it for each
>> application in the queue either.  Basically governing will be difficult if
>> its not enforced from queue/scheduler side.
>>
>> + Naga
>>
>> ------------------------------
>> *From:* Laxman Ch [laxman.lux@gmail.com]
>> *Sent:* Tuesday, September 29, 2015 16:52
>>
>> *To:* user@hadoop.apache.org
>> *Subject:* Re: Concurrency control
>>
>> IMO, its better to have a application level configuration than to have a
>> scheduler/queue level configuration.
>> Having a queue level configuration will restrict every single application
>> that runs in that queue.
>> But, we may want to configure these limits for only some set of jobs and
>> also for every application these limits can be different.
>>
>> FairOrdering policy thing, order of jobs can't be enforced as these are
>> adhoc jobs and scheduled/owned independently by different teams.
>>
>> On 29 September 2015 at 16:43, Naganarasimha G R (Naga) <
>> garlanaganarasimha@huawei.com> wrote:
>>
>>> Hi Laxman,
>>>
>>> What i meant was,  suppose if we support and configure
>>> yarn.scheduler.capacity.<queue-path>.app-limit-factor to .25  then a
>>> single app should not take more than 25 % of resources in the queue.
>>> This would be a more generic configuration which can be enforced by the
>>> admin, than expecting it to be configured for per app by the user.
>>>
>>> And for Rohith's suggestion of FairOrdering policy , I think it should
>>> solve the problem if the App which is submitted first is not already hogged
>>> all the queue's resources.
>>>
>>> + Naga
>>>
>>> ------------------------------
>>> *From:* Laxman Ch [laxman.lux@gmail.com]
>>> *Sent:* Tuesday, September 29, 2015 16:03
>>>
>>> *To:* user@hadoop.apache.org
>>> *Subject:* Re: Concurrency control
>>>
>>> Thanks Rohit, Naga and Lloyd for the responses.
>>>
>>> > I think Laxman should also tell us more about which application type he
>>> is running.
>>>
>>> We run mr jobs mostly with default core/memory allocation (1 vcore,
>>> 1.5GB).
>>> Our problem is more about controlling the * resources used
>>> simultaneously by all running containers *at any given point of time
>>> per application.
>>>
>>> Example:
>>> 1. App1 and App2 are two MR apps.
>>> 2. App1 and App2 belong to same queue (capacity: 100 vcores, 150 GB).
>>> 3. Each App1 task takes 8 hrs for completion
>>> 4. Each App2 task takes 5 mins for completion
>>> 5. App1 triggered at time "t1" and using all the slots of queue.
>>> 6. App2 triggered at time "t2" (where t2 > t1) and waits longer fot App1
>>> tasks to release the resources.
>>> 7. We can't have preemption enabled as we don't want to lose the work
>>> completed so far by App1.
>>> 8. We can't have separate queues for App1 and App2 as we have lots of
>>> jobs like this and it will explode the number of queues.
>>> 9. We use CapacityScheduler.
>>>
>>> In this scenario, if I can control App1 concurrent usage limits to
>>> 50vcores and 75GB, then App1 may take longer time to finish but there won't
>>> be any starvation for App2 (and other jobs running in same queue)
>>>
>>> @Rohit, FairOrdering policy may not solve this starvation problem.
>>>
>>> @Naga, I couldn't think through the expected behavior of "
>>> yarn.scheduler.capacity.<queue-path>.app-limit-factor"
>>> I will revert on this.
>>>
>>> On 29 September 2015 at 14:57, Namikaze Minato <ll...@gmail.com>
>>> wrote:
>>>
>>>> I think Laxman should also tell us more about which application type
>>>> he is running. The normal use cas of MAPREDUCE should be working as
>>>> intended, but if he has for example one MAP using 100 vcores, then the
>>>> second map will have to wait until the app completes. Same would
>>>> happen if the applications running were spark, as spark does not free
>>>> what is allocated to it.
>>>>
>>>> Regards,
>>>> LLoyd
>>>>
>>>> On 29 September 2015 at 11:22, Naganarasimha G R (Naga)
>>>> <ga...@huawei.com> wrote:
>>>> > Thanks Rohith for your thoughts ,
>>>> >       But i think by this configuration it might not completely solve
>>>> the
>>>> > scenario mentioned by Laxman, As if the there is some time gap
>>>> between first
>>>> > and and the second app then though we have fairness or priority set
>>>> for apps
>>>> > starvation will be there.
>>>> > IIUC we can think of an approach where in we can have something
>>>> similar to
>>>> > "yarn.scheduler.capacity.<queue-path>.user-limit-factor"  where in it
>>>> can
>>>> > provide  the functionality like
>>>> > "yarn.scheduler.capacity.<queue-path>.app-limit-factor" : The
>>>> multiple of
>>>> > the queue capacity which can be configured to allow a single app to
>>>> acquire
>>>> > more resources.  Thoughts ?
>>>> >
>>>> > + Naga
>>>> >
>>>> >
>>>> >
>>>> > ________________________________
>>>> > From: Rohith Sharma K S [rohithsharmaks@huawei.com]
>>>> > Sent: Tuesday, September 29, 2015 14:07
>>>> > To: user@hadoop.apache.org
>>>> > Subject: RE: Concurrency control
>>>> >
>>>> > Hi Laxman,
>>>> >
>>>> >
>>>> >
>>>> > In Hadoop-2.8(Not released  yet),  CapacityScheduler provides
>>>> configuration
>>>> > for configuring ordering policy.  By configuring FAIR_ORDERING_POLICY
>>>> in CS
>>>> > , probably you should be able to achieve  your goal i.e avoiding
>>>> starving of
>>>> > applications for resources.
>>>> >
>>>> >
>>>> >
>>>> >
>>>> >
>>>> >
>>>> org.apache.hadoop.yarn.server.resourcemanager.scheduler.policy.FairOrderingPolicy<S>
>>>> >
>>>> > An OrderingPolicy which orders SchedulableEntities for fairness (see
>>>> > FairScheduler FairSharePolicy), generally, processes with lesser
>>>> usage are
>>>> > lesser. If sizedBasedWeight is set to true then an application with
>>>> high
>>>> > demand may be prioritized ahead of an application with less usage.
>>>> This is
>>>> > to offset the tendency to favor small apps, which could result in
>>>> starvation
>>>> > for large apps if many small ones enter and leave the queue
>>>> continuously
>>>> > (optional, default false)
>>>> >
>>>> >
>>>> >
>>>> >
>>>> >
>>>> > Community Issue Id :  https://issues.apache.org/jira/browse/YARN-3463
>>>> >
>>>> >
>>>> >
>>>> > Thanks & Regards
>>>> >
>>>> > Rohith Sharma K S
>>>> >
>>>> >
>>>> >
>>>> > From: Laxman Ch [mailto:laxman.lux@gmail.com]
>>>> > Sent: 29 September 2015 13:36
>>>> > To: user@hadoop.apache.org
>>>> > Subject: Re: Concurrency control
>>>> >
>>>> >
>>>> >
>>>> > Bouncing this thread again. Any other thoughts please?
>>>> >
>>>> >
>>>> >
>>>> > On 17 September 2015 at 23:21, Laxman Ch <la...@gmail.com>
>>>> wrote:
>>>> >
>>>> > No Naga. That wont help.
>>>> >
>>>> >
>>>> >
>>>> > I am running two applications (app1 - 100 vcores, app2 - 100 vcores)
>>>> with
>>>> > same user which runs in same queue (capacity=100vcores). In this
>>>> scenario,
>>>> > if app1 triggers first occupies all the slots and runs longs then
>>>> app2 will
>>>> > starve longer.
>>>> >
>>>> >
>>>> >
>>>> > Let me reiterate my problem statement. I wanted "to control the
>>>> amount of
>>>> > resources (vcores, memory) used by an application SIMULTANEOUSLY"
>>>> >
>>>> >
>>>> >
>>>> > On 17 September 2015 at 22:28, Naganarasimha Garla
>>>> > <na...@gmail.com> wrote:
>>>> >
>>>> > Hi Laxman,
>>>> >
>>>> > For the example you have stated may be we can do the following things
>>>> :
>>>> >
>>>> > 1. Create/modify the queue with capacity and max cap set such that its
>>>> > equivalent to 100 vcores. So as there is no elasticity, given
>>>> application
>>>> > will not be using the resources beyond the capacity configured
>>>> >
>>>> > 2. yarn.scheduler.capacity.<queue-path>.minimum-user-limit-percent
>>>>  so that
>>>> > each active user would be assured with the minimum guaranteed
>>>> resources . By
>>>> > default value is 100 implies no user limits are imposed.
>>>> >
>>>> >
>>>> >
>>>> > Additionally we can think of
>>>> >
>>>> "yarn.nodemanager.linux-container-executor.cgroups.strict-resource-usage"
>>>> > which will enforce strict cpu usage for a given container if required.
>>>> >
>>>> >
>>>> >
>>>> > + Naga
>>>> >
>>>> >
>>>> >
>>>> > On Thu, Sep 17, 2015 at 4:42 PM, Laxman Ch <la...@gmail.com>
>>>> wrote:
>>>> >
>>>> > Yes. I'm already using cgroups. Cgroups helps in controlling the
>>>> resources
>>>> > at container level. But my requirement is more about controlling the
>>>> > concurrent resource usage of an application at whole cluster level.
>>>> >
>>>> >
>>>> >
>>>> > And yes, we do configure queues properly. But, that won't help.
>>>> >
>>>> >
>>>> >
>>>> > For example, I have an application with a requirement of 1000 vcores.
>>>> But, I
>>>> > wanted to control this application not to go beyond 100 vcores at any
>>>> point
>>>> > of time in the cluster/queue. This makes that application to run
>>>> longer even
>>>> > when my cluster is free but I will be able meet the guaranteed SLAs
>>>> of other
>>>> > applications.
>>>> >
>>>> >
>>>> >
>>>> > Hope this helps to understand my question.
>>>> >
>>>> >
>>>> >
>>>> > And thanks Narasimha for quick response.
>>>> >
>>>> >
>>>> >
>>>> > On 17 September 2015 at 16:17, Naganarasimha Garla
>>>> > <na...@gmail.com> wrote:
>>>> >
>>>> > Hi Laxman,
>>>> >
>>>> > Yes if cgroups are enabled and
>>>> "yarn.scheduler.capacity.resource-calculator"
>>>> > configured to DominantResourceCalculator then cpu and memory can be
>>>> > controlled.
>>>> >
>>>> > Please Kindly  furhter refer to the official documentation
>>>> > http://hadoop.apache.org/docs/r1.2.1/capacity_scheduler.html
>>>> >
>>>> >
>>>> >
>>>> > But may be if say more about problem then we can suggest ideal
>>>> > configuration, seems like capacity configuration and splitting of the
>>>> queue
>>>> > is not rightly done or you might refer to Fair Scheduler if you want
>>>> more
>>>> > fairness for container allocation for different apps.
>>>> >
>>>> >
>>>> >
>>>> > On Thu, Sep 17, 2015 at 4:10 PM, Laxman Ch <la...@gmail.com>
>>>> wrote:
>>>> >
>>>> > Hi,
>>>> >
>>>> >
>>>> >
>>>> > In YARN, do we have any way to control the amount of resources
>>>> (vcores,
>>>> > memory) used by an application SIMULTANEOUSLY.
>>>> >
>>>> >
>>>> >
>>>> > - In my cluster, noticed some large and long running mr-app occupied
>>>> all the
>>>> > slots of the queue and blocking other apps to get started.
>>>> >
>>>> > - I'm using Capacity schedulers (using hierarchical queues and
>>>> preemption
>>>> > disabled)
>>>> >
>>>> > - Using Hadoop version 2.6.0
>>>> >
>>>> > - Did some googling around this and gone through configuration docs
>>>> but I'm
>>>> > not able to find anything that matches my requirement.
>>>> >
>>>> >
>>>> >
>>>> > If needed, I can provide more details on the usecase and problem.
>>>> >
>>>> >
>>>> >
>>>> > --
>>>> >
>>>> > Thanks,
>>>> > Laxman
>>>> >
>>>> >
>>>> >
>>>> >
>>>> >
>>>> >
>>>> >
>>>> > --
>>>> >
>>>> > Thanks,
>>>> > Laxman
>>>> >
>>>> >
>>>> >
>>>> >
>>>> >
>>>> >
>>>> >
>>>> > --
>>>> >
>>>> > Thanks,
>>>> > Laxman
>>>> >
>>>> >
>>>> >
>>>> >
>>>> >
>>>> > --
>>>> >
>>>> > Thanks,
>>>> > Laxman
>>>>
>>>
>>>
>>>
>>> --
>>> Thanks,
>>> Laxman
>>>
>>
>>
>>
>> --
>> Thanks,
>> Laxman
>>
>
>
>
> --
> Thanks,
> Laxman
>

Re: Concurrency control

Posted by Harsh J <ha...@cloudera.com>.

If all your Apps are MR, then what you are looking for is MAPREDUCE-5583
(it can be set per-job).

On Thu, Oct 1, 2015 at 3:03 PM Laxman Ch <la...@gmail.com> wrote:

> Hi Naga,
>
> Like most of the app-level configurations, admin can configure the
> defaults which user may want override at application level.
>
> If this is at queue-level then all applications in a queue will have the
> same limits. But all our applications in a queue may not have same SLA and
> we may need to restrict them differently. This requires again splitting
> queues further which I feel is more overhead.
>
>
> On 30 September 2015 at 09:00, Naganarasimha G R (Naga) <
> garlanaganarasimha@huawei.com> wrote:
>
>> Hi Laxman,
>>
>> Ideally i understand it would be better its available @ application
>> level, but  its like each user is expected to ensure that he gives the
>> right configuration which is within the limits of max capacity.
>> And what if user submits some app *(kind of a query execution app**)*
>> with out this setting *or* he doesn't know how much it should take ? In
>> general, users specifying resources for containers itself is a difficult
>> task.
>> And it might not be right to expect that the admin will do it for each
>> application in the queue either.  Basically governing will be difficult if
>> its not enforced from queue/scheduler side.
>>
>> + Naga
>>
>> ------------------------------
>> *From:* Laxman Ch [laxman.lux@gmail.com]
>> *Sent:* Tuesday, September 29, 2015 16:52
>>
>> *To:* user@hadoop.apache.org
>> *Subject:* Re: Concurrency control
>>
>> IMO, its better to have a application level configuration than to have a
>> scheduler/queue level configuration.
>> Having a queue level configuration will restrict every single application
>> that runs in that queue.
>> But, we may want to configure these limits for only some set of jobs and
>> also for every application these limits can be different.
>>
>> FairOrdering policy thing, order of jobs can't be enforced as these are
>> adhoc jobs and scheduled/owned independently by different teams.
>>
>> On 29 September 2015 at 16:43, Naganarasimha G R (Naga) <
>> garlanaganarasimha@huawei.com> wrote:
>>
>>> Hi Laxman,
>>>
>>> What i meant was,  suppose if we support and configure
>>> yarn.scheduler.capacity.<queue-path>.app-limit-factor to .25  then a
>>> single app should not take more than 25 % of resources in the queue.
>>> This would be a more generic configuration which can be enforced by the
>>> admin, than expecting it to be configured for per app by the user.
>>>
>>> And for Rohith's suggestion of FairOrdering policy , I think it should
>>> solve the problem if the App which is submitted first is not already hogged
>>> all the queue's resources.
>>>
>>> + Naga
>>>
>>> ------------------------------
>>> *From:* Laxman Ch [laxman.lux@gmail.com]
>>> *Sent:* Tuesday, September 29, 2015 16:03
>>>
>>> *To:* user@hadoop.apache.org
>>> *Subject:* Re: Concurrency control
>>>
>>> Thanks Rohit, Naga and Lloyd for the responses.
>>>
>>> > I think Laxman should also tell us more about which application type he
>>> is running.
>>>
>>> We run mr jobs mostly with default core/memory allocation (1 vcore,
>>> 1.5GB).
>>> Our problem is more about controlling the * resources used
>>> simultaneously by all running containers *at any given point of time
>>> per application.
>>>
>>> Example:
>>> 1. App1 and App2 are two MR apps.
>>> 2. App1 and App2 belong to same queue (capacity: 100 vcores, 150 GB).
>>> 3. Each App1 task takes 8 hrs for completion
>>> 4. Each App2 task takes 5 mins for completion
>>> 5. App1 triggered at time "t1" and using all the slots of queue.
>>> 6. App2 triggered at time "t2" (where t2 > t1) and waits longer fot App1
>>> tasks to release the resources.
>>> 7. We can't have preemption enabled as we don't want to lose the work
>>> completed so far by App1.
>>> 8. We can't have separate queues for App1 and App2 as we have lots of
>>> jobs like this and it will explode the number of queues.
>>> 9. We use CapacityScheduler.
>>>
>>> In this scenario, if I can control App1 concurrent usage limits to
>>> 50vcores and 75GB, then App1 may take longer time to finish but there won't
>>> be any starvation for App2 (and other jobs running in same queue)
>>>
>>> @Rohit, FairOrdering policy may not solve this starvation problem.
>>>
>>> @Naga, I couldn't think through the expected behavior of "
>>> yarn.scheduler.capacity.<queue-path>.app-limit-factor"
>>> I will revert on this.
>>>
>>> On 29 September 2015 at 14:57, Namikaze Minato <ll...@gmail.com>
>>> wrote:
>>>
>>>> I think Laxman should also tell us more about which application type
>>>> he is running. The normal use cas of MAPREDUCE should be working as
>>>> intended, but if he has for example one MAP using 100 vcores, then the
>>>> second map will have to wait until the app completes. Same would
>>>> happen if the applications running were spark, as spark does not free
>>>> what is allocated to it.
>>>>
>>>> Regards,
>>>> LLoyd
>>>>
>>>> On 29 September 2015 at 11:22, Naganarasimha G R (Naga)
>>>> <ga...@huawei.com> wrote:
>>>> > Thanks Rohith for your thoughts ,
>>>> >       But i think by this configuration it might not completely solve
>>>> the
>>>> > scenario mentioned by Laxman, As if the there is some time gap
>>>> between first
>>>> > and and the second app then though we have fairness or priority set
>>>> for apps
>>>> > starvation will be there.
>>>> > IIUC we can think of an approach where in we can have something
>>>> similar to
>>>> > "yarn.scheduler.capacity.<queue-path>.user-limit-factor"  where in it
>>>> can
>>>> > provide  the functionality like
>>>> > "yarn.scheduler.capacity.<queue-path>.app-limit-factor" : The
>>>> multiple of
>>>> > the queue capacity which can be configured to allow a single app to
>>>> acquire
>>>> > more resources.  Thoughts ?
>>>> >
>>>> > + Naga
>>>> >
>>>> >
>>>> >
>>>> > ________________________________
>>>> > From: Rohith Sharma K S [rohithsharmaks@huawei.com]
>>>> > Sent: Tuesday, September 29, 2015 14:07
>>>> > To: user@hadoop.apache.org
>>>> > Subject: RE: Concurrency control
>>>> >
>>>> > Hi Laxman,
>>>> >
>>>> >
>>>> >
>>>> > In Hadoop-2.8(Not released  yet),  CapacityScheduler provides
>>>> configuration
>>>> > for configuring ordering policy.  By configuring FAIR_ORDERING_POLICY
>>>> in CS
>>>> > , probably you should be able to achieve  your goal i.e avoiding
>>>> starving of
>>>> > applications for resources.
>>>> >
>>>> >
>>>> >
>>>> >
>>>> >
>>>> >
>>>> org.apache.hadoop.yarn.server.resourcemanager.scheduler.policy.FairOrderingPolicy<S>
>>>> >
>>>> > An OrderingPolicy which orders SchedulableEntities for fairness (see
>>>> > FairScheduler FairSharePolicy), generally, processes with lesser
>>>> usage are
>>>> > lesser. If sizedBasedWeight is set to true then an application with
>>>> high
>>>> > demand may be prioritized ahead of an application with less usage.
>>>> This is
>>>> > to offset the tendency to favor small apps, which could result in
>>>> starvation
>>>> > for large apps if many small ones enter and leave the queue
>>>> continuously
>>>> > (optional, default false)
>>>> >
>>>> >
>>>> >
>>>> >
>>>> >
>>>> > Community Issue Id :  https://issues.apache.org/jira/browse/YARN-3463
>>>> >
>>>> >
>>>> >
>>>> > Thanks & Regards
>>>> >
>>>> > Rohith Sharma K S
>>>> >
>>>> >
>>>> >
>>>> > From: Laxman Ch [mailto:laxman.lux@gmail.com]
>>>> > Sent: 29 September 2015 13:36
>>>> > To: user@hadoop.apache.org
>>>> > Subject: Re: Concurrency control
>>>> >
>>>> >
>>>> >
>>>> > Bouncing this thread again. Any other thoughts please?
>>>> >
>>>> >
>>>> >
>>>> > On 17 September 2015 at 23:21, Laxman Ch <la...@gmail.com>
>>>> wrote:
>>>> >
>>>> > No Naga. That wont help.
>>>> >
>>>> >
>>>> >
>>>> > I am running two applications (app1 - 100 vcores, app2 - 100 vcores)
>>>> with
>>>> > same user which runs in same queue (capacity=100vcores). In this
>>>> scenario,
>>>> > if app1 triggers first occupies all the slots and runs longs then
>>>> app2 will
>>>> > starve longer.
>>>> >
>>>> >
>>>> >
>>>> > Let me reiterate my problem statement. I wanted "to control the
>>>> amount of
>>>> > resources (vcores, memory) used by an application SIMULTANEOUSLY"
>>>> >
>>>> >
>>>> >
>>>> > On 17 September 2015 at 22:28, Naganarasimha Garla
>>>> > <na...@gmail.com> wrote:
>>>> >
>>>> > Hi Laxman,
>>>> >
>>>> > For the example you have stated may be we can do the following things
>>>> :
>>>> >
>>>> > 1. Create/modify the queue with capacity and max cap set such that its
>>>> > equivalent to 100 vcores. So as there is no elasticity, given
>>>> application
>>>> > will not be using the resources beyond the capacity configured
>>>> >
>>>> > 2. yarn.scheduler.capacity.<queue-path>.minimum-user-limit-percent
>>>>  so that
>>>> > each active user would be assured with the minimum guaranteed
>>>> resources . By
>>>> > default value is 100 implies no user limits are imposed.
>>>> >
>>>> >
>>>> >
>>>> > Additionally we can think of
>>>> >
>>>> "yarn.nodemanager.linux-container-executor.cgroups.strict-resource-usage"
>>>> > which will enforce strict cpu usage for a given container if required.
>>>> >
>>>> >
>>>> >
>>>> > + Naga
>>>> >
>>>> >
>>>> >
>>>> > On Thu, Sep 17, 2015 at 4:42 PM, Laxman Ch <la...@gmail.com>
>>>> wrote:
>>>> >
>>>> > Yes. I'm already using cgroups. Cgroups helps in controlling the
>>>> resources
>>>> > at container level. But my requirement is more about controlling the
>>>> > concurrent resource usage of an application at whole cluster level.
>>>> >
>>>> >
>>>> >
>>>> > And yes, we do configure queues properly. But, that won't help.
>>>> >
>>>> >
>>>> >
>>>> > For example, I have an application with a requirement of 1000 vcores.
>>>> But, I
>>>> > wanted to control this application not to go beyond 100 vcores at any
>>>> point
>>>> > of time in the cluster/queue. This makes that application to run
>>>> longer even
>>>> > when my cluster is free but I will be able meet the guaranteed SLAs
>>>> of other
>>>> > applications.
>>>> >
>>>> >
>>>> >
>>>> > Hope this helps to understand my question.
>>>> >
>>>> >
>>>> >
>>>> > And thanks Narasimha for quick response.
>>>> >
>>>> >
>>>> >
>>>> > On 17 September 2015 at 16:17, Naganarasimha Garla
>>>> > <na...@gmail.com> wrote:
>>>> >
>>>> > Hi Laxman,
>>>> >
>>>> > Yes if cgroups are enabled and
>>>> "yarn.scheduler.capacity.resource-calculator"
>>>> > configured to DominantResourceCalculator then cpu and memory can be
>>>> > controlled.
>>>> >
>>>> > Please Kindly  furhter refer to the official documentation
>>>> > http://hadoop.apache.org/docs/r1.2.1/capacity_scheduler.html
>>>> >
>>>> >
>>>> >
>>>> > But may be if say more about problem then we can suggest ideal
>>>> > configuration, seems like capacity configuration and splitting of the
>>>> queue
>>>> > is not rightly done or you might refer to Fair Scheduler if you want
>>>> more
>>>> > fairness for container allocation for different apps.
>>>> >
>>>> >
>>>> >
>>>> > On Thu, Sep 17, 2015 at 4:10 PM, Laxman Ch <la...@gmail.com>
>>>> wrote:
>>>> >
>>>> > Hi,
>>>> >
>>>> >
>>>> >
>>>> > In YARN, do we have any way to control the amount of resources
>>>> (vcores,
>>>> > memory) used by an application SIMULTANEOUSLY.
>>>> >
>>>> >
>>>> >
>>>> > - In my cluster, noticed some large and long running mr-app occupied
>>>> all the
>>>> > slots of the queue and blocking other apps to get started.
>>>> >
>>>> > - I'm using Capacity schedulers (using hierarchical queues and
>>>> preemption
>>>> > disabled)
>>>> >
>>>> > - Using Hadoop version 2.6.0
>>>> >
>>>> > - Did some googling around this and gone through configuration docs
>>>> but I'm
>>>> > not able to find anything that matches my requirement.
>>>> >
>>>> >
>>>> >
>>>> > If needed, I can provide more details on the usecase and problem.
>>>> >
>>>> >
>>>> >
>>>> > --
>>>> >
>>>> > Thanks,
>>>> > Laxman
>>>> >
>>>> >
>>>> >
>>>> >
>>>> >
>>>> >
>>>> >
>>>> > --
>>>> >
>>>> > Thanks,
>>>> > Laxman
>>>> >
>>>> >
>>>> >
>>>> >
>>>> >
>>>> >
>>>> >
>>>> > --
>>>> >
>>>> > Thanks,
>>>> > Laxman
>>>> >
>>>> >
>>>> >
>>>> >
>>>> >
>>>> > --
>>>> >
>>>> > Thanks,
>>>> > Laxman
>>>>
>>>
>>>
>>>
>>> --
>>> Thanks,
>>> Laxman
>>>
>>
>>
>>
>> --
>> Thanks,
>> Laxman
>>
>
>
>
> --
> Thanks,
> Laxman
>

Re: Concurrency control

Posted by Harsh J <ha...@cloudera.com>.

If all your Apps are MR, then what you are looking for is MAPREDUCE-5583
(it can be set per-job).

On Thu, Oct 1, 2015 at 3:03 PM Laxman Ch <la...@gmail.com> wrote:

> Hi Naga,
>
> Like most of the app-level configurations, admin can configure the
> defaults which user may want override at application level.
>
> If this is at queue-level then all applications in a queue will have the
> same limits. But all our applications in a queue may not have same SLA and
> we may need to restrict them differently. This requires again splitting
> queues further which I feel is more overhead.
>
>
> On 30 September 2015 at 09:00, Naganarasimha G R (Naga) <
> garlanaganarasimha@huawei.com> wrote:
>
>> Hi Laxman,
>>
>> Ideally i understand it would be better its available @ application
>> level, but  its like each user is expected to ensure that he gives the
>> right configuration which is within the limits of max capacity.
>> And what if user submits some app *(kind of a query execution app**)*
>> with out this setting *or* he doesn't know how much it should take ? In
>> general, users specifying resources for containers itself is a difficult
>> task.
>> And it might not be right to expect that the admin will do it for each
>> application in the queue either.  Basically governing will be difficult if
>> its not enforced from queue/scheduler side.
>>
>> + Naga
>>
>> ------------------------------
>> *From:* Laxman Ch [laxman.lux@gmail.com]
>> *Sent:* Tuesday, September 29, 2015 16:52
>>
>> *To:* user@hadoop.apache.org
>> *Subject:* Re: Concurrency control
>>
>> IMO, its better to have a application level configuration than to have a
>> scheduler/queue level configuration.
>> Having a queue level configuration will restrict every single application
>> that runs in that queue.
>> But, we may want to configure these limits for only some set of jobs and
>> also for every application these limits can be different.
>>
>> FairOrdering policy thing, order of jobs can't be enforced as these are
>> adhoc jobs and scheduled/owned independently by different teams.
>>
>> On 29 September 2015 at 16:43, Naganarasimha G R (Naga) <
>> garlanaganarasimha@huawei.com> wrote:
>>
>>> Hi Laxman,
>>>
>>> What i meant was,  suppose if we support and configure
>>> yarn.scheduler.capacity.<queue-path>.app-limit-factor to .25  then a
>>> single app should not take more than 25 % of resources in the queue.
>>> This would be a more generic configuration which can be enforced by the
>>> admin, than expecting it to be configured for per app by the user.
>>>
>>> And for Rohith's suggestion of FairOrdering policy , I think it should
>>> solve the problem if the App which is submitted first is not already hogged
>>> all the queue's resources.
>>>
>>> + Naga
>>>
>>> ------------------------------
>>> *From:* Laxman Ch [laxman.lux@gmail.com]
>>> *Sent:* Tuesday, September 29, 2015 16:03
>>>
>>> *To:* user@hadoop.apache.org
>>> *Subject:* Re: Concurrency control
>>>
>>> Thanks Rohit, Naga and Lloyd for the responses.
>>>
>>> > I think Laxman should also tell us more about which application type he
>>> is running.
>>>
>>> We run mr jobs mostly with default core/memory allocation (1 vcore,
>>> 1.5GB).
>>> Our problem is more about controlling the * resources used
>>> simultaneously by all running containers *at any given point of time
>>> per application.
>>>
>>> Example:
>>> 1. App1 and App2 are two MR apps.
>>> 2. App1 and App2 belong to same queue (capacity: 100 vcores, 150 GB).
>>> 3. Each App1 task takes 8 hrs for completion
>>> 4. Each App2 task takes 5 mins for completion
>>> 5. App1 triggered at time "t1" and using all the slots of queue.
>>> 6. App2 triggered at time "t2" (where t2 > t1) and waits longer fot App1
>>> tasks to release the resources.
>>> 7. We can't have preemption enabled as we don't want to lose the work
>>> completed so far by App1.
>>> 8. We can't have separate queues for App1 and App2 as we have lots of
>>> jobs like this and it will explode the number of queues.
>>> 9. We use CapacityScheduler.
>>>
>>> In this scenario, if I can control App1 concurrent usage limits to
>>> 50vcores and 75GB, then App1 may take longer time to finish but there won't
>>> be any starvation for App2 (and other jobs running in same queue)
>>>
>>> @Rohit, FairOrdering policy may not solve this starvation problem.
>>>
>>> @Naga, I couldn't think through the expected behavior of "
>>> yarn.scheduler.capacity.<queue-path>.app-limit-factor"
>>> I will revert on this.
>>>
>>> On 29 September 2015 at 14:57, Namikaze Minato <ll...@gmail.com>
>>> wrote:
>>>
>>>> I think Laxman should also tell us more about which application type
>>>> he is running. The normal use cas of MAPREDUCE should be working as
>>>> intended, but if he has for example one MAP using 100 vcores, then the
>>>> second map will have to wait until the app completes. Same would
>>>> happen if the applications running were spark, as spark does not free
>>>> what is allocated to it.
>>>>
>>>> Regards,
>>>> LLoyd
>>>>
>>>> On 29 September 2015 at 11:22, Naganarasimha G R (Naga)
>>>> <ga...@huawei.com> wrote:
>>>> > Thanks Rohith for your thoughts ,
>>>> >       But i think by this configuration it might not completely solve
>>>> the
>>>> > scenario mentioned by Laxman, As if the there is some time gap
>>>> between first
>>>> > and and the second app then though we have fairness or priority set
>>>> for apps
>>>> > starvation will be there.
>>>> > IIUC we can think of an approach where in we can have something
>>>> similar to
>>>> > "yarn.scheduler.capacity.<queue-path>.user-limit-factor"  where in it
>>>> can
>>>> > provide  the functionality like
>>>> > "yarn.scheduler.capacity.<queue-path>.app-limit-factor" : The
>>>> multiple of
>>>> > the queue capacity which can be configured to allow a single app to
>>>> acquire
>>>> > more resources.  Thoughts ?
>>>> >
>>>> > + Naga
>>>> >
>>>> >
>>>> >
>>>> > ________________________________
>>>> > From: Rohith Sharma K S [rohithsharmaks@huawei.com]
>>>> > Sent: Tuesday, September 29, 2015 14:07
>>>> > To: user@hadoop.apache.org
>>>> > Subject: RE: Concurrency control
>>>> >
>>>> > Hi Laxman,
>>>> >
>>>> >
>>>> >
>>>> > In Hadoop-2.8(Not released  yet),  CapacityScheduler provides
>>>> configuration
>>>> > for configuring ordering policy.  By configuring FAIR_ORDERING_POLICY
>>>> in CS
>>>> > , probably you should be able to achieve  your goal i.e avoiding
>>>> starving of
>>>> > applications for resources.
>>>> >
>>>> >
>>>> >
>>>> >
>>>> >
>>>> >
>>>> org.apache.hadoop.yarn.server.resourcemanager.scheduler.policy.FairOrderingPolicy<S>
>>>> >
>>>> > An OrderingPolicy which orders SchedulableEntities for fairness (see
>>>> > FairScheduler FairSharePolicy), generally, processes with lesser
>>>> usage are
>>>> > lesser. If sizedBasedWeight is set to true then an application with
>>>> high
>>>> > demand may be prioritized ahead of an application with less usage.
>>>> This is
>>>> > to offset the tendency to favor small apps, which could result in
>>>> starvation
>>>> > for large apps if many small ones enter and leave the queue
>>>> continuously
>>>> > (optional, default false)
>>>> >
>>>> >
>>>> >
>>>> >
>>>> >
>>>> > Community Issue Id :  https://issues.apache.org/jira/browse/YARN-3463
>>>> >
>>>> >
>>>> >
>>>> > Thanks & Regards
>>>> >
>>>> > Rohith Sharma K S
>>>> >
>>>> >
>>>> >
>>>> > From: Laxman Ch [mailto:laxman.lux@gmail.com]
>>>> > Sent: 29 September 2015 13:36
>>>> > To: user@hadoop.apache.org
>>>> > Subject: Re: Concurrency control
>>>> >
>>>> >
>>>> >
>>>> > Bouncing this thread again. Any other thoughts please?
>>>> >
>>>> >
>>>> >
>>>> > On 17 September 2015 at 23:21, Laxman Ch <la...@gmail.com>
>>>> wrote:
>>>> >
>>>> > No Naga. That wont help.
>>>> >
>>>> >
>>>> >
>>>> > I am running two applications (app1 - 100 vcores, app2 - 100 vcores)
>>>> with
>>>> > same user which runs in same queue (capacity=100vcores). In this
>>>> scenario,
>>>> > if app1 triggers first occupies all the slots and runs longs then
>>>> app2 will
>>>> > starve longer.
>>>> >
>>>> >
>>>> >
>>>> > Let me reiterate my problem statement. I wanted "to control the
>>>> amount of
>>>> > resources (vcores, memory) used by an application SIMULTANEOUSLY"
>>>> >
>>>> >
>>>> >
>>>> > On 17 September 2015 at 22:28, Naganarasimha Garla
>>>> > <na...@gmail.com> wrote:
>>>> >
>>>> > Hi Laxman,
>>>> >
>>>> > For the example you have stated may be we can do the following things
>>>> :
>>>> >
>>>> > 1. Create/modify the queue with capacity and max cap set such that its
>>>> > equivalent to 100 vcores. So as there is no elasticity, given
>>>> application
>>>> > will not be using the resources beyond the capacity configured
>>>> >
>>>> > 2. yarn.scheduler.capacity.<queue-path>.minimum-user-limit-percent
>>>>  so that
>>>> > each active user would be assured with the minimum guaranteed
>>>> resources . By
>>>> > default value is 100 implies no user limits are imposed.
>>>> >
>>>> >
>>>> >
>>>> > Additionally we can think of
>>>> >
>>>> "yarn.nodemanager.linux-container-executor.cgroups.strict-resource-usage"
>>>> > which will enforce strict cpu usage for a given container if required.
>>>> >
>>>> >
>>>> >
>>>> > + Naga
>>>> >
>>>> >
>>>> >
>>>> > On Thu, Sep 17, 2015 at 4:42 PM, Laxman Ch <la...@gmail.com>
>>>> wrote:
>>>> >
>>>> > Yes. I'm already using cgroups. Cgroups helps in controlling the
>>>> resources
>>>> > at container level. But my requirement is more about controlling the
>>>> > concurrent resource usage of an application at whole cluster level.
>>>> >
>>>> >
>>>> >
>>>> > And yes, we do configure queues properly. But, that won't help.
>>>> >
>>>> >
>>>> >
>>>> > For example, I have an application with a requirement of 1000 vcores.
>>>> But, I
>>>> > wanted to control this application not to go beyond 100 vcores at any
>>>> point
>>>> > of time in the cluster/queue. This makes that application to run
>>>> longer even
>>>> > when my cluster is free but I will be able meet the guaranteed SLAs
>>>> of other
>>>> > applications.
>>>> >
>>>> >
>>>> >
>>>> > Hope this helps to understand my question.
>>>> >
>>>> >
>>>> >
>>>> > And thanks Narasimha for quick response.
>>>> >
>>>> >
>>>> >
>>>> > On 17 September 2015 at 16:17, Naganarasimha Garla
>>>> > <na...@gmail.com> wrote:
>>>> >
>>>> > Hi Laxman,
>>>> >
>>>> > Yes if cgroups are enabled and
>>>> "yarn.scheduler.capacity.resource-calculator"
>>>> > configured to DominantResourceCalculator then cpu and memory can be
>>>> > controlled.
>>>> >
>>>> > Please Kindly  furhter refer to the official documentation
>>>> > http://hadoop.apache.org/docs/r1.2.1/capacity_scheduler.html
>>>> >
>>>> >
>>>> >
>>>> > But may be if say more about problem then we can suggest ideal
>>>> > configuration, seems like capacity configuration and splitting of the
>>>> queue
>>>> > is not rightly done or you might refer to Fair Scheduler if you want
>>>> more
>>>> > fairness for container allocation for different apps.
>>>> >
>>>> >
>>>> >
>>>> > On Thu, Sep 17, 2015 at 4:10 PM, Laxman Ch <la...@gmail.com>
>>>> wrote:
>>>> >
>>>> > Hi,
>>>> >
>>>> >
>>>> >
>>>> > In YARN, do we have any way to control the amount of resources
>>>> (vcores,
>>>> > memory) used by an application SIMULTANEOUSLY.
>>>> >
>>>> >
>>>> >
>>>> > - In my cluster, noticed some large and long running mr-app occupied
>>>> all the
>>>> > slots of the queue and blocking other apps to get started.
>>>> >
>>>> > - I'm using Capacity schedulers (using hierarchical queues and
>>>> preemption
>>>> > disabled)
>>>> >
>>>> > - Using Hadoop version 2.6.0
>>>> >
>>>> > - Did some googling around this and gone through configuration docs
>>>> but I'm
>>>> > not able to find anything that matches my requirement.
>>>> >
>>>> >
>>>> >
>>>> > If needed, I can provide more details on the usecase and problem.
>>>> >
>>>> >
>>>> >
>>>> > --
>>>> >
>>>> > Thanks,
>>>> > Laxman
>>>> >
>>>> >
>>>> >
>>>> >
>>>> >
>>>> >
>>>> >
>>>> > --
>>>> >
>>>> > Thanks,
>>>> > Laxman
>>>> >
>>>> >
>>>> >
>>>> >
>>>> >
>>>> >
>>>> >
>>>> > --
>>>> >
>>>> > Thanks,
>>>> > Laxman
>>>> >
>>>> >
>>>> >
>>>> >
>>>> >
>>>> > --
>>>> >
>>>> > Thanks,
>>>> > Laxman
>>>>
>>>
>>>
>>>
>>> --
>>> Thanks,
>>> Laxman
>>>
>>
>>
>>
>> --
>> Thanks,
>> Laxman
>>
>
>
>
> --
> Thanks,
> Laxman
>