You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hadoop.apache.org by Lars Francke <la...@gmail.com> on 2019/10/09 12:52:34 UTC

CapacityScheduler questions - (AM) preemption

Hi,

I've got a question about behavior we're seeing.

Two queues: Preemption enabled, CapacityScheduler (happy to provide more
config if needed), 50% of resources to each

Submit a job to queue 1 which uses 100% of the cluster.
Submit a job to queue 2 which doesn't get allocated because there are not
enough resources for the AM.

I'd have expected this to trigger preemption but it doesn't happen.
Is this expected?

Second question:
Once we get the second job running it only gets three containers but
requested six. Utilization at this point is still "unfair" i.e. queue 1
~75% resources, queue 2 ~25% resources. We saw two preemptions happening to
get to this 25% utilization but then no more.

Any idea why this could be happening?

Thank you!

Lars

Re: CapacityScheduler questions - (AM) preemption

Posted by Lars Francke <la...@gmail.com>.
Eric,

sorry for the slow reply.

I'm afraid I'm not 100% sure anymore (this was at a client and I don't have
access to the system at the moment). I believe we submitted to foo_ultimo
first and then to foo_daily but I've reached out to them to get feedback.

Cheers,
Lars

On Fri, Oct 11, 2019 at 8:21 PM epayne@apache.org <ep...@apache.org> wrote:

> > Submit a job to queue 1 which uses 100% of the cluster.
> > Submit a job to queue 2 which doesn't get allocated because there are
> not enough resources for the AM.
>
> Lars,
> can you please correlate the 'queue 1' and 'queue 2' with the config that
> you provided? That is, based on your configs what is the queue name that
> was using 100% and which was being starved for resources?
>
> Thanks,
> -Eric
>
>
>
> On Friday, October 11, 2019, 6:41:08 AM CDT, Lars Francke <
> lars.francke@gmail.com> wrote:
>
>
>
>
>
> Sunil,
>
> this is the full capacity-scheduler.xml file:
>
>
>   <configuration  xmlns:xi="http://www.w3.org/2001/XInclude">
>
>     <property>
>       <name>yarn.scheduler.capacity.maximum-am-resource-percent</name>
>       <value>0.2</value>
>     </property>
>
>     <property>
>       <name>yarn.scheduler.capacity.maximum-applications</name>
>       <value>10000</value>
>     </property>
>
>     <property>
>       <name>yarn.scheduler.capacity.node-locality-delay</name>
>       <value>40</value>
>     </property>
>
>     <property>
>       <name>yarn.scheduler.capacity.queue-mappings-override.enable</name>
>       <value>false</value>
>     </property>
>
>     <property>
>       <name>yarn.scheduler.capacity.resource-calculator</name>
>
>       <value>org.apache.hadoop.yarn.util.resource.DefaultResourceCalculator</value>
>     </property>
>
>     <property>
>       <name>yarn.scheduler.capacity.root.accessible-node-labels</name>
>       <value>*</value>
>     </property>
>
>     <property>
>       <name>yarn.scheduler.capacity.root.acl_administer_queue</name>
>       <value>*</value>
>     </property>
>
>     <property>
>       <name>yarn.scheduler.capacity.root.acl_submit_applications</name>
>       <value>*</value>
>     </property>
>
>     <property>
>       <name>yarn.scheduler.capacity.root.capacity</name>
>       <value>100</value>
>     </property>
>
>     <property>
>
>       <name>yarn.scheduler.capacity.root.default.acl_submit_applications</name>
>       <value>*</value>
>     </property>
>
>     <property>
>       <name>yarn.scheduler.capacity.root.default.capacity</name>
>       <value>1</value>
>     </property>
>
>     <property>
>       <name>yarn.scheduler.capacity.root.default.maximum-capacity</name>
>       <value>100</value>
>     </property>
>
>     <property>
>       <name>yarn.scheduler.capacity.root.default.priority</name>
>       <value>0</value>
>     </property>
>
>     <property>
>       <name>yarn.scheduler.capacity.root.default.state</name>
>       <value>RUNNING</value>
>     </property>
>
>     <property>
>       <name>yarn.scheduler.capacity.root.default.user-limit-factor</name>
>       <value>50</value>
>     </property>
>
>     <property>
>       <name>yarn.scheduler.capacity.root.priority</name>
>       <value>0</value>
>     </property>
>
>     <property>
>       <name>yarn.scheduler.capacity.root.queues</name>
>       <value>default,foo</value>
>     </property>
>
>     <property>
>       <name>yarn.scheduler.capacity.root.foo.acl_administer_queue</name>
>       <value>*</value>
>     </property>
>
>     <property>
>       <name>yarn.scheduler.capacity.root.foo.acl_submit_applications</name>
>       <value>*</value>
>     </property>
>
>     <property>
>       <name>yarn.scheduler.capacity.root.foo.capacity</name>
>       <value>99</value>
>     </property>
>
>     <property>
>       <name>yarn.scheduler.capacity.root.foo.maximum-capacity</name>
>       <value>100</value>
>     </property>
>
>     <property>
>
>       <name>yarn.scheduler.capacity.root.foo.minimum-user-limit-percent</name>
>       <value>100</value>
>     </property>
>
>     <property>
>       <name>yarn.scheduler.capacity.root.foo.priority</name>
>       <value>0</value>
>     </property>
>
>     <property>
>       <name>yarn.scheduler.capacity.root.foo.queues</name>
>       <value>foo_daily,foo_ultimo</value>
>     </property>
>
>     <property>
>
>       <name>yarn.scheduler.capacity.root.foo.foo_daily.acl_administer_queue</name>
>       <value>*</value>
>     </property>
>
>     <property>
>
>       <name>yarn.scheduler.capacity.root.foo.foo_daily.acl_submit_applications</name>
>       <value>*</value>
>     </property>
>
>     <property>
>       <name>yarn.scheduler.capacity.root.foo.foo_daily.capacity</name>
>       <value>50</value>
>     </property>
>
>     <property>
>
>       <name>yarn.scheduler.capacity.root.foo.foo_daily.maximum-capacity</name>
>       <value>100</value>
>     </property>
>
>     <property>
>
>       <name>yarn.scheduler.capacity.root.foo.foo_daily.minimum-user-limit-percent</name>
>       <value>100</value>
>     </property>
>
>     <property>
>
>       <name>yarn.scheduler.capacity.root.foo.foo_daily.ordering-policy</name>
>       <value>fifo</value>
>     </property>
>
>     <property>
>       <name>yarn.scheduler.capacity.root.foo.foo_daily.priority</name>
>       <value>0</value>
>     </property>
>
>     <property>
>       <name>yarn.scheduler.capacity.root.foo.foo_daily.state</name>
>       <value>RUNNING</value>
>     </property>
>
>     <property>
>
>       <name>yarn.scheduler.capacity.root.foo.foo_daily.user-limit-factor</name>
>       <value>2</value>
>     </property>
>
>     <property>
>
>       <name>yarn.scheduler.capacity.root.foo.foo_ultimo.acl_administer_queue</name>
>       <value>*</value>
>     </property>
>
>     <property>
>
>       <name>yarn.scheduler.capacity.root.foo.foo_ultimo.acl_submit_applications</name>
>       <value>*</value>
>     </property>
>
>     <property>
>       <name>yarn.scheduler.capacity.root.foo.foo_ultimo.capacity</name>
>       <value>50</value>
>     </property>
>
>     <property>
>
>       <name>yarn.scheduler.capacity.root.foo.foo_ultimo.maximum-capacity</name>
>       <value>100</value>
>     </property>
>
>     <property>
>
>       <name>yarn.scheduler.capacity.root.foo.foo_ultimo.minimum-user-limit-percent</name>
>       <value>100</value>
>     </property>
>
>     <property>
>
>       <name>yarn.scheduler.capacity.root.foo.foo_ultimo.ordering-policy</name>
>       <value>fifo</value>
>     </property>
>
>     <property>
>       <name>yarn.scheduler.capacity.root.foo.foo_ultimo.priority</name>
>       <value>0</value>
>     </property>
>
>     <property>
>       <name>yarn.scheduler.capacity.root.foo.foo_ultimo.state</name>
>       <value>RUNNING</value>
>     </property>
>
>     <property>
>
>       <name>yarn.scheduler.capacity.root.foo.foo_ultimo.user-limit-factor</name>
>       <value>2</value>
>     </property>
>
>     <property>
>       <name>yarn.scheduler.capacity.root.foo.state</name>
>       <value>RUNNING</value>
>     </property>
>
>     <property>
>       <name>yarn.scheduler.capacity.root.foo.user-limit-factor</name>
>       <value>1</value>
>     </property>
>
>   </configuration>
>
> On Wed, Oct 9, 2019 at 10:56 PM Lars Francke <la...@gmail.com>
> wrote:
> > Sunil,
> >
> > thank you for the answer.
> >
> > This is HDP 3.1 based on Hadoop 3.1.1.
> > No preemption defaults were changed I believe. The only thing changed is
> to enable preemption (monitor.enabled) in Ambari but I can get the full XML
> for you if that's helpful. I'll get back to you on that.
> >
> > Cheers,
> > Lars
> >
> >
> >
> > On Wed, Oct 9, 2019 at 7:57 PM Sunil Govindan <su...@apache.org> wrote:
> >> Hi,
> >>
> >> Expectation of the scenario explained is more or less correct.
> >> Few informations are needed to get more clear picture.
> >> 1. hadoop version
> >> 2. capacity scheduler xml (to be precise, all preemption related
> configs which are added)
> >>
> >> - Sunil
> >>
> >> On Wed, Oct 9, 2019 at 6:23 PM Lars Francke <la...@gmail.com>
> wrote:
> >>> Hi,
> >>>
> >>> I've got a question about behavior we're seeing.
> >>>
> >>> Two queues: Preemption enabled, CapacityScheduler (happy to provide
> more config if needed), 50% of resources to each
> >>>
> >>> Submit a job to queue 1 which uses 100% of the cluster.
> >>> Submit a job to queue 2 which doesn't get allocated because there are
> not enough resources for the AM.
> >>>
> >>> I'd have expected this to trigger preemption but it doesn't happen.
> >>> Is this expected?
> >>>
> >>> Second question:
> >>> Once we get the second job running it only gets three containers but
> requested six. Utilization at this point is still "unfair" i.e. queue 1
> ~75% resources, queue 2 ~25% resources. We saw two preemptions happening to
> get to this 25% utilization but then no more.
> >>>
> >>> Any idea why this could be happening?
> >>>
> >>> Thank you!
> >>>
> >>> Lars
> >>>
> >>
> >
>

Re: CapacityScheduler questions - (AM) preemption

Posted by "epayne@apache.org" <ep...@apache.org>.
> Submit a job to queue 1 which uses 100% of the cluster.
> Submit a job to queue 2 which doesn't get allocated because there are not enough resources for the AM.

Lars,
can you please correlate the 'queue 1' and 'queue 2' with the config that you provided? That is, based on your configs what is the queue name that was using 100% and which was being starved for resources?

Thanks,
-Eric



On Friday, October 11, 2019, 6:41:08 AM CDT, Lars Francke <la...@gmail.com> wrote: 





Sunil,

this is the full capacity-scheduler.xml file:


  <configuration  xmlns:xi="http://www.w3.org/2001/XInclude">

    <property>
      <name>yarn.scheduler.capacity.maximum-am-resource-percent</name>
      <value>0.2</value>
    </property>

    <property>
      <name>yarn.scheduler.capacity.maximum-applications</name>
      <value>10000</value>
    </property>

    <property>
      <name>yarn.scheduler.capacity.node-locality-delay</name>
      <value>40</value>
    </property>

    <property>
      <name>yarn.scheduler.capacity.queue-mappings-override.enable</name>
      <value>false</value>
    </property>

    <property>
      <name>yarn.scheduler.capacity.resource-calculator</name>
      <value>org.apache.hadoop.yarn.util.resource.DefaultResourceCalculator</value>
    </property>

    <property>
      <name>yarn.scheduler.capacity.root.accessible-node-labels</name>
      <value>*</value>
    </property>

    <property>
      <name>yarn.scheduler.capacity.root.acl_administer_queue</name>
      <value>*</value>
    </property>

    <property>
      <name>yarn.scheduler.capacity.root.acl_submit_applications</name>
      <value>*</value>
    </property>

    <property>
      <name>yarn.scheduler.capacity.root.capacity</name>
      <value>100</value>
    </property>

    <property>
      <name>yarn.scheduler.capacity.root.default.acl_submit_applications</name>
      <value>*</value>
    </property>

    <property>
      <name>yarn.scheduler.capacity.root.default.capacity</name>
      <value>1</value>
    </property>

    <property>
      <name>yarn.scheduler.capacity.root.default.maximum-capacity</name>
      <value>100</value>
    </property>

    <property>
      <name>yarn.scheduler.capacity.root.default.priority</name>
      <value>0</value>
    </property>

    <property>
      <name>yarn.scheduler.capacity.root.default.state</name>
      <value>RUNNING</value>
    </property>

    <property>
      <name>yarn.scheduler.capacity.root.default.user-limit-factor</name>
      <value>50</value>
    </property>

    <property>
      <name>yarn.scheduler.capacity.root.priority</name>
      <value>0</value>
    </property>

    <property>
      <name>yarn.scheduler.capacity.root.queues</name>
      <value>default,foo</value>
    </property>

    <property>
      <name>yarn.scheduler.capacity.root.foo.acl_administer_queue</name>
      <value>*</value>
    </property>

    <property>
      <name>yarn.scheduler.capacity.root.foo.acl_submit_applications</name>
      <value>*</value>
    </property>

    <property>
      <name>yarn.scheduler.capacity.root.foo.capacity</name>
      <value>99</value>
    </property>

    <property>
      <name>yarn.scheduler.capacity.root.foo.maximum-capacity</name>
      <value>100</value>
    </property>

    <property>
      <name>yarn.scheduler.capacity.root.foo.minimum-user-limit-percent</name>
      <value>100</value>
    </property>

    <property>
      <name>yarn.scheduler.capacity.root.foo.priority</name>
      <value>0</value>
    </property>

    <property>
      <name>yarn.scheduler.capacity.root.foo.queues</name>
      <value>foo_daily,foo_ultimo</value>
    </property>

    <property>
      <name>yarn.scheduler.capacity.root.foo.foo_daily.acl_administer_queue</name>
      <value>*</value>
    </property>

    <property>
      <name>yarn.scheduler.capacity.root.foo.foo_daily.acl_submit_applications</name>
      <value>*</value>
    </property>

    <property>
      <name>yarn.scheduler.capacity.root.foo.foo_daily.capacity</name>
      <value>50</value>
    </property>

    <property>
      <name>yarn.scheduler.capacity.root.foo.foo_daily.maximum-capacity</name>
      <value>100</value>
    </property>

    <property>
      <name>yarn.scheduler.capacity.root.foo.foo_daily.minimum-user-limit-percent</name>
      <value>100</value>
    </property>

    <property>
      <name>yarn.scheduler.capacity.root.foo.foo_daily.ordering-policy</name>
      <value>fifo</value>
    </property>

    <property>
      <name>yarn.scheduler.capacity.root.foo.foo_daily.priority</name>
      <value>0</value>
    </property>

    <property>
      <name>yarn.scheduler.capacity.root.foo.foo_daily.state</name>
      <value>RUNNING</value>
    </property>

    <property>
      <name>yarn.scheduler.capacity.root.foo.foo_daily.user-limit-factor</name>
      <value>2</value>
    </property>

    <property>
      <name>yarn.scheduler.capacity.root.foo.foo_ultimo.acl_administer_queue</name>
      <value>*</value>
    </property>

    <property>
      <name>yarn.scheduler.capacity.root.foo.foo_ultimo.acl_submit_applications</name>
      <value>*</value>
    </property>

    <property>
      <name>yarn.scheduler.capacity.root.foo.foo_ultimo.capacity</name>
      <value>50</value>
    </property>

    <property>
      <name>yarn.scheduler.capacity.root.foo.foo_ultimo.maximum-capacity</name>
      <value>100</value>
    </property>

    <property>
      <name>yarn.scheduler.capacity.root.foo.foo_ultimo.minimum-user-limit-percent</name>
      <value>100</value>
    </property>

    <property>
      <name>yarn.scheduler.capacity.root.foo.foo_ultimo.ordering-policy</name>
      <value>fifo</value>
    </property>

    <property>
      <name>yarn.scheduler.capacity.root.foo.foo_ultimo.priority</name>
      <value>0</value>
    </property>

    <property>
      <name>yarn.scheduler.capacity.root.foo.foo_ultimo.state</name>
      <value>RUNNING</value>
    </property>

    <property>
      <name>yarn.scheduler.capacity.root.foo.foo_ultimo.user-limit-factor</name>
      <value>2</value>
    </property>

    <property>
      <name>yarn.scheduler.capacity.root.foo.state</name>
      <value>RUNNING</value>
    </property>

    <property>
      <name>yarn.scheduler.capacity.root.foo.user-limit-factor</name>
      <value>1</value>
    </property>

  </configuration>

On Wed, Oct 9, 2019 at 10:56 PM Lars Francke <la...@gmail.com> wrote:
> Sunil,
> 
> thank you for the answer.
> 
> This is HDP 3.1 based on Hadoop 3.1.1.
> No preemption defaults were changed I believe. The only thing changed is to enable preemption (monitor.enabled) in Ambari but I can get the full XML for you if that's helpful. I'll get back to you on that.
> 
> Cheers,
> Lars
> 
> 
> 
> On Wed, Oct 9, 2019 at 7:57 PM Sunil Govindan <su...@apache.org> wrote:
>> Hi,
>> 
>> Expectation of the scenario explained is more or less correct.
>> Few informations are needed to get more clear picture.
>> 1. hadoop version
>> 2. capacity scheduler xml (to be precise, all preemption related configs which are added)
>> 
>> - Sunil
>> 
>> On Wed, Oct 9, 2019 at 6:23 PM Lars Francke <la...@gmail.com> wrote:
>>> Hi,
>>> 
>>> I've got a question about behavior we're seeing.
>>> 
>>> Two queues: Preemption enabled, CapacityScheduler (happy to provide more config if needed), 50% of resources to each
>>> 
>>> Submit a job to queue 1 which uses 100% of the cluster.
>>> Submit a job to queue 2 which doesn't get allocated because there are not enough resources for the AM.
>>> 
>>> I'd have expected this to trigger preemption but it doesn't happen.
>>> Is this expected?
>>> 
>>> Second question:
>>> Once we get the second job running it only gets three containers but requested six. Utilization at this point is still "unfair" i.e. queue 1 ~75% resources, queue 2 ~25% resources. We saw two preemptions happening to get to this 25% utilization but then no more.
>>> 
>>> Any idea why this could be happening?
>>> 
>>> Thank you!
>>> 
>>> Lars
>>> 
>> 
> 

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@hadoop.apache.org
For additional commands, e-mail: user-help@hadoop.apache.org


Re: CapacityScheduler questions - (AM) preemption

Posted by Lars Francke <la...@gmail.com>.
Sunil,

this is the full capacity-scheduler.xml file:


  <configuration  xmlns:xi="http://www.w3.org/2001/XInclude">

    <property>
      <name>yarn.scheduler.capacity.maximum-am-resource-percent</name>
      <value>0.2</value>
    </property>

    <property>
      <name>yarn.scheduler.capacity.maximum-applications</name>
      <value>10000</value>
    </property>

    <property>
      <name>yarn.scheduler.capacity.node-locality-delay</name>
      <value>40</value>
    </property>

    <property>
      <name>yarn.scheduler.capacity.queue-mappings-override.enable</name>
      <value>false</value>
    </property>

    <property>
      <name>yarn.scheduler.capacity.resource-calculator</name>
      <value>org.apache.hadoop.yarn.util.resource.DefaultResourceCalculator
</value>
    </property>

    <property>
      <name>yarn.scheduler.capacity.root.accessible-node-labels</name>
      <value>*</value>
    </property>

    <property>
      <name>yarn.scheduler.capacity.root.acl_administer_queue</name>
      <value>*</value>
    </property>

    <property>
      <name>yarn.scheduler.capacity.root.acl_submit_applications</name>
      <value>*</value>
    </property>

    <property>
      <name>yarn.scheduler.capacity.root.capacity</name>
      <value>100</value>
    </property>

    <property>
      <name>yarn.scheduler.capacity.root.default.acl_submit_applications
</name>
      <value>*</value>
    </property>

    <property>
      <name>yarn.scheduler.capacity.root.default.capacity</name>
      <value>1</value>
    </property>

    <property>
      <name>yarn.scheduler.capacity.root.default.maximum-capacity</name>
      <value>100</value>
    </property>

    <property>
      <name>yarn.scheduler.capacity.root.default.priority</name>
      <value>0</value>
    </property>

    <property>
      <name>yarn.scheduler.capacity.root.default.state</name>
      <value>RUNNING</value>
    </property>

    <property>
      <name>yarn.scheduler.capacity.root.default.user-limit-factor</name>
      <value>50</value>
    </property>

    <property>
      <name>yarn.scheduler.capacity.root.priority</name>
      <value>0</value>
    </property>

    <property>
      <name>yarn.scheduler.capacity.root.queues</name>
      <value>default,foo</value>
    </property>

    <property>
      <name>yarn.scheduler.capacity.root.foo.acl_administer_queue</name>
      <value>*</value>
    </property>

    <property>
      <name>yarn.scheduler.capacity.root.foo.acl_submit_applications</name>
      <value>*</value>
    </property>

    <property>
      <name>yarn.scheduler.capacity.root.foo.capacity</name>
      <value>99</value>
    </property>

    <property>
      <name>yarn.scheduler.capacity.root.foo.maximum-capacity</name>
      <value>100</value>
    </property>

    <property>
      <name>yarn.scheduler.capacity.root.foo.minimum-user-limit-percent
</name>
      <value>100</value>
    </property>

    <property>
      <name>yarn.scheduler.capacity.root.foo.priority</name>
      <value>0</value>
    </property>

    <property>
      <name>yarn.scheduler.capacity.root.foo.queues</name>
      <value>foo_daily,foo_ultimo</value>
    </property>

    <property>
      <name>yarn.scheduler.capacity.root.foo.foo_daily.acl_administer_queue
</name>
      <value>*</value>
    </property>

    <property>
      <name>
yarn.scheduler.capacity.root.foo.foo_daily.acl_submit_applications</name>
      <value>*</value>
    </property>

    <property>
      <name>yarn.scheduler.capacity.root.foo.foo_daily.capacity</name>
      <value>50</value>
    </property>

    <property>
      <name>yarn.scheduler.capacity.root.foo.foo_daily.maximum-capacity
</name>
      <value>100</value>
    </property>

    <property>
      <name>
yarn.scheduler.capacity.root.foo.foo_daily.minimum-user-limit-percent</name>
      <value>100</value>
    </property>

    <property>
      <name>yarn.scheduler.capacity.root.foo.foo_daily.ordering-policy
</name>
      <value>fifo</value>
    </property>

    <property>
      <name>yarn.scheduler.capacity.root.foo.foo_daily.priority</name>
      <value>0</value>
    </property>

    <property>
      <name>yarn.scheduler.capacity.root.foo.foo_daily.state</name>
      <value>RUNNING</value>
    </property>

    <property>
      <name>yarn.scheduler.capacity.root.foo.foo_daily.user-limit-factor
</name>
      <value>2</value>
    </property>

    <property>
      <name>yarn.scheduler.capacity.root.foo.foo_ultimo.acl_administer_queue
</name>
      <value>*</value>
    </property>

    <property>
      <name>
yarn.scheduler.capacity.root.foo.foo_ultimo.acl_submit_applications</name>
      <value>*</value>
    </property>

    <property>
      <name>yarn.scheduler.capacity.root.foo.foo_ultimo.capacity</name>
      <value>50</value>
    </property>

    <property>
      <name>yarn.scheduler.capacity.root.foo.foo_ultimo.maximum-capacity
</name>
      <value>100</value>
    </property>

    <property>
      <name>
yarn.scheduler.capacity.root.foo.foo_ultimo.minimum-user-limit-percent
</name>
      <value>100</value>
    </property>

    <property>
      <name>yarn.scheduler.capacity.root.foo.foo_ultimo.ordering-policy
</name>
      <value>fifo</value>
    </property>

    <property>
      <name>yarn.scheduler.capacity.root.foo.foo_ultimo.priority</name>
      <value>0</value>
    </property>

    <property>
      <name>yarn.scheduler.capacity.root.foo.foo_ultimo.state</name>
      <value>RUNNING</value>
    </property>

    <property>
      <name>yarn.scheduler.capacity.root.foo.foo_ultimo.user-limit-factor
</name>
      <value>2</value>
    </property>

    <property>
      <name>yarn.scheduler.capacity.root.foo.state</name>
      <value>RUNNING</value>
    </property>

    <property>
      <name>yarn.scheduler.capacity.root.foo.user-limit-factor</name>
      <value>1</value>
    </property>

  </configuration>

On Wed, Oct 9, 2019 at 10:56 PM Lars Francke <la...@gmail.com> wrote:

> Sunil,
>
> thank you for the answer.
>
> This is HDP 3.1 based on Hadoop 3.1.1.
> No preemption defaults were changed I believe. The only thing changed is
> to enable preemption (monitor.enabled) in Ambari but I can get the full XML
> for you if that's helpful. I'll get back to you on that.
>
> Cheers,
> Lars
>
>
>
> On Wed, Oct 9, 2019 at 7:57 PM Sunil Govindan <su...@apache.org> wrote:
>
>> Hi,
>>
>> Expectation of the scenario explained is more or less correct.
>> Few informations are needed to get more clear picture.
>> 1. hadoop version
>> 2. capacity scheduler xml (to be precise, all preemption related configs
>> which are added)
>>
>> - Sunil
>>
>> On Wed, Oct 9, 2019 at 6:23 PM Lars Francke <la...@gmail.com>
>> wrote:
>>
>>> Hi,
>>>
>>> I've got a question about behavior we're seeing.
>>>
>>> Two queues: Preemption enabled, CapacityScheduler (happy to provide more
>>> config if needed), 50% of resources to each
>>>
>>> Submit a job to queue 1 which uses 100% of the cluster.
>>> Submit a job to queue 2 which doesn't get allocated because there are
>>> not enough resources for the AM.
>>>
>>> I'd have expected this to trigger preemption but it doesn't happen.
>>> Is this expected?
>>>
>>> Second question:
>>> Once we get the second job running it only gets three containers but
>>> requested six. Utilization at this point is still "unfair" i.e. queue 1
>>> ~75% resources, queue 2 ~25% resources. We saw two preemptions happening to
>>> get to this 25% utilization but then no more.
>>>
>>> Any idea why this could be happening?
>>>
>>> Thank you!
>>>
>>> Lars
>>>
>>

Re: CapacityScheduler questions - (AM) preemption

Posted by Lars Francke <la...@gmail.com>.
Sunil,

thank you for the answer.

This is HDP 3.1 based on Hadoop 3.1.1.
No preemption defaults were changed I believe. The only thing changed is to
enable preemption (monitor.enabled) in Ambari but I can get the full XML
for you if that's helpful. I'll get back to you on that.

Cheers,
Lars



On Wed, Oct 9, 2019 at 7:57 PM Sunil Govindan <su...@apache.org> wrote:

> Hi,
>
> Expectation of the scenario explained is more or less correct.
> Few informations are needed to get more clear picture.
> 1. hadoop version
> 2. capacity scheduler xml (to be precise, all preemption related configs
> which are added)
>
> - Sunil
>
> On Wed, Oct 9, 2019 at 6:23 PM Lars Francke <la...@gmail.com>
> wrote:
>
>> Hi,
>>
>> I've got a question about behavior we're seeing.
>>
>> Two queues: Preemption enabled, CapacityScheduler (happy to provide more
>> config if needed), 50% of resources to each
>>
>> Submit a job to queue 1 which uses 100% of the cluster.
>> Submit a job to queue 2 which doesn't get allocated because there are not
>> enough resources for the AM.
>>
>> I'd have expected this to trigger preemption but it doesn't happen.
>> Is this expected?
>>
>> Second question:
>> Once we get the second job running it only gets three containers but
>> requested six. Utilization at this point is still "unfair" i.e. queue 1
>> ~75% resources, queue 2 ~25% resources. We saw two preemptions happening to
>> get to this 25% utilization but then no more.
>>
>> Any idea why this could be happening?
>>
>> Thank you!
>>
>> Lars
>>
>

Re: CapacityScheduler questions - (AM) preemption

Posted by Sunil Govindan <su...@apache.org>.
Hi,

Expectation of the scenario explained is more or less correct.
Few informations are needed to get more clear picture.
1. hadoop version
2. capacity scheduler xml (to be precise, all preemption related configs
which are added)

- Sunil

On Wed, Oct 9, 2019 at 6:23 PM Lars Francke <la...@gmail.com> wrote:

> Hi,
>
> I've got a question about behavior we're seeing.
>
> Two queues: Preemption enabled, CapacityScheduler (happy to provide more
> config if needed), 50% of resources to each
>
> Submit a job to queue 1 which uses 100% of the cluster.
> Submit a job to queue 2 which doesn't get allocated because there are not
> enough resources for the AM.
>
> I'd have expected this to trigger preemption but it doesn't happen.
> Is this expected?
>
> Second question:
> Once we get the second job running it only gets three containers but
> requested six. Utilization at this point is still "unfair" i.e. queue 1
> ~75% resources, queue 2 ~25% resources. We saw two preemptions happening to
> get to this 25% utilization but then no more.
>
> Any idea why this could be happening?
>
> Thank you!
>
> Lars
>