You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-user@hadoop.apache.org by Wangda Tan <wh...@gmail.com> on 2015/07/17 02:31:00 UTC

Re: FW: Is it valid to use userLimit in calculating maxApplicationsPerUser ?

I think it's not valid. Multiply it by ULF seems not reasonable, I think it
should be:

max(1, maxApplications * max(userlimit/100, 1/#activeUsers))

Assuming admin setups a very large ULF (e.g. 100), maxApplicationsPerUser
can be much more than maxApplications of a queue.

Also, multiply ULF to compute userAMResourceLimit seems not valid too, it
can lead to a single user can run too much applications than expected,
which we should avoid.

Thoughts?

Thanks,
Wangda

On Thu, Jul 16, 2015 at 12:53 AM, Naganarasimha G R (Naga) <
garlanaganarasimha@huawei.com> wrote:

>   Hi Folks ,
>  Came across one scenario where in maxApplications @ cluster level(2
> node) was set to a low value like 10 and based on capacity configuration
> for a particular queue it was coming to 2 as value, but further while
> calculating maxApplicationsPerUser formula used is :
>
>  *maxApplicationsPerUser = (int)(maxApplications * (userLimit / 100.0f) *
> userLimitFactor);*
>
>  but the definition of *userLimit  *in the documentation is :
> *Each queue enforces a limit on the percentage of resources allocated to a
> user at any given time, if there is demand for resources. The user limit
> can vary between a minimum and maximum value. The the former (the minimum
> value) is set to this property value and the latter (the maximum value)
> depends on the number of users who have submitted applications. For e.g.,
> suppose the value of this property is 25. If two users have submitted
> applications to a queue, no single user can use more than 50% of the queue
> resources. If a third user submits an application, no single user can use
> more than 33% of the queue resources. With 4 or more users, no user can use
> more than 25% of the queues resources. A value of 100 implies no user
> limits are imposed. The default is 100. Value is specified as a integer.*
>
>  So was wondering how a* minimum limit is made used in a formula to
> calculate max applications for a user*, suppose i set "
> yarn.scheduler.capacity.<queue-path>.minimum-user-limit-percent" to *20 * assuming
> at least 20% of queue at the minimum is available for a queue but based on
> the formula *maxApplicationsPerUser * is getting set to *zero. *
> According to the definition of the property max is based on the current
> no. of active users, so i feel this formula is wrong.
> *P.S. *userLimitFactor was configured as default 1, but what i am
> wondering is whether its valid to use it in combination with userlimit to
> find max apps per user.
>
>  Please correct me if my understanding is wrong? if its a bug would like
> to raise and solve it .
>
>  + Naga
>

RE: FW: Is it valid to use userLimit in calculating maxApplicationsPerUser ?

Posted by "Naganarasimha G R (Naga)" <ga...@huawei.com>.
Thanks Wangda for the clarification .
I was thinking about
max(1, maxApplications * userlimit/100)
but
max(1, maxApplications * max(userlimit/100, 1/#activeUsers))
will be much more dynamic & accurate as per the description of userlimit. Will raise a issue and start working on it .

+Naga
________________________________
From: Wangda Tan [wheeleast@gmail.com]
Sent: Friday, July 17, 2015 06:01
To: user@hadoop.apache.org
Subject: Re: FW: Is it valid to use userLimit in calculating maxApplicationsPerUser ?

I think it's not valid. Multiply it by ULF seems not reasonable, I think it should be:

max(1, maxApplications * max(userlimit/100, 1/#activeUsers))

Assuming admin setups a very large ULF (e.g. 100), maxApplicationsPerUser can be much more than maxApplications of a queue.

Also, multiply ULF to compute userAMResourceLimit seems not valid too, it can lead to a single user can run too much applications than expected, which we should avoid.

Thoughts?

Thanks,
Wangda

On Thu, Jul 16, 2015 at 12:53 AM, Naganarasimha G R (Naga) <ga...@huawei.com>> wrote:
Hi Folks ,
Came across one scenario where in maxApplications @ cluster level(2 node) was set to a low value like 10 and based on capacity configuration for a particular queue it was coming to 2 as value, but further while calculating maxApplicationsPerUser formula used is :

maxApplicationsPerUser = (int)(maxApplications * (userLimit / 100.0f) * userLimitFactor);

but the definition of userLimit  in the documentation is :
Each queue enforces a limit on the percentage of resources allocated to a user at any given time, if there is demand for resources. The user limit can vary between a minimum and maximum value. The the former (the minimum value) is set to this property value and the latter (the maximum value) depends on the number of users who have submitted applications. For e.g., suppose the value of this property is 25. If two users have submitted applications to a queue, no single user can use more than 50% of the queue resources. If a third user submits an application, no single user can use more than 33% of the queue resources. With 4 or more users, no user can use more than 25% of the queues resources. A value of 100 implies no user limits are imposed. The default is 100. Value is specified as a integer.

So was wondering how a minimum limit is made used in a formula to calculate max applications for a user, suppose i set "yarn.scheduler.capacity.<queue-path>.minimum-user-limit-percent" to 20  assuming at least 20% of queue at the minimum is available for a queue but based on the formula maxApplicationsPerUser  is getting set to zero.
According to the definition of the property max is based on the current no. of active users, so i feel this formula is wrong.
P.S. userLimitFactor was configured as default 1, but what i am wondering is whether its valid to use it in combination with userlimit to find max apps per user.

Please correct me if my understanding is wrong? if its a bug would like to raise and solve it .

+ Naga


RE: FW: Is it valid to use userLimit in calculating maxApplicationsPerUser ?

Posted by "Naganarasimha G R (Naga)" <ga...@huawei.com>.
Thanks Wangda for the clarification .
I was thinking about
max(1, maxApplications * userlimit/100)
but
max(1, maxApplications * max(userlimit/100, 1/#activeUsers))
will be much more dynamic & accurate as per the description of userlimit. Will raise a issue and start working on it .

+Naga
________________________________
From: Wangda Tan [wheeleast@gmail.com]
Sent: Friday, July 17, 2015 06:01
To: user@hadoop.apache.org
Subject: Re: FW: Is it valid to use userLimit in calculating maxApplicationsPerUser ?

I think it's not valid. Multiply it by ULF seems not reasonable, I think it should be:

max(1, maxApplications * max(userlimit/100, 1/#activeUsers))

Assuming admin setups a very large ULF (e.g. 100), maxApplicationsPerUser can be much more than maxApplications of a queue.

Also, multiply ULF to compute userAMResourceLimit seems not valid too, it can lead to a single user can run too much applications than expected, which we should avoid.

Thoughts?

Thanks,
Wangda

On Thu, Jul 16, 2015 at 12:53 AM, Naganarasimha G R (Naga) <ga...@huawei.com>> wrote:
Hi Folks ,
Came across one scenario where in maxApplications @ cluster level(2 node) was set to a low value like 10 and based on capacity configuration for a particular queue it was coming to 2 as value, but further while calculating maxApplicationsPerUser formula used is :

maxApplicationsPerUser = (int)(maxApplications * (userLimit / 100.0f) * userLimitFactor);

but the definition of userLimit  in the documentation is :
Each queue enforces a limit on the percentage of resources allocated to a user at any given time, if there is demand for resources. The user limit can vary between a minimum and maximum value. The the former (the minimum value) is set to this property value and the latter (the maximum value) depends on the number of users who have submitted applications. For e.g., suppose the value of this property is 25. If two users have submitted applications to a queue, no single user can use more than 50% of the queue resources. If a third user submits an application, no single user can use more than 33% of the queue resources. With 4 or more users, no user can use more than 25% of the queues resources. A value of 100 implies no user limits are imposed. The default is 100. Value is specified as a integer.

So was wondering how a minimum limit is made used in a formula to calculate max applications for a user, suppose i set "yarn.scheduler.capacity.<queue-path>.minimum-user-limit-percent" to 20  assuming at least 20% of queue at the minimum is available for a queue but based on the formula maxApplicationsPerUser  is getting set to zero.
According to the definition of the property max is based on the current no. of active users, so i feel this formula is wrong.
P.S. userLimitFactor was configured as default 1, but what i am wondering is whether its valid to use it in combination with userlimit to find max apps per user.

Please correct me if my understanding is wrong? if its a bug would like to raise and solve it .

+ Naga


RE: FW: Is it valid to use userLimit in calculating maxApplicationsPerUser ?

Posted by "Naganarasimha G R (Naga)" <ga...@huawei.com>.
Thanks Wangda for the clarification .
I was thinking about
max(1, maxApplications * userlimit/100)
but
max(1, maxApplications * max(userlimit/100, 1/#activeUsers))
will be much more dynamic & accurate as per the description of userlimit. Will raise a issue and start working on it .

+Naga
________________________________
From: Wangda Tan [wheeleast@gmail.com]
Sent: Friday, July 17, 2015 06:01
To: user@hadoop.apache.org
Subject: Re: FW: Is it valid to use userLimit in calculating maxApplicationsPerUser ?

I think it's not valid. Multiply it by ULF seems not reasonable, I think it should be:

max(1, maxApplications * max(userlimit/100, 1/#activeUsers))

Assuming admin setups a very large ULF (e.g. 100), maxApplicationsPerUser can be much more than maxApplications of a queue.

Also, multiply ULF to compute userAMResourceLimit seems not valid too, it can lead to a single user can run too much applications than expected, which we should avoid.

Thoughts?

Thanks,
Wangda

On Thu, Jul 16, 2015 at 12:53 AM, Naganarasimha G R (Naga) <ga...@huawei.com>> wrote:
Hi Folks ,
Came across one scenario where in maxApplications @ cluster level(2 node) was set to a low value like 10 and based on capacity configuration for a particular queue it was coming to 2 as value, but further while calculating maxApplicationsPerUser formula used is :

maxApplicationsPerUser = (int)(maxApplications * (userLimit / 100.0f) * userLimitFactor);

but the definition of userLimit  in the documentation is :
Each queue enforces a limit on the percentage of resources allocated to a user at any given time, if there is demand for resources. The user limit can vary between a minimum and maximum value. The the former (the minimum value) is set to this property value and the latter (the maximum value) depends on the number of users who have submitted applications. For e.g., suppose the value of this property is 25. If two users have submitted applications to a queue, no single user can use more than 50% of the queue resources. If a third user submits an application, no single user can use more than 33% of the queue resources. With 4 or more users, no user can use more than 25% of the queues resources. A value of 100 implies no user limits are imposed. The default is 100. Value is specified as a integer.

So was wondering how a minimum limit is made used in a formula to calculate max applications for a user, suppose i set "yarn.scheduler.capacity.<queue-path>.minimum-user-limit-percent" to 20  assuming at least 20% of queue at the minimum is available for a queue but based on the formula maxApplicationsPerUser  is getting set to zero.
According to the definition of the property max is based on the current no. of active users, so i feel this formula is wrong.
P.S. userLimitFactor was configured as default 1, but what i am wondering is whether its valid to use it in combination with userlimit to find max apps per user.

Please correct me if my understanding is wrong? if its a bug would like to raise and solve it .

+ Naga


RE: FW: Is it valid to use userLimit in calculating maxApplicationsPerUser ?

Posted by "Naganarasimha G R (Naga)" <ga...@huawei.com>.
Thanks Wangda for the clarification .
I was thinking about
max(1, maxApplications * userlimit/100)
but
max(1, maxApplications * max(userlimit/100, 1/#activeUsers))
will be much more dynamic & accurate as per the description of userlimit. Will raise a issue and start working on it .

+Naga
________________________________
From: Wangda Tan [wheeleast@gmail.com]
Sent: Friday, July 17, 2015 06:01
To: user@hadoop.apache.org
Subject: Re: FW: Is it valid to use userLimit in calculating maxApplicationsPerUser ?

I think it's not valid. Multiply it by ULF seems not reasonable, I think it should be:

max(1, maxApplications * max(userlimit/100, 1/#activeUsers))

Assuming admin setups a very large ULF (e.g. 100), maxApplicationsPerUser can be much more than maxApplications of a queue.

Also, multiply ULF to compute userAMResourceLimit seems not valid too, it can lead to a single user can run too much applications than expected, which we should avoid.

Thoughts?

Thanks,
Wangda

On Thu, Jul 16, 2015 at 12:53 AM, Naganarasimha G R (Naga) <ga...@huawei.com>> wrote:
Hi Folks ,
Came across one scenario where in maxApplications @ cluster level(2 node) was set to a low value like 10 and based on capacity configuration for a particular queue it was coming to 2 as value, but further while calculating maxApplicationsPerUser formula used is :

maxApplicationsPerUser = (int)(maxApplications * (userLimit / 100.0f) * userLimitFactor);

but the definition of userLimit  in the documentation is :
Each queue enforces a limit on the percentage of resources allocated to a user at any given time, if there is demand for resources. The user limit can vary between a minimum and maximum value. The the former (the minimum value) is set to this property value and the latter (the maximum value) depends on the number of users who have submitted applications. For e.g., suppose the value of this property is 25. If two users have submitted applications to a queue, no single user can use more than 50% of the queue resources. If a third user submits an application, no single user can use more than 33% of the queue resources. With 4 or more users, no user can use more than 25% of the queues resources. A value of 100 implies no user limits are imposed. The default is 100. Value is specified as a integer.

So was wondering how a minimum limit is made used in a formula to calculate max applications for a user, suppose i set "yarn.scheduler.capacity.<queue-path>.minimum-user-limit-percent" to 20  assuming at least 20% of queue at the minimum is available for a queue but based on the formula maxApplicationsPerUser  is getting set to zero.
According to the definition of the property max is based on the current no. of active users, so i feel this formula is wrong.
P.S. userLimitFactor was configured as default 1, but what i am wondering is whether its valid to use it in combination with userlimit to find max apps per user.

Please correct me if my understanding is wrong? if its a bug would like to raise and solve it .

+ Naga