You are viewing a plain text version of this content. The canonical link for it is here.
Posted to yarn-issues@hadoop.apache.org by "zhengchenyu (JIRA)" <ji...@apache.org> on 2017/05/10 01:57:04 UTC
[jira] [Comment Edited] (YARN-6568) A queue which runs a long time
job couldn't acquire any container for long time.
[ https://issues.apache.org/jira/browse/YARN-6568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16003908#comment-16003908 ]
zhengchenyu edited comment on YARN-6568 at 5/10/17 1:56 AM:
------------------------------------------------------------
[~yufeigu]
Sorry, I didn't express definitely!
I said that the minShare1 which is big enough is configured by fair-scheduler.xml, not the variable 'minShare1'. It equals s1.getMinShare.
look the code below. if minShare1 which is configured by fair-scheduler.xml is big enough. the variable 'minShare1' equals s1.getDemand. It means the variable 'minShare1' = resourceUsage + request.
{
Resource minShare1 = Resources.min(RESOURCE_CALCULATOR, null,s1.getMinShare(), s1.getDemand());
}
look the code below. At this time, minShareRatio1 = resourceUsage1/'minShare1 = resourceUsage1 / (resourceUsage1+request1)
{
minShareRatio1 = (double) resourceUsage1.getMemory()/ Resources.max(RESOURCE_CALCULATOR, null, minShare1, ONE).getMemory();
}
was (Author: zhengchenyu):
[~yufeigu]
Sorry, I didn't express definitely!
I said that the minShare1 which is big enough is configured by fair-scheduler.xml, not the variable 'minShare1'. It equals s1.getMinShare.
look the code below. if minShare1 which is configured by fair-scheduler.xml is big enough. the variable 'minShare1' equals s1.getDemand. It means the variable 'minShare1' = resourceUsage + request.
{{
Resource minShare1 = Resources.min(RESOURCE_CALCULATOR, null,s1.getMinShare(), s1.getDemand());
}}
look the code below. At this time, minShareRatio1 = resourceUsage1/'minShare1 = resourceUsage1 / (resourceUsage1+request1)
{{
minShareRatio1 = (double) resourceUsage1.getMemory()/ Resources.max(RESOURCE_CALCULATOR, null, minShare1, ONE).getMemory();
}}
> A queue which runs a long time job couldn't acquire any container for long time.
> --------------------------------------------------------------------------------
>
> Key: YARN-6568
> URL: https://issues.apache.org/jira/browse/YARN-6568
> Project: Hadoop YARN
> Issue Type: Bug
> Components: fairscheduler
> Affects Versions: 2.7.1
> Environment: CentOS 7.1
> Reporter: zhengchenyu
> Fix For: 2.7.4
>
> Original Estimate: 1m
> Remaining Estimate: 1m
>
> In our cluster, we find some applications couldn't acquire any container for long time. (Note: we use FairSharePolicy and FairScheduler)
> First, I found some unreasonable configuration, we set minRes=maxRes. So some application keep pending for long time, we kill some large applicaiton to solve this problem. Then we changed this configuration, this problem relieves.
> But this problem is not completely solved. In our cluster, I found applications in some queue which request few container keep pending for long time.
> I simulate in test cluster. I submit DistributedShell application which run many loo applications to queueA, then I submit my own yarn application which request container and release container constantly to queueB. At this time, any applicaitons which are submmited to queueA keep pending!
> We know this is the problem of FairSharePolicy, it consider the request of queue. So after sort the queues, some queues which have few request are ordered last all time.
> We know if the AM container is launched, then the request will increase, But FairSharePolicy can't distinguish which request is AM request. I think if am container is assigned, the problem is solved.
> Our companion discuss this problem. we recommend set a timeout for queue, it means the time length of a queue is not assigned. If timeout, we set this queue to the first place of queues list.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)
---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org