You are viewing a plain text version of this content. The canonical link for it is here.
Posted to yarn-issues@hadoop.apache.org by "zhengchenyu (JIRA)" <ji...@apache.org> on 2017/03/29 06:53:41 UTC
[jira] [Comment Edited] (YARN-6407) Improve and fix locks of RM
scheduler
[ https://issues.apache.org/jira/browse/YARN-6407?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15946630#comment-15946630 ]
zhengchenyu edited comment on YARN-6407 at 3/29/17 6:53 AM:
------------------------------------------------------------
[~vinodkv]
Can you give me some advice ? Thanks!
was (Author: zhengchenyu):
[~vinodkv]
> Improve and fix locks of RM scheduler
> -------------------------------------
>
> Key: YARN-6407
> URL: https://issues.apache.org/jira/browse/YARN-6407
> Project: Hadoop YARN
> Issue Type: Bug
> Components: fairscheduler
> Affects Versions: 2.7.1
> Environment: CentOS 7, 1 Gigabit Ethernet
> Reporter: zhengchenyu
> Fix For: 2.7.1
>
> Original Estimate: 2m
> Remaining Estimate: 2m
>
> First,this issue dose not duplicate the YARN-3091.
> In our cluster, we have 5k nodes, and the server is configured with 1 Gigabit Ethernet. So network is bottleneck in our cluster.
> We must distcp data from warehouse, because of 1 Gigabit Ethernet, we must set yarn.scheduler.fair.max.assign to 5, or must lead to hotspot.
> The setting that max.assign is 5 lead to the assigned ability decreased. So we start the ContinuousSchedulingThread.
> As more applicaitons running in our cluster , and with ContinuousSchedulingThread, the problem of lock contention is more serious.
> In our cluster, the callqueue of ApplicationMasterSeriver's rpc is high occasionally. we worried that more problem occure in future with more application are running.
> Here is our logical graph:
> "1 Gigabit Ethernet" and "data hot spot" ==> "set yarn.scheduler.fair.max.assign to 5" ==> "ContinuousSchedulingThread is started" and "more applcations" => "lock contention"
> I know YARN-3091 solved this problem, but the patch aims that change the object lock to read write lock. This change is still Coarse-Grained. So I think we lock the resources or not lock the large section code.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)
---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org