You are viewing a plain text version of this content. The canonical link for it is here.
Posted to yarn-dev@hadoop.apache.org by "Nathan Roberts (JIRA)" <ji...@apache.org> on 2015/10/22 00:32:27 UTC

[jira] [Created] (YARN-4287) Capacity Scheduler: Rack Locality improvement

Nathan Roberts created YARN-4287:
------------------------------------

             Summary: Capacity Scheduler: Rack Locality improvement
                 Key: YARN-4287
                 URL: https://issues.apache.org/jira/browse/YARN-4287
             Project: Hadoop YARN
          Issue Type: Improvement
          Components: capacityscheduler
    Affects Versions: 2.7.1
            Reporter: Nathan Roberts
            Assignee: Nathan Roberts


YARN-4189 does an excellent job describing the issues with the current delay scheduling algorithms within the capacity scheduler. The design proposal also seems like a good direction.

This jira proposes a simple interim solution to the key issue we've been experiencing on a regular basis:
 - rackLocal assignments trickle out due to nodeLocalityDelay. This can have significant impact on things like CombineFileInputFormat which targets very specific nodes in its split calculations.

I'm not sure when YARN-4189 will become reality so I thought a simple interim patch might make sense. The basic idea is simple: 
1) Separate delays for rackLocal, and OffSwitch (today there is only 1)
2) When we're getting rackLocal assignments, subsequent rackLocal assignments should not be delayed

Patch will be uploaded shortly. No big deal if the consensus is to go straight to YARN-4189. 






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)