You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@tez.apache.org by "Jason Lowe (JIRA)" <ji...@apache.org> on 2017/06/22 16:26:00 UTC

[jira] [Updated] (TEZ-3770) DAG-aware YARN task scheduler

     [ https://issues.apache.org/jira/browse/TEZ-3770?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jason Lowe updated TEZ-3770:
----------------------------
    Attachment: TEZ-3770.001.patch

Attaching a patch that provides a new scheduler class, DagAwareYarnTaskScheduler.  The scheduler is a very tricky place to change, so to mitigate risk I implemented it as a separate scheduler that must be enabled via configuration.  This scheduler has the following high-level behavioral differences from the existing YarnTaskSchedulerService class:
- It tries to schedule new containers for tasks that match its priority before trying to schedule the highest priority task first.  This avoids hanging onto unused, lower priority containers because higher priority requests are pending (see TEZ-3535).
- New task allocation requests are first matched against idle containers before requesting resources from the RM.  This cuts down on AM-RM protocol churn.
- Task requests for tasks that are DAG-descendants of pending task requests will not be allocated to help reduce priority inversions that could lead to preemption.
- Running tasks will only be preempted if they are DAG-descendants of tasks that have pending allocation requests.


> DAG-aware YARN task scheduler
> -----------------------------
>
>                 Key: TEZ-3770
>                 URL: https://issues.apache.org/jira/browse/TEZ-3770
>             Project: Apache Tez
>          Issue Type: New Feature
>            Reporter: Jason Lowe
>            Assignee: Jason Lowe
>         Attachments: TEZ-3770.001.patch
>
>
> There are cases where priority alone does not convey the relationship between tasks, and this can cause problems when scheduling or preempting tasks.  If the YARN task scheduler was aware of the relationship between tasks then it could make smarter decisions when trying to assign tasks to containers or preempt running tasks to schedule pending tasks.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)