You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@tez.apache.org by "Ming Ma (JIRA)" <ji...@apache.org> on 2016/08/26 23:32:20 UTC

[jira] [Updated] (TEZ-3269) Provide basic fair routing and scheduling functionality via custom VertexManager and EdgeManager

     [ https://issues.apache.org/jira/browse/TEZ-3269?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ming Ma updated TEZ-3269:
-------------------------
    Attachment: TEZ-3269.patch

Here is the draft patch. It supports two polices w.r.t. fair routing.

* One policy is auto reduce, somewhat similar to ShuffleVertexManager where the number of partitions can be reduced based on data size. Instead of using the overall size as in ShuffleVertexManager, it uses partition stats instead e.g. TEZ-2962.
* Another routing policy is fair routing. Any destination task can fetch a consecutive range of partitions from a consecutive range of source tasks. Note that the patch only supports one bipartite edge. To make it work for more than one bipartite edge requires more work. We can open another jira if we need to support that.

Besides the core routing functionalities, the patch also includes the followings.

* Move global stats to per source vertex. This will allow more accurate estimation of the partition size given one source vertex can be much larger than the others. In addition, change from long to int for stats as the unit is in MB. So there is some impact on memory even for ShuffleVertexManager. But the net impact should be acceptable. For joining say 20 source vertexes with 20k destination tasks, the size is 4 * 20k * 20 = 800k. If we want to be safe, we can make this change specific to FairShuffleVertexManager. But we might need it anyway for TEZ-2962.

* Refactor test code.
** The common test cases for both ShuffleVertexManager and FairShuffleVertexManager are moved to TestShuffleVertexManagerBase.
** TestShuffleVertexManager still verified EdgeManager via EdgeManagerPlugin#routeDataMovementEventToDestination. It should use EdgeManagerPluginOnDemand instead as that is what is actually being used.
** Break testShuffleVertexManagerAutoParallelism into individual test cases.

> Provide basic fair routing and scheduling functionality via custom VertexManager and EdgeManager
> ------------------------------------------------------------------------------------------------
>
>                 Key: TEZ-3269
>                 URL: https://issues.apache.org/jira/browse/TEZ-3269
>             Project: Apache Tez
>          Issue Type: Sub-task
>            Reporter: Ming Ma
>         Attachments: TEZ-3269.patch
>
>
> With TEZ-3206 and TEZ-3216, we can build a custom VertexManager and EdgeManager that uses partition stats to do fair routing as well as the scheduling based on destination tasks’ dependency on source tasks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)