You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@tez.apache.org by "Jason Lowe (JIRA)" <ji...@apache.org> on 2017/02/14 16:55:41 UTC
[jira] [Updated] (TEZ-394) Better scheduling for uneven DAGs

     [ https://issues.apache.org/jira/browse/TEZ-394?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jason Lowe updated TEZ-394:
---------------------------
    Attachment: TEZ-394.002.patch

bq. will this work or RootV2 will only be run after IntermediateV1 is run?

As Bikas said, this just changes the container priorities used by vertices but otherwise doesn't change when they want to start.  If RootV2 and IntermediateV1 both ask for containers at the same time then the latter will receive allocations first, but if there is enough capacity in the cluster then all of them will be allocated.

Thanks for the review, Bikas!

bq. The name of the assigned variable is now misleading because its not topo sorted anymore.

I believe it is still topo sorted?  As I understand it, topo sorting simply means child vertices won't appear before parent vertices in the sort order but doesn't imply any further details on the order (e.g.: breadth first, depth first, etc.).  I believe the code using topologicalVertexStack still needs it to be topo sorted so it isn't visiting child vertices before parent vertices.  I went ahead and updated it to use two variables, topologicalVertexStack when we're detecting cycles as we did before, and critPathVertexStack after it gets reordered.  Both are still topologically sorted, but hopefully this makes it a bit more clear the difference between the two stacks.

> Better scheduling for uneven DAGs
> ---------------------------------
>
>                 Key: TEZ-394
>                 URL: https://issues.apache.org/jira/browse/TEZ-394
>             Project: Apache Tez
>          Issue Type: Sub-task
>            Reporter: Rohini Palaniswamy
>            Assignee: Jason Lowe
>         Attachments: TEZ-394.001.patch, TEZ-394.002.patch
>
>
>   Consider a series of joins or group by on dataset A with few datasets that takes 10 hours followed by a final join with a dataset X. The vertex that loads dataset X will be one of the top vertexes and initialized early even though its output is not consumed till the end after 10 hours. 
> 1) Could either use delayed start logic for better resource allocation
> 2) Else if they are started upfront, need to handle failure/recovery cases where the nodes which executed the MapTask might have gone down when the final join happens. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)