You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@tez.apache.org by "Yingda Chen (JIRA)" <ji...@apache.org> on 2019/04/26 23:20:00 UTC

[jira] [Assigned] (TEZ-4060) NoOpVertexManager schedules tasks that are not ready to run

     [ https://issues.apache.org/jira/browse/TEZ-4060?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Yingda Chen reassigned TEZ-4060:
--------------------------------

    Assignee: Ying Han

> NoOpVertexManager schedules tasks that are not ready to run
> -----------------------------------------------------------
>
>                 Key: TEZ-4060
>                 URL: https://issues.apache.org/jira/browse/TEZ-4060
>             Project: Apache Tez
>          Issue Type: Bug
>    Affects Versions: 0.9.1
>            Reporter: Adrian Nicoara
>            Assignee: Ying Han
>            Priority: Major
>
> During recovery, vertices which have already been reconfigured get assigned a NoOpVertexManager:
> [https://github.com/apache/tez/blob/8395a9560a131799f1af49b26e1f10f12ef48752/tez-dag/src/main/java/org/apache/tez/dag/app/dag/impl/VertexImpl.java#L2689-L2711]
> [https://github.com/apache/tez/blob/8395a9560a131799f1af49b26e1f10f12ef48752/tez-dag/src/main/java/org/apache/tez/dag/app/RecoveryParser.java#L970-L972]
> The NoOpVertexManager directly schedules tasks upon being started:
> [https://github.com/apache/tez/blob/8395a9560a131799f1af49b26e1f10f12ef48752/tez-dag/src/main/java/org/apache/tez/dag/app/dag/impl/VertexImpl.java#L4628]
> However, for a large graph, we can end up having all vertices configured and started, before many of their inputs (for vertices that are not attached to the roots) are generated.
> This ends up scheduling tasks which are not ready to run, and will ultimately fail until their inputs are generated.
> In addition to bypassing input dependency checking, which is generally done in VertexManagerPlugin#onSourceTaskCompleted, we lose the ability of executing custom logic within our own VertexManagerPlugins that is needed for the configuration of downstream vertices. This is due to the fact that we communicate some graph configuration metadata through global objects that are populated through calls to VertexManagerPlugin#onVertexStateUpdated.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)