You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@tez.apache.org by "Peter Slawski (JIRA)" <ji...@apache.org> on 2016/07/19 00:03:20 UTC

[jira] [Updated] (TEZ-3356) Fix initializing of stats when custom ShuffleVertexManager is used

     [ https://issues.apache.org/jira/browse/TEZ-3356?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Peter Slawski updated TEZ-3356:
-------------------------------
    Attachment: TEZ-3356.1.patch

> Fix initializing of stats when custom ShuffleVertexManager is used
> ------------------------------------------------------------------
>
>                 Key: TEZ-3356
>                 URL: https://issues.apache.org/jira/browse/TEZ-3356
>             Project: Apache Tez
>          Issue Type: Bug
>    Affects Versions: 0.8.4
>            Reporter: Peter Slawski
>         Attachments: TEZ-3356.1.patch
>
>
> When using a custom ShuffleVertexManager to set a vertex’s parallelism, the partition stats field will be left uninitialized even after the manager itself gets initialized. This results in a IllegalStateException to be thrown as the stats field will not yet be initialized when VertexManagerEvents are processed upon the start of the vertex. Note that these events contain partition sizes which are aggregated and stored in this stats field.
>  
> Apache Pig’s grace auto-parallelism feature uses a custom ShuffleVertexManager which sets a vertex’s parallelism upon the completion of one of its parent’s parents. Thus, this corner case is hit and pig scripts with grace parallelism enabled would fail if the DAG consists of at least one vertex having grandparents.
>  
> The fix should be straight forward. Before rather than after VertexManagerEvents are processed, simply update pending tasks to ensure the partition stats field will be initialized.
>  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)