You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pig.apache.org by "Rohini Palaniswamy (JIRA)" <ji...@apache.org> on 2016/06/08 20:06:21 UTC

[jira] [Updated] (PIG-4911) Provide option to disable DAG recovery

     [ https://issues.apache.org/jira/browse/PIG-4911?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Rohini Palaniswamy updated PIG-4911:
------------------------------------
    Status: Patch Available  (was: Open)

> Provide option to disable DAG recovery
> --------------------------------------
>
>                 Key: PIG-4911
>                 URL: https://issues.apache.org/jira/browse/PIG-4911
>             Project: Pig
>          Issue Type: Improvement
>            Reporter: Rohini Palaniswamy
>            Assignee: Rohini Palaniswamy
>             Fix For: 0.17.0
>
>         Attachments: PIG-4911-1.patch
>
>
>   Tez 0.7 has lot of issues with DAG recovery with auto parallelism causing hung dags in many cases as it was not writing auto parallelism decisions to recovery history. Rewrite was done in Tez 0.8 to handle that.
>   Code was added to Tez to automatically disable recovery if there was auto parallelism so that it would benefit both Pig and Tez. It works fine and the second AM attempt fails with DAG cannot be recovered error when it sees there are vertices with auto parallelism. But problem is it is hard to see what the actual problem is for the users and is hard to debug as well as the whole UI state is rewritten with the partial recovery information.
>     Doing the disabling of recovery in Pig itself by setting tez.dag.recovery.enabled=false will make it not go for the second attempt at all which will eventually fail. It also makes it easy to debug the original failure.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)