You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@tez.apache.org by "Hitesh Shah (JIRA)" <ji...@apache.org> on 2013/11/05 01:56:17 UTC
[jira] [Resolved] (TEZ-591) Provide mode specific diagnostic information to the Tez client

     [ https://issues.apache.org/jira/browse/TEZ-591?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Hitesh Shah resolved TEZ-591.
-----------------------------

       Resolution: Fixed
    Fix Version/s: 0.2.0

Committed to master.

> Provide mode specific diagnostic information to the Tez client
> --------------------------------------------------------------
>
>                 Key: TEZ-591
>                 URL: https://issues.apache.org/jira/browse/TEZ-591
>             Project: Apache Tez
>          Issue Type: Wish
>            Reporter: Cheolsoo Park
>            Assignee: Hitesh Shah
>            Priority: Minor
>             Fix For: 0.2.0
>
>         Attachments: TEZ-591.1.patch
>
>
> While developing Pig on Tez, I found it's hard to debug DAG failures due to lack of diagnostic information. Currently, the MR Pig client reports the backend error message when there is a job failure. For example, if I have a UDF that throws a runtime exception, I will see the following stack trace in the front-end log file-
> {code}
> Pig Stack Trace
> ---------------
> ERROR 1066: Unable to open iterator for alias b. Backend error : FAIL IT NOW!
> ...
> Caused by: java.lang.RuntimeException: FAIL IT NOW!
>     at Kill.exec(Kill.java:9)
>     at Kill.exec(Kill.java:6)
>     at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNext(POUserFunc.java:334)
>     at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNext(POUserFunc.java:383)
>     at org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.getNext(PhysicalOperator.java:346)
>     at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.processPlan(POForEach.java:372)
>     at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNext(POForEach.java:297)
>     at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.runPipeline(PigGenericMapBase.java:283)
>     at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:278)
>     at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:64)
>     at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
>     at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:771)
>     at org.apache.hadoop.mapred.MapTask.run(MapTask.java:375)
>     at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
>     at java.security.AccessController.doPrivileged(Native Method)
>     at javax.security.auth.Subject.doAs(Subject.java:396)
>     at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1132)
> {code}
> Basically, I'd like to do something similar in Tez Pig.
> If there are multiple failed vertices and tasks, it may be not possible to propagate all the backend exceptions to the frontend. But would it be possible to propagate some of first ones at least? Perhaps one per failed vertex? Given that DAGStatus.getDiagnostics() returns a list of Strings, it seems feasible.



--
This message was sent by Atlassian JIRA
(v6.1#6144)