You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@tez.apache.org by "Jeff Zhang (JIRA)" <ji...@apache.org> on 2014/07/04 06:40:33 UTC

[jira] [Updated] (TEZ-1238) Display more clear diagnostics info on client side if missing jar in LocalResource or Exception happen in Processor

     [ https://issues.apache.org/jira/browse/TEZ-1238?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jeff Zhang updated TEZ-1238:
----------------------------

    Attachment: Tez-1238.patch

> Display more clear diagnostics info on client side if missing jar in LocalResource or Exception happen in Processor
> -------------------------------------------------------------------------------------------------------------------
>
>                 Key: TEZ-1238
>                 URL: https://issues.apache.org/jira/browse/TEZ-1238
>             Project: Apache Tez
>          Issue Type: Sub-task
>    Affects Versions: 0.4.0
>            Reporter: Jeff Zhang
>            Assignee: Jeff Zhang
>         Attachments: Tez-1238.patch
>
>
> I have a tez job which is failed due to that I didn't put my jar to the local resources. But on the client side, the exception is not clear for me to figure what's wrong with it. The real reason is that It couldn't load the Processor class. I have to run command "yarn logs" to find the real exception in the container logs.  
> I also have another case that has exception in the my Processor, the message on the client side is still not clear to me. I think that should we pass the real exception to the diagnostics and display it in client side, this should help user to find out what's wrong with their program.
> *Exception on client side*
> {code}
> 14/06/26 14:57:15 INFO rpc.DAGClientRPCImpl: VertexStatus: VertexName:
> summer Progress: 0% TotalTasks: 1 Succeeded: 0 Running: 0 Failed: 0 Killed:
> 114/06/26 14:57:15 INFO rpc.DAGClientRPCImpl: VertexStatus: VertexName:
> tokenizer Progress: 0% TotalTasks: 1 Succeeded: 0 Running: 0 Failed: 1
> Killed: 014/06/26 14:57:15 INFO rpc.DAGClientRPCImpl: DAG completed.
> FinalState=FAILEDDAG diagnostics:[Vertex failed, vertexName=tokenizer,
> vertexId=vertex_1403765612557_0004_1_00, diagnostics=[Task failed,
> taskId=task_1403765612557_0004_1_00_000000, diagnostics=[TaskAttempt 0
> failed, info=[Container container_1403765612557_0004_01_000002 COMPLETED
> with diagnostics set to [Exception from container-launch:
> org.apache.hadoop.util.Shell$ExitCodeException:
> org.apache.hadoop.util.Shell$ExitCodeException: at
> org.apache.hadoop.util.Shell.runCommand(Shell.java:505)
> at org.apache.hadoop.util.Shell.run(Shell.java:418)
> at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:650)
> at
> org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(
> DefaultContainerExecutor.java:195)
> at
> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(
> ContainerLaunch.java:300)
> at
> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(
> ContainerLaunch.java:81)
> at java.util.concurrent.FutureTask.run(FutureTask.java:262)
> at java.util.concurrent.ThreadPoolExecutor.runWorker(
> ThreadPoolExecutor.java:1145)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(
> ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:745)
> Container exited with a non-zero exit code 1
> {code}
> *The real exception in container log:*
> {code}
> 2014-06-26 14:57:02,146 ERROR [main]
> org.apache.hadoop.yarn.YarnUncaughtExceptionHandler: Thread
> Thread[main,5,main] threw an Exception.
> org.apache.tez.dag.api.TezUncheckedException: Unable to load class:
> com.zjffdu.tutorial.tez.WordCount$TokenProcessor
>     at org.apache.tez.common.RuntimeUtils.getClazz(RuntimeUtils.java:44)
>     at
> org.apache.tez.common.RuntimeUtils.createClazzInstance(RuntimeUtils.java:66)
>     at
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.createProcessor(LogicalIOProcessorRuntimeTask.java:533)
>     at
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.<init>(LogicalIOProcessorRuntimeTask.java:146)
>     at
> org.apache.tez.runtime.task.TezTaskRunner.<init>(TezTaskRunner.java:78)
>     at org.apache.tez.runtime.task.TezChild.run(TezChild.java:208)
>     at org.apache.tez.runtime.task.TezChild.main(TezChild.java:363)
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)