You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@tez.apache.org by "Arpit Gupta (JIRA)" <ji...@apache.org> on 2014/04/11 23:54:16 UTC

[jira] [Commented] (TEZ-1048) NPE when previous task attempts fail without generating any data

    [ https://issues.apache.org/jira/browse/TEZ-1048?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13967150#comment-13967150 ] 

Arpit Gupta commented on TEZ-1048:
----------------------------------

Here is the client side log

{code}
Vertex re-running, vertexName=Reducer 2, vertexId=vertex_1397208530054_0009_1_02
Vertex failed, vertexName=Reducer 3, vertexId=vertex_1397208530054_0009_1_01, diagnostics=[Task failed, taskId=task_1397208530054_0009_1_01_000034, diagnostics=[AttemptID:attempt_1397208530054_0009_1_01_000034_0 Info:Container container_1397208530054_0009_02_000013 COMPLETED with diagnostics set to [Container [pid=13375,containerID=container_1397208530054_0009_02_000013] is running beyond physical memory limits. Current usage: 1.2 GB of 1 GB physical memory used; 1.8 GB of 2.1 GB virtual memory used. Killing container.
Dump of the process-tree for container_1397208530054_0009_02_000013 :
|- PID PPID PGRPID SESSID CMD_NAME USER_MODE_TIME(MILLIS) SYSTEM_TIME(MILLIS) VMEM_USAGE(BYTES) RSSMEM_USAGE(PAGES) FULL_CMD_LINE
|- 13375 3478 13375 13375 (bash) 0 1 108638208 304 /bin/bash -c /usr/hadoop-jdk1.6.0_31/bin/java -server -XX:NewRatio=8 -Djava.net.preferIPv4Stack=true -server -Xmx1024m -Djava.net.preferIPv4Stack=true -Dlog4j.configuration=tez-container-log4j.properties -Dyarn.app.container.log.dir=/grid/1/hdp/yarn/log/application_1397208530054_0009/container_1397208530054_0009_02_000013 -Dtez.root.logger=INFO,CLA  -Djava.io.tmpdir=/grid/0/hdp/yarn/local/usercache/hrt_qa/appcache/application_1397208530054_0009/container_1397208530054_0009_02_000013/tmp org.apache.hadoop.mapred.YarnTezDagChild 68.142.246.51 59344 container_1397208530054_0009_02_000013 application_1397208530054_0009 2 1>/grid/1/hdp/yarn/log/application_1397208530054_0009/container_1397208530054_0009_02_000013/stdout 2>/grid/1/hdp/yarn/log/application_1397208530054_0009/container_1397208530054_0009_02_000013/stderr
|- 13639 13375 13375 13375 (java) 50163 1880 1867436032 302067 /usr/hadoop-jdk1.6.0_31/bin/java -server -XX:NewRatio=8 -Djava.net.preferIPv4Stack=true -server -Xmx1024m -Djava.net.preferIPv4Stack=true -Dlog4j.configuration=tez-container-log4j.properties -Dyarn.app.container.log.dir=/grid/1/hdp/yarn/log/application_1397208530054_0009/container_1397208530054_0009_02_000013 -Dtez.root.logger=INFO,CLA -Djava.io.tmpdir=/grid/0/hdp/yarn/local/usercache/hrt_qa/appcache/application_1397208530054_0009/container_1397208530054_0009_02_000013/tmp org.apache.hadoop.mapred.YarnTezDagChild 68.142.246.51 59344 container_1397208530054_0009_02_000013 application_1397208530054_0009 2

Container killed on request. Exit code is 143
Container exited with a non-zero exit code 143
], AttemptID:attempt_1397208530054_0009_1_01_000034_1 Info:Node blacklisted, AttemptID:attempt_1397208530054_0009_1_01_000034_2 Info:Error: java.lang.NullPointerException
at org.apache.tez.runtime.library.common.shuffle.impl.ShuffleScheduler.copySucceeded(ShuffleScheduler.java:206)
at org.apache.tez.runtime.library.common.shuffle.impl.ShuffleInputEventHandler.processDataMovementEvent(ShuffleInputEventHandler.java:94)
at org.apache.tez.runtime.library.common.shuffle.impl.ShuffleInputEventHandler.handleEvent(ShuffleInputEventHandler.java:63)
at org.apache.tez.runtime.library.common.shuffle.impl.ShuffleInputEventHandler.handleEvents(ShuffleInputEventHandler.java:56)
at org.apache.tez.runtime.library.common.shuffle.impl.Shuffle.handleEvents(Shuffle.java:180)
at org.apache.tez.runtime.library.input.ShuffledMergedInput.handleEvents(ShuffledMergedInput.java:241)
at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.handleEvent(LogicalIOProcessorRuntimeTask.java:578)
at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.access$1100(LogicalIOProcessorRuntimeTask.java:82)
at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask$1.run(LogicalIOProcessorRuntimeTask.java:640)
at java.lang.Thread.run(Thread.java:722)

Container released by application, AttemptID:attempt_1397208530054_0009_1_01_000034_3 Info:Error: java.lang.NullPointerException
at org.apache.tez.runtime.library.common.shuffle.impl.ShuffleScheduler.copySucceeded(ShuffleScheduler.java:206)
at org.apache.tez.runtime.library.common.shuffle.impl.ShuffleInputEventHandler.processDataMovementEvent(ShuffleInputEventHandler.java:94)
at org.apache.tez.runtime.library.common.shuffle.impl.ShuffleInputEventHandler.handleEvent(ShuffleInputEventHandler.java:63)
at org.apache.tez.runtime.library.common.shuffle.impl.ShuffleInputEventHandler.handleEvents(ShuffleInputEventHandler.java:56)
at org.apache.tez.runtime.library.common.shuffle.impl.Shuffle.handleEvents(Shuffle.java:180)
at org.apache.tez.runtime.library.input.ShuffledMergedInput.handleEvents(ShuffledMergedInput.java:241)
at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.handleEvent(LogicalIOProcessorRuntimeTask.java:578)
at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.access$1100(LogicalIOProcessorRuntimeTask.java:82)
at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask$1.run(LogicalIOProcessorRuntimeTask.java:640)
at java.lang.Thread.run(Thread.java:722)

Container released by application, AttemptID:attempt_1397208530054_0009_1_01_000034_4 Info:Error: java.lang.NullPointerException
at org.apache.tez.runtime.library.common.shuffle.impl.ShuffleScheduler.copySucceeded(ShuffleScheduler.java:206)
at org.apache.tez.runtime.library.common.shuffle.impl.ShuffleInputEventHandler.processDataMovementEvent(ShuffleInputEventHandler.java:94)
at org.apache.tez.runtime.library.common.shuffle.impl.ShuffleInputEventHandler.handleEvent(ShuffleInputEventHandler.java:63)
at org.apache.tez.runtime.library.common.shuffle.impl.ShuffleInputEventHandler.handleEvents(ShuffleInputEventHandler.java:56)
at org.apache.tez.runtime.library.common.shuffle.impl.Shuffle.handleEvents(Shuffle.java:180)
at org.apache.tez.runtime.library.input.ShuffledMergedInput.handleEvents(ShuffledMergedInput.java:241)
at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.handleEvent(LogicalIOProcessorRuntimeTask.java:578)
at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.access$1100(LogicalIOProcessorRuntimeTask.java:82)
at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask$1.run(LogicalIOProcessorRuntimeTask.java:640)
at java.lang.Thread.run(Thread.java:722)
], Vertex failed as one or more tasks failed. failedTasks:1]
Vertex killed, vertexName=Reducer 4, vertexId=vertex_1397208530054_0009_1_00, diagnostics=[Vertex received Kill while in RUNNING state., Vertex killed as other vertex failed. failedTasks:0]
Vertex killed, vertexName=Reducer 8, vertexId=vertex_1397208530054_0009_1_03, diagnostics=[Vertex received Kill while in RUNNING state., Vertex killed as other vertex failed. failedTasks:0]
Vertex killed, vertexName=Reducer 2, vertexId=vertex_1397208530054_0009_1_02, diagnostics=[Vertex received Kill while in RUNNING state., Vertex killed as other vertex failed. failedTasks:0]
DAG failed due to vertex failure. failedVertices:1 killedVertices:3
FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.tez.TezTask

{code}

> NPE when previous task attempts fail without generating any data
> ----------------------------------------------------------------
>
>                 Key: TEZ-1048
>                 URL: https://issues.apache.org/jira/browse/TEZ-1048
>             Project: Apache Tez
>          Issue Type: Bug
>    Affects Versions: 0.4.0
>            Reporter: Arpit Gupta
>
> We NPE when the previous task fails without generating data.



--
This message was sent by Atlassian JIRA
(v6.2#6252)