You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@tez.apache.org by "Bikas Saha (JIRA)" <ji...@apache.org> on 2015/02/12 00:44:13 UTC
[jira] [Commented] (TEZ-2082) Failing test:
TestPreemption::testPreemptionWithSession/
[ https://issues.apache.org/jira/browse/TEZ-2082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14317224#comment-14317224 ]
Bikas Saha commented on TEZ-2082:
---------------------------------
This is likely a race condition introduced in TEZ-2045 and hence I am removing the 0.6.1 target version and reducing priority.
Explanation below. /cc [~sseth]
In TaskAttemptListenerImpTezDag.getTask(TaskAttemptListenerImpTezDag.java)
{code}
=== registeredContainers returns true here====
if (!registeredContainers.containsKey(containerId)) {
if(context.getAllContainers().get(containerId) == null) {
LOG.info("Container with id: " + containerId
+ " is invalid and will be killed");
} else {
LOG.info("Container with id: " + containerId
+ " is valid, but no longer registered, and will be killed");
}
task = TASK_FOR_INVALID_JVM;
} else {
pingContainerHeartbeatHandler(containerId);
=== registeredContainers return null for the same cId inside getContainerTask ===
=== so it returns TASK_FOR_INVALID_JVM but code only checks for null ====
task = getContainerTask(containerId);
if (task == null) {
if (LOG.isDebugEnabled()) {
LOG.debug("No task current assigned to Container with id: " + containerId);
}
} else {
context.getEventHandler().handle(
=== so it crashes here while accessing getTaskSpec().getTaskAttemptID() since that is null for TASK_FOR_INVALID_JVM ===
new TaskAttemptEventStartedRemotely(task.getTaskSpec()
.getTaskAttemptID(), containerId, context
.getApplicationACLs()));
LOG.info("Container with id: " + containerId + " given task: "
+ task.getTaskSpec().getTaskAttemptID());
}
}{code}
Can't think of anyway to test for this race condition. So added a precondition that will help catch this more easily if it occurs again.
> Failing test: TestPreemption::testPreemptionWithSession/
> --------------------------------------------------------
>
> Key: TEZ-2082
> URL: https://issues.apache.org/jira/browse/TEZ-2082
> Project: Apache Tez
> Issue Type: Bug
> Affects Versions: 0.7.0
> Reporter: Hitesh Shah
> Assignee: Bikas Saha
> Attachments: TEZ-2082.1.patch
>
>
> From https://builds.apache.org/job/Tez-Build/891/testReport/junit/org.apache.tez.dag.app/TestPreemption/testPreemptionWithSession/
> Exception in thread "Thread-27" java.lang.NullPointerException
> at org.apache.tez.dag.app.TaskAttemptListenerImpTezDag.getTask(TaskAttemptListenerImpTezDag.java:222)
> at org.apache.tez.dag.app.MockDAGAppMaster$MockContainerLauncher.run(MockDAGAppMaster.java:230)
> at java.lang.Thread.run(Thread.java:662)
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)