You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tez.apache.org by "Prabhu Joseph (JIRA)" <ji...@apache.org> on 2017/06/09 11:25:18 UTC
[jira] [Created] (TEZ-3756) Tez Query fails because of a weed node
and all four attempts are placed on same node
Prabhu Joseph created TEZ-3756:
----------------------------------
Summary: Tez Query fails because of a weed node and all four attempts are placed on same node
Key: TEZ-3756
URL: https://issues.apache.org/jira/browse/TEZ-3756
Project: Apache Tez
Issue Type: Bug
Affects Versions: 0.7.1
Reporter: Prabhu Joseph
Tez query fails due to a task failing on all four attempts with "Error: Could not find or load main class org.apache.tez.runtime.task.TezChild". There is a weed node where all containers are failing with this error. Tez library tez.tar.gz cached is corrupt on that machine. But the concern is all the four attempts are placed on same problematic node.
{code}
HW12691:TEZ pjoseph$ cat application_1495721159191_10342.log | grep attempt_1495721159191_10342_6_00_001808 | grep "Assigning container"
Assigning container to task: containerId=container_1495721159191_10342_01_000395, task=attempt_1495721159191_10342_6_00_001808_0
Assigning container to task: containerId=container_1495721159191_10342_01_000397, task=attempt_1495721159191_10342_6_00_001808_1
Assigning container to task: containerId=container_1495721159191_10342_01_000399, task=attempt_1495721159191_10342_6_00_001808_2
Assigning container to task: containerId=container_1495721159191_10342_01_000401, task=attempt_1495721159191_10342_6_00_001808_3
All the four containers are placed on same nodemanager
Container: container_1495721159191_10342_01_000395 on xxx_45454
Container: container_1495721159191_10342_01_000397 on xxx_45454
Container: container_1495721159191_10342_01_000399 on xxx_45454
Container: container_1495721159191_10342_01_000401 on xxx_45454
{code}
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)