You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@tez.apache.org by "Siddharth Seth (JIRA)" <ji...@apache.org> on 2014/10/14 04:09:34 UTC

[jira] [Commented] (TEZ-1659) One Pig on tez hang due to a tez setting

    [ https://issues.apache.org/jira/browse/TEZ-1659?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14170377#comment-14170377 ] 

Siddharth Seth commented on TEZ-1659:
-------------------------------------

Looks like this will be fixed by TEZ-1649. 
What's happening here is that a vertex (v4) has two incoming SCATTER-GATHER edges. Initially, one has 1 task (v0), the other has 2 tasks (v3).

Once a single task on v0 completes (v3 not started yet), the shufflevertexmanager looks at 1/3 tasks completed which exceeds the 10% limit - and ends up starting tasks for v4, with the Inputs configured to expect 1 (v0) and 2(v3) events.
v3 parallelism gets updated from 2 to 1.
v4 tasks still have the old parallelism and end up hanging.

[~rajesh.balamohan] - could you please confirm that TEZ-1649 is fixing the same issue.

> One Pig on tez hang due to a tez setting
> ----------------------------------------
>
>                 Key: TEZ-1659
>                 URL: https://issues.apache.org/jira/browse/TEZ-1659
>             Project: Apache Tez
>          Issue Type: Bug
>            Reporter: Daniel Dai
>            Assignee: Siddharth Seth
>         Attachments: application_1413229270748_0020.tar.gz, stack.log
>
>
> A particular Pig on Tez e2e test (Cross_5 in Pig e2e tests) hang with a particular tez setting:
> {code}
>   <property>
>     <name>tez.shuffle-vertex-manager.min-src-fraction</name>
>     <value>0.1</value>
>   </property>
>   <property>
>     <name>tez.shuffle-vertex-manager.max-src-fraction</name>
>     <value>0.1</value>
>   </property>
> {code}
> With default setting, the test pass.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)