You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@tez.apache.org by "Jeff Zhang (JIRA)" <ji...@apache.org> on 2015/01/22 03:12:34 UTC

[jira] [Comment Edited] (TEZ-1642) TestAMRecovery sometimes fail

    [ https://issues.apache.org/jira/browse/TEZ-1642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14286788#comment-14286788 ] 

Jeff Zhang edited comment on TEZ-1642 at 1/22/15 2:11 AM:
----------------------------------------------------------

Update master's CHANGES.txt : Move TEZ-1642, TEZ-1934 to 0.5.4

commit 758d5a6b9441fcd92e37fae5885e793ece8f0457 (HEAD, master)
Author: Jeff Zhang <zj...@apache.org>
Date:   Thu Jan 22 10:09:29 2015 +0800

    Update CHNAGES.txt: Move TEZ-1642, TEZ-1934 to 0.5.4 (zjffdu)


was (Author: zjffdu):
Update master's CHANGES.txt : Move TEZ-1642, TEZ-1943 to 0.5.4

commit 758d5a6b9441fcd92e37fae5885e793ece8f0457 (HEAD, master)
Author: Jeff Zhang <zj...@apache.org>
Date:   Thu Jan 22 10:09:29 2015 +0800

    Update CHNAGES.txt: Move TEZ-1642, TEZ-1934 to 0.5.4 (zjffdu)

> TestAMRecovery sometimes fail
> -----------------------------
>
>                 Key: TEZ-1642
>                 URL: https://issues.apache.org/jira/browse/TEZ-1642
>             Project: Apache Tez
>          Issue Type: Bug
>            Reporter: Jeff Zhang
>            Assignee: Jeff Zhang
>             Fix For: 0.6.0
>
>         Attachments: TEZ-1642-2.patch, TEZ-1642-3.patch, TEZ-1642-4.patch, TEZ-1642-5.patch, TEZ-1642.patch
>
>
> TestAMRecovery fails sometimes on testVertexPartiallyFinished_XXX.  
> The scenario is that we'd like kill AM when vertex is partially finished ( with 2 tasks, task_0 is finished and task_1 is running). When in recovery, task_0 should not rerun and task_1 should rerun. ( We use the recovery log(TaskAttemptFinishedEvent) to judge whether task is rerun)
> Currently, using VertexManager.onSourceTaskCompleted to control when to kill AM, but it is not perfect.  VertexManager.onSourceTaskCompleted is not invoked at the moment task attempt is finished ( TaskAttempt send event to Task to tell TaskAttempt is finsihed, and then Task send event to Vertex to trigger VM.onSourceTaskCompleted) 
> The following case is possible: task_0 finished -> task_1 finished -> VM.onSourceTaskCompleted -> VM.onSourceTaskCompleted
> In this case, we will take it as partially completed in the first VM.onSourceTaskCompleted, but actually the vertex is fully completed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)