You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-issues@hadoop.apache.org by "Karthik Kambatla (JIRA)" <ji...@apache.org> on 2012/08/27 19:22:07 UTC

[jira] [Updated] (MAPREDUCE-4595) TestLostTracker failing - possibly due to a race in JobHistory.JobHistoryFilesManager#run()

     [ https://issues.apache.org/jira/browse/MAPREDUCE-4595?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Karthik Kambatla updated MAPREDUCE-4595:
----------------------------------------

    Attachment: MR-4595.patch

Uploading a patch for branch-1.

I understand it is not the absolute fool approach, as the test still fails if the thread moving the file takes longer than 5 minutes. However, it is a cause of concern if it takes longer than that.

Please feel free to suggest alternate/better approaches.
                
> TestLostTracker failing - possibly due to a race in JobHistory.JobHistoryFilesManager#run()
> -------------------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-4595
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4595
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>    Affects Versions: 1.0.3
>            Reporter: Karthik Kambatla
>            Assignee: Karthik Kambatla
>            Priority: Critical
>              Labels: test
>         Attachments: force-TestLostTracker-failure.patch, MR-4595.patch
>
>
> The source for occasional failure of TestLostTracker seems like the following:
> On job completion, JobHistoryFilesManager#run() spawns another thread to move history files to done folder. TestLostTracker waits for job completion, before checking the file format of the history file. However, the history files move might be in the process or might not have started in the first place.
> I am uploading a patch that significantly increases the chance of hitting this race. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira