You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-issues@hadoop.apache.org by "Amareshwari Sriramadasu (JIRA)" <ji...@apache.org> on 2009/12/17 12:48:18 UTC

[jira] Updated: (MAPREDUCE-913) TaskRunner crashes with NPE resulting in held up slots, UNINITIALIZED tasks and hung TaskTracker

     [ https://issues.apache.org/jira/browse/MAPREDUCE-913?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Amareshwari Sriramadasu updated MAPREDUCE-913:
----------------------------------------------

    Attachment: patch-913.txt

Patch does the following:
1. changed reportTaskFinished code to ensure release slot happens always by calling releaseSlot in finally block.
2. Have undone the changes to do with throwing exception when arguments to debug-script could not be constructed, as it was already initializing them to empty String.
3. Modified the testcase to use new api.

bq.  In test case can we verify the correct number of the map slot is actually reported back to JobTracker after the failing job completes, this would test the actual slot management.
4. Added asserts for slot management. Verified the test passes with the patch and fails without the patch.

bq. Can we check if the workDir is non-null in the run-debug script and throw an exception if the same is null? Would prevent launch of task-controller code.
If workdDir is null or if it doesnt exists, the current code already throws IOException.

bq. Wouldn't it be much better that we add a check to figure out if the taskJVM was launched or not and then run debug script accordingly.
This may need more discussion, since it changes the feature in a way that debug script will be launched only when taskJvm is launched properly.


> TaskRunner crashes with NPE resulting in held up slots, UNINITIALIZED tasks and hung TaskTracker
> ------------------------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-913
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-913
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: tasktracker
>    Affects Versions: 0.20.1
>            Reporter: Vinod K V
>            Priority: Blocker
>             Fix For: 0.21.0
>
>         Attachments: mapreduce-913-1.patch, MAPREDUCE-913-20091119.1.txt, MAPREDUCE-913-20091119.2.txt, MAPREDUCE-913-20091120.1.txt, patch-913.txt
>
>


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.