You are viewing a plain text version of this content. The canonical link for it is here.

Posted to common-dev@hadoop.apache.org by "Hemanth Yamijala (JIRA)" <ji...@apache.org> on 2009/03/13 14:05:50 UTC

[jira] Created: (HADOOP-5487) Few tasks failed while creating the work directory for a job, when job tracker was restarted

Few tasks failed while creating the work directory for a job, when job tracker was restarted
--------------------------------------------------------------------------------------------

                 Key: HADOOP-5487
                 URL: https://issues.apache.org/jira/browse/HADOOP-5487
             Project: Hadoop Core
          Issue Type: Bug
          Components: mapred
            Reporter: Hemanth Yamijala


A randomwriter job was running when the job tracker restarted. After the jobtracker restarted, some tasktrackers were sent a reinit action. After this, some new tasks of the random writer were scheduled to be run on the same task trackers. These failed in the job localization while creating the work directory. However, the next attempts of the same job ran successfully and the job succeeded. This happened in about 1% of the total number of tasks.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-5487) Few tasks failed while creating the work directory for a job, when job tracker was restarted

Posted by "Hemanth Yamijala (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HADOOP-5487?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12681696#action_12681696 ] 

Hemanth Yamijala commented on HADOOP-5487:
------------------------------------------

Following was the exception trace on such a task:

java.io.IOException: Mkdirs failed to create /path/to/mapred-local/taskTracker/jobcache/job_200903130908_0051/work
  at org.apache.hadoop.mapred.TaskTracker.localizeJob(TaskTracker.java:829)
  at org.apache.hadoop.mapred.TaskTracker.startNewTask(TaskTracker.java:1743)
  at org.apache.hadoop.mapred.TaskTracker.access$1200(TaskTracker.java:97)
  at org.apache.hadoop.mapred.TaskTracker$TaskLauncher.run(TaskTracker.java:1708)

> Few tasks failed while creating the work directory for a job, when job tracker was restarted
> --------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-5487
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5487
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>            Reporter: Hemanth Yamijala
>
> A randomwriter job was running when the job tracker restarted. After the jobtracker restarted, some tasktrackers were sent a reinit action. After this, some new tasks of the random writer were scheduled to be run on the same task trackers. These failed in the job localization while creating the work directory. However, the next attempts of the same job ran successfully and the job succeeded. This happened in about 1% of the total number of tasks.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HADOOP-5487) Few tasks failed while creating the work directory for a job, when job tracker was restarted

Posted by "Amareshwari Sriramadasu (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HADOOP-5487?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Amareshwari Sriramadasu updated HADOOP-5487:
--------------------------------------------

    Fix Version/s: 0.21.0
         Assignee: Amareshwari Sriramadasu

> Few tasks failed while creating the work directory for a job, when job tracker was restarted
> --------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-5487
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5487
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>            Reporter: Hemanth Yamijala
>            Assignee: Amareshwari Sriramadasu
>             Fix For: 0.21.0
>
>
> A randomwriter job was running when the job tracker restarted. After the jobtracker restarted, some tasktrackers were sent a reinit action. After this, some new tasks of the random writer were scheduled to be run on the same task trackers. These failed in the job localization while creating the work directory. However, the next attempts of the same job ran successfully and the job succeeded. This happened in about 1% of the total number of tasks.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.