You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-issues@hadoop.apache.org by "Hadoop QA (JIRA)" <ji...@apache.org> on 2011/04/19 10:46:06 UTC

[jira] [Commented] (HADOOP-7130) Map Reduce Tasks are continously failing, when one among the several harddisks available on the TaskTracker fails.

    [ https://issues.apache.org/jira/browse/HADOOP-7130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13021489#comment-13021489 ] 

Hadoop QA commented on HADOOP-7130:
-----------------------------------

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12471931/HADOOP-7130.patch
  against trunk revision 1094750.

    +1 @author.  The patch does not contain any @author tags.

    +1 tests included.  The patch appears to include 4 new or modified tests.

    -1 patch.  The patch command could not apply the patch.

Console output: https://hudson.apache.org/hudson/job/PreCommit-HADOOP-Build/363//console

This message is automatically generated.

> Map Reduce Tasks are continously failing, when one among the several harddisks available on the TaskTracker fails.
> ------------------------------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-7130
>                 URL: https://issues.apache.org/jira/browse/HADOOP-7130
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: fs
>    Affects Versions: 0.20.2, 0.20.3, 0.20-append
>            Reporter: Devaraj K
>            Assignee: Devaraj K
>             Fix For: 0.20.4
>
>         Attachments: HADOOP-7130.patch
>
>
> 1. Pull out one hard disk from Task tracker node (out of 10 disks pull one). Now it is noted that some jobs are failing. 
> However process is continued. 
> 2. Wait for sometime (15 mins) and pull out one disk from another Task tracker. 
> 3. More number of jobs failed now and it can be seen from UI. Process is getting paused.
> The exception can be seen in the job tracker UI for a failed job.
> {code:xml} 
> Error initializing attempt_201010221528_10174_m_000011_0:
> java.io.IOException: Expecting a line not the end of stream
>  at org.apache.hadoop.fs.DF.parseExecResult(DF.java:110)
>  at org.apache.hadoop.util.Shell.runCommand(Shell.java:182)
>  at org.apache.hadoop.util.Shell.run(Shell.java:137)
>  at org.apache.hadoop.fs.DF.getAvailable(DF.java:74)
>  at org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathForWrite(LocalDirAllocator.java:385)
>  at org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:134)
>  at org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:113)
>  at org.apache.hadoop.mapred.TaskTracker.localizeJob(TaskTracker.java:835)
>  at org.apache.hadoop.mapred.TaskTracker.startNewTask(TaskTracker.java:1790)
>  at org.apache.hadoop.mapred.TaskTracker.access$1200(TaskTracker.java:104)
>  at org.apache.hadoop.mapred.TaskTracker$TaskLauncher.run(TaskTracker.java:1753)
> Error initializing attempt_201010221528_10174_m_000011_1:
> org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find any valid local directory for taskTracker/jobcache/job_201010221528_10174/work
>  at org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathForWrite(LocalDirAllocator.java:454)
>  at org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:134)
>  at org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:113)
>  at org.apache.hadoop.mapred.TaskTracker.localizeJob(TaskTracker.java:835)
>  at org.apache.hadoop.mapred.TaskTracker.startNewTask(TaskTracker.java:1790)
>  at org.apache.hadoop.mapred.TaskTracker.access$1200(TaskTracker.java:104)
>  at org.apache.hadoop.mapred.TaskTracker$TaskLauncher.run(TaskTracker.java:1753)
> {code} 
> Task Tracker log can be seen here :
> {code:xml} 
> 2010-10-25 16:36:24,215 ERROR mapred.TaskTracker (TaskTracker.java:offerService(1211)) - Caught exception: java.io.IOException: Expecting a line not the end of stream
>         at org.apache.hadoop.fs.DF.parseExecResult(DF.java:110)
>         at org.apache.hadoop.util.Shell.runCommand(Shell.java:182)
>         at org.apache.hadoop.util.Shell.run(Shell.java:137)
>         at org.apache.hadoop.fs.DF.getAvailable(DF.java:74)
>         at org.apache.hadoop.mapred.TaskTracker.getFreeSpace(TaskTracker.java:1586)
>         at org.apache.hadoop.mapred.TaskTracker.transmitHeartBeat(TaskTracker.java:1274)
>         at org.apache.hadoop.mapred.TaskTracker.offerService(TaskTracker.java:1106)
>         at org.apache.hadoop.mapred.TaskTracker.run(TaskTracker.java:1848)
>         at org.apache.hadoop.mapred.TaskTracker.main(TaskTracker.java:3022)
> 2010-10-25 16:36:24,216 INFO  mapred.TaskTracker (TaskTracker.java:run(1856)) - Lost connection to JobTracker [/192.168.97.1:9001].  Retrying...
> java.lang.Exception: java.io.IOException: Expecting a line not the end of stream
>         at org.apache.hadoop.mapred.TaskTracker.offerService(TaskTracker.java:1212)
>         at org.apache.hadoop.mapred.TaskTracker.run(TaskTracker.java:1848)
>         at org.apache.hadoop.mapred.TaskTracker.main(TaskTracker.java:3022)
> Caused by: java.io.IOException: Expecting a line not the end of stream
>         at org.apache.hadoop.fs.DF.parseExecResult(DF.java:110)
>         at org.apache.hadoop.util.Shell.runCommand(Shell.java:182)
>         at org.apache.hadoop.util.Shell.run(Shell.java:137)
>         at org.apache.hadoop.fs.DF.getAvailable(DF.java:74)
>         at org.apache.hadoop.mapred.TaskTracker.getFreeSpace(TaskTracker.java:1586)
>         at org.apache.hadoop.mapred.TaskTracker.transmitHeartBeat(TaskTracker.java:1274)
>         at org.apache.hadoop.mapred.TaskTracker.offerService(TaskTracker.java:1106)
>         ... 2 more
> 2010-10-25 16:36:29,550 INFO  mapred.TaskTracker (TaskTracker.java:transmitHeartBeat(1256)) - Resending 'status' to '192.168.97.1' with reponseId '18361
> 2010-10-25 16:36:29,550 WARN  mapred.TaskTracker (TaskTracker.java:checkLocalDirs(2982)) - Task Tracker local can not create directory: /hdfsdata/0/mapred/local
> 2010-10-25 16:36:32,656 WARN  mapred.TaskTracker (TaskTracker.java:checkLocalDirs(2982)) - Task Tracker local can not create directory: /hdfsdata/0/mapred/local
> {code} 
> This seems to be fixed in the trunk.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira