You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-issues@hadoop.apache.org by "Devaraj K (JIRA)" <ji...@apache.org> on 2011/02/03 16:52:29 UTC

[jira] Created: (HADOOP-7130) Map Reduce Tasks are continously failing, when one among the several harddisks available on the TaskTracker fails.

Map Reduce Tasks are continously failing, when one among the several harddisks available on the TaskTracker fails.
------------------------------------------------------------------------------------------------------------------

                 Key: HADOOP-7130
                 URL: https://issues.apache.org/jira/browse/HADOOP-7130
             Project: Hadoop Common
          Issue Type: Bug
          Components: fs
    Affects Versions: 0.20.2
            Reporter: Devaraj K


1. Pull out one hard disk from Task tracker node (out of 10 disks pull one). Now it is noted that some jobs are failing. 
However process is continued. 
2. Wait for sometime (15 mins) and pull out one disk from another Task tracker. 
3. More number of jobs failed now and it can be seen from UI. Process is getting paused.

The exception can be seen in the job tracker UI for a failed job.
{code:xml} 
Error initializing attempt_201010221528_10174_m_000011_0:
java.io.IOException: Expecting a line not the end of stream
 at org.apache.hadoop.fs.DF.parseExecResult(DF.java:110)
 at org.apache.hadoop.util.Shell.runCommand(Shell.java:182)
 at org.apache.hadoop.util.Shell.run(Shell.java:137)
 at org.apache.hadoop.fs.DF.getAvailable(DF.java:74)
 at org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathForWrite(LocalDirAllocator.java:385)
 at org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:134)
 at org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:113)
 at org.apache.hadoop.mapred.TaskTracker.localizeJob(TaskTracker.java:835)
 at org.apache.hadoop.mapred.TaskTracker.startNewTask(TaskTracker.java:1790)
 at org.apache.hadoop.mapred.TaskTracker.access$1200(TaskTracker.java:104)
 at org.apache.hadoop.mapred.TaskTracker$TaskLauncher.run(TaskTracker.java:1753)

Error initializing attempt_201010221528_10174_m_000011_1:
org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find any valid local directory for taskTracker/jobcache/job_201010221528_10174/work
 at org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathForWrite(LocalDirAllocator.java:454)
 at org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:134)
 at org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:113)
 at org.apache.hadoop.mapred.TaskTracker.localizeJob(TaskTracker.java:835)
 at org.apache.hadoop.mapred.TaskTracker.startNewTask(TaskTracker.java:1790)
 at org.apache.hadoop.mapred.TaskTracker.access$1200(TaskTracker.java:104)
 at org.apache.hadoop.mapred.TaskTracker$TaskLauncher.run(TaskTracker.java:1753)

{code} 


Task Tracker log can be seen here :
{code:xml} 
2010-10-25 16:36:24,215 ERROR mapred.TaskTracker (TaskTracker.java:offerService(1211)) - Caught exception: java.io.IOException: Expecting a line not the end of stream
        at org.apache.hadoop.fs.DF.parseExecResult(DF.java:110)
        at org.apache.hadoop.util.Shell.runCommand(Shell.java:182)
        at org.apache.hadoop.util.Shell.run(Shell.java:137)
        at org.apache.hadoop.fs.DF.getAvailable(DF.java:74)
        at org.apache.hadoop.mapred.TaskTracker.getFreeSpace(TaskTracker.java:1586)
        at org.apache.hadoop.mapred.TaskTracker.transmitHeartBeat(TaskTracker.java:1274)
        at org.apache.hadoop.mapred.TaskTracker.offerService(TaskTracker.java:1106)
        at org.apache.hadoop.mapred.TaskTracker.run(TaskTracker.java:1848)
        at org.apache.hadoop.mapred.TaskTracker.main(TaskTracker.java:3022)

2010-10-25 16:36:24,216 INFO  mapred.TaskTracker (TaskTracker.java:run(1856)) - Lost connection to JobTracker [/192.168.97.1:9001].  Retrying...
java.lang.Exception: java.io.IOException: Expecting a line not the end of stream
        at org.apache.hadoop.mapred.TaskTracker.offerService(TaskTracker.java:1212)
        at org.apache.hadoop.mapred.TaskTracker.run(TaskTracker.java:1848)
        at org.apache.hadoop.mapred.TaskTracker.main(TaskTracker.java:3022)
Caused by: java.io.IOException: Expecting a line not the end of stream
        at org.apache.hadoop.fs.DF.parseExecResult(DF.java:110)
        at org.apache.hadoop.util.Shell.runCommand(Shell.java:182)
        at org.apache.hadoop.util.Shell.run(Shell.java:137)
        at org.apache.hadoop.fs.DF.getAvailable(DF.java:74)
        at org.apache.hadoop.mapred.TaskTracker.getFreeSpace(TaskTracker.java:1586)
        at org.apache.hadoop.mapred.TaskTracker.transmitHeartBeat(TaskTracker.java:1274)
        at org.apache.hadoop.mapred.TaskTracker.offerService(TaskTracker.java:1106)
        ... 2 more
2010-10-25 16:36:29,550 INFO  mapred.TaskTracker (TaskTracker.java:transmitHeartBeat(1256)) - Resending 'status' to '192.168.97.1' with reponseId '18361
2010-10-25 16:36:29,550 WARN  mapred.TaskTracker (TaskTracker.java:checkLocalDirs(2982)) - Task Tracker local can not create directory: /hdfsdata/0/mapred/local
2010-10-25 16:36:32,656 WARN  mapred.TaskTracker (TaskTracker.java:checkLocalDirs(2982)) - Task Tracker local can not create directory: /hdfsdata/0/mapred/local
{code} 


This seems to be fixed in the trunk.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HADOOP-7130) Map Reduce Tasks are continously failing, when one among the several harddisks available on the TaskTracker fails.

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-7130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13021489#comment-13021489 ] 

Hadoop QA commented on HADOOP-7130:
-----------------------------------

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12471931/HADOOP-7130.patch
  against trunk revision 1094750.

    +1 @author.  The patch does not contain any @author tags.

    +1 tests included.  The patch appears to include 4 new or modified tests.

    -1 patch.  The patch command could not apply the patch.

Console output: https://hudson.apache.org/hudson/job/PreCommit-HADOOP-Build/363//console

This message is automatically generated.

> Map Reduce Tasks are continously failing, when one among the several harddisks available on the TaskTracker fails.
> ------------------------------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-7130
>                 URL: https://issues.apache.org/jira/browse/HADOOP-7130
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: fs
>    Affects Versions: 0.20.2, 0.20.3, 0.20-append
>            Reporter: Devaraj K
>            Assignee: Devaraj K
>             Fix For: 0.20.4
>
>         Attachments: HADOOP-7130.patch
>
>
> 1. Pull out one hard disk from Task tracker node (out of 10 disks pull one). Now it is noted that some jobs are failing. 
> However process is continued. 
> 2. Wait for sometime (15 mins) and pull out one disk from another Task tracker. 
> 3. More number of jobs failed now and it can be seen from UI. Process is getting paused.
> The exception can be seen in the job tracker UI for a failed job.
> {code:xml} 
> Error initializing attempt_201010221528_10174_m_000011_0:
> java.io.IOException: Expecting a line not the end of stream
>  at org.apache.hadoop.fs.DF.parseExecResult(DF.java:110)
>  at org.apache.hadoop.util.Shell.runCommand(Shell.java:182)
>  at org.apache.hadoop.util.Shell.run(Shell.java:137)
>  at org.apache.hadoop.fs.DF.getAvailable(DF.java:74)
>  at org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathForWrite(LocalDirAllocator.java:385)
>  at org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:134)
>  at org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:113)
>  at org.apache.hadoop.mapred.TaskTracker.localizeJob(TaskTracker.java:835)
>  at org.apache.hadoop.mapred.TaskTracker.startNewTask(TaskTracker.java:1790)
>  at org.apache.hadoop.mapred.TaskTracker.access$1200(TaskTracker.java:104)
>  at org.apache.hadoop.mapred.TaskTracker$TaskLauncher.run(TaskTracker.java:1753)
> Error initializing attempt_201010221528_10174_m_000011_1:
> org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find any valid local directory for taskTracker/jobcache/job_201010221528_10174/work
>  at org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathForWrite(LocalDirAllocator.java:454)
>  at org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:134)
>  at org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:113)
>  at org.apache.hadoop.mapred.TaskTracker.localizeJob(TaskTracker.java:835)
>  at org.apache.hadoop.mapred.TaskTracker.startNewTask(TaskTracker.java:1790)
>  at org.apache.hadoop.mapred.TaskTracker.access$1200(TaskTracker.java:104)
>  at org.apache.hadoop.mapred.TaskTracker$TaskLauncher.run(TaskTracker.java:1753)
> {code} 
> Task Tracker log can be seen here :
> {code:xml} 
> 2010-10-25 16:36:24,215 ERROR mapred.TaskTracker (TaskTracker.java:offerService(1211)) - Caught exception: java.io.IOException: Expecting a line not the end of stream
>         at org.apache.hadoop.fs.DF.parseExecResult(DF.java:110)
>         at org.apache.hadoop.util.Shell.runCommand(Shell.java:182)
>         at org.apache.hadoop.util.Shell.run(Shell.java:137)
>         at org.apache.hadoop.fs.DF.getAvailable(DF.java:74)
>         at org.apache.hadoop.mapred.TaskTracker.getFreeSpace(TaskTracker.java:1586)
>         at org.apache.hadoop.mapred.TaskTracker.transmitHeartBeat(TaskTracker.java:1274)
>         at org.apache.hadoop.mapred.TaskTracker.offerService(TaskTracker.java:1106)
>         at org.apache.hadoop.mapred.TaskTracker.run(TaskTracker.java:1848)
>         at org.apache.hadoop.mapred.TaskTracker.main(TaskTracker.java:3022)
> 2010-10-25 16:36:24,216 INFO  mapred.TaskTracker (TaskTracker.java:run(1856)) - Lost connection to JobTracker [/192.168.97.1:9001].  Retrying...
> java.lang.Exception: java.io.IOException: Expecting a line not the end of stream
>         at org.apache.hadoop.mapred.TaskTracker.offerService(TaskTracker.java:1212)
>         at org.apache.hadoop.mapred.TaskTracker.run(TaskTracker.java:1848)
>         at org.apache.hadoop.mapred.TaskTracker.main(TaskTracker.java:3022)
> Caused by: java.io.IOException: Expecting a line not the end of stream
>         at org.apache.hadoop.fs.DF.parseExecResult(DF.java:110)
>         at org.apache.hadoop.util.Shell.runCommand(Shell.java:182)
>         at org.apache.hadoop.util.Shell.run(Shell.java:137)
>         at org.apache.hadoop.fs.DF.getAvailable(DF.java:74)
>         at org.apache.hadoop.mapred.TaskTracker.getFreeSpace(TaskTracker.java:1586)
>         at org.apache.hadoop.mapred.TaskTracker.transmitHeartBeat(TaskTracker.java:1274)
>         at org.apache.hadoop.mapred.TaskTracker.offerService(TaskTracker.java:1106)
>         ... 2 more
> 2010-10-25 16:36:29,550 INFO  mapred.TaskTracker (TaskTracker.java:transmitHeartBeat(1256)) - Resending 'status' to '192.168.97.1' with reponseId '18361
> 2010-10-25 16:36:29,550 WARN  mapred.TaskTracker (TaskTracker.java:checkLocalDirs(2982)) - Task Tracker local can not create directory: /hdfsdata/0/mapred/local
> 2010-10-25 16:36:32,656 WARN  mapred.TaskTracker (TaskTracker.java:checkLocalDirs(2982)) - Task Tracker local can not create directory: /hdfsdata/0/mapred/local
> {code} 
> This seems to be fixed in the trunk.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Updated: (HADOOP-7130) Map Reduce Tasks are continously failing, when one among the several harddisks available on the TaskTracker fails.

Posted by "Devaraj K (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-7130?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Devaraj K updated HADOOP-7130:
------------------------------

    Attachment: HADOOP-7130.patch

> Map Reduce Tasks are continously failing, when one among the several harddisks available on the TaskTracker fails.
> ------------------------------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-7130
>                 URL: https://issues.apache.org/jira/browse/HADOOP-7130
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: fs
>    Affects Versions: 0.20.2, 0.20.3, 0.20-append
>            Reporter: Devaraj K
>         Attachments: HADOOP-7130.patch
>
>
> 1. Pull out one hard disk from Task tracker node (out of 10 disks pull one). Now it is noted that some jobs are failing. 
> However process is continued. 
> 2. Wait for sometime (15 mins) and pull out one disk from another Task tracker. 
> 3. More number of jobs failed now and it can be seen from UI. Process is getting paused.
> The exception can be seen in the job tracker UI for a failed job.
> {code:xml} 
> Error initializing attempt_201010221528_10174_m_000011_0:
> java.io.IOException: Expecting a line not the end of stream
>  at org.apache.hadoop.fs.DF.parseExecResult(DF.java:110)
>  at org.apache.hadoop.util.Shell.runCommand(Shell.java:182)
>  at org.apache.hadoop.util.Shell.run(Shell.java:137)
>  at org.apache.hadoop.fs.DF.getAvailable(DF.java:74)
>  at org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathForWrite(LocalDirAllocator.java:385)
>  at org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:134)
>  at org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:113)
>  at org.apache.hadoop.mapred.TaskTracker.localizeJob(TaskTracker.java:835)
>  at org.apache.hadoop.mapred.TaskTracker.startNewTask(TaskTracker.java:1790)
>  at org.apache.hadoop.mapred.TaskTracker.access$1200(TaskTracker.java:104)
>  at org.apache.hadoop.mapred.TaskTracker$TaskLauncher.run(TaskTracker.java:1753)
> Error initializing attempt_201010221528_10174_m_000011_1:
> org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find any valid local directory for taskTracker/jobcache/job_201010221528_10174/work
>  at org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathForWrite(LocalDirAllocator.java:454)
>  at org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:134)
>  at org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:113)
>  at org.apache.hadoop.mapred.TaskTracker.localizeJob(TaskTracker.java:835)
>  at org.apache.hadoop.mapred.TaskTracker.startNewTask(TaskTracker.java:1790)
>  at org.apache.hadoop.mapred.TaskTracker.access$1200(TaskTracker.java:104)
>  at org.apache.hadoop.mapred.TaskTracker$TaskLauncher.run(TaskTracker.java:1753)
> {code} 
> Task Tracker log can be seen here :
> {code:xml} 
> 2010-10-25 16:36:24,215 ERROR mapred.TaskTracker (TaskTracker.java:offerService(1211)) - Caught exception: java.io.IOException: Expecting a line not the end of stream
>         at org.apache.hadoop.fs.DF.parseExecResult(DF.java:110)
>         at org.apache.hadoop.util.Shell.runCommand(Shell.java:182)
>         at org.apache.hadoop.util.Shell.run(Shell.java:137)
>         at org.apache.hadoop.fs.DF.getAvailable(DF.java:74)
>         at org.apache.hadoop.mapred.TaskTracker.getFreeSpace(TaskTracker.java:1586)
>         at org.apache.hadoop.mapred.TaskTracker.transmitHeartBeat(TaskTracker.java:1274)
>         at org.apache.hadoop.mapred.TaskTracker.offerService(TaskTracker.java:1106)
>         at org.apache.hadoop.mapred.TaskTracker.run(TaskTracker.java:1848)
>         at org.apache.hadoop.mapred.TaskTracker.main(TaskTracker.java:3022)
> 2010-10-25 16:36:24,216 INFO  mapred.TaskTracker (TaskTracker.java:run(1856)) - Lost connection to JobTracker [/192.168.97.1:9001].  Retrying...
> java.lang.Exception: java.io.IOException: Expecting a line not the end of stream
>         at org.apache.hadoop.mapred.TaskTracker.offerService(TaskTracker.java:1212)
>         at org.apache.hadoop.mapred.TaskTracker.run(TaskTracker.java:1848)
>         at org.apache.hadoop.mapred.TaskTracker.main(TaskTracker.java:3022)
> Caused by: java.io.IOException: Expecting a line not the end of stream
>         at org.apache.hadoop.fs.DF.parseExecResult(DF.java:110)
>         at org.apache.hadoop.util.Shell.runCommand(Shell.java:182)
>         at org.apache.hadoop.util.Shell.run(Shell.java:137)
>         at org.apache.hadoop.fs.DF.getAvailable(DF.java:74)
>         at org.apache.hadoop.mapred.TaskTracker.getFreeSpace(TaskTracker.java:1586)
>         at org.apache.hadoop.mapred.TaskTracker.transmitHeartBeat(TaskTracker.java:1274)
>         at org.apache.hadoop.mapred.TaskTracker.offerService(TaskTracker.java:1106)
>         ... 2 more
> 2010-10-25 16:36:29,550 INFO  mapred.TaskTracker (TaskTracker.java:transmitHeartBeat(1256)) - Resending 'status' to '192.168.97.1' with reponseId '18361
> 2010-10-25 16:36:29,550 WARN  mapred.TaskTracker (TaskTracker.java:checkLocalDirs(2982)) - Task Tracker local can not create directory: /hdfsdata/0/mapred/local
> 2010-10-25 16:36:32,656 WARN  mapred.TaskTracker (TaskTracker.java:checkLocalDirs(2982)) - Task Tracker local can not create directory: /hdfsdata/0/mapred/local
> {code} 
> This seems to be fixed in the trunk.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] Updated: (HADOOP-7130) Map Reduce Tasks are continously failing, when one among the several harddisks available on the TaskTracker fails.

Posted by "Devaraj K (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-7130?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Devaraj K updated HADOOP-7130:
------------------------------

    Affects Version/s: 0.20-append
                       0.20.3

> Map Reduce Tasks are continously failing, when one among the several harddisks available on the TaskTracker fails.
> ------------------------------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-7130
>                 URL: https://issues.apache.org/jira/browse/HADOOP-7130
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: fs
>    Affects Versions: 0.20.2, 0.20.3, 0.20-append
>            Reporter: Devaraj K
>
> 1. Pull out one hard disk from Task tracker node (out of 10 disks pull one). Now it is noted that some jobs are failing. 
> However process is continued. 
> 2. Wait for sometime (15 mins) and pull out one disk from another Task tracker. 
> 3. More number of jobs failed now and it can be seen from UI. Process is getting paused.
> The exception can be seen in the job tracker UI for a failed job.
> {code:xml} 
> Error initializing attempt_201010221528_10174_m_000011_0:
> java.io.IOException: Expecting a line not the end of stream
>  at org.apache.hadoop.fs.DF.parseExecResult(DF.java:110)
>  at org.apache.hadoop.util.Shell.runCommand(Shell.java:182)
>  at org.apache.hadoop.util.Shell.run(Shell.java:137)
>  at org.apache.hadoop.fs.DF.getAvailable(DF.java:74)
>  at org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathForWrite(LocalDirAllocator.java:385)
>  at org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:134)
>  at org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:113)
>  at org.apache.hadoop.mapred.TaskTracker.localizeJob(TaskTracker.java:835)
>  at org.apache.hadoop.mapred.TaskTracker.startNewTask(TaskTracker.java:1790)
>  at org.apache.hadoop.mapred.TaskTracker.access$1200(TaskTracker.java:104)
>  at org.apache.hadoop.mapred.TaskTracker$TaskLauncher.run(TaskTracker.java:1753)
> Error initializing attempt_201010221528_10174_m_000011_1:
> org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find any valid local directory for taskTracker/jobcache/job_201010221528_10174/work
>  at org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathForWrite(LocalDirAllocator.java:454)
>  at org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:134)
>  at org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:113)
>  at org.apache.hadoop.mapred.TaskTracker.localizeJob(TaskTracker.java:835)
>  at org.apache.hadoop.mapred.TaskTracker.startNewTask(TaskTracker.java:1790)
>  at org.apache.hadoop.mapred.TaskTracker.access$1200(TaskTracker.java:104)
>  at org.apache.hadoop.mapred.TaskTracker$TaskLauncher.run(TaskTracker.java:1753)
> {code} 
> Task Tracker log can be seen here :
> {code:xml} 
> 2010-10-25 16:36:24,215 ERROR mapred.TaskTracker (TaskTracker.java:offerService(1211)) - Caught exception: java.io.IOException: Expecting a line not the end of stream
>         at org.apache.hadoop.fs.DF.parseExecResult(DF.java:110)
>         at org.apache.hadoop.util.Shell.runCommand(Shell.java:182)
>         at org.apache.hadoop.util.Shell.run(Shell.java:137)
>         at org.apache.hadoop.fs.DF.getAvailable(DF.java:74)
>         at org.apache.hadoop.mapred.TaskTracker.getFreeSpace(TaskTracker.java:1586)
>         at org.apache.hadoop.mapred.TaskTracker.transmitHeartBeat(TaskTracker.java:1274)
>         at org.apache.hadoop.mapred.TaskTracker.offerService(TaskTracker.java:1106)
>         at org.apache.hadoop.mapred.TaskTracker.run(TaskTracker.java:1848)
>         at org.apache.hadoop.mapred.TaskTracker.main(TaskTracker.java:3022)
> 2010-10-25 16:36:24,216 INFO  mapred.TaskTracker (TaskTracker.java:run(1856)) - Lost connection to JobTracker [/192.168.97.1:9001].  Retrying...
> java.lang.Exception: java.io.IOException: Expecting a line not the end of stream
>         at org.apache.hadoop.mapred.TaskTracker.offerService(TaskTracker.java:1212)
>         at org.apache.hadoop.mapred.TaskTracker.run(TaskTracker.java:1848)
>         at org.apache.hadoop.mapred.TaskTracker.main(TaskTracker.java:3022)
> Caused by: java.io.IOException: Expecting a line not the end of stream
>         at org.apache.hadoop.fs.DF.parseExecResult(DF.java:110)
>         at org.apache.hadoop.util.Shell.runCommand(Shell.java:182)
>         at org.apache.hadoop.util.Shell.run(Shell.java:137)
>         at org.apache.hadoop.fs.DF.getAvailable(DF.java:74)
>         at org.apache.hadoop.mapred.TaskTracker.getFreeSpace(TaskTracker.java:1586)
>         at org.apache.hadoop.mapred.TaskTracker.transmitHeartBeat(TaskTracker.java:1274)
>         at org.apache.hadoop.mapred.TaskTracker.offerService(TaskTracker.java:1106)
>         ... 2 more
> 2010-10-25 16:36:29,550 INFO  mapred.TaskTracker (TaskTracker.java:transmitHeartBeat(1256)) - Resending 'status' to '192.168.97.1' with reponseId '18361
> 2010-10-25 16:36:29,550 WARN  mapred.TaskTracker (TaskTracker.java:checkLocalDirs(2982)) - Task Tracker local can not create directory: /hdfsdata/0/mapred/local
> 2010-10-25 16:36:32,656 WARN  mapred.TaskTracker (TaskTracker.java:checkLocalDirs(2982)) - Task Tracker local can not create directory: /hdfsdata/0/mapred/local
> {code} 
> This seems to be fixed in the trunk.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Assigned] (HADOOP-7130) Map Reduce Tasks are continously failing, when one among the several harddisks available on the TaskTracker fails.

Posted by "Devaraj K (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-7130?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Devaraj K reassigned HADOOP-7130:
---------------------------------

    Assignee: Devaraj K

> Map Reduce Tasks are continously failing, when one among the several harddisks available on the TaskTracker fails.
> ------------------------------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-7130
>                 URL: https://issues.apache.org/jira/browse/HADOOP-7130
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: fs
>    Affects Versions: 0.20.2, 0.20.3, 0.20-append
>            Reporter: Devaraj K
>            Assignee: Devaraj K
>             Fix For: 0.20.4
>
>         Attachments: HADOOP-7130.patch
>
>
> 1. Pull out one hard disk from Task tracker node (out of 10 disks pull one). Now it is noted that some jobs are failing. 
> However process is continued. 
> 2. Wait for sometime (15 mins) and pull out one disk from another Task tracker. 
> 3. More number of jobs failed now and it can be seen from UI. Process is getting paused.
> The exception can be seen in the job tracker UI for a failed job.
> {code:xml} 
> Error initializing attempt_201010221528_10174_m_000011_0:
> java.io.IOException: Expecting a line not the end of stream
>  at org.apache.hadoop.fs.DF.parseExecResult(DF.java:110)
>  at org.apache.hadoop.util.Shell.runCommand(Shell.java:182)
>  at org.apache.hadoop.util.Shell.run(Shell.java:137)
>  at org.apache.hadoop.fs.DF.getAvailable(DF.java:74)
>  at org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathForWrite(LocalDirAllocator.java:385)
>  at org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:134)
>  at org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:113)
>  at org.apache.hadoop.mapred.TaskTracker.localizeJob(TaskTracker.java:835)
>  at org.apache.hadoop.mapred.TaskTracker.startNewTask(TaskTracker.java:1790)
>  at org.apache.hadoop.mapred.TaskTracker.access$1200(TaskTracker.java:104)
>  at org.apache.hadoop.mapred.TaskTracker$TaskLauncher.run(TaskTracker.java:1753)
> Error initializing attempt_201010221528_10174_m_000011_1:
> org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find any valid local directory for taskTracker/jobcache/job_201010221528_10174/work
>  at org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathForWrite(LocalDirAllocator.java:454)
>  at org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:134)
>  at org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:113)
>  at org.apache.hadoop.mapred.TaskTracker.localizeJob(TaskTracker.java:835)
>  at org.apache.hadoop.mapred.TaskTracker.startNewTask(TaskTracker.java:1790)
>  at org.apache.hadoop.mapred.TaskTracker.access$1200(TaskTracker.java:104)
>  at org.apache.hadoop.mapred.TaskTracker$TaskLauncher.run(TaskTracker.java:1753)
> {code} 
> Task Tracker log can be seen here :
> {code:xml} 
> 2010-10-25 16:36:24,215 ERROR mapred.TaskTracker (TaskTracker.java:offerService(1211)) - Caught exception: java.io.IOException: Expecting a line not the end of stream
>         at org.apache.hadoop.fs.DF.parseExecResult(DF.java:110)
>         at org.apache.hadoop.util.Shell.runCommand(Shell.java:182)
>         at org.apache.hadoop.util.Shell.run(Shell.java:137)
>         at org.apache.hadoop.fs.DF.getAvailable(DF.java:74)
>         at org.apache.hadoop.mapred.TaskTracker.getFreeSpace(TaskTracker.java:1586)
>         at org.apache.hadoop.mapred.TaskTracker.transmitHeartBeat(TaskTracker.java:1274)
>         at org.apache.hadoop.mapred.TaskTracker.offerService(TaskTracker.java:1106)
>         at org.apache.hadoop.mapred.TaskTracker.run(TaskTracker.java:1848)
>         at org.apache.hadoop.mapred.TaskTracker.main(TaskTracker.java:3022)
> 2010-10-25 16:36:24,216 INFO  mapred.TaskTracker (TaskTracker.java:run(1856)) - Lost connection to JobTracker [/192.168.97.1:9001].  Retrying...
> java.lang.Exception: java.io.IOException: Expecting a line not the end of stream
>         at org.apache.hadoop.mapred.TaskTracker.offerService(TaskTracker.java:1212)
>         at org.apache.hadoop.mapred.TaskTracker.run(TaskTracker.java:1848)
>         at org.apache.hadoop.mapred.TaskTracker.main(TaskTracker.java:3022)
> Caused by: java.io.IOException: Expecting a line not the end of stream
>         at org.apache.hadoop.fs.DF.parseExecResult(DF.java:110)
>         at org.apache.hadoop.util.Shell.runCommand(Shell.java:182)
>         at org.apache.hadoop.util.Shell.run(Shell.java:137)
>         at org.apache.hadoop.fs.DF.getAvailable(DF.java:74)
>         at org.apache.hadoop.mapred.TaskTracker.getFreeSpace(TaskTracker.java:1586)
>         at org.apache.hadoop.mapred.TaskTracker.transmitHeartBeat(TaskTracker.java:1274)
>         at org.apache.hadoop.mapred.TaskTracker.offerService(TaskTracker.java:1106)
>         ... 2 more
> 2010-10-25 16:36:29,550 INFO  mapred.TaskTracker (TaskTracker.java:transmitHeartBeat(1256)) - Resending 'status' to '192.168.97.1' with reponseId '18361
> 2010-10-25 16:36:29,550 WARN  mapred.TaskTracker (TaskTracker.java:checkLocalDirs(2982)) - Task Tracker local can not create directory: /hdfsdata/0/mapred/local
> 2010-10-25 16:36:32,656 WARN  mapred.TaskTracker (TaskTracker.java:checkLocalDirs(2982)) - Task Tracker local can not create directory: /hdfsdata/0/mapred/local
> {code} 
> This seems to be fixed in the trunk.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Updated: (HADOOP-7130) Map Reduce Tasks are continously failing, when one among the several harddisks available on the TaskTracker fails.

Posted by "Devaraj K (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-7130?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Devaraj K updated HADOOP-7130:
------------------------------

    Fix Version/s: 0.20.4
           Status: Patch Available  (was: Open)

Patch is provided for 0.20 branch.

> Map Reduce Tasks are continously failing, when one among the several harddisks available on the TaskTracker fails.
> ------------------------------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-7130
>                 URL: https://issues.apache.org/jira/browse/HADOOP-7130
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: fs
>    Affects Versions: 0.20.2, 0.20.3, 0.20-append
>            Reporter: Devaraj K
>             Fix For: 0.20.4
>
>         Attachments: HADOOP-7130.patch
>
>
> 1. Pull out one hard disk from Task tracker node (out of 10 disks pull one). Now it is noted that some jobs are failing. 
> However process is continued. 
> 2. Wait for sometime (15 mins) and pull out one disk from another Task tracker. 
> 3. More number of jobs failed now and it can be seen from UI. Process is getting paused.
> The exception can be seen in the job tracker UI for a failed job.
> {code:xml} 
> Error initializing attempt_201010221528_10174_m_000011_0:
> java.io.IOException: Expecting a line not the end of stream
>  at org.apache.hadoop.fs.DF.parseExecResult(DF.java:110)
>  at org.apache.hadoop.util.Shell.runCommand(Shell.java:182)
>  at org.apache.hadoop.util.Shell.run(Shell.java:137)
>  at org.apache.hadoop.fs.DF.getAvailable(DF.java:74)
>  at org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathForWrite(LocalDirAllocator.java:385)
>  at org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:134)
>  at org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:113)
>  at org.apache.hadoop.mapred.TaskTracker.localizeJob(TaskTracker.java:835)
>  at org.apache.hadoop.mapred.TaskTracker.startNewTask(TaskTracker.java:1790)
>  at org.apache.hadoop.mapred.TaskTracker.access$1200(TaskTracker.java:104)
>  at org.apache.hadoop.mapred.TaskTracker$TaskLauncher.run(TaskTracker.java:1753)
> Error initializing attempt_201010221528_10174_m_000011_1:
> org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find any valid local directory for taskTracker/jobcache/job_201010221528_10174/work
>  at org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathForWrite(LocalDirAllocator.java:454)
>  at org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:134)
>  at org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:113)
>  at org.apache.hadoop.mapred.TaskTracker.localizeJob(TaskTracker.java:835)
>  at org.apache.hadoop.mapred.TaskTracker.startNewTask(TaskTracker.java:1790)
>  at org.apache.hadoop.mapred.TaskTracker.access$1200(TaskTracker.java:104)
>  at org.apache.hadoop.mapred.TaskTracker$TaskLauncher.run(TaskTracker.java:1753)
> {code} 
> Task Tracker log can be seen here :
> {code:xml} 
> 2010-10-25 16:36:24,215 ERROR mapred.TaskTracker (TaskTracker.java:offerService(1211)) - Caught exception: java.io.IOException: Expecting a line not the end of stream
>         at org.apache.hadoop.fs.DF.parseExecResult(DF.java:110)
>         at org.apache.hadoop.util.Shell.runCommand(Shell.java:182)
>         at org.apache.hadoop.util.Shell.run(Shell.java:137)
>         at org.apache.hadoop.fs.DF.getAvailable(DF.java:74)
>         at org.apache.hadoop.mapred.TaskTracker.getFreeSpace(TaskTracker.java:1586)
>         at org.apache.hadoop.mapred.TaskTracker.transmitHeartBeat(TaskTracker.java:1274)
>         at org.apache.hadoop.mapred.TaskTracker.offerService(TaskTracker.java:1106)
>         at org.apache.hadoop.mapred.TaskTracker.run(TaskTracker.java:1848)
>         at org.apache.hadoop.mapred.TaskTracker.main(TaskTracker.java:3022)
> 2010-10-25 16:36:24,216 INFO  mapred.TaskTracker (TaskTracker.java:run(1856)) - Lost connection to JobTracker [/192.168.97.1:9001].  Retrying...
> java.lang.Exception: java.io.IOException: Expecting a line not the end of stream
>         at org.apache.hadoop.mapred.TaskTracker.offerService(TaskTracker.java:1212)
>         at org.apache.hadoop.mapred.TaskTracker.run(TaskTracker.java:1848)
>         at org.apache.hadoop.mapred.TaskTracker.main(TaskTracker.java:3022)
> Caused by: java.io.IOException: Expecting a line not the end of stream
>         at org.apache.hadoop.fs.DF.parseExecResult(DF.java:110)
>         at org.apache.hadoop.util.Shell.runCommand(Shell.java:182)
>         at org.apache.hadoop.util.Shell.run(Shell.java:137)
>         at org.apache.hadoop.fs.DF.getAvailable(DF.java:74)
>         at org.apache.hadoop.mapred.TaskTracker.getFreeSpace(TaskTracker.java:1586)
>         at org.apache.hadoop.mapred.TaskTracker.transmitHeartBeat(TaskTracker.java:1274)
>         at org.apache.hadoop.mapred.TaskTracker.offerService(TaskTracker.java:1106)
>         ... 2 more
> 2010-10-25 16:36:29,550 INFO  mapred.TaskTracker (TaskTracker.java:transmitHeartBeat(1256)) - Resending 'status' to '192.168.97.1' with reponseId '18361
> 2010-10-25 16:36:29,550 WARN  mapred.TaskTracker (TaskTracker.java:checkLocalDirs(2982)) - Task Tracker local can not create directory: /hdfsdata/0/mapred/local
> 2010-10-25 16:36:32,656 WARN  mapred.TaskTracker (TaskTracker.java:checkLocalDirs(2982)) - Task Tracker local can not create directory: /hdfsdata/0/mapred/local
> {code} 
> This seems to be fixed in the trunk.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira