You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by "Arun C Murthy (JIRA)" <ji...@apache.org> on 2009/01/30 03:38:59 UTC

[jira] Created: (HADOOP-5146) LocalDirAllocator misses files on the local filesystem

LocalDirAllocator misses files on the local filesystem
------------------------------------------------------

                 Key: HADOOP-5146
                 URL: https://issues.apache.org/jira/browse/HADOOP-5146
             Project: Hadoop Core
          Issue Type: Bug
    Affects Versions: 0.20.0
            Reporter: Arun C Murthy
            Priority: Blocker
             Fix For: 0.20.0


For some reason the LocalDirAllocator.getLocaPathToRead doesn't find files which are present, extra logging shows:

{noformat}
2009-01-30 06:43:32,312 INFO org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext: in ifExists, /grid/2/arunc/mapred-local/taskTracker/archive/xxx.yyy.com/tera/in/_partition.lst exists
2009-01-30 06:43:32,389 WARN org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext: in getLocalPathToRead, taskTracker/archive/xxx.yyy.com/tera/in/_partition.lst doesn't exist
2009-01-30 06:43:32,390 WARN org.apache.hadoop.mapred.TaskRunner: attempt_200901300512_0007_m_000055_0 Child Error
 org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find taskTracker/archive/xx.yyy.com/tera/in/_partition.lst in any of the configured local directories
         at org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathToRead(LocalDirAllocator.java:388)
         at org.apache.hadoop.fs.LocalDirAllocator.getLocalPathToRead(LocalDirAllocator.java:138)
         at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:172)
{noformat}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-5146) LocalDirAllocator misses files on the local filesystem

Posted by "Devaraj Das (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-5146?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Devaraj Das updated HADOOP-5146:
--------------------------------

    Attachment: 5146.patch

Attaching a patch with the logs removed.

> LocalDirAllocator misses files on the local filesystem
> ------------------------------------------------------
>
>                 Key: HADOOP-5146
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5146
>             Project: Hadoop Core
>          Issue Type: Bug
>    Affects Versions: 0.20.0
>            Reporter: Arun C Murthy
>            Assignee: Devaraj Das
>            Priority: Blocker
>             Fix For: 0.20.0
>
>         Attachments: 5146.patch, 5146_20090204job.output.txt, localdirallocator.patch, localdirallocator.patch
>
>
> For some reason the LocalDirAllocator.getLocaPathToRead doesn't find files which are present, extra logging shows:
> {noformat}
> 2009-01-30 06:43:32,312 INFO org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext: in ifExists, /grid/2/arunc/mapred-local/taskTracker/archive/xxx.yyy.com/tera/in/_partition.lst exists
> 2009-01-30 06:43:32,389 WARN org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext: in getLocalPathToRead, taskTracker/archive/xxx.yyy.com/tera/in/_partition.lst doesn't exist
> 2009-01-30 06:43:32,390 WARN org.apache.hadoop.mapred.TaskRunner: attempt_200901300512_0007_m_000055_0 Child Error
>  org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find taskTracker/archive/xx.yyy.com/tera/in/_partition.lst in any of the configured local directories
>          at org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathToRead(LocalDirAllocator.java:388)
>          at org.apache.hadoop.fs.LocalDirAllocator.getLocalPathToRead(LocalDirAllocator.java:138)
>          at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:172)
> {noformat}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-5146) LocalDirAllocator misses files on the local filesystem

Posted by "Amareshwari Sriramadasu (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-5146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12677325#action_12677325 ] 

Amareshwari Sriramadasu commented on HADOOP-5146:
-------------------------------------------------

I think -1 for core and contrib tests is due to downtime in the svn. I'm not able to view the testReport or even console output. 
Giri, can you please confirm?

As I commented earlier All tests passed on my machine.

> LocalDirAllocator misses files on the local filesystem
> ------------------------------------------------------
>
>                 Key: HADOOP-5146
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5146
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.20.0
>            Reporter: Arun C Murthy
>            Assignee: Devaraj Das
>            Priority: Blocker
>             Fix For: 0.20.0
>
>         Attachments: 5146.patch, 5146.patch, 5146_20090204job.output.txt, localdirallocator.patch, localdirallocator.patch
>
>
> For some reason the LocalDirAllocator.getLocaPathToRead doesn't find files which are present, extra logging shows:
> {noformat}
> 2009-01-30 06:43:32,312 INFO org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext: in ifExists, /grid/2/arunc/mapred-local/taskTracker/archive/xxx.yyy.com/tera/in/_partition.lst exists
> 2009-01-30 06:43:32,389 WARN org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext: in getLocalPathToRead, taskTracker/archive/xxx.yyy.com/tera/in/_partition.lst doesn't exist
> 2009-01-30 06:43:32,390 WARN org.apache.hadoop.mapred.TaskRunner: attempt_200901300512_0007_m_000055_0 Child Error
>  org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find taskTracker/archive/xx.yyy.com/tera/in/_partition.lst in any of the configured local directories
>          at org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathToRead(LocalDirAllocator.java:388)
>          at org.apache.hadoop.fs.LocalDirAllocator.getLocalPathToRead(LocalDirAllocator.java:138)
>          at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:172)
> {noformat}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-5146) LocalDirAllocator misses files on the local filesystem

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-5146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12677317#action_12677317 ] 

Hadoop QA commented on HADOOP-5146:
-----------------------------------

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12401022/5146.patch
  against trunk revision 748403.

    +1 @author.  The patch does not contain any @author tags.

    -1 tests included.  The patch doesn't appear to include any new or modified tests.
                        Please justify why no tests are needed for this patch.

    +1 javadoc.  The javadoc tool did not generate any warning messages.

    +1 javac.  The applied patch does not increase the total number of javac compiler warnings.

    +1 findbugs.  The patch does not introduce any new Findbugs warnings.

    +1 Eclipse classpath. The patch retains Eclipse classpath integrity.

    +1 release audit.  The applied patch does not increase the total number of release audit warnings.

    -1 core tests.  The patch failed core unit tests.

    -1 contrib tests.  The patch failed contrib unit tests.

Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta/10/testReport/
Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta/10/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta/10/artifact/trunk/build/test/checkstyle-errors.html
Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta/10/console

This message is automatically generated.

> LocalDirAllocator misses files on the local filesystem
> ------------------------------------------------------
>
>                 Key: HADOOP-5146
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5146
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.20.0
>            Reporter: Arun C Murthy
>            Assignee: Devaraj Das
>            Priority: Blocker
>             Fix For: 0.20.0
>
>         Attachments: 5146.patch, 5146.patch, 5146_20090204job.output.txt, localdirallocator.patch, localdirallocator.patch
>
>
> For some reason the LocalDirAllocator.getLocaPathToRead doesn't find files which are present, extra logging shows:
> {noformat}
> 2009-01-30 06:43:32,312 INFO org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext: in ifExists, /grid/2/arunc/mapred-local/taskTracker/archive/xxx.yyy.com/tera/in/_partition.lst exists
> 2009-01-30 06:43:32,389 WARN org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext: in getLocalPathToRead, taskTracker/archive/xxx.yyy.com/tera/in/_partition.lst doesn't exist
> 2009-01-30 06:43:32,390 WARN org.apache.hadoop.mapred.TaskRunner: attempt_200901300512_0007_m_000055_0 Child Error
>  org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find taskTracker/archive/xx.yyy.com/tera/in/_partition.lst in any of the configured local directories
>          at org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathToRead(LocalDirAllocator.java:388)
>          at org.apache.hadoop.fs.LocalDirAllocator.getLocalPathToRead(LocalDirAllocator.java:138)
>          at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:172)
> {noformat}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-5146) LocalDirAllocator misses files on the local filesystem

Posted by "Vinod K V (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-5146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12670326#action_12670326 ] 

Vinod K V commented on HADOOP-5146:
-----------------------------------

Nicholas, is this problem reproducible? From the code, it seems that the exception you pointed out should not cause tasks/job to fail. Is there a possibility of anything else happening here, something like a disk failure?

> LocalDirAllocator misses files on the local filesystem
> ------------------------------------------------------
>
>                 Key: HADOOP-5146
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5146
>             Project: Hadoop Core
>          Issue Type: Bug
>    Affects Versions: 0.20.0
>            Reporter: Arun C Murthy
>            Assignee: Devaraj Das
>            Priority: Blocker
>             Fix For: 0.20.0
>
>         Attachments: localdirallocator.patch, localdirallocator.patch
>
>
> For some reason the LocalDirAllocator.getLocaPathToRead doesn't find files which are present, extra logging shows:
> {noformat}
> 2009-01-30 06:43:32,312 INFO org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext: in ifExists, /grid/2/arunc/mapred-local/taskTracker/archive/xxx.yyy.com/tera/in/_partition.lst exists
> 2009-01-30 06:43:32,389 WARN org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext: in getLocalPathToRead, taskTracker/archive/xxx.yyy.com/tera/in/_partition.lst doesn't exist
> 2009-01-30 06:43:32,390 WARN org.apache.hadoop.mapred.TaskRunner: attempt_200901300512_0007_m_000055_0 Child Error
>  org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find taskTracker/archive/xx.yyy.com/tera/in/_partition.lst in any of the configured local directories
>          at org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathToRead(LocalDirAllocator.java:388)
>          at org.apache.hadoop.fs.LocalDirAllocator.getLocalPathToRead(LocalDirAllocator.java:138)
>          at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:172)
> {noformat}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-5146) LocalDirAllocator misses files on the local filesystem

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-5146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12677654#action_12677654 ] 

Hadoop QA commented on HADOOP-5146:
-----------------------------------

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12401022/5146.patch
  against trunk revision 748770.

    +1 @author.  The patch does not contain any @author tags.

    -1 tests included.  The patch doesn't appear to include any new or modified tests.
                        Please justify why no tests are needed for this patch.

    +1 javadoc.  The javadoc tool did not generate any warning messages.

    +1 javac.  The applied patch does not increase the total number of javac compiler warnings.

    +1 findbugs.  The patch does not introduce any new Findbugs warnings.

    +1 Eclipse classpath. The patch retains Eclipse classpath integrity.

    +1 release audit.  The applied patch does not increase the total number of release audit warnings.

    +1 core tests.  The patch passed core unit tests.

    -1 contrib tests.  The patch failed contrib unit tests.

Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/15/testReport/
Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/15/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/15/artifact/trunk/build/test/checkstyle-errors.html
Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/15/console

This message is automatically generated.

> LocalDirAllocator misses files on the local filesystem
> ------------------------------------------------------
>
>                 Key: HADOOP-5146
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5146
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.20.0
>            Reporter: Arun C Murthy
>            Assignee: Devaraj Das
>            Priority: Blocker
>             Fix For: 0.20.0
>
>         Attachments: 5146.patch, 5146.patch, 5146_20090204job.output.txt, localdirallocator.patch, localdirallocator.patch
>
>
> For some reason the LocalDirAllocator.getLocaPathToRead doesn't find files which are present, extra logging shows:
> {noformat}
> 2009-01-30 06:43:32,312 INFO org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext: in ifExists, /grid/2/arunc/mapred-local/taskTracker/archive/xxx.yyy.com/tera/in/_partition.lst exists
> 2009-01-30 06:43:32,389 WARN org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext: in getLocalPathToRead, taskTracker/archive/xxx.yyy.com/tera/in/_partition.lst doesn't exist
> 2009-01-30 06:43:32,390 WARN org.apache.hadoop.mapred.TaskRunner: attempt_200901300512_0007_m_000055_0 Child Error
>  org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find taskTracker/archive/xx.yyy.com/tera/in/_partition.lst in any of the configured local directories
>          at org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathToRead(LocalDirAllocator.java:388)
>          at org.apache.hadoop.fs.LocalDirAllocator.getLocalPathToRead(LocalDirAllocator.java:138)
>          at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:172)
> {noformat}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-5146) LocalDirAllocator misses files on the local filesystem

Posted by "Amareshwari Sriramadasu (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-5146?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Amareshwari Sriramadasu updated HADOOP-5146:
--------------------------------------------

    Status: Patch Available  (was: Open)

> LocalDirAllocator misses files on the local filesystem
> ------------------------------------------------------
>
>                 Key: HADOOP-5146
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5146
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.20.0
>            Reporter: Arun C Murthy
>            Assignee: Devaraj Das
>            Priority: Blocker
>             Fix For: 0.20.0
>
>         Attachments: 5146.patch, 5146.patch, 5146_20090204job.output.txt, localdirallocator.patch, localdirallocator.patch
>
>
> For some reason the LocalDirAllocator.getLocaPathToRead doesn't find files which are present, extra logging shows:
> {noformat}
> 2009-01-30 06:43:32,312 INFO org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext: in ifExists, /grid/2/arunc/mapred-local/taskTracker/archive/xxx.yyy.com/tera/in/_partition.lst exists
> 2009-01-30 06:43:32,389 WARN org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext: in getLocalPathToRead, taskTracker/archive/xxx.yyy.com/tera/in/_partition.lst doesn't exist
> 2009-01-30 06:43:32,390 WARN org.apache.hadoop.mapred.TaskRunner: attempt_200901300512_0007_m_000055_0 Child Error
>  org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find taskTracker/archive/xx.yyy.com/tera/in/_partition.lst in any of the configured local directories
>          at org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathToRead(LocalDirAllocator.java:388)
>          at org.apache.hadoop.fs.LocalDirAllocator.getLocalPathToRead(LocalDirAllocator.java:138)
>          at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:172)
> {noformat}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-5146) LocalDirAllocator misses files on the local filesystem

Posted by "Devaraj Das (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-5146?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Devaraj Das updated HADOOP-5146:
--------------------------------

    Status: Patch Available  (was: Open)

> LocalDirAllocator misses files on the local filesystem
> ------------------------------------------------------
>
>                 Key: HADOOP-5146
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5146
>             Project: Hadoop Core
>          Issue Type: Bug
>    Affects Versions: 0.20.0
>            Reporter: Arun C Murthy
>            Assignee: Devaraj Das
>            Priority: Blocker
>             Fix For: 0.20.0
>
>         Attachments: 5146.patch, 5146.patch, 5146_20090204job.output.txt, localdirallocator.patch, localdirallocator.patch
>
>
> For some reason the LocalDirAllocator.getLocaPathToRead doesn't find files which are present, extra logging shows:
> {noformat}
> 2009-01-30 06:43:32,312 INFO org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext: in ifExists, /grid/2/arunc/mapred-local/taskTracker/archive/xxx.yyy.com/tera/in/_partition.lst exists
> 2009-01-30 06:43:32,389 WARN org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext: in getLocalPathToRead, taskTracker/archive/xxx.yyy.com/tera/in/_partition.lst doesn't exist
> 2009-01-30 06:43:32,390 WARN org.apache.hadoop.mapred.TaskRunner: attempt_200901300512_0007_m_000055_0 Child Error
>  org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find taskTracker/archive/xx.yyy.com/tera/in/_partition.lst in any of the configured local directories
>          at org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathToRead(LocalDirAllocator.java:388)
>          at org.apache.hadoop.fs.LocalDirAllocator.getLocalPathToRead(LocalDirAllocator.java:138)
>          at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:172)
> {noformat}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-5146) LocalDirAllocator misses files on the local filesystem

Posted by "Devaraj Das (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-5146?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Devaraj Das updated HADOOP-5146:
--------------------------------

    Attachment: 5146.patch

After some analysis, found the cause of the race condition:
1) Assume a fresh hdfs cluster with no files. Run a job (foo) that generates a file in the hdfs called partition.lst
2) Run a job (bar) that uses the file foo generated. On a given node, one task localizes partition.lst in the dist cache and other tasks simply use this 
3) bar job finishes successfully without any task failures.... 
4) Now run foo again. This will regenerate the file partition.lst at the same location.
5) Run bar again. On a given node that was used by the previous bar job, a task t1 from the new bar job will still find the partition.lst in the cache in ifExists() check. Context now switches to another taskrunner thread, say t2 of the new bar job. 
6) t2 also finds that ifExists() returns true but when it does getLocalCache, it finds the file to be stale (since the file got regenerated in the foo job again) and deletes it in DistributedCache.localizeCache. Context now switches back to t1. 
7) t1 does getLocalPathToRead and doesn't find the file... For t1, this is a situation where ifExists() returns true, but getLocalPathToRead returns false for the same path. This is the race condition..

The attached patch removes the call to ifExists/getLocalPathToRead in the TaskRunner thread during the cache localization. It always does getLocalPathForWrite. In the case where the file is already localized, the path returned by getLocalPathForWrite will not be used and instead getLocalCache will return the already localized path.

> LocalDirAllocator misses files on the local filesystem
> ------------------------------------------------------
>
>                 Key: HADOOP-5146
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5146
>             Project: Hadoop Core
>          Issue Type: Bug
>    Affects Versions: 0.20.0
>            Reporter: Arun C Murthy
>            Assignee: Devaraj Das
>            Priority: Blocker
>             Fix For: 0.20.0
>
>         Attachments: 5146.patch, 5146.patch, 5146_20090204job.output.txt, localdirallocator.patch, localdirallocator.patch
>
>
> For some reason the LocalDirAllocator.getLocaPathToRead doesn't find files which are present, extra logging shows:
> {noformat}
> 2009-01-30 06:43:32,312 INFO org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext: in ifExists, /grid/2/arunc/mapred-local/taskTracker/archive/xxx.yyy.com/tera/in/_partition.lst exists
> 2009-01-30 06:43:32,389 WARN org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext: in getLocalPathToRead, taskTracker/archive/xxx.yyy.com/tera/in/_partition.lst doesn't exist
> 2009-01-30 06:43:32,390 WARN org.apache.hadoop.mapred.TaskRunner: attempt_200901300512_0007_m_000055_0 Child Error
>  org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find taskTracker/archive/xx.yyy.com/tera/in/_partition.lst in any of the configured local directories
>          at org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathToRead(LocalDirAllocator.java:388)
>          at org.apache.hadoop.fs.LocalDirAllocator.getLocalPathToRead(LocalDirAllocator.java:138)
>          at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:172)
> {noformat}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-5146) LocalDirAllocator misses files on the local filesystem

Posted by "Amareshwari Sriramadasu (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-5146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12670993#action_12670993 ] 

Amareshwari Sriramadasu commented on HADOOP-5146:
-------------------------------------------------

Nicholas, can you try to rerun your example with latest trunk once and see if the problem still exists, because HADOOP-4759 moved pidFile to task directory.

> LocalDirAllocator misses files on the local filesystem
> ------------------------------------------------------
>
>                 Key: HADOOP-5146
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5146
>             Project: Hadoop Core
>          Issue Type: Bug
>    Affects Versions: 0.20.0
>            Reporter: Arun C Murthy
>            Assignee: Devaraj Das
>            Priority: Blocker
>             Fix For: 0.20.0
>
>         Attachments: 5146_20090204job.output.txt, localdirallocator.patch, localdirallocator.patch
>
>
> For some reason the LocalDirAllocator.getLocaPathToRead doesn't find files which are present, extra logging shows:
> {noformat}
> 2009-01-30 06:43:32,312 INFO org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext: in ifExists, /grid/2/arunc/mapred-local/taskTracker/archive/xxx.yyy.com/tera/in/_partition.lst exists
> 2009-01-30 06:43:32,389 WARN org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext: in getLocalPathToRead, taskTracker/archive/xxx.yyy.com/tera/in/_partition.lst doesn't exist
> 2009-01-30 06:43:32,390 WARN org.apache.hadoop.mapred.TaskRunner: attempt_200901300512_0007_m_000055_0 Child Error
>  org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find taskTracker/archive/xx.yyy.com/tera/in/_partition.lst in any of the configured local directories
>          at org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathToRead(LocalDirAllocator.java:388)
>          at org.apache.hadoop.fs.LocalDirAllocator.getLocalPathToRead(LocalDirAllocator.java:138)
>          at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:172)
> {noformat}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-5146) LocalDirAllocator misses files on the local filesystem

Posted by "Vinod K V (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-5146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12676978#action_12676978 ] 

Vinod K V commented on HADOOP-5146:
-----------------------------------

The explanation for the race condition scenario sounds correct. The patch looks good too. +1

> LocalDirAllocator misses files on the local filesystem
> ------------------------------------------------------
>
>                 Key: HADOOP-5146
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5146
>             Project: Hadoop Core
>          Issue Type: Bug
>    Affects Versions: 0.20.0
>            Reporter: Arun C Murthy
>            Assignee: Devaraj Das
>            Priority: Blocker
>             Fix For: 0.20.0
>
>         Attachments: 5146.patch, 5146.patch, 5146_20090204job.output.txt, localdirallocator.patch, localdirallocator.patch
>
>
> For some reason the LocalDirAllocator.getLocaPathToRead doesn't find files which are present, extra logging shows:
> {noformat}
> 2009-01-30 06:43:32,312 INFO org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext: in ifExists, /grid/2/arunc/mapred-local/taskTracker/archive/xxx.yyy.com/tera/in/_partition.lst exists
> 2009-01-30 06:43:32,389 WARN org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext: in getLocalPathToRead, taskTracker/archive/xxx.yyy.com/tera/in/_partition.lst doesn't exist
> 2009-01-30 06:43:32,390 WARN org.apache.hadoop.mapred.TaskRunner: attempt_200901300512_0007_m_000055_0 Child Error
>  org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find taskTracker/archive/xx.yyy.com/tera/in/_partition.lst in any of the configured local directories
>          at org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathToRead(LocalDirAllocator.java:388)
>          at org.apache.hadoop.fs.LocalDirAllocator.getLocalPathToRead(LocalDirAllocator.java:138)
>          at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:172)
> {noformat}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-5146) LocalDirAllocator misses files on the local filesystem

Posted by "Tsz Wo (Nicholas), SZE (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-5146?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Tsz Wo (Nicholas), SZE updated HADOOP-5146:
-------------------------------------------

    Attachment: 5146_20090204job.output.txt

> Nicholas, is this problem reproducible? From the code, it seems that the exception you pointed out should not cause tasks/job to fail. 

Yes, this can be reporduced deterministically in Windows.  It seems that this is not a problem in Linux.

>  Is there a possibility of anything else happening here, something like a disk failure?

No, the disk is working fine.


5146_20090204job.output.txt: the output for running PiEstimator.

> LocalDirAllocator misses files on the local filesystem
> ------------------------------------------------------
>
>                 Key: HADOOP-5146
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5146
>             Project: Hadoop Core
>          Issue Type: Bug
>    Affects Versions: 0.20.0
>            Reporter: Arun C Murthy
>            Assignee: Devaraj Das
>            Priority: Blocker
>             Fix For: 0.20.0
>
>         Attachments: 5146_20090204job.output.txt, localdirallocator.patch, localdirallocator.patch
>
>
> For some reason the LocalDirAllocator.getLocaPathToRead doesn't find files which are present, extra logging shows:
> {noformat}
> 2009-01-30 06:43:32,312 INFO org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext: in ifExists, /grid/2/arunc/mapred-local/taskTracker/archive/xxx.yyy.com/tera/in/_partition.lst exists
> 2009-01-30 06:43:32,389 WARN org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext: in getLocalPathToRead, taskTracker/archive/xxx.yyy.com/tera/in/_partition.lst doesn't exist
> 2009-01-30 06:43:32,390 WARN org.apache.hadoop.mapred.TaskRunner: attempt_200901300512_0007_m_000055_0 Child Error
>  org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find taskTracker/archive/xx.yyy.com/tera/in/_partition.lst in any of the configured local directories
>          at org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathToRead(LocalDirAllocator.java:388)
>          at org.apache.hadoop.fs.LocalDirAllocator.getLocalPathToRead(LocalDirAllocator.java:138)
>          at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:172)
> {noformat}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-5146) LocalDirAllocator misses files on the local filesystem

Posted by "Devaraj Das (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-5146?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Devaraj Das updated HADOOP-5146:
--------------------------------

    Attachment: localdirallocator.patch

Arun, can you please apply this debug patch and let me know what you see in the TT logs around the time a task fails with the DiskErrorException.

> LocalDirAllocator misses files on the local filesystem
> ------------------------------------------------------
>
>                 Key: HADOOP-5146
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5146
>             Project: Hadoop Core
>          Issue Type: Bug
>    Affects Versions: 0.20.0
>            Reporter: Arun C Murthy
>            Priority: Blocker
>             Fix For: 0.20.0
>
>         Attachments: localdirallocator.patch
>
>
> For some reason the LocalDirAllocator.getLocaPathToRead doesn't find files which are present, extra logging shows:
> {noformat}
> 2009-01-30 06:43:32,312 INFO org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext: in ifExists, /grid/2/arunc/mapred-local/taskTracker/archive/xxx.yyy.com/tera/in/_partition.lst exists
> 2009-01-30 06:43:32,389 WARN org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext: in getLocalPathToRead, taskTracker/archive/xxx.yyy.com/tera/in/_partition.lst doesn't exist
> 2009-01-30 06:43:32,390 WARN org.apache.hadoop.mapred.TaskRunner: attempt_200901300512_0007_m_000055_0 Child Error
>  org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find taskTracker/archive/xx.yyy.com/tera/in/_partition.lst in any of the configured local directories
>          at org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathToRead(LocalDirAllocator.java:388)
>          at org.apache.hadoop.fs.LocalDirAllocator.getLocalPathToRead(LocalDirAllocator.java:138)
>          at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:172)
> {noformat}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-5146) LocalDirAllocator misses files on the local filesystem

Posted by "Vinod K V (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-5146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12671123#action_12671123 ] 

Vinod K V commented on HADOOP-5146:
-----------------------------------

Ravi and I could reproduce the problem locally on a Windows box. The reason for the error is that pid files are not being created on Windows at all due to path problems. The situation persists irrespective of HADOOP-4759. This will be tracked in a new JIRA. The present jira should address the original problem that Arun reported.

> LocalDirAllocator misses files on the local filesystem
> ------------------------------------------------------
>
>                 Key: HADOOP-5146
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5146
>             Project: Hadoop Core
>          Issue Type: Bug
>    Affects Versions: 0.20.0
>            Reporter: Arun C Murthy
>            Assignee: Devaraj Das
>            Priority: Blocker
>             Fix For: 0.20.0
>
>         Attachments: 5146_20090204job.output.txt, localdirallocator.patch, localdirallocator.patch
>
>
> For some reason the LocalDirAllocator.getLocaPathToRead doesn't find files which are present, extra logging shows:
> {noformat}
> 2009-01-30 06:43:32,312 INFO org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext: in ifExists, /grid/2/arunc/mapred-local/taskTracker/archive/xxx.yyy.com/tera/in/_partition.lst exists
> 2009-01-30 06:43:32,389 WARN org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext: in getLocalPathToRead, taskTracker/archive/xxx.yyy.com/tera/in/_partition.lst doesn't exist
> 2009-01-30 06:43:32,390 WARN org.apache.hadoop.mapred.TaskRunner: attempt_200901300512_0007_m_000055_0 Child Error
>  org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find taskTracker/archive/xx.yyy.com/tera/in/_partition.lst in any of the configured local directories
>          at org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathToRead(LocalDirAllocator.java:388)
>          at org.apache.hadoop.fs.LocalDirAllocator.getLocalPathToRead(LocalDirAllocator.java:138)
>          at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:172)
> {noformat}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-5146) LocalDirAllocator misses files on the local filesystem

Posted by "Amareshwari Sriramadasu (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-5146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12677270#action_12677270 ] 

Amareshwari Sriramadasu commented on HADOOP-5146:
-------------------------------------------------

Patch looks good to me too. And also  validated the patch over terasort runs
test-patch result:
{noformat}
     [exec]
     [exec] -1 overall.
     [exec]
     [exec]     +1 @author.  The patch does not contain any @author tags.
     [exec]
     [exec]     -1 tests included.  The patch doesn't appear to include any new or modified tests.
     [exec]                         Please justify why no tests are needed for this patch.
     [exec]
     [exec]     +1 javadoc.  The javadoc tool did not generate any warning messages.
     [exec]
     [exec]     +1 javac.  The applied patch does not increase the total number of javac compiler warnings.
     [exec]
     [exec]     +1 findbugs.  The patch does not introduce any new Findbugs warnings.
     [exec]
     [exec]     +1 Eclipse classpath. The patch retains Eclipse classpath integrity.
     [exec]
     [exec]     +1 release audit.  The applied patch does not increase the total number of release audit warnings.
     [exec]
{noformat}
ant tests passed on my machine.

> LocalDirAllocator misses files on the local filesystem
> ------------------------------------------------------
>
>                 Key: HADOOP-5146
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5146
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.20.0
>            Reporter: Arun C Murthy
>            Assignee: Devaraj Das
>            Priority: Blocker
>             Fix For: 0.20.0
>
>         Attachments: 5146.patch, 5146.patch, 5146_20090204job.output.txt, localdirallocator.patch, localdirallocator.patch
>
>
> For some reason the LocalDirAllocator.getLocaPathToRead doesn't find files which are present, extra logging shows:
> {noformat}
> 2009-01-30 06:43:32,312 INFO org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext: in ifExists, /grid/2/arunc/mapred-local/taskTracker/archive/xxx.yyy.com/tera/in/_partition.lst exists
> 2009-01-30 06:43:32,389 WARN org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext: in getLocalPathToRead, taskTracker/archive/xxx.yyy.com/tera/in/_partition.lst doesn't exist
> 2009-01-30 06:43:32,390 WARN org.apache.hadoop.mapred.TaskRunner: attempt_200901300512_0007_m_000055_0 Child Error
>  org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find taskTracker/archive/xx.yyy.com/tera/in/_partition.lst in any of the configured local directories
>          at org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathToRead(LocalDirAllocator.java:388)
>          at org.apache.hadoop.fs.LocalDirAllocator.getLocalPathToRead(LocalDirAllocator.java:138)
>          at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:172)
> {noformat}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-5146) LocalDirAllocator misses files on the local filesystem

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-5146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12677560#action_12677560 ] 

Hadoop QA commented on HADOOP-5146:
-----------------------------------

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12401022/5146.patch
  against trunk revision 748628.

    +1 @author.  The patch does not contain any @author tags.

    -1 tests included.  The patch doesn't appear to include any new or modified tests.
                        Please justify why no tests are needed for this patch.

    +1 javadoc.  The javadoc tool did not generate any warning messages.

    +1 javac.  The applied patch does not increase the total number of javac compiler warnings.

    +1 findbugs.  The patch does not introduce any new Findbugs warnings.

    +1 Eclipse classpath. The patch retains Eclipse classpath integrity.

    +1 release audit.  The applied patch does not increase the total number of release audit warnings.

    +1 core tests.  The patch passed core unit tests.

    +1 contrib tests.  The patch passed contrib unit tests.

Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/13/testReport/
Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/13/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/13/artifact/trunk/build/test/checkstyle-errors.html
Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/13/console

This message is automatically generated.

> LocalDirAllocator misses files on the local filesystem
> ------------------------------------------------------
>
>                 Key: HADOOP-5146
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5146
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.20.0
>            Reporter: Arun C Murthy
>            Assignee: Devaraj Das
>            Priority: Blocker
>             Fix For: 0.20.0
>
>         Attachments: 5146.patch, 5146.patch, 5146_20090204job.output.txt, localdirallocator.patch, localdirallocator.patch
>
>
> For some reason the LocalDirAllocator.getLocaPathToRead doesn't find files which are present, extra logging shows:
> {noformat}
> 2009-01-30 06:43:32,312 INFO org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext: in ifExists, /grid/2/arunc/mapred-local/taskTracker/archive/xxx.yyy.com/tera/in/_partition.lst exists
> 2009-01-30 06:43:32,389 WARN org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext: in getLocalPathToRead, taskTracker/archive/xxx.yyy.com/tera/in/_partition.lst doesn't exist
> 2009-01-30 06:43:32,390 WARN org.apache.hadoop.mapred.TaskRunner: attempt_200901300512_0007_m_000055_0 Child Error
>  org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find taskTracker/archive/xx.yyy.com/tera/in/_partition.lst in any of the configured local directories
>          at org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathToRead(LocalDirAllocator.java:388)
>          at org.apache.hadoop.fs.LocalDirAllocator.getLocalPathToRead(LocalDirAllocator.java:138)
>          at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:172)
> {noformat}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-5146) LocalDirAllocator misses files on the local filesystem

Posted by "Amareshwari Sriramadasu (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-5146?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Amareshwari Sriramadasu updated HADOOP-5146:
--------------------------------------------

    Status: Open  (was: Patch Available)

Resubmitting to hudson

> LocalDirAllocator misses files on the local filesystem
> ------------------------------------------------------
>
>                 Key: HADOOP-5146
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5146
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.20.0
>            Reporter: Arun C Murthy
>            Assignee: Devaraj Das
>            Priority: Blocker
>             Fix For: 0.20.0
>
>         Attachments: 5146.patch, 5146.patch, 5146_20090204job.output.txt, localdirallocator.patch, localdirallocator.patch
>
>
> For some reason the LocalDirAllocator.getLocaPathToRead doesn't find files which are present, extra logging shows:
> {noformat}
> 2009-01-30 06:43:32,312 INFO org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext: in ifExists, /grid/2/arunc/mapred-local/taskTracker/archive/xxx.yyy.com/tera/in/_partition.lst exists
> 2009-01-30 06:43:32,389 WARN org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext: in getLocalPathToRead, taskTracker/archive/xxx.yyy.com/tera/in/_partition.lst doesn't exist
> 2009-01-30 06:43:32,390 WARN org.apache.hadoop.mapred.TaskRunner: attempt_200901300512_0007_m_000055_0 Child Error
>  org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find taskTracker/archive/xx.yyy.com/tera/in/_partition.lst in any of the configured local directories
>          at org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathToRead(LocalDirAllocator.java:388)
>          at org.apache.hadoop.fs.LocalDirAllocator.getLocalPathToRead(LocalDirAllocator.java:138)
>          at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:172)
> {noformat}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-5146) LocalDirAllocator misses files on the local filesystem

Posted by "Devaraj Das (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-5146?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Devaraj Das updated HADOOP-5146:
--------------------------------

    Attachment: localdirallocator.patch

I think I found the cause of this. Basically there is a race condition in TaskRunner where the distributed cache files are localized. Multiple TaskRunner threads may end up trying to localize the same files. The attached patch prevents this from happening. Arun, could you please take a shot at this? Thanks! 

> LocalDirAllocator misses files on the local filesystem
> ------------------------------------------------------
>
>                 Key: HADOOP-5146
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5146
>             Project: Hadoop Core
>          Issue Type: Bug
>    Affects Versions: 0.20.0
>            Reporter: Arun C Murthy
>            Priority: Blocker
>             Fix For: 0.20.0
>
>         Attachments: localdirallocator.patch, localdirallocator.patch
>
>
> For some reason the LocalDirAllocator.getLocaPathToRead doesn't find files which are present, extra logging shows:
> {noformat}
> 2009-01-30 06:43:32,312 INFO org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext: in ifExists, /grid/2/arunc/mapred-local/taskTracker/archive/xxx.yyy.com/tera/in/_partition.lst exists
> 2009-01-30 06:43:32,389 WARN org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext: in getLocalPathToRead, taskTracker/archive/xxx.yyy.com/tera/in/_partition.lst doesn't exist
> 2009-01-30 06:43:32,390 WARN org.apache.hadoop.mapred.TaskRunner: attempt_200901300512_0007_m_000055_0 Child Error
>  org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find taskTracker/archive/xx.yyy.com/tera/in/_partition.lst in any of the configured local directories
>          at org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathToRead(LocalDirAllocator.java:388)
>          at org.apache.hadoop.fs.LocalDirAllocator.getLocalPathToRead(LocalDirAllocator.java:138)
>          at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:172)
> {noformat}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-5146) LocalDirAllocator misses files on the local filesystem

Posted by "Hudson (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-5146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12681783#action_12681783 ] 

Hudson commented on HADOOP-5146:
--------------------------------

Integrated in Hadoop-trunk #778 (See [http://hudson.zones.apache.org/hudson/job/Hadoop-trunk/778/])
    

> LocalDirAllocator misses files on the local filesystem
> ------------------------------------------------------
>
>                 Key: HADOOP-5146
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5146
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.20.0
>            Reporter: Arun C Murthy
>            Assignee: Devaraj Das
>            Priority: Blocker
>             Fix For: 0.19.2, 0.20.0
>
>         Attachments: 5146.patch, 5146.patch, 5146_20090204job.output.txt, localdirallocator.patch, localdirallocator.patch
>
>
> For some reason the LocalDirAllocator.getLocaPathToRead doesn't find files which are present, extra logging shows:
> {noformat}
> 2009-01-30 06:43:32,312 INFO org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext: in ifExists, /grid/2/arunc/mapred-local/taskTracker/archive/xxx.yyy.com/tera/in/_partition.lst exists
> 2009-01-30 06:43:32,389 WARN org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext: in getLocalPathToRead, taskTracker/archive/xxx.yyy.com/tera/in/_partition.lst doesn't exist
> 2009-01-30 06:43:32,390 WARN org.apache.hadoop.mapred.TaskRunner: attempt_200901300512_0007_m_000055_0 Child Error
>  org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find taskTracker/archive/xx.yyy.com/tera/in/_partition.lst in any of the configured local directories
>          at org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathToRead(LocalDirAllocator.java:388)
>          at org.apache.hadoop.fs.LocalDirAllocator.getLocalPathToRead(LocalDirAllocator.java:138)
>          at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:172)
> {noformat}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-5146) LocalDirAllocator misses files on the local filesystem

Posted by "Tsz Wo (Nicholas), SZE (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-5146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12671322#action_12671322 ] 

Tsz Wo (Nicholas), SZE commented on HADOOP-5146:
------------------------------------------------

The problem still exist in Windows.  I filed HADOOP-5194.  Thanks, Amareshwari, Ravi and Vinod for checking this.

> LocalDirAllocator misses files on the local filesystem
> ------------------------------------------------------
>
>                 Key: HADOOP-5146
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5146
>             Project: Hadoop Core
>          Issue Type: Bug
>    Affects Versions: 0.20.0
>            Reporter: Arun C Murthy
>            Assignee: Devaraj Das
>            Priority: Blocker
>             Fix For: 0.20.0
>
>         Attachments: 5146_20090204job.output.txt, localdirallocator.patch, localdirallocator.patch
>
>
> For some reason the LocalDirAllocator.getLocaPathToRead doesn't find files which are present, extra logging shows:
> {noformat}
> 2009-01-30 06:43:32,312 INFO org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext: in ifExists, /grid/2/arunc/mapred-local/taskTracker/archive/xxx.yyy.com/tera/in/_partition.lst exists
> 2009-01-30 06:43:32,389 WARN org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext: in getLocalPathToRead, taskTracker/archive/xxx.yyy.com/tera/in/_partition.lst doesn't exist
> 2009-01-30 06:43:32,390 WARN org.apache.hadoop.mapred.TaskRunner: attempt_200901300512_0007_m_000055_0 Child Error
>  org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find taskTracker/archive/xx.yyy.com/tera/in/_partition.lst in any of the configured local directories
>          at org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathToRead(LocalDirAllocator.java:388)
>          at org.apache.hadoop.fs.LocalDirAllocator.getLocalPathToRead(LocalDirAllocator.java:138)
>          at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:172)
> {noformat}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-5146) LocalDirAllocator misses files on the local filesystem

Posted by "Tsz Wo (Nicholas), SZE (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-5146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12669845#action_12669845 ] 

Tsz Wo (Nicholas), SZE commented on HADOOP-5146:
------------------------------------------------

I forgot to say that the sample programs still fail after applied localdirallocator.patch.

> LocalDirAllocator misses files on the local filesystem
> ------------------------------------------------------
>
>                 Key: HADOOP-5146
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5146
>             Project: Hadoop Core
>          Issue Type: Bug
>    Affects Versions: 0.20.0
>            Reporter: Arun C Murthy
>            Assignee: Devaraj Das
>            Priority: Blocker
>             Fix For: 0.20.0
>
>         Attachments: localdirallocator.patch, localdirallocator.patch
>
>
> For some reason the LocalDirAllocator.getLocaPathToRead doesn't find files which are present, extra logging shows:
> {noformat}
> 2009-01-30 06:43:32,312 INFO org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext: in ifExists, /grid/2/arunc/mapred-local/taskTracker/archive/xxx.yyy.com/tera/in/_partition.lst exists
> 2009-01-30 06:43:32,389 WARN org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext: in getLocalPathToRead, taskTracker/archive/xxx.yyy.com/tera/in/_partition.lst doesn't exist
> 2009-01-30 06:43:32,390 WARN org.apache.hadoop.mapred.TaskRunner: attempt_200901300512_0007_m_000055_0 Child Error
>  org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find taskTracker/archive/xx.yyy.com/tera/in/_partition.lst in any of the configured local directories
>          at org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathToRead(LocalDirAllocator.java:388)
>          at org.apache.hadoop.fs.LocalDirAllocator.getLocalPathToRead(LocalDirAllocator.java:138)
>          at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:172)
> {noformat}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-5146) LocalDirAllocator misses files on the local filesystem

Posted by "Tsz Wo (Nicholas), SZE (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-5146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12669844#action_12669844 ] 

Tsz Wo (Nicholas), SZE commented on HADOOP-5146:
------------------------------------------------

There is a DiskErrorException (and so the job failed) when we run the sample programs such as PiEstimator.
{noformat}
org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find taskTracker/pids/attempt_200902021632_0001_m_000002_0 in any of the configured local directories
	at org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathToRead(LocalDirAllocator.java:381)
	at org.apache.hadoop.fs.LocalDirAllocator.getLocalPathToRead(LocalDirAllocator.java:138)
	at org.apache.hadoop.mapred.TaskTracker.getPidFilePath(TaskTracker.java:430)
	at org.apache.hadoop.mapred.TaskTracker.removePidFile(TaskTracker.java:440)
	at org.apache.hadoop.mapred.JvmManager$JvmManagerForType$JvmRunner.runChild(JvmManager.java:370)
	at org.apache.hadoop.mapred.JvmManager$JvmManagerForType$JvmRunner.run(JvmManager.java:338)
{noformat}
(I have changed TaskTracker line 432 to print out the trace.)

> LocalDirAllocator misses files on the local filesystem
> ------------------------------------------------------
>
>                 Key: HADOOP-5146
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5146
>             Project: Hadoop Core
>          Issue Type: Bug
>    Affects Versions: 0.20.0
>            Reporter: Arun C Murthy
>            Assignee: Devaraj Das
>            Priority: Blocker
>             Fix For: 0.20.0
>
>         Attachments: localdirallocator.patch, localdirallocator.patch
>
>
> For some reason the LocalDirAllocator.getLocaPathToRead doesn't find files which are present, extra logging shows:
> {noformat}
> 2009-01-30 06:43:32,312 INFO org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext: in ifExists, /grid/2/arunc/mapred-local/taskTracker/archive/xxx.yyy.com/tera/in/_partition.lst exists
> 2009-01-30 06:43:32,389 WARN org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext: in getLocalPathToRead, taskTracker/archive/xxx.yyy.com/tera/in/_partition.lst doesn't exist
> 2009-01-30 06:43:32,390 WARN org.apache.hadoop.mapred.TaskRunner: attempt_200901300512_0007_m_000055_0 Child Error
>  org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find taskTracker/archive/xx.yyy.com/tera/in/_partition.lst in any of the configured local directories
>          at org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathToRead(LocalDirAllocator.java:388)
>          at org.apache.hadoop.fs.LocalDirAllocator.getLocalPathToRead(LocalDirAllocator.java:138)
>          at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:172)
> {noformat}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-5146) LocalDirAllocator misses files on the local filesystem

Posted by "Ravi Gummadi (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-5146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12670839#action_12670839 ] 

Ravi Gummadi commented on HADOOP-5146:
--------------------------------------

pid files seem to be not getting created on cygwin. Could be because shell command "echo $$ >pidFile" seem to be getting the wrong path of pidFile. Looking into the issue.

> LocalDirAllocator misses files on the local filesystem
> ------------------------------------------------------
>
>                 Key: HADOOP-5146
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5146
>             Project: Hadoop Core
>          Issue Type: Bug
>    Affects Versions: 0.20.0
>            Reporter: Arun C Murthy
>            Assignee: Devaraj Das
>            Priority: Blocker
>             Fix For: 0.20.0
>
>         Attachments: 5146_20090204job.output.txt, localdirallocator.patch, localdirallocator.patch
>
>
> For some reason the LocalDirAllocator.getLocaPathToRead doesn't find files which are present, extra logging shows:
> {noformat}
> 2009-01-30 06:43:32,312 INFO org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext: in ifExists, /grid/2/arunc/mapred-local/taskTracker/archive/xxx.yyy.com/tera/in/_partition.lst exists
> 2009-01-30 06:43:32,389 WARN org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext: in getLocalPathToRead, taskTracker/archive/xxx.yyy.com/tera/in/_partition.lst doesn't exist
> 2009-01-30 06:43:32,390 WARN org.apache.hadoop.mapred.TaskRunner: attempt_200901300512_0007_m_000055_0 Child Error
>  org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find taskTracker/archive/xx.yyy.com/tera/in/_partition.lst in any of the configured local directories
>          at org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathToRead(LocalDirAllocator.java:388)
>          at org.apache.hadoop.fs.LocalDirAllocator.getLocalPathToRead(LocalDirAllocator.java:138)
>          at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:172)
> {noformat}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Assigned: (HADOOP-5146) LocalDirAllocator misses files on the local filesystem

Posted by "Devaraj Das (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-5146?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Devaraj Das reassigned HADOOP-5146:
-----------------------------------

    Assignee: Devaraj Das

> LocalDirAllocator misses files on the local filesystem
> ------------------------------------------------------
>
>                 Key: HADOOP-5146
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5146
>             Project: Hadoop Core
>          Issue Type: Bug
>    Affects Versions: 0.20.0
>            Reporter: Arun C Murthy
>            Assignee: Devaraj Das
>            Priority: Blocker
>             Fix For: 0.20.0
>
>         Attachments: localdirallocator.patch, localdirallocator.patch
>
>
> For some reason the LocalDirAllocator.getLocaPathToRead doesn't find files which are present, extra logging shows:
> {noformat}
> 2009-01-30 06:43:32,312 INFO org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext: in ifExists, /grid/2/arunc/mapred-local/taskTracker/archive/xxx.yyy.com/tera/in/_partition.lst exists
> 2009-01-30 06:43:32,389 WARN org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext: in getLocalPathToRead, taskTracker/archive/xxx.yyy.com/tera/in/_partition.lst doesn't exist
> 2009-01-30 06:43:32,390 WARN org.apache.hadoop.mapred.TaskRunner: attempt_200901300512_0007_m_000055_0 Child Error
>  org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find taskTracker/archive/xx.yyy.com/tera/in/_partition.lst in any of the configured local directories
>          at org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathToRead(LocalDirAllocator.java:388)
>          at org.apache.hadoop.fs.LocalDirAllocator.getLocalPathToRead(LocalDirAllocator.java:138)
>          at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:172)
> {noformat}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-5146) LocalDirAllocator misses files on the local filesystem

Posted by "Hemanth Yamijala (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-5146?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Hemanth Yamijala updated HADOOP-5146:
-------------------------------------

       Resolution: Fixed
    Fix Version/s: 0.19.2
     Hadoop Flags: [Reviewed]
           Status: Resolved  (was: Patch Available)

I just committed this to 0.19.2, 0.20 and trunk. Thanks, Devaraj !

> LocalDirAllocator misses files on the local filesystem
> ------------------------------------------------------
>
>                 Key: HADOOP-5146
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5146
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.20.0
>            Reporter: Arun C Murthy
>            Assignee: Devaraj Das
>            Priority: Blocker
>             Fix For: 0.19.2, 0.20.0
>
>         Attachments: 5146.patch, 5146.patch, 5146_20090204job.output.txt, localdirallocator.patch, localdirallocator.patch
>
>
> For some reason the LocalDirAllocator.getLocaPathToRead doesn't find files which are present, extra logging shows:
> {noformat}
> 2009-01-30 06:43:32,312 INFO org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext: in ifExists, /grid/2/arunc/mapred-local/taskTracker/archive/xxx.yyy.com/tera/in/_partition.lst exists
> 2009-01-30 06:43:32,389 WARN org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext: in getLocalPathToRead, taskTracker/archive/xxx.yyy.com/tera/in/_partition.lst doesn't exist
> 2009-01-30 06:43:32,390 WARN org.apache.hadoop.mapred.TaskRunner: attempt_200901300512_0007_m_000055_0 Child Error
>  org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find taskTracker/archive/xx.yyy.com/tera/in/_partition.lst in any of the configured local directories
>          at org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathToRead(LocalDirAllocator.java:388)
>          at org.apache.hadoop.fs.LocalDirAllocator.getLocalPathToRead(LocalDirAllocator.java:138)
>          at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:172)
> {noformat}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-5146) LocalDirAllocator misses files on the local filesystem

Posted by "Devaraj Das (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-5146?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Devaraj Das updated HADOOP-5146:
--------------------------------

    Component/s: mapred

> LocalDirAllocator misses files on the local filesystem
> ------------------------------------------------------
>
>                 Key: HADOOP-5146
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5146
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.20.0
>            Reporter: Arun C Murthy
>            Assignee: Devaraj Das
>            Priority: Blocker
>             Fix For: 0.20.0
>
>         Attachments: 5146.patch, 5146.patch, 5146_20090204job.output.txt, localdirallocator.patch, localdirallocator.patch
>
>
> For some reason the LocalDirAllocator.getLocaPathToRead doesn't find files which are present, extra logging shows:
> {noformat}
> 2009-01-30 06:43:32,312 INFO org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext: in ifExists, /grid/2/arunc/mapred-local/taskTracker/archive/xxx.yyy.com/tera/in/_partition.lst exists
> 2009-01-30 06:43:32,389 WARN org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext: in getLocalPathToRead, taskTracker/archive/xxx.yyy.com/tera/in/_partition.lst doesn't exist
> 2009-01-30 06:43:32,390 WARN org.apache.hadoop.mapred.TaskRunner: attempt_200901300512_0007_m_000055_0 Child Error
>  org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find taskTracker/archive/xx.yyy.com/tera/in/_partition.lst in any of the configured local directories
>          at org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathToRead(LocalDirAllocator.java:388)
>          at org.apache.hadoop.fs.LocalDirAllocator.getLocalPathToRead(LocalDirAllocator.java:138)
>          at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:172)
> {noformat}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.