You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-issues@hadoop.apache.org by "Steve Loughran (JIRA)" <ji...@apache.org> on 2010/07/05 13:44:49 UTC

[jira] Commented: (HADOOP-6849) Have LocalDirAllocator.AllocatorPerContext.getLocalPathForWrite fail more meaningfully

    [ https://issues.apache.org/jira/browse/HADOOP-6849?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12885187#action_12885187 ] 

Steve Loughran commented on HADOOP-6849:
----------------------------------------

Obviously, building up an error string during the directory search is a waste of effort for every successful operation, so the diagnostics should only be created on failure. A list of directories and the target file size may be enough, as it would catch directory config issues.

> Have LocalDirAllocator.AllocatorPerContext.getLocalPathForWrite fail more meaningfully
> --------------------------------------------------------------------------------------
>
>                 Key: HADOOP-6849
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6849
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: fs
>    Affects Versions: 0.20.2
>            Reporter: Steve Loughran
>            Priority: Minor
>
> A stack trace makes it way to me, of a reduce failing
> {code}
> Caused by: org\.apache\.hadoop\.util\.DiskChecker$DiskErrorException: Could not find any valid local directory for file:/mnt/data/dfs/data/mapred/local/taskTracker/jobcache/job_201007011427_0001/attempt_201007011427_0001_r_000000_1/output/map_96\.out
>       at org\.apache\.hadoop\.fs\.LocalDirAllocator$AllocatorPerContext\.getLocalPathForWrite(LocalDirAllocator\.java:343)
>       at org\.apache\.hadoop\.fs\.LocalDirAllocator\.getLocalPathForWrite(LocalDirAllocator\.java:124)
>       at org\.apache\.hadoop\.mapred\.ReduceTask$ReduceCopier$LocalFSMerger\.run(ReduceTask\.java:2434)
> {code}
> We're probably running out of HDD space, if not its configuration problems. Either way, some more hints in the exception would be handy.
> # Include the size of the output file looked for if known
> # Include the list of dirs examined and their reason for rejection (not found or if not enough room, available space).
> This would make it easier to diagnose problems after the event, with nothing but emailed logs for diagnostics.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.