You are viewing a plain text version of this content. The canonical link for it is here.
Posted to yarn-issues@hadoop.apache.org by "vishal.rajan (JIRA)" <ji...@apache.org> on 2015/04/01 12:00:55 UTC

[jira] [Commented] (YARN-2624) Resource Localization fails on a cluster due to existing cache directories

    [ https://issues.apache.org/jira/browse/YARN-2624?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14390305#comment-14390305 ] 

vishal.rajan commented on YARN-2624:
------------------------------------

seems like this issue still persist in yarn  2.6.0 under certain conditions.

Dump of the log relating to this issue.
================================
15/04/01 12:13:20 ERROR test.Job: Task error: Rename cannot overwrite non empty destination directory /grid/6/yarn/local/usercache/azkaban/filecache/344860
java.io.IOException: Rename cannot overwrite non empty destination directory /grid/6/yarn/local/usercache/azkaban/filecache/344860
        at org.apache.hadoop.fs.AbstractFileSystem.renameInternal(AbstractFileSystem.java:716)
        at org.apache.hadoop.fs.FilterFs.renameInternal(FilterFs.java:228)
        at org.apache.hadoop.fs.AbstractFileSystem.rename(AbstractFileSystem.java:659)
        at org.apache.hadoop.fs.FileContext.rename(FileContext.java:909)
        at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:364)
        at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:60)
        at java.util.concurrent.FutureTask.run(FutureTask.java:262)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
        at java.util.concurrent.FutureTask.run(FutureTask.java:262)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:745)
=============================
yarn version : hadoop-2-2-0-0-2041-yarn                   2.6.0.2.2.0.0-2041
=============================

This node was taken OOR for maintanance, and when it was added back to the cluster, seems like this 344860 directory was not removed before assigning it to the new container.




> Resource Localization fails on a cluster due to existing cache directories
> --------------------------------------------------------------------------
>
>                 Key: YARN-2624
>                 URL: https://issues.apache.org/jira/browse/YARN-2624
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: nodemanager
>    Affects Versions: 2.5.1
>            Reporter: Anubhav Dhoot
>            Assignee: Anubhav Dhoot
>            Priority: Blocker
>             Fix For: 2.6.0
>
>         Attachments: YARN-2624.001.patch, YARN-2624.001.patch
>
>
> We have found resource localization fails on a cluster with following error in certain cases.
> {noformat}
> INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService: Failed to download rsrc { { hdfs://<blahhostname>:8020/tmp/hive-hive/hive_2014-09-29_14-55-45_184_6531377394813896912-12/-mr-10004/95a07b90-2448-48fc-bcda-cdb7400b4975/map.xml, 1412027745352, FILE, null },pending,[(container_1411670948067_0009_02_000001)],443533288192637,DOWNLOADING}
> java.io.IOException: Rename cannot overwrite non empty destination directory /data/yarn/nm/filecache/27
> 	at org.apache.hadoop.fs.AbstractFileSystem.renameInternal(AbstractFileSystem.java:716)
> 	at org.apache.hadoop.fs.FilterFs.renameInternal(FilterFs.java:228)
> 	at org.apache.hadoop.fs.AbstractFileSystem.rename(AbstractFileSystem.java:659)
> 	at org.apache.hadoop.fs.FileContext.rename(FileContext.java:906)
> 	at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:366)
> 	at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:59)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)