You are viewing a plain text version of this content. The canonical link for it is here.
Posted to yarn-issues@hadoop.apache.org by "Bibin A Chundatt (JIRA)" <ji...@apache.org> on 2017/07/06 07:12:05 UTC

[jira] [Updated] (YARN-6708) Nodemanager container crash after ext3 folder limit

     [ https://issues.apache.org/jira/browse/YARN-6708?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Bibin A Chundatt updated YARN-6708:
-----------------------------------
    Attachment: YARN-6708.007.patch

Apologies for missing out comment.
Following changes are made in latest patch
# After added to test class to delete {{basedir}}
# Precreate directory and set wrong permission for {{USERCACHE}} folder
# {{USER_CACHE_FOLDER_PERMS}} renamed to {{USERCACHE_FOLDER_PERMS}}

> Nodemanager container crash after ext3 folder limit
> ---------------------------------------------------
>
>                 Key: YARN-6708
>                 URL: https://issues.apache.org/jira/browse/YARN-6708
>             Project: Hadoop YARN
>          Issue Type: Bug
>            Reporter: Bibin A Chundatt
>            Assignee: Bibin A Chundatt
>            Priority: Critical
>         Attachments: YARN-6708.001.patch, YARN-6708.002.patch, YARN-6708.003.patch, YARN-6708.004.patch, YARN-6708.005.patch, YARN-6708.006.patch, YARN-6708.007.patch
>
>
> Configure umask as *027* for nodemanager service user
> and {{yarn.nodemanager.local-cache.max-files-per-directory}} as {{40}}. After 4  *private* dir localization next directory will be *0/14*
> Local Directory cache manager 
> {code}
> vm2:/opt/hadoop/release/data/nmlocal/usercache/mapred/filecache # l
> total 28
> drwx--x--- 7 mapred hadoop 4096 Jun 10 14:35 ./
> drwxr-s--- 4 mapred hadoop 4096 Jun 10 12:07 ../
> drwxr-x--- 3 mapred users  4096 Jun 10 14:36 0/
> drwxr-xr-x 3 mapred users  4096 Jun 10 12:15 10/
> drwxr-xr-x 3 mapred users  4096 Jun 10 12:22 11/
> drwxr-xr-x 3 mapred users  4096 Jun 10 12:27 12/
> drwxr-xr-x 3 mapred users  4096 Jun 10 12:31 13/
> {code}
> *drwxr-x---* 3 mapred users  4096 Jun 10 14:36 0/ is only *750*
> Nodemanager user will not be able check for localization path exists or not.
> {{LocalResourcesTrackerImpl}}
> {code}
>     case REQUEST:
>       if (rsrc != null && (!isResourcePresent(rsrc))) {
>         LOG.info("Resource " + rsrc.getLocalPath()
>             + " is missing, localizing it again");
>         removeResource(req);
>         rsrc = null;
>       }
>       if (null == rsrc) {
>         rsrc = new LocalizedResource(req, dispatcher);
>         localrsrc.put(req, rsrc);
>       }
>       break;
> {code}
> *isResourcePresent* will always return false and same resource will be localized to {{0}} to next unique number



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org