You are viewing a plain text version of this content. The canonical link for it is here.
Posted to yarn-issues@hadoop.apache.org by "Zhankun Tang (JIRA)" <ji...@apache.org> on 2018/12/05 02:54:00 UTC

[jira] [Updated] (YARN-9083) Support remote directory localization in yarn native service

     [ https://issues.apache.org/jira/browse/YARN-9083?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Zhankun Tang updated YARN-9083:
-------------------------------
    Description: 
When refining YARN-8714, found that the YARN localizer seems can handle remote directory directly. In FSDownload.java#downloadAndUnpack, it uses "FileUtil.copy" which can handle directory. This ability is added by YARN-2185.

For testing purpose, I changed distributedShell's client to let it localize an HDFS directory "mydir" directly. 
{code:java}
Path p = new Path("hdfs:///user/yarn/submarine/jobs/tf-job-001/staging" +
 "/mydir");
FileStatus scFileStatus = fs.getFileStatus(p);
LocalResource r =
 LocalResource.newInstance(URL.fromURI(p.toUri()),
 LocalResourceType.FILE, LocalResourceVisibility.APPLICATION,
 scFileStatus.getLen(), scFileStatus.getModificationTime());
localResources.put("mydir", r);{code}
And YARN localizer indeed downloads the HDFS dir to local for DistributedShell.
{code:java}
yarn@master0-VirtualBox:~/apache-hadoop-install-dir/hadoop-dev-workspace$ ls /tmp/hadoop-yarn/nm-local-dir/usercache/yarn/appcache/application_1543898305524_0004/filecache/13/mydir
1.py  2.py  dir1  test_kill9.sh
yarn@master0-VirtualBox:~/apache-hadoop-install-dir/hadoop-dev-workspace$ ls /tmp/hadoop-yarn/nm-local-dir/usercache/yarn/appcache/application_1543898305524_0004/container_1543898305524_0004_01_000001/ -l
total 20
lrwxrwxrwx 1 yarn hadoop  111 12月  5 10:08 AppMaster.jar -> /tmp/hadoop-yarn/nm-local-dir/usercache/yarn/appcache/application_1543898305524_0004/filecache/10/AppMaster.jar
lrwxrwxrwx 1 yarn hadoop  103 12月  5 10:08 mydir -> /tmp/hadoop-yarn/nm-local-dir/usercache/yarn/appcache/application_1543898305524_0004/filecache/13/mydir
{code}
But the YARN native service seems doesn't know this YARN localizer ability and blocked it.
{code:java}
2018-12-05 10:06:40,286 ERROR client.ApiServiceClient: srcFile=hdfs://192.168.50.191:9000/user/yarn/submarine/jobs/tf-job-001/staging/mydir is a directory, which is not supported.{code}
We should enable this ability in yarn native service.

  was:
When refining YARN-8714, found that the YARN localizer seems can handle remote directory directly. In FSDownload.java#downloadAndUnpack, it uses "FileUtil.copy" which can handle directory. This ability is added by YARN-2185.

For testing purpose, I changed distributedShell's client to let it localize an HDFS directory "mydir" directly. 
{code:java}
Path p = new Path("hdfs:///user/yarn/submarine/jobs/tf-job-001/staging" +
 "/mydir");
FileStatus scFileStatus = fs.getFileStatus(p);
LocalResource r =
 LocalResource.newInstance(URL.fromURI(p.toUri()),
 LocalResourceType.FILE, LocalResourceVisibility.APPLICATION,
 scFileStatus.getLen(), scFileStatus.getModificationTime());
localResources.put("mydir", r);{code}
And YARN localizer indeed downloads the HDFS dir to local for DistributedShell.
{code:java}
yarn@master0-VirtualBox:~/apache-hadoop-install-dir/hadoop-dev-workspace$ ls /tmp/hadoop-yarn/nm-local-dir/usercache/yarn/appcache/application_1543898305524_0004/filecache/13/mydir
1.py  2.py  dir1  test_kill9.sh
yarn@master0-VirtualBox:~/apache-hadoop-install-dir/hadoop-dev-workspace$ ls /tmp/hadoop-yarn/nm-local-dir/usercache/yarn/appcache/application_1543898305524_0004/container_1543898305524_0004_01_000001/ -l
total 20
lrwxrwxrwx 1 yarn hadoop  111 12月  5 10:08 AppMaster.jar -> /tmp/hadoop-yarn/nm-local-dir/usercache/yarn/appcache/application_1543898305524_0004/filecache/10/AppMaster.jar
lrwxrwxrwx 1 yarn hadoop  103 12月  5 10:08 mydir -> /tmp/hadoop-yarn/nm-local-dir/usercache/yarn/appcache/application_1543898305524_0004/filecache/13/mydir
{code}
But the YARN native service seems doesn't know this YARN localizer ability and blocked it.
{code:java}
2018-12-05 10:06:40,286 ERROR client.ApiServiceClient: srcFile=hdfs://192.168.50.191:9000/user/yarn/submarine/jobs/tf-job-001/staging/mydir is a directory, which is not supported.{code}
We should utilize this ability in yarn native service.


> Support remote directory localization in yarn native service
> ------------------------------------------------------------
>
>                 Key: YARN-9083
>                 URL: https://issues.apache.org/jira/browse/YARN-9083
>             Project: Hadoop YARN
>          Issue Type: Improvement
>            Reporter: Zhankun Tang
>            Assignee: Zhankun Tang
>            Priority: Major
>
> When refining YARN-8714, found that the YARN localizer seems can handle remote directory directly. In FSDownload.java#downloadAndUnpack, it uses "FileUtil.copy" which can handle directory. This ability is added by YARN-2185.
> For testing purpose, I changed distributedShell's client to let it localize an HDFS directory "mydir" directly. 
> {code:java}
> Path p = new Path("hdfs:///user/yarn/submarine/jobs/tf-job-001/staging" +
>  "/mydir");
> FileStatus scFileStatus = fs.getFileStatus(p);
> LocalResource r =
>  LocalResource.newInstance(URL.fromURI(p.toUri()),
>  LocalResourceType.FILE, LocalResourceVisibility.APPLICATION,
>  scFileStatus.getLen(), scFileStatus.getModificationTime());
> localResources.put("mydir", r);{code}
> And YARN localizer indeed downloads the HDFS dir to local for DistributedShell.
> {code:java}
> yarn@master0-VirtualBox:~/apache-hadoop-install-dir/hadoop-dev-workspace$ ls /tmp/hadoop-yarn/nm-local-dir/usercache/yarn/appcache/application_1543898305524_0004/filecache/13/mydir
> 1.py  2.py  dir1  test_kill9.sh
> yarn@master0-VirtualBox:~/apache-hadoop-install-dir/hadoop-dev-workspace$ ls /tmp/hadoop-yarn/nm-local-dir/usercache/yarn/appcache/application_1543898305524_0004/container_1543898305524_0004_01_000001/ -l
> total 20
> lrwxrwxrwx 1 yarn hadoop  111 12月  5 10:08 AppMaster.jar -> /tmp/hadoop-yarn/nm-local-dir/usercache/yarn/appcache/application_1543898305524_0004/filecache/10/AppMaster.jar
> lrwxrwxrwx 1 yarn hadoop  103 12月  5 10:08 mydir -> /tmp/hadoop-yarn/nm-local-dir/usercache/yarn/appcache/application_1543898305524_0004/filecache/13/mydir
> {code}
> But the YARN native service seems doesn't know this YARN localizer ability and blocked it.
> {code:java}
> 2018-12-05 10:06:40,286 ERROR client.ApiServiceClient: srcFile=hdfs://192.168.50.191:9000/user/yarn/submarine/jobs/tf-job-001/staging/mydir is a directory, which is not supported.{code}
> We should enable this ability in yarn native service.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org