You are viewing a plain text version of this content. The canonical link for it is here.
Posted to hdfs-user@hadoop.apache.org by Robert Metzger <me...@gmail.com> on 2013/12/01 21:03:46 UTC

YARN: LocalResources and file distribution

Hello,

I'm currently writing code to run my application using Yarn (Hadoop 2.2.0).
I used this code as a skeleton:
https://github.com/hortonworks/simple-yarn-app

Everything works fine on my local machine or on a cluster with the shared
directories, but when I want to access resources outside of commonly
accessible locations, my application fails.

I have my application in a large jar file, containing everything
(Submission Client, Application Master, and Workers).
The submission client registers the large jar file as a local resource for
the Application master's context.

In my understanding, Yarn takes care of transferring the client-local
resources to the application master's container.
This is also stated here:
http://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/WritingYarnApplications.html

You can use the LocalResource to add resources to your application request.
> This will cause YARN to distribute the resource to the ApplicationMaster
> node.


If I'm starting my jar from the dir "/home/hadoop/robert/large_jar.jar",
I'll get the following error from the nodemanager (another node in the
cluster):

2013-12-01 20:13:00,810 INFO
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService:
> Failed to download rsrc { { file:/home/hadoop/robert/large_jar.jar, ..
>

So it seems as this node tries to access the file from its local file
system.

Do I have to use another "protocol" for the file, something like
"file://host:port/home/blabla" ?

Is it true that Yarn is able to distribute files (not using hdfs
obviously?) ?


The distributedshell-example suggests that I have to use HDFS:
https://github.com/apache/hadoop-common/blob/50f0de14e377091c308c3a74ed089a7e4a7f0bfe/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/Client.java


Sincerely,
Robert

Re: YARN: LocalResources and file distribution

Posted by Hitesh Shah <hi...@apache.org>.
That seems wrong. 

What you need to do is:

        localResources.put("list.ksh", shellRsrc);
        ctx.setLocalResources(localResources);

shellRsrc should already have the hdfs path set in it. The key in the local resource map denotes the name of the symlink to create and the value contains all the information for YARN to be able to download and verify the resource.

-- Hitesh

On Dec 6, 2013, at 3:03 AM, Krishna Kishore Bonagiri wrote:

> Hi Omkar,
>   Thanks for the quick reply. I am now adding a resource to be localized on the ContainerLaunchContext like this:
> 
>         localResources.put("hdfs://isredeng:8020/user/kbonagir/KKDummy/list.ksh", shellRsrc);
>         ctx.setLocalResources(localResources);
> 
> and referred it as "./list.ksh". Is that enough?
> 
> With this change I have gone past the previous error, and now seeing this error, what else I might be missing?
> 
> 2013-12-06 05:25:59,480 WARN org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch: Failed to launch container.
> java.io.IOException: Destination must be relative
>         at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch$ShellScriptBuilder.symlink(ContainerLaunch.java:474)
>         at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.writeLaunchEnv(ContainerLaunch.java:723)
>         at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:254)
>         at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:79)
>         at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
>         at java.util.concurrent.FutureTask.run(FutureTask.java:166)
>         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>         at java.lang.Thread.run(Thread.java:780)
> 
> Thanks,
> Kishore
> 
> 
> 
> On Fri, Dec 6, 2013 at 12:48 PM, omkar joshi <om...@gmail.com> wrote:
> add this file in the files to be localized. (LocalResourceRequest). and then refer it as ./list.ksh .. While adding this to LocalResource specify the path which you have mentioned.
> 
> 
> On Thu, Dec 5, 2013 at 10:40 PM, Krishna Kishore Bonagiri <wr...@gmail.com> wrote:
> Hi Arun,
> 
>   I have copied a shell script to HDFS and trying to execute it on containers. How do I specify my shell script PATH in setCommands() call on ContainerLaunchContext? I am doing it this way
> 
>       String shellScriptPath = "hdfs://isredeng:8020/user/kbonagir/KKDummy/list.ksh";
>       commands.add(shellScriptPath);
> 
> But my container execution is failing saying that there is No such file or directory!
> 
> org.apache.hadoop.util.Shell$ExitCodeException: /bin/bash: hdfs://isredeng:8020/user/kbonagir/KKDummy/list.ksh: No such file or directory
> 
>         at org.apache.hadoop.util.Shell.runCommand(Shell.java:464)
>         at org.apache.hadoop.util.Shell.run(Shell.java:379)
>         at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:589)
> 
> I could see this file with "hadoop fs" command and also saw messages in Node Manager's log saying that the resource is downloaded and localized. So, how do I run the downloaded shell script on a container?
> 
> Thanks,
> Kishore
> 
> 
> 
> On Tue, Dec 3, 2013 at 4:57 AM, Arun C Murthy <ac...@hortonworks.com> wrote:
> Robert,
> 
>  YARN, by default, will only download *resource* from a shared namespace (e.g. HDFS).
> 
>  If /home/hadoop/robert/large_jar.jar is available on each node then you can specify path as file:///home/hadoop/robert/large_jar.jar and it should work.
> 
>  Else, you'll need to copy /home/hadoop/robert/large_jar.jar to HDFS and then specify hdfs://host:port/path/to/large_jar.jar.
> 
> hth,
> Arun
> 
> On Dec 1, 2013, at 12:03 PM, Robert Metzger <me...@gmail.com> wrote:
> 
>> Hello,
>> 
>> I'm currently writing code to run my application using Yarn (Hadoop 2.2.0).
>> I used this code as a skeleton: https://github.com/hortonworks/simple-yarn-app
>> 
>> Everything works fine on my local machine or on a cluster with the shared directories, but when I want to access resources outside of commonly accessible locations, my application fails.
>> 
>> I have my application in a large jar file, containing everything (Submission Client, Application Master, and Workers). 
>> The submission client registers the large jar file as a local resource for the Application master's context.
>> 
>> In my understanding, Yarn takes care of transferring the client-local resources to the application master's container.
>> This is also stated here: http://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/WritingYarnApplications.html
>> 
>> You can use the LocalResource to add resources to your application request. This will cause YARN to distribute the resource to the ApplicationMaster node.
>> 
>> If I'm starting my jar from the dir "/home/hadoop/robert/large_jar.jar", I'll get the following error from the nodemanager (another node in the cluster):
>> 
>> 2013-12-01 20:13:00,810 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService: Failed to download rsrc { { file:/home/hadoop/robert/large_jar.jar, ..
>> 
>> So it seems as this node tries to access the file from its local file system.
>> 
>> Do I have to use another "protocol" for the file, something like "file://host:port/home/blabla" ?
>> 
>> Is it true that Yarn is able to distribute files (not using hdfs obviously?) ?
>> 
>> 
>> The distributedshell-example suggests that I have to use HDFS: https://github.com/apache/hadoop-common/blob/50f0de14e377091c308c3a74ed089a7e4a7f0bfe/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/Client.java
>> 
>> 
>> Sincerely,
>> Robert
>> 
>> 
>> 
>> 
>> 
> 
> --
> Arun C. Murthy
> Hortonworks Inc.
> http://hortonworks.com/
> 
> 
> 
> CONFIDENTIALITY NOTICE
> NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.
> 
> 
> 


Re: YARN: LocalResources and file distribution

Posted by Hitesh Shah <hi...@apache.org>.
That seems wrong. 

What you need to do is:

        localResources.put("list.ksh", shellRsrc);
        ctx.setLocalResources(localResources);

shellRsrc should already have the hdfs path set in it. The key in the local resource map denotes the name of the symlink to create and the value contains all the information for YARN to be able to download and verify the resource.

-- Hitesh

On Dec 6, 2013, at 3:03 AM, Krishna Kishore Bonagiri wrote:

> Hi Omkar,
>   Thanks for the quick reply. I am now adding a resource to be localized on the ContainerLaunchContext like this:
> 
>         localResources.put("hdfs://isredeng:8020/user/kbonagir/KKDummy/list.ksh", shellRsrc);
>         ctx.setLocalResources(localResources);
> 
> and referred it as "./list.ksh". Is that enough?
> 
> With this change I have gone past the previous error, and now seeing this error, what else I might be missing?
> 
> 2013-12-06 05:25:59,480 WARN org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch: Failed to launch container.
> java.io.IOException: Destination must be relative
>         at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch$ShellScriptBuilder.symlink(ContainerLaunch.java:474)
>         at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.writeLaunchEnv(ContainerLaunch.java:723)
>         at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:254)
>         at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:79)
>         at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
>         at java.util.concurrent.FutureTask.run(FutureTask.java:166)
>         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>         at java.lang.Thread.run(Thread.java:780)
> 
> Thanks,
> Kishore
> 
> 
> 
> On Fri, Dec 6, 2013 at 12:48 PM, omkar joshi <om...@gmail.com> wrote:
> add this file in the files to be localized. (LocalResourceRequest). and then refer it as ./list.ksh .. While adding this to LocalResource specify the path which you have mentioned.
> 
> 
> On Thu, Dec 5, 2013 at 10:40 PM, Krishna Kishore Bonagiri <wr...@gmail.com> wrote:
> Hi Arun,
> 
>   I have copied a shell script to HDFS and trying to execute it on containers. How do I specify my shell script PATH in setCommands() call on ContainerLaunchContext? I am doing it this way
> 
>       String shellScriptPath = "hdfs://isredeng:8020/user/kbonagir/KKDummy/list.ksh";
>       commands.add(shellScriptPath);
> 
> But my container execution is failing saying that there is No such file or directory!
> 
> org.apache.hadoop.util.Shell$ExitCodeException: /bin/bash: hdfs://isredeng:8020/user/kbonagir/KKDummy/list.ksh: No such file or directory
> 
>         at org.apache.hadoop.util.Shell.runCommand(Shell.java:464)
>         at org.apache.hadoop.util.Shell.run(Shell.java:379)
>         at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:589)
> 
> I could see this file with "hadoop fs" command and also saw messages in Node Manager's log saying that the resource is downloaded and localized. So, how do I run the downloaded shell script on a container?
> 
> Thanks,
> Kishore
> 
> 
> 
> On Tue, Dec 3, 2013 at 4:57 AM, Arun C Murthy <ac...@hortonworks.com> wrote:
> Robert,
> 
>  YARN, by default, will only download *resource* from a shared namespace (e.g. HDFS).
> 
>  If /home/hadoop/robert/large_jar.jar is available on each node then you can specify path as file:///home/hadoop/robert/large_jar.jar and it should work.
> 
>  Else, you'll need to copy /home/hadoop/robert/large_jar.jar to HDFS and then specify hdfs://host:port/path/to/large_jar.jar.
> 
> hth,
> Arun
> 
> On Dec 1, 2013, at 12:03 PM, Robert Metzger <me...@gmail.com> wrote:
> 
>> Hello,
>> 
>> I'm currently writing code to run my application using Yarn (Hadoop 2.2.0).
>> I used this code as a skeleton: https://github.com/hortonworks/simple-yarn-app
>> 
>> Everything works fine on my local machine or on a cluster with the shared directories, but when I want to access resources outside of commonly accessible locations, my application fails.
>> 
>> I have my application in a large jar file, containing everything (Submission Client, Application Master, and Workers). 
>> The submission client registers the large jar file as a local resource for the Application master's context.
>> 
>> In my understanding, Yarn takes care of transferring the client-local resources to the application master's container.
>> This is also stated here: http://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/WritingYarnApplications.html
>> 
>> You can use the LocalResource to add resources to your application request. This will cause YARN to distribute the resource to the ApplicationMaster node.
>> 
>> If I'm starting my jar from the dir "/home/hadoop/robert/large_jar.jar", I'll get the following error from the nodemanager (another node in the cluster):
>> 
>> 2013-12-01 20:13:00,810 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService: Failed to download rsrc { { file:/home/hadoop/robert/large_jar.jar, ..
>> 
>> So it seems as this node tries to access the file from its local file system.
>> 
>> Do I have to use another "protocol" for the file, something like "file://host:port/home/blabla" ?
>> 
>> Is it true that Yarn is able to distribute files (not using hdfs obviously?) ?
>> 
>> 
>> The distributedshell-example suggests that I have to use HDFS: https://github.com/apache/hadoop-common/blob/50f0de14e377091c308c3a74ed089a7e4a7f0bfe/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/Client.java
>> 
>> 
>> Sincerely,
>> Robert
>> 
>> 
>> 
>> 
>> 
> 
> --
> Arun C. Murthy
> Hortonworks Inc.
> http://hortonworks.com/
> 
> 
> 
> CONFIDENTIALITY NOTICE
> NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.
> 
> 
> 


Re: YARN: LocalResources and file distribution

Posted by Hitesh Shah <hi...@apache.org>.
That seems wrong. 

What you need to do is:

        localResources.put("list.ksh", shellRsrc);
        ctx.setLocalResources(localResources);

shellRsrc should already have the hdfs path set in it. The key in the local resource map denotes the name of the symlink to create and the value contains all the information for YARN to be able to download and verify the resource.

-- Hitesh

On Dec 6, 2013, at 3:03 AM, Krishna Kishore Bonagiri wrote:

> Hi Omkar,
>   Thanks for the quick reply. I am now adding a resource to be localized on the ContainerLaunchContext like this:
> 
>         localResources.put("hdfs://isredeng:8020/user/kbonagir/KKDummy/list.ksh", shellRsrc);
>         ctx.setLocalResources(localResources);
> 
> and referred it as "./list.ksh". Is that enough?
> 
> With this change I have gone past the previous error, and now seeing this error, what else I might be missing?
> 
> 2013-12-06 05:25:59,480 WARN org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch: Failed to launch container.
> java.io.IOException: Destination must be relative
>         at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch$ShellScriptBuilder.symlink(ContainerLaunch.java:474)
>         at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.writeLaunchEnv(ContainerLaunch.java:723)
>         at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:254)
>         at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:79)
>         at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
>         at java.util.concurrent.FutureTask.run(FutureTask.java:166)
>         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>         at java.lang.Thread.run(Thread.java:780)
> 
> Thanks,
> Kishore
> 
> 
> 
> On Fri, Dec 6, 2013 at 12:48 PM, omkar joshi <om...@gmail.com> wrote:
> add this file in the files to be localized. (LocalResourceRequest). and then refer it as ./list.ksh .. While adding this to LocalResource specify the path which you have mentioned.
> 
> 
> On Thu, Dec 5, 2013 at 10:40 PM, Krishna Kishore Bonagiri <wr...@gmail.com> wrote:
> Hi Arun,
> 
>   I have copied a shell script to HDFS and trying to execute it on containers. How do I specify my shell script PATH in setCommands() call on ContainerLaunchContext? I am doing it this way
> 
>       String shellScriptPath = "hdfs://isredeng:8020/user/kbonagir/KKDummy/list.ksh";
>       commands.add(shellScriptPath);
> 
> But my container execution is failing saying that there is No such file or directory!
> 
> org.apache.hadoop.util.Shell$ExitCodeException: /bin/bash: hdfs://isredeng:8020/user/kbonagir/KKDummy/list.ksh: No such file or directory
> 
>         at org.apache.hadoop.util.Shell.runCommand(Shell.java:464)
>         at org.apache.hadoop.util.Shell.run(Shell.java:379)
>         at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:589)
> 
> I could see this file with "hadoop fs" command and also saw messages in Node Manager's log saying that the resource is downloaded and localized. So, how do I run the downloaded shell script on a container?
> 
> Thanks,
> Kishore
> 
> 
> 
> On Tue, Dec 3, 2013 at 4:57 AM, Arun C Murthy <ac...@hortonworks.com> wrote:
> Robert,
> 
>  YARN, by default, will only download *resource* from a shared namespace (e.g. HDFS).
> 
>  If /home/hadoop/robert/large_jar.jar is available on each node then you can specify path as file:///home/hadoop/robert/large_jar.jar and it should work.
> 
>  Else, you'll need to copy /home/hadoop/robert/large_jar.jar to HDFS and then specify hdfs://host:port/path/to/large_jar.jar.
> 
> hth,
> Arun
> 
> On Dec 1, 2013, at 12:03 PM, Robert Metzger <me...@gmail.com> wrote:
> 
>> Hello,
>> 
>> I'm currently writing code to run my application using Yarn (Hadoop 2.2.0).
>> I used this code as a skeleton: https://github.com/hortonworks/simple-yarn-app
>> 
>> Everything works fine on my local machine or on a cluster with the shared directories, but when I want to access resources outside of commonly accessible locations, my application fails.
>> 
>> I have my application in a large jar file, containing everything (Submission Client, Application Master, and Workers). 
>> The submission client registers the large jar file as a local resource for the Application master's context.
>> 
>> In my understanding, Yarn takes care of transferring the client-local resources to the application master's container.
>> This is also stated here: http://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/WritingYarnApplications.html
>> 
>> You can use the LocalResource to add resources to your application request. This will cause YARN to distribute the resource to the ApplicationMaster node.
>> 
>> If I'm starting my jar from the dir "/home/hadoop/robert/large_jar.jar", I'll get the following error from the nodemanager (another node in the cluster):
>> 
>> 2013-12-01 20:13:00,810 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService: Failed to download rsrc { { file:/home/hadoop/robert/large_jar.jar, ..
>> 
>> So it seems as this node tries to access the file from its local file system.
>> 
>> Do I have to use another "protocol" for the file, something like "file://host:port/home/blabla" ?
>> 
>> Is it true that Yarn is able to distribute files (not using hdfs obviously?) ?
>> 
>> 
>> The distributedshell-example suggests that I have to use HDFS: https://github.com/apache/hadoop-common/blob/50f0de14e377091c308c3a74ed089a7e4a7f0bfe/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/Client.java
>> 
>> 
>> Sincerely,
>> Robert
>> 
>> 
>> 
>> 
>> 
> 
> --
> Arun C. Murthy
> Hortonworks Inc.
> http://hortonworks.com/
> 
> 
> 
> CONFIDENTIALITY NOTICE
> NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.
> 
> 
> 


Re: YARN: LocalResources and file distribution

Posted by Hitesh Shah <hi...@apache.org>.
That seems wrong. 

What you need to do is:

        localResources.put("list.ksh", shellRsrc);
        ctx.setLocalResources(localResources);

shellRsrc should already have the hdfs path set in it. The key in the local resource map denotes the name of the symlink to create and the value contains all the information for YARN to be able to download and verify the resource.

-- Hitesh

On Dec 6, 2013, at 3:03 AM, Krishna Kishore Bonagiri wrote:

> Hi Omkar,
>   Thanks for the quick reply. I am now adding a resource to be localized on the ContainerLaunchContext like this:
> 
>         localResources.put("hdfs://isredeng:8020/user/kbonagir/KKDummy/list.ksh", shellRsrc);
>         ctx.setLocalResources(localResources);
> 
> and referred it as "./list.ksh". Is that enough?
> 
> With this change I have gone past the previous error, and now seeing this error, what else I might be missing?
> 
> 2013-12-06 05:25:59,480 WARN org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch: Failed to launch container.
> java.io.IOException: Destination must be relative
>         at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch$ShellScriptBuilder.symlink(ContainerLaunch.java:474)
>         at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.writeLaunchEnv(ContainerLaunch.java:723)
>         at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:254)
>         at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:79)
>         at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
>         at java.util.concurrent.FutureTask.run(FutureTask.java:166)
>         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>         at java.lang.Thread.run(Thread.java:780)
> 
> Thanks,
> Kishore
> 
> 
> 
> On Fri, Dec 6, 2013 at 12:48 PM, omkar joshi <om...@gmail.com> wrote:
> add this file in the files to be localized. (LocalResourceRequest). and then refer it as ./list.ksh .. While adding this to LocalResource specify the path which you have mentioned.
> 
> 
> On Thu, Dec 5, 2013 at 10:40 PM, Krishna Kishore Bonagiri <wr...@gmail.com> wrote:
> Hi Arun,
> 
>   I have copied a shell script to HDFS and trying to execute it on containers. How do I specify my shell script PATH in setCommands() call on ContainerLaunchContext? I am doing it this way
> 
>       String shellScriptPath = "hdfs://isredeng:8020/user/kbonagir/KKDummy/list.ksh";
>       commands.add(shellScriptPath);
> 
> But my container execution is failing saying that there is No such file or directory!
> 
> org.apache.hadoop.util.Shell$ExitCodeException: /bin/bash: hdfs://isredeng:8020/user/kbonagir/KKDummy/list.ksh: No such file or directory
> 
>         at org.apache.hadoop.util.Shell.runCommand(Shell.java:464)
>         at org.apache.hadoop.util.Shell.run(Shell.java:379)
>         at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:589)
> 
> I could see this file with "hadoop fs" command and also saw messages in Node Manager's log saying that the resource is downloaded and localized. So, how do I run the downloaded shell script on a container?
> 
> Thanks,
> Kishore
> 
> 
> 
> On Tue, Dec 3, 2013 at 4:57 AM, Arun C Murthy <ac...@hortonworks.com> wrote:
> Robert,
> 
>  YARN, by default, will only download *resource* from a shared namespace (e.g. HDFS).
> 
>  If /home/hadoop/robert/large_jar.jar is available on each node then you can specify path as file:///home/hadoop/robert/large_jar.jar and it should work.
> 
>  Else, you'll need to copy /home/hadoop/robert/large_jar.jar to HDFS and then specify hdfs://host:port/path/to/large_jar.jar.
> 
> hth,
> Arun
> 
> On Dec 1, 2013, at 12:03 PM, Robert Metzger <me...@gmail.com> wrote:
> 
>> Hello,
>> 
>> I'm currently writing code to run my application using Yarn (Hadoop 2.2.0).
>> I used this code as a skeleton: https://github.com/hortonworks/simple-yarn-app
>> 
>> Everything works fine on my local machine or on a cluster with the shared directories, but when I want to access resources outside of commonly accessible locations, my application fails.
>> 
>> I have my application in a large jar file, containing everything (Submission Client, Application Master, and Workers). 
>> The submission client registers the large jar file as a local resource for the Application master's context.
>> 
>> In my understanding, Yarn takes care of transferring the client-local resources to the application master's container.
>> This is also stated here: http://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/WritingYarnApplications.html
>> 
>> You can use the LocalResource to add resources to your application request. This will cause YARN to distribute the resource to the ApplicationMaster node.
>> 
>> If I'm starting my jar from the dir "/home/hadoop/robert/large_jar.jar", I'll get the following error from the nodemanager (another node in the cluster):
>> 
>> 2013-12-01 20:13:00,810 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService: Failed to download rsrc { { file:/home/hadoop/robert/large_jar.jar, ..
>> 
>> So it seems as this node tries to access the file from its local file system.
>> 
>> Do I have to use another "protocol" for the file, something like "file://host:port/home/blabla" ?
>> 
>> Is it true that Yarn is able to distribute files (not using hdfs obviously?) ?
>> 
>> 
>> The distributedshell-example suggests that I have to use HDFS: https://github.com/apache/hadoop-common/blob/50f0de14e377091c308c3a74ed089a7e4a7f0bfe/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/Client.java
>> 
>> 
>> Sincerely,
>> Robert
>> 
>> 
>> 
>> 
>> 
> 
> --
> Arun C. Murthy
> Hortonworks Inc.
> http://hortonworks.com/
> 
> 
> 
> CONFIDENTIALITY NOTICE
> NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.
> 
> 
> 


Re: YARN: LocalResources and file distribution

Posted by Krishna Kishore Bonagiri <wr...@gmail.com>.
Hi Omkar,
  Thanks for the quick reply. I am now adding a resource to be localized on
the ContainerLaunchContext like this:


localResources.put("hdfs://isredeng:8020/user/kbonagir/KKDummy/list.ksh",
shellRsrc);
        ctx.setLocalResources(localResources);

and referred it as "./list.ksh". Is that enough?

With this change I have gone past the previous error, and now seeing this
error, what else I might be missing?

2013-12-06 05:25:59,480 WARN
org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch:
Failed to launch container.
java.io.IOException: Destination must be relative
        at
org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch$ShellScriptBuilder.symlink(ContainerLaunch.java:474)
        at
org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.writeLaunchEnv(ContainerLaunch.java:723)
        at
org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:254)
        at
org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:79)
        at
java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
        at java.util.concurrent.FutureTask.run(FutureTask.java:166)
        at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:780)

Thanks,
Kishore



On Fri, Dec 6, 2013 at 12:48 PM, omkar joshi <omkar.vinit.joshi.86@gmail.com
> wrote:

> add this file in the files to be localized. (LocalResourceRequest). and
> then refer it as ./list.ksh .. While adding this to LocalResource specify
> the path which you have mentioned.
>
>
> On Thu, Dec 5, 2013 at 10:40 PM, Krishna Kishore Bonagiri <
> write2kishore@gmail.com> wrote:
>
>> Hi Arun,
>>
>>   I have copied a shell script to HDFS and trying to execute it on
>> containers. How do I specify my shell script PATH in setCommands() call
>> on ContainerLaunchContext? I am doing it this way
>>
>>       String shellScriptPath =
>> "hdfs://isredeng:8020/user/kbonagir/KKDummy/list.ksh";
>>       commands.add(shellScriptPath);
>>
>> But my container execution is failing saying that there is No such file
>> or directory!
>>
>> org.apache.hadoop.util.Shell$ExitCodeException: /bin/bash:
>> hdfs://isredeng:8020/user/kbonagir/KKDummy/list.ksh: No such file or
>> directory
>>
>>         at org.apache.hadoop.util.Shell.runCommand(Shell.java:464)
>>         at org.apache.hadoop.util.Shell.run(Shell.java:379)
>>         at
>> org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:589)
>>
>> I could see this file with "hadoop fs" command and also saw messages in
>> Node Manager's log saying that the resource is downloaded and localized.
>> So, how do I run the downloaded shell script on a container?
>>
>> Thanks,
>> Kishore
>>
>>
>>
>> On Tue, Dec 3, 2013 at 4:57 AM, Arun C Murthy <ac...@hortonworks.com>wrote:
>>
>>> Robert,
>>>
>>>  YARN, by default, will only download *resource* from a shared namespace
>>> (e.g. HDFS).
>>>
>>>  If /home/hadoop/robert/large_jar.jar is available on each node then you
>>> can specify path as file:///home/hadoop/robert/large_jar.jar and it should
>>> work.
>>>
>>>  Else, you'll need to copy /home/hadoop/robert/large_jar.jar to HDFS and
>>> then specify hdfs://host:port/path/to/large_jar.jar.
>>>
>>> hth,
>>> Arun
>>>
>>> On Dec 1, 2013, at 12:03 PM, Robert Metzger <me...@gmail.com> wrote:
>>>
>>> Hello,
>>>
>>> I'm currently writing code to run my application using Yarn (Hadoop
>>> 2.2.0).
>>> I used this code as a skeleton:
>>> https://github.com/hortonworks/simple-yarn-app
>>>
>>> Everything works fine on my local machine or on a cluster with the
>>> shared directories, but when I want to access resources outside of commonly
>>> accessible locations, my application fails.
>>>
>>> I have my application in a large jar file, containing everything
>>> (Submission Client, Application Master, and Workers).
>>> The submission client registers the large jar file as a local resource
>>> for the Application master's context.
>>>
>>> In my understanding, Yarn takes care of transferring the client-local
>>> resources to the application master's container.
>>> This is also stated here:
>>> http://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/WritingYarnApplications.html
>>>
>>> You can use the LocalResource to add resources to your application
>>>> request. This will cause YARN to distribute the resource to the
>>>> ApplicationMaster node.
>>>
>>>
>>> If I'm starting my jar from the dir "/home/hadoop/robert/large_jar.jar",
>>> I'll get the following error from the nodemanager (another node in the
>>> cluster):
>>>
>>> 2013-12-01 20:13:00,810 INFO
>>>> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService:
>>>> Failed to download rsrc { { file:/home/hadoop/robert/large_jar.jar, ..
>>>>
>>>
>>> So it seems as this node tries to access the file from its local file
>>> system.
>>>
>>> Do I have to use another "protocol" for the file, something like "
>>> file://host:port/home/blabla" ?
>>>
>>> Is it true that Yarn is able to distribute files (not using hdfs
>>> obviously?) ?
>>>
>>>
>>> The distributedshell-example suggests that I have to use HDFS:
>>> https://github.com/apache/hadoop-common/blob/50f0de14e377091c308c3a74ed089a7e4a7f0bfe/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/Client.java
>>>
>>>
>>> Sincerely,
>>> Robert
>>>
>>>
>>>
>>>
>>>
>>>
>>>  --
>>> Arun C. Murthy
>>> Hortonworks Inc.
>>> http://hortonworks.com/
>>>
>>>
>>>
>>> CONFIDENTIALITY NOTICE
>>> NOTICE: This message is intended for the use of the individual or entity
>>> to which it is addressed and may contain information that is confidential,
>>> privileged and exempt from disclosure under applicable law. If the reader
>>> of this message is not the intended recipient, you are hereby notified that
>>> any printing, copying, dissemination, distribution, disclosure or
>>> forwarding of this communication is strictly prohibited. If you have
>>> received this communication in error, please contact the sender immediately
>>> and delete it from your system. Thank You.
>>
>>
>>
>

Re: YARN: LocalResources and file distribution

Posted by Krishna Kishore Bonagiri <wr...@gmail.com>.
Hi Omkar,
  Thanks for the quick reply. I am now adding a resource to be localized on
the ContainerLaunchContext like this:


localResources.put("hdfs://isredeng:8020/user/kbonagir/KKDummy/list.ksh",
shellRsrc);
        ctx.setLocalResources(localResources);

and referred it as "./list.ksh". Is that enough?

With this change I have gone past the previous error, and now seeing this
error, what else I might be missing?

2013-12-06 05:25:59,480 WARN
org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch:
Failed to launch container.
java.io.IOException: Destination must be relative
        at
org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch$ShellScriptBuilder.symlink(ContainerLaunch.java:474)
        at
org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.writeLaunchEnv(ContainerLaunch.java:723)
        at
org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:254)
        at
org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:79)
        at
java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
        at java.util.concurrent.FutureTask.run(FutureTask.java:166)
        at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:780)

Thanks,
Kishore



On Fri, Dec 6, 2013 at 12:48 PM, omkar joshi <omkar.vinit.joshi.86@gmail.com
> wrote:

> add this file in the files to be localized. (LocalResourceRequest). and
> then refer it as ./list.ksh .. While adding this to LocalResource specify
> the path which you have mentioned.
>
>
> On Thu, Dec 5, 2013 at 10:40 PM, Krishna Kishore Bonagiri <
> write2kishore@gmail.com> wrote:
>
>> Hi Arun,
>>
>>   I have copied a shell script to HDFS and trying to execute it on
>> containers. How do I specify my shell script PATH in setCommands() call
>> on ContainerLaunchContext? I am doing it this way
>>
>>       String shellScriptPath =
>> "hdfs://isredeng:8020/user/kbonagir/KKDummy/list.ksh";
>>       commands.add(shellScriptPath);
>>
>> But my container execution is failing saying that there is No such file
>> or directory!
>>
>> org.apache.hadoop.util.Shell$ExitCodeException: /bin/bash:
>> hdfs://isredeng:8020/user/kbonagir/KKDummy/list.ksh: No such file or
>> directory
>>
>>         at org.apache.hadoop.util.Shell.runCommand(Shell.java:464)
>>         at org.apache.hadoop.util.Shell.run(Shell.java:379)
>>         at
>> org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:589)
>>
>> I could see this file with "hadoop fs" command and also saw messages in
>> Node Manager's log saying that the resource is downloaded and localized.
>> So, how do I run the downloaded shell script on a container?
>>
>> Thanks,
>> Kishore
>>
>>
>>
>> On Tue, Dec 3, 2013 at 4:57 AM, Arun C Murthy <ac...@hortonworks.com>wrote:
>>
>>> Robert,
>>>
>>>  YARN, by default, will only download *resource* from a shared namespace
>>> (e.g. HDFS).
>>>
>>>  If /home/hadoop/robert/large_jar.jar is available on each node then you
>>> can specify path as file:///home/hadoop/robert/large_jar.jar and it should
>>> work.
>>>
>>>  Else, you'll need to copy /home/hadoop/robert/large_jar.jar to HDFS and
>>> then specify hdfs://host:port/path/to/large_jar.jar.
>>>
>>> hth,
>>> Arun
>>>
>>> On Dec 1, 2013, at 12:03 PM, Robert Metzger <me...@gmail.com> wrote:
>>>
>>> Hello,
>>>
>>> I'm currently writing code to run my application using Yarn (Hadoop
>>> 2.2.0).
>>> I used this code as a skeleton:
>>> https://github.com/hortonworks/simple-yarn-app
>>>
>>> Everything works fine on my local machine or on a cluster with the
>>> shared directories, but when I want to access resources outside of commonly
>>> accessible locations, my application fails.
>>>
>>> I have my application in a large jar file, containing everything
>>> (Submission Client, Application Master, and Workers).
>>> The submission client registers the large jar file as a local resource
>>> for the Application master's context.
>>>
>>> In my understanding, Yarn takes care of transferring the client-local
>>> resources to the application master's container.
>>> This is also stated here:
>>> http://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/WritingYarnApplications.html
>>>
>>> You can use the LocalResource to add resources to your application
>>>> request. This will cause YARN to distribute the resource to the
>>>> ApplicationMaster node.
>>>
>>>
>>> If I'm starting my jar from the dir "/home/hadoop/robert/large_jar.jar",
>>> I'll get the following error from the nodemanager (another node in the
>>> cluster):
>>>
>>> 2013-12-01 20:13:00,810 INFO
>>>> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService:
>>>> Failed to download rsrc { { file:/home/hadoop/robert/large_jar.jar, ..
>>>>
>>>
>>> So it seems as this node tries to access the file from its local file
>>> system.
>>>
>>> Do I have to use another "protocol" for the file, something like "
>>> file://host:port/home/blabla" ?
>>>
>>> Is it true that Yarn is able to distribute files (not using hdfs
>>> obviously?) ?
>>>
>>>
>>> The distributedshell-example suggests that I have to use HDFS:
>>> https://github.com/apache/hadoop-common/blob/50f0de14e377091c308c3a74ed089a7e4a7f0bfe/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/Client.java
>>>
>>>
>>> Sincerely,
>>> Robert
>>>
>>>
>>>
>>>
>>>
>>>
>>>  --
>>> Arun C. Murthy
>>> Hortonworks Inc.
>>> http://hortonworks.com/
>>>
>>>
>>>
>>> CONFIDENTIALITY NOTICE
>>> NOTICE: This message is intended for the use of the individual or entity
>>> to which it is addressed and may contain information that is confidential,
>>> privileged and exempt from disclosure under applicable law. If the reader
>>> of this message is not the intended recipient, you are hereby notified that
>>> any printing, copying, dissemination, distribution, disclosure or
>>> forwarding of this communication is strictly prohibited. If you have
>>> received this communication in error, please contact the sender immediately
>>> and delete it from your system. Thank You.
>>
>>
>>
>

Re: YARN: LocalResources and file distribution

Posted by Krishna Kishore Bonagiri <wr...@gmail.com>.
Hi Omkar,
  Thanks for the quick reply. I am now adding a resource to be localized on
the ContainerLaunchContext like this:


localResources.put("hdfs://isredeng:8020/user/kbonagir/KKDummy/list.ksh",
shellRsrc);
        ctx.setLocalResources(localResources);

and referred it as "./list.ksh". Is that enough?

With this change I have gone past the previous error, and now seeing this
error, what else I might be missing?

2013-12-06 05:25:59,480 WARN
org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch:
Failed to launch container.
java.io.IOException: Destination must be relative
        at
org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch$ShellScriptBuilder.symlink(ContainerLaunch.java:474)
        at
org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.writeLaunchEnv(ContainerLaunch.java:723)
        at
org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:254)
        at
org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:79)
        at
java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
        at java.util.concurrent.FutureTask.run(FutureTask.java:166)
        at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:780)

Thanks,
Kishore



On Fri, Dec 6, 2013 at 12:48 PM, omkar joshi <omkar.vinit.joshi.86@gmail.com
> wrote:

> add this file in the files to be localized. (LocalResourceRequest). and
> then refer it as ./list.ksh .. While adding this to LocalResource specify
> the path which you have mentioned.
>
>
> On Thu, Dec 5, 2013 at 10:40 PM, Krishna Kishore Bonagiri <
> write2kishore@gmail.com> wrote:
>
>> Hi Arun,
>>
>>   I have copied a shell script to HDFS and trying to execute it on
>> containers. How do I specify my shell script PATH in setCommands() call
>> on ContainerLaunchContext? I am doing it this way
>>
>>       String shellScriptPath =
>> "hdfs://isredeng:8020/user/kbonagir/KKDummy/list.ksh";
>>       commands.add(shellScriptPath);
>>
>> But my container execution is failing saying that there is No such file
>> or directory!
>>
>> org.apache.hadoop.util.Shell$ExitCodeException: /bin/bash:
>> hdfs://isredeng:8020/user/kbonagir/KKDummy/list.ksh: No such file or
>> directory
>>
>>         at org.apache.hadoop.util.Shell.runCommand(Shell.java:464)
>>         at org.apache.hadoop.util.Shell.run(Shell.java:379)
>>         at
>> org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:589)
>>
>> I could see this file with "hadoop fs" command and also saw messages in
>> Node Manager's log saying that the resource is downloaded and localized.
>> So, how do I run the downloaded shell script on a container?
>>
>> Thanks,
>> Kishore
>>
>>
>>
>> On Tue, Dec 3, 2013 at 4:57 AM, Arun C Murthy <ac...@hortonworks.com>wrote:
>>
>>> Robert,
>>>
>>>  YARN, by default, will only download *resource* from a shared namespace
>>> (e.g. HDFS).
>>>
>>>  If /home/hadoop/robert/large_jar.jar is available on each node then you
>>> can specify path as file:///home/hadoop/robert/large_jar.jar and it should
>>> work.
>>>
>>>  Else, you'll need to copy /home/hadoop/robert/large_jar.jar to HDFS and
>>> then specify hdfs://host:port/path/to/large_jar.jar.
>>>
>>> hth,
>>> Arun
>>>
>>> On Dec 1, 2013, at 12:03 PM, Robert Metzger <me...@gmail.com> wrote:
>>>
>>> Hello,
>>>
>>> I'm currently writing code to run my application using Yarn (Hadoop
>>> 2.2.0).
>>> I used this code as a skeleton:
>>> https://github.com/hortonworks/simple-yarn-app
>>>
>>> Everything works fine on my local machine or on a cluster with the
>>> shared directories, but when I want to access resources outside of commonly
>>> accessible locations, my application fails.
>>>
>>> I have my application in a large jar file, containing everything
>>> (Submission Client, Application Master, and Workers).
>>> The submission client registers the large jar file as a local resource
>>> for the Application master's context.
>>>
>>> In my understanding, Yarn takes care of transferring the client-local
>>> resources to the application master's container.
>>> This is also stated here:
>>> http://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/WritingYarnApplications.html
>>>
>>> You can use the LocalResource to add resources to your application
>>>> request. This will cause YARN to distribute the resource to the
>>>> ApplicationMaster node.
>>>
>>>
>>> If I'm starting my jar from the dir "/home/hadoop/robert/large_jar.jar",
>>> I'll get the following error from the nodemanager (another node in the
>>> cluster):
>>>
>>> 2013-12-01 20:13:00,810 INFO
>>>> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService:
>>>> Failed to download rsrc { { file:/home/hadoop/robert/large_jar.jar, ..
>>>>
>>>
>>> So it seems as this node tries to access the file from its local file
>>> system.
>>>
>>> Do I have to use another "protocol" for the file, something like "
>>> file://host:port/home/blabla" ?
>>>
>>> Is it true that Yarn is able to distribute files (not using hdfs
>>> obviously?) ?
>>>
>>>
>>> The distributedshell-example suggests that I have to use HDFS:
>>> https://github.com/apache/hadoop-common/blob/50f0de14e377091c308c3a74ed089a7e4a7f0bfe/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/Client.java
>>>
>>>
>>> Sincerely,
>>> Robert
>>>
>>>
>>>
>>>
>>>
>>>
>>>  --
>>> Arun C. Murthy
>>> Hortonworks Inc.
>>> http://hortonworks.com/
>>>
>>>
>>>
>>> CONFIDENTIALITY NOTICE
>>> NOTICE: This message is intended for the use of the individual or entity
>>> to which it is addressed and may contain information that is confidential,
>>> privileged and exempt from disclosure under applicable law. If the reader
>>> of this message is not the intended recipient, you are hereby notified that
>>> any printing, copying, dissemination, distribution, disclosure or
>>> forwarding of this communication is strictly prohibited. If you have
>>> received this communication in error, please contact the sender immediately
>>> and delete it from your system. Thank You.
>>
>>
>>
>

Re: YARN: LocalResources and file distribution

Posted by Krishna Kishore Bonagiri <wr...@gmail.com>.
Hi Omkar,
  Thanks for the quick reply. I am now adding a resource to be localized on
the ContainerLaunchContext like this:


localResources.put("hdfs://isredeng:8020/user/kbonagir/KKDummy/list.ksh",
shellRsrc);
        ctx.setLocalResources(localResources);

and referred it as "./list.ksh". Is that enough?

With this change I have gone past the previous error, and now seeing this
error, what else I might be missing?

2013-12-06 05:25:59,480 WARN
org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch:
Failed to launch container.
java.io.IOException: Destination must be relative
        at
org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch$ShellScriptBuilder.symlink(ContainerLaunch.java:474)
        at
org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.writeLaunchEnv(ContainerLaunch.java:723)
        at
org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:254)
        at
org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:79)
        at
java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
        at java.util.concurrent.FutureTask.run(FutureTask.java:166)
        at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:780)

Thanks,
Kishore



On Fri, Dec 6, 2013 at 12:48 PM, omkar joshi <omkar.vinit.joshi.86@gmail.com
> wrote:

> add this file in the files to be localized. (LocalResourceRequest). and
> then refer it as ./list.ksh .. While adding this to LocalResource specify
> the path which you have mentioned.
>
>
> On Thu, Dec 5, 2013 at 10:40 PM, Krishna Kishore Bonagiri <
> write2kishore@gmail.com> wrote:
>
>> Hi Arun,
>>
>>   I have copied a shell script to HDFS and trying to execute it on
>> containers. How do I specify my shell script PATH in setCommands() call
>> on ContainerLaunchContext? I am doing it this way
>>
>>       String shellScriptPath =
>> "hdfs://isredeng:8020/user/kbonagir/KKDummy/list.ksh";
>>       commands.add(shellScriptPath);
>>
>> But my container execution is failing saying that there is No such file
>> or directory!
>>
>> org.apache.hadoop.util.Shell$ExitCodeException: /bin/bash:
>> hdfs://isredeng:8020/user/kbonagir/KKDummy/list.ksh: No such file or
>> directory
>>
>>         at org.apache.hadoop.util.Shell.runCommand(Shell.java:464)
>>         at org.apache.hadoop.util.Shell.run(Shell.java:379)
>>         at
>> org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:589)
>>
>> I could see this file with "hadoop fs" command and also saw messages in
>> Node Manager's log saying that the resource is downloaded and localized.
>> So, how do I run the downloaded shell script on a container?
>>
>> Thanks,
>> Kishore
>>
>>
>>
>> On Tue, Dec 3, 2013 at 4:57 AM, Arun C Murthy <ac...@hortonworks.com>wrote:
>>
>>> Robert,
>>>
>>>  YARN, by default, will only download *resource* from a shared namespace
>>> (e.g. HDFS).
>>>
>>>  If /home/hadoop/robert/large_jar.jar is available on each node then you
>>> can specify path as file:///home/hadoop/robert/large_jar.jar and it should
>>> work.
>>>
>>>  Else, you'll need to copy /home/hadoop/robert/large_jar.jar to HDFS and
>>> then specify hdfs://host:port/path/to/large_jar.jar.
>>>
>>> hth,
>>> Arun
>>>
>>> On Dec 1, 2013, at 12:03 PM, Robert Metzger <me...@gmail.com> wrote:
>>>
>>> Hello,
>>>
>>> I'm currently writing code to run my application using Yarn (Hadoop
>>> 2.2.0).
>>> I used this code as a skeleton:
>>> https://github.com/hortonworks/simple-yarn-app
>>>
>>> Everything works fine on my local machine or on a cluster with the
>>> shared directories, but when I want to access resources outside of commonly
>>> accessible locations, my application fails.
>>>
>>> I have my application in a large jar file, containing everything
>>> (Submission Client, Application Master, and Workers).
>>> The submission client registers the large jar file as a local resource
>>> for the Application master's context.
>>>
>>> In my understanding, Yarn takes care of transferring the client-local
>>> resources to the application master's container.
>>> This is also stated here:
>>> http://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/WritingYarnApplications.html
>>>
>>> You can use the LocalResource to add resources to your application
>>>> request. This will cause YARN to distribute the resource to the
>>>> ApplicationMaster node.
>>>
>>>
>>> If I'm starting my jar from the dir "/home/hadoop/robert/large_jar.jar",
>>> I'll get the following error from the nodemanager (another node in the
>>> cluster):
>>>
>>> 2013-12-01 20:13:00,810 INFO
>>>> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService:
>>>> Failed to download rsrc { { file:/home/hadoop/robert/large_jar.jar, ..
>>>>
>>>
>>> So it seems as this node tries to access the file from its local file
>>> system.
>>>
>>> Do I have to use another "protocol" for the file, something like "
>>> file://host:port/home/blabla" ?
>>>
>>> Is it true that Yarn is able to distribute files (not using hdfs
>>> obviously?) ?
>>>
>>>
>>> The distributedshell-example suggests that I have to use HDFS:
>>> https://github.com/apache/hadoop-common/blob/50f0de14e377091c308c3a74ed089a7e4a7f0bfe/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/Client.java
>>>
>>>
>>> Sincerely,
>>> Robert
>>>
>>>
>>>
>>>
>>>
>>>
>>>  --
>>> Arun C. Murthy
>>> Hortonworks Inc.
>>> http://hortonworks.com/
>>>
>>>
>>>
>>> CONFIDENTIALITY NOTICE
>>> NOTICE: This message is intended for the use of the individual or entity
>>> to which it is addressed and may contain information that is confidential,
>>> privileged and exempt from disclosure under applicable law. If the reader
>>> of this message is not the intended recipient, you are hereby notified that
>>> any printing, copying, dissemination, distribution, disclosure or
>>> forwarding of this communication is strictly prohibited. If you have
>>> received this communication in error, please contact the sender immediately
>>> and delete it from your system. Thank You.
>>
>>
>>
>

Re: YARN: LocalResources and file distribution

Posted by omkar joshi <om...@gmail.com>.
add this file in the files to be localized. (LocalResourceRequest). and
then refer it as ./list.ksh .. While adding this to LocalResource specify
the path which you have mentioned.


On Thu, Dec 5, 2013 at 10:40 PM, Krishna Kishore Bonagiri <
write2kishore@gmail.com> wrote:

> Hi Arun,
>
>   I have copied a shell script to HDFS and trying to execute it on
> containers. How do I specify my shell script PATH in setCommands() call
> on ContainerLaunchContext? I am doing it this way
>
>       String shellScriptPath =
> "hdfs://isredeng:8020/user/kbonagir/KKDummy/list.ksh";
>       commands.add(shellScriptPath);
>
> But my container execution is failing saying that there is No such file or
> directory!
>
> org.apache.hadoop.util.Shell$ExitCodeException: /bin/bash:
> hdfs://isredeng:8020/user/kbonagir/KKDummy/list.ksh: No such file or
> directory
>
>         at org.apache.hadoop.util.Shell.runCommand(Shell.java:464)
>         at org.apache.hadoop.util.Shell.run(Shell.java:379)
>         at
> org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:589)
>
> I could see this file with "hadoop fs" command and also saw messages in
> Node Manager's log saying that the resource is downloaded and localized.
> So, how do I run the downloaded shell script on a container?
>
> Thanks,
> Kishore
>
>
>
> On Tue, Dec 3, 2013 at 4:57 AM, Arun C Murthy <ac...@hortonworks.com> wrote:
>
>> Robert,
>>
>>  YARN, by default, will only download *resource* from a shared namespace
>> (e.g. HDFS).
>>
>>  If /home/hadoop/robert/large_jar.jar is available on each node then you
>> can specify path as file:///home/hadoop/robert/large_jar.jar and it should
>> work.
>>
>>  Else, you'll need to copy /home/hadoop/robert/large_jar.jar to HDFS and
>> then specify hdfs://host:port/path/to/large_jar.jar.
>>
>> hth,
>> Arun
>>
>> On Dec 1, 2013, at 12:03 PM, Robert Metzger <me...@gmail.com> wrote:
>>
>> Hello,
>>
>> I'm currently writing code to run my application using Yarn (Hadoop
>> 2.2.0).
>> I used this code as a skeleton:
>> https://github.com/hortonworks/simple-yarn-app
>>
>> Everything works fine on my local machine or on a cluster with the shared
>> directories, but when I want to access resources outside of commonly
>> accessible locations, my application fails.
>>
>> I have my application in a large jar file, containing everything
>> (Submission Client, Application Master, and Workers).
>> The submission client registers the large jar file as a local resource
>> for the Application master's context.
>>
>> In my understanding, Yarn takes care of transferring the client-local
>> resources to the application master's container.
>> This is also stated here:
>> http://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/WritingYarnApplications.html
>>
>> You can use the LocalResource to add resources to your application
>>> request. This will cause YARN to distribute the resource to the
>>> ApplicationMaster node.
>>
>>
>> If I'm starting my jar from the dir "/home/hadoop/robert/large_jar.jar",
>> I'll get the following error from the nodemanager (another node in the
>> cluster):
>>
>> 2013-12-01 20:13:00,810 INFO
>>> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService:
>>> Failed to download rsrc { { file:/home/hadoop/robert/large_jar.jar, ..
>>>
>>
>> So it seems as this node tries to access the file from its local file
>> system.
>>
>> Do I have to use another "protocol" for the file, something like "
>> file://host:port/home/blabla" ?
>>
>> Is it true that Yarn is able to distribute files (not using hdfs
>> obviously?) ?
>>
>>
>> The distributedshell-example suggests that I have to use HDFS:
>> https://github.com/apache/hadoop-common/blob/50f0de14e377091c308c3a74ed089a7e4a7f0bfe/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/Client.java
>>
>>
>> Sincerely,
>> Robert
>>
>>
>>
>>
>>
>>
>>  --
>> Arun C. Murthy
>> Hortonworks Inc.
>> http://hortonworks.com/
>>
>>
>>
>> CONFIDENTIALITY NOTICE
>> NOTICE: This message is intended for the use of the individual or entity
>> to which it is addressed and may contain information that is confidential,
>> privileged and exempt from disclosure under applicable law. If the reader
>> of this message is not the intended recipient, you are hereby notified that
>> any printing, copying, dissemination, distribution, disclosure or
>> forwarding of this communication is strictly prohibited. If you have
>> received this communication in error, please contact the sender immediately
>> and delete it from your system. Thank You.
>
>
>

Re: YARN: LocalResources and file distribution

Posted by omkar joshi <om...@gmail.com>.
add this file in the files to be localized. (LocalResourceRequest). and
then refer it as ./list.ksh .. While adding this to LocalResource specify
the path which you have mentioned.


On Thu, Dec 5, 2013 at 10:40 PM, Krishna Kishore Bonagiri <
write2kishore@gmail.com> wrote:

> Hi Arun,
>
>   I have copied a shell script to HDFS and trying to execute it on
> containers. How do I specify my shell script PATH in setCommands() call
> on ContainerLaunchContext? I am doing it this way
>
>       String shellScriptPath =
> "hdfs://isredeng:8020/user/kbonagir/KKDummy/list.ksh";
>       commands.add(shellScriptPath);
>
> But my container execution is failing saying that there is No such file or
> directory!
>
> org.apache.hadoop.util.Shell$ExitCodeException: /bin/bash:
> hdfs://isredeng:8020/user/kbonagir/KKDummy/list.ksh: No such file or
> directory
>
>         at org.apache.hadoop.util.Shell.runCommand(Shell.java:464)
>         at org.apache.hadoop.util.Shell.run(Shell.java:379)
>         at
> org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:589)
>
> I could see this file with "hadoop fs" command and also saw messages in
> Node Manager's log saying that the resource is downloaded and localized.
> So, how do I run the downloaded shell script on a container?
>
> Thanks,
> Kishore
>
>
>
> On Tue, Dec 3, 2013 at 4:57 AM, Arun C Murthy <ac...@hortonworks.com> wrote:
>
>> Robert,
>>
>>  YARN, by default, will only download *resource* from a shared namespace
>> (e.g. HDFS).
>>
>>  If /home/hadoop/robert/large_jar.jar is available on each node then you
>> can specify path as file:///home/hadoop/robert/large_jar.jar and it should
>> work.
>>
>>  Else, you'll need to copy /home/hadoop/robert/large_jar.jar to HDFS and
>> then specify hdfs://host:port/path/to/large_jar.jar.
>>
>> hth,
>> Arun
>>
>> On Dec 1, 2013, at 12:03 PM, Robert Metzger <me...@gmail.com> wrote:
>>
>> Hello,
>>
>> I'm currently writing code to run my application using Yarn (Hadoop
>> 2.2.0).
>> I used this code as a skeleton:
>> https://github.com/hortonworks/simple-yarn-app
>>
>> Everything works fine on my local machine or on a cluster with the shared
>> directories, but when I want to access resources outside of commonly
>> accessible locations, my application fails.
>>
>> I have my application in a large jar file, containing everything
>> (Submission Client, Application Master, and Workers).
>> The submission client registers the large jar file as a local resource
>> for the Application master's context.
>>
>> In my understanding, Yarn takes care of transferring the client-local
>> resources to the application master's container.
>> This is also stated here:
>> http://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/WritingYarnApplications.html
>>
>> You can use the LocalResource to add resources to your application
>>> request. This will cause YARN to distribute the resource to the
>>> ApplicationMaster node.
>>
>>
>> If I'm starting my jar from the dir "/home/hadoop/robert/large_jar.jar",
>> I'll get the following error from the nodemanager (another node in the
>> cluster):
>>
>> 2013-12-01 20:13:00,810 INFO
>>> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService:
>>> Failed to download rsrc { { file:/home/hadoop/robert/large_jar.jar, ..
>>>
>>
>> So it seems as this node tries to access the file from its local file
>> system.
>>
>> Do I have to use another "protocol" for the file, something like "
>> file://host:port/home/blabla" ?
>>
>> Is it true that Yarn is able to distribute files (not using hdfs
>> obviously?) ?
>>
>>
>> The distributedshell-example suggests that I have to use HDFS:
>> https://github.com/apache/hadoop-common/blob/50f0de14e377091c308c3a74ed089a7e4a7f0bfe/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/Client.java
>>
>>
>> Sincerely,
>> Robert
>>
>>
>>
>>
>>
>>
>>  --
>> Arun C. Murthy
>> Hortonworks Inc.
>> http://hortonworks.com/
>>
>>
>>
>> CONFIDENTIALITY NOTICE
>> NOTICE: This message is intended for the use of the individual or entity
>> to which it is addressed and may contain information that is confidential,
>> privileged and exempt from disclosure under applicable law. If the reader
>> of this message is not the intended recipient, you are hereby notified that
>> any printing, copying, dissemination, distribution, disclosure or
>> forwarding of this communication is strictly prohibited. If you have
>> received this communication in error, please contact the sender immediately
>> and delete it from your system. Thank You.
>
>
>

Re: YARN: LocalResources and file distribution

Posted by omkar joshi <om...@gmail.com>.
add this file in the files to be localized. (LocalResourceRequest). and
then refer it as ./list.ksh .. While adding this to LocalResource specify
the path which you have mentioned.


On Thu, Dec 5, 2013 at 10:40 PM, Krishna Kishore Bonagiri <
write2kishore@gmail.com> wrote:

> Hi Arun,
>
>   I have copied a shell script to HDFS and trying to execute it on
> containers. How do I specify my shell script PATH in setCommands() call
> on ContainerLaunchContext? I am doing it this way
>
>       String shellScriptPath =
> "hdfs://isredeng:8020/user/kbonagir/KKDummy/list.ksh";
>       commands.add(shellScriptPath);
>
> But my container execution is failing saying that there is No such file or
> directory!
>
> org.apache.hadoop.util.Shell$ExitCodeException: /bin/bash:
> hdfs://isredeng:8020/user/kbonagir/KKDummy/list.ksh: No such file or
> directory
>
>         at org.apache.hadoop.util.Shell.runCommand(Shell.java:464)
>         at org.apache.hadoop.util.Shell.run(Shell.java:379)
>         at
> org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:589)
>
> I could see this file with "hadoop fs" command and also saw messages in
> Node Manager's log saying that the resource is downloaded and localized.
> So, how do I run the downloaded shell script on a container?
>
> Thanks,
> Kishore
>
>
>
> On Tue, Dec 3, 2013 at 4:57 AM, Arun C Murthy <ac...@hortonworks.com> wrote:
>
>> Robert,
>>
>>  YARN, by default, will only download *resource* from a shared namespace
>> (e.g. HDFS).
>>
>>  If /home/hadoop/robert/large_jar.jar is available on each node then you
>> can specify path as file:///home/hadoop/robert/large_jar.jar and it should
>> work.
>>
>>  Else, you'll need to copy /home/hadoop/robert/large_jar.jar to HDFS and
>> then specify hdfs://host:port/path/to/large_jar.jar.
>>
>> hth,
>> Arun
>>
>> On Dec 1, 2013, at 12:03 PM, Robert Metzger <me...@gmail.com> wrote:
>>
>> Hello,
>>
>> I'm currently writing code to run my application using Yarn (Hadoop
>> 2.2.0).
>> I used this code as a skeleton:
>> https://github.com/hortonworks/simple-yarn-app
>>
>> Everything works fine on my local machine or on a cluster with the shared
>> directories, but when I want to access resources outside of commonly
>> accessible locations, my application fails.
>>
>> I have my application in a large jar file, containing everything
>> (Submission Client, Application Master, and Workers).
>> The submission client registers the large jar file as a local resource
>> for the Application master's context.
>>
>> In my understanding, Yarn takes care of transferring the client-local
>> resources to the application master's container.
>> This is also stated here:
>> http://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/WritingYarnApplications.html
>>
>> You can use the LocalResource to add resources to your application
>>> request. This will cause YARN to distribute the resource to the
>>> ApplicationMaster node.
>>
>>
>> If I'm starting my jar from the dir "/home/hadoop/robert/large_jar.jar",
>> I'll get the following error from the nodemanager (another node in the
>> cluster):
>>
>> 2013-12-01 20:13:00,810 INFO
>>> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService:
>>> Failed to download rsrc { { file:/home/hadoop/robert/large_jar.jar, ..
>>>
>>
>> So it seems as this node tries to access the file from its local file
>> system.
>>
>> Do I have to use another "protocol" for the file, something like "
>> file://host:port/home/blabla" ?
>>
>> Is it true that Yarn is able to distribute files (not using hdfs
>> obviously?) ?
>>
>>
>> The distributedshell-example suggests that I have to use HDFS:
>> https://github.com/apache/hadoop-common/blob/50f0de14e377091c308c3a74ed089a7e4a7f0bfe/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/Client.java
>>
>>
>> Sincerely,
>> Robert
>>
>>
>>
>>
>>
>>
>>  --
>> Arun C. Murthy
>> Hortonworks Inc.
>> http://hortonworks.com/
>>
>>
>>
>> CONFIDENTIALITY NOTICE
>> NOTICE: This message is intended for the use of the individual or entity
>> to which it is addressed and may contain information that is confidential,
>> privileged and exempt from disclosure under applicable law. If the reader
>> of this message is not the intended recipient, you are hereby notified that
>> any printing, copying, dissemination, distribution, disclosure or
>> forwarding of this communication is strictly prohibited. If you have
>> received this communication in error, please contact the sender immediately
>> and delete it from your system. Thank You.
>
>
>

Re: YARN: LocalResources and file distribution

Posted by omkar joshi <om...@gmail.com>.
add this file in the files to be localized. (LocalResourceRequest). and
then refer it as ./list.ksh .. While adding this to LocalResource specify
the path which you have mentioned.


On Thu, Dec 5, 2013 at 10:40 PM, Krishna Kishore Bonagiri <
write2kishore@gmail.com> wrote:

> Hi Arun,
>
>   I have copied a shell script to HDFS and trying to execute it on
> containers. How do I specify my shell script PATH in setCommands() call
> on ContainerLaunchContext? I am doing it this way
>
>       String shellScriptPath =
> "hdfs://isredeng:8020/user/kbonagir/KKDummy/list.ksh";
>       commands.add(shellScriptPath);
>
> But my container execution is failing saying that there is No such file or
> directory!
>
> org.apache.hadoop.util.Shell$ExitCodeException: /bin/bash:
> hdfs://isredeng:8020/user/kbonagir/KKDummy/list.ksh: No such file or
> directory
>
>         at org.apache.hadoop.util.Shell.runCommand(Shell.java:464)
>         at org.apache.hadoop.util.Shell.run(Shell.java:379)
>         at
> org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:589)
>
> I could see this file with "hadoop fs" command and also saw messages in
> Node Manager's log saying that the resource is downloaded and localized.
> So, how do I run the downloaded shell script on a container?
>
> Thanks,
> Kishore
>
>
>
> On Tue, Dec 3, 2013 at 4:57 AM, Arun C Murthy <ac...@hortonworks.com> wrote:
>
>> Robert,
>>
>>  YARN, by default, will only download *resource* from a shared namespace
>> (e.g. HDFS).
>>
>>  If /home/hadoop/robert/large_jar.jar is available on each node then you
>> can specify path as file:///home/hadoop/robert/large_jar.jar and it should
>> work.
>>
>>  Else, you'll need to copy /home/hadoop/robert/large_jar.jar to HDFS and
>> then specify hdfs://host:port/path/to/large_jar.jar.
>>
>> hth,
>> Arun
>>
>> On Dec 1, 2013, at 12:03 PM, Robert Metzger <me...@gmail.com> wrote:
>>
>> Hello,
>>
>> I'm currently writing code to run my application using Yarn (Hadoop
>> 2.2.0).
>> I used this code as a skeleton:
>> https://github.com/hortonworks/simple-yarn-app
>>
>> Everything works fine on my local machine or on a cluster with the shared
>> directories, but when I want to access resources outside of commonly
>> accessible locations, my application fails.
>>
>> I have my application in a large jar file, containing everything
>> (Submission Client, Application Master, and Workers).
>> The submission client registers the large jar file as a local resource
>> for the Application master's context.
>>
>> In my understanding, Yarn takes care of transferring the client-local
>> resources to the application master's container.
>> This is also stated here:
>> http://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/WritingYarnApplications.html
>>
>> You can use the LocalResource to add resources to your application
>>> request. This will cause YARN to distribute the resource to the
>>> ApplicationMaster node.
>>
>>
>> If I'm starting my jar from the dir "/home/hadoop/robert/large_jar.jar",
>> I'll get the following error from the nodemanager (another node in the
>> cluster):
>>
>> 2013-12-01 20:13:00,810 INFO
>>> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService:
>>> Failed to download rsrc { { file:/home/hadoop/robert/large_jar.jar, ..
>>>
>>
>> So it seems as this node tries to access the file from its local file
>> system.
>>
>> Do I have to use another "protocol" for the file, something like "
>> file://host:port/home/blabla" ?
>>
>> Is it true that Yarn is able to distribute files (not using hdfs
>> obviously?) ?
>>
>>
>> The distributedshell-example suggests that I have to use HDFS:
>> https://github.com/apache/hadoop-common/blob/50f0de14e377091c308c3a74ed089a7e4a7f0bfe/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/Client.java
>>
>>
>> Sincerely,
>> Robert
>>
>>
>>
>>
>>
>>
>>  --
>> Arun C. Murthy
>> Hortonworks Inc.
>> http://hortonworks.com/
>>
>>
>>
>> CONFIDENTIALITY NOTICE
>> NOTICE: This message is intended for the use of the individual or entity
>> to which it is addressed and may contain information that is confidential,
>> privileged and exempt from disclosure under applicable law. If the reader
>> of this message is not the intended recipient, you are hereby notified that
>> any printing, copying, dissemination, distribution, disclosure or
>> forwarding of this communication is strictly prohibited. If you have
>> received this communication in error, please contact the sender immediately
>> and delete it from your system. Thank You.
>
>
>

Re: YARN: LocalResources and file distribution

Posted by Krishna Kishore Bonagiri <wr...@gmail.com>.
Hi Arun,

  I have copied a shell script to HDFS and trying to execute it on
containers. How do I specify my shell script PATH in setCommands() call
on ContainerLaunchContext? I am doing it this way

      String shellScriptPath =
"hdfs://isredeng:8020/user/kbonagir/KKDummy/list.ksh";
      commands.add(shellScriptPath);

But my container execution is failing saying that there is No such file or
directory!

org.apache.hadoop.util.Shell$ExitCodeException: /bin/bash:
hdfs://isredeng:8020/user/kbonagir/KKDummy/list.ksh: No such file or
directory

        at org.apache.hadoop.util.Shell.runCommand(Shell.java:464)
        at org.apache.hadoop.util.Shell.run(Shell.java:379)
        at
org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:589)

I could see this file with "hadoop fs" command and also saw messages in
Node Manager's log saying that the resource is downloaded and localized.
So, how do I run the downloaded shell script on a container?

Thanks,
Kishore



On Tue, Dec 3, 2013 at 4:57 AM, Arun C Murthy <ac...@hortonworks.com> wrote:

> Robert,
>
>  YARN, by default, will only download *resource* from a shared namespace
> (e.g. HDFS).
>
>  If /home/hadoop/robert/large_jar.jar is available on each node then you
> can specify path as file:///home/hadoop/robert/large_jar.jar and it should
> work.
>
>  Else, you'll need to copy /home/hadoop/robert/large_jar.jar to HDFS and
> then specify hdfs://host:port/path/to/large_jar.jar.
>
> hth,
> Arun
>
> On Dec 1, 2013, at 12:03 PM, Robert Metzger <me...@gmail.com> wrote:
>
> Hello,
>
> I'm currently writing code to run my application using Yarn (Hadoop 2.2.0).
> I used this code as a skeleton:
> https://github.com/hortonworks/simple-yarn-app
>
> Everything works fine on my local machine or on a cluster with the shared
> directories, but when I want to access resources outside of commonly
> accessible locations, my application fails.
>
> I have my application in a large jar file, containing everything
> (Submission Client, Application Master, and Workers).
> The submission client registers the large jar file as a local resource for
> the Application master's context.
>
> In my understanding, Yarn takes care of transferring the client-local
> resources to the application master's container.
> This is also stated here:
> http://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/WritingYarnApplications.html
>
> You can use the LocalResource to add resources to your application
>> request. This will cause YARN to distribute the resource to the
>> ApplicationMaster node.
>
>
> If I'm starting my jar from the dir "/home/hadoop/robert/large_jar.jar",
> I'll get the following error from the nodemanager (another node in the
> cluster):
>
> 2013-12-01 20:13:00,810 INFO
>> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService:
>> Failed to download rsrc { { file:/home/hadoop/robert/large_jar.jar, ..
>>
>
> So it seems as this node tries to access the file from its local file
> system.
>
> Do I have to use another "protocol" for the file, something like "
> file://host:port/home/blabla" ?
>
> Is it true that Yarn is able to distribute files (not using hdfs
> obviously?) ?
>
>
> The distributedshell-example suggests that I have to use HDFS:
> https://github.com/apache/hadoop-common/blob/50f0de14e377091c308c3a74ed089a7e4a7f0bfe/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/Client.java
>
>
> Sincerely,
> Robert
>
>
>
>
>
>
> --
> Arun C. Murthy
> Hortonworks Inc.
> http://hortonworks.com/
>
>
>
> CONFIDENTIALITY NOTICE
> NOTICE: This message is intended for the use of the individual or entity
> to which it is addressed and may contain information that is confidential,
> privileged and exempt from disclosure under applicable law. If the reader
> of this message is not the intended recipient, you are hereby notified that
> any printing, copying, dissemination, distribution, disclosure or
> forwarding of this communication is strictly prohibited. If you have
> received this communication in error, please contact the sender immediately
> and delete it from your system. Thank You.

Re: YARN: LocalResources and file distribution

Posted by Krishna Kishore Bonagiri <wr...@gmail.com>.
Hi Arun,

  I have copied a shell script to HDFS and trying to execute it on
containers. How do I specify my shell script PATH in setCommands() call
on ContainerLaunchContext? I am doing it this way

      String shellScriptPath =
"hdfs://isredeng:8020/user/kbonagir/KKDummy/list.ksh";
      commands.add(shellScriptPath);

But my container execution is failing saying that there is No such file or
directory!

org.apache.hadoop.util.Shell$ExitCodeException: /bin/bash:
hdfs://isredeng:8020/user/kbonagir/KKDummy/list.ksh: No such file or
directory

        at org.apache.hadoop.util.Shell.runCommand(Shell.java:464)
        at org.apache.hadoop.util.Shell.run(Shell.java:379)
        at
org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:589)

I could see this file with "hadoop fs" command and also saw messages in
Node Manager's log saying that the resource is downloaded and localized.
So, how do I run the downloaded shell script on a container?

Thanks,
Kishore



On Tue, Dec 3, 2013 at 4:57 AM, Arun C Murthy <ac...@hortonworks.com> wrote:

> Robert,
>
>  YARN, by default, will only download *resource* from a shared namespace
> (e.g. HDFS).
>
>  If /home/hadoop/robert/large_jar.jar is available on each node then you
> can specify path as file:///home/hadoop/robert/large_jar.jar and it should
> work.
>
>  Else, you'll need to copy /home/hadoop/robert/large_jar.jar to HDFS and
> then specify hdfs://host:port/path/to/large_jar.jar.
>
> hth,
> Arun
>
> On Dec 1, 2013, at 12:03 PM, Robert Metzger <me...@gmail.com> wrote:
>
> Hello,
>
> I'm currently writing code to run my application using Yarn (Hadoop 2.2.0).
> I used this code as a skeleton:
> https://github.com/hortonworks/simple-yarn-app
>
> Everything works fine on my local machine or on a cluster with the shared
> directories, but when I want to access resources outside of commonly
> accessible locations, my application fails.
>
> I have my application in a large jar file, containing everything
> (Submission Client, Application Master, and Workers).
> The submission client registers the large jar file as a local resource for
> the Application master's context.
>
> In my understanding, Yarn takes care of transferring the client-local
> resources to the application master's container.
> This is also stated here:
> http://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/WritingYarnApplications.html
>
> You can use the LocalResource to add resources to your application
>> request. This will cause YARN to distribute the resource to the
>> ApplicationMaster node.
>
>
> If I'm starting my jar from the dir "/home/hadoop/robert/large_jar.jar",
> I'll get the following error from the nodemanager (another node in the
> cluster):
>
> 2013-12-01 20:13:00,810 INFO
>> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService:
>> Failed to download rsrc { { file:/home/hadoop/robert/large_jar.jar, ..
>>
>
> So it seems as this node tries to access the file from its local file
> system.
>
> Do I have to use another "protocol" for the file, something like "
> file://host:port/home/blabla" ?
>
> Is it true that Yarn is able to distribute files (not using hdfs
> obviously?) ?
>
>
> The distributedshell-example suggests that I have to use HDFS:
> https://github.com/apache/hadoop-common/blob/50f0de14e377091c308c3a74ed089a7e4a7f0bfe/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/Client.java
>
>
> Sincerely,
> Robert
>
>
>
>
>
>
> --
> Arun C. Murthy
> Hortonworks Inc.
> http://hortonworks.com/
>
>
>
> CONFIDENTIALITY NOTICE
> NOTICE: This message is intended for the use of the individual or entity
> to which it is addressed and may contain information that is confidential,
> privileged and exempt from disclosure under applicable law. If the reader
> of this message is not the intended recipient, you are hereby notified that
> any printing, copying, dissemination, distribution, disclosure or
> forwarding of this communication is strictly prohibited. If you have
> received this communication in error, please contact the sender immediately
> and delete it from your system. Thank You.

Re: YARN: LocalResources and file distribution

Posted by Krishna Kishore Bonagiri <wr...@gmail.com>.
Hi Arun,

  I have copied a shell script to HDFS and trying to execute it on
containers. How do I specify my shell script PATH in setCommands() call
on ContainerLaunchContext? I am doing it this way

      String shellScriptPath =
"hdfs://isredeng:8020/user/kbonagir/KKDummy/list.ksh";
      commands.add(shellScriptPath);

But my container execution is failing saying that there is No such file or
directory!

org.apache.hadoop.util.Shell$ExitCodeException: /bin/bash:
hdfs://isredeng:8020/user/kbonagir/KKDummy/list.ksh: No such file or
directory

        at org.apache.hadoop.util.Shell.runCommand(Shell.java:464)
        at org.apache.hadoop.util.Shell.run(Shell.java:379)
        at
org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:589)

I could see this file with "hadoop fs" command and also saw messages in
Node Manager's log saying that the resource is downloaded and localized.
So, how do I run the downloaded shell script on a container?

Thanks,
Kishore



On Tue, Dec 3, 2013 at 4:57 AM, Arun C Murthy <ac...@hortonworks.com> wrote:

> Robert,
>
>  YARN, by default, will only download *resource* from a shared namespace
> (e.g. HDFS).
>
>  If /home/hadoop/robert/large_jar.jar is available on each node then you
> can specify path as file:///home/hadoop/robert/large_jar.jar and it should
> work.
>
>  Else, you'll need to copy /home/hadoop/robert/large_jar.jar to HDFS and
> then specify hdfs://host:port/path/to/large_jar.jar.
>
> hth,
> Arun
>
> On Dec 1, 2013, at 12:03 PM, Robert Metzger <me...@gmail.com> wrote:
>
> Hello,
>
> I'm currently writing code to run my application using Yarn (Hadoop 2.2.0).
> I used this code as a skeleton:
> https://github.com/hortonworks/simple-yarn-app
>
> Everything works fine on my local machine or on a cluster with the shared
> directories, but when I want to access resources outside of commonly
> accessible locations, my application fails.
>
> I have my application in a large jar file, containing everything
> (Submission Client, Application Master, and Workers).
> The submission client registers the large jar file as a local resource for
> the Application master's context.
>
> In my understanding, Yarn takes care of transferring the client-local
> resources to the application master's container.
> This is also stated here:
> http://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/WritingYarnApplications.html
>
> You can use the LocalResource to add resources to your application
>> request. This will cause YARN to distribute the resource to the
>> ApplicationMaster node.
>
>
> If I'm starting my jar from the dir "/home/hadoop/robert/large_jar.jar",
> I'll get the following error from the nodemanager (another node in the
> cluster):
>
> 2013-12-01 20:13:00,810 INFO
>> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService:
>> Failed to download rsrc { { file:/home/hadoop/robert/large_jar.jar, ..
>>
>
> So it seems as this node tries to access the file from its local file
> system.
>
> Do I have to use another "protocol" for the file, something like "
> file://host:port/home/blabla" ?
>
> Is it true that Yarn is able to distribute files (not using hdfs
> obviously?) ?
>
>
> The distributedshell-example suggests that I have to use HDFS:
> https://github.com/apache/hadoop-common/blob/50f0de14e377091c308c3a74ed089a7e4a7f0bfe/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/Client.java
>
>
> Sincerely,
> Robert
>
>
>
>
>
>
> --
> Arun C. Murthy
> Hortonworks Inc.
> http://hortonworks.com/
>
>
>
> CONFIDENTIALITY NOTICE
> NOTICE: This message is intended for the use of the individual or entity
> to which it is addressed and may contain information that is confidential,
> privileged and exempt from disclosure under applicable law. If the reader
> of this message is not the intended recipient, you are hereby notified that
> any printing, copying, dissemination, distribution, disclosure or
> forwarding of this communication is strictly prohibited. If you have
> received this communication in error, please contact the sender immediately
> and delete it from your system. Thank You.

Re: YARN: LocalResources and file distribution

Posted by Krishna Kishore Bonagiri <wr...@gmail.com>.
Hi Arun,

  I have copied a shell script to HDFS and trying to execute it on
containers. How do I specify my shell script PATH in setCommands() call
on ContainerLaunchContext? I am doing it this way

      String shellScriptPath =
"hdfs://isredeng:8020/user/kbonagir/KKDummy/list.ksh";
      commands.add(shellScriptPath);

But my container execution is failing saying that there is No such file or
directory!

org.apache.hadoop.util.Shell$ExitCodeException: /bin/bash:
hdfs://isredeng:8020/user/kbonagir/KKDummy/list.ksh: No such file or
directory

        at org.apache.hadoop.util.Shell.runCommand(Shell.java:464)
        at org.apache.hadoop.util.Shell.run(Shell.java:379)
        at
org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:589)

I could see this file with "hadoop fs" command and also saw messages in
Node Manager's log saying that the resource is downloaded and localized.
So, how do I run the downloaded shell script on a container?

Thanks,
Kishore



On Tue, Dec 3, 2013 at 4:57 AM, Arun C Murthy <ac...@hortonworks.com> wrote:

> Robert,
>
>  YARN, by default, will only download *resource* from a shared namespace
> (e.g. HDFS).
>
>  If /home/hadoop/robert/large_jar.jar is available on each node then you
> can specify path as file:///home/hadoop/robert/large_jar.jar and it should
> work.
>
>  Else, you'll need to copy /home/hadoop/robert/large_jar.jar to HDFS and
> then specify hdfs://host:port/path/to/large_jar.jar.
>
> hth,
> Arun
>
> On Dec 1, 2013, at 12:03 PM, Robert Metzger <me...@gmail.com> wrote:
>
> Hello,
>
> I'm currently writing code to run my application using Yarn (Hadoop 2.2.0).
> I used this code as a skeleton:
> https://github.com/hortonworks/simple-yarn-app
>
> Everything works fine on my local machine or on a cluster with the shared
> directories, but when I want to access resources outside of commonly
> accessible locations, my application fails.
>
> I have my application in a large jar file, containing everything
> (Submission Client, Application Master, and Workers).
> The submission client registers the large jar file as a local resource for
> the Application master's context.
>
> In my understanding, Yarn takes care of transferring the client-local
> resources to the application master's container.
> This is also stated here:
> http://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/WritingYarnApplications.html
>
> You can use the LocalResource to add resources to your application
>> request. This will cause YARN to distribute the resource to the
>> ApplicationMaster node.
>
>
> If I'm starting my jar from the dir "/home/hadoop/robert/large_jar.jar",
> I'll get the following error from the nodemanager (another node in the
> cluster):
>
> 2013-12-01 20:13:00,810 INFO
>> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService:
>> Failed to download rsrc { { file:/home/hadoop/robert/large_jar.jar, ..
>>
>
> So it seems as this node tries to access the file from its local file
> system.
>
> Do I have to use another "protocol" for the file, something like "
> file://host:port/home/blabla" ?
>
> Is it true that Yarn is able to distribute files (not using hdfs
> obviously?) ?
>
>
> The distributedshell-example suggests that I have to use HDFS:
> https://github.com/apache/hadoop-common/blob/50f0de14e377091c308c3a74ed089a7e4a7f0bfe/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/Client.java
>
>
> Sincerely,
> Robert
>
>
>
>
>
>
> --
> Arun C. Murthy
> Hortonworks Inc.
> http://hortonworks.com/
>
>
>
> CONFIDENTIALITY NOTICE
> NOTICE: This message is intended for the use of the individual or entity
> to which it is addressed and may contain information that is confidential,
> privileged and exempt from disclosure under applicable law. If the reader
> of this message is not the intended recipient, you are hereby notified that
> any printing, copying, dissemination, distribution, disclosure or
> forwarding of this communication is strictly prohibited. If you have
> received this communication in error, please contact the sender immediately
> and delete it from your system. Thank You.

Re: YARN: LocalResources and file distribution

Posted by Arun C Murthy <ac...@hortonworks.com>.
Robert,

 YARN, by default, will only download *resource* from a shared namespace (e.g. HDFS).

 If /home/hadoop/robert/large_jar.jar is available on each node then you can specify path as file:///home/hadoop/robert/large_jar.jar and it should work.

 Else, you'll need to copy /home/hadoop/robert/large_jar.jar to HDFS and then specify hdfs://host:port/path/to/large_jar.jar.

hth,
Arun

On Dec 1, 2013, at 12:03 PM, Robert Metzger <me...@gmail.com> wrote:

> Hello,
> 
> I'm currently writing code to run my application using Yarn (Hadoop 2.2.0).
> I used this code as a skeleton: https://github.com/hortonworks/simple-yarn-app
> 
> Everything works fine on my local machine or on a cluster with the shared directories, but when I want to access resources outside of commonly accessible locations, my application fails.
> 
> I have my application in a large jar file, containing everything (Submission Client, Application Master, and Workers). 
> The submission client registers the large jar file as a local resource for the Application master's context.
> 
> In my understanding, Yarn takes care of transferring the client-local resources to the application master's container.
> This is also stated here: http://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/WritingYarnApplications.html
> 
> You can use the LocalResource to add resources to your application request. This will cause YARN to distribute the resource to the ApplicationMaster node.
> 
> If I'm starting my jar from the dir "/home/hadoop/robert/large_jar.jar", I'll get the following error from the nodemanager (another node in the cluster):
> 
> 2013-12-01 20:13:00,810 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService: Failed to download rsrc { { file:/home/hadoop/robert/large_jar.jar, ..
> 
> So it seems as this node tries to access the file from its local file system.
> 
> Do I have to use another "protocol" for the file, something like "file://host:port/home/blabla" ?
> 
> Is it true that Yarn is able to distribute files (not using hdfs obviously?) ?
> 
> 
> The distributedshell-example suggests that I have to use HDFS: https://github.com/apache/hadoop-common/blob/50f0de14e377091c308c3a74ed089a7e4a7f0bfe/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/Client.java
> 
> 
> Sincerely,
> Robert
> 
> 
> 
> 
> 

--
Arun C. Murthy
Hortonworks Inc.
http://hortonworks.com/



-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

Re: YARN: LocalResources and file distribution

Posted by Arun C Murthy <ac...@hortonworks.com>.
Robert,

 YARN, by default, will only download *resource* from a shared namespace (e.g. HDFS).

 If /home/hadoop/robert/large_jar.jar is available on each node then you can specify path as file:///home/hadoop/robert/large_jar.jar and it should work.

 Else, you'll need to copy /home/hadoop/robert/large_jar.jar to HDFS and then specify hdfs://host:port/path/to/large_jar.jar.

hth,
Arun

On Dec 1, 2013, at 12:03 PM, Robert Metzger <me...@gmail.com> wrote:

> Hello,
> 
> I'm currently writing code to run my application using Yarn (Hadoop 2.2.0).
> I used this code as a skeleton: https://github.com/hortonworks/simple-yarn-app
> 
> Everything works fine on my local machine or on a cluster with the shared directories, but when I want to access resources outside of commonly accessible locations, my application fails.
> 
> I have my application in a large jar file, containing everything (Submission Client, Application Master, and Workers). 
> The submission client registers the large jar file as a local resource for the Application master's context.
> 
> In my understanding, Yarn takes care of transferring the client-local resources to the application master's container.
> This is also stated here: http://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/WritingYarnApplications.html
> 
> You can use the LocalResource to add resources to your application request. This will cause YARN to distribute the resource to the ApplicationMaster node.
> 
> If I'm starting my jar from the dir "/home/hadoop/robert/large_jar.jar", I'll get the following error from the nodemanager (another node in the cluster):
> 
> 2013-12-01 20:13:00,810 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService: Failed to download rsrc { { file:/home/hadoop/robert/large_jar.jar, ..
> 
> So it seems as this node tries to access the file from its local file system.
> 
> Do I have to use another "protocol" for the file, something like "file://host:port/home/blabla" ?
> 
> Is it true that Yarn is able to distribute files (not using hdfs obviously?) ?
> 
> 
> The distributedshell-example suggests that I have to use HDFS: https://github.com/apache/hadoop-common/blob/50f0de14e377091c308c3a74ed089a7e4a7f0bfe/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/Client.java
> 
> 
> Sincerely,
> Robert
> 
> 
> 
> 
> 

--
Arun C. Murthy
Hortonworks Inc.
http://hortonworks.com/



-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

Re: YARN: LocalResources and file distribution

Posted by Arun C Murthy <ac...@hortonworks.com>.
Robert,

 YARN, by default, will only download *resource* from a shared namespace (e.g. HDFS).

 If /home/hadoop/robert/large_jar.jar is available on each node then you can specify path as file:///home/hadoop/robert/large_jar.jar and it should work.

 Else, you'll need to copy /home/hadoop/robert/large_jar.jar to HDFS and then specify hdfs://host:port/path/to/large_jar.jar.

hth,
Arun

On Dec 1, 2013, at 12:03 PM, Robert Metzger <me...@gmail.com> wrote:

> Hello,
> 
> I'm currently writing code to run my application using Yarn (Hadoop 2.2.0).
> I used this code as a skeleton: https://github.com/hortonworks/simple-yarn-app
> 
> Everything works fine on my local machine or on a cluster with the shared directories, but when I want to access resources outside of commonly accessible locations, my application fails.
> 
> I have my application in a large jar file, containing everything (Submission Client, Application Master, and Workers). 
> The submission client registers the large jar file as a local resource for the Application master's context.
> 
> In my understanding, Yarn takes care of transferring the client-local resources to the application master's container.
> This is also stated here: http://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/WritingYarnApplications.html
> 
> You can use the LocalResource to add resources to your application request. This will cause YARN to distribute the resource to the ApplicationMaster node.
> 
> If I'm starting my jar from the dir "/home/hadoop/robert/large_jar.jar", I'll get the following error from the nodemanager (another node in the cluster):
> 
> 2013-12-01 20:13:00,810 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService: Failed to download rsrc { { file:/home/hadoop/robert/large_jar.jar, ..
> 
> So it seems as this node tries to access the file from its local file system.
> 
> Do I have to use another "protocol" for the file, something like "file://host:port/home/blabla" ?
> 
> Is it true that Yarn is able to distribute files (not using hdfs obviously?) ?
> 
> 
> The distributedshell-example suggests that I have to use HDFS: https://github.com/apache/hadoop-common/blob/50f0de14e377091c308c3a74ed089a7e4a7f0bfe/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/Client.java
> 
> 
> Sincerely,
> Robert
> 
> 
> 
> 
> 

--
Arun C. Murthy
Hortonworks Inc.
http://hortonworks.com/



-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

Re: YARN: LocalResources and file distribution

Posted by Arun C Murthy <ac...@hortonworks.com>.
Robert,

 YARN, by default, will only download *resource* from a shared namespace (e.g. HDFS).

 If /home/hadoop/robert/large_jar.jar is available on each node then you can specify path as file:///home/hadoop/robert/large_jar.jar and it should work.

 Else, you'll need to copy /home/hadoop/robert/large_jar.jar to HDFS and then specify hdfs://host:port/path/to/large_jar.jar.

hth,
Arun

On Dec 1, 2013, at 12:03 PM, Robert Metzger <me...@gmail.com> wrote:

> Hello,
> 
> I'm currently writing code to run my application using Yarn (Hadoop 2.2.0).
> I used this code as a skeleton: https://github.com/hortonworks/simple-yarn-app
> 
> Everything works fine on my local machine or on a cluster with the shared directories, but when I want to access resources outside of commonly accessible locations, my application fails.
> 
> I have my application in a large jar file, containing everything (Submission Client, Application Master, and Workers). 
> The submission client registers the large jar file as a local resource for the Application master's context.
> 
> In my understanding, Yarn takes care of transferring the client-local resources to the application master's container.
> This is also stated here: http://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/WritingYarnApplications.html
> 
> You can use the LocalResource to add resources to your application request. This will cause YARN to distribute the resource to the ApplicationMaster node.
> 
> If I'm starting my jar from the dir "/home/hadoop/robert/large_jar.jar", I'll get the following error from the nodemanager (another node in the cluster):
> 
> 2013-12-01 20:13:00,810 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService: Failed to download rsrc { { file:/home/hadoop/robert/large_jar.jar, ..
> 
> So it seems as this node tries to access the file from its local file system.
> 
> Do I have to use another "protocol" for the file, something like "file://host:port/home/blabla" ?
> 
> Is it true that Yarn is able to distribute files (not using hdfs obviously?) ?
> 
> 
> The distributedshell-example suggests that I have to use HDFS: https://github.com/apache/hadoop-common/blob/50f0de14e377091c308c3a74ed089a7e4a7f0bfe/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/Client.java
> 
> 
> Sincerely,
> Robert
> 
> 
> 
> 
> 

--
Arun C. Murthy
Hortonworks Inc.
http://hortonworks.com/



-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.