You are viewing a plain text version of this content. The canonical link for it is here.
Posted to hdfs-user@hadoop.apache.org by Krishna Kishore Bonagiri <wr...@gmail.com> on 2013/08/05 11:14:53 UTC

setLocalResources() on ContainerLaunchContext

Hi,

  Can someone please tell me what is the use of calling setLocalResources()
on ContainerLaunchContext?

  And, also an example of how to use this will help...

 I couldn't guess what is the String in the map that is passed to
setLocalResources() like below:

      // Set the local resources
      Map<String, LocalResource> localResources = new HashMap<String,
LocalResource>();

Thanks,
Kishore

Re: setLocalResources() on ContainerLaunchContext

Posted by Krishna Kishore Bonagiri <wr...@gmail.com>.
Hi Omkar,

  I will try that. I might have got 2 of '/' wrongly while trying it in
different ways to make it work. The file kishore/kk.ksh is accessible to
the same user that is running the AM container.

  And my another questions is to understand what are the exact benefits of
using this resource localization? Can you please explain me briefly or
point me some online documentation talking about it?

Thanks,
Kishore


On Wed, Aug 7, 2013 at 11:49 PM, Omkar Joshi <oj...@hortonworks.com> wrote:

> Good that your timestamp worked... Now for hdfs try this
> hdfs://<hdfs-host-name>:<hdfs-host-port><absolute-path>
> now verify that your absolute path is correct. I hope it will work.
> bin/hadoop fs -ls <absolute-path>
>
>
> hdfs://isredeng:8020*//*kishore/kk.ksh... why "//" ?? you have hdfs file
> at absolute location /kishore/kk.sh? is /kishore and /kishore/kk.sh
> accessible to the user who is making startContainer call or the one running
> AM container?
>
> Thanks,
> Omkar Joshi
> *Hortonworks Inc.* <http://www.hortonworks.com>
>
>
> On Tue, Aug 6, 2013 at 10:43 PM, Krishna Kishore Bonagiri <
> write2kishore@gmail.com> wrote:
>
>> Hi Harsh, Hitesh & Omkar,
>>
>>   Thanks for the replies.
>>
>> I tried getting the last modified timestamp like this and it works. Is
>> this a right thing to do?
>>
>>       File file = new File("/home_/dsadm/kishore/kk.ksh");
>>       shellRsrc.setTimestamp(file.lastModified());
>>
>>
>> And, when I tried using a hdfs file qualifying it with both node name and
>> port, it didn't work, I get a similar error as earlier.
>>
>>       String shellScriptPath = "hdfs://isredeng:8020//kishore/kk.ksh";
>>
>>
>> 13/08/07 01:36:28 INFO ApplicationMaster: Got container status for
>> containerID= container_1375853431091_0005_01_000002, state=COMPLETE,
>> exitStatus=-1000, diagnostics=File does not exist:
>> hdfs://isredeng:8020/kishore/kk.ksh
>>
>> 13/08/07 01:36:28 INFO ApplicationMaster: Got failure status for a
>> container : -1000
>>
>>
>>
>> On Wed, Aug 7, 2013 at 7:45 AM, Harsh J <ha...@cloudera.com> wrote:
>>
>>> Thanks Hitesh!
>>>
>>> P.s. Port isn't a requirement (and with HA URIs, you shouldn't add a
>>> port), but "isredeng" has to be the authority component.
>>>
>>> On Wed, Aug 7, 2013 at 7:37 AM, Hitesh Shah <hi...@apache.org> wrote:
>>> > @Krishna, your logs showed the file error for
>>> "hdfs://isredeng/kishore/kk.ksh"
>>> >
>>> > I am assuming you have tried dfs -ls /kishore/kk.ksh and confirmed
>>> that the file exists? Also the qualified path seems to be missing the
>>> namenode port. I need to go back and check if a path without the port works
>>> by assuming the default namenode port.
>>> >
>>> > @Harsh, adding a helper function seems like a good idea. Let me file a
>>> jira to have the above added to one of the helper/client libraries.
>>> >
>>> > thanks
>>> > -- Hitesh
>>> >
>>> > On Aug 6, 2013, at 6:47 PM, Harsh J wrote:
>>> >
>>> >> It is kinda unnecessary to be asking developers to load in timestamps
>>> and
>>> >> length themselves. Why not provide a java.io.File, or perhaps a Path
>>> >> accepting API, that gets it automatically on their behalf using the
>>> >> FileSystem API internally?
>>> >>
>>> >> P.s. A HDFS file gave him a FNF, while a Local file gave him a proper
>>> >> TS/Len error. I'm guessing there's a bug here w.r.t. handling HDFS
>>> >> paths.
>>> >>
>>> >> On Wed, Aug 7, 2013 at 12:35 AM, Hitesh Shah <hi...@apache.org>
>>> wrote:
>>> >>> Hi Krishna,
>>> >>>
>>> >>> YARN downloads a specified local resource on the container's node
>>> from the url specified. In all situtations, the remote url needs to be a
>>> fully qualified path. To verify that the file at the remote url is still
>>> valid, YARN expects you to provide the length and last modified timestamp
>>> of that file.
>>> >>>
>>> >>> If you use an hdfs path such as hdfs://namenode:port/<absolute path
>>> to file>, you will need to get the length and timestamp from HDFS.
>>> >>> If you use file:///, the file should exist on all nodes and all
>>> nodes should have the file with the same length and timestamp for
>>> localization to work. ( For a single node setup, this works but tougher to
>>> get right on a multi-node setup - deploying the file via a rpm should
>>> likely work).
>>> >>>
>>> >>> -- Hitesh
>>> >>>
>>> >>> On Aug 6, 2013, at 11:11 AM, Omkar Joshi wrote:
>>> >>>
>>> >>>> Hi,
>>> >>>>
>>> >>>> You need to match the timestamp. Probably get the timestamp locally
>>> before adding it. This is explicitly done to ensure that file is not
>>> updated after user makes the call to avoid possible errors.
>>> >>>>
>>> >>>>
>>> >>>> Thanks,
>>> >>>> Omkar Joshi
>>> >>>> Hortonworks Inc.
>>> >>>>
>>> >>>>
>>> >>>> On Tue, Aug 6, 2013 at 5:25 AM, Krishna Kishore Bonagiri <
>>> write2kishore@gmail.com> wrote:
>>> >>>> I tried the following and it works!
>>> >>>> String shellScriptPath = "file:///home_/dsadm/kishore/kk.ksh";
>>> >>>>
>>> >>>> But now getting a timestamp error like below, when I passed 0 to
>>> setTimestamp()
>>> >>>>
>>> >>>> 13/08/06 08:23:48 INFO ApplicationMaster: Got container status for
>>> containerID= container_1375784329048_0017_01_000002, state=COMPLETE,
>>> exitStatus=-1000, diagnostics=Resource file:/home_/dsadm/kishore/kk.ksh
>>> changed on src filesystem (expected 0, was 1367580580000
>>> >>>>
>>> >>>>
>>> >>>>
>>> >>>>
>>> >>>> On Tue, Aug 6, 2013 at 5:24 PM, Harsh J <ha...@cloudera.com> wrote:
>>> >>>> Can you try passing a fully qualified local path? That is,
>>> including the file:/ scheme
>>> >>>>
>>> >>>> On Aug 6, 2013 4:05 PM, "Krishna Kishore Bonagiri" <
>>> write2kishore@gmail.com> wrote:
>>> >>>> Hi Harsh,
>>> >>>>   The setResource() call on LocalResource() is expecting an
>>> argument of type org.apache.hadoop.yarn.api.records.URL which is converted
>>> from a string in the form of URI. This happens in the following call of
>>> Distributed Shell example,
>>> >>>>
>>> >>>> shellRsrc.setResource(ConverterUtils.getYarnUrlFromURI(new URI(
>>> shellScriptPath)));
>>> >>>>
>>> >>>> So, if I give a local file I get a parsing error like below, which
>>> is when I changed it to an HDFS file thinking that it should be given like
>>> that only. Could you please give an example of how else it could be used,
>>> using a local file as you are saying?
>>> >>>>
>>> >>>> 2013-08-06 06:23:12,942 WARN
>>> org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container:
>>> Failed to parse resource-request
>>> >>>> java.net.URISyntaxException: Expected scheme name at index 0:
>>> :///home_/dsadm/kishore/kk.ksh
>>> >>>>        at java.net.URI$Parser.fail(URI.java:2820)
>>> >>>>        at java.net.URI$Parser.failExpecting(URI.java:2826)
>>> >>>>        at java.net.URI$Parser.parse(URI.java:3015)
>>> >>>>        at java.net.URI.<init>(URI.java:747)
>>> >>>>        at
>>> org.apache.hadoop.yarn.util.ConverterUtils.getPathFromYarnURL(ConverterUtils.java:77)
>>> >>>>        at
>>> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalResourceRequest.<init>(LocalResourceRequest.java:46)
>>> >>>>
>>> >>>>
>>> >>>>
>>> >>>> On Tue, Aug 6, 2013 at 3:36 PM, Harsh J <ha...@cloudera.com> wrote:
>>> >>>> To be honest, I've never tried loading a HDFS file onto the
>>> >>>> LocalResource this way. I usually just pass a local file and that
>>> >>>> works just fine. There may be something in the URI transformation
>>> >>>> possibly breaking a HDFS source, but try passing a local file - does
>>> >>>> that fail too? The Shell example uses a local file.
>>> >>>>
>>> >>>> On Tue, Aug 6, 2013 at 10:54 AM, Krishna Kishore Bonagiri
>>> >>>> <wr...@gmail.com> wrote:
>>> >>>>> Hi Harsh,
>>> >>>>>
>>> >>>>>  Please see if this is useful, I got a stack trace after the error
>>> has
>>> >>>>> occurred....
>>> >>>>>
>>> >>>>> 2013-08-06 00:55:30,559 INFO
>>> >>>>>
>>> org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor: CWD set
>>> >>>>> to
>>> /tmp/nm-local-dir/usercache/dsadm/appcache/application_1375716148174_0004
>>> >>>>> =
>>> >>>>>
>>> file:/tmp/nm-local-dir/usercache/dsadm/appcache/application_1375716148174_0004
>>> >>>>> 2013-08-06 00:55:31,017 ERROR
>>> >>>>> org.apache.hadoop.security.UserGroupInformation:
>>> PriviledgedActionException
>>> >>>>> as:dsadm (auth:SIMPLE) cause:java.io.FileNotFoundException: File
>>> does not
>>> >>>>> exist: hdfs://isredeng/kishore/kk.ksh
>>> >>>>> 2013-08-06 00:55:31,029 INFO
>>> >>>>>
>>> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService:
>>> >>>>> DEBUG: FAILED { hdfs://isredeng/kishore/kk.ksh, 0, FILE, null },
>>> File does
>>> >>>>> not exist: hdfs://isredeng/kishore/kk.ksh
>>> >>>>> 2013-08-06 00:55:31,031 INFO
>>> >>>>>
>>> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalizedResource:
>>> >>>>> Resource hdfs://isredeng/kishore/kk.ksh transitioned from
>>> DOWNLOADING to
>>> >>>>> FAILED
>>> >>>>> 2013-08-06 00:55:31,034 INFO
>>> >>>>>
>>> org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container:
>>> >>>>> Container container_1375716148174_0004_01_000002 transitioned from
>>> >>>>> LOCALIZING to LOCALIZATION_FAILED
>>> >>>>> 2013-08-06 00:55:31,035 INFO
>>> >>>>>
>>> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalResourcesTrackerImpl:
>>> >>>>> Container container_1375716148174_0004_01_000002 sent RELEASE
>>> event on a
>>> >>>>> resource request { hdfs://isredeng/kishore/kk.ksh, 0, FILE, null }
>>> not
>>> >>>>> present in cache.
>>> >>>>> 2013-08-06 00:55:31,036 WARN org.apache.hadoop.ipc.Client:
>>> interrupted
>>> >>>>> waiting to send rpc request to server
>>> >>>>> java.lang.InterruptedException
>>> >>>>>        at
>>> >>>>>
>>> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1290)
>>> >>>>>        at
>>> >>>>> java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:229)
>>> >>>>>        at java.util.concurrent.FutureTask.get(FutureTask.java:94)
>>> >>>>>        at
>>> >>>>>
>>> org.apache.hadoop.ipc.Client$Connection.sendRpcRequest(Client.java:930)
>>> >>>>>        at org.apache.hadoop.ipc.Client.call(Client.java:1285)
>>> >>>>>        at org.apache.hadoop.ipc.Client.call(Client.java:1264)
>>> >>>>>        at
>>> >>>>>
>>> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
>>> >>>>>        at $Proxy22.heartbeat(Unknown Source)
>>> >>>>>        at
>>> >>>>>
>>> org.apache.hadoop.yarn.server.nodemanager.api.impl.pb.client.LocalizationProtocolPBClientImpl.heartbeat(LocalizationProtocolPBClientImpl.java:62)
>>> >>>>>        at
>>> >>>>>
>>> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.localizeFiles(ContainerLocalizer.java:249)
>>> >>>>>        at
>>> >>>>>
>>> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.runLocalization(ContainerLocalizer.java:163)
>>> >>>>>        at
>>> >>>>>
>>> org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.startLocalizer(DefaultContainerExecutor.java:106)
>>> >>>>>        at
>>> >>>>>
>>> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerRunner.run(ResourceLocalizationService.java:979)
>>> >>>>>
>>> >>>>>
>>> >>>>>
>>> >>>>> And here is my code snippet:
>>> >>>>>
>>> >>>>>      ContainerLaunchContext ctx =
>>> >>>>> Records.newRecord(ContainerLaunchContext.class);
>>> >>>>>
>>> >>>>>      ctx.setEnvironment(oshEnv);
>>> >>>>>
>>> >>>>>      // Set the local resources
>>> >>>>>      Map<String, LocalResource> localResources = new
>>> HashMap<String,
>>> >>>>> LocalResource>();
>>> >>>>>
>>> >>>>>      LocalResource shellRsrc =
>>> Records.newRecord(LocalResource.class);
>>> >>>>>      shellRsrc.setType(LocalResourceType.FILE);
>>> >>>>>      shellRsrc.setVisibility(LocalResourceVisibility.APPLICATION);
>>> >>>>>      String shellScriptPath = "hdfs://isredeng//kishore/kk.ksh";
>>> >>>>>      try {
>>> >>>>>        shellRsrc.setResource(ConverterUtils.getYarnUrlFromURI(new
>>> >>>>> URI(shellScriptPath)));
>>> >>>>>      } catch (URISyntaxException e) {
>>> >>>>>        LOG.error("Error when trying to use shell script path
>>> specified"
>>> >>>>>            + " in env, path=" + shellScriptPath);
>>> >>>>>        e.printStackTrace();
>>> >>>>>      }
>>> >>>>>
>>> >>>>>      shellRsrc.setTimestamp(0/*shellScriptPathTimestamp*/);
>>> >>>>>      shellRsrc.setSize(0/*shellScriptPathLen*/);
>>> >>>>>      String ExecShellStringPath = "ExecShellScript.sh";
>>> >>>>>      localResources.put(ExecShellStringPath, shellRsrc);
>>> >>>>>
>>> >>>>>      ctx.setLocalResources(localResources);
>>> >>>>>
>>> >>>>>
>>> >>>>> Please let me know if you need anything else.
>>> >>>>>
>>> >>>>> Thanks,
>>> >>>>> Kishore
>>> >>>>>
>>> >>>>>
>>> >>>>>
>>> >>>>> On Tue, Aug 6, 2013 at 12:05 AM, Harsh J <ha...@cloudera.com>
>>> wrote:
>>> >>>>>>
>>> >>>>>> The detail is insufficient to answer why. You should also have
>>> gotten
>>> >>>>>> a trace after it, can you post that? If possible, also the
>>> relevant
>>> >>>>>> snippets of code.
>>> >>>>>>
>>> >>>>>> On Mon, Aug 5, 2013 at 6:36 PM, Krishna Kishore Bonagiri
>>> >>>>>> <wr...@gmail.com> wrote:
>>> >>>>>>> Hi Harsh,
>>> >>>>>>> Thanks for the quick and detailed reply, it really helps. I am
>>> trying
>>> >>>>>>> to
>>> >>>>>>> use it and getting this error in node manager's log:
>>> >>>>>>>
>>> >>>>>>> 2013-08-05 08:57:28,867 ERROR
>>> >>>>>>> org.apache.hadoop.security.UserGroupInformation:
>>> >>>>>>> PriviledgedActionException
>>> >>>>>>> as:dsadm (auth:SIMPLE) cause:java.io.FileNotFoundException: File
>>> does
>>> >>>>>>> not
>>> >>>>>>> exist: hdfs://isredeng/kishore/kk.ksh
>>> >>>>>>>
>>> >>>>>>>
>>> >>>>>>> This file is there on the machine with name "isredeng", I could
>>> do ls
>>> >>>>>>> for
>>> >>>>>>> that file as below:
>>> >>>>>>>
>>> >>>>>>> -bash-4.1$ hadoop fs -ls kishore/kk.ksh
>>> >>>>>>> 13/08/05 09:01:03 WARN util.NativeCodeLoader: Unable to load
>>> >>>>>>> native-hadoop
>>> >>>>>>> library for your platform... using builtin-java classes where
>>> applicable
>>> >>>>>>> Found 1 items
>>> >>>>>>> -rw-r--r--   3 dsadm supergroup       1046 2013-08-05 08:48
>>> >>>>>>> kishore/kk.ksh
>>> >>>>>>>
>>> >>>>>>> Note: I am using a single node cluster
>>> >>>>>>>
>>> >>>>>>> Thanks,
>>> >>>>>>> Kishore
>>> >>>>>>>
>>> >>>>>>>
>>> >>>>>>>
>>> >>>>>>>
>>> >>>>>>> On Mon, Aug 5, 2013 at 3:00 PM, Harsh J <ha...@cloudera.com>
>>> wrote:
>>> >>>>>>>>
>>> >>>>>>>> The string for each LocalResource in the map can be anything
>>> that
>>> >>>>>>>> serves as a common identifier name for your application. At
>>> execution
>>> >>>>>>>> time, the passed resource filename will be aliased to the name
>>> you've
>>> >>>>>>>> mapped it to, so that the application code need not track
>>> special
>>> >>>>>>>> names. The behavior is very similar to how you can, in MR,
>>> define a
>>> >>>>>>>> symlink name for a DistributedCache entry (e.g.
>>> foo.jar#bar.jar).
>>> >>>>>>>>
>>> >>>>>>>> For an example, checkout the DistributedShell app sources.
>>> >>>>>>>>
>>> >>>>>>>> Over [1], you can see we take a user provided file path to a
>>> shell
>>> >>>>>>>> script. This can be named anything as it is user-supplied.
>>> >>>>>>>> Onto [2], we define this as a local resource [2.1] and embed it
>>> with a
>>> >>>>>>>> different name (the string you ask about) [2.2], as defined at
>>> [3] as
>>> >>>>>>>> an application reference-able constant.
>>> >>>>>>>> Note that in [4], we add to the Container arguments the aliased
>>> name
>>> >>>>>>>> we mapped it to (i.e. [3]) and not the original filename we
>>> received
>>> >>>>>>>> from the user. The resource is placed on the container with
>>> this name
>>> >>>>>>>> instead, so thats what we choose to execute.
>>> >>>>>>>>
>>> >>>>>>>> [1] -
>>> >>>>>>>>
>>> >>>>>>>>
>>> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L390
>>> >>>>>>>>
>>> >>>>>>>> [2] - [2.1]
>>> >>>>>>>>
>>> >>>>>>>>
>>> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L764
>>> >>>>>>>> and [2.2]
>>> >>>>>>>>
>>> >>>>>>>>
>>> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L780
>>> >>>>>>>>
>>> >>>>>>>> [3] -
>>> >>>>>>>>
>>> >>>>>>>>
>>> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L205
>>> >>>>>>>>
>>> >>>>>>>> [4] -
>>> >>>>>>>>
>>> >>>>>>>>
>>> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L791
>>> >>>>>>>>
>>> >>>>>>>> On Mon, Aug 5, 2013 at 2:44 PM, Krishna Kishore Bonagiri
>>> >>>>>>>> <wr...@gmail.com> wrote:
>>> >>>>>>>>> Hi,
>>> >>>>>>>>>
>>> >>>>>>>>>  Can someone please tell me what is the use of calling
>>> >>>>>>>>> setLocalResources()
>>> >>>>>>>>> on ContainerLaunchContext?
>>> >>>>>>>>>
>>> >>>>>>>>>  And, also an example of how to use this will help...
>>> >>>>>>>>>
>>> >>>>>>>>> I couldn't guess what is the String in the map that is passed
>>> to
>>> >>>>>>>>> setLocalResources() like below:
>>> >>>>>>>>>
>>> >>>>>>>>>      // Set the local resources
>>> >>>>>>>>>      Map<String, LocalResource> localResources = new
>>> HashMap<String,
>>> >>>>>>>>> LocalResource>();
>>> >>>>>>>>>
>>> >>>>>>>>> Thanks,
>>> >>>>>>>>> Kishore
>>> >>>>>>>>>
>>> >>>>>>>>
>>> >>>>>>>>
>>> >>>>>>>>
>>> >>>>>>>> --
>>> >>>>>>>> Harsh J
>>> >>>>>>>
>>> >>>>>>>
>>> >>>>>>
>>> >>>>>>
>>> >>>>>>
>>> >>>>>> --
>>> >>>>>> Harsh J
>>> >>>>>
>>> >>>>>
>>> >>>>
>>> >>>>
>>> >>>>
>>> >>>> --
>>> >>>> Harsh J
>>> >>>>
>>> >>>>
>>> >>>>
>>> >>>
>>> >>
>>> >>
>>> >>
>>> >> --
>>> >> Harsh J
>>> >
>>>
>>>
>>>
>>> --
>>> Harsh J
>>>
>>
>>
>

Re: setLocalResources() on ContainerLaunchContext

Posted by Krishna Kishore Bonagiri <wr...@gmail.com>.
Hi Omkar,

  I will try that. I might have got 2 of '/' wrongly while trying it in
different ways to make it work. The file kishore/kk.ksh is accessible to
the same user that is running the AM container.

  And my another questions is to understand what are the exact benefits of
using this resource localization? Can you please explain me briefly or
point me some online documentation talking about it?

Thanks,
Kishore


On Wed, Aug 7, 2013 at 11:49 PM, Omkar Joshi <oj...@hortonworks.com> wrote:

> Good that your timestamp worked... Now for hdfs try this
> hdfs://<hdfs-host-name>:<hdfs-host-port><absolute-path>
> now verify that your absolute path is correct. I hope it will work.
> bin/hadoop fs -ls <absolute-path>
>
>
> hdfs://isredeng:8020*//*kishore/kk.ksh... why "//" ?? you have hdfs file
> at absolute location /kishore/kk.sh? is /kishore and /kishore/kk.sh
> accessible to the user who is making startContainer call or the one running
> AM container?
>
> Thanks,
> Omkar Joshi
> *Hortonworks Inc.* <http://www.hortonworks.com>
>
>
> On Tue, Aug 6, 2013 at 10:43 PM, Krishna Kishore Bonagiri <
> write2kishore@gmail.com> wrote:
>
>> Hi Harsh, Hitesh & Omkar,
>>
>>   Thanks for the replies.
>>
>> I tried getting the last modified timestamp like this and it works. Is
>> this a right thing to do?
>>
>>       File file = new File("/home_/dsadm/kishore/kk.ksh");
>>       shellRsrc.setTimestamp(file.lastModified());
>>
>>
>> And, when I tried using a hdfs file qualifying it with both node name and
>> port, it didn't work, I get a similar error as earlier.
>>
>>       String shellScriptPath = "hdfs://isredeng:8020//kishore/kk.ksh";
>>
>>
>> 13/08/07 01:36:28 INFO ApplicationMaster: Got container status for
>> containerID= container_1375853431091_0005_01_000002, state=COMPLETE,
>> exitStatus=-1000, diagnostics=File does not exist:
>> hdfs://isredeng:8020/kishore/kk.ksh
>>
>> 13/08/07 01:36:28 INFO ApplicationMaster: Got failure status for a
>> container : -1000
>>
>>
>>
>> On Wed, Aug 7, 2013 at 7:45 AM, Harsh J <ha...@cloudera.com> wrote:
>>
>>> Thanks Hitesh!
>>>
>>> P.s. Port isn't a requirement (and with HA URIs, you shouldn't add a
>>> port), but "isredeng" has to be the authority component.
>>>
>>> On Wed, Aug 7, 2013 at 7:37 AM, Hitesh Shah <hi...@apache.org> wrote:
>>> > @Krishna, your logs showed the file error for
>>> "hdfs://isredeng/kishore/kk.ksh"
>>> >
>>> > I am assuming you have tried dfs -ls /kishore/kk.ksh and confirmed
>>> that the file exists? Also the qualified path seems to be missing the
>>> namenode port. I need to go back and check if a path without the port works
>>> by assuming the default namenode port.
>>> >
>>> > @Harsh, adding a helper function seems like a good idea. Let me file a
>>> jira to have the above added to one of the helper/client libraries.
>>> >
>>> > thanks
>>> > -- Hitesh
>>> >
>>> > On Aug 6, 2013, at 6:47 PM, Harsh J wrote:
>>> >
>>> >> It is kinda unnecessary to be asking developers to load in timestamps
>>> and
>>> >> length themselves. Why not provide a java.io.File, or perhaps a Path
>>> >> accepting API, that gets it automatically on their behalf using the
>>> >> FileSystem API internally?
>>> >>
>>> >> P.s. A HDFS file gave him a FNF, while a Local file gave him a proper
>>> >> TS/Len error. I'm guessing there's a bug here w.r.t. handling HDFS
>>> >> paths.
>>> >>
>>> >> On Wed, Aug 7, 2013 at 12:35 AM, Hitesh Shah <hi...@apache.org>
>>> wrote:
>>> >>> Hi Krishna,
>>> >>>
>>> >>> YARN downloads a specified local resource on the container's node
>>> from the url specified. In all situtations, the remote url needs to be a
>>> fully qualified path. To verify that the file at the remote url is still
>>> valid, YARN expects you to provide the length and last modified timestamp
>>> of that file.
>>> >>>
>>> >>> If you use an hdfs path such as hdfs://namenode:port/<absolute path
>>> to file>, you will need to get the length and timestamp from HDFS.
>>> >>> If you use file:///, the file should exist on all nodes and all
>>> nodes should have the file with the same length and timestamp for
>>> localization to work. ( For a single node setup, this works but tougher to
>>> get right on a multi-node setup - deploying the file via a rpm should
>>> likely work).
>>> >>>
>>> >>> -- Hitesh
>>> >>>
>>> >>> On Aug 6, 2013, at 11:11 AM, Omkar Joshi wrote:
>>> >>>
>>> >>>> Hi,
>>> >>>>
>>> >>>> You need to match the timestamp. Probably get the timestamp locally
>>> before adding it. This is explicitly done to ensure that file is not
>>> updated after user makes the call to avoid possible errors.
>>> >>>>
>>> >>>>
>>> >>>> Thanks,
>>> >>>> Omkar Joshi
>>> >>>> Hortonworks Inc.
>>> >>>>
>>> >>>>
>>> >>>> On Tue, Aug 6, 2013 at 5:25 AM, Krishna Kishore Bonagiri <
>>> write2kishore@gmail.com> wrote:
>>> >>>> I tried the following and it works!
>>> >>>> String shellScriptPath = "file:///home_/dsadm/kishore/kk.ksh";
>>> >>>>
>>> >>>> But now getting a timestamp error like below, when I passed 0 to
>>> setTimestamp()
>>> >>>>
>>> >>>> 13/08/06 08:23:48 INFO ApplicationMaster: Got container status for
>>> containerID= container_1375784329048_0017_01_000002, state=COMPLETE,
>>> exitStatus=-1000, diagnostics=Resource file:/home_/dsadm/kishore/kk.ksh
>>> changed on src filesystem (expected 0, was 1367580580000
>>> >>>>
>>> >>>>
>>> >>>>
>>> >>>>
>>> >>>> On Tue, Aug 6, 2013 at 5:24 PM, Harsh J <ha...@cloudera.com> wrote:
>>> >>>> Can you try passing a fully qualified local path? That is,
>>> including the file:/ scheme
>>> >>>>
>>> >>>> On Aug 6, 2013 4:05 PM, "Krishna Kishore Bonagiri" <
>>> write2kishore@gmail.com> wrote:
>>> >>>> Hi Harsh,
>>> >>>>   The setResource() call on LocalResource() is expecting an
>>> argument of type org.apache.hadoop.yarn.api.records.URL which is converted
>>> from a string in the form of URI. This happens in the following call of
>>> Distributed Shell example,
>>> >>>>
>>> >>>> shellRsrc.setResource(ConverterUtils.getYarnUrlFromURI(new URI(
>>> shellScriptPath)));
>>> >>>>
>>> >>>> So, if I give a local file I get a parsing error like below, which
>>> is when I changed it to an HDFS file thinking that it should be given like
>>> that only. Could you please give an example of how else it could be used,
>>> using a local file as you are saying?
>>> >>>>
>>> >>>> 2013-08-06 06:23:12,942 WARN
>>> org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container:
>>> Failed to parse resource-request
>>> >>>> java.net.URISyntaxException: Expected scheme name at index 0:
>>> :///home_/dsadm/kishore/kk.ksh
>>> >>>>        at java.net.URI$Parser.fail(URI.java:2820)
>>> >>>>        at java.net.URI$Parser.failExpecting(URI.java:2826)
>>> >>>>        at java.net.URI$Parser.parse(URI.java:3015)
>>> >>>>        at java.net.URI.<init>(URI.java:747)
>>> >>>>        at
>>> org.apache.hadoop.yarn.util.ConverterUtils.getPathFromYarnURL(ConverterUtils.java:77)
>>> >>>>        at
>>> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalResourceRequest.<init>(LocalResourceRequest.java:46)
>>> >>>>
>>> >>>>
>>> >>>>
>>> >>>> On Tue, Aug 6, 2013 at 3:36 PM, Harsh J <ha...@cloudera.com> wrote:
>>> >>>> To be honest, I've never tried loading a HDFS file onto the
>>> >>>> LocalResource this way. I usually just pass a local file and that
>>> >>>> works just fine. There may be something in the URI transformation
>>> >>>> possibly breaking a HDFS source, but try passing a local file - does
>>> >>>> that fail too? The Shell example uses a local file.
>>> >>>>
>>> >>>> On Tue, Aug 6, 2013 at 10:54 AM, Krishna Kishore Bonagiri
>>> >>>> <wr...@gmail.com> wrote:
>>> >>>>> Hi Harsh,
>>> >>>>>
>>> >>>>>  Please see if this is useful, I got a stack trace after the error
>>> has
>>> >>>>> occurred....
>>> >>>>>
>>> >>>>> 2013-08-06 00:55:30,559 INFO
>>> >>>>>
>>> org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor: CWD set
>>> >>>>> to
>>> /tmp/nm-local-dir/usercache/dsadm/appcache/application_1375716148174_0004
>>> >>>>> =
>>> >>>>>
>>> file:/tmp/nm-local-dir/usercache/dsadm/appcache/application_1375716148174_0004
>>> >>>>> 2013-08-06 00:55:31,017 ERROR
>>> >>>>> org.apache.hadoop.security.UserGroupInformation:
>>> PriviledgedActionException
>>> >>>>> as:dsadm (auth:SIMPLE) cause:java.io.FileNotFoundException: File
>>> does not
>>> >>>>> exist: hdfs://isredeng/kishore/kk.ksh
>>> >>>>> 2013-08-06 00:55:31,029 INFO
>>> >>>>>
>>> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService:
>>> >>>>> DEBUG: FAILED { hdfs://isredeng/kishore/kk.ksh, 0, FILE, null },
>>> File does
>>> >>>>> not exist: hdfs://isredeng/kishore/kk.ksh
>>> >>>>> 2013-08-06 00:55:31,031 INFO
>>> >>>>>
>>> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalizedResource:
>>> >>>>> Resource hdfs://isredeng/kishore/kk.ksh transitioned from
>>> DOWNLOADING to
>>> >>>>> FAILED
>>> >>>>> 2013-08-06 00:55:31,034 INFO
>>> >>>>>
>>> org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container:
>>> >>>>> Container container_1375716148174_0004_01_000002 transitioned from
>>> >>>>> LOCALIZING to LOCALIZATION_FAILED
>>> >>>>> 2013-08-06 00:55:31,035 INFO
>>> >>>>>
>>> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalResourcesTrackerImpl:
>>> >>>>> Container container_1375716148174_0004_01_000002 sent RELEASE
>>> event on a
>>> >>>>> resource request { hdfs://isredeng/kishore/kk.ksh, 0, FILE, null }
>>> not
>>> >>>>> present in cache.
>>> >>>>> 2013-08-06 00:55:31,036 WARN org.apache.hadoop.ipc.Client:
>>> interrupted
>>> >>>>> waiting to send rpc request to server
>>> >>>>> java.lang.InterruptedException
>>> >>>>>        at
>>> >>>>>
>>> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1290)
>>> >>>>>        at
>>> >>>>> java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:229)
>>> >>>>>        at java.util.concurrent.FutureTask.get(FutureTask.java:94)
>>> >>>>>        at
>>> >>>>>
>>> org.apache.hadoop.ipc.Client$Connection.sendRpcRequest(Client.java:930)
>>> >>>>>        at org.apache.hadoop.ipc.Client.call(Client.java:1285)
>>> >>>>>        at org.apache.hadoop.ipc.Client.call(Client.java:1264)
>>> >>>>>        at
>>> >>>>>
>>> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
>>> >>>>>        at $Proxy22.heartbeat(Unknown Source)
>>> >>>>>        at
>>> >>>>>
>>> org.apache.hadoop.yarn.server.nodemanager.api.impl.pb.client.LocalizationProtocolPBClientImpl.heartbeat(LocalizationProtocolPBClientImpl.java:62)
>>> >>>>>        at
>>> >>>>>
>>> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.localizeFiles(ContainerLocalizer.java:249)
>>> >>>>>        at
>>> >>>>>
>>> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.runLocalization(ContainerLocalizer.java:163)
>>> >>>>>        at
>>> >>>>>
>>> org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.startLocalizer(DefaultContainerExecutor.java:106)
>>> >>>>>        at
>>> >>>>>
>>> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerRunner.run(ResourceLocalizationService.java:979)
>>> >>>>>
>>> >>>>>
>>> >>>>>
>>> >>>>> And here is my code snippet:
>>> >>>>>
>>> >>>>>      ContainerLaunchContext ctx =
>>> >>>>> Records.newRecord(ContainerLaunchContext.class);
>>> >>>>>
>>> >>>>>      ctx.setEnvironment(oshEnv);
>>> >>>>>
>>> >>>>>      // Set the local resources
>>> >>>>>      Map<String, LocalResource> localResources = new
>>> HashMap<String,
>>> >>>>> LocalResource>();
>>> >>>>>
>>> >>>>>      LocalResource shellRsrc =
>>> Records.newRecord(LocalResource.class);
>>> >>>>>      shellRsrc.setType(LocalResourceType.FILE);
>>> >>>>>      shellRsrc.setVisibility(LocalResourceVisibility.APPLICATION);
>>> >>>>>      String shellScriptPath = "hdfs://isredeng//kishore/kk.ksh";
>>> >>>>>      try {
>>> >>>>>        shellRsrc.setResource(ConverterUtils.getYarnUrlFromURI(new
>>> >>>>> URI(shellScriptPath)));
>>> >>>>>      } catch (URISyntaxException e) {
>>> >>>>>        LOG.error("Error when trying to use shell script path
>>> specified"
>>> >>>>>            + " in env, path=" + shellScriptPath);
>>> >>>>>        e.printStackTrace();
>>> >>>>>      }
>>> >>>>>
>>> >>>>>      shellRsrc.setTimestamp(0/*shellScriptPathTimestamp*/);
>>> >>>>>      shellRsrc.setSize(0/*shellScriptPathLen*/);
>>> >>>>>      String ExecShellStringPath = "ExecShellScript.sh";
>>> >>>>>      localResources.put(ExecShellStringPath, shellRsrc);
>>> >>>>>
>>> >>>>>      ctx.setLocalResources(localResources);
>>> >>>>>
>>> >>>>>
>>> >>>>> Please let me know if you need anything else.
>>> >>>>>
>>> >>>>> Thanks,
>>> >>>>> Kishore
>>> >>>>>
>>> >>>>>
>>> >>>>>
>>> >>>>> On Tue, Aug 6, 2013 at 12:05 AM, Harsh J <ha...@cloudera.com>
>>> wrote:
>>> >>>>>>
>>> >>>>>> The detail is insufficient to answer why. You should also have
>>> gotten
>>> >>>>>> a trace after it, can you post that? If possible, also the
>>> relevant
>>> >>>>>> snippets of code.
>>> >>>>>>
>>> >>>>>> On Mon, Aug 5, 2013 at 6:36 PM, Krishna Kishore Bonagiri
>>> >>>>>> <wr...@gmail.com> wrote:
>>> >>>>>>> Hi Harsh,
>>> >>>>>>> Thanks for the quick and detailed reply, it really helps. I am
>>> trying
>>> >>>>>>> to
>>> >>>>>>> use it and getting this error in node manager's log:
>>> >>>>>>>
>>> >>>>>>> 2013-08-05 08:57:28,867 ERROR
>>> >>>>>>> org.apache.hadoop.security.UserGroupInformation:
>>> >>>>>>> PriviledgedActionException
>>> >>>>>>> as:dsadm (auth:SIMPLE) cause:java.io.FileNotFoundException: File
>>> does
>>> >>>>>>> not
>>> >>>>>>> exist: hdfs://isredeng/kishore/kk.ksh
>>> >>>>>>>
>>> >>>>>>>
>>> >>>>>>> This file is there on the machine with name "isredeng", I could
>>> do ls
>>> >>>>>>> for
>>> >>>>>>> that file as below:
>>> >>>>>>>
>>> >>>>>>> -bash-4.1$ hadoop fs -ls kishore/kk.ksh
>>> >>>>>>> 13/08/05 09:01:03 WARN util.NativeCodeLoader: Unable to load
>>> >>>>>>> native-hadoop
>>> >>>>>>> library for your platform... using builtin-java classes where
>>> applicable
>>> >>>>>>> Found 1 items
>>> >>>>>>> -rw-r--r--   3 dsadm supergroup       1046 2013-08-05 08:48
>>> >>>>>>> kishore/kk.ksh
>>> >>>>>>>
>>> >>>>>>> Note: I am using a single node cluster
>>> >>>>>>>
>>> >>>>>>> Thanks,
>>> >>>>>>> Kishore
>>> >>>>>>>
>>> >>>>>>>
>>> >>>>>>>
>>> >>>>>>>
>>> >>>>>>> On Mon, Aug 5, 2013 at 3:00 PM, Harsh J <ha...@cloudera.com>
>>> wrote:
>>> >>>>>>>>
>>> >>>>>>>> The string for each LocalResource in the map can be anything
>>> that
>>> >>>>>>>> serves as a common identifier name for your application. At
>>> execution
>>> >>>>>>>> time, the passed resource filename will be aliased to the name
>>> you've
>>> >>>>>>>> mapped it to, so that the application code need not track
>>> special
>>> >>>>>>>> names. The behavior is very similar to how you can, in MR,
>>> define a
>>> >>>>>>>> symlink name for a DistributedCache entry (e.g.
>>> foo.jar#bar.jar).
>>> >>>>>>>>
>>> >>>>>>>> For an example, checkout the DistributedShell app sources.
>>> >>>>>>>>
>>> >>>>>>>> Over [1], you can see we take a user provided file path to a
>>> shell
>>> >>>>>>>> script. This can be named anything as it is user-supplied.
>>> >>>>>>>> Onto [2], we define this as a local resource [2.1] and embed it
>>> with a
>>> >>>>>>>> different name (the string you ask about) [2.2], as defined at
>>> [3] as
>>> >>>>>>>> an application reference-able constant.
>>> >>>>>>>> Note that in [4], we add to the Container arguments the aliased
>>> name
>>> >>>>>>>> we mapped it to (i.e. [3]) and not the original filename we
>>> received
>>> >>>>>>>> from the user. The resource is placed on the container with
>>> this name
>>> >>>>>>>> instead, so thats what we choose to execute.
>>> >>>>>>>>
>>> >>>>>>>> [1] -
>>> >>>>>>>>
>>> >>>>>>>>
>>> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L390
>>> >>>>>>>>
>>> >>>>>>>> [2] - [2.1]
>>> >>>>>>>>
>>> >>>>>>>>
>>> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L764
>>> >>>>>>>> and [2.2]
>>> >>>>>>>>
>>> >>>>>>>>
>>> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L780
>>> >>>>>>>>
>>> >>>>>>>> [3] -
>>> >>>>>>>>
>>> >>>>>>>>
>>> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L205
>>> >>>>>>>>
>>> >>>>>>>> [4] -
>>> >>>>>>>>
>>> >>>>>>>>
>>> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L791
>>> >>>>>>>>
>>> >>>>>>>> On Mon, Aug 5, 2013 at 2:44 PM, Krishna Kishore Bonagiri
>>> >>>>>>>> <wr...@gmail.com> wrote:
>>> >>>>>>>>> Hi,
>>> >>>>>>>>>
>>> >>>>>>>>>  Can someone please tell me what is the use of calling
>>> >>>>>>>>> setLocalResources()
>>> >>>>>>>>> on ContainerLaunchContext?
>>> >>>>>>>>>
>>> >>>>>>>>>  And, also an example of how to use this will help...
>>> >>>>>>>>>
>>> >>>>>>>>> I couldn't guess what is the String in the map that is passed
>>> to
>>> >>>>>>>>> setLocalResources() like below:
>>> >>>>>>>>>
>>> >>>>>>>>>      // Set the local resources
>>> >>>>>>>>>      Map<String, LocalResource> localResources = new
>>> HashMap<String,
>>> >>>>>>>>> LocalResource>();
>>> >>>>>>>>>
>>> >>>>>>>>> Thanks,
>>> >>>>>>>>> Kishore
>>> >>>>>>>>>
>>> >>>>>>>>
>>> >>>>>>>>
>>> >>>>>>>>
>>> >>>>>>>> --
>>> >>>>>>>> Harsh J
>>> >>>>>>>
>>> >>>>>>>
>>> >>>>>>
>>> >>>>>>
>>> >>>>>>
>>> >>>>>> --
>>> >>>>>> Harsh J
>>> >>>>>
>>> >>>>>
>>> >>>>
>>> >>>>
>>> >>>>
>>> >>>> --
>>> >>>> Harsh J
>>> >>>>
>>> >>>>
>>> >>>>
>>> >>>
>>> >>
>>> >>
>>> >>
>>> >> --
>>> >> Harsh J
>>> >
>>>
>>>
>>>
>>> --
>>> Harsh J
>>>
>>
>>
>

Re: setLocalResources() on ContainerLaunchContext

Posted by Krishna Kishore Bonagiri <wr...@gmail.com>.
Hi Omkar,

  I will try that. I might have got 2 of '/' wrongly while trying it in
different ways to make it work. The file kishore/kk.ksh is accessible to
the same user that is running the AM container.

  And my another questions is to understand what are the exact benefits of
using this resource localization? Can you please explain me briefly or
point me some online documentation talking about it?

Thanks,
Kishore


On Wed, Aug 7, 2013 at 11:49 PM, Omkar Joshi <oj...@hortonworks.com> wrote:

> Good that your timestamp worked... Now for hdfs try this
> hdfs://<hdfs-host-name>:<hdfs-host-port><absolute-path>
> now verify that your absolute path is correct. I hope it will work.
> bin/hadoop fs -ls <absolute-path>
>
>
> hdfs://isredeng:8020*//*kishore/kk.ksh... why "//" ?? you have hdfs file
> at absolute location /kishore/kk.sh? is /kishore and /kishore/kk.sh
> accessible to the user who is making startContainer call or the one running
> AM container?
>
> Thanks,
> Omkar Joshi
> *Hortonworks Inc.* <http://www.hortonworks.com>
>
>
> On Tue, Aug 6, 2013 at 10:43 PM, Krishna Kishore Bonagiri <
> write2kishore@gmail.com> wrote:
>
>> Hi Harsh, Hitesh & Omkar,
>>
>>   Thanks for the replies.
>>
>> I tried getting the last modified timestamp like this and it works. Is
>> this a right thing to do?
>>
>>       File file = new File("/home_/dsadm/kishore/kk.ksh");
>>       shellRsrc.setTimestamp(file.lastModified());
>>
>>
>> And, when I tried using a hdfs file qualifying it with both node name and
>> port, it didn't work, I get a similar error as earlier.
>>
>>       String shellScriptPath = "hdfs://isredeng:8020//kishore/kk.ksh";
>>
>>
>> 13/08/07 01:36:28 INFO ApplicationMaster: Got container status for
>> containerID= container_1375853431091_0005_01_000002, state=COMPLETE,
>> exitStatus=-1000, diagnostics=File does not exist:
>> hdfs://isredeng:8020/kishore/kk.ksh
>>
>> 13/08/07 01:36:28 INFO ApplicationMaster: Got failure status for a
>> container : -1000
>>
>>
>>
>> On Wed, Aug 7, 2013 at 7:45 AM, Harsh J <ha...@cloudera.com> wrote:
>>
>>> Thanks Hitesh!
>>>
>>> P.s. Port isn't a requirement (and with HA URIs, you shouldn't add a
>>> port), but "isredeng" has to be the authority component.
>>>
>>> On Wed, Aug 7, 2013 at 7:37 AM, Hitesh Shah <hi...@apache.org> wrote:
>>> > @Krishna, your logs showed the file error for
>>> "hdfs://isredeng/kishore/kk.ksh"
>>> >
>>> > I am assuming you have tried dfs -ls /kishore/kk.ksh and confirmed
>>> that the file exists? Also the qualified path seems to be missing the
>>> namenode port. I need to go back and check if a path without the port works
>>> by assuming the default namenode port.
>>> >
>>> > @Harsh, adding a helper function seems like a good idea. Let me file a
>>> jira to have the above added to one of the helper/client libraries.
>>> >
>>> > thanks
>>> > -- Hitesh
>>> >
>>> > On Aug 6, 2013, at 6:47 PM, Harsh J wrote:
>>> >
>>> >> It is kinda unnecessary to be asking developers to load in timestamps
>>> and
>>> >> length themselves. Why not provide a java.io.File, or perhaps a Path
>>> >> accepting API, that gets it automatically on their behalf using the
>>> >> FileSystem API internally?
>>> >>
>>> >> P.s. A HDFS file gave him a FNF, while a Local file gave him a proper
>>> >> TS/Len error. I'm guessing there's a bug here w.r.t. handling HDFS
>>> >> paths.
>>> >>
>>> >> On Wed, Aug 7, 2013 at 12:35 AM, Hitesh Shah <hi...@apache.org>
>>> wrote:
>>> >>> Hi Krishna,
>>> >>>
>>> >>> YARN downloads a specified local resource on the container's node
>>> from the url specified. In all situtations, the remote url needs to be a
>>> fully qualified path. To verify that the file at the remote url is still
>>> valid, YARN expects you to provide the length and last modified timestamp
>>> of that file.
>>> >>>
>>> >>> If you use an hdfs path such as hdfs://namenode:port/<absolute path
>>> to file>, you will need to get the length and timestamp from HDFS.
>>> >>> If you use file:///, the file should exist on all nodes and all
>>> nodes should have the file with the same length and timestamp for
>>> localization to work. ( For a single node setup, this works but tougher to
>>> get right on a multi-node setup - deploying the file via a rpm should
>>> likely work).
>>> >>>
>>> >>> -- Hitesh
>>> >>>
>>> >>> On Aug 6, 2013, at 11:11 AM, Omkar Joshi wrote:
>>> >>>
>>> >>>> Hi,
>>> >>>>
>>> >>>> You need to match the timestamp. Probably get the timestamp locally
>>> before adding it. This is explicitly done to ensure that file is not
>>> updated after user makes the call to avoid possible errors.
>>> >>>>
>>> >>>>
>>> >>>> Thanks,
>>> >>>> Omkar Joshi
>>> >>>> Hortonworks Inc.
>>> >>>>
>>> >>>>
>>> >>>> On Tue, Aug 6, 2013 at 5:25 AM, Krishna Kishore Bonagiri <
>>> write2kishore@gmail.com> wrote:
>>> >>>> I tried the following and it works!
>>> >>>> String shellScriptPath = "file:///home_/dsadm/kishore/kk.ksh";
>>> >>>>
>>> >>>> But now getting a timestamp error like below, when I passed 0 to
>>> setTimestamp()
>>> >>>>
>>> >>>> 13/08/06 08:23:48 INFO ApplicationMaster: Got container status for
>>> containerID= container_1375784329048_0017_01_000002, state=COMPLETE,
>>> exitStatus=-1000, diagnostics=Resource file:/home_/dsadm/kishore/kk.ksh
>>> changed on src filesystem (expected 0, was 1367580580000
>>> >>>>
>>> >>>>
>>> >>>>
>>> >>>>
>>> >>>> On Tue, Aug 6, 2013 at 5:24 PM, Harsh J <ha...@cloudera.com> wrote:
>>> >>>> Can you try passing a fully qualified local path? That is,
>>> including the file:/ scheme
>>> >>>>
>>> >>>> On Aug 6, 2013 4:05 PM, "Krishna Kishore Bonagiri" <
>>> write2kishore@gmail.com> wrote:
>>> >>>> Hi Harsh,
>>> >>>>   The setResource() call on LocalResource() is expecting an
>>> argument of type org.apache.hadoop.yarn.api.records.URL which is converted
>>> from a string in the form of URI. This happens in the following call of
>>> Distributed Shell example,
>>> >>>>
>>> >>>> shellRsrc.setResource(ConverterUtils.getYarnUrlFromURI(new URI(
>>> shellScriptPath)));
>>> >>>>
>>> >>>> So, if I give a local file I get a parsing error like below, which
>>> is when I changed it to an HDFS file thinking that it should be given like
>>> that only. Could you please give an example of how else it could be used,
>>> using a local file as you are saying?
>>> >>>>
>>> >>>> 2013-08-06 06:23:12,942 WARN
>>> org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container:
>>> Failed to parse resource-request
>>> >>>> java.net.URISyntaxException: Expected scheme name at index 0:
>>> :///home_/dsadm/kishore/kk.ksh
>>> >>>>        at java.net.URI$Parser.fail(URI.java:2820)
>>> >>>>        at java.net.URI$Parser.failExpecting(URI.java:2826)
>>> >>>>        at java.net.URI$Parser.parse(URI.java:3015)
>>> >>>>        at java.net.URI.<init>(URI.java:747)
>>> >>>>        at
>>> org.apache.hadoop.yarn.util.ConverterUtils.getPathFromYarnURL(ConverterUtils.java:77)
>>> >>>>        at
>>> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalResourceRequest.<init>(LocalResourceRequest.java:46)
>>> >>>>
>>> >>>>
>>> >>>>
>>> >>>> On Tue, Aug 6, 2013 at 3:36 PM, Harsh J <ha...@cloudera.com> wrote:
>>> >>>> To be honest, I've never tried loading a HDFS file onto the
>>> >>>> LocalResource this way. I usually just pass a local file and that
>>> >>>> works just fine. There may be something in the URI transformation
>>> >>>> possibly breaking a HDFS source, but try passing a local file - does
>>> >>>> that fail too? The Shell example uses a local file.
>>> >>>>
>>> >>>> On Tue, Aug 6, 2013 at 10:54 AM, Krishna Kishore Bonagiri
>>> >>>> <wr...@gmail.com> wrote:
>>> >>>>> Hi Harsh,
>>> >>>>>
>>> >>>>>  Please see if this is useful, I got a stack trace after the error
>>> has
>>> >>>>> occurred....
>>> >>>>>
>>> >>>>> 2013-08-06 00:55:30,559 INFO
>>> >>>>>
>>> org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor: CWD set
>>> >>>>> to
>>> /tmp/nm-local-dir/usercache/dsadm/appcache/application_1375716148174_0004
>>> >>>>> =
>>> >>>>>
>>> file:/tmp/nm-local-dir/usercache/dsadm/appcache/application_1375716148174_0004
>>> >>>>> 2013-08-06 00:55:31,017 ERROR
>>> >>>>> org.apache.hadoop.security.UserGroupInformation:
>>> PriviledgedActionException
>>> >>>>> as:dsadm (auth:SIMPLE) cause:java.io.FileNotFoundException: File
>>> does not
>>> >>>>> exist: hdfs://isredeng/kishore/kk.ksh
>>> >>>>> 2013-08-06 00:55:31,029 INFO
>>> >>>>>
>>> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService:
>>> >>>>> DEBUG: FAILED { hdfs://isredeng/kishore/kk.ksh, 0, FILE, null },
>>> File does
>>> >>>>> not exist: hdfs://isredeng/kishore/kk.ksh
>>> >>>>> 2013-08-06 00:55:31,031 INFO
>>> >>>>>
>>> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalizedResource:
>>> >>>>> Resource hdfs://isredeng/kishore/kk.ksh transitioned from
>>> DOWNLOADING to
>>> >>>>> FAILED
>>> >>>>> 2013-08-06 00:55:31,034 INFO
>>> >>>>>
>>> org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container:
>>> >>>>> Container container_1375716148174_0004_01_000002 transitioned from
>>> >>>>> LOCALIZING to LOCALIZATION_FAILED
>>> >>>>> 2013-08-06 00:55:31,035 INFO
>>> >>>>>
>>> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalResourcesTrackerImpl:
>>> >>>>> Container container_1375716148174_0004_01_000002 sent RELEASE
>>> event on a
>>> >>>>> resource request { hdfs://isredeng/kishore/kk.ksh, 0, FILE, null }
>>> not
>>> >>>>> present in cache.
>>> >>>>> 2013-08-06 00:55:31,036 WARN org.apache.hadoop.ipc.Client:
>>> interrupted
>>> >>>>> waiting to send rpc request to server
>>> >>>>> java.lang.InterruptedException
>>> >>>>>        at
>>> >>>>>
>>> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1290)
>>> >>>>>        at
>>> >>>>> java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:229)
>>> >>>>>        at java.util.concurrent.FutureTask.get(FutureTask.java:94)
>>> >>>>>        at
>>> >>>>>
>>> org.apache.hadoop.ipc.Client$Connection.sendRpcRequest(Client.java:930)
>>> >>>>>        at org.apache.hadoop.ipc.Client.call(Client.java:1285)
>>> >>>>>        at org.apache.hadoop.ipc.Client.call(Client.java:1264)
>>> >>>>>        at
>>> >>>>>
>>> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
>>> >>>>>        at $Proxy22.heartbeat(Unknown Source)
>>> >>>>>        at
>>> >>>>>
>>> org.apache.hadoop.yarn.server.nodemanager.api.impl.pb.client.LocalizationProtocolPBClientImpl.heartbeat(LocalizationProtocolPBClientImpl.java:62)
>>> >>>>>        at
>>> >>>>>
>>> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.localizeFiles(ContainerLocalizer.java:249)
>>> >>>>>        at
>>> >>>>>
>>> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.runLocalization(ContainerLocalizer.java:163)
>>> >>>>>        at
>>> >>>>>
>>> org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.startLocalizer(DefaultContainerExecutor.java:106)
>>> >>>>>        at
>>> >>>>>
>>> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerRunner.run(ResourceLocalizationService.java:979)
>>> >>>>>
>>> >>>>>
>>> >>>>>
>>> >>>>> And here is my code snippet:
>>> >>>>>
>>> >>>>>      ContainerLaunchContext ctx =
>>> >>>>> Records.newRecord(ContainerLaunchContext.class);
>>> >>>>>
>>> >>>>>      ctx.setEnvironment(oshEnv);
>>> >>>>>
>>> >>>>>      // Set the local resources
>>> >>>>>      Map<String, LocalResource> localResources = new
>>> HashMap<String,
>>> >>>>> LocalResource>();
>>> >>>>>
>>> >>>>>      LocalResource shellRsrc =
>>> Records.newRecord(LocalResource.class);
>>> >>>>>      shellRsrc.setType(LocalResourceType.FILE);
>>> >>>>>      shellRsrc.setVisibility(LocalResourceVisibility.APPLICATION);
>>> >>>>>      String shellScriptPath = "hdfs://isredeng//kishore/kk.ksh";
>>> >>>>>      try {
>>> >>>>>        shellRsrc.setResource(ConverterUtils.getYarnUrlFromURI(new
>>> >>>>> URI(shellScriptPath)));
>>> >>>>>      } catch (URISyntaxException e) {
>>> >>>>>        LOG.error("Error when trying to use shell script path
>>> specified"
>>> >>>>>            + " in env, path=" + shellScriptPath);
>>> >>>>>        e.printStackTrace();
>>> >>>>>      }
>>> >>>>>
>>> >>>>>      shellRsrc.setTimestamp(0/*shellScriptPathTimestamp*/);
>>> >>>>>      shellRsrc.setSize(0/*shellScriptPathLen*/);
>>> >>>>>      String ExecShellStringPath = "ExecShellScript.sh";
>>> >>>>>      localResources.put(ExecShellStringPath, shellRsrc);
>>> >>>>>
>>> >>>>>      ctx.setLocalResources(localResources);
>>> >>>>>
>>> >>>>>
>>> >>>>> Please let me know if you need anything else.
>>> >>>>>
>>> >>>>> Thanks,
>>> >>>>> Kishore
>>> >>>>>
>>> >>>>>
>>> >>>>>
>>> >>>>> On Tue, Aug 6, 2013 at 12:05 AM, Harsh J <ha...@cloudera.com>
>>> wrote:
>>> >>>>>>
>>> >>>>>> The detail is insufficient to answer why. You should also have
>>> gotten
>>> >>>>>> a trace after it, can you post that? If possible, also the
>>> relevant
>>> >>>>>> snippets of code.
>>> >>>>>>
>>> >>>>>> On Mon, Aug 5, 2013 at 6:36 PM, Krishna Kishore Bonagiri
>>> >>>>>> <wr...@gmail.com> wrote:
>>> >>>>>>> Hi Harsh,
>>> >>>>>>> Thanks for the quick and detailed reply, it really helps. I am
>>> trying
>>> >>>>>>> to
>>> >>>>>>> use it and getting this error in node manager's log:
>>> >>>>>>>
>>> >>>>>>> 2013-08-05 08:57:28,867 ERROR
>>> >>>>>>> org.apache.hadoop.security.UserGroupInformation:
>>> >>>>>>> PriviledgedActionException
>>> >>>>>>> as:dsadm (auth:SIMPLE) cause:java.io.FileNotFoundException: File
>>> does
>>> >>>>>>> not
>>> >>>>>>> exist: hdfs://isredeng/kishore/kk.ksh
>>> >>>>>>>
>>> >>>>>>>
>>> >>>>>>> This file is there on the machine with name "isredeng", I could
>>> do ls
>>> >>>>>>> for
>>> >>>>>>> that file as below:
>>> >>>>>>>
>>> >>>>>>> -bash-4.1$ hadoop fs -ls kishore/kk.ksh
>>> >>>>>>> 13/08/05 09:01:03 WARN util.NativeCodeLoader: Unable to load
>>> >>>>>>> native-hadoop
>>> >>>>>>> library for your platform... using builtin-java classes where
>>> applicable
>>> >>>>>>> Found 1 items
>>> >>>>>>> -rw-r--r--   3 dsadm supergroup       1046 2013-08-05 08:48
>>> >>>>>>> kishore/kk.ksh
>>> >>>>>>>
>>> >>>>>>> Note: I am using a single node cluster
>>> >>>>>>>
>>> >>>>>>> Thanks,
>>> >>>>>>> Kishore
>>> >>>>>>>
>>> >>>>>>>
>>> >>>>>>>
>>> >>>>>>>
>>> >>>>>>> On Mon, Aug 5, 2013 at 3:00 PM, Harsh J <ha...@cloudera.com>
>>> wrote:
>>> >>>>>>>>
>>> >>>>>>>> The string for each LocalResource in the map can be anything
>>> that
>>> >>>>>>>> serves as a common identifier name for your application. At
>>> execution
>>> >>>>>>>> time, the passed resource filename will be aliased to the name
>>> you've
>>> >>>>>>>> mapped it to, so that the application code need not track
>>> special
>>> >>>>>>>> names. The behavior is very similar to how you can, in MR,
>>> define a
>>> >>>>>>>> symlink name for a DistributedCache entry (e.g.
>>> foo.jar#bar.jar).
>>> >>>>>>>>
>>> >>>>>>>> For an example, checkout the DistributedShell app sources.
>>> >>>>>>>>
>>> >>>>>>>> Over [1], you can see we take a user provided file path to a
>>> shell
>>> >>>>>>>> script. This can be named anything as it is user-supplied.
>>> >>>>>>>> Onto [2], we define this as a local resource [2.1] and embed it
>>> with a
>>> >>>>>>>> different name (the string you ask about) [2.2], as defined at
>>> [3] as
>>> >>>>>>>> an application reference-able constant.
>>> >>>>>>>> Note that in [4], we add to the Container arguments the aliased
>>> name
>>> >>>>>>>> we mapped it to (i.e. [3]) and not the original filename we
>>> received
>>> >>>>>>>> from the user. The resource is placed on the container with
>>> this name
>>> >>>>>>>> instead, so thats what we choose to execute.
>>> >>>>>>>>
>>> >>>>>>>> [1] -
>>> >>>>>>>>
>>> >>>>>>>>
>>> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L390
>>> >>>>>>>>
>>> >>>>>>>> [2] - [2.1]
>>> >>>>>>>>
>>> >>>>>>>>
>>> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L764
>>> >>>>>>>> and [2.2]
>>> >>>>>>>>
>>> >>>>>>>>
>>> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L780
>>> >>>>>>>>
>>> >>>>>>>> [3] -
>>> >>>>>>>>
>>> >>>>>>>>
>>> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L205
>>> >>>>>>>>
>>> >>>>>>>> [4] -
>>> >>>>>>>>
>>> >>>>>>>>
>>> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L791
>>> >>>>>>>>
>>> >>>>>>>> On Mon, Aug 5, 2013 at 2:44 PM, Krishna Kishore Bonagiri
>>> >>>>>>>> <wr...@gmail.com> wrote:
>>> >>>>>>>>> Hi,
>>> >>>>>>>>>
>>> >>>>>>>>>  Can someone please tell me what is the use of calling
>>> >>>>>>>>> setLocalResources()
>>> >>>>>>>>> on ContainerLaunchContext?
>>> >>>>>>>>>
>>> >>>>>>>>>  And, also an example of how to use this will help...
>>> >>>>>>>>>
>>> >>>>>>>>> I couldn't guess what is the String in the map that is passed
>>> to
>>> >>>>>>>>> setLocalResources() like below:
>>> >>>>>>>>>
>>> >>>>>>>>>      // Set the local resources
>>> >>>>>>>>>      Map<String, LocalResource> localResources = new
>>> HashMap<String,
>>> >>>>>>>>> LocalResource>();
>>> >>>>>>>>>
>>> >>>>>>>>> Thanks,
>>> >>>>>>>>> Kishore
>>> >>>>>>>>>
>>> >>>>>>>>
>>> >>>>>>>>
>>> >>>>>>>>
>>> >>>>>>>> --
>>> >>>>>>>> Harsh J
>>> >>>>>>>
>>> >>>>>>>
>>> >>>>>>
>>> >>>>>>
>>> >>>>>>
>>> >>>>>> --
>>> >>>>>> Harsh J
>>> >>>>>
>>> >>>>>
>>> >>>>
>>> >>>>
>>> >>>>
>>> >>>> --
>>> >>>> Harsh J
>>> >>>>
>>> >>>>
>>> >>>>
>>> >>>
>>> >>
>>> >>
>>> >>
>>> >> --
>>> >> Harsh J
>>> >
>>>
>>>
>>>
>>> --
>>> Harsh J
>>>
>>
>>
>

Re: setLocalResources() on ContainerLaunchContext

Posted by Krishna Kishore Bonagiri <wr...@gmail.com>.
Hi Omkar,

  I will try that. I might have got 2 of '/' wrongly while trying it in
different ways to make it work. The file kishore/kk.ksh is accessible to
the same user that is running the AM container.

  And my another questions is to understand what are the exact benefits of
using this resource localization? Can you please explain me briefly or
point me some online documentation talking about it?

Thanks,
Kishore


On Wed, Aug 7, 2013 at 11:49 PM, Omkar Joshi <oj...@hortonworks.com> wrote:

> Good that your timestamp worked... Now for hdfs try this
> hdfs://<hdfs-host-name>:<hdfs-host-port><absolute-path>
> now verify that your absolute path is correct. I hope it will work.
> bin/hadoop fs -ls <absolute-path>
>
>
> hdfs://isredeng:8020*//*kishore/kk.ksh... why "//" ?? you have hdfs file
> at absolute location /kishore/kk.sh? is /kishore and /kishore/kk.sh
> accessible to the user who is making startContainer call or the one running
> AM container?
>
> Thanks,
> Omkar Joshi
> *Hortonworks Inc.* <http://www.hortonworks.com>
>
>
> On Tue, Aug 6, 2013 at 10:43 PM, Krishna Kishore Bonagiri <
> write2kishore@gmail.com> wrote:
>
>> Hi Harsh, Hitesh & Omkar,
>>
>>   Thanks for the replies.
>>
>> I tried getting the last modified timestamp like this and it works. Is
>> this a right thing to do?
>>
>>       File file = new File("/home_/dsadm/kishore/kk.ksh");
>>       shellRsrc.setTimestamp(file.lastModified());
>>
>>
>> And, when I tried using a hdfs file qualifying it with both node name and
>> port, it didn't work, I get a similar error as earlier.
>>
>>       String shellScriptPath = "hdfs://isredeng:8020//kishore/kk.ksh";
>>
>>
>> 13/08/07 01:36:28 INFO ApplicationMaster: Got container status for
>> containerID= container_1375853431091_0005_01_000002, state=COMPLETE,
>> exitStatus=-1000, diagnostics=File does not exist:
>> hdfs://isredeng:8020/kishore/kk.ksh
>>
>> 13/08/07 01:36:28 INFO ApplicationMaster: Got failure status for a
>> container : -1000
>>
>>
>>
>> On Wed, Aug 7, 2013 at 7:45 AM, Harsh J <ha...@cloudera.com> wrote:
>>
>>> Thanks Hitesh!
>>>
>>> P.s. Port isn't a requirement (and with HA URIs, you shouldn't add a
>>> port), but "isredeng" has to be the authority component.
>>>
>>> On Wed, Aug 7, 2013 at 7:37 AM, Hitesh Shah <hi...@apache.org> wrote:
>>> > @Krishna, your logs showed the file error for
>>> "hdfs://isredeng/kishore/kk.ksh"
>>> >
>>> > I am assuming you have tried dfs -ls /kishore/kk.ksh and confirmed
>>> that the file exists? Also the qualified path seems to be missing the
>>> namenode port. I need to go back and check if a path without the port works
>>> by assuming the default namenode port.
>>> >
>>> > @Harsh, adding a helper function seems like a good idea. Let me file a
>>> jira to have the above added to one of the helper/client libraries.
>>> >
>>> > thanks
>>> > -- Hitesh
>>> >
>>> > On Aug 6, 2013, at 6:47 PM, Harsh J wrote:
>>> >
>>> >> It is kinda unnecessary to be asking developers to load in timestamps
>>> and
>>> >> length themselves. Why not provide a java.io.File, or perhaps a Path
>>> >> accepting API, that gets it automatically on their behalf using the
>>> >> FileSystem API internally?
>>> >>
>>> >> P.s. A HDFS file gave him a FNF, while a Local file gave him a proper
>>> >> TS/Len error. I'm guessing there's a bug here w.r.t. handling HDFS
>>> >> paths.
>>> >>
>>> >> On Wed, Aug 7, 2013 at 12:35 AM, Hitesh Shah <hi...@apache.org>
>>> wrote:
>>> >>> Hi Krishna,
>>> >>>
>>> >>> YARN downloads a specified local resource on the container's node
>>> from the url specified. In all situtations, the remote url needs to be a
>>> fully qualified path. To verify that the file at the remote url is still
>>> valid, YARN expects you to provide the length and last modified timestamp
>>> of that file.
>>> >>>
>>> >>> If you use an hdfs path such as hdfs://namenode:port/<absolute path
>>> to file>, you will need to get the length and timestamp from HDFS.
>>> >>> If you use file:///, the file should exist on all nodes and all
>>> nodes should have the file with the same length and timestamp for
>>> localization to work. ( For a single node setup, this works but tougher to
>>> get right on a multi-node setup - deploying the file via a rpm should
>>> likely work).
>>> >>>
>>> >>> -- Hitesh
>>> >>>
>>> >>> On Aug 6, 2013, at 11:11 AM, Omkar Joshi wrote:
>>> >>>
>>> >>>> Hi,
>>> >>>>
>>> >>>> You need to match the timestamp. Probably get the timestamp locally
>>> before adding it. This is explicitly done to ensure that file is not
>>> updated after user makes the call to avoid possible errors.
>>> >>>>
>>> >>>>
>>> >>>> Thanks,
>>> >>>> Omkar Joshi
>>> >>>> Hortonworks Inc.
>>> >>>>
>>> >>>>
>>> >>>> On Tue, Aug 6, 2013 at 5:25 AM, Krishna Kishore Bonagiri <
>>> write2kishore@gmail.com> wrote:
>>> >>>> I tried the following and it works!
>>> >>>> String shellScriptPath = "file:///home_/dsadm/kishore/kk.ksh";
>>> >>>>
>>> >>>> But now getting a timestamp error like below, when I passed 0 to
>>> setTimestamp()
>>> >>>>
>>> >>>> 13/08/06 08:23:48 INFO ApplicationMaster: Got container status for
>>> containerID= container_1375784329048_0017_01_000002, state=COMPLETE,
>>> exitStatus=-1000, diagnostics=Resource file:/home_/dsadm/kishore/kk.ksh
>>> changed on src filesystem (expected 0, was 1367580580000
>>> >>>>
>>> >>>>
>>> >>>>
>>> >>>>
>>> >>>> On Tue, Aug 6, 2013 at 5:24 PM, Harsh J <ha...@cloudera.com> wrote:
>>> >>>> Can you try passing a fully qualified local path? That is,
>>> including the file:/ scheme
>>> >>>>
>>> >>>> On Aug 6, 2013 4:05 PM, "Krishna Kishore Bonagiri" <
>>> write2kishore@gmail.com> wrote:
>>> >>>> Hi Harsh,
>>> >>>>   The setResource() call on LocalResource() is expecting an
>>> argument of type org.apache.hadoop.yarn.api.records.URL which is converted
>>> from a string in the form of URI. This happens in the following call of
>>> Distributed Shell example,
>>> >>>>
>>> >>>> shellRsrc.setResource(ConverterUtils.getYarnUrlFromURI(new URI(
>>> shellScriptPath)));
>>> >>>>
>>> >>>> So, if I give a local file I get a parsing error like below, which
>>> is when I changed it to an HDFS file thinking that it should be given like
>>> that only. Could you please give an example of how else it could be used,
>>> using a local file as you are saying?
>>> >>>>
>>> >>>> 2013-08-06 06:23:12,942 WARN
>>> org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container:
>>> Failed to parse resource-request
>>> >>>> java.net.URISyntaxException: Expected scheme name at index 0:
>>> :///home_/dsadm/kishore/kk.ksh
>>> >>>>        at java.net.URI$Parser.fail(URI.java:2820)
>>> >>>>        at java.net.URI$Parser.failExpecting(URI.java:2826)
>>> >>>>        at java.net.URI$Parser.parse(URI.java:3015)
>>> >>>>        at java.net.URI.<init>(URI.java:747)
>>> >>>>        at
>>> org.apache.hadoop.yarn.util.ConverterUtils.getPathFromYarnURL(ConverterUtils.java:77)
>>> >>>>        at
>>> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalResourceRequest.<init>(LocalResourceRequest.java:46)
>>> >>>>
>>> >>>>
>>> >>>>
>>> >>>> On Tue, Aug 6, 2013 at 3:36 PM, Harsh J <ha...@cloudera.com> wrote:
>>> >>>> To be honest, I've never tried loading a HDFS file onto the
>>> >>>> LocalResource this way. I usually just pass a local file and that
>>> >>>> works just fine. There may be something in the URI transformation
>>> >>>> possibly breaking a HDFS source, but try passing a local file - does
>>> >>>> that fail too? The Shell example uses a local file.
>>> >>>>
>>> >>>> On Tue, Aug 6, 2013 at 10:54 AM, Krishna Kishore Bonagiri
>>> >>>> <wr...@gmail.com> wrote:
>>> >>>>> Hi Harsh,
>>> >>>>>
>>> >>>>>  Please see if this is useful, I got a stack trace after the error
>>> has
>>> >>>>> occurred....
>>> >>>>>
>>> >>>>> 2013-08-06 00:55:30,559 INFO
>>> >>>>>
>>> org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor: CWD set
>>> >>>>> to
>>> /tmp/nm-local-dir/usercache/dsadm/appcache/application_1375716148174_0004
>>> >>>>> =
>>> >>>>>
>>> file:/tmp/nm-local-dir/usercache/dsadm/appcache/application_1375716148174_0004
>>> >>>>> 2013-08-06 00:55:31,017 ERROR
>>> >>>>> org.apache.hadoop.security.UserGroupInformation:
>>> PriviledgedActionException
>>> >>>>> as:dsadm (auth:SIMPLE) cause:java.io.FileNotFoundException: File
>>> does not
>>> >>>>> exist: hdfs://isredeng/kishore/kk.ksh
>>> >>>>> 2013-08-06 00:55:31,029 INFO
>>> >>>>>
>>> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService:
>>> >>>>> DEBUG: FAILED { hdfs://isredeng/kishore/kk.ksh, 0, FILE, null },
>>> File does
>>> >>>>> not exist: hdfs://isredeng/kishore/kk.ksh
>>> >>>>> 2013-08-06 00:55:31,031 INFO
>>> >>>>>
>>> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalizedResource:
>>> >>>>> Resource hdfs://isredeng/kishore/kk.ksh transitioned from
>>> DOWNLOADING to
>>> >>>>> FAILED
>>> >>>>> 2013-08-06 00:55:31,034 INFO
>>> >>>>>
>>> org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container:
>>> >>>>> Container container_1375716148174_0004_01_000002 transitioned from
>>> >>>>> LOCALIZING to LOCALIZATION_FAILED
>>> >>>>> 2013-08-06 00:55:31,035 INFO
>>> >>>>>
>>> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalResourcesTrackerImpl:
>>> >>>>> Container container_1375716148174_0004_01_000002 sent RELEASE
>>> event on a
>>> >>>>> resource request { hdfs://isredeng/kishore/kk.ksh, 0, FILE, null }
>>> not
>>> >>>>> present in cache.
>>> >>>>> 2013-08-06 00:55:31,036 WARN org.apache.hadoop.ipc.Client:
>>> interrupted
>>> >>>>> waiting to send rpc request to server
>>> >>>>> java.lang.InterruptedException
>>> >>>>>        at
>>> >>>>>
>>> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1290)
>>> >>>>>        at
>>> >>>>> java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:229)
>>> >>>>>        at java.util.concurrent.FutureTask.get(FutureTask.java:94)
>>> >>>>>        at
>>> >>>>>
>>> org.apache.hadoop.ipc.Client$Connection.sendRpcRequest(Client.java:930)
>>> >>>>>        at org.apache.hadoop.ipc.Client.call(Client.java:1285)
>>> >>>>>        at org.apache.hadoop.ipc.Client.call(Client.java:1264)
>>> >>>>>        at
>>> >>>>>
>>> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
>>> >>>>>        at $Proxy22.heartbeat(Unknown Source)
>>> >>>>>        at
>>> >>>>>
>>> org.apache.hadoop.yarn.server.nodemanager.api.impl.pb.client.LocalizationProtocolPBClientImpl.heartbeat(LocalizationProtocolPBClientImpl.java:62)
>>> >>>>>        at
>>> >>>>>
>>> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.localizeFiles(ContainerLocalizer.java:249)
>>> >>>>>        at
>>> >>>>>
>>> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.runLocalization(ContainerLocalizer.java:163)
>>> >>>>>        at
>>> >>>>>
>>> org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.startLocalizer(DefaultContainerExecutor.java:106)
>>> >>>>>        at
>>> >>>>>
>>> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerRunner.run(ResourceLocalizationService.java:979)
>>> >>>>>
>>> >>>>>
>>> >>>>>
>>> >>>>> And here is my code snippet:
>>> >>>>>
>>> >>>>>      ContainerLaunchContext ctx =
>>> >>>>> Records.newRecord(ContainerLaunchContext.class);
>>> >>>>>
>>> >>>>>      ctx.setEnvironment(oshEnv);
>>> >>>>>
>>> >>>>>      // Set the local resources
>>> >>>>>      Map<String, LocalResource> localResources = new
>>> HashMap<String,
>>> >>>>> LocalResource>();
>>> >>>>>
>>> >>>>>      LocalResource shellRsrc =
>>> Records.newRecord(LocalResource.class);
>>> >>>>>      shellRsrc.setType(LocalResourceType.FILE);
>>> >>>>>      shellRsrc.setVisibility(LocalResourceVisibility.APPLICATION);
>>> >>>>>      String shellScriptPath = "hdfs://isredeng//kishore/kk.ksh";
>>> >>>>>      try {
>>> >>>>>        shellRsrc.setResource(ConverterUtils.getYarnUrlFromURI(new
>>> >>>>> URI(shellScriptPath)));
>>> >>>>>      } catch (URISyntaxException e) {
>>> >>>>>        LOG.error("Error when trying to use shell script path
>>> specified"
>>> >>>>>            + " in env, path=" + shellScriptPath);
>>> >>>>>        e.printStackTrace();
>>> >>>>>      }
>>> >>>>>
>>> >>>>>      shellRsrc.setTimestamp(0/*shellScriptPathTimestamp*/);
>>> >>>>>      shellRsrc.setSize(0/*shellScriptPathLen*/);
>>> >>>>>      String ExecShellStringPath = "ExecShellScript.sh";
>>> >>>>>      localResources.put(ExecShellStringPath, shellRsrc);
>>> >>>>>
>>> >>>>>      ctx.setLocalResources(localResources);
>>> >>>>>
>>> >>>>>
>>> >>>>> Please let me know if you need anything else.
>>> >>>>>
>>> >>>>> Thanks,
>>> >>>>> Kishore
>>> >>>>>
>>> >>>>>
>>> >>>>>
>>> >>>>> On Tue, Aug 6, 2013 at 12:05 AM, Harsh J <ha...@cloudera.com>
>>> wrote:
>>> >>>>>>
>>> >>>>>> The detail is insufficient to answer why. You should also have
>>> gotten
>>> >>>>>> a trace after it, can you post that? If possible, also the
>>> relevant
>>> >>>>>> snippets of code.
>>> >>>>>>
>>> >>>>>> On Mon, Aug 5, 2013 at 6:36 PM, Krishna Kishore Bonagiri
>>> >>>>>> <wr...@gmail.com> wrote:
>>> >>>>>>> Hi Harsh,
>>> >>>>>>> Thanks for the quick and detailed reply, it really helps. I am
>>> trying
>>> >>>>>>> to
>>> >>>>>>> use it and getting this error in node manager's log:
>>> >>>>>>>
>>> >>>>>>> 2013-08-05 08:57:28,867 ERROR
>>> >>>>>>> org.apache.hadoop.security.UserGroupInformation:
>>> >>>>>>> PriviledgedActionException
>>> >>>>>>> as:dsadm (auth:SIMPLE) cause:java.io.FileNotFoundException: File
>>> does
>>> >>>>>>> not
>>> >>>>>>> exist: hdfs://isredeng/kishore/kk.ksh
>>> >>>>>>>
>>> >>>>>>>
>>> >>>>>>> This file is there on the machine with name "isredeng", I could
>>> do ls
>>> >>>>>>> for
>>> >>>>>>> that file as below:
>>> >>>>>>>
>>> >>>>>>> -bash-4.1$ hadoop fs -ls kishore/kk.ksh
>>> >>>>>>> 13/08/05 09:01:03 WARN util.NativeCodeLoader: Unable to load
>>> >>>>>>> native-hadoop
>>> >>>>>>> library for your platform... using builtin-java classes where
>>> applicable
>>> >>>>>>> Found 1 items
>>> >>>>>>> -rw-r--r--   3 dsadm supergroup       1046 2013-08-05 08:48
>>> >>>>>>> kishore/kk.ksh
>>> >>>>>>>
>>> >>>>>>> Note: I am using a single node cluster
>>> >>>>>>>
>>> >>>>>>> Thanks,
>>> >>>>>>> Kishore
>>> >>>>>>>
>>> >>>>>>>
>>> >>>>>>>
>>> >>>>>>>
>>> >>>>>>> On Mon, Aug 5, 2013 at 3:00 PM, Harsh J <ha...@cloudera.com>
>>> wrote:
>>> >>>>>>>>
>>> >>>>>>>> The string for each LocalResource in the map can be anything
>>> that
>>> >>>>>>>> serves as a common identifier name for your application. At
>>> execution
>>> >>>>>>>> time, the passed resource filename will be aliased to the name
>>> you've
>>> >>>>>>>> mapped it to, so that the application code need not track
>>> special
>>> >>>>>>>> names. The behavior is very similar to how you can, in MR,
>>> define a
>>> >>>>>>>> symlink name for a DistributedCache entry (e.g.
>>> foo.jar#bar.jar).
>>> >>>>>>>>
>>> >>>>>>>> For an example, checkout the DistributedShell app sources.
>>> >>>>>>>>
>>> >>>>>>>> Over [1], you can see we take a user provided file path to a
>>> shell
>>> >>>>>>>> script. This can be named anything as it is user-supplied.
>>> >>>>>>>> Onto [2], we define this as a local resource [2.1] and embed it
>>> with a
>>> >>>>>>>> different name (the string you ask about) [2.2], as defined at
>>> [3] as
>>> >>>>>>>> an application reference-able constant.
>>> >>>>>>>> Note that in [4], we add to the Container arguments the aliased
>>> name
>>> >>>>>>>> we mapped it to (i.e. [3]) and not the original filename we
>>> received
>>> >>>>>>>> from the user. The resource is placed on the container with
>>> this name
>>> >>>>>>>> instead, so thats what we choose to execute.
>>> >>>>>>>>
>>> >>>>>>>> [1] -
>>> >>>>>>>>
>>> >>>>>>>>
>>> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L390
>>> >>>>>>>>
>>> >>>>>>>> [2] - [2.1]
>>> >>>>>>>>
>>> >>>>>>>>
>>> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L764
>>> >>>>>>>> and [2.2]
>>> >>>>>>>>
>>> >>>>>>>>
>>> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L780
>>> >>>>>>>>
>>> >>>>>>>> [3] -
>>> >>>>>>>>
>>> >>>>>>>>
>>> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L205
>>> >>>>>>>>
>>> >>>>>>>> [4] -
>>> >>>>>>>>
>>> >>>>>>>>
>>> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L791
>>> >>>>>>>>
>>> >>>>>>>> On Mon, Aug 5, 2013 at 2:44 PM, Krishna Kishore Bonagiri
>>> >>>>>>>> <wr...@gmail.com> wrote:
>>> >>>>>>>>> Hi,
>>> >>>>>>>>>
>>> >>>>>>>>>  Can someone please tell me what is the use of calling
>>> >>>>>>>>> setLocalResources()
>>> >>>>>>>>> on ContainerLaunchContext?
>>> >>>>>>>>>
>>> >>>>>>>>>  And, also an example of how to use this will help...
>>> >>>>>>>>>
>>> >>>>>>>>> I couldn't guess what is the String in the map that is passed
>>> to
>>> >>>>>>>>> setLocalResources() like below:
>>> >>>>>>>>>
>>> >>>>>>>>>      // Set the local resources
>>> >>>>>>>>>      Map<String, LocalResource> localResources = new
>>> HashMap<String,
>>> >>>>>>>>> LocalResource>();
>>> >>>>>>>>>
>>> >>>>>>>>> Thanks,
>>> >>>>>>>>> Kishore
>>> >>>>>>>>>
>>> >>>>>>>>
>>> >>>>>>>>
>>> >>>>>>>>
>>> >>>>>>>> --
>>> >>>>>>>> Harsh J
>>> >>>>>>>
>>> >>>>>>>
>>> >>>>>>
>>> >>>>>>
>>> >>>>>>
>>> >>>>>> --
>>> >>>>>> Harsh J
>>> >>>>>
>>> >>>>>
>>> >>>>
>>> >>>>
>>> >>>>
>>> >>>> --
>>> >>>> Harsh J
>>> >>>>
>>> >>>>
>>> >>>>
>>> >>>
>>> >>
>>> >>
>>> >>
>>> >> --
>>> >> Harsh J
>>> >
>>>
>>>
>>>
>>> --
>>> Harsh J
>>>
>>
>>
>

Re: setLocalResources() on ContainerLaunchContext

Posted by Omkar Joshi <oj...@hortonworks.com>.
Good that your timestamp worked... Now for hdfs try this
hdfs://<hdfs-host-name>:<hdfs-host-port><absolute-path>
now verify that your absolute path is correct. I hope it will work.
bin/hadoop fs -ls <absolute-path>


hdfs://isredeng:8020*//*kishore/kk.ksh... why "//" ?? you have hdfs file at
absolute location /kishore/kk.sh? is /kishore and /kishore/kk.sh accessible
to the user who is making startContainer call or the one running AM
container?

Thanks,
Omkar Joshi
*Hortonworks Inc.* <http://www.hortonworks.com>


On Tue, Aug 6, 2013 at 10:43 PM, Krishna Kishore Bonagiri <
write2kishore@gmail.com> wrote:

> Hi Harsh, Hitesh & Omkar,
>
>   Thanks for the replies.
>
> I tried getting the last modified timestamp like this and it works. Is
> this a right thing to do?
>
>       File file = new File("/home_/dsadm/kishore/kk.ksh");
>       shellRsrc.setTimestamp(file.lastModified());
>
>
> And, when I tried using a hdfs file qualifying it with both node name and
> port, it didn't work, I get a similar error as earlier.
>
>       String shellScriptPath = "hdfs://isredeng:8020//kishore/kk.ksh";
>
>
> 13/08/07 01:36:28 INFO ApplicationMaster: Got container status for
> containerID= container_1375853431091_0005_01_000002, state=COMPLETE,
> exitStatus=-1000, diagnostics=File does not exist:
> hdfs://isredeng:8020/kishore/kk.ksh
>
> 13/08/07 01:36:28 INFO ApplicationMaster: Got failure status for a
> container : -1000
>
>
>
> On Wed, Aug 7, 2013 at 7:45 AM, Harsh J <ha...@cloudera.com> wrote:
>
>> Thanks Hitesh!
>>
>> P.s. Port isn't a requirement (and with HA URIs, you shouldn't add a
>> port), but "isredeng" has to be the authority component.
>>
>> On Wed, Aug 7, 2013 at 7:37 AM, Hitesh Shah <hi...@apache.org> wrote:
>> > @Krishna, your logs showed the file error for
>> "hdfs://isredeng/kishore/kk.ksh"
>> >
>> > I am assuming you have tried dfs -ls /kishore/kk.ksh and confirmed that
>> the file exists? Also the qualified path seems to be missing the namenode
>> port. I need to go back and check if a path without the port works by
>> assuming the default namenode port.
>> >
>> > @Harsh, adding a helper function seems like a good idea. Let me file a
>> jira to have the above added to one of the helper/client libraries.
>> >
>> > thanks
>> > -- Hitesh
>> >
>> > On Aug 6, 2013, at 6:47 PM, Harsh J wrote:
>> >
>> >> It is kinda unnecessary to be asking developers to load in timestamps
>> and
>> >> length themselves. Why not provide a java.io.File, or perhaps a Path
>> >> accepting API, that gets it automatically on their behalf using the
>> >> FileSystem API internally?
>> >>
>> >> P.s. A HDFS file gave him a FNF, while a Local file gave him a proper
>> >> TS/Len error. I'm guessing there's a bug here w.r.t. handling HDFS
>> >> paths.
>> >>
>> >> On Wed, Aug 7, 2013 at 12:35 AM, Hitesh Shah <hi...@apache.org>
>> wrote:
>> >>> Hi Krishna,
>> >>>
>> >>> YARN downloads a specified local resource on the container's node
>> from the url specified. In all situtations, the remote url needs to be a
>> fully qualified path. To verify that the file at the remote url is still
>> valid, YARN expects you to provide the length and last modified timestamp
>> of that file.
>> >>>
>> >>> If you use an hdfs path such as hdfs://namenode:port/<absolute path
>> to file>, you will need to get the length and timestamp from HDFS.
>> >>> If you use file:///, the file should exist on all nodes and all nodes
>> should have the file with the same length and timestamp for localization to
>> work. ( For a single node setup, this works but tougher to get right on a
>> multi-node setup - deploying the file via a rpm should likely work).
>> >>>
>> >>> -- Hitesh
>> >>>
>> >>> On Aug 6, 2013, at 11:11 AM, Omkar Joshi wrote:
>> >>>
>> >>>> Hi,
>> >>>>
>> >>>> You need to match the timestamp. Probably get the timestamp locally
>> before adding it. This is explicitly done to ensure that file is not
>> updated after user makes the call to avoid possible errors.
>> >>>>
>> >>>>
>> >>>> Thanks,
>> >>>> Omkar Joshi
>> >>>> Hortonworks Inc.
>> >>>>
>> >>>>
>> >>>> On Tue, Aug 6, 2013 at 5:25 AM, Krishna Kishore Bonagiri <
>> write2kishore@gmail.com> wrote:
>> >>>> I tried the following and it works!
>> >>>> String shellScriptPath = "file:///home_/dsadm/kishore/kk.ksh";
>> >>>>
>> >>>> But now getting a timestamp error like below, when I passed 0 to
>> setTimestamp()
>> >>>>
>> >>>> 13/08/06 08:23:48 INFO ApplicationMaster: Got container status for
>> containerID= container_1375784329048_0017_01_000002, state=COMPLETE,
>> exitStatus=-1000, diagnostics=Resource file:/home_/dsadm/kishore/kk.ksh
>> changed on src filesystem (expected 0, was 1367580580000
>> >>>>
>> >>>>
>> >>>>
>> >>>>
>> >>>> On Tue, Aug 6, 2013 at 5:24 PM, Harsh J <ha...@cloudera.com> wrote:
>> >>>> Can you try passing a fully qualified local path? That is, including
>> the file:/ scheme
>> >>>>
>> >>>> On Aug 6, 2013 4:05 PM, "Krishna Kishore Bonagiri" <
>> write2kishore@gmail.com> wrote:
>> >>>> Hi Harsh,
>> >>>>   The setResource() call on LocalResource() is expecting an argument
>> of type org.apache.hadoop.yarn.api.records.URL which is converted from a
>> string in the form of URI. This happens in the following call of
>> Distributed Shell example,
>> >>>>
>> >>>> shellRsrc.setResource(ConverterUtils.getYarnUrlFromURI(new URI(
>> shellScriptPath)));
>> >>>>
>> >>>> So, if I give a local file I get a parsing error like below, which
>> is when I changed it to an HDFS file thinking that it should be given like
>> that only. Could you please give an example of how else it could be used,
>> using a local file as you are saying?
>> >>>>
>> >>>> 2013-08-06 06:23:12,942 WARN
>> org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container:
>> Failed to parse resource-request
>> >>>> java.net.URISyntaxException: Expected scheme name at index 0:
>> :///home_/dsadm/kishore/kk.ksh
>> >>>>        at java.net.URI$Parser.fail(URI.java:2820)
>> >>>>        at java.net.URI$Parser.failExpecting(URI.java:2826)
>> >>>>        at java.net.URI$Parser.parse(URI.java:3015)
>> >>>>        at java.net.URI.<init>(URI.java:747)
>> >>>>        at
>> org.apache.hadoop.yarn.util.ConverterUtils.getPathFromYarnURL(ConverterUtils.java:77)
>> >>>>        at
>> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalResourceRequest.<init>(LocalResourceRequest.java:46)
>> >>>>
>> >>>>
>> >>>>
>> >>>> On Tue, Aug 6, 2013 at 3:36 PM, Harsh J <ha...@cloudera.com> wrote:
>> >>>> To be honest, I've never tried loading a HDFS file onto the
>> >>>> LocalResource this way. I usually just pass a local file and that
>> >>>> works just fine. There may be something in the URI transformation
>> >>>> possibly breaking a HDFS source, but try passing a local file - does
>> >>>> that fail too? The Shell example uses a local file.
>> >>>>
>> >>>> On Tue, Aug 6, 2013 at 10:54 AM, Krishna Kishore Bonagiri
>> >>>> <wr...@gmail.com> wrote:
>> >>>>> Hi Harsh,
>> >>>>>
>> >>>>>  Please see if this is useful, I got a stack trace after the error
>> has
>> >>>>> occurred....
>> >>>>>
>> >>>>> 2013-08-06 00:55:30,559 INFO
>> >>>>> org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor:
>> CWD set
>> >>>>> to
>> /tmp/nm-local-dir/usercache/dsadm/appcache/application_1375716148174_0004
>> >>>>> =
>> >>>>>
>> file:/tmp/nm-local-dir/usercache/dsadm/appcache/application_1375716148174_0004
>> >>>>> 2013-08-06 00:55:31,017 ERROR
>> >>>>> org.apache.hadoop.security.UserGroupInformation:
>> PriviledgedActionException
>> >>>>> as:dsadm (auth:SIMPLE) cause:java.io.FileNotFoundException: File
>> does not
>> >>>>> exist: hdfs://isredeng/kishore/kk.ksh
>> >>>>> 2013-08-06 00:55:31,029 INFO
>> >>>>>
>> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService:
>> >>>>> DEBUG: FAILED { hdfs://isredeng/kishore/kk.ksh, 0, FILE, null },
>> File does
>> >>>>> not exist: hdfs://isredeng/kishore/kk.ksh
>> >>>>> 2013-08-06 00:55:31,031 INFO
>> >>>>>
>> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalizedResource:
>> >>>>> Resource hdfs://isredeng/kishore/kk.ksh transitioned from
>> DOWNLOADING to
>> >>>>> FAILED
>> >>>>> 2013-08-06 00:55:31,034 INFO
>> >>>>>
>> org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container:
>> >>>>> Container container_1375716148174_0004_01_000002 transitioned from
>> >>>>> LOCALIZING to LOCALIZATION_FAILED
>> >>>>> 2013-08-06 00:55:31,035 INFO
>> >>>>>
>> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalResourcesTrackerImpl:
>> >>>>> Container container_1375716148174_0004_01_000002 sent RELEASE event
>> on a
>> >>>>> resource request { hdfs://isredeng/kishore/kk.ksh, 0, FILE, null }
>> not
>> >>>>> present in cache.
>> >>>>> 2013-08-06 00:55:31,036 WARN org.apache.hadoop.ipc.Client:
>> interrupted
>> >>>>> waiting to send rpc request to server
>> >>>>> java.lang.InterruptedException
>> >>>>>        at
>> >>>>>
>> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1290)
>> >>>>>        at
>> >>>>> java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:229)
>> >>>>>        at java.util.concurrent.FutureTask.get(FutureTask.java:94)
>> >>>>>        at
>> >>>>>
>> org.apache.hadoop.ipc.Client$Connection.sendRpcRequest(Client.java:930)
>> >>>>>        at org.apache.hadoop.ipc.Client.call(Client.java:1285)
>> >>>>>        at org.apache.hadoop.ipc.Client.call(Client.java:1264)
>> >>>>>        at
>> >>>>>
>> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
>> >>>>>        at $Proxy22.heartbeat(Unknown Source)
>> >>>>>        at
>> >>>>>
>> org.apache.hadoop.yarn.server.nodemanager.api.impl.pb.client.LocalizationProtocolPBClientImpl.heartbeat(LocalizationProtocolPBClientImpl.java:62)
>> >>>>>        at
>> >>>>>
>> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.localizeFiles(ContainerLocalizer.java:249)
>> >>>>>        at
>> >>>>>
>> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.runLocalization(ContainerLocalizer.java:163)
>> >>>>>        at
>> >>>>>
>> org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.startLocalizer(DefaultContainerExecutor.java:106)
>> >>>>>        at
>> >>>>>
>> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerRunner.run(ResourceLocalizationService.java:979)
>> >>>>>
>> >>>>>
>> >>>>>
>> >>>>> And here is my code snippet:
>> >>>>>
>> >>>>>      ContainerLaunchContext ctx =
>> >>>>> Records.newRecord(ContainerLaunchContext.class);
>> >>>>>
>> >>>>>      ctx.setEnvironment(oshEnv);
>> >>>>>
>> >>>>>      // Set the local resources
>> >>>>>      Map<String, LocalResource> localResources = new HashMap<String,
>> >>>>> LocalResource>();
>> >>>>>
>> >>>>>      LocalResource shellRsrc =
>> Records.newRecord(LocalResource.class);
>> >>>>>      shellRsrc.setType(LocalResourceType.FILE);
>> >>>>>      shellRsrc.setVisibility(LocalResourceVisibility.APPLICATION);
>> >>>>>      String shellScriptPath = "hdfs://isredeng//kishore/kk.ksh";
>> >>>>>      try {
>> >>>>>        shellRsrc.setResource(ConverterUtils.getYarnUrlFromURI(new
>> >>>>> URI(shellScriptPath)));
>> >>>>>      } catch (URISyntaxException e) {
>> >>>>>        LOG.error("Error when trying to use shell script path
>> specified"
>> >>>>>            + " in env, path=" + shellScriptPath);
>> >>>>>        e.printStackTrace();
>> >>>>>      }
>> >>>>>
>> >>>>>      shellRsrc.setTimestamp(0/*shellScriptPathTimestamp*/);
>> >>>>>      shellRsrc.setSize(0/*shellScriptPathLen*/);
>> >>>>>      String ExecShellStringPath = "ExecShellScript.sh";
>> >>>>>      localResources.put(ExecShellStringPath, shellRsrc);
>> >>>>>
>> >>>>>      ctx.setLocalResources(localResources);
>> >>>>>
>> >>>>>
>> >>>>> Please let me know if you need anything else.
>> >>>>>
>> >>>>> Thanks,
>> >>>>> Kishore
>> >>>>>
>> >>>>>
>> >>>>>
>> >>>>> On Tue, Aug 6, 2013 at 12:05 AM, Harsh J <ha...@cloudera.com>
>> wrote:
>> >>>>>>
>> >>>>>> The detail is insufficient to answer why. You should also have
>> gotten
>> >>>>>> a trace after it, can you post that? If possible, also the relevant
>> >>>>>> snippets of code.
>> >>>>>>
>> >>>>>> On Mon, Aug 5, 2013 at 6:36 PM, Krishna Kishore Bonagiri
>> >>>>>> <wr...@gmail.com> wrote:
>> >>>>>>> Hi Harsh,
>> >>>>>>> Thanks for the quick and detailed reply, it really helps. I am
>> trying
>> >>>>>>> to
>> >>>>>>> use it and getting this error in node manager's log:
>> >>>>>>>
>> >>>>>>> 2013-08-05 08:57:28,867 ERROR
>> >>>>>>> org.apache.hadoop.security.UserGroupInformation:
>> >>>>>>> PriviledgedActionException
>> >>>>>>> as:dsadm (auth:SIMPLE) cause:java.io.FileNotFoundException: File
>> does
>> >>>>>>> not
>> >>>>>>> exist: hdfs://isredeng/kishore/kk.ksh
>> >>>>>>>
>> >>>>>>>
>> >>>>>>> This file is there on the machine with name "isredeng", I could
>> do ls
>> >>>>>>> for
>> >>>>>>> that file as below:
>> >>>>>>>
>> >>>>>>> -bash-4.1$ hadoop fs -ls kishore/kk.ksh
>> >>>>>>> 13/08/05 09:01:03 WARN util.NativeCodeLoader: Unable to load
>> >>>>>>> native-hadoop
>> >>>>>>> library for your platform... using builtin-java classes where
>> applicable
>> >>>>>>> Found 1 items
>> >>>>>>> -rw-r--r--   3 dsadm supergroup       1046 2013-08-05 08:48
>> >>>>>>> kishore/kk.ksh
>> >>>>>>>
>> >>>>>>> Note: I am using a single node cluster
>> >>>>>>>
>> >>>>>>> Thanks,
>> >>>>>>> Kishore
>> >>>>>>>
>> >>>>>>>
>> >>>>>>>
>> >>>>>>>
>> >>>>>>> On Mon, Aug 5, 2013 at 3:00 PM, Harsh J <ha...@cloudera.com>
>> wrote:
>> >>>>>>>>
>> >>>>>>>> The string for each LocalResource in the map can be anything that
>> >>>>>>>> serves as a common identifier name for your application. At
>> execution
>> >>>>>>>> time, the passed resource filename will be aliased to the name
>> you've
>> >>>>>>>> mapped it to, so that the application code need not track special
>> >>>>>>>> names. The behavior is very similar to how you can, in MR,
>> define a
>> >>>>>>>> symlink name for a DistributedCache entry (e.g. foo.jar#bar.jar).
>> >>>>>>>>
>> >>>>>>>> For an example, checkout the DistributedShell app sources.
>> >>>>>>>>
>> >>>>>>>> Over [1], you can see we take a user provided file path to a
>> shell
>> >>>>>>>> script. This can be named anything as it is user-supplied.
>> >>>>>>>> Onto [2], we define this as a local resource [2.1] and embed it
>> with a
>> >>>>>>>> different name (the string you ask about) [2.2], as defined at
>> [3] as
>> >>>>>>>> an application reference-able constant.
>> >>>>>>>> Note that in [4], we add to the Container arguments the aliased
>> name
>> >>>>>>>> we mapped it to (i.e. [3]) and not the original filename we
>> received
>> >>>>>>>> from the user. The resource is placed on the container with this
>> name
>> >>>>>>>> instead, so thats what we choose to execute.
>> >>>>>>>>
>> >>>>>>>> [1] -
>> >>>>>>>>
>> >>>>>>>>
>> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L390
>> >>>>>>>>
>> >>>>>>>> [2] - [2.1]
>> >>>>>>>>
>> >>>>>>>>
>> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L764
>> >>>>>>>> and [2.2]
>> >>>>>>>>
>> >>>>>>>>
>> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L780
>> >>>>>>>>
>> >>>>>>>> [3] -
>> >>>>>>>>
>> >>>>>>>>
>> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L205
>> >>>>>>>>
>> >>>>>>>> [4] -
>> >>>>>>>>
>> >>>>>>>>
>> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L791
>> >>>>>>>>
>> >>>>>>>> On Mon, Aug 5, 2013 at 2:44 PM, Krishna Kishore Bonagiri
>> >>>>>>>> <wr...@gmail.com> wrote:
>> >>>>>>>>> Hi,
>> >>>>>>>>>
>> >>>>>>>>>  Can someone please tell me what is the use of calling
>> >>>>>>>>> setLocalResources()
>> >>>>>>>>> on ContainerLaunchContext?
>> >>>>>>>>>
>> >>>>>>>>>  And, also an example of how to use this will help...
>> >>>>>>>>>
>> >>>>>>>>> I couldn't guess what is the String in the map that is passed to
>> >>>>>>>>> setLocalResources() like below:
>> >>>>>>>>>
>> >>>>>>>>>      // Set the local resources
>> >>>>>>>>>      Map<String, LocalResource> localResources = new
>> HashMap<String,
>> >>>>>>>>> LocalResource>();
>> >>>>>>>>>
>> >>>>>>>>> Thanks,
>> >>>>>>>>> Kishore
>> >>>>>>>>>
>> >>>>>>>>
>> >>>>>>>>
>> >>>>>>>>
>> >>>>>>>> --
>> >>>>>>>> Harsh J
>> >>>>>>>
>> >>>>>>>
>> >>>>>>
>> >>>>>>
>> >>>>>>
>> >>>>>> --
>> >>>>>> Harsh J
>> >>>>>
>> >>>>>
>> >>>>
>> >>>>
>> >>>>
>> >>>> --
>> >>>> Harsh J
>> >>>>
>> >>>>
>> >>>>
>> >>>
>> >>
>> >>
>> >>
>> >> --
>> >> Harsh J
>> >
>>
>>
>>
>> --
>> Harsh J
>>
>
>

Re: setLocalResources() on ContainerLaunchContext

Posted by Omkar Joshi <oj...@hortonworks.com>.
Good that your timestamp worked... Now for hdfs try this
hdfs://<hdfs-host-name>:<hdfs-host-port><absolute-path>
now verify that your absolute path is correct. I hope it will work.
bin/hadoop fs -ls <absolute-path>


hdfs://isredeng:8020*//*kishore/kk.ksh... why "//" ?? you have hdfs file at
absolute location /kishore/kk.sh? is /kishore and /kishore/kk.sh accessible
to the user who is making startContainer call or the one running AM
container?

Thanks,
Omkar Joshi
*Hortonworks Inc.* <http://www.hortonworks.com>


On Tue, Aug 6, 2013 at 10:43 PM, Krishna Kishore Bonagiri <
write2kishore@gmail.com> wrote:

> Hi Harsh, Hitesh & Omkar,
>
>   Thanks for the replies.
>
> I tried getting the last modified timestamp like this and it works. Is
> this a right thing to do?
>
>       File file = new File("/home_/dsadm/kishore/kk.ksh");
>       shellRsrc.setTimestamp(file.lastModified());
>
>
> And, when I tried using a hdfs file qualifying it with both node name and
> port, it didn't work, I get a similar error as earlier.
>
>       String shellScriptPath = "hdfs://isredeng:8020//kishore/kk.ksh";
>
>
> 13/08/07 01:36:28 INFO ApplicationMaster: Got container status for
> containerID= container_1375853431091_0005_01_000002, state=COMPLETE,
> exitStatus=-1000, diagnostics=File does not exist:
> hdfs://isredeng:8020/kishore/kk.ksh
>
> 13/08/07 01:36:28 INFO ApplicationMaster: Got failure status for a
> container : -1000
>
>
>
> On Wed, Aug 7, 2013 at 7:45 AM, Harsh J <ha...@cloudera.com> wrote:
>
>> Thanks Hitesh!
>>
>> P.s. Port isn't a requirement (and with HA URIs, you shouldn't add a
>> port), but "isredeng" has to be the authority component.
>>
>> On Wed, Aug 7, 2013 at 7:37 AM, Hitesh Shah <hi...@apache.org> wrote:
>> > @Krishna, your logs showed the file error for
>> "hdfs://isredeng/kishore/kk.ksh"
>> >
>> > I am assuming you have tried dfs -ls /kishore/kk.ksh and confirmed that
>> the file exists? Also the qualified path seems to be missing the namenode
>> port. I need to go back and check if a path without the port works by
>> assuming the default namenode port.
>> >
>> > @Harsh, adding a helper function seems like a good idea. Let me file a
>> jira to have the above added to one of the helper/client libraries.
>> >
>> > thanks
>> > -- Hitesh
>> >
>> > On Aug 6, 2013, at 6:47 PM, Harsh J wrote:
>> >
>> >> It is kinda unnecessary to be asking developers to load in timestamps
>> and
>> >> length themselves. Why not provide a java.io.File, or perhaps a Path
>> >> accepting API, that gets it automatically on their behalf using the
>> >> FileSystem API internally?
>> >>
>> >> P.s. A HDFS file gave him a FNF, while a Local file gave him a proper
>> >> TS/Len error. I'm guessing there's a bug here w.r.t. handling HDFS
>> >> paths.
>> >>
>> >> On Wed, Aug 7, 2013 at 12:35 AM, Hitesh Shah <hi...@apache.org>
>> wrote:
>> >>> Hi Krishna,
>> >>>
>> >>> YARN downloads a specified local resource on the container's node
>> from the url specified. In all situtations, the remote url needs to be a
>> fully qualified path. To verify that the file at the remote url is still
>> valid, YARN expects you to provide the length and last modified timestamp
>> of that file.
>> >>>
>> >>> If you use an hdfs path such as hdfs://namenode:port/<absolute path
>> to file>, you will need to get the length and timestamp from HDFS.
>> >>> If you use file:///, the file should exist on all nodes and all nodes
>> should have the file with the same length and timestamp for localization to
>> work. ( For a single node setup, this works but tougher to get right on a
>> multi-node setup - deploying the file via a rpm should likely work).
>> >>>
>> >>> -- Hitesh
>> >>>
>> >>> On Aug 6, 2013, at 11:11 AM, Omkar Joshi wrote:
>> >>>
>> >>>> Hi,
>> >>>>
>> >>>> You need to match the timestamp. Probably get the timestamp locally
>> before adding it. This is explicitly done to ensure that file is not
>> updated after user makes the call to avoid possible errors.
>> >>>>
>> >>>>
>> >>>> Thanks,
>> >>>> Omkar Joshi
>> >>>> Hortonworks Inc.
>> >>>>
>> >>>>
>> >>>> On Tue, Aug 6, 2013 at 5:25 AM, Krishna Kishore Bonagiri <
>> write2kishore@gmail.com> wrote:
>> >>>> I tried the following and it works!
>> >>>> String shellScriptPath = "file:///home_/dsadm/kishore/kk.ksh";
>> >>>>
>> >>>> But now getting a timestamp error like below, when I passed 0 to
>> setTimestamp()
>> >>>>
>> >>>> 13/08/06 08:23:48 INFO ApplicationMaster: Got container status for
>> containerID= container_1375784329048_0017_01_000002, state=COMPLETE,
>> exitStatus=-1000, diagnostics=Resource file:/home_/dsadm/kishore/kk.ksh
>> changed on src filesystem (expected 0, was 1367580580000
>> >>>>
>> >>>>
>> >>>>
>> >>>>
>> >>>> On Tue, Aug 6, 2013 at 5:24 PM, Harsh J <ha...@cloudera.com> wrote:
>> >>>> Can you try passing a fully qualified local path? That is, including
>> the file:/ scheme
>> >>>>
>> >>>> On Aug 6, 2013 4:05 PM, "Krishna Kishore Bonagiri" <
>> write2kishore@gmail.com> wrote:
>> >>>> Hi Harsh,
>> >>>>   The setResource() call on LocalResource() is expecting an argument
>> of type org.apache.hadoop.yarn.api.records.URL which is converted from a
>> string in the form of URI. This happens in the following call of
>> Distributed Shell example,
>> >>>>
>> >>>> shellRsrc.setResource(ConverterUtils.getYarnUrlFromURI(new URI(
>> shellScriptPath)));
>> >>>>
>> >>>> So, if I give a local file I get a parsing error like below, which
>> is when I changed it to an HDFS file thinking that it should be given like
>> that only. Could you please give an example of how else it could be used,
>> using a local file as you are saying?
>> >>>>
>> >>>> 2013-08-06 06:23:12,942 WARN
>> org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container:
>> Failed to parse resource-request
>> >>>> java.net.URISyntaxException: Expected scheme name at index 0:
>> :///home_/dsadm/kishore/kk.ksh
>> >>>>        at java.net.URI$Parser.fail(URI.java:2820)
>> >>>>        at java.net.URI$Parser.failExpecting(URI.java:2826)
>> >>>>        at java.net.URI$Parser.parse(URI.java:3015)
>> >>>>        at java.net.URI.<init>(URI.java:747)
>> >>>>        at
>> org.apache.hadoop.yarn.util.ConverterUtils.getPathFromYarnURL(ConverterUtils.java:77)
>> >>>>        at
>> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalResourceRequest.<init>(LocalResourceRequest.java:46)
>> >>>>
>> >>>>
>> >>>>
>> >>>> On Tue, Aug 6, 2013 at 3:36 PM, Harsh J <ha...@cloudera.com> wrote:
>> >>>> To be honest, I've never tried loading a HDFS file onto the
>> >>>> LocalResource this way. I usually just pass a local file and that
>> >>>> works just fine. There may be something in the URI transformation
>> >>>> possibly breaking a HDFS source, but try passing a local file - does
>> >>>> that fail too? The Shell example uses a local file.
>> >>>>
>> >>>> On Tue, Aug 6, 2013 at 10:54 AM, Krishna Kishore Bonagiri
>> >>>> <wr...@gmail.com> wrote:
>> >>>>> Hi Harsh,
>> >>>>>
>> >>>>>  Please see if this is useful, I got a stack trace after the error
>> has
>> >>>>> occurred....
>> >>>>>
>> >>>>> 2013-08-06 00:55:30,559 INFO
>> >>>>> org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor:
>> CWD set
>> >>>>> to
>> /tmp/nm-local-dir/usercache/dsadm/appcache/application_1375716148174_0004
>> >>>>> =
>> >>>>>
>> file:/tmp/nm-local-dir/usercache/dsadm/appcache/application_1375716148174_0004
>> >>>>> 2013-08-06 00:55:31,017 ERROR
>> >>>>> org.apache.hadoop.security.UserGroupInformation:
>> PriviledgedActionException
>> >>>>> as:dsadm (auth:SIMPLE) cause:java.io.FileNotFoundException: File
>> does not
>> >>>>> exist: hdfs://isredeng/kishore/kk.ksh
>> >>>>> 2013-08-06 00:55:31,029 INFO
>> >>>>>
>> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService:
>> >>>>> DEBUG: FAILED { hdfs://isredeng/kishore/kk.ksh, 0, FILE, null },
>> File does
>> >>>>> not exist: hdfs://isredeng/kishore/kk.ksh
>> >>>>> 2013-08-06 00:55:31,031 INFO
>> >>>>>
>> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalizedResource:
>> >>>>> Resource hdfs://isredeng/kishore/kk.ksh transitioned from
>> DOWNLOADING to
>> >>>>> FAILED
>> >>>>> 2013-08-06 00:55:31,034 INFO
>> >>>>>
>> org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container:
>> >>>>> Container container_1375716148174_0004_01_000002 transitioned from
>> >>>>> LOCALIZING to LOCALIZATION_FAILED
>> >>>>> 2013-08-06 00:55:31,035 INFO
>> >>>>>
>> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalResourcesTrackerImpl:
>> >>>>> Container container_1375716148174_0004_01_000002 sent RELEASE event
>> on a
>> >>>>> resource request { hdfs://isredeng/kishore/kk.ksh, 0, FILE, null }
>> not
>> >>>>> present in cache.
>> >>>>> 2013-08-06 00:55:31,036 WARN org.apache.hadoop.ipc.Client:
>> interrupted
>> >>>>> waiting to send rpc request to server
>> >>>>> java.lang.InterruptedException
>> >>>>>        at
>> >>>>>
>> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1290)
>> >>>>>        at
>> >>>>> java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:229)
>> >>>>>        at java.util.concurrent.FutureTask.get(FutureTask.java:94)
>> >>>>>        at
>> >>>>>
>> org.apache.hadoop.ipc.Client$Connection.sendRpcRequest(Client.java:930)
>> >>>>>        at org.apache.hadoop.ipc.Client.call(Client.java:1285)
>> >>>>>        at org.apache.hadoop.ipc.Client.call(Client.java:1264)
>> >>>>>        at
>> >>>>>
>> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
>> >>>>>        at $Proxy22.heartbeat(Unknown Source)
>> >>>>>        at
>> >>>>>
>> org.apache.hadoop.yarn.server.nodemanager.api.impl.pb.client.LocalizationProtocolPBClientImpl.heartbeat(LocalizationProtocolPBClientImpl.java:62)
>> >>>>>        at
>> >>>>>
>> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.localizeFiles(ContainerLocalizer.java:249)
>> >>>>>        at
>> >>>>>
>> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.runLocalization(ContainerLocalizer.java:163)
>> >>>>>        at
>> >>>>>
>> org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.startLocalizer(DefaultContainerExecutor.java:106)
>> >>>>>        at
>> >>>>>
>> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerRunner.run(ResourceLocalizationService.java:979)
>> >>>>>
>> >>>>>
>> >>>>>
>> >>>>> And here is my code snippet:
>> >>>>>
>> >>>>>      ContainerLaunchContext ctx =
>> >>>>> Records.newRecord(ContainerLaunchContext.class);
>> >>>>>
>> >>>>>      ctx.setEnvironment(oshEnv);
>> >>>>>
>> >>>>>      // Set the local resources
>> >>>>>      Map<String, LocalResource> localResources = new HashMap<String,
>> >>>>> LocalResource>();
>> >>>>>
>> >>>>>      LocalResource shellRsrc =
>> Records.newRecord(LocalResource.class);
>> >>>>>      shellRsrc.setType(LocalResourceType.FILE);
>> >>>>>      shellRsrc.setVisibility(LocalResourceVisibility.APPLICATION);
>> >>>>>      String shellScriptPath = "hdfs://isredeng//kishore/kk.ksh";
>> >>>>>      try {
>> >>>>>        shellRsrc.setResource(ConverterUtils.getYarnUrlFromURI(new
>> >>>>> URI(shellScriptPath)));
>> >>>>>      } catch (URISyntaxException e) {
>> >>>>>        LOG.error("Error when trying to use shell script path
>> specified"
>> >>>>>            + " in env, path=" + shellScriptPath);
>> >>>>>        e.printStackTrace();
>> >>>>>      }
>> >>>>>
>> >>>>>      shellRsrc.setTimestamp(0/*shellScriptPathTimestamp*/);
>> >>>>>      shellRsrc.setSize(0/*shellScriptPathLen*/);
>> >>>>>      String ExecShellStringPath = "ExecShellScript.sh";
>> >>>>>      localResources.put(ExecShellStringPath, shellRsrc);
>> >>>>>
>> >>>>>      ctx.setLocalResources(localResources);
>> >>>>>
>> >>>>>
>> >>>>> Please let me know if you need anything else.
>> >>>>>
>> >>>>> Thanks,
>> >>>>> Kishore
>> >>>>>
>> >>>>>
>> >>>>>
>> >>>>> On Tue, Aug 6, 2013 at 12:05 AM, Harsh J <ha...@cloudera.com>
>> wrote:
>> >>>>>>
>> >>>>>> The detail is insufficient to answer why. You should also have
>> gotten
>> >>>>>> a trace after it, can you post that? If possible, also the relevant
>> >>>>>> snippets of code.
>> >>>>>>
>> >>>>>> On Mon, Aug 5, 2013 at 6:36 PM, Krishna Kishore Bonagiri
>> >>>>>> <wr...@gmail.com> wrote:
>> >>>>>>> Hi Harsh,
>> >>>>>>> Thanks for the quick and detailed reply, it really helps. I am
>> trying
>> >>>>>>> to
>> >>>>>>> use it and getting this error in node manager's log:
>> >>>>>>>
>> >>>>>>> 2013-08-05 08:57:28,867 ERROR
>> >>>>>>> org.apache.hadoop.security.UserGroupInformation:
>> >>>>>>> PriviledgedActionException
>> >>>>>>> as:dsadm (auth:SIMPLE) cause:java.io.FileNotFoundException: File
>> does
>> >>>>>>> not
>> >>>>>>> exist: hdfs://isredeng/kishore/kk.ksh
>> >>>>>>>
>> >>>>>>>
>> >>>>>>> This file is there on the machine with name "isredeng", I could
>> do ls
>> >>>>>>> for
>> >>>>>>> that file as below:
>> >>>>>>>
>> >>>>>>> -bash-4.1$ hadoop fs -ls kishore/kk.ksh
>> >>>>>>> 13/08/05 09:01:03 WARN util.NativeCodeLoader: Unable to load
>> >>>>>>> native-hadoop
>> >>>>>>> library for your platform... using builtin-java classes where
>> applicable
>> >>>>>>> Found 1 items
>> >>>>>>> -rw-r--r--   3 dsadm supergroup       1046 2013-08-05 08:48
>> >>>>>>> kishore/kk.ksh
>> >>>>>>>
>> >>>>>>> Note: I am using a single node cluster
>> >>>>>>>
>> >>>>>>> Thanks,
>> >>>>>>> Kishore
>> >>>>>>>
>> >>>>>>>
>> >>>>>>>
>> >>>>>>>
>> >>>>>>> On Mon, Aug 5, 2013 at 3:00 PM, Harsh J <ha...@cloudera.com>
>> wrote:
>> >>>>>>>>
>> >>>>>>>> The string for each LocalResource in the map can be anything that
>> >>>>>>>> serves as a common identifier name for your application. At
>> execution
>> >>>>>>>> time, the passed resource filename will be aliased to the name
>> you've
>> >>>>>>>> mapped it to, so that the application code need not track special
>> >>>>>>>> names. The behavior is very similar to how you can, in MR,
>> define a
>> >>>>>>>> symlink name for a DistributedCache entry (e.g. foo.jar#bar.jar).
>> >>>>>>>>
>> >>>>>>>> For an example, checkout the DistributedShell app sources.
>> >>>>>>>>
>> >>>>>>>> Over [1], you can see we take a user provided file path to a
>> shell
>> >>>>>>>> script. This can be named anything as it is user-supplied.
>> >>>>>>>> Onto [2], we define this as a local resource [2.1] and embed it
>> with a
>> >>>>>>>> different name (the string you ask about) [2.2], as defined at
>> [3] as
>> >>>>>>>> an application reference-able constant.
>> >>>>>>>> Note that in [4], we add to the Container arguments the aliased
>> name
>> >>>>>>>> we mapped it to (i.e. [3]) and not the original filename we
>> received
>> >>>>>>>> from the user. The resource is placed on the container with this
>> name
>> >>>>>>>> instead, so thats what we choose to execute.
>> >>>>>>>>
>> >>>>>>>> [1] -
>> >>>>>>>>
>> >>>>>>>>
>> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L390
>> >>>>>>>>
>> >>>>>>>> [2] - [2.1]
>> >>>>>>>>
>> >>>>>>>>
>> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L764
>> >>>>>>>> and [2.2]
>> >>>>>>>>
>> >>>>>>>>
>> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L780
>> >>>>>>>>
>> >>>>>>>> [3] -
>> >>>>>>>>
>> >>>>>>>>
>> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L205
>> >>>>>>>>
>> >>>>>>>> [4] -
>> >>>>>>>>
>> >>>>>>>>
>> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L791
>> >>>>>>>>
>> >>>>>>>> On Mon, Aug 5, 2013 at 2:44 PM, Krishna Kishore Bonagiri
>> >>>>>>>> <wr...@gmail.com> wrote:
>> >>>>>>>>> Hi,
>> >>>>>>>>>
>> >>>>>>>>>  Can someone please tell me what is the use of calling
>> >>>>>>>>> setLocalResources()
>> >>>>>>>>> on ContainerLaunchContext?
>> >>>>>>>>>
>> >>>>>>>>>  And, also an example of how to use this will help...
>> >>>>>>>>>
>> >>>>>>>>> I couldn't guess what is the String in the map that is passed to
>> >>>>>>>>> setLocalResources() like below:
>> >>>>>>>>>
>> >>>>>>>>>      // Set the local resources
>> >>>>>>>>>      Map<String, LocalResource> localResources = new
>> HashMap<String,
>> >>>>>>>>> LocalResource>();
>> >>>>>>>>>
>> >>>>>>>>> Thanks,
>> >>>>>>>>> Kishore
>> >>>>>>>>>
>> >>>>>>>>
>> >>>>>>>>
>> >>>>>>>>
>> >>>>>>>> --
>> >>>>>>>> Harsh J
>> >>>>>>>
>> >>>>>>>
>> >>>>>>
>> >>>>>>
>> >>>>>>
>> >>>>>> --
>> >>>>>> Harsh J
>> >>>>>
>> >>>>>
>> >>>>
>> >>>>
>> >>>>
>> >>>> --
>> >>>> Harsh J
>> >>>>
>> >>>>
>> >>>>
>> >>>
>> >>
>> >>
>> >>
>> >> --
>> >> Harsh J
>> >
>>
>>
>>
>> --
>> Harsh J
>>
>
>

Re: setLocalResources() on ContainerLaunchContext

Posted by Omkar Joshi <oj...@hortonworks.com>.
Good that your timestamp worked... Now for hdfs try this
hdfs://<hdfs-host-name>:<hdfs-host-port><absolute-path>
now verify that your absolute path is correct. I hope it will work.
bin/hadoop fs -ls <absolute-path>


hdfs://isredeng:8020*//*kishore/kk.ksh... why "//" ?? you have hdfs file at
absolute location /kishore/kk.sh? is /kishore and /kishore/kk.sh accessible
to the user who is making startContainer call or the one running AM
container?

Thanks,
Omkar Joshi
*Hortonworks Inc.* <http://www.hortonworks.com>


On Tue, Aug 6, 2013 at 10:43 PM, Krishna Kishore Bonagiri <
write2kishore@gmail.com> wrote:

> Hi Harsh, Hitesh & Omkar,
>
>   Thanks for the replies.
>
> I tried getting the last modified timestamp like this and it works. Is
> this a right thing to do?
>
>       File file = new File("/home_/dsadm/kishore/kk.ksh");
>       shellRsrc.setTimestamp(file.lastModified());
>
>
> And, when I tried using a hdfs file qualifying it with both node name and
> port, it didn't work, I get a similar error as earlier.
>
>       String shellScriptPath = "hdfs://isredeng:8020//kishore/kk.ksh";
>
>
> 13/08/07 01:36:28 INFO ApplicationMaster: Got container status for
> containerID= container_1375853431091_0005_01_000002, state=COMPLETE,
> exitStatus=-1000, diagnostics=File does not exist:
> hdfs://isredeng:8020/kishore/kk.ksh
>
> 13/08/07 01:36:28 INFO ApplicationMaster: Got failure status for a
> container : -1000
>
>
>
> On Wed, Aug 7, 2013 at 7:45 AM, Harsh J <ha...@cloudera.com> wrote:
>
>> Thanks Hitesh!
>>
>> P.s. Port isn't a requirement (and with HA URIs, you shouldn't add a
>> port), but "isredeng" has to be the authority component.
>>
>> On Wed, Aug 7, 2013 at 7:37 AM, Hitesh Shah <hi...@apache.org> wrote:
>> > @Krishna, your logs showed the file error for
>> "hdfs://isredeng/kishore/kk.ksh"
>> >
>> > I am assuming you have tried dfs -ls /kishore/kk.ksh and confirmed that
>> the file exists? Also the qualified path seems to be missing the namenode
>> port. I need to go back and check if a path without the port works by
>> assuming the default namenode port.
>> >
>> > @Harsh, adding a helper function seems like a good idea. Let me file a
>> jira to have the above added to one of the helper/client libraries.
>> >
>> > thanks
>> > -- Hitesh
>> >
>> > On Aug 6, 2013, at 6:47 PM, Harsh J wrote:
>> >
>> >> It is kinda unnecessary to be asking developers to load in timestamps
>> and
>> >> length themselves. Why not provide a java.io.File, or perhaps a Path
>> >> accepting API, that gets it automatically on their behalf using the
>> >> FileSystem API internally?
>> >>
>> >> P.s. A HDFS file gave him a FNF, while a Local file gave him a proper
>> >> TS/Len error. I'm guessing there's a bug here w.r.t. handling HDFS
>> >> paths.
>> >>
>> >> On Wed, Aug 7, 2013 at 12:35 AM, Hitesh Shah <hi...@apache.org>
>> wrote:
>> >>> Hi Krishna,
>> >>>
>> >>> YARN downloads a specified local resource on the container's node
>> from the url specified. In all situtations, the remote url needs to be a
>> fully qualified path. To verify that the file at the remote url is still
>> valid, YARN expects you to provide the length and last modified timestamp
>> of that file.
>> >>>
>> >>> If you use an hdfs path such as hdfs://namenode:port/<absolute path
>> to file>, you will need to get the length and timestamp from HDFS.
>> >>> If you use file:///, the file should exist on all nodes and all nodes
>> should have the file with the same length and timestamp for localization to
>> work. ( For a single node setup, this works but tougher to get right on a
>> multi-node setup - deploying the file via a rpm should likely work).
>> >>>
>> >>> -- Hitesh
>> >>>
>> >>> On Aug 6, 2013, at 11:11 AM, Omkar Joshi wrote:
>> >>>
>> >>>> Hi,
>> >>>>
>> >>>> You need to match the timestamp. Probably get the timestamp locally
>> before adding it. This is explicitly done to ensure that file is not
>> updated after user makes the call to avoid possible errors.
>> >>>>
>> >>>>
>> >>>> Thanks,
>> >>>> Omkar Joshi
>> >>>> Hortonworks Inc.
>> >>>>
>> >>>>
>> >>>> On Tue, Aug 6, 2013 at 5:25 AM, Krishna Kishore Bonagiri <
>> write2kishore@gmail.com> wrote:
>> >>>> I tried the following and it works!
>> >>>> String shellScriptPath = "file:///home_/dsadm/kishore/kk.ksh";
>> >>>>
>> >>>> But now getting a timestamp error like below, when I passed 0 to
>> setTimestamp()
>> >>>>
>> >>>> 13/08/06 08:23:48 INFO ApplicationMaster: Got container status for
>> containerID= container_1375784329048_0017_01_000002, state=COMPLETE,
>> exitStatus=-1000, diagnostics=Resource file:/home_/dsadm/kishore/kk.ksh
>> changed on src filesystem (expected 0, was 1367580580000
>> >>>>
>> >>>>
>> >>>>
>> >>>>
>> >>>> On Tue, Aug 6, 2013 at 5:24 PM, Harsh J <ha...@cloudera.com> wrote:
>> >>>> Can you try passing a fully qualified local path? That is, including
>> the file:/ scheme
>> >>>>
>> >>>> On Aug 6, 2013 4:05 PM, "Krishna Kishore Bonagiri" <
>> write2kishore@gmail.com> wrote:
>> >>>> Hi Harsh,
>> >>>>   The setResource() call on LocalResource() is expecting an argument
>> of type org.apache.hadoop.yarn.api.records.URL which is converted from a
>> string in the form of URI. This happens in the following call of
>> Distributed Shell example,
>> >>>>
>> >>>> shellRsrc.setResource(ConverterUtils.getYarnUrlFromURI(new URI(
>> shellScriptPath)));
>> >>>>
>> >>>> So, if I give a local file I get a parsing error like below, which
>> is when I changed it to an HDFS file thinking that it should be given like
>> that only. Could you please give an example of how else it could be used,
>> using a local file as you are saying?
>> >>>>
>> >>>> 2013-08-06 06:23:12,942 WARN
>> org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container:
>> Failed to parse resource-request
>> >>>> java.net.URISyntaxException: Expected scheme name at index 0:
>> :///home_/dsadm/kishore/kk.ksh
>> >>>>        at java.net.URI$Parser.fail(URI.java:2820)
>> >>>>        at java.net.URI$Parser.failExpecting(URI.java:2826)
>> >>>>        at java.net.URI$Parser.parse(URI.java:3015)
>> >>>>        at java.net.URI.<init>(URI.java:747)
>> >>>>        at
>> org.apache.hadoop.yarn.util.ConverterUtils.getPathFromYarnURL(ConverterUtils.java:77)
>> >>>>        at
>> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalResourceRequest.<init>(LocalResourceRequest.java:46)
>> >>>>
>> >>>>
>> >>>>
>> >>>> On Tue, Aug 6, 2013 at 3:36 PM, Harsh J <ha...@cloudera.com> wrote:
>> >>>> To be honest, I've never tried loading a HDFS file onto the
>> >>>> LocalResource this way. I usually just pass a local file and that
>> >>>> works just fine. There may be something in the URI transformation
>> >>>> possibly breaking a HDFS source, but try passing a local file - does
>> >>>> that fail too? The Shell example uses a local file.
>> >>>>
>> >>>> On Tue, Aug 6, 2013 at 10:54 AM, Krishna Kishore Bonagiri
>> >>>> <wr...@gmail.com> wrote:
>> >>>>> Hi Harsh,
>> >>>>>
>> >>>>>  Please see if this is useful, I got a stack trace after the error
>> has
>> >>>>> occurred....
>> >>>>>
>> >>>>> 2013-08-06 00:55:30,559 INFO
>> >>>>> org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor:
>> CWD set
>> >>>>> to
>> /tmp/nm-local-dir/usercache/dsadm/appcache/application_1375716148174_0004
>> >>>>> =
>> >>>>>
>> file:/tmp/nm-local-dir/usercache/dsadm/appcache/application_1375716148174_0004
>> >>>>> 2013-08-06 00:55:31,017 ERROR
>> >>>>> org.apache.hadoop.security.UserGroupInformation:
>> PriviledgedActionException
>> >>>>> as:dsadm (auth:SIMPLE) cause:java.io.FileNotFoundException: File
>> does not
>> >>>>> exist: hdfs://isredeng/kishore/kk.ksh
>> >>>>> 2013-08-06 00:55:31,029 INFO
>> >>>>>
>> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService:
>> >>>>> DEBUG: FAILED { hdfs://isredeng/kishore/kk.ksh, 0, FILE, null },
>> File does
>> >>>>> not exist: hdfs://isredeng/kishore/kk.ksh
>> >>>>> 2013-08-06 00:55:31,031 INFO
>> >>>>>
>> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalizedResource:
>> >>>>> Resource hdfs://isredeng/kishore/kk.ksh transitioned from
>> DOWNLOADING to
>> >>>>> FAILED
>> >>>>> 2013-08-06 00:55:31,034 INFO
>> >>>>>
>> org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container:
>> >>>>> Container container_1375716148174_0004_01_000002 transitioned from
>> >>>>> LOCALIZING to LOCALIZATION_FAILED
>> >>>>> 2013-08-06 00:55:31,035 INFO
>> >>>>>
>> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalResourcesTrackerImpl:
>> >>>>> Container container_1375716148174_0004_01_000002 sent RELEASE event
>> on a
>> >>>>> resource request { hdfs://isredeng/kishore/kk.ksh, 0, FILE, null }
>> not
>> >>>>> present in cache.
>> >>>>> 2013-08-06 00:55:31,036 WARN org.apache.hadoop.ipc.Client:
>> interrupted
>> >>>>> waiting to send rpc request to server
>> >>>>> java.lang.InterruptedException
>> >>>>>        at
>> >>>>>
>> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1290)
>> >>>>>        at
>> >>>>> java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:229)
>> >>>>>        at java.util.concurrent.FutureTask.get(FutureTask.java:94)
>> >>>>>        at
>> >>>>>
>> org.apache.hadoop.ipc.Client$Connection.sendRpcRequest(Client.java:930)
>> >>>>>        at org.apache.hadoop.ipc.Client.call(Client.java:1285)
>> >>>>>        at org.apache.hadoop.ipc.Client.call(Client.java:1264)
>> >>>>>        at
>> >>>>>
>> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
>> >>>>>        at $Proxy22.heartbeat(Unknown Source)
>> >>>>>        at
>> >>>>>
>> org.apache.hadoop.yarn.server.nodemanager.api.impl.pb.client.LocalizationProtocolPBClientImpl.heartbeat(LocalizationProtocolPBClientImpl.java:62)
>> >>>>>        at
>> >>>>>
>> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.localizeFiles(ContainerLocalizer.java:249)
>> >>>>>        at
>> >>>>>
>> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.runLocalization(ContainerLocalizer.java:163)
>> >>>>>        at
>> >>>>>
>> org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.startLocalizer(DefaultContainerExecutor.java:106)
>> >>>>>        at
>> >>>>>
>> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerRunner.run(ResourceLocalizationService.java:979)
>> >>>>>
>> >>>>>
>> >>>>>
>> >>>>> And here is my code snippet:
>> >>>>>
>> >>>>>      ContainerLaunchContext ctx =
>> >>>>> Records.newRecord(ContainerLaunchContext.class);
>> >>>>>
>> >>>>>      ctx.setEnvironment(oshEnv);
>> >>>>>
>> >>>>>      // Set the local resources
>> >>>>>      Map<String, LocalResource> localResources = new HashMap<String,
>> >>>>> LocalResource>();
>> >>>>>
>> >>>>>      LocalResource shellRsrc =
>> Records.newRecord(LocalResource.class);
>> >>>>>      shellRsrc.setType(LocalResourceType.FILE);
>> >>>>>      shellRsrc.setVisibility(LocalResourceVisibility.APPLICATION);
>> >>>>>      String shellScriptPath = "hdfs://isredeng//kishore/kk.ksh";
>> >>>>>      try {
>> >>>>>        shellRsrc.setResource(ConverterUtils.getYarnUrlFromURI(new
>> >>>>> URI(shellScriptPath)));
>> >>>>>      } catch (URISyntaxException e) {
>> >>>>>        LOG.error("Error when trying to use shell script path
>> specified"
>> >>>>>            + " in env, path=" + shellScriptPath);
>> >>>>>        e.printStackTrace();
>> >>>>>      }
>> >>>>>
>> >>>>>      shellRsrc.setTimestamp(0/*shellScriptPathTimestamp*/);
>> >>>>>      shellRsrc.setSize(0/*shellScriptPathLen*/);
>> >>>>>      String ExecShellStringPath = "ExecShellScript.sh";
>> >>>>>      localResources.put(ExecShellStringPath, shellRsrc);
>> >>>>>
>> >>>>>      ctx.setLocalResources(localResources);
>> >>>>>
>> >>>>>
>> >>>>> Please let me know if you need anything else.
>> >>>>>
>> >>>>> Thanks,
>> >>>>> Kishore
>> >>>>>
>> >>>>>
>> >>>>>
>> >>>>> On Tue, Aug 6, 2013 at 12:05 AM, Harsh J <ha...@cloudera.com>
>> wrote:
>> >>>>>>
>> >>>>>> The detail is insufficient to answer why. You should also have
>> gotten
>> >>>>>> a trace after it, can you post that? If possible, also the relevant
>> >>>>>> snippets of code.
>> >>>>>>
>> >>>>>> On Mon, Aug 5, 2013 at 6:36 PM, Krishna Kishore Bonagiri
>> >>>>>> <wr...@gmail.com> wrote:
>> >>>>>>> Hi Harsh,
>> >>>>>>> Thanks for the quick and detailed reply, it really helps. I am
>> trying
>> >>>>>>> to
>> >>>>>>> use it and getting this error in node manager's log:
>> >>>>>>>
>> >>>>>>> 2013-08-05 08:57:28,867 ERROR
>> >>>>>>> org.apache.hadoop.security.UserGroupInformation:
>> >>>>>>> PriviledgedActionException
>> >>>>>>> as:dsadm (auth:SIMPLE) cause:java.io.FileNotFoundException: File
>> does
>> >>>>>>> not
>> >>>>>>> exist: hdfs://isredeng/kishore/kk.ksh
>> >>>>>>>
>> >>>>>>>
>> >>>>>>> This file is there on the machine with name "isredeng", I could
>> do ls
>> >>>>>>> for
>> >>>>>>> that file as below:
>> >>>>>>>
>> >>>>>>> -bash-4.1$ hadoop fs -ls kishore/kk.ksh
>> >>>>>>> 13/08/05 09:01:03 WARN util.NativeCodeLoader: Unable to load
>> >>>>>>> native-hadoop
>> >>>>>>> library for your platform... using builtin-java classes where
>> applicable
>> >>>>>>> Found 1 items
>> >>>>>>> -rw-r--r--   3 dsadm supergroup       1046 2013-08-05 08:48
>> >>>>>>> kishore/kk.ksh
>> >>>>>>>
>> >>>>>>> Note: I am using a single node cluster
>> >>>>>>>
>> >>>>>>> Thanks,
>> >>>>>>> Kishore
>> >>>>>>>
>> >>>>>>>
>> >>>>>>>
>> >>>>>>>
>> >>>>>>> On Mon, Aug 5, 2013 at 3:00 PM, Harsh J <ha...@cloudera.com>
>> wrote:
>> >>>>>>>>
>> >>>>>>>> The string for each LocalResource in the map can be anything that
>> >>>>>>>> serves as a common identifier name for your application. At
>> execution
>> >>>>>>>> time, the passed resource filename will be aliased to the name
>> you've
>> >>>>>>>> mapped it to, so that the application code need not track special
>> >>>>>>>> names. The behavior is very similar to how you can, in MR,
>> define a
>> >>>>>>>> symlink name for a DistributedCache entry (e.g. foo.jar#bar.jar).
>> >>>>>>>>
>> >>>>>>>> For an example, checkout the DistributedShell app sources.
>> >>>>>>>>
>> >>>>>>>> Over [1], you can see we take a user provided file path to a
>> shell
>> >>>>>>>> script. This can be named anything as it is user-supplied.
>> >>>>>>>> Onto [2], we define this as a local resource [2.1] and embed it
>> with a
>> >>>>>>>> different name (the string you ask about) [2.2], as defined at
>> [3] as
>> >>>>>>>> an application reference-able constant.
>> >>>>>>>> Note that in [4], we add to the Container arguments the aliased
>> name
>> >>>>>>>> we mapped it to (i.e. [3]) and not the original filename we
>> received
>> >>>>>>>> from the user. The resource is placed on the container with this
>> name
>> >>>>>>>> instead, so thats what we choose to execute.
>> >>>>>>>>
>> >>>>>>>> [1] -
>> >>>>>>>>
>> >>>>>>>>
>> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L390
>> >>>>>>>>
>> >>>>>>>> [2] - [2.1]
>> >>>>>>>>
>> >>>>>>>>
>> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L764
>> >>>>>>>> and [2.2]
>> >>>>>>>>
>> >>>>>>>>
>> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L780
>> >>>>>>>>
>> >>>>>>>> [3] -
>> >>>>>>>>
>> >>>>>>>>
>> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L205
>> >>>>>>>>
>> >>>>>>>> [4] -
>> >>>>>>>>
>> >>>>>>>>
>> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L791
>> >>>>>>>>
>> >>>>>>>> On Mon, Aug 5, 2013 at 2:44 PM, Krishna Kishore Bonagiri
>> >>>>>>>> <wr...@gmail.com> wrote:
>> >>>>>>>>> Hi,
>> >>>>>>>>>
>> >>>>>>>>>  Can someone please tell me what is the use of calling
>> >>>>>>>>> setLocalResources()
>> >>>>>>>>> on ContainerLaunchContext?
>> >>>>>>>>>
>> >>>>>>>>>  And, also an example of how to use this will help...
>> >>>>>>>>>
>> >>>>>>>>> I couldn't guess what is the String in the map that is passed to
>> >>>>>>>>> setLocalResources() like below:
>> >>>>>>>>>
>> >>>>>>>>>      // Set the local resources
>> >>>>>>>>>      Map<String, LocalResource> localResources = new
>> HashMap<String,
>> >>>>>>>>> LocalResource>();
>> >>>>>>>>>
>> >>>>>>>>> Thanks,
>> >>>>>>>>> Kishore
>> >>>>>>>>>
>> >>>>>>>>
>> >>>>>>>>
>> >>>>>>>>
>> >>>>>>>> --
>> >>>>>>>> Harsh J
>> >>>>>>>
>> >>>>>>>
>> >>>>>>
>> >>>>>>
>> >>>>>>
>> >>>>>> --
>> >>>>>> Harsh J
>> >>>>>
>> >>>>>
>> >>>>
>> >>>>
>> >>>>
>> >>>> --
>> >>>> Harsh J
>> >>>>
>> >>>>
>> >>>>
>> >>>
>> >>
>> >>
>> >>
>> >> --
>> >> Harsh J
>> >
>>
>>
>>
>> --
>> Harsh J
>>
>
>

Re: setLocalResources() on ContainerLaunchContext

Posted by Omkar Joshi <oj...@hortonworks.com>.
Good that your timestamp worked... Now for hdfs try this
hdfs://<hdfs-host-name>:<hdfs-host-port><absolute-path>
now verify that your absolute path is correct. I hope it will work.
bin/hadoop fs -ls <absolute-path>


hdfs://isredeng:8020*//*kishore/kk.ksh... why "//" ?? you have hdfs file at
absolute location /kishore/kk.sh? is /kishore and /kishore/kk.sh accessible
to the user who is making startContainer call or the one running AM
container?

Thanks,
Omkar Joshi
*Hortonworks Inc.* <http://www.hortonworks.com>


On Tue, Aug 6, 2013 at 10:43 PM, Krishna Kishore Bonagiri <
write2kishore@gmail.com> wrote:

> Hi Harsh, Hitesh & Omkar,
>
>   Thanks for the replies.
>
> I tried getting the last modified timestamp like this and it works. Is
> this a right thing to do?
>
>       File file = new File("/home_/dsadm/kishore/kk.ksh");
>       shellRsrc.setTimestamp(file.lastModified());
>
>
> And, when I tried using a hdfs file qualifying it with both node name and
> port, it didn't work, I get a similar error as earlier.
>
>       String shellScriptPath = "hdfs://isredeng:8020//kishore/kk.ksh";
>
>
> 13/08/07 01:36:28 INFO ApplicationMaster: Got container status for
> containerID= container_1375853431091_0005_01_000002, state=COMPLETE,
> exitStatus=-1000, diagnostics=File does not exist:
> hdfs://isredeng:8020/kishore/kk.ksh
>
> 13/08/07 01:36:28 INFO ApplicationMaster: Got failure status for a
> container : -1000
>
>
>
> On Wed, Aug 7, 2013 at 7:45 AM, Harsh J <ha...@cloudera.com> wrote:
>
>> Thanks Hitesh!
>>
>> P.s. Port isn't a requirement (and with HA URIs, you shouldn't add a
>> port), but "isredeng" has to be the authority component.
>>
>> On Wed, Aug 7, 2013 at 7:37 AM, Hitesh Shah <hi...@apache.org> wrote:
>> > @Krishna, your logs showed the file error for
>> "hdfs://isredeng/kishore/kk.ksh"
>> >
>> > I am assuming you have tried dfs -ls /kishore/kk.ksh and confirmed that
>> the file exists? Also the qualified path seems to be missing the namenode
>> port. I need to go back and check if a path without the port works by
>> assuming the default namenode port.
>> >
>> > @Harsh, adding a helper function seems like a good idea. Let me file a
>> jira to have the above added to one of the helper/client libraries.
>> >
>> > thanks
>> > -- Hitesh
>> >
>> > On Aug 6, 2013, at 6:47 PM, Harsh J wrote:
>> >
>> >> It is kinda unnecessary to be asking developers to load in timestamps
>> and
>> >> length themselves. Why not provide a java.io.File, or perhaps a Path
>> >> accepting API, that gets it automatically on their behalf using the
>> >> FileSystem API internally?
>> >>
>> >> P.s. A HDFS file gave him a FNF, while a Local file gave him a proper
>> >> TS/Len error. I'm guessing there's a bug here w.r.t. handling HDFS
>> >> paths.
>> >>
>> >> On Wed, Aug 7, 2013 at 12:35 AM, Hitesh Shah <hi...@apache.org>
>> wrote:
>> >>> Hi Krishna,
>> >>>
>> >>> YARN downloads a specified local resource on the container's node
>> from the url specified. In all situtations, the remote url needs to be a
>> fully qualified path. To verify that the file at the remote url is still
>> valid, YARN expects you to provide the length and last modified timestamp
>> of that file.
>> >>>
>> >>> If you use an hdfs path such as hdfs://namenode:port/<absolute path
>> to file>, you will need to get the length and timestamp from HDFS.
>> >>> If you use file:///, the file should exist on all nodes and all nodes
>> should have the file with the same length and timestamp for localization to
>> work. ( For a single node setup, this works but tougher to get right on a
>> multi-node setup - deploying the file via a rpm should likely work).
>> >>>
>> >>> -- Hitesh
>> >>>
>> >>> On Aug 6, 2013, at 11:11 AM, Omkar Joshi wrote:
>> >>>
>> >>>> Hi,
>> >>>>
>> >>>> You need to match the timestamp. Probably get the timestamp locally
>> before adding it. This is explicitly done to ensure that file is not
>> updated after user makes the call to avoid possible errors.
>> >>>>
>> >>>>
>> >>>> Thanks,
>> >>>> Omkar Joshi
>> >>>> Hortonworks Inc.
>> >>>>
>> >>>>
>> >>>> On Tue, Aug 6, 2013 at 5:25 AM, Krishna Kishore Bonagiri <
>> write2kishore@gmail.com> wrote:
>> >>>> I tried the following and it works!
>> >>>> String shellScriptPath = "file:///home_/dsadm/kishore/kk.ksh";
>> >>>>
>> >>>> But now getting a timestamp error like below, when I passed 0 to
>> setTimestamp()
>> >>>>
>> >>>> 13/08/06 08:23:48 INFO ApplicationMaster: Got container status for
>> containerID= container_1375784329048_0017_01_000002, state=COMPLETE,
>> exitStatus=-1000, diagnostics=Resource file:/home_/dsadm/kishore/kk.ksh
>> changed on src filesystem (expected 0, was 1367580580000
>> >>>>
>> >>>>
>> >>>>
>> >>>>
>> >>>> On Tue, Aug 6, 2013 at 5:24 PM, Harsh J <ha...@cloudera.com> wrote:
>> >>>> Can you try passing a fully qualified local path? That is, including
>> the file:/ scheme
>> >>>>
>> >>>> On Aug 6, 2013 4:05 PM, "Krishna Kishore Bonagiri" <
>> write2kishore@gmail.com> wrote:
>> >>>> Hi Harsh,
>> >>>>   The setResource() call on LocalResource() is expecting an argument
>> of type org.apache.hadoop.yarn.api.records.URL which is converted from a
>> string in the form of URI. This happens in the following call of
>> Distributed Shell example,
>> >>>>
>> >>>> shellRsrc.setResource(ConverterUtils.getYarnUrlFromURI(new URI(
>> shellScriptPath)));
>> >>>>
>> >>>> So, if I give a local file I get a parsing error like below, which
>> is when I changed it to an HDFS file thinking that it should be given like
>> that only. Could you please give an example of how else it could be used,
>> using a local file as you are saying?
>> >>>>
>> >>>> 2013-08-06 06:23:12,942 WARN
>> org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container:
>> Failed to parse resource-request
>> >>>> java.net.URISyntaxException: Expected scheme name at index 0:
>> :///home_/dsadm/kishore/kk.ksh
>> >>>>        at java.net.URI$Parser.fail(URI.java:2820)
>> >>>>        at java.net.URI$Parser.failExpecting(URI.java:2826)
>> >>>>        at java.net.URI$Parser.parse(URI.java:3015)
>> >>>>        at java.net.URI.<init>(URI.java:747)
>> >>>>        at
>> org.apache.hadoop.yarn.util.ConverterUtils.getPathFromYarnURL(ConverterUtils.java:77)
>> >>>>        at
>> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalResourceRequest.<init>(LocalResourceRequest.java:46)
>> >>>>
>> >>>>
>> >>>>
>> >>>> On Tue, Aug 6, 2013 at 3:36 PM, Harsh J <ha...@cloudera.com> wrote:
>> >>>> To be honest, I've never tried loading a HDFS file onto the
>> >>>> LocalResource this way. I usually just pass a local file and that
>> >>>> works just fine. There may be something in the URI transformation
>> >>>> possibly breaking a HDFS source, but try passing a local file - does
>> >>>> that fail too? The Shell example uses a local file.
>> >>>>
>> >>>> On Tue, Aug 6, 2013 at 10:54 AM, Krishna Kishore Bonagiri
>> >>>> <wr...@gmail.com> wrote:
>> >>>>> Hi Harsh,
>> >>>>>
>> >>>>>  Please see if this is useful, I got a stack trace after the error
>> has
>> >>>>> occurred....
>> >>>>>
>> >>>>> 2013-08-06 00:55:30,559 INFO
>> >>>>> org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor:
>> CWD set
>> >>>>> to
>> /tmp/nm-local-dir/usercache/dsadm/appcache/application_1375716148174_0004
>> >>>>> =
>> >>>>>
>> file:/tmp/nm-local-dir/usercache/dsadm/appcache/application_1375716148174_0004
>> >>>>> 2013-08-06 00:55:31,017 ERROR
>> >>>>> org.apache.hadoop.security.UserGroupInformation:
>> PriviledgedActionException
>> >>>>> as:dsadm (auth:SIMPLE) cause:java.io.FileNotFoundException: File
>> does not
>> >>>>> exist: hdfs://isredeng/kishore/kk.ksh
>> >>>>> 2013-08-06 00:55:31,029 INFO
>> >>>>>
>> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService:
>> >>>>> DEBUG: FAILED { hdfs://isredeng/kishore/kk.ksh, 0, FILE, null },
>> File does
>> >>>>> not exist: hdfs://isredeng/kishore/kk.ksh
>> >>>>> 2013-08-06 00:55:31,031 INFO
>> >>>>>
>> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalizedResource:
>> >>>>> Resource hdfs://isredeng/kishore/kk.ksh transitioned from
>> DOWNLOADING to
>> >>>>> FAILED
>> >>>>> 2013-08-06 00:55:31,034 INFO
>> >>>>>
>> org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container:
>> >>>>> Container container_1375716148174_0004_01_000002 transitioned from
>> >>>>> LOCALIZING to LOCALIZATION_FAILED
>> >>>>> 2013-08-06 00:55:31,035 INFO
>> >>>>>
>> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalResourcesTrackerImpl:
>> >>>>> Container container_1375716148174_0004_01_000002 sent RELEASE event
>> on a
>> >>>>> resource request { hdfs://isredeng/kishore/kk.ksh, 0, FILE, null }
>> not
>> >>>>> present in cache.
>> >>>>> 2013-08-06 00:55:31,036 WARN org.apache.hadoop.ipc.Client:
>> interrupted
>> >>>>> waiting to send rpc request to server
>> >>>>> java.lang.InterruptedException
>> >>>>>        at
>> >>>>>
>> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1290)
>> >>>>>        at
>> >>>>> java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:229)
>> >>>>>        at java.util.concurrent.FutureTask.get(FutureTask.java:94)
>> >>>>>        at
>> >>>>>
>> org.apache.hadoop.ipc.Client$Connection.sendRpcRequest(Client.java:930)
>> >>>>>        at org.apache.hadoop.ipc.Client.call(Client.java:1285)
>> >>>>>        at org.apache.hadoop.ipc.Client.call(Client.java:1264)
>> >>>>>        at
>> >>>>>
>> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
>> >>>>>        at $Proxy22.heartbeat(Unknown Source)
>> >>>>>        at
>> >>>>>
>> org.apache.hadoop.yarn.server.nodemanager.api.impl.pb.client.LocalizationProtocolPBClientImpl.heartbeat(LocalizationProtocolPBClientImpl.java:62)
>> >>>>>        at
>> >>>>>
>> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.localizeFiles(ContainerLocalizer.java:249)
>> >>>>>        at
>> >>>>>
>> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.runLocalization(ContainerLocalizer.java:163)
>> >>>>>        at
>> >>>>>
>> org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.startLocalizer(DefaultContainerExecutor.java:106)
>> >>>>>        at
>> >>>>>
>> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerRunner.run(ResourceLocalizationService.java:979)
>> >>>>>
>> >>>>>
>> >>>>>
>> >>>>> And here is my code snippet:
>> >>>>>
>> >>>>>      ContainerLaunchContext ctx =
>> >>>>> Records.newRecord(ContainerLaunchContext.class);
>> >>>>>
>> >>>>>      ctx.setEnvironment(oshEnv);
>> >>>>>
>> >>>>>      // Set the local resources
>> >>>>>      Map<String, LocalResource> localResources = new HashMap<String,
>> >>>>> LocalResource>();
>> >>>>>
>> >>>>>      LocalResource shellRsrc =
>> Records.newRecord(LocalResource.class);
>> >>>>>      shellRsrc.setType(LocalResourceType.FILE);
>> >>>>>      shellRsrc.setVisibility(LocalResourceVisibility.APPLICATION);
>> >>>>>      String shellScriptPath = "hdfs://isredeng//kishore/kk.ksh";
>> >>>>>      try {
>> >>>>>        shellRsrc.setResource(ConverterUtils.getYarnUrlFromURI(new
>> >>>>> URI(shellScriptPath)));
>> >>>>>      } catch (URISyntaxException e) {
>> >>>>>        LOG.error("Error when trying to use shell script path
>> specified"
>> >>>>>            + " in env, path=" + shellScriptPath);
>> >>>>>        e.printStackTrace();
>> >>>>>      }
>> >>>>>
>> >>>>>      shellRsrc.setTimestamp(0/*shellScriptPathTimestamp*/);
>> >>>>>      shellRsrc.setSize(0/*shellScriptPathLen*/);
>> >>>>>      String ExecShellStringPath = "ExecShellScript.sh";
>> >>>>>      localResources.put(ExecShellStringPath, shellRsrc);
>> >>>>>
>> >>>>>      ctx.setLocalResources(localResources);
>> >>>>>
>> >>>>>
>> >>>>> Please let me know if you need anything else.
>> >>>>>
>> >>>>> Thanks,
>> >>>>> Kishore
>> >>>>>
>> >>>>>
>> >>>>>
>> >>>>> On Tue, Aug 6, 2013 at 12:05 AM, Harsh J <ha...@cloudera.com>
>> wrote:
>> >>>>>>
>> >>>>>> The detail is insufficient to answer why. You should also have
>> gotten
>> >>>>>> a trace after it, can you post that? If possible, also the relevant
>> >>>>>> snippets of code.
>> >>>>>>
>> >>>>>> On Mon, Aug 5, 2013 at 6:36 PM, Krishna Kishore Bonagiri
>> >>>>>> <wr...@gmail.com> wrote:
>> >>>>>>> Hi Harsh,
>> >>>>>>> Thanks for the quick and detailed reply, it really helps. I am
>> trying
>> >>>>>>> to
>> >>>>>>> use it and getting this error in node manager's log:
>> >>>>>>>
>> >>>>>>> 2013-08-05 08:57:28,867 ERROR
>> >>>>>>> org.apache.hadoop.security.UserGroupInformation:
>> >>>>>>> PriviledgedActionException
>> >>>>>>> as:dsadm (auth:SIMPLE) cause:java.io.FileNotFoundException: File
>> does
>> >>>>>>> not
>> >>>>>>> exist: hdfs://isredeng/kishore/kk.ksh
>> >>>>>>>
>> >>>>>>>
>> >>>>>>> This file is there on the machine with name "isredeng", I could
>> do ls
>> >>>>>>> for
>> >>>>>>> that file as below:
>> >>>>>>>
>> >>>>>>> -bash-4.1$ hadoop fs -ls kishore/kk.ksh
>> >>>>>>> 13/08/05 09:01:03 WARN util.NativeCodeLoader: Unable to load
>> >>>>>>> native-hadoop
>> >>>>>>> library for your platform... using builtin-java classes where
>> applicable
>> >>>>>>> Found 1 items
>> >>>>>>> -rw-r--r--   3 dsadm supergroup       1046 2013-08-05 08:48
>> >>>>>>> kishore/kk.ksh
>> >>>>>>>
>> >>>>>>> Note: I am using a single node cluster
>> >>>>>>>
>> >>>>>>> Thanks,
>> >>>>>>> Kishore
>> >>>>>>>
>> >>>>>>>
>> >>>>>>>
>> >>>>>>>
>> >>>>>>> On Mon, Aug 5, 2013 at 3:00 PM, Harsh J <ha...@cloudera.com>
>> wrote:
>> >>>>>>>>
>> >>>>>>>> The string for each LocalResource in the map can be anything that
>> >>>>>>>> serves as a common identifier name for your application. At
>> execution
>> >>>>>>>> time, the passed resource filename will be aliased to the name
>> you've
>> >>>>>>>> mapped it to, so that the application code need not track special
>> >>>>>>>> names. The behavior is very similar to how you can, in MR,
>> define a
>> >>>>>>>> symlink name for a DistributedCache entry (e.g. foo.jar#bar.jar).
>> >>>>>>>>
>> >>>>>>>> For an example, checkout the DistributedShell app sources.
>> >>>>>>>>
>> >>>>>>>> Over [1], you can see we take a user provided file path to a
>> shell
>> >>>>>>>> script. This can be named anything as it is user-supplied.
>> >>>>>>>> Onto [2], we define this as a local resource [2.1] and embed it
>> with a
>> >>>>>>>> different name (the string you ask about) [2.2], as defined at
>> [3] as
>> >>>>>>>> an application reference-able constant.
>> >>>>>>>> Note that in [4], we add to the Container arguments the aliased
>> name
>> >>>>>>>> we mapped it to (i.e. [3]) and not the original filename we
>> received
>> >>>>>>>> from the user. The resource is placed on the container with this
>> name
>> >>>>>>>> instead, so thats what we choose to execute.
>> >>>>>>>>
>> >>>>>>>> [1] -
>> >>>>>>>>
>> >>>>>>>>
>> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L390
>> >>>>>>>>
>> >>>>>>>> [2] - [2.1]
>> >>>>>>>>
>> >>>>>>>>
>> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L764
>> >>>>>>>> and [2.2]
>> >>>>>>>>
>> >>>>>>>>
>> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L780
>> >>>>>>>>
>> >>>>>>>> [3] -
>> >>>>>>>>
>> >>>>>>>>
>> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L205
>> >>>>>>>>
>> >>>>>>>> [4] -
>> >>>>>>>>
>> >>>>>>>>
>> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L791
>> >>>>>>>>
>> >>>>>>>> On Mon, Aug 5, 2013 at 2:44 PM, Krishna Kishore Bonagiri
>> >>>>>>>> <wr...@gmail.com> wrote:
>> >>>>>>>>> Hi,
>> >>>>>>>>>
>> >>>>>>>>>  Can someone please tell me what is the use of calling
>> >>>>>>>>> setLocalResources()
>> >>>>>>>>> on ContainerLaunchContext?
>> >>>>>>>>>
>> >>>>>>>>>  And, also an example of how to use this will help...
>> >>>>>>>>>
>> >>>>>>>>> I couldn't guess what is the String in the map that is passed to
>> >>>>>>>>> setLocalResources() like below:
>> >>>>>>>>>
>> >>>>>>>>>      // Set the local resources
>> >>>>>>>>>      Map<String, LocalResource> localResources = new
>> HashMap<String,
>> >>>>>>>>> LocalResource>();
>> >>>>>>>>>
>> >>>>>>>>> Thanks,
>> >>>>>>>>> Kishore
>> >>>>>>>>>
>> >>>>>>>>
>> >>>>>>>>
>> >>>>>>>>
>> >>>>>>>> --
>> >>>>>>>> Harsh J
>> >>>>>>>
>> >>>>>>>
>> >>>>>>
>> >>>>>>
>> >>>>>>
>> >>>>>> --
>> >>>>>> Harsh J
>> >>>>>
>> >>>>>
>> >>>>
>> >>>>
>> >>>>
>> >>>> --
>> >>>> Harsh J
>> >>>>
>> >>>>
>> >>>>
>> >>>
>> >>
>> >>
>> >>
>> >> --
>> >> Harsh J
>> >
>>
>>
>>
>> --
>> Harsh J
>>
>
>

Re: setLocalResources() on ContainerLaunchContext

Posted by Krishna Kishore Bonagiri <wr...@gmail.com>.
Hi Harsh, Hitesh & Omkar,

  Thanks for the replies.

I tried getting the last modified timestamp like this and it works. Is this
a right thing to do?

      File file = new File("/home_/dsadm/kishore/kk.ksh");
      shellRsrc.setTimestamp(file.lastModified());


And, when I tried using a hdfs file qualifying it with both node name and
port, it didn't work, I get a similar error as earlier.

      String shellScriptPath = "hdfs://isredeng:8020//kishore/kk.ksh";


13/08/07 01:36:28 INFO ApplicationMaster: Got container status for
containerID= container_1375853431091_0005_01_000002, state=COMPLETE,
exitStatus=-1000, diagnostics=File does not exist:
hdfs://isredeng:8020/kishore/kk.ksh

13/08/07 01:36:28 INFO ApplicationMaster: Got failure status for a
container : -1000



On Wed, Aug 7, 2013 at 7:45 AM, Harsh J <ha...@cloudera.com> wrote:

> Thanks Hitesh!
>
> P.s. Port isn't a requirement (and with HA URIs, you shouldn't add a
> port), but "isredeng" has to be the authority component.
>
> On Wed, Aug 7, 2013 at 7:37 AM, Hitesh Shah <hi...@apache.org> wrote:
> > @Krishna, your logs showed the file error for
> "hdfs://isredeng/kishore/kk.ksh"
> >
> > I am assuming you have tried dfs -ls /kishore/kk.ksh and confirmed that
> the file exists? Also the qualified path seems to be missing the namenode
> port. I need to go back and check if a path without the port works by
> assuming the default namenode port.
> >
> > @Harsh, adding a helper function seems like a good idea. Let me file a
> jira to have the above added to one of the helper/client libraries.
> >
> > thanks
> > -- Hitesh
> >
> > On Aug 6, 2013, at 6:47 PM, Harsh J wrote:
> >
> >> It is kinda unnecessary to be asking developers to load in timestamps
> and
> >> length themselves. Why not provide a java.io.File, or perhaps a Path
> >> accepting API, that gets it automatically on their behalf using the
> >> FileSystem API internally?
> >>
> >> P.s. A HDFS file gave him a FNF, while a Local file gave him a proper
> >> TS/Len error. I'm guessing there's a bug here w.r.t. handling HDFS
> >> paths.
> >>
> >> On Wed, Aug 7, 2013 at 12:35 AM, Hitesh Shah <hi...@apache.org> wrote:
> >>> Hi Krishna,
> >>>
> >>> YARN downloads a specified local resource on the container's node from
> the url specified. In all situtations, the remote url needs to be a fully
> qualified path. To verify that the file at the remote url is still valid,
> YARN expects you to provide the length and last modified timestamp of that
> file.
> >>>
> >>> If you use an hdfs path such as hdfs://namenode:port/<absolute path to
> file>, you will need to get the length and timestamp from HDFS.
> >>> If you use file:///, the file should exist on all nodes and all nodes
> should have the file with the same length and timestamp for localization to
> work. ( For a single node setup, this works but tougher to get right on a
> multi-node setup - deploying the file via a rpm should likely work).
> >>>
> >>> -- Hitesh
> >>>
> >>> On Aug 6, 2013, at 11:11 AM, Omkar Joshi wrote:
> >>>
> >>>> Hi,
> >>>>
> >>>> You need to match the timestamp. Probably get the timestamp locally
> before adding it. This is explicitly done to ensure that file is not
> updated after user makes the call to avoid possible errors.
> >>>>
> >>>>
> >>>> Thanks,
> >>>> Omkar Joshi
> >>>> Hortonworks Inc.
> >>>>
> >>>>
> >>>> On Tue, Aug 6, 2013 at 5:25 AM, Krishna Kishore Bonagiri <
> write2kishore@gmail.com> wrote:
> >>>> I tried the following and it works!
> >>>> String shellScriptPath = "file:///home_/dsadm/kishore/kk.ksh";
> >>>>
> >>>> But now getting a timestamp error like below, when I passed 0 to
> setTimestamp()
> >>>>
> >>>> 13/08/06 08:23:48 INFO ApplicationMaster: Got container status for
> containerID= container_1375784329048_0017_01_000002, state=COMPLETE,
> exitStatus=-1000, diagnostics=Resource file:/home_/dsadm/kishore/kk.ksh
> changed on src filesystem (expected 0, was 1367580580000
> >>>>
> >>>>
> >>>>
> >>>>
> >>>> On Tue, Aug 6, 2013 at 5:24 PM, Harsh J <ha...@cloudera.com> wrote:
> >>>> Can you try passing a fully qualified local path? That is, including
> the file:/ scheme
> >>>>
> >>>> On Aug 6, 2013 4:05 PM, "Krishna Kishore Bonagiri" <
> write2kishore@gmail.com> wrote:
> >>>> Hi Harsh,
> >>>>   The setResource() call on LocalResource() is expecting an argument
> of type org.apache.hadoop.yarn.api.records.URL which is converted from a
> string in the form of URI. This happens in the following call of
> Distributed Shell example,
> >>>>
> >>>> shellRsrc.setResource(ConverterUtils.getYarnUrlFromURI(new URI(
> shellScriptPath)));
> >>>>
> >>>> So, if I give a local file I get a parsing error like below, which is
> when I changed it to an HDFS file thinking that it should be given like
> that only. Could you please give an example of how else it could be used,
> using a local file as you are saying?
> >>>>
> >>>> 2013-08-06 06:23:12,942 WARN
> org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container:
> Failed to parse resource-request
> >>>> java.net.URISyntaxException: Expected scheme name at index 0:
> :///home_/dsadm/kishore/kk.ksh
> >>>>        at java.net.URI$Parser.fail(URI.java:2820)
> >>>>        at java.net.URI$Parser.failExpecting(URI.java:2826)
> >>>>        at java.net.URI$Parser.parse(URI.java:3015)
> >>>>        at java.net.URI.<init>(URI.java:747)
> >>>>        at
> org.apache.hadoop.yarn.util.ConverterUtils.getPathFromYarnURL(ConverterUtils.java:77)
> >>>>        at
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalResourceRequest.<init>(LocalResourceRequest.java:46)
> >>>>
> >>>>
> >>>>
> >>>> On Tue, Aug 6, 2013 at 3:36 PM, Harsh J <ha...@cloudera.com> wrote:
> >>>> To be honest, I've never tried loading a HDFS file onto the
> >>>> LocalResource this way. I usually just pass a local file and that
> >>>> works just fine. There may be something in the URI transformation
> >>>> possibly breaking a HDFS source, but try passing a local file - does
> >>>> that fail too? The Shell example uses a local file.
> >>>>
> >>>> On Tue, Aug 6, 2013 at 10:54 AM, Krishna Kishore Bonagiri
> >>>> <wr...@gmail.com> wrote:
> >>>>> Hi Harsh,
> >>>>>
> >>>>>  Please see if this is useful, I got a stack trace after the error
> has
> >>>>> occurred....
> >>>>>
> >>>>> 2013-08-06 00:55:30,559 INFO
> >>>>> org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor:
> CWD set
> >>>>> to
> /tmp/nm-local-dir/usercache/dsadm/appcache/application_1375716148174_0004
> >>>>> =
> >>>>>
> file:/tmp/nm-local-dir/usercache/dsadm/appcache/application_1375716148174_0004
> >>>>> 2013-08-06 00:55:31,017 ERROR
> >>>>> org.apache.hadoop.security.UserGroupInformation:
> PriviledgedActionException
> >>>>> as:dsadm (auth:SIMPLE) cause:java.io.FileNotFoundException: File
> does not
> >>>>> exist: hdfs://isredeng/kishore/kk.ksh
> >>>>> 2013-08-06 00:55:31,029 INFO
> >>>>>
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService:
> >>>>> DEBUG: FAILED { hdfs://isredeng/kishore/kk.ksh, 0, FILE, null },
> File does
> >>>>> not exist: hdfs://isredeng/kishore/kk.ksh
> >>>>> 2013-08-06 00:55:31,031 INFO
> >>>>>
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalizedResource:
> >>>>> Resource hdfs://isredeng/kishore/kk.ksh transitioned from
> DOWNLOADING to
> >>>>> FAILED
> >>>>> 2013-08-06 00:55:31,034 INFO
> >>>>>
> org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container:
> >>>>> Container container_1375716148174_0004_01_000002 transitioned from
> >>>>> LOCALIZING to LOCALIZATION_FAILED
> >>>>> 2013-08-06 00:55:31,035 INFO
> >>>>>
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalResourcesTrackerImpl:
> >>>>> Container container_1375716148174_0004_01_000002 sent RELEASE event
> on a
> >>>>> resource request { hdfs://isredeng/kishore/kk.ksh, 0, FILE, null }
> not
> >>>>> present in cache.
> >>>>> 2013-08-06 00:55:31,036 WARN org.apache.hadoop.ipc.Client:
> interrupted
> >>>>> waiting to send rpc request to server
> >>>>> java.lang.InterruptedException
> >>>>>        at
> >>>>>
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1290)
> >>>>>        at
> >>>>> java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:229)
> >>>>>        at java.util.concurrent.FutureTask.get(FutureTask.java:94)
> >>>>>        at
> >>>>>
> org.apache.hadoop.ipc.Client$Connection.sendRpcRequest(Client.java:930)
> >>>>>        at org.apache.hadoop.ipc.Client.call(Client.java:1285)
> >>>>>        at org.apache.hadoop.ipc.Client.call(Client.java:1264)
> >>>>>        at
> >>>>>
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
> >>>>>        at $Proxy22.heartbeat(Unknown Source)
> >>>>>        at
> >>>>>
> org.apache.hadoop.yarn.server.nodemanager.api.impl.pb.client.LocalizationProtocolPBClientImpl.heartbeat(LocalizationProtocolPBClientImpl.java:62)
> >>>>>        at
> >>>>>
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.localizeFiles(ContainerLocalizer.java:249)
> >>>>>        at
> >>>>>
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.runLocalization(ContainerLocalizer.java:163)
> >>>>>        at
> >>>>>
> org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.startLocalizer(DefaultContainerExecutor.java:106)
> >>>>>        at
> >>>>>
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerRunner.run(ResourceLocalizationService.java:979)
> >>>>>
> >>>>>
> >>>>>
> >>>>> And here is my code snippet:
> >>>>>
> >>>>>      ContainerLaunchContext ctx =
> >>>>> Records.newRecord(ContainerLaunchContext.class);
> >>>>>
> >>>>>      ctx.setEnvironment(oshEnv);
> >>>>>
> >>>>>      // Set the local resources
> >>>>>      Map<String, LocalResource> localResources = new HashMap<String,
> >>>>> LocalResource>();
> >>>>>
> >>>>>      LocalResource shellRsrc =
> Records.newRecord(LocalResource.class);
> >>>>>      shellRsrc.setType(LocalResourceType.FILE);
> >>>>>      shellRsrc.setVisibility(LocalResourceVisibility.APPLICATION);
> >>>>>      String shellScriptPath = "hdfs://isredeng//kishore/kk.ksh";
> >>>>>      try {
> >>>>>        shellRsrc.setResource(ConverterUtils.getYarnUrlFromURI(new
> >>>>> URI(shellScriptPath)));
> >>>>>      } catch (URISyntaxException e) {
> >>>>>        LOG.error("Error when trying to use shell script path
> specified"
> >>>>>            + " in env, path=" + shellScriptPath);
> >>>>>        e.printStackTrace();
> >>>>>      }
> >>>>>
> >>>>>      shellRsrc.setTimestamp(0/*shellScriptPathTimestamp*/);
> >>>>>      shellRsrc.setSize(0/*shellScriptPathLen*/);
> >>>>>      String ExecShellStringPath = "ExecShellScript.sh";
> >>>>>      localResources.put(ExecShellStringPath, shellRsrc);
> >>>>>
> >>>>>      ctx.setLocalResources(localResources);
> >>>>>
> >>>>>
> >>>>> Please let me know if you need anything else.
> >>>>>
> >>>>> Thanks,
> >>>>> Kishore
> >>>>>
> >>>>>
> >>>>>
> >>>>> On Tue, Aug 6, 2013 at 12:05 AM, Harsh J <ha...@cloudera.com> wrote:
> >>>>>>
> >>>>>> The detail is insufficient to answer why. You should also have
> gotten
> >>>>>> a trace after it, can you post that? If possible, also the relevant
> >>>>>> snippets of code.
> >>>>>>
> >>>>>> On Mon, Aug 5, 2013 at 6:36 PM, Krishna Kishore Bonagiri
> >>>>>> <wr...@gmail.com> wrote:
> >>>>>>> Hi Harsh,
> >>>>>>> Thanks for the quick and detailed reply, it really helps. I am
> trying
> >>>>>>> to
> >>>>>>> use it and getting this error in node manager's log:
> >>>>>>>
> >>>>>>> 2013-08-05 08:57:28,867 ERROR
> >>>>>>> org.apache.hadoop.security.UserGroupInformation:
> >>>>>>> PriviledgedActionException
> >>>>>>> as:dsadm (auth:SIMPLE) cause:java.io.FileNotFoundException: File
> does
> >>>>>>> not
> >>>>>>> exist: hdfs://isredeng/kishore/kk.ksh
> >>>>>>>
> >>>>>>>
> >>>>>>> This file is there on the machine with name "isredeng", I could do
> ls
> >>>>>>> for
> >>>>>>> that file as below:
> >>>>>>>
> >>>>>>> -bash-4.1$ hadoop fs -ls kishore/kk.ksh
> >>>>>>> 13/08/05 09:01:03 WARN util.NativeCodeLoader: Unable to load
> >>>>>>> native-hadoop
> >>>>>>> library for your platform... using builtin-java classes where
> applicable
> >>>>>>> Found 1 items
> >>>>>>> -rw-r--r--   3 dsadm supergroup       1046 2013-08-05 08:48
> >>>>>>> kishore/kk.ksh
> >>>>>>>
> >>>>>>> Note: I am using a single node cluster
> >>>>>>>
> >>>>>>> Thanks,
> >>>>>>> Kishore
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>> On Mon, Aug 5, 2013 at 3:00 PM, Harsh J <ha...@cloudera.com>
> wrote:
> >>>>>>>>
> >>>>>>>> The string for each LocalResource in the map can be anything that
> >>>>>>>> serves as a common identifier name for your application. At
> execution
> >>>>>>>> time, the passed resource filename will be aliased to the name
> you've
> >>>>>>>> mapped it to, so that the application code need not track special
> >>>>>>>> names. The behavior is very similar to how you can, in MR, define
> a
> >>>>>>>> symlink name for a DistributedCache entry (e.g. foo.jar#bar.jar).
> >>>>>>>>
> >>>>>>>> For an example, checkout the DistributedShell app sources.
> >>>>>>>>
> >>>>>>>> Over [1], you can see we take a user provided file path to a shell
> >>>>>>>> script. This can be named anything as it is user-supplied.
> >>>>>>>> Onto [2], we define this as a local resource [2.1] and embed it
> with a
> >>>>>>>> different name (the string you ask about) [2.2], as defined at
> [3] as
> >>>>>>>> an application reference-able constant.
> >>>>>>>> Note that in [4], we add to the Container arguments the aliased
> name
> >>>>>>>> we mapped it to (i.e. [3]) and not the original filename we
> received
> >>>>>>>> from the user. The resource is placed on the container with this
> name
> >>>>>>>> instead, so thats what we choose to execute.
> >>>>>>>>
> >>>>>>>> [1] -
> >>>>>>>>
> >>>>>>>>
> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L390
> >>>>>>>>
> >>>>>>>> [2] - [2.1]
> >>>>>>>>
> >>>>>>>>
> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L764
> >>>>>>>> and [2.2]
> >>>>>>>>
> >>>>>>>>
> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L780
> >>>>>>>>
> >>>>>>>> [3] -
> >>>>>>>>
> >>>>>>>>
> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L205
> >>>>>>>>
> >>>>>>>> [4] -
> >>>>>>>>
> >>>>>>>>
> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L791
> >>>>>>>>
> >>>>>>>> On Mon, Aug 5, 2013 at 2:44 PM, Krishna Kishore Bonagiri
> >>>>>>>> <wr...@gmail.com> wrote:
> >>>>>>>>> Hi,
> >>>>>>>>>
> >>>>>>>>>  Can someone please tell me what is the use of calling
> >>>>>>>>> setLocalResources()
> >>>>>>>>> on ContainerLaunchContext?
> >>>>>>>>>
> >>>>>>>>>  And, also an example of how to use this will help...
> >>>>>>>>>
> >>>>>>>>> I couldn't guess what is the String in the map that is passed to
> >>>>>>>>> setLocalResources() like below:
> >>>>>>>>>
> >>>>>>>>>      // Set the local resources
> >>>>>>>>>      Map<String, LocalResource> localResources = new
> HashMap<String,
> >>>>>>>>> LocalResource>();
> >>>>>>>>>
> >>>>>>>>> Thanks,
> >>>>>>>>> Kishore
> >>>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> --
> >>>>>>>> Harsh J
> >>>>>>>
> >>>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>> --
> >>>>>> Harsh J
> >>>>>
> >>>>>
> >>>>
> >>>>
> >>>>
> >>>> --
> >>>> Harsh J
> >>>>
> >>>>
> >>>>
> >>>
> >>
> >>
> >>
> >> --
> >> Harsh J
> >
>
>
>
> --
> Harsh J
>

Re: setLocalResources() on ContainerLaunchContext

Posted by Krishna Kishore Bonagiri <wr...@gmail.com>.
Hi Harsh, Hitesh & Omkar,

  Thanks for the replies.

I tried getting the last modified timestamp like this and it works. Is this
a right thing to do?

      File file = new File("/home_/dsadm/kishore/kk.ksh");
      shellRsrc.setTimestamp(file.lastModified());


And, when I tried using a hdfs file qualifying it with both node name and
port, it didn't work, I get a similar error as earlier.

      String shellScriptPath = "hdfs://isredeng:8020//kishore/kk.ksh";


13/08/07 01:36:28 INFO ApplicationMaster: Got container status for
containerID= container_1375853431091_0005_01_000002, state=COMPLETE,
exitStatus=-1000, diagnostics=File does not exist:
hdfs://isredeng:8020/kishore/kk.ksh

13/08/07 01:36:28 INFO ApplicationMaster: Got failure status for a
container : -1000



On Wed, Aug 7, 2013 at 7:45 AM, Harsh J <ha...@cloudera.com> wrote:

> Thanks Hitesh!
>
> P.s. Port isn't a requirement (and with HA URIs, you shouldn't add a
> port), but "isredeng" has to be the authority component.
>
> On Wed, Aug 7, 2013 at 7:37 AM, Hitesh Shah <hi...@apache.org> wrote:
> > @Krishna, your logs showed the file error for
> "hdfs://isredeng/kishore/kk.ksh"
> >
> > I am assuming you have tried dfs -ls /kishore/kk.ksh and confirmed that
> the file exists? Also the qualified path seems to be missing the namenode
> port. I need to go back and check if a path without the port works by
> assuming the default namenode port.
> >
> > @Harsh, adding a helper function seems like a good idea. Let me file a
> jira to have the above added to one of the helper/client libraries.
> >
> > thanks
> > -- Hitesh
> >
> > On Aug 6, 2013, at 6:47 PM, Harsh J wrote:
> >
> >> It is kinda unnecessary to be asking developers to load in timestamps
> and
> >> length themselves. Why not provide a java.io.File, or perhaps a Path
> >> accepting API, that gets it automatically on their behalf using the
> >> FileSystem API internally?
> >>
> >> P.s. A HDFS file gave him a FNF, while a Local file gave him a proper
> >> TS/Len error. I'm guessing there's a bug here w.r.t. handling HDFS
> >> paths.
> >>
> >> On Wed, Aug 7, 2013 at 12:35 AM, Hitesh Shah <hi...@apache.org> wrote:
> >>> Hi Krishna,
> >>>
> >>> YARN downloads a specified local resource on the container's node from
> the url specified. In all situtations, the remote url needs to be a fully
> qualified path. To verify that the file at the remote url is still valid,
> YARN expects you to provide the length and last modified timestamp of that
> file.
> >>>
> >>> If you use an hdfs path such as hdfs://namenode:port/<absolute path to
> file>, you will need to get the length and timestamp from HDFS.
> >>> If you use file:///, the file should exist on all nodes and all nodes
> should have the file with the same length and timestamp for localization to
> work. ( For a single node setup, this works but tougher to get right on a
> multi-node setup - deploying the file via a rpm should likely work).
> >>>
> >>> -- Hitesh
> >>>
> >>> On Aug 6, 2013, at 11:11 AM, Omkar Joshi wrote:
> >>>
> >>>> Hi,
> >>>>
> >>>> You need to match the timestamp. Probably get the timestamp locally
> before adding it. This is explicitly done to ensure that file is not
> updated after user makes the call to avoid possible errors.
> >>>>
> >>>>
> >>>> Thanks,
> >>>> Omkar Joshi
> >>>> Hortonworks Inc.
> >>>>
> >>>>
> >>>> On Tue, Aug 6, 2013 at 5:25 AM, Krishna Kishore Bonagiri <
> write2kishore@gmail.com> wrote:
> >>>> I tried the following and it works!
> >>>> String shellScriptPath = "file:///home_/dsadm/kishore/kk.ksh";
> >>>>
> >>>> But now getting a timestamp error like below, when I passed 0 to
> setTimestamp()
> >>>>
> >>>> 13/08/06 08:23:48 INFO ApplicationMaster: Got container status for
> containerID= container_1375784329048_0017_01_000002, state=COMPLETE,
> exitStatus=-1000, diagnostics=Resource file:/home_/dsadm/kishore/kk.ksh
> changed on src filesystem (expected 0, was 1367580580000
> >>>>
> >>>>
> >>>>
> >>>>
> >>>> On Tue, Aug 6, 2013 at 5:24 PM, Harsh J <ha...@cloudera.com> wrote:
> >>>> Can you try passing a fully qualified local path? That is, including
> the file:/ scheme
> >>>>
> >>>> On Aug 6, 2013 4:05 PM, "Krishna Kishore Bonagiri" <
> write2kishore@gmail.com> wrote:
> >>>> Hi Harsh,
> >>>>   The setResource() call on LocalResource() is expecting an argument
> of type org.apache.hadoop.yarn.api.records.URL which is converted from a
> string in the form of URI. This happens in the following call of
> Distributed Shell example,
> >>>>
> >>>> shellRsrc.setResource(ConverterUtils.getYarnUrlFromURI(new URI(
> shellScriptPath)));
> >>>>
> >>>> So, if I give a local file I get a parsing error like below, which is
> when I changed it to an HDFS file thinking that it should be given like
> that only. Could you please give an example of how else it could be used,
> using a local file as you are saying?
> >>>>
> >>>> 2013-08-06 06:23:12,942 WARN
> org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container:
> Failed to parse resource-request
> >>>> java.net.URISyntaxException: Expected scheme name at index 0:
> :///home_/dsadm/kishore/kk.ksh
> >>>>        at java.net.URI$Parser.fail(URI.java:2820)
> >>>>        at java.net.URI$Parser.failExpecting(URI.java:2826)
> >>>>        at java.net.URI$Parser.parse(URI.java:3015)
> >>>>        at java.net.URI.<init>(URI.java:747)
> >>>>        at
> org.apache.hadoop.yarn.util.ConverterUtils.getPathFromYarnURL(ConverterUtils.java:77)
> >>>>        at
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalResourceRequest.<init>(LocalResourceRequest.java:46)
> >>>>
> >>>>
> >>>>
> >>>> On Tue, Aug 6, 2013 at 3:36 PM, Harsh J <ha...@cloudera.com> wrote:
> >>>> To be honest, I've never tried loading a HDFS file onto the
> >>>> LocalResource this way. I usually just pass a local file and that
> >>>> works just fine. There may be something in the URI transformation
> >>>> possibly breaking a HDFS source, but try passing a local file - does
> >>>> that fail too? The Shell example uses a local file.
> >>>>
> >>>> On Tue, Aug 6, 2013 at 10:54 AM, Krishna Kishore Bonagiri
> >>>> <wr...@gmail.com> wrote:
> >>>>> Hi Harsh,
> >>>>>
> >>>>>  Please see if this is useful, I got a stack trace after the error
> has
> >>>>> occurred....
> >>>>>
> >>>>> 2013-08-06 00:55:30,559 INFO
> >>>>> org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor:
> CWD set
> >>>>> to
> /tmp/nm-local-dir/usercache/dsadm/appcache/application_1375716148174_0004
> >>>>> =
> >>>>>
> file:/tmp/nm-local-dir/usercache/dsadm/appcache/application_1375716148174_0004
> >>>>> 2013-08-06 00:55:31,017 ERROR
> >>>>> org.apache.hadoop.security.UserGroupInformation:
> PriviledgedActionException
> >>>>> as:dsadm (auth:SIMPLE) cause:java.io.FileNotFoundException: File
> does not
> >>>>> exist: hdfs://isredeng/kishore/kk.ksh
> >>>>> 2013-08-06 00:55:31,029 INFO
> >>>>>
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService:
> >>>>> DEBUG: FAILED { hdfs://isredeng/kishore/kk.ksh, 0, FILE, null },
> File does
> >>>>> not exist: hdfs://isredeng/kishore/kk.ksh
> >>>>> 2013-08-06 00:55:31,031 INFO
> >>>>>
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalizedResource:
> >>>>> Resource hdfs://isredeng/kishore/kk.ksh transitioned from
> DOWNLOADING to
> >>>>> FAILED
> >>>>> 2013-08-06 00:55:31,034 INFO
> >>>>>
> org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container:
> >>>>> Container container_1375716148174_0004_01_000002 transitioned from
> >>>>> LOCALIZING to LOCALIZATION_FAILED
> >>>>> 2013-08-06 00:55:31,035 INFO
> >>>>>
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalResourcesTrackerImpl:
> >>>>> Container container_1375716148174_0004_01_000002 sent RELEASE event
> on a
> >>>>> resource request { hdfs://isredeng/kishore/kk.ksh, 0, FILE, null }
> not
> >>>>> present in cache.
> >>>>> 2013-08-06 00:55:31,036 WARN org.apache.hadoop.ipc.Client:
> interrupted
> >>>>> waiting to send rpc request to server
> >>>>> java.lang.InterruptedException
> >>>>>        at
> >>>>>
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1290)
> >>>>>        at
> >>>>> java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:229)
> >>>>>        at java.util.concurrent.FutureTask.get(FutureTask.java:94)
> >>>>>        at
> >>>>>
> org.apache.hadoop.ipc.Client$Connection.sendRpcRequest(Client.java:930)
> >>>>>        at org.apache.hadoop.ipc.Client.call(Client.java:1285)
> >>>>>        at org.apache.hadoop.ipc.Client.call(Client.java:1264)
> >>>>>        at
> >>>>>
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
> >>>>>        at $Proxy22.heartbeat(Unknown Source)
> >>>>>        at
> >>>>>
> org.apache.hadoop.yarn.server.nodemanager.api.impl.pb.client.LocalizationProtocolPBClientImpl.heartbeat(LocalizationProtocolPBClientImpl.java:62)
> >>>>>        at
> >>>>>
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.localizeFiles(ContainerLocalizer.java:249)
> >>>>>        at
> >>>>>
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.runLocalization(ContainerLocalizer.java:163)
> >>>>>        at
> >>>>>
> org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.startLocalizer(DefaultContainerExecutor.java:106)
> >>>>>        at
> >>>>>
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerRunner.run(ResourceLocalizationService.java:979)
> >>>>>
> >>>>>
> >>>>>
> >>>>> And here is my code snippet:
> >>>>>
> >>>>>      ContainerLaunchContext ctx =
> >>>>> Records.newRecord(ContainerLaunchContext.class);
> >>>>>
> >>>>>      ctx.setEnvironment(oshEnv);
> >>>>>
> >>>>>      // Set the local resources
> >>>>>      Map<String, LocalResource> localResources = new HashMap<String,
> >>>>> LocalResource>();
> >>>>>
> >>>>>      LocalResource shellRsrc =
> Records.newRecord(LocalResource.class);
> >>>>>      shellRsrc.setType(LocalResourceType.FILE);
> >>>>>      shellRsrc.setVisibility(LocalResourceVisibility.APPLICATION);
> >>>>>      String shellScriptPath = "hdfs://isredeng//kishore/kk.ksh";
> >>>>>      try {
> >>>>>        shellRsrc.setResource(ConverterUtils.getYarnUrlFromURI(new
> >>>>> URI(shellScriptPath)));
> >>>>>      } catch (URISyntaxException e) {
> >>>>>        LOG.error("Error when trying to use shell script path
> specified"
> >>>>>            + " in env, path=" + shellScriptPath);
> >>>>>        e.printStackTrace();
> >>>>>      }
> >>>>>
> >>>>>      shellRsrc.setTimestamp(0/*shellScriptPathTimestamp*/);
> >>>>>      shellRsrc.setSize(0/*shellScriptPathLen*/);
> >>>>>      String ExecShellStringPath = "ExecShellScript.sh";
> >>>>>      localResources.put(ExecShellStringPath, shellRsrc);
> >>>>>
> >>>>>      ctx.setLocalResources(localResources);
> >>>>>
> >>>>>
> >>>>> Please let me know if you need anything else.
> >>>>>
> >>>>> Thanks,
> >>>>> Kishore
> >>>>>
> >>>>>
> >>>>>
> >>>>> On Tue, Aug 6, 2013 at 12:05 AM, Harsh J <ha...@cloudera.com> wrote:
> >>>>>>
> >>>>>> The detail is insufficient to answer why. You should also have
> gotten
> >>>>>> a trace after it, can you post that? If possible, also the relevant
> >>>>>> snippets of code.
> >>>>>>
> >>>>>> On Mon, Aug 5, 2013 at 6:36 PM, Krishna Kishore Bonagiri
> >>>>>> <wr...@gmail.com> wrote:
> >>>>>>> Hi Harsh,
> >>>>>>> Thanks for the quick and detailed reply, it really helps. I am
> trying
> >>>>>>> to
> >>>>>>> use it and getting this error in node manager's log:
> >>>>>>>
> >>>>>>> 2013-08-05 08:57:28,867 ERROR
> >>>>>>> org.apache.hadoop.security.UserGroupInformation:
> >>>>>>> PriviledgedActionException
> >>>>>>> as:dsadm (auth:SIMPLE) cause:java.io.FileNotFoundException: File
> does
> >>>>>>> not
> >>>>>>> exist: hdfs://isredeng/kishore/kk.ksh
> >>>>>>>
> >>>>>>>
> >>>>>>> This file is there on the machine with name "isredeng", I could do
> ls
> >>>>>>> for
> >>>>>>> that file as below:
> >>>>>>>
> >>>>>>> -bash-4.1$ hadoop fs -ls kishore/kk.ksh
> >>>>>>> 13/08/05 09:01:03 WARN util.NativeCodeLoader: Unable to load
> >>>>>>> native-hadoop
> >>>>>>> library for your platform... using builtin-java classes where
> applicable
> >>>>>>> Found 1 items
> >>>>>>> -rw-r--r--   3 dsadm supergroup       1046 2013-08-05 08:48
> >>>>>>> kishore/kk.ksh
> >>>>>>>
> >>>>>>> Note: I am using a single node cluster
> >>>>>>>
> >>>>>>> Thanks,
> >>>>>>> Kishore
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>> On Mon, Aug 5, 2013 at 3:00 PM, Harsh J <ha...@cloudera.com>
> wrote:
> >>>>>>>>
> >>>>>>>> The string for each LocalResource in the map can be anything that
> >>>>>>>> serves as a common identifier name for your application. At
> execution
> >>>>>>>> time, the passed resource filename will be aliased to the name
> you've
> >>>>>>>> mapped it to, so that the application code need not track special
> >>>>>>>> names. The behavior is very similar to how you can, in MR, define
> a
> >>>>>>>> symlink name for a DistributedCache entry (e.g. foo.jar#bar.jar).
> >>>>>>>>
> >>>>>>>> For an example, checkout the DistributedShell app sources.
> >>>>>>>>
> >>>>>>>> Over [1], you can see we take a user provided file path to a shell
> >>>>>>>> script. This can be named anything as it is user-supplied.
> >>>>>>>> Onto [2], we define this as a local resource [2.1] and embed it
> with a
> >>>>>>>> different name (the string you ask about) [2.2], as defined at
> [3] as
> >>>>>>>> an application reference-able constant.
> >>>>>>>> Note that in [4], we add to the Container arguments the aliased
> name
> >>>>>>>> we mapped it to (i.e. [3]) and not the original filename we
> received
> >>>>>>>> from the user. The resource is placed on the container with this
> name
> >>>>>>>> instead, so thats what we choose to execute.
> >>>>>>>>
> >>>>>>>> [1] -
> >>>>>>>>
> >>>>>>>>
> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L390
> >>>>>>>>
> >>>>>>>> [2] - [2.1]
> >>>>>>>>
> >>>>>>>>
> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L764
> >>>>>>>> and [2.2]
> >>>>>>>>
> >>>>>>>>
> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L780
> >>>>>>>>
> >>>>>>>> [3] -
> >>>>>>>>
> >>>>>>>>
> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L205
> >>>>>>>>
> >>>>>>>> [4] -
> >>>>>>>>
> >>>>>>>>
> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L791
> >>>>>>>>
> >>>>>>>> On Mon, Aug 5, 2013 at 2:44 PM, Krishna Kishore Bonagiri
> >>>>>>>> <wr...@gmail.com> wrote:
> >>>>>>>>> Hi,
> >>>>>>>>>
> >>>>>>>>>  Can someone please tell me what is the use of calling
> >>>>>>>>> setLocalResources()
> >>>>>>>>> on ContainerLaunchContext?
> >>>>>>>>>
> >>>>>>>>>  And, also an example of how to use this will help...
> >>>>>>>>>
> >>>>>>>>> I couldn't guess what is the String in the map that is passed to
> >>>>>>>>> setLocalResources() like below:
> >>>>>>>>>
> >>>>>>>>>      // Set the local resources
> >>>>>>>>>      Map<String, LocalResource> localResources = new
> HashMap<String,
> >>>>>>>>> LocalResource>();
> >>>>>>>>>
> >>>>>>>>> Thanks,
> >>>>>>>>> Kishore
> >>>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> --
> >>>>>>>> Harsh J
> >>>>>>>
> >>>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>> --
> >>>>>> Harsh J
> >>>>>
> >>>>>
> >>>>
> >>>>
> >>>>
> >>>> --
> >>>> Harsh J
> >>>>
> >>>>
> >>>>
> >>>
> >>
> >>
> >>
> >> --
> >> Harsh J
> >
>
>
>
> --
> Harsh J
>

Re: setLocalResources() on ContainerLaunchContext

Posted by Krishna Kishore Bonagiri <wr...@gmail.com>.
Hi Harsh, Hitesh & Omkar,

  Thanks for the replies.

I tried getting the last modified timestamp like this and it works. Is this
a right thing to do?

      File file = new File("/home_/dsadm/kishore/kk.ksh");
      shellRsrc.setTimestamp(file.lastModified());


And, when I tried using a hdfs file qualifying it with both node name and
port, it didn't work, I get a similar error as earlier.

      String shellScriptPath = "hdfs://isredeng:8020//kishore/kk.ksh";


13/08/07 01:36:28 INFO ApplicationMaster: Got container status for
containerID= container_1375853431091_0005_01_000002, state=COMPLETE,
exitStatus=-1000, diagnostics=File does not exist:
hdfs://isredeng:8020/kishore/kk.ksh

13/08/07 01:36:28 INFO ApplicationMaster: Got failure status for a
container : -1000



On Wed, Aug 7, 2013 at 7:45 AM, Harsh J <ha...@cloudera.com> wrote:

> Thanks Hitesh!
>
> P.s. Port isn't a requirement (and with HA URIs, you shouldn't add a
> port), but "isredeng" has to be the authority component.
>
> On Wed, Aug 7, 2013 at 7:37 AM, Hitesh Shah <hi...@apache.org> wrote:
> > @Krishna, your logs showed the file error for
> "hdfs://isredeng/kishore/kk.ksh"
> >
> > I am assuming you have tried dfs -ls /kishore/kk.ksh and confirmed that
> the file exists? Also the qualified path seems to be missing the namenode
> port. I need to go back and check if a path without the port works by
> assuming the default namenode port.
> >
> > @Harsh, adding a helper function seems like a good idea. Let me file a
> jira to have the above added to one of the helper/client libraries.
> >
> > thanks
> > -- Hitesh
> >
> > On Aug 6, 2013, at 6:47 PM, Harsh J wrote:
> >
> >> It is kinda unnecessary to be asking developers to load in timestamps
> and
> >> length themselves. Why not provide a java.io.File, or perhaps a Path
> >> accepting API, that gets it automatically on their behalf using the
> >> FileSystem API internally?
> >>
> >> P.s. A HDFS file gave him a FNF, while a Local file gave him a proper
> >> TS/Len error. I'm guessing there's a bug here w.r.t. handling HDFS
> >> paths.
> >>
> >> On Wed, Aug 7, 2013 at 12:35 AM, Hitesh Shah <hi...@apache.org> wrote:
> >>> Hi Krishna,
> >>>
> >>> YARN downloads a specified local resource on the container's node from
> the url specified. In all situtations, the remote url needs to be a fully
> qualified path. To verify that the file at the remote url is still valid,
> YARN expects you to provide the length and last modified timestamp of that
> file.
> >>>
> >>> If you use an hdfs path such as hdfs://namenode:port/<absolute path to
> file>, you will need to get the length and timestamp from HDFS.
> >>> If you use file:///, the file should exist on all nodes and all nodes
> should have the file with the same length and timestamp for localization to
> work. ( For a single node setup, this works but tougher to get right on a
> multi-node setup - deploying the file via a rpm should likely work).
> >>>
> >>> -- Hitesh
> >>>
> >>> On Aug 6, 2013, at 11:11 AM, Omkar Joshi wrote:
> >>>
> >>>> Hi,
> >>>>
> >>>> You need to match the timestamp. Probably get the timestamp locally
> before adding it. This is explicitly done to ensure that file is not
> updated after user makes the call to avoid possible errors.
> >>>>
> >>>>
> >>>> Thanks,
> >>>> Omkar Joshi
> >>>> Hortonworks Inc.
> >>>>
> >>>>
> >>>> On Tue, Aug 6, 2013 at 5:25 AM, Krishna Kishore Bonagiri <
> write2kishore@gmail.com> wrote:
> >>>> I tried the following and it works!
> >>>> String shellScriptPath = "file:///home_/dsadm/kishore/kk.ksh";
> >>>>
> >>>> But now getting a timestamp error like below, when I passed 0 to
> setTimestamp()
> >>>>
> >>>> 13/08/06 08:23:48 INFO ApplicationMaster: Got container status for
> containerID= container_1375784329048_0017_01_000002, state=COMPLETE,
> exitStatus=-1000, diagnostics=Resource file:/home_/dsadm/kishore/kk.ksh
> changed on src filesystem (expected 0, was 1367580580000
> >>>>
> >>>>
> >>>>
> >>>>
> >>>> On Tue, Aug 6, 2013 at 5:24 PM, Harsh J <ha...@cloudera.com> wrote:
> >>>> Can you try passing a fully qualified local path? That is, including
> the file:/ scheme
> >>>>
> >>>> On Aug 6, 2013 4:05 PM, "Krishna Kishore Bonagiri" <
> write2kishore@gmail.com> wrote:
> >>>> Hi Harsh,
> >>>>   The setResource() call on LocalResource() is expecting an argument
> of type org.apache.hadoop.yarn.api.records.URL which is converted from a
> string in the form of URI. This happens in the following call of
> Distributed Shell example,
> >>>>
> >>>> shellRsrc.setResource(ConverterUtils.getYarnUrlFromURI(new URI(
> shellScriptPath)));
> >>>>
> >>>> So, if I give a local file I get a parsing error like below, which is
> when I changed it to an HDFS file thinking that it should be given like
> that only. Could you please give an example of how else it could be used,
> using a local file as you are saying?
> >>>>
> >>>> 2013-08-06 06:23:12,942 WARN
> org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container:
> Failed to parse resource-request
> >>>> java.net.URISyntaxException: Expected scheme name at index 0:
> :///home_/dsadm/kishore/kk.ksh
> >>>>        at java.net.URI$Parser.fail(URI.java:2820)
> >>>>        at java.net.URI$Parser.failExpecting(URI.java:2826)
> >>>>        at java.net.URI$Parser.parse(URI.java:3015)
> >>>>        at java.net.URI.<init>(URI.java:747)
> >>>>        at
> org.apache.hadoop.yarn.util.ConverterUtils.getPathFromYarnURL(ConverterUtils.java:77)
> >>>>        at
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalResourceRequest.<init>(LocalResourceRequest.java:46)
> >>>>
> >>>>
> >>>>
> >>>> On Tue, Aug 6, 2013 at 3:36 PM, Harsh J <ha...@cloudera.com> wrote:
> >>>> To be honest, I've never tried loading a HDFS file onto the
> >>>> LocalResource this way. I usually just pass a local file and that
> >>>> works just fine. There may be something in the URI transformation
> >>>> possibly breaking a HDFS source, but try passing a local file - does
> >>>> that fail too? The Shell example uses a local file.
> >>>>
> >>>> On Tue, Aug 6, 2013 at 10:54 AM, Krishna Kishore Bonagiri
> >>>> <wr...@gmail.com> wrote:
> >>>>> Hi Harsh,
> >>>>>
> >>>>>  Please see if this is useful, I got a stack trace after the error
> has
> >>>>> occurred....
> >>>>>
> >>>>> 2013-08-06 00:55:30,559 INFO
> >>>>> org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor:
> CWD set
> >>>>> to
> /tmp/nm-local-dir/usercache/dsadm/appcache/application_1375716148174_0004
> >>>>> =
> >>>>>
> file:/tmp/nm-local-dir/usercache/dsadm/appcache/application_1375716148174_0004
> >>>>> 2013-08-06 00:55:31,017 ERROR
> >>>>> org.apache.hadoop.security.UserGroupInformation:
> PriviledgedActionException
> >>>>> as:dsadm (auth:SIMPLE) cause:java.io.FileNotFoundException: File
> does not
> >>>>> exist: hdfs://isredeng/kishore/kk.ksh
> >>>>> 2013-08-06 00:55:31,029 INFO
> >>>>>
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService:
> >>>>> DEBUG: FAILED { hdfs://isredeng/kishore/kk.ksh, 0, FILE, null },
> File does
> >>>>> not exist: hdfs://isredeng/kishore/kk.ksh
> >>>>> 2013-08-06 00:55:31,031 INFO
> >>>>>
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalizedResource:
> >>>>> Resource hdfs://isredeng/kishore/kk.ksh transitioned from
> DOWNLOADING to
> >>>>> FAILED
> >>>>> 2013-08-06 00:55:31,034 INFO
> >>>>>
> org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container:
> >>>>> Container container_1375716148174_0004_01_000002 transitioned from
> >>>>> LOCALIZING to LOCALIZATION_FAILED
> >>>>> 2013-08-06 00:55:31,035 INFO
> >>>>>
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalResourcesTrackerImpl:
> >>>>> Container container_1375716148174_0004_01_000002 sent RELEASE event
> on a
> >>>>> resource request { hdfs://isredeng/kishore/kk.ksh, 0, FILE, null }
> not
> >>>>> present in cache.
> >>>>> 2013-08-06 00:55:31,036 WARN org.apache.hadoop.ipc.Client:
> interrupted
> >>>>> waiting to send rpc request to server
> >>>>> java.lang.InterruptedException
> >>>>>        at
> >>>>>
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1290)
> >>>>>        at
> >>>>> java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:229)
> >>>>>        at java.util.concurrent.FutureTask.get(FutureTask.java:94)
> >>>>>        at
> >>>>>
> org.apache.hadoop.ipc.Client$Connection.sendRpcRequest(Client.java:930)
> >>>>>        at org.apache.hadoop.ipc.Client.call(Client.java:1285)
> >>>>>        at org.apache.hadoop.ipc.Client.call(Client.java:1264)
> >>>>>        at
> >>>>>
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
> >>>>>        at $Proxy22.heartbeat(Unknown Source)
> >>>>>        at
> >>>>>
> org.apache.hadoop.yarn.server.nodemanager.api.impl.pb.client.LocalizationProtocolPBClientImpl.heartbeat(LocalizationProtocolPBClientImpl.java:62)
> >>>>>        at
> >>>>>
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.localizeFiles(ContainerLocalizer.java:249)
> >>>>>        at
> >>>>>
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.runLocalization(ContainerLocalizer.java:163)
> >>>>>        at
> >>>>>
> org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.startLocalizer(DefaultContainerExecutor.java:106)
> >>>>>        at
> >>>>>
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerRunner.run(ResourceLocalizationService.java:979)
> >>>>>
> >>>>>
> >>>>>
> >>>>> And here is my code snippet:
> >>>>>
> >>>>>      ContainerLaunchContext ctx =
> >>>>> Records.newRecord(ContainerLaunchContext.class);
> >>>>>
> >>>>>      ctx.setEnvironment(oshEnv);
> >>>>>
> >>>>>      // Set the local resources
> >>>>>      Map<String, LocalResource> localResources = new HashMap<String,
> >>>>> LocalResource>();
> >>>>>
> >>>>>      LocalResource shellRsrc =
> Records.newRecord(LocalResource.class);
> >>>>>      shellRsrc.setType(LocalResourceType.FILE);
> >>>>>      shellRsrc.setVisibility(LocalResourceVisibility.APPLICATION);
> >>>>>      String shellScriptPath = "hdfs://isredeng//kishore/kk.ksh";
> >>>>>      try {
> >>>>>        shellRsrc.setResource(ConverterUtils.getYarnUrlFromURI(new
> >>>>> URI(shellScriptPath)));
> >>>>>      } catch (URISyntaxException e) {
> >>>>>        LOG.error("Error when trying to use shell script path
> specified"
> >>>>>            + " in env, path=" + shellScriptPath);
> >>>>>        e.printStackTrace();
> >>>>>      }
> >>>>>
> >>>>>      shellRsrc.setTimestamp(0/*shellScriptPathTimestamp*/);
> >>>>>      shellRsrc.setSize(0/*shellScriptPathLen*/);
> >>>>>      String ExecShellStringPath = "ExecShellScript.sh";
> >>>>>      localResources.put(ExecShellStringPath, shellRsrc);
> >>>>>
> >>>>>      ctx.setLocalResources(localResources);
> >>>>>
> >>>>>
> >>>>> Please let me know if you need anything else.
> >>>>>
> >>>>> Thanks,
> >>>>> Kishore
> >>>>>
> >>>>>
> >>>>>
> >>>>> On Tue, Aug 6, 2013 at 12:05 AM, Harsh J <ha...@cloudera.com> wrote:
> >>>>>>
> >>>>>> The detail is insufficient to answer why. You should also have
> gotten
> >>>>>> a trace after it, can you post that? If possible, also the relevant
> >>>>>> snippets of code.
> >>>>>>
> >>>>>> On Mon, Aug 5, 2013 at 6:36 PM, Krishna Kishore Bonagiri
> >>>>>> <wr...@gmail.com> wrote:
> >>>>>>> Hi Harsh,
> >>>>>>> Thanks for the quick and detailed reply, it really helps. I am
> trying
> >>>>>>> to
> >>>>>>> use it and getting this error in node manager's log:
> >>>>>>>
> >>>>>>> 2013-08-05 08:57:28,867 ERROR
> >>>>>>> org.apache.hadoop.security.UserGroupInformation:
> >>>>>>> PriviledgedActionException
> >>>>>>> as:dsadm (auth:SIMPLE) cause:java.io.FileNotFoundException: File
> does
> >>>>>>> not
> >>>>>>> exist: hdfs://isredeng/kishore/kk.ksh
> >>>>>>>
> >>>>>>>
> >>>>>>> This file is there on the machine with name "isredeng", I could do
> ls
> >>>>>>> for
> >>>>>>> that file as below:
> >>>>>>>
> >>>>>>> -bash-4.1$ hadoop fs -ls kishore/kk.ksh
> >>>>>>> 13/08/05 09:01:03 WARN util.NativeCodeLoader: Unable to load
> >>>>>>> native-hadoop
> >>>>>>> library for your platform... using builtin-java classes where
> applicable
> >>>>>>> Found 1 items
> >>>>>>> -rw-r--r--   3 dsadm supergroup       1046 2013-08-05 08:48
> >>>>>>> kishore/kk.ksh
> >>>>>>>
> >>>>>>> Note: I am using a single node cluster
> >>>>>>>
> >>>>>>> Thanks,
> >>>>>>> Kishore
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>> On Mon, Aug 5, 2013 at 3:00 PM, Harsh J <ha...@cloudera.com>
> wrote:
> >>>>>>>>
> >>>>>>>> The string for each LocalResource in the map can be anything that
> >>>>>>>> serves as a common identifier name for your application. At
> execution
> >>>>>>>> time, the passed resource filename will be aliased to the name
> you've
> >>>>>>>> mapped it to, so that the application code need not track special
> >>>>>>>> names. The behavior is very similar to how you can, in MR, define
> a
> >>>>>>>> symlink name for a DistributedCache entry (e.g. foo.jar#bar.jar).
> >>>>>>>>
> >>>>>>>> For an example, checkout the DistributedShell app sources.
> >>>>>>>>
> >>>>>>>> Over [1], you can see we take a user provided file path to a shell
> >>>>>>>> script. This can be named anything as it is user-supplied.
> >>>>>>>> Onto [2], we define this as a local resource [2.1] and embed it
> with a
> >>>>>>>> different name (the string you ask about) [2.2], as defined at
> [3] as
> >>>>>>>> an application reference-able constant.
> >>>>>>>> Note that in [4], we add to the Container arguments the aliased
> name
> >>>>>>>> we mapped it to (i.e. [3]) and not the original filename we
> received
> >>>>>>>> from the user. The resource is placed on the container with this
> name
> >>>>>>>> instead, so thats what we choose to execute.
> >>>>>>>>
> >>>>>>>> [1] -
> >>>>>>>>
> >>>>>>>>
> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L390
> >>>>>>>>
> >>>>>>>> [2] - [2.1]
> >>>>>>>>
> >>>>>>>>
> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L764
> >>>>>>>> and [2.2]
> >>>>>>>>
> >>>>>>>>
> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L780
> >>>>>>>>
> >>>>>>>> [3] -
> >>>>>>>>
> >>>>>>>>
> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L205
> >>>>>>>>
> >>>>>>>> [4] -
> >>>>>>>>
> >>>>>>>>
> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L791
> >>>>>>>>
> >>>>>>>> On Mon, Aug 5, 2013 at 2:44 PM, Krishna Kishore Bonagiri
> >>>>>>>> <wr...@gmail.com> wrote:
> >>>>>>>>> Hi,
> >>>>>>>>>
> >>>>>>>>>  Can someone please tell me what is the use of calling
> >>>>>>>>> setLocalResources()
> >>>>>>>>> on ContainerLaunchContext?
> >>>>>>>>>
> >>>>>>>>>  And, also an example of how to use this will help...
> >>>>>>>>>
> >>>>>>>>> I couldn't guess what is the String in the map that is passed to
> >>>>>>>>> setLocalResources() like below:
> >>>>>>>>>
> >>>>>>>>>      // Set the local resources
> >>>>>>>>>      Map<String, LocalResource> localResources = new
> HashMap<String,
> >>>>>>>>> LocalResource>();
> >>>>>>>>>
> >>>>>>>>> Thanks,
> >>>>>>>>> Kishore
> >>>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> --
> >>>>>>>> Harsh J
> >>>>>>>
> >>>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>> --
> >>>>>> Harsh J
> >>>>>
> >>>>>
> >>>>
> >>>>
> >>>>
> >>>> --
> >>>> Harsh J
> >>>>
> >>>>
> >>>>
> >>>
> >>
> >>
> >>
> >> --
> >> Harsh J
> >
>
>
>
> --
> Harsh J
>

Re: setLocalResources() on ContainerLaunchContext

Posted by Krishna Kishore Bonagiri <wr...@gmail.com>.
Hi Harsh, Hitesh & Omkar,

  Thanks for the replies.

I tried getting the last modified timestamp like this and it works. Is this
a right thing to do?

      File file = new File("/home_/dsadm/kishore/kk.ksh");
      shellRsrc.setTimestamp(file.lastModified());


And, when I tried using a hdfs file qualifying it with both node name and
port, it didn't work, I get a similar error as earlier.

      String shellScriptPath = "hdfs://isredeng:8020//kishore/kk.ksh";


13/08/07 01:36:28 INFO ApplicationMaster: Got container status for
containerID= container_1375853431091_0005_01_000002, state=COMPLETE,
exitStatus=-1000, diagnostics=File does not exist:
hdfs://isredeng:8020/kishore/kk.ksh

13/08/07 01:36:28 INFO ApplicationMaster: Got failure status for a
container : -1000



On Wed, Aug 7, 2013 at 7:45 AM, Harsh J <ha...@cloudera.com> wrote:

> Thanks Hitesh!
>
> P.s. Port isn't a requirement (and with HA URIs, you shouldn't add a
> port), but "isredeng" has to be the authority component.
>
> On Wed, Aug 7, 2013 at 7:37 AM, Hitesh Shah <hi...@apache.org> wrote:
> > @Krishna, your logs showed the file error for
> "hdfs://isredeng/kishore/kk.ksh"
> >
> > I am assuming you have tried dfs -ls /kishore/kk.ksh and confirmed that
> the file exists? Also the qualified path seems to be missing the namenode
> port. I need to go back and check if a path without the port works by
> assuming the default namenode port.
> >
> > @Harsh, adding a helper function seems like a good idea. Let me file a
> jira to have the above added to one of the helper/client libraries.
> >
> > thanks
> > -- Hitesh
> >
> > On Aug 6, 2013, at 6:47 PM, Harsh J wrote:
> >
> >> It is kinda unnecessary to be asking developers to load in timestamps
> and
> >> length themselves. Why not provide a java.io.File, or perhaps a Path
> >> accepting API, that gets it automatically on their behalf using the
> >> FileSystem API internally?
> >>
> >> P.s. A HDFS file gave him a FNF, while a Local file gave him a proper
> >> TS/Len error. I'm guessing there's a bug here w.r.t. handling HDFS
> >> paths.
> >>
> >> On Wed, Aug 7, 2013 at 12:35 AM, Hitesh Shah <hi...@apache.org> wrote:
> >>> Hi Krishna,
> >>>
> >>> YARN downloads a specified local resource on the container's node from
> the url specified. In all situtations, the remote url needs to be a fully
> qualified path. To verify that the file at the remote url is still valid,
> YARN expects you to provide the length and last modified timestamp of that
> file.
> >>>
> >>> If you use an hdfs path such as hdfs://namenode:port/<absolute path to
> file>, you will need to get the length and timestamp from HDFS.
> >>> If you use file:///, the file should exist on all nodes and all nodes
> should have the file with the same length and timestamp for localization to
> work. ( For a single node setup, this works but tougher to get right on a
> multi-node setup - deploying the file via a rpm should likely work).
> >>>
> >>> -- Hitesh
> >>>
> >>> On Aug 6, 2013, at 11:11 AM, Omkar Joshi wrote:
> >>>
> >>>> Hi,
> >>>>
> >>>> You need to match the timestamp. Probably get the timestamp locally
> before adding it. This is explicitly done to ensure that file is not
> updated after user makes the call to avoid possible errors.
> >>>>
> >>>>
> >>>> Thanks,
> >>>> Omkar Joshi
> >>>> Hortonworks Inc.
> >>>>
> >>>>
> >>>> On Tue, Aug 6, 2013 at 5:25 AM, Krishna Kishore Bonagiri <
> write2kishore@gmail.com> wrote:
> >>>> I tried the following and it works!
> >>>> String shellScriptPath = "file:///home_/dsadm/kishore/kk.ksh";
> >>>>
> >>>> But now getting a timestamp error like below, when I passed 0 to
> setTimestamp()
> >>>>
> >>>> 13/08/06 08:23:48 INFO ApplicationMaster: Got container status for
> containerID= container_1375784329048_0017_01_000002, state=COMPLETE,
> exitStatus=-1000, diagnostics=Resource file:/home_/dsadm/kishore/kk.ksh
> changed on src filesystem (expected 0, was 1367580580000
> >>>>
> >>>>
> >>>>
> >>>>
> >>>> On Tue, Aug 6, 2013 at 5:24 PM, Harsh J <ha...@cloudera.com> wrote:
> >>>> Can you try passing a fully qualified local path? That is, including
> the file:/ scheme
> >>>>
> >>>> On Aug 6, 2013 4:05 PM, "Krishna Kishore Bonagiri" <
> write2kishore@gmail.com> wrote:
> >>>> Hi Harsh,
> >>>>   The setResource() call on LocalResource() is expecting an argument
> of type org.apache.hadoop.yarn.api.records.URL which is converted from a
> string in the form of URI. This happens in the following call of
> Distributed Shell example,
> >>>>
> >>>> shellRsrc.setResource(ConverterUtils.getYarnUrlFromURI(new URI(
> shellScriptPath)));
> >>>>
> >>>> So, if I give a local file I get a parsing error like below, which is
> when I changed it to an HDFS file thinking that it should be given like
> that only. Could you please give an example of how else it could be used,
> using a local file as you are saying?
> >>>>
> >>>> 2013-08-06 06:23:12,942 WARN
> org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container:
> Failed to parse resource-request
> >>>> java.net.URISyntaxException: Expected scheme name at index 0:
> :///home_/dsadm/kishore/kk.ksh
> >>>>        at java.net.URI$Parser.fail(URI.java:2820)
> >>>>        at java.net.URI$Parser.failExpecting(URI.java:2826)
> >>>>        at java.net.URI$Parser.parse(URI.java:3015)
> >>>>        at java.net.URI.<init>(URI.java:747)
> >>>>        at
> org.apache.hadoop.yarn.util.ConverterUtils.getPathFromYarnURL(ConverterUtils.java:77)
> >>>>        at
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalResourceRequest.<init>(LocalResourceRequest.java:46)
> >>>>
> >>>>
> >>>>
> >>>> On Tue, Aug 6, 2013 at 3:36 PM, Harsh J <ha...@cloudera.com> wrote:
> >>>> To be honest, I've never tried loading a HDFS file onto the
> >>>> LocalResource this way. I usually just pass a local file and that
> >>>> works just fine. There may be something in the URI transformation
> >>>> possibly breaking a HDFS source, but try passing a local file - does
> >>>> that fail too? The Shell example uses a local file.
> >>>>
> >>>> On Tue, Aug 6, 2013 at 10:54 AM, Krishna Kishore Bonagiri
> >>>> <wr...@gmail.com> wrote:
> >>>>> Hi Harsh,
> >>>>>
> >>>>>  Please see if this is useful, I got a stack trace after the error
> has
> >>>>> occurred....
> >>>>>
> >>>>> 2013-08-06 00:55:30,559 INFO
> >>>>> org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor:
> CWD set
> >>>>> to
> /tmp/nm-local-dir/usercache/dsadm/appcache/application_1375716148174_0004
> >>>>> =
> >>>>>
> file:/tmp/nm-local-dir/usercache/dsadm/appcache/application_1375716148174_0004
> >>>>> 2013-08-06 00:55:31,017 ERROR
> >>>>> org.apache.hadoop.security.UserGroupInformation:
> PriviledgedActionException
> >>>>> as:dsadm (auth:SIMPLE) cause:java.io.FileNotFoundException: File
> does not
> >>>>> exist: hdfs://isredeng/kishore/kk.ksh
> >>>>> 2013-08-06 00:55:31,029 INFO
> >>>>>
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService:
> >>>>> DEBUG: FAILED { hdfs://isredeng/kishore/kk.ksh, 0, FILE, null },
> File does
> >>>>> not exist: hdfs://isredeng/kishore/kk.ksh
> >>>>> 2013-08-06 00:55:31,031 INFO
> >>>>>
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalizedResource:
> >>>>> Resource hdfs://isredeng/kishore/kk.ksh transitioned from
> DOWNLOADING to
> >>>>> FAILED
> >>>>> 2013-08-06 00:55:31,034 INFO
> >>>>>
> org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container:
> >>>>> Container container_1375716148174_0004_01_000002 transitioned from
> >>>>> LOCALIZING to LOCALIZATION_FAILED
> >>>>> 2013-08-06 00:55:31,035 INFO
> >>>>>
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalResourcesTrackerImpl:
> >>>>> Container container_1375716148174_0004_01_000002 sent RELEASE event
> on a
> >>>>> resource request { hdfs://isredeng/kishore/kk.ksh, 0, FILE, null }
> not
> >>>>> present in cache.
> >>>>> 2013-08-06 00:55:31,036 WARN org.apache.hadoop.ipc.Client:
> interrupted
> >>>>> waiting to send rpc request to server
> >>>>> java.lang.InterruptedException
> >>>>>        at
> >>>>>
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1290)
> >>>>>        at
> >>>>> java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:229)
> >>>>>        at java.util.concurrent.FutureTask.get(FutureTask.java:94)
> >>>>>        at
> >>>>>
> org.apache.hadoop.ipc.Client$Connection.sendRpcRequest(Client.java:930)
> >>>>>        at org.apache.hadoop.ipc.Client.call(Client.java:1285)
> >>>>>        at org.apache.hadoop.ipc.Client.call(Client.java:1264)
> >>>>>        at
> >>>>>
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
> >>>>>        at $Proxy22.heartbeat(Unknown Source)
> >>>>>        at
> >>>>>
> org.apache.hadoop.yarn.server.nodemanager.api.impl.pb.client.LocalizationProtocolPBClientImpl.heartbeat(LocalizationProtocolPBClientImpl.java:62)
> >>>>>        at
> >>>>>
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.localizeFiles(ContainerLocalizer.java:249)
> >>>>>        at
> >>>>>
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.runLocalization(ContainerLocalizer.java:163)
> >>>>>        at
> >>>>>
> org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.startLocalizer(DefaultContainerExecutor.java:106)
> >>>>>        at
> >>>>>
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerRunner.run(ResourceLocalizationService.java:979)
> >>>>>
> >>>>>
> >>>>>
> >>>>> And here is my code snippet:
> >>>>>
> >>>>>      ContainerLaunchContext ctx =
> >>>>> Records.newRecord(ContainerLaunchContext.class);
> >>>>>
> >>>>>      ctx.setEnvironment(oshEnv);
> >>>>>
> >>>>>      // Set the local resources
> >>>>>      Map<String, LocalResource> localResources = new HashMap<String,
> >>>>> LocalResource>();
> >>>>>
> >>>>>      LocalResource shellRsrc =
> Records.newRecord(LocalResource.class);
> >>>>>      shellRsrc.setType(LocalResourceType.FILE);
> >>>>>      shellRsrc.setVisibility(LocalResourceVisibility.APPLICATION);
> >>>>>      String shellScriptPath = "hdfs://isredeng//kishore/kk.ksh";
> >>>>>      try {
> >>>>>        shellRsrc.setResource(ConverterUtils.getYarnUrlFromURI(new
> >>>>> URI(shellScriptPath)));
> >>>>>      } catch (URISyntaxException e) {
> >>>>>        LOG.error("Error when trying to use shell script path
> specified"
> >>>>>            + " in env, path=" + shellScriptPath);
> >>>>>        e.printStackTrace();
> >>>>>      }
> >>>>>
> >>>>>      shellRsrc.setTimestamp(0/*shellScriptPathTimestamp*/);
> >>>>>      shellRsrc.setSize(0/*shellScriptPathLen*/);
> >>>>>      String ExecShellStringPath = "ExecShellScript.sh";
> >>>>>      localResources.put(ExecShellStringPath, shellRsrc);
> >>>>>
> >>>>>      ctx.setLocalResources(localResources);
> >>>>>
> >>>>>
> >>>>> Please let me know if you need anything else.
> >>>>>
> >>>>> Thanks,
> >>>>> Kishore
> >>>>>
> >>>>>
> >>>>>
> >>>>> On Tue, Aug 6, 2013 at 12:05 AM, Harsh J <ha...@cloudera.com> wrote:
> >>>>>>
> >>>>>> The detail is insufficient to answer why. You should also have
> gotten
> >>>>>> a trace after it, can you post that? If possible, also the relevant
> >>>>>> snippets of code.
> >>>>>>
> >>>>>> On Mon, Aug 5, 2013 at 6:36 PM, Krishna Kishore Bonagiri
> >>>>>> <wr...@gmail.com> wrote:
> >>>>>>> Hi Harsh,
> >>>>>>> Thanks for the quick and detailed reply, it really helps. I am
> trying
> >>>>>>> to
> >>>>>>> use it and getting this error in node manager's log:
> >>>>>>>
> >>>>>>> 2013-08-05 08:57:28,867 ERROR
> >>>>>>> org.apache.hadoop.security.UserGroupInformation:
> >>>>>>> PriviledgedActionException
> >>>>>>> as:dsadm (auth:SIMPLE) cause:java.io.FileNotFoundException: File
> does
> >>>>>>> not
> >>>>>>> exist: hdfs://isredeng/kishore/kk.ksh
> >>>>>>>
> >>>>>>>
> >>>>>>> This file is there on the machine with name "isredeng", I could do
> ls
> >>>>>>> for
> >>>>>>> that file as below:
> >>>>>>>
> >>>>>>> -bash-4.1$ hadoop fs -ls kishore/kk.ksh
> >>>>>>> 13/08/05 09:01:03 WARN util.NativeCodeLoader: Unable to load
> >>>>>>> native-hadoop
> >>>>>>> library for your platform... using builtin-java classes where
> applicable
> >>>>>>> Found 1 items
> >>>>>>> -rw-r--r--   3 dsadm supergroup       1046 2013-08-05 08:48
> >>>>>>> kishore/kk.ksh
> >>>>>>>
> >>>>>>> Note: I am using a single node cluster
> >>>>>>>
> >>>>>>> Thanks,
> >>>>>>> Kishore
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>> On Mon, Aug 5, 2013 at 3:00 PM, Harsh J <ha...@cloudera.com>
> wrote:
> >>>>>>>>
> >>>>>>>> The string for each LocalResource in the map can be anything that
> >>>>>>>> serves as a common identifier name for your application. At
> execution
> >>>>>>>> time, the passed resource filename will be aliased to the name
> you've
> >>>>>>>> mapped it to, so that the application code need not track special
> >>>>>>>> names. The behavior is very similar to how you can, in MR, define
> a
> >>>>>>>> symlink name for a DistributedCache entry (e.g. foo.jar#bar.jar).
> >>>>>>>>
> >>>>>>>> For an example, checkout the DistributedShell app sources.
> >>>>>>>>
> >>>>>>>> Over [1], you can see we take a user provided file path to a shell
> >>>>>>>> script. This can be named anything as it is user-supplied.
> >>>>>>>> Onto [2], we define this as a local resource [2.1] and embed it
> with a
> >>>>>>>> different name (the string you ask about) [2.2], as defined at
> [3] as
> >>>>>>>> an application reference-able constant.
> >>>>>>>> Note that in [4], we add to the Container arguments the aliased
> name
> >>>>>>>> we mapped it to (i.e. [3]) and not the original filename we
> received
> >>>>>>>> from the user. The resource is placed on the container with this
> name
> >>>>>>>> instead, so thats what we choose to execute.
> >>>>>>>>
> >>>>>>>> [1] -
> >>>>>>>>
> >>>>>>>>
> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L390
> >>>>>>>>
> >>>>>>>> [2] - [2.1]
> >>>>>>>>
> >>>>>>>>
> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L764
> >>>>>>>> and [2.2]
> >>>>>>>>
> >>>>>>>>
> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L780
> >>>>>>>>
> >>>>>>>> [3] -
> >>>>>>>>
> >>>>>>>>
> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L205
> >>>>>>>>
> >>>>>>>> [4] -
> >>>>>>>>
> >>>>>>>>
> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L791
> >>>>>>>>
> >>>>>>>> On Mon, Aug 5, 2013 at 2:44 PM, Krishna Kishore Bonagiri
> >>>>>>>> <wr...@gmail.com> wrote:
> >>>>>>>>> Hi,
> >>>>>>>>>
> >>>>>>>>>  Can someone please tell me what is the use of calling
> >>>>>>>>> setLocalResources()
> >>>>>>>>> on ContainerLaunchContext?
> >>>>>>>>>
> >>>>>>>>>  And, also an example of how to use this will help...
> >>>>>>>>>
> >>>>>>>>> I couldn't guess what is the String in the map that is passed to
> >>>>>>>>> setLocalResources() like below:
> >>>>>>>>>
> >>>>>>>>>      // Set the local resources
> >>>>>>>>>      Map<String, LocalResource> localResources = new
> HashMap<String,
> >>>>>>>>> LocalResource>();
> >>>>>>>>>
> >>>>>>>>> Thanks,
> >>>>>>>>> Kishore
> >>>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> --
> >>>>>>>> Harsh J
> >>>>>>>
> >>>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>> --
> >>>>>> Harsh J
> >>>>>
> >>>>>
> >>>>
> >>>>
> >>>>
> >>>> --
> >>>> Harsh J
> >>>>
> >>>>
> >>>>
> >>>
> >>
> >>
> >>
> >> --
> >> Harsh J
> >
>
>
>
> --
> Harsh J
>

Re: setLocalResources() on ContainerLaunchContext

Posted by Harsh J <ha...@cloudera.com>.
Thanks Hitesh!

P.s. Port isn't a requirement (and with HA URIs, you shouldn't add a
port), but "isredeng" has to be the authority component.

On Wed, Aug 7, 2013 at 7:37 AM, Hitesh Shah <hi...@apache.org> wrote:
> @Krishna, your logs showed the file error for "hdfs://isredeng/kishore/kk.ksh"
>
> I am assuming you have tried dfs -ls /kishore/kk.ksh and confirmed that the file exists? Also the qualified path seems to be missing the namenode port. I need to go back and check if a path without the port works by assuming the default namenode port.
>
> @Harsh, adding a helper function seems like a good idea. Let me file a jira to have the above added to one of the helper/client libraries.
>
> thanks
> -- Hitesh
>
> On Aug 6, 2013, at 6:47 PM, Harsh J wrote:
>
>> It is kinda unnecessary to be asking developers to load in timestamps and
>> length themselves. Why not provide a java.io.File, or perhaps a Path
>> accepting API, that gets it automatically on their behalf using the
>> FileSystem API internally?
>>
>> P.s. A HDFS file gave him a FNF, while a Local file gave him a proper
>> TS/Len error. I'm guessing there's a bug here w.r.t. handling HDFS
>> paths.
>>
>> On Wed, Aug 7, 2013 at 12:35 AM, Hitesh Shah <hi...@apache.org> wrote:
>>> Hi Krishna,
>>>
>>> YARN downloads a specified local resource on the container's node from the url specified. In all situtations, the remote url needs to be a fully qualified path. To verify that the file at the remote url is still valid, YARN expects you to provide the length and last modified timestamp of that file.
>>>
>>> If you use an hdfs path such as hdfs://namenode:port/<absolute path to file>, you will need to get the length and timestamp from HDFS.
>>> If you use file:///, the file should exist on all nodes and all nodes should have the file with the same length and timestamp for localization to work. ( For a single node setup, this works but tougher to get right on a multi-node setup - deploying the file via a rpm should likely work).
>>>
>>> -- Hitesh
>>>
>>> On Aug 6, 2013, at 11:11 AM, Omkar Joshi wrote:
>>>
>>>> Hi,
>>>>
>>>> You need to match the timestamp. Probably get the timestamp locally before adding it. This is explicitly done to ensure that file is not updated after user makes the call to avoid possible errors.
>>>>
>>>>
>>>> Thanks,
>>>> Omkar Joshi
>>>> Hortonworks Inc.
>>>>
>>>>
>>>> On Tue, Aug 6, 2013 at 5:25 AM, Krishna Kishore Bonagiri <wr...@gmail.com> wrote:
>>>> I tried the following and it works!
>>>> String shellScriptPath = "file:///home_/dsadm/kishore/kk.ksh";
>>>>
>>>> But now getting a timestamp error like below, when I passed 0 to setTimestamp()
>>>>
>>>> 13/08/06 08:23:48 INFO ApplicationMaster: Got container status for containerID= container_1375784329048_0017_01_000002, state=COMPLETE, exitStatus=-1000, diagnostics=Resource file:/home_/dsadm/kishore/kk.ksh changed on src filesystem (expected 0, was 1367580580000
>>>>
>>>>
>>>>
>>>>
>>>> On Tue, Aug 6, 2013 at 5:24 PM, Harsh J <ha...@cloudera.com> wrote:
>>>> Can you try passing a fully qualified local path? That is, including the file:/ scheme
>>>>
>>>> On Aug 6, 2013 4:05 PM, "Krishna Kishore Bonagiri" <wr...@gmail.com> wrote:
>>>> Hi Harsh,
>>>>   The setResource() call on LocalResource() is expecting an argument of type org.apache.hadoop.yarn.api.records.URL which is converted from a string in the form of URI. This happens in the following call of Distributed Shell example,
>>>>
>>>> shellRsrc.setResource(ConverterUtils.getYarnUrlFromURI(new URI( shellScriptPath)));
>>>>
>>>> So, if I give a local file I get a parsing error like below, which is when I changed it to an HDFS file thinking that it should be given like that only. Could you please give an example of how else it could be used, using a local file as you are saying?
>>>>
>>>> 2013-08-06 06:23:12,942 WARN org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container: Failed to parse resource-request
>>>> java.net.URISyntaxException: Expected scheme name at index 0: :///home_/dsadm/kishore/kk.ksh
>>>>        at java.net.URI$Parser.fail(URI.java:2820)
>>>>        at java.net.URI$Parser.failExpecting(URI.java:2826)
>>>>        at java.net.URI$Parser.parse(URI.java:3015)
>>>>        at java.net.URI.<init>(URI.java:747)
>>>>        at org.apache.hadoop.yarn.util.ConverterUtils.getPathFromYarnURL(ConverterUtils.java:77)
>>>>        at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalResourceRequest.<init>(LocalResourceRequest.java:46)
>>>>
>>>>
>>>>
>>>> On Tue, Aug 6, 2013 at 3:36 PM, Harsh J <ha...@cloudera.com> wrote:
>>>> To be honest, I've never tried loading a HDFS file onto the
>>>> LocalResource this way. I usually just pass a local file and that
>>>> works just fine. There may be something in the URI transformation
>>>> possibly breaking a HDFS source, but try passing a local file - does
>>>> that fail too? The Shell example uses a local file.
>>>>
>>>> On Tue, Aug 6, 2013 at 10:54 AM, Krishna Kishore Bonagiri
>>>> <wr...@gmail.com> wrote:
>>>>> Hi Harsh,
>>>>>
>>>>>  Please see if this is useful, I got a stack trace after the error has
>>>>> occurred....
>>>>>
>>>>> 2013-08-06 00:55:30,559 INFO
>>>>> org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor: CWD set
>>>>> to /tmp/nm-local-dir/usercache/dsadm/appcache/application_1375716148174_0004
>>>>> =
>>>>> file:/tmp/nm-local-dir/usercache/dsadm/appcache/application_1375716148174_0004
>>>>> 2013-08-06 00:55:31,017 ERROR
>>>>> org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException
>>>>> as:dsadm (auth:SIMPLE) cause:java.io.FileNotFoundException: File does not
>>>>> exist: hdfs://isredeng/kishore/kk.ksh
>>>>> 2013-08-06 00:55:31,029 INFO
>>>>> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService:
>>>>> DEBUG: FAILED { hdfs://isredeng/kishore/kk.ksh, 0, FILE, null }, File does
>>>>> not exist: hdfs://isredeng/kishore/kk.ksh
>>>>> 2013-08-06 00:55:31,031 INFO
>>>>> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalizedResource:
>>>>> Resource hdfs://isredeng/kishore/kk.ksh transitioned from DOWNLOADING to
>>>>> FAILED
>>>>> 2013-08-06 00:55:31,034 INFO
>>>>> org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container:
>>>>> Container container_1375716148174_0004_01_000002 transitioned from
>>>>> LOCALIZING to LOCALIZATION_FAILED
>>>>> 2013-08-06 00:55:31,035 INFO
>>>>> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalResourcesTrackerImpl:
>>>>> Container container_1375716148174_0004_01_000002 sent RELEASE event on a
>>>>> resource request { hdfs://isredeng/kishore/kk.ksh, 0, FILE, null } not
>>>>> present in cache.
>>>>> 2013-08-06 00:55:31,036 WARN org.apache.hadoop.ipc.Client: interrupted
>>>>> waiting to send rpc request to server
>>>>> java.lang.InterruptedException
>>>>>        at
>>>>> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1290)
>>>>>        at
>>>>> java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:229)
>>>>>        at java.util.concurrent.FutureTask.get(FutureTask.java:94)
>>>>>        at
>>>>> org.apache.hadoop.ipc.Client$Connection.sendRpcRequest(Client.java:930)
>>>>>        at org.apache.hadoop.ipc.Client.call(Client.java:1285)
>>>>>        at org.apache.hadoop.ipc.Client.call(Client.java:1264)
>>>>>        at
>>>>> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
>>>>>        at $Proxy22.heartbeat(Unknown Source)
>>>>>        at
>>>>> org.apache.hadoop.yarn.server.nodemanager.api.impl.pb.client.LocalizationProtocolPBClientImpl.heartbeat(LocalizationProtocolPBClientImpl.java:62)
>>>>>        at
>>>>> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.localizeFiles(ContainerLocalizer.java:249)
>>>>>        at
>>>>> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.runLocalization(ContainerLocalizer.java:163)
>>>>>        at
>>>>> org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.startLocalizer(DefaultContainerExecutor.java:106)
>>>>>        at
>>>>> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerRunner.run(ResourceLocalizationService.java:979)
>>>>>
>>>>>
>>>>>
>>>>> And here is my code snippet:
>>>>>
>>>>>      ContainerLaunchContext ctx =
>>>>> Records.newRecord(ContainerLaunchContext.class);
>>>>>
>>>>>      ctx.setEnvironment(oshEnv);
>>>>>
>>>>>      // Set the local resources
>>>>>      Map<String, LocalResource> localResources = new HashMap<String,
>>>>> LocalResource>();
>>>>>
>>>>>      LocalResource shellRsrc = Records.newRecord(LocalResource.class);
>>>>>      shellRsrc.setType(LocalResourceType.FILE);
>>>>>      shellRsrc.setVisibility(LocalResourceVisibility.APPLICATION);
>>>>>      String shellScriptPath = "hdfs://isredeng//kishore/kk.ksh";
>>>>>      try {
>>>>>        shellRsrc.setResource(ConverterUtils.getYarnUrlFromURI(new
>>>>> URI(shellScriptPath)));
>>>>>      } catch (URISyntaxException e) {
>>>>>        LOG.error("Error when trying to use shell script path specified"
>>>>>            + " in env, path=" + shellScriptPath);
>>>>>        e.printStackTrace();
>>>>>      }
>>>>>
>>>>>      shellRsrc.setTimestamp(0/*shellScriptPathTimestamp*/);
>>>>>      shellRsrc.setSize(0/*shellScriptPathLen*/);
>>>>>      String ExecShellStringPath = "ExecShellScript.sh";
>>>>>      localResources.put(ExecShellStringPath, shellRsrc);
>>>>>
>>>>>      ctx.setLocalResources(localResources);
>>>>>
>>>>>
>>>>> Please let me know if you need anything else.
>>>>>
>>>>> Thanks,
>>>>> Kishore
>>>>>
>>>>>
>>>>>
>>>>> On Tue, Aug 6, 2013 at 12:05 AM, Harsh J <ha...@cloudera.com> wrote:
>>>>>>
>>>>>> The detail is insufficient to answer why. You should also have gotten
>>>>>> a trace after it, can you post that? If possible, also the relevant
>>>>>> snippets of code.
>>>>>>
>>>>>> On Mon, Aug 5, 2013 at 6:36 PM, Krishna Kishore Bonagiri
>>>>>> <wr...@gmail.com> wrote:
>>>>>>> Hi Harsh,
>>>>>>> Thanks for the quick and detailed reply, it really helps. I am trying
>>>>>>> to
>>>>>>> use it and getting this error in node manager's log:
>>>>>>>
>>>>>>> 2013-08-05 08:57:28,867 ERROR
>>>>>>> org.apache.hadoop.security.UserGroupInformation:
>>>>>>> PriviledgedActionException
>>>>>>> as:dsadm (auth:SIMPLE) cause:java.io.FileNotFoundException: File does
>>>>>>> not
>>>>>>> exist: hdfs://isredeng/kishore/kk.ksh
>>>>>>>
>>>>>>>
>>>>>>> This file is there on the machine with name "isredeng", I could do ls
>>>>>>> for
>>>>>>> that file as below:
>>>>>>>
>>>>>>> -bash-4.1$ hadoop fs -ls kishore/kk.ksh
>>>>>>> 13/08/05 09:01:03 WARN util.NativeCodeLoader: Unable to load
>>>>>>> native-hadoop
>>>>>>> library for your platform... using builtin-java classes where applicable
>>>>>>> Found 1 items
>>>>>>> -rw-r--r--   3 dsadm supergroup       1046 2013-08-05 08:48
>>>>>>> kishore/kk.ksh
>>>>>>>
>>>>>>> Note: I am using a single node cluster
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Kishore
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Mon, Aug 5, 2013 at 3:00 PM, Harsh J <ha...@cloudera.com> wrote:
>>>>>>>>
>>>>>>>> The string for each LocalResource in the map can be anything that
>>>>>>>> serves as a common identifier name for your application. At execution
>>>>>>>> time, the passed resource filename will be aliased to the name you've
>>>>>>>> mapped it to, so that the application code need not track special
>>>>>>>> names. The behavior is very similar to how you can, in MR, define a
>>>>>>>> symlink name for a DistributedCache entry (e.g. foo.jar#bar.jar).
>>>>>>>>
>>>>>>>> For an example, checkout the DistributedShell app sources.
>>>>>>>>
>>>>>>>> Over [1], you can see we take a user provided file path to a shell
>>>>>>>> script. This can be named anything as it is user-supplied.
>>>>>>>> Onto [2], we define this as a local resource [2.1] and embed it with a
>>>>>>>> different name (the string you ask about) [2.2], as defined at [3] as
>>>>>>>> an application reference-able constant.
>>>>>>>> Note that in [4], we add to the Container arguments the aliased name
>>>>>>>> we mapped it to (i.e. [3]) and not the original filename we received
>>>>>>>> from the user. The resource is placed on the container with this name
>>>>>>>> instead, so thats what we choose to execute.
>>>>>>>>
>>>>>>>> [1] -
>>>>>>>>
>>>>>>>> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L390
>>>>>>>>
>>>>>>>> [2] - [2.1]
>>>>>>>>
>>>>>>>> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L764
>>>>>>>> and [2.2]
>>>>>>>>
>>>>>>>> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L780
>>>>>>>>
>>>>>>>> [3] -
>>>>>>>>
>>>>>>>> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L205
>>>>>>>>
>>>>>>>> [4] -
>>>>>>>>
>>>>>>>> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L791
>>>>>>>>
>>>>>>>> On Mon, Aug 5, 2013 at 2:44 PM, Krishna Kishore Bonagiri
>>>>>>>> <wr...@gmail.com> wrote:
>>>>>>>>> Hi,
>>>>>>>>>
>>>>>>>>>  Can someone please tell me what is the use of calling
>>>>>>>>> setLocalResources()
>>>>>>>>> on ContainerLaunchContext?
>>>>>>>>>
>>>>>>>>>  And, also an example of how to use this will help...
>>>>>>>>>
>>>>>>>>> I couldn't guess what is the String in the map that is passed to
>>>>>>>>> setLocalResources() like below:
>>>>>>>>>
>>>>>>>>>      // Set the local resources
>>>>>>>>>      Map<String, LocalResource> localResources = new HashMap<String,
>>>>>>>>> LocalResource>();
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>> Kishore
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>> Harsh J
>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Harsh J
>>>>>
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> Harsh J
>>>>
>>>>
>>>>
>>>
>>
>>
>>
>> --
>> Harsh J
>



-- 
Harsh J

Re: setLocalResources() on ContainerLaunchContext

Posted by Harsh J <ha...@cloudera.com>.
Thanks Hitesh!

P.s. Port isn't a requirement (and with HA URIs, you shouldn't add a
port), but "isredeng" has to be the authority component.

On Wed, Aug 7, 2013 at 7:37 AM, Hitesh Shah <hi...@apache.org> wrote:
> @Krishna, your logs showed the file error for "hdfs://isredeng/kishore/kk.ksh"
>
> I am assuming you have tried dfs -ls /kishore/kk.ksh and confirmed that the file exists? Also the qualified path seems to be missing the namenode port. I need to go back and check if a path without the port works by assuming the default namenode port.
>
> @Harsh, adding a helper function seems like a good idea. Let me file a jira to have the above added to one of the helper/client libraries.
>
> thanks
> -- Hitesh
>
> On Aug 6, 2013, at 6:47 PM, Harsh J wrote:
>
>> It is kinda unnecessary to be asking developers to load in timestamps and
>> length themselves. Why not provide a java.io.File, or perhaps a Path
>> accepting API, that gets it automatically on their behalf using the
>> FileSystem API internally?
>>
>> P.s. A HDFS file gave him a FNF, while a Local file gave him a proper
>> TS/Len error. I'm guessing there's a bug here w.r.t. handling HDFS
>> paths.
>>
>> On Wed, Aug 7, 2013 at 12:35 AM, Hitesh Shah <hi...@apache.org> wrote:
>>> Hi Krishna,
>>>
>>> YARN downloads a specified local resource on the container's node from the url specified. In all situtations, the remote url needs to be a fully qualified path. To verify that the file at the remote url is still valid, YARN expects you to provide the length and last modified timestamp of that file.
>>>
>>> If you use an hdfs path such as hdfs://namenode:port/<absolute path to file>, you will need to get the length and timestamp from HDFS.
>>> If you use file:///, the file should exist on all nodes and all nodes should have the file with the same length and timestamp for localization to work. ( For a single node setup, this works but tougher to get right on a multi-node setup - deploying the file via a rpm should likely work).
>>>
>>> -- Hitesh
>>>
>>> On Aug 6, 2013, at 11:11 AM, Omkar Joshi wrote:
>>>
>>>> Hi,
>>>>
>>>> You need to match the timestamp. Probably get the timestamp locally before adding it. This is explicitly done to ensure that file is not updated after user makes the call to avoid possible errors.
>>>>
>>>>
>>>> Thanks,
>>>> Omkar Joshi
>>>> Hortonworks Inc.
>>>>
>>>>
>>>> On Tue, Aug 6, 2013 at 5:25 AM, Krishna Kishore Bonagiri <wr...@gmail.com> wrote:
>>>> I tried the following and it works!
>>>> String shellScriptPath = "file:///home_/dsadm/kishore/kk.ksh";
>>>>
>>>> But now getting a timestamp error like below, when I passed 0 to setTimestamp()
>>>>
>>>> 13/08/06 08:23:48 INFO ApplicationMaster: Got container status for containerID= container_1375784329048_0017_01_000002, state=COMPLETE, exitStatus=-1000, diagnostics=Resource file:/home_/dsadm/kishore/kk.ksh changed on src filesystem (expected 0, was 1367580580000
>>>>
>>>>
>>>>
>>>>
>>>> On Tue, Aug 6, 2013 at 5:24 PM, Harsh J <ha...@cloudera.com> wrote:
>>>> Can you try passing a fully qualified local path? That is, including the file:/ scheme
>>>>
>>>> On Aug 6, 2013 4:05 PM, "Krishna Kishore Bonagiri" <wr...@gmail.com> wrote:
>>>> Hi Harsh,
>>>>   The setResource() call on LocalResource() is expecting an argument of type org.apache.hadoop.yarn.api.records.URL which is converted from a string in the form of URI. This happens in the following call of Distributed Shell example,
>>>>
>>>> shellRsrc.setResource(ConverterUtils.getYarnUrlFromURI(new URI( shellScriptPath)));
>>>>
>>>> So, if I give a local file I get a parsing error like below, which is when I changed it to an HDFS file thinking that it should be given like that only. Could you please give an example of how else it could be used, using a local file as you are saying?
>>>>
>>>> 2013-08-06 06:23:12,942 WARN org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container: Failed to parse resource-request
>>>> java.net.URISyntaxException: Expected scheme name at index 0: :///home_/dsadm/kishore/kk.ksh
>>>>        at java.net.URI$Parser.fail(URI.java:2820)
>>>>        at java.net.URI$Parser.failExpecting(URI.java:2826)
>>>>        at java.net.URI$Parser.parse(URI.java:3015)
>>>>        at java.net.URI.<init>(URI.java:747)
>>>>        at org.apache.hadoop.yarn.util.ConverterUtils.getPathFromYarnURL(ConverterUtils.java:77)
>>>>        at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalResourceRequest.<init>(LocalResourceRequest.java:46)
>>>>
>>>>
>>>>
>>>> On Tue, Aug 6, 2013 at 3:36 PM, Harsh J <ha...@cloudera.com> wrote:
>>>> To be honest, I've never tried loading a HDFS file onto the
>>>> LocalResource this way. I usually just pass a local file and that
>>>> works just fine. There may be something in the URI transformation
>>>> possibly breaking a HDFS source, but try passing a local file - does
>>>> that fail too? The Shell example uses a local file.
>>>>
>>>> On Tue, Aug 6, 2013 at 10:54 AM, Krishna Kishore Bonagiri
>>>> <wr...@gmail.com> wrote:
>>>>> Hi Harsh,
>>>>>
>>>>>  Please see if this is useful, I got a stack trace after the error has
>>>>> occurred....
>>>>>
>>>>> 2013-08-06 00:55:30,559 INFO
>>>>> org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor: CWD set
>>>>> to /tmp/nm-local-dir/usercache/dsadm/appcache/application_1375716148174_0004
>>>>> =
>>>>> file:/tmp/nm-local-dir/usercache/dsadm/appcache/application_1375716148174_0004
>>>>> 2013-08-06 00:55:31,017 ERROR
>>>>> org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException
>>>>> as:dsadm (auth:SIMPLE) cause:java.io.FileNotFoundException: File does not
>>>>> exist: hdfs://isredeng/kishore/kk.ksh
>>>>> 2013-08-06 00:55:31,029 INFO
>>>>> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService:
>>>>> DEBUG: FAILED { hdfs://isredeng/kishore/kk.ksh, 0, FILE, null }, File does
>>>>> not exist: hdfs://isredeng/kishore/kk.ksh
>>>>> 2013-08-06 00:55:31,031 INFO
>>>>> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalizedResource:
>>>>> Resource hdfs://isredeng/kishore/kk.ksh transitioned from DOWNLOADING to
>>>>> FAILED
>>>>> 2013-08-06 00:55:31,034 INFO
>>>>> org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container:
>>>>> Container container_1375716148174_0004_01_000002 transitioned from
>>>>> LOCALIZING to LOCALIZATION_FAILED
>>>>> 2013-08-06 00:55:31,035 INFO
>>>>> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalResourcesTrackerImpl:
>>>>> Container container_1375716148174_0004_01_000002 sent RELEASE event on a
>>>>> resource request { hdfs://isredeng/kishore/kk.ksh, 0, FILE, null } not
>>>>> present in cache.
>>>>> 2013-08-06 00:55:31,036 WARN org.apache.hadoop.ipc.Client: interrupted
>>>>> waiting to send rpc request to server
>>>>> java.lang.InterruptedException
>>>>>        at
>>>>> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1290)
>>>>>        at
>>>>> java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:229)
>>>>>        at java.util.concurrent.FutureTask.get(FutureTask.java:94)
>>>>>        at
>>>>> org.apache.hadoop.ipc.Client$Connection.sendRpcRequest(Client.java:930)
>>>>>        at org.apache.hadoop.ipc.Client.call(Client.java:1285)
>>>>>        at org.apache.hadoop.ipc.Client.call(Client.java:1264)
>>>>>        at
>>>>> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
>>>>>        at $Proxy22.heartbeat(Unknown Source)
>>>>>        at
>>>>> org.apache.hadoop.yarn.server.nodemanager.api.impl.pb.client.LocalizationProtocolPBClientImpl.heartbeat(LocalizationProtocolPBClientImpl.java:62)
>>>>>        at
>>>>> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.localizeFiles(ContainerLocalizer.java:249)
>>>>>        at
>>>>> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.runLocalization(ContainerLocalizer.java:163)
>>>>>        at
>>>>> org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.startLocalizer(DefaultContainerExecutor.java:106)
>>>>>        at
>>>>> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerRunner.run(ResourceLocalizationService.java:979)
>>>>>
>>>>>
>>>>>
>>>>> And here is my code snippet:
>>>>>
>>>>>      ContainerLaunchContext ctx =
>>>>> Records.newRecord(ContainerLaunchContext.class);
>>>>>
>>>>>      ctx.setEnvironment(oshEnv);
>>>>>
>>>>>      // Set the local resources
>>>>>      Map<String, LocalResource> localResources = new HashMap<String,
>>>>> LocalResource>();
>>>>>
>>>>>      LocalResource shellRsrc = Records.newRecord(LocalResource.class);
>>>>>      shellRsrc.setType(LocalResourceType.FILE);
>>>>>      shellRsrc.setVisibility(LocalResourceVisibility.APPLICATION);
>>>>>      String shellScriptPath = "hdfs://isredeng//kishore/kk.ksh";
>>>>>      try {
>>>>>        shellRsrc.setResource(ConverterUtils.getYarnUrlFromURI(new
>>>>> URI(shellScriptPath)));
>>>>>      } catch (URISyntaxException e) {
>>>>>        LOG.error("Error when trying to use shell script path specified"
>>>>>            + " in env, path=" + shellScriptPath);
>>>>>        e.printStackTrace();
>>>>>      }
>>>>>
>>>>>      shellRsrc.setTimestamp(0/*shellScriptPathTimestamp*/);
>>>>>      shellRsrc.setSize(0/*shellScriptPathLen*/);
>>>>>      String ExecShellStringPath = "ExecShellScript.sh";
>>>>>      localResources.put(ExecShellStringPath, shellRsrc);
>>>>>
>>>>>      ctx.setLocalResources(localResources);
>>>>>
>>>>>
>>>>> Please let me know if you need anything else.
>>>>>
>>>>> Thanks,
>>>>> Kishore
>>>>>
>>>>>
>>>>>
>>>>> On Tue, Aug 6, 2013 at 12:05 AM, Harsh J <ha...@cloudera.com> wrote:
>>>>>>
>>>>>> The detail is insufficient to answer why. You should also have gotten
>>>>>> a trace after it, can you post that? If possible, also the relevant
>>>>>> snippets of code.
>>>>>>
>>>>>> On Mon, Aug 5, 2013 at 6:36 PM, Krishna Kishore Bonagiri
>>>>>> <wr...@gmail.com> wrote:
>>>>>>> Hi Harsh,
>>>>>>> Thanks for the quick and detailed reply, it really helps. I am trying
>>>>>>> to
>>>>>>> use it and getting this error in node manager's log:
>>>>>>>
>>>>>>> 2013-08-05 08:57:28,867 ERROR
>>>>>>> org.apache.hadoop.security.UserGroupInformation:
>>>>>>> PriviledgedActionException
>>>>>>> as:dsadm (auth:SIMPLE) cause:java.io.FileNotFoundException: File does
>>>>>>> not
>>>>>>> exist: hdfs://isredeng/kishore/kk.ksh
>>>>>>>
>>>>>>>
>>>>>>> This file is there on the machine with name "isredeng", I could do ls
>>>>>>> for
>>>>>>> that file as below:
>>>>>>>
>>>>>>> -bash-4.1$ hadoop fs -ls kishore/kk.ksh
>>>>>>> 13/08/05 09:01:03 WARN util.NativeCodeLoader: Unable to load
>>>>>>> native-hadoop
>>>>>>> library for your platform... using builtin-java classes where applicable
>>>>>>> Found 1 items
>>>>>>> -rw-r--r--   3 dsadm supergroup       1046 2013-08-05 08:48
>>>>>>> kishore/kk.ksh
>>>>>>>
>>>>>>> Note: I am using a single node cluster
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Kishore
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Mon, Aug 5, 2013 at 3:00 PM, Harsh J <ha...@cloudera.com> wrote:
>>>>>>>>
>>>>>>>> The string for each LocalResource in the map can be anything that
>>>>>>>> serves as a common identifier name for your application. At execution
>>>>>>>> time, the passed resource filename will be aliased to the name you've
>>>>>>>> mapped it to, so that the application code need not track special
>>>>>>>> names. The behavior is very similar to how you can, in MR, define a
>>>>>>>> symlink name for a DistributedCache entry (e.g. foo.jar#bar.jar).
>>>>>>>>
>>>>>>>> For an example, checkout the DistributedShell app sources.
>>>>>>>>
>>>>>>>> Over [1], you can see we take a user provided file path to a shell
>>>>>>>> script. This can be named anything as it is user-supplied.
>>>>>>>> Onto [2], we define this as a local resource [2.1] and embed it with a
>>>>>>>> different name (the string you ask about) [2.2], as defined at [3] as
>>>>>>>> an application reference-able constant.
>>>>>>>> Note that in [4], we add to the Container arguments the aliased name
>>>>>>>> we mapped it to (i.e. [3]) and not the original filename we received
>>>>>>>> from the user. The resource is placed on the container with this name
>>>>>>>> instead, so thats what we choose to execute.
>>>>>>>>
>>>>>>>> [1] -
>>>>>>>>
>>>>>>>> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L390
>>>>>>>>
>>>>>>>> [2] - [2.1]
>>>>>>>>
>>>>>>>> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L764
>>>>>>>> and [2.2]
>>>>>>>>
>>>>>>>> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L780
>>>>>>>>
>>>>>>>> [3] -
>>>>>>>>
>>>>>>>> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L205
>>>>>>>>
>>>>>>>> [4] -
>>>>>>>>
>>>>>>>> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L791
>>>>>>>>
>>>>>>>> On Mon, Aug 5, 2013 at 2:44 PM, Krishna Kishore Bonagiri
>>>>>>>> <wr...@gmail.com> wrote:
>>>>>>>>> Hi,
>>>>>>>>>
>>>>>>>>>  Can someone please tell me what is the use of calling
>>>>>>>>> setLocalResources()
>>>>>>>>> on ContainerLaunchContext?
>>>>>>>>>
>>>>>>>>>  And, also an example of how to use this will help...
>>>>>>>>>
>>>>>>>>> I couldn't guess what is the String in the map that is passed to
>>>>>>>>> setLocalResources() like below:
>>>>>>>>>
>>>>>>>>>      // Set the local resources
>>>>>>>>>      Map<String, LocalResource> localResources = new HashMap<String,
>>>>>>>>> LocalResource>();
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>> Kishore
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>> Harsh J
>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Harsh J
>>>>>
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> Harsh J
>>>>
>>>>
>>>>
>>>
>>
>>
>>
>> --
>> Harsh J
>



-- 
Harsh J

Re: setLocalResources() on ContainerLaunchContext

Posted by Harsh J <ha...@cloudera.com>.
Thanks Hitesh!

P.s. Port isn't a requirement (and with HA URIs, you shouldn't add a
port), but "isredeng" has to be the authority component.

On Wed, Aug 7, 2013 at 7:37 AM, Hitesh Shah <hi...@apache.org> wrote:
> @Krishna, your logs showed the file error for "hdfs://isredeng/kishore/kk.ksh"
>
> I am assuming you have tried dfs -ls /kishore/kk.ksh and confirmed that the file exists? Also the qualified path seems to be missing the namenode port. I need to go back and check if a path without the port works by assuming the default namenode port.
>
> @Harsh, adding a helper function seems like a good idea. Let me file a jira to have the above added to one of the helper/client libraries.
>
> thanks
> -- Hitesh
>
> On Aug 6, 2013, at 6:47 PM, Harsh J wrote:
>
>> It is kinda unnecessary to be asking developers to load in timestamps and
>> length themselves. Why not provide a java.io.File, or perhaps a Path
>> accepting API, that gets it automatically on their behalf using the
>> FileSystem API internally?
>>
>> P.s. A HDFS file gave him a FNF, while a Local file gave him a proper
>> TS/Len error. I'm guessing there's a bug here w.r.t. handling HDFS
>> paths.
>>
>> On Wed, Aug 7, 2013 at 12:35 AM, Hitesh Shah <hi...@apache.org> wrote:
>>> Hi Krishna,
>>>
>>> YARN downloads a specified local resource on the container's node from the url specified. In all situtations, the remote url needs to be a fully qualified path. To verify that the file at the remote url is still valid, YARN expects you to provide the length and last modified timestamp of that file.
>>>
>>> If you use an hdfs path such as hdfs://namenode:port/<absolute path to file>, you will need to get the length and timestamp from HDFS.
>>> If you use file:///, the file should exist on all nodes and all nodes should have the file with the same length and timestamp for localization to work. ( For a single node setup, this works but tougher to get right on a multi-node setup - deploying the file via a rpm should likely work).
>>>
>>> -- Hitesh
>>>
>>> On Aug 6, 2013, at 11:11 AM, Omkar Joshi wrote:
>>>
>>>> Hi,
>>>>
>>>> You need to match the timestamp. Probably get the timestamp locally before adding it. This is explicitly done to ensure that file is not updated after user makes the call to avoid possible errors.
>>>>
>>>>
>>>> Thanks,
>>>> Omkar Joshi
>>>> Hortonworks Inc.
>>>>
>>>>
>>>> On Tue, Aug 6, 2013 at 5:25 AM, Krishna Kishore Bonagiri <wr...@gmail.com> wrote:
>>>> I tried the following and it works!
>>>> String shellScriptPath = "file:///home_/dsadm/kishore/kk.ksh";
>>>>
>>>> But now getting a timestamp error like below, when I passed 0 to setTimestamp()
>>>>
>>>> 13/08/06 08:23:48 INFO ApplicationMaster: Got container status for containerID= container_1375784329048_0017_01_000002, state=COMPLETE, exitStatus=-1000, diagnostics=Resource file:/home_/dsadm/kishore/kk.ksh changed on src filesystem (expected 0, was 1367580580000
>>>>
>>>>
>>>>
>>>>
>>>> On Tue, Aug 6, 2013 at 5:24 PM, Harsh J <ha...@cloudera.com> wrote:
>>>> Can you try passing a fully qualified local path? That is, including the file:/ scheme
>>>>
>>>> On Aug 6, 2013 4:05 PM, "Krishna Kishore Bonagiri" <wr...@gmail.com> wrote:
>>>> Hi Harsh,
>>>>   The setResource() call on LocalResource() is expecting an argument of type org.apache.hadoop.yarn.api.records.URL which is converted from a string in the form of URI. This happens in the following call of Distributed Shell example,
>>>>
>>>> shellRsrc.setResource(ConverterUtils.getYarnUrlFromURI(new URI( shellScriptPath)));
>>>>
>>>> So, if I give a local file I get a parsing error like below, which is when I changed it to an HDFS file thinking that it should be given like that only. Could you please give an example of how else it could be used, using a local file as you are saying?
>>>>
>>>> 2013-08-06 06:23:12,942 WARN org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container: Failed to parse resource-request
>>>> java.net.URISyntaxException: Expected scheme name at index 0: :///home_/dsadm/kishore/kk.ksh
>>>>        at java.net.URI$Parser.fail(URI.java:2820)
>>>>        at java.net.URI$Parser.failExpecting(URI.java:2826)
>>>>        at java.net.URI$Parser.parse(URI.java:3015)
>>>>        at java.net.URI.<init>(URI.java:747)
>>>>        at org.apache.hadoop.yarn.util.ConverterUtils.getPathFromYarnURL(ConverterUtils.java:77)
>>>>        at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalResourceRequest.<init>(LocalResourceRequest.java:46)
>>>>
>>>>
>>>>
>>>> On Tue, Aug 6, 2013 at 3:36 PM, Harsh J <ha...@cloudera.com> wrote:
>>>> To be honest, I've never tried loading a HDFS file onto the
>>>> LocalResource this way. I usually just pass a local file and that
>>>> works just fine. There may be something in the URI transformation
>>>> possibly breaking a HDFS source, but try passing a local file - does
>>>> that fail too? The Shell example uses a local file.
>>>>
>>>> On Tue, Aug 6, 2013 at 10:54 AM, Krishna Kishore Bonagiri
>>>> <wr...@gmail.com> wrote:
>>>>> Hi Harsh,
>>>>>
>>>>>  Please see if this is useful, I got a stack trace after the error has
>>>>> occurred....
>>>>>
>>>>> 2013-08-06 00:55:30,559 INFO
>>>>> org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor: CWD set
>>>>> to /tmp/nm-local-dir/usercache/dsadm/appcache/application_1375716148174_0004
>>>>> =
>>>>> file:/tmp/nm-local-dir/usercache/dsadm/appcache/application_1375716148174_0004
>>>>> 2013-08-06 00:55:31,017 ERROR
>>>>> org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException
>>>>> as:dsadm (auth:SIMPLE) cause:java.io.FileNotFoundException: File does not
>>>>> exist: hdfs://isredeng/kishore/kk.ksh
>>>>> 2013-08-06 00:55:31,029 INFO
>>>>> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService:
>>>>> DEBUG: FAILED { hdfs://isredeng/kishore/kk.ksh, 0, FILE, null }, File does
>>>>> not exist: hdfs://isredeng/kishore/kk.ksh
>>>>> 2013-08-06 00:55:31,031 INFO
>>>>> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalizedResource:
>>>>> Resource hdfs://isredeng/kishore/kk.ksh transitioned from DOWNLOADING to
>>>>> FAILED
>>>>> 2013-08-06 00:55:31,034 INFO
>>>>> org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container:
>>>>> Container container_1375716148174_0004_01_000002 transitioned from
>>>>> LOCALIZING to LOCALIZATION_FAILED
>>>>> 2013-08-06 00:55:31,035 INFO
>>>>> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalResourcesTrackerImpl:
>>>>> Container container_1375716148174_0004_01_000002 sent RELEASE event on a
>>>>> resource request { hdfs://isredeng/kishore/kk.ksh, 0, FILE, null } not
>>>>> present in cache.
>>>>> 2013-08-06 00:55:31,036 WARN org.apache.hadoop.ipc.Client: interrupted
>>>>> waiting to send rpc request to server
>>>>> java.lang.InterruptedException
>>>>>        at
>>>>> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1290)
>>>>>        at
>>>>> java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:229)
>>>>>        at java.util.concurrent.FutureTask.get(FutureTask.java:94)
>>>>>        at
>>>>> org.apache.hadoop.ipc.Client$Connection.sendRpcRequest(Client.java:930)
>>>>>        at org.apache.hadoop.ipc.Client.call(Client.java:1285)
>>>>>        at org.apache.hadoop.ipc.Client.call(Client.java:1264)
>>>>>        at
>>>>> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
>>>>>        at $Proxy22.heartbeat(Unknown Source)
>>>>>        at
>>>>> org.apache.hadoop.yarn.server.nodemanager.api.impl.pb.client.LocalizationProtocolPBClientImpl.heartbeat(LocalizationProtocolPBClientImpl.java:62)
>>>>>        at
>>>>> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.localizeFiles(ContainerLocalizer.java:249)
>>>>>        at
>>>>> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.runLocalization(ContainerLocalizer.java:163)
>>>>>        at
>>>>> org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.startLocalizer(DefaultContainerExecutor.java:106)
>>>>>        at
>>>>> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerRunner.run(ResourceLocalizationService.java:979)
>>>>>
>>>>>
>>>>>
>>>>> And here is my code snippet:
>>>>>
>>>>>      ContainerLaunchContext ctx =
>>>>> Records.newRecord(ContainerLaunchContext.class);
>>>>>
>>>>>      ctx.setEnvironment(oshEnv);
>>>>>
>>>>>      // Set the local resources
>>>>>      Map<String, LocalResource> localResources = new HashMap<String,
>>>>> LocalResource>();
>>>>>
>>>>>      LocalResource shellRsrc = Records.newRecord(LocalResource.class);
>>>>>      shellRsrc.setType(LocalResourceType.FILE);
>>>>>      shellRsrc.setVisibility(LocalResourceVisibility.APPLICATION);
>>>>>      String shellScriptPath = "hdfs://isredeng//kishore/kk.ksh";
>>>>>      try {
>>>>>        shellRsrc.setResource(ConverterUtils.getYarnUrlFromURI(new
>>>>> URI(shellScriptPath)));
>>>>>      } catch (URISyntaxException e) {
>>>>>        LOG.error("Error when trying to use shell script path specified"
>>>>>            + " in env, path=" + shellScriptPath);
>>>>>        e.printStackTrace();
>>>>>      }
>>>>>
>>>>>      shellRsrc.setTimestamp(0/*shellScriptPathTimestamp*/);
>>>>>      shellRsrc.setSize(0/*shellScriptPathLen*/);
>>>>>      String ExecShellStringPath = "ExecShellScript.sh";
>>>>>      localResources.put(ExecShellStringPath, shellRsrc);
>>>>>
>>>>>      ctx.setLocalResources(localResources);
>>>>>
>>>>>
>>>>> Please let me know if you need anything else.
>>>>>
>>>>> Thanks,
>>>>> Kishore
>>>>>
>>>>>
>>>>>
>>>>> On Tue, Aug 6, 2013 at 12:05 AM, Harsh J <ha...@cloudera.com> wrote:
>>>>>>
>>>>>> The detail is insufficient to answer why. You should also have gotten
>>>>>> a trace after it, can you post that? If possible, also the relevant
>>>>>> snippets of code.
>>>>>>
>>>>>> On Mon, Aug 5, 2013 at 6:36 PM, Krishna Kishore Bonagiri
>>>>>> <wr...@gmail.com> wrote:
>>>>>>> Hi Harsh,
>>>>>>> Thanks for the quick and detailed reply, it really helps. I am trying
>>>>>>> to
>>>>>>> use it and getting this error in node manager's log:
>>>>>>>
>>>>>>> 2013-08-05 08:57:28,867 ERROR
>>>>>>> org.apache.hadoop.security.UserGroupInformation:
>>>>>>> PriviledgedActionException
>>>>>>> as:dsadm (auth:SIMPLE) cause:java.io.FileNotFoundException: File does
>>>>>>> not
>>>>>>> exist: hdfs://isredeng/kishore/kk.ksh
>>>>>>>
>>>>>>>
>>>>>>> This file is there on the machine with name "isredeng", I could do ls
>>>>>>> for
>>>>>>> that file as below:
>>>>>>>
>>>>>>> -bash-4.1$ hadoop fs -ls kishore/kk.ksh
>>>>>>> 13/08/05 09:01:03 WARN util.NativeCodeLoader: Unable to load
>>>>>>> native-hadoop
>>>>>>> library for your platform... using builtin-java classes where applicable
>>>>>>> Found 1 items
>>>>>>> -rw-r--r--   3 dsadm supergroup       1046 2013-08-05 08:48
>>>>>>> kishore/kk.ksh
>>>>>>>
>>>>>>> Note: I am using a single node cluster
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Kishore
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Mon, Aug 5, 2013 at 3:00 PM, Harsh J <ha...@cloudera.com> wrote:
>>>>>>>>
>>>>>>>> The string for each LocalResource in the map can be anything that
>>>>>>>> serves as a common identifier name for your application. At execution
>>>>>>>> time, the passed resource filename will be aliased to the name you've
>>>>>>>> mapped it to, so that the application code need not track special
>>>>>>>> names. The behavior is very similar to how you can, in MR, define a
>>>>>>>> symlink name for a DistributedCache entry (e.g. foo.jar#bar.jar).
>>>>>>>>
>>>>>>>> For an example, checkout the DistributedShell app sources.
>>>>>>>>
>>>>>>>> Over [1], you can see we take a user provided file path to a shell
>>>>>>>> script. This can be named anything as it is user-supplied.
>>>>>>>> Onto [2], we define this as a local resource [2.1] and embed it with a
>>>>>>>> different name (the string you ask about) [2.2], as defined at [3] as
>>>>>>>> an application reference-able constant.
>>>>>>>> Note that in [4], we add to the Container arguments the aliased name
>>>>>>>> we mapped it to (i.e. [3]) and not the original filename we received
>>>>>>>> from the user. The resource is placed on the container with this name
>>>>>>>> instead, so thats what we choose to execute.
>>>>>>>>
>>>>>>>> [1] -
>>>>>>>>
>>>>>>>> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L390
>>>>>>>>
>>>>>>>> [2] - [2.1]
>>>>>>>>
>>>>>>>> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L764
>>>>>>>> and [2.2]
>>>>>>>>
>>>>>>>> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L780
>>>>>>>>
>>>>>>>> [3] -
>>>>>>>>
>>>>>>>> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L205
>>>>>>>>
>>>>>>>> [4] -
>>>>>>>>
>>>>>>>> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L791
>>>>>>>>
>>>>>>>> On Mon, Aug 5, 2013 at 2:44 PM, Krishna Kishore Bonagiri
>>>>>>>> <wr...@gmail.com> wrote:
>>>>>>>>> Hi,
>>>>>>>>>
>>>>>>>>>  Can someone please tell me what is the use of calling
>>>>>>>>> setLocalResources()
>>>>>>>>> on ContainerLaunchContext?
>>>>>>>>>
>>>>>>>>>  And, also an example of how to use this will help...
>>>>>>>>>
>>>>>>>>> I couldn't guess what is the String in the map that is passed to
>>>>>>>>> setLocalResources() like below:
>>>>>>>>>
>>>>>>>>>      // Set the local resources
>>>>>>>>>      Map<String, LocalResource> localResources = new HashMap<String,
>>>>>>>>> LocalResource>();
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>> Kishore
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>> Harsh J
>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Harsh J
>>>>>
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> Harsh J
>>>>
>>>>
>>>>
>>>
>>
>>
>>
>> --
>> Harsh J
>



-- 
Harsh J

Re: setLocalResources() on ContainerLaunchContext

Posted by Harsh J <ha...@cloudera.com>.
Thanks Hitesh!

P.s. Port isn't a requirement (and with HA URIs, you shouldn't add a
port), but "isredeng" has to be the authority component.

On Wed, Aug 7, 2013 at 7:37 AM, Hitesh Shah <hi...@apache.org> wrote:
> @Krishna, your logs showed the file error for "hdfs://isredeng/kishore/kk.ksh"
>
> I am assuming you have tried dfs -ls /kishore/kk.ksh and confirmed that the file exists? Also the qualified path seems to be missing the namenode port. I need to go back and check if a path without the port works by assuming the default namenode port.
>
> @Harsh, adding a helper function seems like a good idea. Let me file a jira to have the above added to one of the helper/client libraries.
>
> thanks
> -- Hitesh
>
> On Aug 6, 2013, at 6:47 PM, Harsh J wrote:
>
>> It is kinda unnecessary to be asking developers to load in timestamps and
>> length themselves. Why not provide a java.io.File, or perhaps a Path
>> accepting API, that gets it automatically on their behalf using the
>> FileSystem API internally?
>>
>> P.s. A HDFS file gave him a FNF, while a Local file gave him a proper
>> TS/Len error. I'm guessing there's a bug here w.r.t. handling HDFS
>> paths.
>>
>> On Wed, Aug 7, 2013 at 12:35 AM, Hitesh Shah <hi...@apache.org> wrote:
>>> Hi Krishna,
>>>
>>> YARN downloads a specified local resource on the container's node from the url specified. In all situtations, the remote url needs to be a fully qualified path. To verify that the file at the remote url is still valid, YARN expects you to provide the length and last modified timestamp of that file.
>>>
>>> If you use an hdfs path such as hdfs://namenode:port/<absolute path to file>, you will need to get the length and timestamp from HDFS.
>>> If you use file:///, the file should exist on all nodes and all nodes should have the file with the same length and timestamp for localization to work. ( For a single node setup, this works but tougher to get right on a multi-node setup - deploying the file via a rpm should likely work).
>>>
>>> -- Hitesh
>>>
>>> On Aug 6, 2013, at 11:11 AM, Omkar Joshi wrote:
>>>
>>>> Hi,
>>>>
>>>> You need to match the timestamp. Probably get the timestamp locally before adding it. This is explicitly done to ensure that file is not updated after user makes the call to avoid possible errors.
>>>>
>>>>
>>>> Thanks,
>>>> Omkar Joshi
>>>> Hortonworks Inc.
>>>>
>>>>
>>>> On Tue, Aug 6, 2013 at 5:25 AM, Krishna Kishore Bonagiri <wr...@gmail.com> wrote:
>>>> I tried the following and it works!
>>>> String shellScriptPath = "file:///home_/dsadm/kishore/kk.ksh";
>>>>
>>>> But now getting a timestamp error like below, when I passed 0 to setTimestamp()
>>>>
>>>> 13/08/06 08:23:48 INFO ApplicationMaster: Got container status for containerID= container_1375784329048_0017_01_000002, state=COMPLETE, exitStatus=-1000, diagnostics=Resource file:/home_/dsadm/kishore/kk.ksh changed on src filesystem (expected 0, was 1367580580000
>>>>
>>>>
>>>>
>>>>
>>>> On Tue, Aug 6, 2013 at 5:24 PM, Harsh J <ha...@cloudera.com> wrote:
>>>> Can you try passing a fully qualified local path? That is, including the file:/ scheme
>>>>
>>>> On Aug 6, 2013 4:05 PM, "Krishna Kishore Bonagiri" <wr...@gmail.com> wrote:
>>>> Hi Harsh,
>>>>   The setResource() call on LocalResource() is expecting an argument of type org.apache.hadoop.yarn.api.records.URL which is converted from a string in the form of URI. This happens in the following call of Distributed Shell example,
>>>>
>>>> shellRsrc.setResource(ConverterUtils.getYarnUrlFromURI(new URI( shellScriptPath)));
>>>>
>>>> So, if I give a local file I get a parsing error like below, which is when I changed it to an HDFS file thinking that it should be given like that only. Could you please give an example of how else it could be used, using a local file as you are saying?
>>>>
>>>> 2013-08-06 06:23:12,942 WARN org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container: Failed to parse resource-request
>>>> java.net.URISyntaxException: Expected scheme name at index 0: :///home_/dsadm/kishore/kk.ksh
>>>>        at java.net.URI$Parser.fail(URI.java:2820)
>>>>        at java.net.URI$Parser.failExpecting(URI.java:2826)
>>>>        at java.net.URI$Parser.parse(URI.java:3015)
>>>>        at java.net.URI.<init>(URI.java:747)
>>>>        at org.apache.hadoop.yarn.util.ConverterUtils.getPathFromYarnURL(ConverterUtils.java:77)
>>>>        at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalResourceRequest.<init>(LocalResourceRequest.java:46)
>>>>
>>>>
>>>>
>>>> On Tue, Aug 6, 2013 at 3:36 PM, Harsh J <ha...@cloudera.com> wrote:
>>>> To be honest, I've never tried loading a HDFS file onto the
>>>> LocalResource this way. I usually just pass a local file and that
>>>> works just fine. There may be something in the URI transformation
>>>> possibly breaking a HDFS source, but try passing a local file - does
>>>> that fail too? The Shell example uses a local file.
>>>>
>>>> On Tue, Aug 6, 2013 at 10:54 AM, Krishna Kishore Bonagiri
>>>> <wr...@gmail.com> wrote:
>>>>> Hi Harsh,
>>>>>
>>>>>  Please see if this is useful, I got a stack trace after the error has
>>>>> occurred....
>>>>>
>>>>> 2013-08-06 00:55:30,559 INFO
>>>>> org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor: CWD set
>>>>> to /tmp/nm-local-dir/usercache/dsadm/appcache/application_1375716148174_0004
>>>>> =
>>>>> file:/tmp/nm-local-dir/usercache/dsadm/appcache/application_1375716148174_0004
>>>>> 2013-08-06 00:55:31,017 ERROR
>>>>> org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException
>>>>> as:dsadm (auth:SIMPLE) cause:java.io.FileNotFoundException: File does not
>>>>> exist: hdfs://isredeng/kishore/kk.ksh
>>>>> 2013-08-06 00:55:31,029 INFO
>>>>> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService:
>>>>> DEBUG: FAILED { hdfs://isredeng/kishore/kk.ksh, 0, FILE, null }, File does
>>>>> not exist: hdfs://isredeng/kishore/kk.ksh
>>>>> 2013-08-06 00:55:31,031 INFO
>>>>> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalizedResource:
>>>>> Resource hdfs://isredeng/kishore/kk.ksh transitioned from DOWNLOADING to
>>>>> FAILED
>>>>> 2013-08-06 00:55:31,034 INFO
>>>>> org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container:
>>>>> Container container_1375716148174_0004_01_000002 transitioned from
>>>>> LOCALIZING to LOCALIZATION_FAILED
>>>>> 2013-08-06 00:55:31,035 INFO
>>>>> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalResourcesTrackerImpl:
>>>>> Container container_1375716148174_0004_01_000002 sent RELEASE event on a
>>>>> resource request { hdfs://isredeng/kishore/kk.ksh, 0, FILE, null } not
>>>>> present in cache.
>>>>> 2013-08-06 00:55:31,036 WARN org.apache.hadoop.ipc.Client: interrupted
>>>>> waiting to send rpc request to server
>>>>> java.lang.InterruptedException
>>>>>        at
>>>>> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1290)
>>>>>        at
>>>>> java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:229)
>>>>>        at java.util.concurrent.FutureTask.get(FutureTask.java:94)
>>>>>        at
>>>>> org.apache.hadoop.ipc.Client$Connection.sendRpcRequest(Client.java:930)
>>>>>        at org.apache.hadoop.ipc.Client.call(Client.java:1285)
>>>>>        at org.apache.hadoop.ipc.Client.call(Client.java:1264)
>>>>>        at
>>>>> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
>>>>>        at $Proxy22.heartbeat(Unknown Source)
>>>>>        at
>>>>> org.apache.hadoop.yarn.server.nodemanager.api.impl.pb.client.LocalizationProtocolPBClientImpl.heartbeat(LocalizationProtocolPBClientImpl.java:62)
>>>>>        at
>>>>> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.localizeFiles(ContainerLocalizer.java:249)
>>>>>        at
>>>>> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.runLocalization(ContainerLocalizer.java:163)
>>>>>        at
>>>>> org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.startLocalizer(DefaultContainerExecutor.java:106)
>>>>>        at
>>>>> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerRunner.run(ResourceLocalizationService.java:979)
>>>>>
>>>>>
>>>>>
>>>>> And here is my code snippet:
>>>>>
>>>>>      ContainerLaunchContext ctx =
>>>>> Records.newRecord(ContainerLaunchContext.class);
>>>>>
>>>>>      ctx.setEnvironment(oshEnv);
>>>>>
>>>>>      // Set the local resources
>>>>>      Map<String, LocalResource> localResources = new HashMap<String,
>>>>> LocalResource>();
>>>>>
>>>>>      LocalResource shellRsrc = Records.newRecord(LocalResource.class);
>>>>>      shellRsrc.setType(LocalResourceType.FILE);
>>>>>      shellRsrc.setVisibility(LocalResourceVisibility.APPLICATION);
>>>>>      String shellScriptPath = "hdfs://isredeng//kishore/kk.ksh";
>>>>>      try {
>>>>>        shellRsrc.setResource(ConverterUtils.getYarnUrlFromURI(new
>>>>> URI(shellScriptPath)));
>>>>>      } catch (URISyntaxException e) {
>>>>>        LOG.error("Error when trying to use shell script path specified"
>>>>>            + " in env, path=" + shellScriptPath);
>>>>>        e.printStackTrace();
>>>>>      }
>>>>>
>>>>>      shellRsrc.setTimestamp(0/*shellScriptPathTimestamp*/);
>>>>>      shellRsrc.setSize(0/*shellScriptPathLen*/);
>>>>>      String ExecShellStringPath = "ExecShellScript.sh";
>>>>>      localResources.put(ExecShellStringPath, shellRsrc);
>>>>>
>>>>>      ctx.setLocalResources(localResources);
>>>>>
>>>>>
>>>>> Please let me know if you need anything else.
>>>>>
>>>>> Thanks,
>>>>> Kishore
>>>>>
>>>>>
>>>>>
>>>>> On Tue, Aug 6, 2013 at 12:05 AM, Harsh J <ha...@cloudera.com> wrote:
>>>>>>
>>>>>> The detail is insufficient to answer why. You should also have gotten
>>>>>> a trace after it, can you post that? If possible, also the relevant
>>>>>> snippets of code.
>>>>>>
>>>>>> On Mon, Aug 5, 2013 at 6:36 PM, Krishna Kishore Bonagiri
>>>>>> <wr...@gmail.com> wrote:
>>>>>>> Hi Harsh,
>>>>>>> Thanks for the quick and detailed reply, it really helps. I am trying
>>>>>>> to
>>>>>>> use it and getting this error in node manager's log:
>>>>>>>
>>>>>>> 2013-08-05 08:57:28,867 ERROR
>>>>>>> org.apache.hadoop.security.UserGroupInformation:
>>>>>>> PriviledgedActionException
>>>>>>> as:dsadm (auth:SIMPLE) cause:java.io.FileNotFoundException: File does
>>>>>>> not
>>>>>>> exist: hdfs://isredeng/kishore/kk.ksh
>>>>>>>
>>>>>>>
>>>>>>> This file is there on the machine with name "isredeng", I could do ls
>>>>>>> for
>>>>>>> that file as below:
>>>>>>>
>>>>>>> -bash-4.1$ hadoop fs -ls kishore/kk.ksh
>>>>>>> 13/08/05 09:01:03 WARN util.NativeCodeLoader: Unable to load
>>>>>>> native-hadoop
>>>>>>> library for your platform... using builtin-java classes where applicable
>>>>>>> Found 1 items
>>>>>>> -rw-r--r--   3 dsadm supergroup       1046 2013-08-05 08:48
>>>>>>> kishore/kk.ksh
>>>>>>>
>>>>>>> Note: I am using a single node cluster
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Kishore
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Mon, Aug 5, 2013 at 3:00 PM, Harsh J <ha...@cloudera.com> wrote:
>>>>>>>>
>>>>>>>> The string for each LocalResource in the map can be anything that
>>>>>>>> serves as a common identifier name for your application. At execution
>>>>>>>> time, the passed resource filename will be aliased to the name you've
>>>>>>>> mapped it to, so that the application code need not track special
>>>>>>>> names. The behavior is very similar to how you can, in MR, define a
>>>>>>>> symlink name for a DistributedCache entry (e.g. foo.jar#bar.jar).
>>>>>>>>
>>>>>>>> For an example, checkout the DistributedShell app sources.
>>>>>>>>
>>>>>>>> Over [1], you can see we take a user provided file path to a shell
>>>>>>>> script. This can be named anything as it is user-supplied.
>>>>>>>> Onto [2], we define this as a local resource [2.1] and embed it with a
>>>>>>>> different name (the string you ask about) [2.2], as defined at [3] as
>>>>>>>> an application reference-able constant.
>>>>>>>> Note that in [4], we add to the Container arguments the aliased name
>>>>>>>> we mapped it to (i.e. [3]) and not the original filename we received
>>>>>>>> from the user. The resource is placed on the container with this name
>>>>>>>> instead, so thats what we choose to execute.
>>>>>>>>
>>>>>>>> [1] -
>>>>>>>>
>>>>>>>> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L390
>>>>>>>>
>>>>>>>> [2] - [2.1]
>>>>>>>>
>>>>>>>> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L764
>>>>>>>> and [2.2]
>>>>>>>>
>>>>>>>> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L780
>>>>>>>>
>>>>>>>> [3] -
>>>>>>>>
>>>>>>>> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L205
>>>>>>>>
>>>>>>>> [4] -
>>>>>>>>
>>>>>>>> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L791
>>>>>>>>
>>>>>>>> On Mon, Aug 5, 2013 at 2:44 PM, Krishna Kishore Bonagiri
>>>>>>>> <wr...@gmail.com> wrote:
>>>>>>>>> Hi,
>>>>>>>>>
>>>>>>>>>  Can someone please tell me what is the use of calling
>>>>>>>>> setLocalResources()
>>>>>>>>> on ContainerLaunchContext?
>>>>>>>>>
>>>>>>>>>  And, also an example of how to use this will help...
>>>>>>>>>
>>>>>>>>> I couldn't guess what is the String in the map that is passed to
>>>>>>>>> setLocalResources() like below:
>>>>>>>>>
>>>>>>>>>      // Set the local resources
>>>>>>>>>      Map<String, LocalResource> localResources = new HashMap<String,
>>>>>>>>> LocalResource>();
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>> Kishore
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>> Harsh J
>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Harsh J
>>>>>
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> Harsh J
>>>>
>>>>
>>>>
>>>
>>
>>
>>
>> --
>> Harsh J
>



-- 
Harsh J

Re: setLocalResources() on ContainerLaunchContext

Posted by Hitesh Shah <hi...@apache.org>.
@Krishna, your logs showed the file error for "hdfs://isredeng/kishore/kk.ksh" 

I am assuming you have tried dfs -ls /kishore/kk.ksh and confirmed that the file exists? Also the qualified path seems to be missing the namenode port. I need to go back and check if a path without the port works by assuming the default namenode port.

@Harsh, adding a helper function seems like a good idea. Let me file a jira to have the above added to one of the helper/client libraries.

thanks
-- Hitesh

On Aug 6, 2013, at 6:47 PM, Harsh J wrote:

> It is kinda unnecessary to be asking developers to load in timestamps and
> length themselves. Why not provide a java.io.File, or perhaps a Path
> accepting API, that gets it automatically on their behalf using the
> FileSystem API internally?
> 
> P.s. A HDFS file gave him a FNF, while a Local file gave him a proper
> TS/Len error. I'm guessing there's a bug here w.r.t. handling HDFS
> paths.
> 
> On Wed, Aug 7, 2013 at 12:35 AM, Hitesh Shah <hi...@apache.org> wrote:
>> Hi Krishna,
>> 
>> YARN downloads a specified local resource on the container's node from the url specified. In all situtations, the remote url needs to be a fully qualified path. To verify that the file at the remote url is still valid, YARN expects you to provide the length and last modified timestamp of that file.
>> 
>> If you use an hdfs path such as hdfs://namenode:port/<absolute path to file>, you will need to get the length and timestamp from HDFS.
>> If you use file:///, the file should exist on all nodes and all nodes should have the file with the same length and timestamp for localization to work. ( For a single node setup, this works but tougher to get right on a multi-node setup - deploying the file via a rpm should likely work).
>> 
>> -- Hitesh
>> 
>> On Aug 6, 2013, at 11:11 AM, Omkar Joshi wrote:
>> 
>>> Hi,
>>> 
>>> You need to match the timestamp. Probably get the timestamp locally before adding it. This is explicitly done to ensure that file is not updated after user makes the call to avoid possible errors.
>>> 
>>> 
>>> Thanks,
>>> Omkar Joshi
>>> Hortonworks Inc.
>>> 
>>> 
>>> On Tue, Aug 6, 2013 at 5:25 AM, Krishna Kishore Bonagiri <wr...@gmail.com> wrote:
>>> I tried the following and it works!
>>> String shellScriptPath = "file:///home_/dsadm/kishore/kk.ksh";
>>> 
>>> But now getting a timestamp error like below, when I passed 0 to setTimestamp()
>>> 
>>> 13/08/06 08:23:48 INFO ApplicationMaster: Got container status for containerID= container_1375784329048_0017_01_000002, state=COMPLETE, exitStatus=-1000, diagnostics=Resource file:/home_/dsadm/kishore/kk.ksh changed on src filesystem (expected 0, was 1367580580000
>>> 
>>> 
>>> 
>>> 
>>> On Tue, Aug 6, 2013 at 5:24 PM, Harsh J <ha...@cloudera.com> wrote:
>>> Can you try passing a fully qualified local path? That is, including the file:/ scheme
>>> 
>>> On Aug 6, 2013 4:05 PM, "Krishna Kishore Bonagiri" <wr...@gmail.com> wrote:
>>> Hi Harsh,
>>>   The setResource() call on LocalResource() is expecting an argument of type org.apache.hadoop.yarn.api.records.URL which is converted from a string in the form of URI. This happens in the following call of Distributed Shell example,
>>> 
>>> shellRsrc.setResource(ConverterUtils.getYarnUrlFromURI(new URI( shellScriptPath)));
>>> 
>>> So, if I give a local file I get a parsing error like below, which is when I changed it to an HDFS file thinking that it should be given like that only. Could you please give an example of how else it could be used, using a local file as you are saying?
>>> 
>>> 2013-08-06 06:23:12,942 WARN org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container: Failed to parse resource-request
>>> java.net.URISyntaxException: Expected scheme name at index 0: :///home_/dsadm/kishore/kk.ksh
>>>        at java.net.URI$Parser.fail(URI.java:2820)
>>>        at java.net.URI$Parser.failExpecting(URI.java:2826)
>>>        at java.net.URI$Parser.parse(URI.java:3015)
>>>        at java.net.URI.<init>(URI.java:747)
>>>        at org.apache.hadoop.yarn.util.ConverterUtils.getPathFromYarnURL(ConverterUtils.java:77)
>>>        at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalResourceRequest.<init>(LocalResourceRequest.java:46)
>>> 
>>> 
>>> 
>>> On Tue, Aug 6, 2013 at 3:36 PM, Harsh J <ha...@cloudera.com> wrote:
>>> To be honest, I've never tried loading a HDFS file onto the
>>> LocalResource this way. I usually just pass a local file and that
>>> works just fine. There may be something in the URI transformation
>>> possibly breaking a HDFS source, but try passing a local file - does
>>> that fail too? The Shell example uses a local file.
>>> 
>>> On Tue, Aug 6, 2013 at 10:54 AM, Krishna Kishore Bonagiri
>>> <wr...@gmail.com> wrote:
>>>> Hi Harsh,
>>>> 
>>>>  Please see if this is useful, I got a stack trace after the error has
>>>> occurred....
>>>> 
>>>> 2013-08-06 00:55:30,559 INFO
>>>> org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor: CWD set
>>>> to /tmp/nm-local-dir/usercache/dsadm/appcache/application_1375716148174_0004
>>>> =
>>>> file:/tmp/nm-local-dir/usercache/dsadm/appcache/application_1375716148174_0004
>>>> 2013-08-06 00:55:31,017 ERROR
>>>> org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException
>>>> as:dsadm (auth:SIMPLE) cause:java.io.FileNotFoundException: File does not
>>>> exist: hdfs://isredeng/kishore/kk.ksh
>>>> 2013-08-06 00:55:31,029 INFO
>>>> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService:
>>>> DEBUG: FAILED { hdfs://isredeng/kishore/kk.ksh, 0, FILE, null }, File does
>>>> not exist: hdfs://isredeng/kishore/kk.ksh
>>>> 2013-08-06 00:55:31,031 INFO
>>>> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalizedResource:
>>>> Resource hdfs://isredeng/kishore/kk.ksh transitioned from DOWNLOADING to
>>>> FAILED
>>>> 2013-08-06 00:55:31,034 INFO
>>>> org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container:
>>>> Container container_1375716148174_0004_01_000002 transitioned from
>>>> LOCALIZING to LOCALIZATION_FAILED
>>>> 2013-08-06 00:55:31,035 INFO
>>>> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalResourcesTrackerImpl:
>>>> Container container_1375716148174_0004_01_000002 sent RELEASE event on a
>>>> resource request { hdfs://isredeng/kishore/kk.ksh, 0, FILE, null } not
>>>> present in cache.
>>>> 2013-08-06 00:55:31,036 WARN org.apache.hadoop.ipc.Client: interrupted
>>>> waiting to send rpc request to server
>>>> java.lang.InterruptedException
>>>>        at
>>>> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1290)
>>>>        at
>>>> java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:229)
>>>>        at java.util.concurrent.FutureTask.get(FutureTask.java:94)
>>>>        at
>>>> org.apache.hadoop.ipc.Client$Connection.sendRpcRequest(Client.java:930)
>>>>        at org.apache.hadoop.ipc.Client.call(Client.java:1285)
>>>>        at org.apache.hadoop.ipc.Client.call(Client.java:1264)
>>>>        at
>>>> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
>>>>        at $Proxy22.heartbeat(Unknown Source)
>>>>        at
>>>> org.apache.hadoop.yarn.server.nodemanager.api.impl.pb.client.LocalizationProtocolPBClientImpl.heartbeat(LocalizationProtocolPBClientImpl.java:62)
>>>>        at
>>>> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.localizeFiles(ContainerLocalizer.java:249)
>>>>        at
>>>> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.runLocalization(ContainerLocalizer.java:163)
>>>>        at
>>>> org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.startLocalizer(DefaultContainerExecutor.java:106)
>>>>        at
>>>> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerRunner.run(ResourceLocalizationService.java:979)
>>>> 
>>>> 
>>>> 
>>>> And here is my code snippet:
>>>> 
>>>>      ContainerLaunchContext ctx =
>>>> Records.newRecord(ContainerLaunchContext.class);
>>>> 
>>>>      ctx.setEnvironment(oshEnv);
>>>> 
>>>>      // Set the local resources
>>>>      Map<String, LocalResource> localResources = new HashMap<String,
>>>> LocalResource>();
>>>> 
>>>>      LocalResource shellRsrc = Records.newRecord(LocalResource.class);
>>>>      shellRsrc.setType(LocalResourceType.FILE);
>>>>      shellRsrc.setVisibility(LocalResourceVisibility.APPLICATION);
>>>>      String shellScriptPath = "hdfs://isredeng//kishore/kk.ksh";
>>>>      try {
>>>>        shellRsrc.setResource(ConverterUtils.getYarnUrlFromURI(new
>>>> URI(shellScriptPath)));
>>>>      } catch (URISyntaxException e) {
>>>>        LOG.error("Error when trying to use shell script path specified"
>>>>            + " in env, path=" + shellScriptPath);
>>>>        e.printStackTrace();
>>>>      }
>>>> 
>>>>      shellRsrc.setTimestamp(0/*shellScriptPathTimestamp*/);
>>>>      shellRsrc.setSize(0/*shellScriptPathLen*/);
>>>>      String ExecShellStringPath = "ExecShellScript.sh";
>>>>      localResources.put(ExecShellStringPath, shellRsrc);
>>>> 
>>>>      ctx.setLocalResources(localResources);
>>>> 
>>>> 
>>>> Please let me know if you need anything else.
>>>> 
>>>> Thanks,
>>>> Kishore
>>>> 
>>>> 
>>>> 
>>>> On Tue, Aug 6, 2013 at 12:05 AM, Harsh J <ha...@cloudera.com> wrote:
>>>>> 
>>>>> The detail is insufficient to answer why. You should also have gotten
>>>>> a trace after it, can you post that? If possible, also the relevant
>>>>> snippets of code.
>>>>> 
>>>>> On Mon, Aug 5, 2013 at 6:36 PM, Krishna Kishore Bonagiri
>>>>> <wr...@gmail.com> wrote:
>>>>>> Hi Harsh,
>>>>>> Thanks for the quick and detailed reply, it really helps. I am trying
>>>>>> to
>>>>>> use it and getting this error in node manager's log:
>>>>>> 
>>>>>> 2013-08-05 08:57:28,867 ERROR
>>>>>> org.apache.hadoop.security.UserGroupInformation:
>>>>>> PriviledgedActionException
>>>>>> as:dsadm (auth:SIMPLE) cause:java.io.FileNotFoundException: File does
>>>>>> not
>>>>>> exist: hdfs://isredeng/kishore/kk.ksh
>>>>>> 
>>>>>> 
>>>>>> This file is there on the machine with name "isredeng", I could do ls
>>>>>> for
>>>>>> that file as below:
>>>>>> 
>>>>>> -bash-4.1$ hadoop fs -ls kishore/kk.ksh
>>>>>> 13/08/05 09:01:03 WARN util.NativeCodeLoader: Unable to load
>>>>>> native-hadoop
>>>>>> library for your platform... using builtin-java classes where applicable
>>>>>> Found 1 items
>>>>>> -rw-r--r--   3 dsadm supergroup       1046 2013-08-05 08:48
>>>>>> kishore/kk.ksh
>>>>>> 
>>>>>> Note: I am using a single node cluster
>>>>>> 
>>>>>> Thanks,
>>>>>> Kishore
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> On Mon, Aug 5, 2013 at 3:00 PM, Harsh J <ha...@cloudera.com> wrote:
>>>>>>> 
>>>>>>> The string for each LocalResource in the map can be anything that
>>>>>>> serves as a common identifier name for your application. At execution
>>>>>>> time, the passed resource filename will be aliased to the name you've
>>>>>>> mapped it to, so that the application code need not track special
>>>>>>> names. The behavior is very similar to how you can, in MR, define a
>>>>>>> symlink name for a DistributedCache entry (e.g. foo.jar#bar.jar).
>>>>>>> 
>>>>>>> For an example, checkout the DistributedShell app sources.
>>>>>>> 
>>>>>>> Over [1], you can see we take a user provided file path to a shell
>>>>>>> script. This can be named anything as it is user-supplied.
>>>>>>> Onto [2], we define this as a local resource [2.1] and embed it with a
>>>>>>> different name (the string you ask about) [2.2], as defined at [3] as
>>>>>>> an application reference-able constant.
>>>>>>> Note that in [4], we add to the Container arguments the aliased name
>>>>>>> we mapped it to (i.e. [3]) and not the original filename we received
>>>>>>> from the user. The resource is placed on the container with this name
>>>>>>> instead, so thats what we choose to execute.
>>>>>>> 
>>>>>>> [1] -
>>>>>>> 
>>>>>>> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L390
>>>>>>> 
>>>>>>> [2] - [2.1]
>>>>>>> 
>>>>>>> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L764
>>>>>>> and [2.2]
>>>>>>> 
>>>>>>> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L780
>>>>>>> 
>>>>>>> [3] -
>>>>>>> 
>>>>>>> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L205
>>>>>>> 
>>>>>>> [4] -
>>>>>>> 
>>>>>>> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L791
>>>>>>> 
>>>>>>> On Mon, Aug 5, 2013 at 2:44 PM, Krishna Kishore Bonagiri
>>>>>>> <wr...@gmail.com> wrote:
>>>>>>>> Hi,
>>>>>>>> 
>>>>>>>>  Can someone please tell me what is the use of calling
>>>>>>>> setLocalResources()
>>>>>>>> on ContainerLaunchContext?
>>>>>>>> 
>>>>>>>>  And, also an example of how to use this will help...
>>>>>>>> 
>>>>>>>> I couldn't guess what is the String in the map that is passed to
>>>>>>>> setLocalResources() like below:
>>>>>>>> 
>>>>>>>>      // Set the local resources
>>>>>>>>      Map<String, LocalResource> localResources = new HashMap<String,
>>>>>>>> LocalResource>();
>>>>>>>> 
>>>>>>>> Thanks,
>>>>>>>> Kishore
>>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> --
>>>>>>> Harsh J
>>>>>> 
>>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> --
>>>>> Harsh J
>>>> 
>>>> 
>>> 
>>> 
>>> 
>>> --
>>> Harsh J
>>> 
>>> 
>>> 
>> 
> 
> 
> 
> -- 
> Harsh J


Re: setLocalResources() on ContainerLaunchContext

Posted by Hitesh Shah <hi...@apache.org>.
@Krishna, your logs showed the file error for "hdfs://isredeng/kishore/kk.ksh" 

I am assuming you have tried dfs -ls /kishore/kk.ksh and confirmed that the file exists? Also the qualified path seems to be missing the namenode port. I need to go back and check if a path without the port works by assuming the default namenode port.

@Harsh, adding a helper function seems like a good idea. Let me file a jira to have the above added to one of the helper/client libraries.

thanks
-- Hitesh

On Aug 6, 2013, at 6:47 PM, Harsh J wrote:

> It is kinda unnecessary to be asking developers to load in timestamps and
> length themselves. Why not provide a java.io.File, or perhaps a Path
> accepting API, that gets it automatically on their behalf using the
> FileSystem API internally?
> 
> P.s. A HDFS file gave him a FNF, while a Local file gave him a proper
> TS/Len error. I'm guessing there's a bug here w.r.t. handling HDFS
> paths.
> 
> On Wed, Aug 7, 2013 at 12:35 AM, Hitesh Shah <hi...@apache.org> wrote:
>> Hi Krishna,
>> 
>> YARN downloads a specified local resource on the container's node from the url specified. In all situtations, the remote url needs to be a fully qualified path. To verify that the file at the remote url is still valid, YARN expects you to provide the length and last modified timestamp of that file.
>> 
>> If you use an hdfs path such as hdfs://namenode:port/<absolute path to file>, you will need to get the length and timestamp from HDFS.
>> If you use file:///, the file should exist on all nodes and all nodes should have the file with the same length and timestamp for localization to work. ( For a single node setup, this works but tougher to get right on a multi-node setup - deploying the file via a rpm should likely work).
>> 
>> -- Hitesh
>> 
>> On Aug 6, 2013, at 11:11 AM, Omkar Joshi wrote:
>> 
>>> Hi,
>>> 
>>> You need to match the timestamp. Probably get the timestamp locally before adding it. This is explicitly done to ensure that file is not updated after user makes the call to avoid possible errors.
>>> 
>>> 
>>> Thanks,
>>> Omkar Joshi
>>> Hortonworks Inc.
>>> 
>>> 
>>> On Tue, Aug 6, 2013 at 5:25 AM, Krishna Kishore Bonagiri <wr...@gmail.com> wrote:
>>> I tried the following and it works!
>>> String shellScriptPath = "file:///home_/dsadm/kishore/kk.ksh";
>>> 
>>> But now getting a timestamp error like below, when I passed 0 to setTimestamp()
>>> 
>>> 13/08/06 08:23:48 INFO ApplicationMaster: Got container status for containerID= container_1375784329048_0017_01_000002, state=COMPLETE, exitStatus=-1000, diagnostics=Resource file:/home_/dsadm/kishore/kk.ksh changed on src filesystem (expected 0, was 1367580580000
>>> 
>>> 
>>> 
>>> 
>>> On Tue, Aug 6, 2013 at 5:24 PM, Harsh J <ha...@cloudera.com> wrote:
>>> Can you try passing a fully qualified local path? That is, including the file:/ scheme
>>> 
>>> On Aug 6, 2013 4:05 PM, "Krishna Kishore Bonagiri" <wr...@gmail.com> wrote:
>>> Hi Harsh,
>>>   The setResource() call on LocalResource() is expecting an argument of type org.apache.hadoop.yarn.api.records.URL which is converted from a string in the form of URI. This happens in the following call of Distributed Shell example,
>>> 
>>> shellRsrc.setResource(ConverterUtils.getYarnUrlFromURI(new URI( shellScriptPath)));
>>> 
>>> So, if I give a local file I get a parsing error like below, which is when I changed it to an HDFS file thinking that it should be given like that only. Could you please give an example of how else it could be used, using a local file as you are saying?
>>> 
>>> 2013-08-06 06:23:12,942 WARN org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container: Failed to parse resource-request
>>> java.net.URISyntaxException: Expected scheme name at index 0: :///home_/dsadm/kishore/kk.ksh
>>>        at java.net.URI$Parser.fail(URI.java:2820)
>>>        at java.net.URI$Parser.failExpecting(URI.java:2826)
>>>        at java.net.URI$Parser.parse(URI.java:3015)
>>>        at java.net.URI.<init>(URI.java:747)
>>>        at org.apache.hadoop.yarn.util.ConverterUtils.getPathFromYarnURL(ConverterUtils.java:77)
>>>        at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalResourceRequest.<init>(LocalResourceRequest.java:46)
>>> 
>>> 
>>> 
>>> On Tue, Aug 6, 2013 at 3:36 PM, Harsh J <ha...@cloudera.com> wrote:
>>> To be honest, I've never tried loading a HDFS file onto the
>>> LocalResource this way. I usually just pass a local file and that
>>> works just fine. There may be something in the URI transformation
>>> possibly breaking a HDFS source, but try passing a local file - does
>>> that fail too? The Shell example uses a local file.
>>> 
>>> On Tue, Aug 6, 2013 at 10:54 AM, Krishna Kishore Bonagiri
>>> <wr...@gmail.com> wrote:
>>>> Hi Harsh,
>>>> 
>>>>  Please see if this is useful, I got a stack trace after the error has
>>>> occurred....
>>>> 
>>>> 2013-08-06 00:55:30,559 INFO
>>>> org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor: CWD set
>>>> to /tmp/nm-local-dir/usercache/dsadm/appcache/application_1375716148174_0004
>>>> =
>>>> file:/tmp/nm-local-dir/usercache/dsadm/appcache/application_1375716148174_0004
>>>> 2013-08-06 00:55:31,017 ERROR
>>>> org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException
>>>> as:dsadm (auth:SIMPLE) cause:java.io.FileNotFoundException: File does not
>>>> exist: hdfs://isredeng/kishore/kk.ksh
>>>> 2013-08-06 00:55:31,029 INFO
>>>> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService:
>>>> DEBUG: FAILED { hdfs://isredeng/kishore/kk.ksh, 0, FILE, null }, File does
>>>> not exist: hdfs://isredeng/kishore/kk.ksh
>>>> 2013-08-06 00:55:31,031 INFO
>>>> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalizedResource:
>>>> Resource hdfs://isredeng/kishore/kk.ksh transitioned from DOWNLOADING to
>>>> FAILED
>>>> 2013-08-06 00:55:31,034 INFO
>>>> org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container:
>>>> Container container_1375716148174_0004_01_000002 transitioned from
>>>> LOCALIZING to LOCALIZATION_FAILED
>>>> 2013-08-06 00:55:31,035 INFO
>>>> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalResourcesTrackerImpl:
>>>> Container container_1375716148174_0004_01_000002 sent RELEASE event on a
>>>> resource request { hdfs://isredeng/kishore/kk.ksh, 0, FILE, null } not
>>>> present in cache.
>>>> 2013-08-06 00:55:31,036 WARN org.apache.hadoop.ipc.Client: interrupted
>>>> waiting to send rpc request to server
>>>> java.lang.InterruptedException
>>>>        at
>>>> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1290)
>>>>        at
>>>> java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:229)
>>>>        at java.util.concurrent.FutureTask.get(FutureTask.java:94)
>>>>        at
>>>> org.apache.hadoop.ipc.Client$Connection.sendRpcRequest(Client.java:930)
>>>>        at org.apache.hadoop.ipc.Client.call(Client.java:1285)
>>>>        at org.apache.hadoop.ipc.Client.call(Client.java:1264)
>>>>        at
>>>> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
>>>>        at $Proxy22.heartbeat(Unknown Source)
>>>>        at
>>>> org.apache.hadoop.yarn.server.nodemanager.api.impl.pb.client.LocalizationProtocolPBClientImpl.heartbeat(LocalizationProtocolPBClientImpl.java:62)
>>>>        at
>>>> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.localizeFiles(ContainerLocalizer.java:249)
>>>>        at
>>>> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.runLocalization(ContainerLocalizer.java:163)
>>>>        at
>>>> org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.startLocalizer(DefaultContainerExecutor.java:106)
>>>>        at
>>>> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerRunner.run(ResourceLocalizationService.java:979)
>>>> 
>>>> 
>>>> 
>>>> And here is my code snippet:
>>>> 
>>>>      ContainerLaunchContext ctx =
>>>> Records.newRecord(ContainerLaunchContext.class);
>>>> 
>>>>      ctx.setEnvironment(oshEnv);
>>>> 
>>>>      // Set the local resources
>>>>      Map<String, LocalResource> localResources = new HashMap<String,
>>>> LocalResource>();
>>>> 
>>>>      LocalResource shellRsrc = Records.newRecord(LocalResource.class);
>>>>      shellRsrc.setType(LocalResourceType.FILE);
>>>>      shellRsrc.setVisibility(LocalResourceVisibility.APPLICATION);
>>>>      String shellScriptPath = "hdfs://isredeng//kishore/kk.ksh";
>>>>      try {
>>>>        shellRsrc.setResource(ConverterUtils.getYarnUrlFromURI(new
>>>> URI(shellScriptPath)));
>>>>      } catch (URISyntaxException e) {
>>>>        LOG.error("Error when trying to use shell script path specified"
>>>>            + " in env, path=" + shellScriptPath);
>>>>        e.printStackTrace();
>>>>      }
>>>> 
>>>>      shellRsrc.setTimestamp(0/*shellScriptPathTimestamp*/);
>>>>      shellRsrc.setSize(0/*shellScriptPathLen*/);
>>>>      String ExecShellStringPath = "ExecShellScript.sh";
>>>>      localResources.put(ExecShellStringPath, shellRsrc);
>>>> 
>>>>      ctx.setLocalResources(localResources);
>>>> 
>>>> 
>>>> Please let me know if you need anything else.
>>>> 
>>>> Thanks,
>>>> Kishore
>>>> 
>>>> 
>>>> 
>>>> On Tue, Aug 6, 2013 at 12:05 AM, Harsh J <ha...@cloudera.com> wrote:
>>>>> 
>>>>> The detail is insufficient to answer why. You should also have gotten
>>>>> a trace after it, can you post that? If possible, also the relevant
>>>>> snippets of code.
>>>>> 
>>>>> On Mon, Aug 5, 2013 at 6:36 PM, Krishna Kishore Bonagiri
>>>>> <wr...@gmail.com> wrote:
>>>>>> Hi Harsh,
>>>>>> Thanks for the quick and detailed reply, it really helps. I am trying
>>>>>> to
>>>>>> use it and getting this error in node manager's log:
>>>>>> 
>>>>>> 2013-08-05 08:57:28,867 ERROR
>>>>>> org.apache.hadoop.security.UserGroupInformation:
>>>>>> PriviledgedActionException
>>>>>> as:dsadm (auth:SIMPLE) cause:java.io.FileNotFoundException: File does
>>>>>> not
>>>>>> exist: hdfs://isredeng/kishore/kk.ksh
>>>>>> 
>>>>>> 
>>>>>> This file is there on the machine with name "isredeng", I could do ls
>>>>>> for
>>>>>> that file as below:
>>>>>> 
>>>>>> -bash-4.1$ hadoop fs -ls kishore/kk.ksh
>>>>>> 13/08/05 09:01:03 WARN util.NativeCodeLoader: Unable to load
>>>>>> native-hadoop
>>>>>> library for your platform... using builtin-java classes where applicable
>>>>>> Found 1 items
>>>>>> -rw-r--r--   3 dsadm supergroup       1046 2013-08-05 08:48
>>>>>> kishore/kk.ksh
>>>>>> 
>>>>>> Note: I am using a single node cluster
>>>>>> 
>>>>>> Thanks,
>>>>>> Kishore
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> On Mon, Aug 5, 2013 at 3:00 PM, Harsh J <ha...@cloudera.com> wrote:
>>>>>>> 
>>>>>>> The string for each LocalResource in the map can be anything that
>>>>>>> serves as a common identifier name for your application. At execution
>>>>>>> time, the passed resource filename will be aliased to the name you've
>>>>>>> mapped it to, so that the application code need not track special
>>>>>>> names. The behavior is very similar to how you can, in MR, define a
>>>>>>> symlink name for a DistributedCache entry (e.g. foo.jar#bar.jar).
>>>>>>> 
>>>>>>> For an example, checkout the DistributedShell app sources.
>>>>>>> 
>>>>>>> Over [1], you can see we take a user provided file path to a shell
>>>>>>> script. This can be named anything as it is user-supplied.
>>>>>>> Onto [2], we define this as a local resource [2.1] and embed it with a
>>>>>>> different name (the string you ask about) [2.2], as defined at [3] as
>>>>>>> an application reference-able constant.
>>>>>>> Note that in [4], we add to the Container arguments the aliased name
>>>>>>> we mapped it to (i.e. [3]) and not the original filename we received
>>>>>>> from the user. The resource is placed on the container with this name
>>>>>>> instead, so thats what we choose to execute.
>>>>>>> 
>>>>>>> [1] -
>>>>>>> 
>>>>>>> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L390
>>>>>>> 
>>>>>>> [2] - [2.1]
>>>>>>> 
>>>>>>> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L764
>>>>>>> and [2.2]
>>>>>>> 
>>>>>>> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L780
>>>>>>> 
>>>>>>> [3] -
>>>>>>> 
>>>>>>> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L205
>>>>>>> 
>>>>>>> [4] -
>>>>>>> 
>>>>>>> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L791
>>>>>>> 
>>>>>>> On Mon, Aug 5, 2013 at 2:44 PM, Krishna Kishore Bonagiri
>>>>>>> <wr...@gmail.com> wrote:
>>>>>>>> Hi,
>>>>>>>> 
>>>>>>>>  Can someone please tell me what is the use of calling
>>>>>>>> setLocalResources()
>>>>>>>> on ContainerLaunchContext?
>>>>>>>> 
>>>>>>>>  And, also an example of how to use this will help...
>>>>>>>> 
>>>>>>>> I couldn't guess what is the String in the map that is passed to
>>>>>>>> setLocalResources() like below:
>>>>>>>> 
>>>>>>>>      // Set the local resources
>>>>>>>>      Map<String, LocalResource> localResources = new HashMap<String,
>>>>>>>> LocalResource>();
>>>>>>>> 
>>>>>>>> Thanks,
>>>>>>>> Kishore
>>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> --
>>>>>>> Harsh J
>>>>>> 
>>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> --
>>>>> Harsh J
>>>> 
>>>> 
>>> 
>>> 
>>> 
>>> --
>>> Harsh J
>>> 
>>> 
>>> 
>> 
> 
> 
> 
> -- 
> Harsh J


Re: setLocalResources() on ContainerLaunchContext

Posted by Hitesh Shah <hi...@apache.org>.
@Krishna, your logs showed the file error for "hdfs://isredeng/kishore/kk.ksh" 

I am assuming you have tried dfs -ls /kishore/kk.ksh and confirmed that the file exists? Also the qualified path seems to be missing the namenode port. I need to go back and check if a path without the port works by assuming the default namenode port.

@Harsh, adding a helper function seems like a good idea. Let me file a jira to have the above added to one of the helper/client libraries.

thanks
-- Hitesh

On Aug 6, 2013, at 6:47 PM, Harsh J wrote:

> It is kinda unnecessary to be asking developers to load in timestamps and
> length themselves. Why not provide a java.io.File, or perhaps a Path
> accepting API, that gets it automatically on their behalf using the
> FileSystem API internally?
> 
> P.s. A HDFS file gave him a FNF, while a Local file gave him a proper
> TS/Len error. I'm guessing there's a bug here w.r.t. handling HDFS
> paths.
> 
> On Wed, Aug 7, 2013 at 12:35 AM, Hitesh Shah <hi...@apache.org> wrote:
>> Hi Krishna,
>> 
>> YARN downloads a specified local resource on the container's node from the url specified. In all situtations, the remote url needs to be a fully qualified path. To verify that the file at the remote url is still valid, YARN expects you to provide the length and last modified timestamp of that file.
>> 
>> If you use an hdfs path such as hdfs://namenode:port/<absolute path to file>, you will need to get the length and timestamp from HDFS.
>> If you use file:///, the file should exist on all nodes and all nodes should have the file with the same length and timestamp for localization to work. ( For a single node setup, this works but tougher to get right on a multi-node setup - deploying the file via a rpm should likely work).
>> 
>> -- Hitesh
>> 
>> On Aug 6, 2013, at 11:11 AM, Omkar Joshi wrote:
>> 
>>> Hi,
>>> 
>>> You need to match the timestamp. Probably get the timestamp locally before adding it. This is explicitly done to ensure that file is not updated after user makes the call to avoid possible errors.
>>> 
>>> 
>>> Thanks,
>>> Omkar Joshi
>>> Hortonworks Inc.
>>> 
>>> 
>>> On Tue, Aug 6, 2013 at 5:25 AM, Krishna Kishore Bonagiri <wr...@gmail.com> wrote:
>>> I tried the following and it works!
>>> String shellScriptPath = "file:///home_/dsadm/kishore/kk.ksh";
>>> 
>>> But now getting a timestamp error like below, when I passed 0 to setTimestamp()
>>> 
>>> 13/08/06 08:23:48 INFO ApplicationMaster: Got container status for containerID= container_1375784329048_0017_01_000002, state=COMPLETE, exitStatus=-1000, diagnostics=Resource file:/home_/dsadm/kishore/kk.ksh changed on src filesystem (expected 0, was 1367580580000
>>> 
>>> 
>>> 
>>> 
>>> On Tue, Aug 6, 2013 at 5:24 PM, Harsh J <ha...@cloudera.com> wrote:
>>> Can you try passing a fully qualified local path? That is, including the file:/ scheme
>>> 
>>> On Aug 6, 2013 4:05 PM, "Krishna Kishore Bonagiri" <wr...@gmail.com> wrote:
>>> Hi Harsh,
>>>   The setResource() call on LocalResource() is expecting an argument of type org.apache.hadoop.yarn.api.records.URL which is converted from a string in the form of URI. This happens in the following call of Distributed Shell example,
>>> 
>>> shellRsrc.setResource(ConverterUtils.getYarnUrlFromURI(new URI( shellScriptPath)));
>>> 
>>> So, if I give a local file I get a parsing error like below, which is when I changed it to an HDFS file thinking that it should be given like that only. Could you please give an example of how else it could be used, using a local file as you are saying?
>>> 
>>> 2013-08-06 06:23:12,942 WARN org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container: Failed to parse resource-request
>>> java.net.URISyntaxException: Expected scheme name at index 0: :///home_/dsadm/kishore/kk.ksh
>>>        at java.net.URI$Parser.fail(URI.java:2820)
>>>        at java.net.URI$Parser.failExpecting(URI.java:2826)
>>>        at java.net.URI$Parser.parse(URI.java:3015)
>>>        at java.net.URI.<init>(URI.java:747)
>>>        at org.apache.hadoop.yarn.util.ConverterUtils.getPathFromYarnURL(ConverterUtils.java:77)
>>>        at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalResourceRequest.<init>(LocalResourceRequest.java:46)
>>> 
>>> 
>>> 
>>> On Tue, Aug 6, 2013 at 3:36 PM, Harsh J <ha...@cloudera.com> wrote:
>>> To be honest, I've never tried loading a HDFS file onto the
>>> LocalResource this way. I usually just pass a local file and that
>>> works just fine. There may be something in the URI transformation
>>> possibly breaking a HDFS source, but try passing a local file - does
>>> that fail too? The Shell example uses a local file.
>>> 
>>> On Tue, Aug 6, 2013 at 10:54 AM, Krishna Kishore Bonagiri
>>> <wr...@gmail.com> wrote:
>>>> Hi Harsh,
>>>> 
>>>>  Please see if this is useful, I got a stack trace after the error has
>>>> occurred....
>>>> 
>>>> 2013-08-06 00:55:30,559 INFO
>>>> org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor: CWD set
>>>> to /tmp/nm-local-dir/usercache/dsadm/appcache/application_1375716148174_0004
>>>> =
>>>> file:/tmp/nm-local-dir/usercache/dsadm/appcache/application_1375716148174_0004
>>>> 2013-08-06 00:55:31,017 ERROR
>>>> org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException
>>>> as:dsadm (auth:SIMPLE) cause:java.io.FileNotFoundException: File does not
>>>> exist: hdfs://isredeng/kishore/kk.ksh
>>>> 2013-08-06 00:55:31,029 INFO
>>>> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService:
>>>> DEBUG: FAILED { hdfs://isredeng/kishore/kk.ksh, 0, FILE, null }, File does
>>>> not exist: hdfs://isredeng/kishore/kk.ksh
>>>> 2013-08-06 00:55:31,031 INFO
>>>> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalizedResource:
>>>> Resource hdfs://isredeng/kishore/kk.ksh transitioned from DOWNLOADING to
>>>> FAILED
>>>> 2013-08-06 00:55:31,034 INFO
>>>> org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container:
>>>> Container container_1375716148174_0004_01_000002 transitioned from
>>>> LOCALIZING to LOCALIZATION_FAILED
>>>> 2013-08-06 00:55:31,035 INFO
>>>> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalResourcesTrackerImpl:
>>>> Container container_1375716148174_0004_01_000002 sent RELEASE event on a
>>>> resource request { hdfs://isredeng/kishore/kk.ksh, 0, FILE, null } not
>>>> present in cache.
>>>> 2013-08-06 00:55:31,036 WARN org.apache.hadoop.ipc.Client: interrupted
>>>> waiting to send rpc request to server
>>>> java.lang.InterruptedException
>>>>        at
>>>> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1290)
>>>>        at
>>>> java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:229)
>>>>        at java.util.concurrent.FutureTask.get(FutureTask.java:94)
>>>>        at
>>>> org.apache.hadoop.ipc.Client$Connection.sendRpcRequest(Client.java:930)
>>>>        at org.apache.hadoop.ipc.Client.call(Client.java:1285)
>>>>        at org.apache.hadoop.ipc.Client.call(Client.java:1264)
>>>>        at
>>>> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
>>>>        at $Proxy22.heartbeat(Unknown Source)
>>>>        at
>>>> org.apache.hadoop.yarn.server.nodemanager.api.impl.pb.client.LocalizationProtocolPBClientImpl.heartbeat(LocalizationProtocolPBClientImpl.java:62)
>>>>        at
>>>> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.localizeFiles(ContainerLocalizer.java:249)
>>>>        at
>>>> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.runLocalization(ContainerLocalizer.java:163)
>>>>        at
>>>> org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.startLocalizer(DefaultContainerExecutor.java:106)
>>>>        at
>>>> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerRunner.run(ResourceLocalizationService.java:979)
>>>> 
>>>> 
>>>> 
>>>> And here is my code snippet:
>>>> 
>>>>      ContainerLaunchContext ctx =
>>>> Records.newRecord(ContainerLaunchContext.class);
>>>> 
>>>>      ctx.setEnvironment(oshEnv);
>>>> 
>>>>      // Set the local resources
>>>>      Map<String, LocalResource> localResources = new HashMap<String,
>>>> LocalResource>();
>>>> 
>>>>      LocalResource shellRsrc = Records.newRecord(LocalResource.class);
>>>>      shellRsrc.setType(LocalResourceType.FILE);
>>>>      shellRsrc.setVisibility(LocalResourceVisibility.APPLICATION);
>>>>      String shellScriptPath = "hdfs://isredeng//kishore/kk.ksh";
>>>>      try {
>>>>        shellRsrc.setResource(ConverterUtils.getYarnUrlFromURI(new
>>>> URI(shellScriptPath)));
>>>>      } catch (URISyntaxException e) {
>>>>        LOG.error("Error when trying to use shell script path specified"
>>>>            + " in env, path=" + shellScriptPath);
>>>>        e.printStackTrace();
>>>>      }
>>>> 
>>>>      shellRsrc.setTimestamp(0/*shellScriptPathTimestamp*/);
>>>>      shellRsrc.setSize(0/*shellScriptPathLen*/);
>>>>      String ExecShellStringPath = "ExecShellScript.sh";
>>>>      localResources.put(ExecShellStringPath, shellRsrc);
>>>> 
>>>>      ctx.setLocalResources(localResources);
>>>> 
>>>> 
>>>> Please let me know if you need anything else.
>>>> 
>>>> Thanks,
>>>> Kishore
>>>> 
>>>> 
>>>> 
>>>> On Tue, Aug 6, 2013 at 12:05 AM, Harsh J <ha...@cloudera.com> wrote:
>>>>> 
>>>>> The detail is insufficient to answer why. You should also have gotten
>>>>> a trace after it, can you post that? If possible, also the relevant
>>>>> snippets of code.
>>>>> 
>>>>> On Mon, Aug 5, 2013 at 6:36 PM, Krishna Kishore Bonagiri
>>>>> <wr...@gmail.com> wrote:
>>>>>> Hi Harsh,
>>>>>> Thanks for the quick and detailed reply, it really helps. I am trying
>>>>>> to
>>>>>> use it and getting this error in node manager's log:
>>>>>> 
>>>>>> 2013-08-05 08:57:28,867 ERROR
>>>>>> org.apache.hadoop.security.UserGroupInformation:
>>>>>> PriviledgedActionException
>>>>>> as:dsadm (auth:SIMPLE) cause:java.io.FileNotFoundException: File does
>>>>>> not
>>>>>> exist: hdfs://isredeng/kishore/kk.ksh
>>>>>> 
>>>>>> 
>>>>>> This file is there on the machine with name "isredeng", I could do ls
>>>>>> for
>>>>>> that file as below:
>>>>>> 
>>>>>> -bash-4.1$ hadoop fs -ls kishore/kk.ksh
>>>>>> 13/08/05 09:01:03 WARN util.NativeCodeLoader: Unable to load
>>>>>> native-hadoop
>>>>>> library for your platform... using builtin-java classes where applicable
>>>>>> Found 1 items
>>>>>> -rw-r--r--   3 dsadm supergroup       1046 2013-08-05 08:48
>>>>>> kishore/kk.ksh
>>>>>> 
>>>>>> Note: I am using a single node cluster
>>>>>> 
>>>>>> Thanks,
>>>>>> Kishore
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> On Mon, Aug 5, 2013 at 3:00 PM, Harsh J <ha...@cloudera.com> wrote:
>>>>>>> 
>>>>>>> The string for each LocalResource in the map can be anything that
>>>>>>> serves as a common identifier name for your application. At execution
>>>>>>> time, the passed resource filename will be aliased to the name you've
>>>>>>> mapped it to, so that the application code need not track special
>>>>>>> names. The behavior is very similar to how you can, in MR, define a
>>>>>>> symlink name for a DistributedCache entry (e.g. foo.jar#bar.jar).
>>>>>>> 
>>>>>>> For an example, checkout the DistributedShell app sources.
>>>>>>> 
>>>>>>> Over [1], you can see we take a user provided file path to a shell
>>>>>>> script. This can be named anything as it is user-supplied.
>>>>>>> Onto [2], we define this as a local resource [2.1] and embed it with a
>>>>>>> different name (the string you ask about) [2.2], as defined at [3] as
>>>>>>> an application reference-able constant.
>>>>>>> Note that in [4], we add to the Container arguments the aliased name
>>>>>>> we mapped it to (i.e. [3]) and not the original filename we received
>>>>>>> from the user. The resource is placed on the container with this name
>>>>>>> instead, so thats what we choose to execute.
>>>>>>> 
>>>>>>> [1] -
>>>>>>> 
>>>>>>> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L390
>>>>>>> 
>>>>>>> [2] - [2.1]
>>>>>>> 
>>>>>>> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L764
>>>>>>> and [2.2]
>>>>>>> 
>>>>>>> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L780
>>>>>>> 
>>>>>>> [3] -
>>>>>>> 
>>>>>>> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L205
>>>>>>> 
>>>>>>> [4] -
>>>>>>> 
>>>>>>> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L791
>>>>>>> 
>>>>>>> On Mon, Aug 5, 2013 at 2:44 PM, Krishna Kishore Bonagiri
>>>>>>> <wr...@gmail.com> wrote:
>>>>>>>> Hi,
>>>>>>>> 
>>>>>>>>  Can someone please tell me what is the use of calling
>>>>>>>> setLocalResources()
>>>>>>>> on ContainerLaunchContext?
>>>>>>>> 
>>>>>>>>  And, also an example of how to use this will help...
>>>>>>>> 
>>>>>>>> I couldn't guess what is the String in the map that is passed to
>>>>>>>> setLocalResources() like below:
>>>>>>>> 
>>>>>>>>      // Set the local resources
>>>>>>>>      Map<String, LocalResource> localResources = new HashMap<String,
>>>>>>>> LocalResource>();
>>>>>>>> 
>>>>>>>> Thanks,
>>>>>>>> Kishore
>>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> --
>>>>>>> Harsh J
>>>>>> 
>>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> --
>>>>> Harsh J
>>>> 
>>>> 
>>> 
>>> 
>>> 
>>> --
>>> Harsh J
>>> 
>>> 
>>> 
>> 
> 
> 
> 
> -- 
> Harsh J


Re: setLocalResources() on ContainerLaunchContext

Posted by Hitesh Shah <hi...@apache.org>.
@Krishna, your logs showed the file error for "hdfs://isredeng/kishore/kk.ksh" 

I am assuming you have tried dfs -ls /kishore/kk.ksh and confirmed that the file exists? Also the qualified path seems to be missing the namenode port. I need to go back and check if a path without the port works by assuming the default namenode port.

@Harsh, adding a helper function seems like a good idea. Let me file a jira to have the above added to one of the helper/client libraries.

thanks
-- Hitesh

On Aug 6, 2013, at 6:47 PM, Harsh J wrote:

> It is kinda unnecessary to be asking developers to load in timestamps and
> length themselves. Why not provide a java.io.File, or perhaps a Path
> accepting API, that gets it automatically on their behalf using the
> FileSystem API internally?
> 
> P.s. A HDFS file gave him a FNF, while a Local file gave him a proper
> TS/Len error. I'm guessing there's a bug here w.r.t. handling HDFS
> paths.
> 
> On Wed, Aug 7, 2013 at 12:35 AM, Hitesh Shah <hi...@apache.org> wrote:
>> Hi Krishna,
>> 
>> YARN downloads a specified local resource on the container's node from the url specified. In all situtations, the remote url needs to be a fully qualified path. To verify that the file at the remote url is still valid, YARN expects you to provide the length and last modified timestamp of that file.
>> 
>> If you use an hdfs path such as hdfs://namenode:port/<absolute path to file>, you will need to get the length and timestamp from HDFS.
>> If you use file:///, the file should exist on all nodes and all nodes should have the file with the same length and timestamp for localization to work. ( For a single node setup, this works but tougher to get right on a multi-node setup - deploying the file via a rpm should likely work).
>> 
>> -- Hitesh
>> 
>> On Aug 6, 2013, at 11:11 AM, Omkar Joshi wrote:
>> 
>>> Hi,
>>> 
>>> You need to match the timestamp. Probably get the timestamp locally before adding it. This is explicitly done to ensure that file is not updated after user makes the call to avoid possible errors.
>>> 
>>> 
>>> Thanks,
>>> Omkar Joshi
>>> Hortonworks Inc.
>>> 
>>> 
>>> On Tue, Aug 6, 2013 at 5:25 AM, Krishna Kishore Bonagiri <wr...@gmail.com> wrote:
>>> I tried the following and it works!
>>> String shellScriptPath = "file:///home_/dsadm/kishore/kk.ksh";
>>> 
>>> But now getting a timestamp error like below, when I passed 0 to setTimestamp()
>>> 
>>> 13/08/06 08:23:48 INFO ApplicationMaster: Got container status for containerID= container_1375784329048_0017_01_000002, state=COMPLETE, exitStatus=-1000, diagnostics=Resource file:/home_/dsadm/kishore/kk.ksh changed on src filesystem (expected 0, was 1367580580000
>>> 
>>> 
>>> 
>>> 
>>> On Tue, Aug 6, 2013 at 5:24 PM, Harsh J <ha...@cloudera.com> wrote:
>>> Can you try passing a fully qualified local path? That is, including the file:/ scheme
>>> 
>>> On Aug 6, 2013 4:05 PM, "Krishna Kishore Bonagiri" <wr...@gmail.com> wrote:
>>> Hi Harsh,
>>>   The setResource() call on LocalResource() is expecting an argument of type org.apache.hadoop.yarn.api.records.URL which is converted from a string in the form of URI. This happens in the following call of Distributed Shell example,
>>> 
>>> shellRsrc.setResource(ConverterUtils.getYarnUrlFromURI(new URI( shellScriptPath)));
>>> 
>>> So, if I give a local file I get a parsing error like below, which is when I changed it to an HDFS file thinking that it should be given like that only. Could you please give an example of how else it could be used, using a local file as you are saying?
>>> 
>>> 2013-08-06 06:23:12,942 WARN org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container: Failed to parse resource-request
>>> java.net.URISyntaxException: Expected scheme name at index 0: :///home_/dsadm/kishore/kk.ksh
>>>        at java.net.URI$Parser.fail(URI.java:2820)
>>>        at java.net.URI$Parser.failExpecting(URI.java:2826)
>>>        at java.net.URI$Parser.parse(URI.java:3015)
>>>        at java.net.URI.<init>(URI.java:747)
>>>        at org.apache.hadoop.yarn.util.ConverterUtils.getPathFromYarnURL(ConverterUtils.java:77)
>>>        at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalResourceRequest.<init>(LocalResourceRequest.java:46)
>>> 
>>> 
>>> 
>>> On Tue, Aug 6, 2013 at 3:36 PM, Harsh J <ha...@cloudera.com> wrote:
>>> To be honest, I've never tried loading a HDFS file onto the
>>> LocalResource this way. I usually just pass a local file and that
>>> works just fine. There may be something in the URI transformation
>>> possibly breaking a HDFS source, but try passing a local file - does
>>> that fail too? The Shell example uses a local file.
>>> 
>>> On Tue, Aug 6, 2013 at 10:54 AM, Krishna Kishore Bonagiri
>>> <wr...@gmail.com> wrote:
>>>> Hi Harsh,
>>>> 
>>>>  Please see if this is useful, I got a stack trace after the error has
>>>> occurred....
>>>> 
>>>> 2013-08-06 00:55:30,559 INFO
>>>> org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor: CWD set
>>>> to /tmp/nm-local-dir/usercache/dsadm/appcache/application_1375716148174_0004
>>>> =
>>>> file:/tmp/nm-local-dir/usercache/dsadm/appcache/application_1375716148174_0004
>>>> 2013-08-06 00:55:31,017 ERROR
>>>> org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException
>>>> as:dsadm (auth:SIMPLE) cause:java.io.FileNotFoundException: File does not
>>>> exist: hdfs://isredeng/kishore/kk.ksh
>>>> 2013-08-06 00:55:31,029 INFO
>>>> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService:
>>>> DEBUG: FAILED { hdfs://isredeng/kishore/kk.ksh, 0, FILE, null }, File does
>>>> not exist: hdfs://isredeng/kishore/kk.ksh
>>>> 2013-08-06 00:55:31,031 INFO
>>>> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalizedResource:
>>>> Resource hdfs://isredeng/kishore/kk.ksh transitioned from DOWNLOADING to
>>>> FAILED
>>>> 2013-08-06 00:55:31,034 INFO
>>>> org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container:
>>>> Container container_1375716148174_0004_01_000002 transitioned from
>>>> LOCALIZING to LOCALIZATION_FAILED
>>>> 2013-08-06 00:55:31,035 INFO
>>>> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalResourcesTrackerImpl:
>>>> Container container_1375716148174_0004_01_000002 sent RELEASE event on a
>>>> resource request { hdfs://isredeng/kishore/kk.ksh, 0, FILE, null } not
>>>> present in cache.
>>>> 2013-08-06 00:55:31,036 WARN org.apache.hadoop.ipc.Client: interrupted
>>>> waiting to send rpc request to server
>>>> java.lang.InterruptedException
>>>>        at
>>>> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1290)
>>>>        at
>>>> java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:229)
>>>>        at java.util.concurrent.FutureTask.get(FutureTask.java:94)
>>>>        at
>>>> org.apache.hadoop.ipc.Client$Connection.sendRpcRequest(Client.java:930)
>>>>        at org.apache.hadoop.ipc.Client.call(Client.java:1285)
>>>>        at org.apache.hadoop.ipc.Client.call(Client.java:1264)
>>>>        at
>>>> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
>>>>        at $Proxy22.heartbeat(Unknown Source)
>>>>        at
>>>> org.apache.hadoop.yarn.server.nodemanager.api.impl.pb.client.LocalizationProtocolPBClientImpl.heartbeat(LocalizationProtocolPBClientImpl.java:62)
>>>>        at
>>>> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.localizeFiles(ContainerLocalizer.java:249)
>>>>        at
>>>> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.runLocalization(ContainerLocalizer.java:163)
>>>>        at
>>>> org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.startLocalizer(DefaultContainerExecutor.java:106)
>>>>        at
>>>> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerRunner.run(ResourceLocalizationService.java:979)
>>>> 
>>>> 
>>>> 
>>>> And here is my code snippet:
>>>> 
>>>>      ContainerLaunchContext ctx =
>>>> Records.newRecord(ContainerLaunchContext.class);
>>>> 
>>>>      ctx.setEnvironment(oshEnv);
>>>> 
>>>>      // Set the local resources
>>>>      Map<String, LocalResource> localResources = new HashMap<String,
>>>> LocalResource>();
>>>> 
>>>>      LocalResource shellRsrc = Records.newRecord(LocalResource.class);
>>>>      shellRsrc.setType(LocalResourceType.FILE);
>>>>      shellRsrc.setVisibility(LocalResourceVisibility.APPLICATION);
>>>>      String shellScriptPath = "hdfs://isredeng//kishore/kk.ksh";
>>>>      try {
>>>>        shellRsrc.setResource(ConverterUtils.getYarnUrlFromURI(new
>>>> URI(shellScriptPath)));
>>>>      } catch (URISyntaxException e) {
>>>>        LOG.error("Error when trying to use shell script path specified"
>>>>            + " in env, path=" + shellScriptPath);
>>>>        e.printStackTrace();
>>>>      }
>>>> 
>>>>      shellRsrc.setTimestamp(0/*shellScriptPathTimestamp*/);
>>>>      shellRsrc.setSize(0/*shellScriptPathLen*/);
>>>>      String ExecShellStringPath = "ExecShellScript.sh";
>>>>      localResources.put(ExecShellStringPath, shellRsrc);
>>>> 
>>>>      ctx.setLocalResources(localResources);
>>>> 
>>>> 
>>>> Please let me know if you need anything else.
>>>> 
>>>> Thanks,
>>>> Kishore
>>>> 
>>>> 
>>>> 
>>>> On Tue, Aug 6, 2013 at 12:05 AM, Harsh J <ha...@cloudera.com> wrote:
>>>>> 
>>>>> The detail is insufficient to answer why. You should also have gotten
>>>>> a trace after it, can you post that? If possible, also the relevant
>>>>> snippets of code.
>>>>> 
>>>>> On Mon, Aug 5, 2013 at 6:36 PM, Krishna Kishore Bonagiri
>>>>> <wr...@gmail.com> wrote:
>>>>>> Hi Harsh,
>>>>>> Thanks for the quick and detailed reply, it really helps. I am trying
>>>>>> to
>>>>>> use it and getting this error in node manager's log:
>>>>>> 
>>>>>> 2013-08-05 08:57:28,867 ERROR
>>>>>> org.apache.hadoop.security.UserGroupInformation:
>>>>>> PriviledgedActionException
>>>>>> as:dsadm (auth:SIMPLE) cause:java.io.FileNotFoundException: File does
>>>>>> not
>>>>>> exist: hdfs://isredeng/kishore/kk.ksh
>>>>>> 
>>>>>> 
>>>>>> This file is there on the machine with name "isredeng", I could do ls
>>>>>> for
>>>>>> that file as below:
>>>>>> 
>>>>>> -bash-4.1$ hadoop fs -ls kishore/kk.ksh
>>>>>> 13/08/05 09:01:03 WARN util.NativeCodeLoader: Unable to load
>>>>>> native-hadoop
>>>>>> library for your platform... using builtin-java classes where applicable
>>>>>> Found 1 items
>>>>>> -rw-r--r--   3 dsadm supergroup       1046 2013-08-05 08:48
>>>>>> kishore/kk.ksh
>>>>>> 
>>>>>> Note: I am using a single node cluster
>>>>>> 
>>>>>> Thanks,
>>>>>> Kishore
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> On Mon, Aug 5, 2013 at 3:00 PM, Harsh J <ha...@cloudera.com> wrote:
>>>>>>> 
>>>>>>> The string for each LocalResource in the map can be anything that
>>>>>>> serves as a common identifier name for your application. At execution
>>>>>>> time, the passed resource filename will be aliased to the name you've
>>>>>>> mapped it to, so that the application code need not track special
>>>>>>> names. The behavior is very similar to how you can, in MR, define a
>>>>>>> symlink name for a DistributedCache entry (e.g. foo.jar#bar.jar).
>>>>>>> 
>>>>>>> For an example, checkout the DistributedShell app sources.
>>>>>>> 
>>>>>>> Over [1], you can see we take a user provided file path to a shell
>>>>>>> script. This can be named anything as it is user-supplied.
>>>>>>> Onto [2], we define this as a local resource [2.1] and embed it with a
>>>>>>> different name (the string you ask about) [2.2], as defined at [3] as
>>>>>>> an application reference-able constant.
>>>>>>> Note that in [4], we add to the Container arguments the aliased name
>>>>>>> we mapped it to (i.e. [3]) and not the original filename we received
>>>>>>> from the user. The resource is placed on the container with this name
>>>>>>> instead, so thats what we choose to execute.
>>>>>>> 
>>>>>>> [1] -
>>>>>>> 
>>>>>>> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L390
>>>>>>> 
>>>>>>> [2] - [2.1]
>>>>>>> 
>>>>>>> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L764
>>>>>>> and [2.2]
>>>>>>> 
>>>>>>> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L780
>>>>>>> 
>>>>>>> [3] -
>>>>>>> 
>>>>>>> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L205
>>>>>>> 
>>>>>>> [4] -
>>>>>>> 
>>>>>>> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L791
>>>>>>> 
>>>>>>> On Mon, Aug 5, 2013 at 2:44 PM, Krishna Kishore Bonagiri
>>>>>>> <wr...@gmail.com> wrote:
>>>>>>>> Hi,
>>>>>>>> 
>>>>>>>>  Can someone please tell me what is the use of calling
>>>>>>>> setLocalResources()
>>>>>>>> on ContainerLaunchContext?
>>>>>>>> 
>>>>>>>>  And, also an example of how to use this will help...
>>>>>>>> 
>>>>>>>> I couldn't guess what is the String in the map that is passed to
>>>>>>>> setLocalResources() like below:
>>>>>>>> 
>>>>>>>>      // Set the local resources
>>>>>>>>      Map<String, LocalResource> localResources = new HashMap<String,
>>>>>>>> LocalResource>();
>>>>>>>> 
>>>>>>>> Thanks,
>>>>>>>> Kishore
>>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> --
>>>>>>> Harsh J
>>>>>> 
>>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> --
>>>>> Harsh J
>>>> 
>>>> 
>>> 
>>> 
>>> 
>>> --
>>> Harsh J
>>> 
>>> 
>>> 
>> 
> 
> 
> 
> -- 
> Harsh J


Re: setLocalResources() on ContainerLaunchContext

Posted by Harsh J <ha...@cloudera.com>.
It is kinda unnecessary to be asking developers to load in timestamps and
length themselves. Why not provide a java.io.File, or perhaps a Path
accepting API, that gets it automatically on their behalf using the
FileSystem API internally?

P.s. A HDFS file gave him a FNF, while a Local file gave him a proper
TS/Len error. I'm guessing there's a bug here w.r.t. handling HDFS
paths.

On Wed, Aug 7, 2013 at 12:35 AM, Hitesh Shah <hi...@apache.org> wrote:
> Hi Krishna,
>
> YARN downloads a specified local resource on the container's node from the url specified. In all situtations, the remote url needs to be a fully qualified path. To verify that the file at the remote url is still valid, YARN expects you to provide the length and last modified timestamp of that file.
>
> If you use an hdfs path such as hdfs://namenode:port/<absolute path to file>, you will need to get the length and timestamp from HDFS.
> If you use file:///, the file should exist on all nodes and all nodes should have the file with the same length and timestamp for localization to work. ( For a single node setup, this works but tougher to get right on a multi-node setup - deploying the file via a rpm should likely work).
>
> -- Hitesh
>
> On Aug 6, 2013, at 11:11 AM, Omkar Joshi wrote:
>
>> Hi,
>>
>> You need to match the timestamp. Probably get the timestamp locally before adding it. This is explicitly done to ensure that file is not updated after user makes the call to avoid possible errors.
>>
>>
>> Thanks,
>> Omkar Joshi
>> Hortonworks Inc.
>>
>>
>> On Tue, Aug 6, 2013 at 5:25 AM, Krishna Kishore Bonagiri <wr...@gmail.com> wrote:
>> I tried the following and it works!
>> String shellScriptPath = "file:///home_/dsadm/kishore/kk.ksh";
>>
>> But now getting a timestamp error like below, when I passed 0 to setTimestamp()
>>
>> 13/08/06 08:23:48 INFO ApplicationMaster: Got container status for containerID= container_1375784329048_0017_01_000002, state=COMPLETE, exitStatus=-1000, diagnostics=Resource file:/home_/dsadm/kishore/kk.ksh changed on src filesystem (expected 0, was 1367580580000
>>
>>
>>
>>
>> On Tue, Aug 6, 2013 at 5:24 PM, Harsh J <ha...@cloudera.com> wrote:
>> Can you try passing a fully qualified local path? That is, including the file:/ scheme
>>
>> On Aug 6, 2013 4:05 PM, "Krishna Kishore Bonagiri" <wr...@gmail.com> wrote:
>> Hi Harsh,
>>    The setResource() call on LocalResource() is expecting an argument of type org.apache.hadoop.yarn.api.records.URL which is converted from a string in the form of URI. This happens in the following call of Distributed Shell example,
>>
>> shellRsrc.setResource(ConverterUtils.getYarnUrlFromURI(new URI( shellScriptPath)));
>>
>> So, if I give a local file I get a parsing error like below, which is when I changed it to an HDFS file thinking that it should be given like that only. Could you please give an example of how else it could be used, using a local file as you are saying?
>>
>> 2013-08-06 06:23:12,942 WARN org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container: Failed to parse resource-request
>> java.net.URISyntaxException: Expected scheme name at index 0: :///home_/dsadm/kishore/kk.ksh
>>         at java.net.URI$Parser.fail(URI.java:2820)
>>         at java.net.URI$Parser.failExpecting(URI.java:2826)
>>         at java.net.URI$Parser.parse(URI.java:3015)
>>         at java.net.URI.<init>(URI.java:747)
>>         at org.apache.hadoop.yarn.util.ConverterUtils.getPathFromYarnURL(ConverterUtils.java:77)
>>         at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalResourceRequest.<init>(LocalResourceRequest.java:46)
>>
>>
>>
>> On Tue, Aug 6, 2013 at 3:36 PM, Harsh J <ha...@cloudera.com> wrote:
>> To be honest, I've never tried loading a HDFS file onto the
>> LocalResource this way. I usually just pass a local file and that
>> works just fine. There may be something in the URI transformation
>> possibly breaking a HDFS source, but try passing a local file - does
>> that fail too? The Shell example uses a local file.
>>
>> On Tue, Aug 6, 2013 at 10:54 AM, Krishna Kishore Bonagiri
>> <wr...@gmail.com> wrote:
>> > Hi Harsh,
>> >
>> >   Please see if this is useful, I got a stack trace after the error has
>> > occurred....
>> >
>> > 2013-08-06 00:55:30,559 INFO
>> > org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor: CWD set
>> > to /tmp/nm-local-dir/usercache/dsadm/appcache/application_1375716148174_0004
>> > =
>> > file:/tmp/nm-local-dir/usercache/dsadm/appcache/application_1375716148174_0004
>> > 2013-08-06 00:55:31,017 ERROR
>> > org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException
>> > as:dsadm (auth:SIMPLE) cause:java.io.FileNotFoundException: File does not
>> > exist: hdfs://isredeng/kishore/kk.ksh
>> > 2013-08-06 00:55:31,029 INFO
>> > org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService:
>> > DEBUG: FAILED { hdfs://isredeng/kishore/kk.ksh, 0, FILE, null }, File does
>> > not exist: hdfs://isredeng/kishore/kk.ksh
>> > 2013-08-06 00:55:31,031 INFO
>> > org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalizedResource:
>> > Resource hdfs://isredeng/kishore/kk.ksh transitioned from DOWNLOADING to
>> > FAILED
>> > 2013-08-06 00:55:31,034 INFO
>> > org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container:
>> > Container container_1375716148174_0004_01_000002 transitioned from
>> > LOCALIZING to LOCALIZATION_FAILED
>> > 2013-08-06 00:55:31,035 INFO
>> > org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalResourcesTrackerImpl:
>> > Container container_1375716148174_0004_01_000002 sent RELEASE event on a
>> > resource request { hdfs://isredeng/kishore/kk.ksh, 0, FILE, null } not
>> > present in cache.
>> > 2013-08-06 00:55:31,036 WARN org.apache.hadoop.ipc.Client: interrupted
>> > waiting to send rpc request to server
>> > java.lang.InterruptedException
>> >         at
>> > java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1290)
>> >         at
>> > java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:229)
>> >         at java.util.concurrent.FutureTask.get(FutureTask.java:94)
>> >         at
>> > org.apache.hadoop.ipc.Client$Connection.sendRpcRequest(Client.java:930)
>> >         at org.apache.hadoop.ipc.Client.call(Client.java:1285)
>> >         at org.apache.hadoop.ipc.Client.call(Client.java:1264)
>> >         at
>> > org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
>> >         at $Proxy22.heartbeat(Unknown Source)
>> >         at
>> > org.apache.hadoop.yarn.server.nodemanager.api.impl.pb.client.LocalizationProtocolPBClientImpl.heartbeat(LocalizationProtocolPBClientImpl.java:62)
>> >         at
>> > org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.localizeFiles(ContainerLocalizer.java:249)
>> >         at
>> > org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.runLocalization(ContainerLocalizer.java:163)
>> >         at
>> > org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.startLocalizer(DefaultContainerExecutor.java:106)
>> >         at
>> > org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerRunner.run(ResourceLocalizationService.java:979)
>> >
>> >
>> >
>> > And here is my code snippet:
>> >
>> >       ContainerLaunchContext ctx =
>> > Records.newRecord(ContainerLaunchContext.class);
>> >
>> >       ctx.setEnvironment(oshEnv);
>> >
>> >       // Set the local resources
>> >       Map<String, LocalResource> localResources = new HashMap<String,
>> > LocalResource>();
>> >
>> >       LocalResource shellRsrc = Records.newRecord(LocalResource.class);
>> >       shellRsrc.setType(LocalResourceType.FILE);
>> >       shellRsrc.setVisibility(LocalResourceVisibility.APPLICATION);
>> >       String shellScriptPath = "hdfs://isredeng//kishore/kk.ksh";
>> >       try {
>> >         shellRsrc.setResource(ConverterUtils.getYarnUrlFromURI(new
>> > URI(shellScriptPath)));
>> >       } catch (URISyntaxException e) {
>> >         LOG.error("Error when trying to use shell script path specified"
>> >             + " in env, path=" + shellScriptPath);
>> >         e.printStackTrace();
>> >       }
>> >
>> >       shellRsrc.setTimestamp(0/*shellScriptPathTimestamp*/);
>> >       shellRsrc.setSize(0/*shellScriptPathLen*/);
>> >       String ExecShellStringPath = "ExecShellScript.sh";
>> >       localResources.put(ExecShellStringPath, shellRsrc);
>> >
>> >       ctx.setLocalResources(localResources);
>> >
>> >
>> > Please let me know if you need anything else.
>> >
>> > Thanks,
>> > Kishore
>> >
>> >
>> >
>> > On Tue, Aug 6, 2013 at 12:05 AM, Harsh J <ha...@cloudera.com> wrote:
>> >>
>> >> The detail is insufficient to answer why. You should also have gotten
>> >> a trace after it, can you post that? If possible, also the relevant
>> >> snippets of code.
>> >>
>> >> On Mon, Aug 5, 2013 at 6:36 PM, Krishna Kishore Bonagiri
>> >> <wr...@gmail.com> wrote:
>> >> > Hi Harsh,
>> >> >  Thanks for the quick and detailed reply, it really helps. I am trying
>> >> > to
>> >> > use it and getting this error in node manager's log:
>> >> >
>> >> > 2013-08-05 08:57:28,867 ERROR
>> >> > org.apache.hadoop.security.UserGroupInformation:
>> >> > PriviledgedActionException
>> >> > as:dsadm (auth:SIMPLE) cause:java.io.FileNotFoundException: File does
>> >> > not
>> >> > exist: hdfs://isredeng/kishore/kk.ksh
>> >> >
>> >> >
>> >> > This file is there on the machine with name "isredeng", I could do ls
>> >> > for
>> >> > that file as below:
>> >> >
>> >> > -bash-4.1$ hadoop fs -ls kishore/kk.ksh
>> >> > 13/08/05 09:01:03 WARN util.NativeCodeLoader: Unable to load
>> >> > native-hadoop
>> >> > library for your platform... using builtin-java classes where applicable
>> >> > Found 1 items
>> >> > -rw-r--r--   3 dsadm supergroup       1046 2013-08-05 08:48
>> >> > kishore/kk.ksh
>> >> >
>> >> > Note: I am using a single node cluster
>> >> >
>> >> > Thanks,
>> >> > Kishore
>> >> >
>> >> >
>> >> >
>> >> >
>> >> > On Mon, Aug 5, 2013 at 3:00 PM, Harsh J <ha...@cloudera.com> wrote:
>> >> >>
>> >> >> The string for each LocalResource in the map can be anything that
>> >> >> serves as a common identifier name for your application. At execution
>> >> >> time, the passed resource filename will be aliased to the name you've
>> >> >> mapped it to, so that the application code need not track special
>> >> >> names. The behavior is very similar to how you can, in MR, define a
>> >> >> symlink name for a DistributedCache entry (e.g. foo.jar#bar.jar).
>> >> >>
>> >> >> For an example, checkout the DistributedShell app sources.
>> >> >>
>> >> >> Over [1], you can see we take a user provided file path to a shell
>> >> >> script. This can be named anything as it is user-supplied.
>> >> >> Onto [2], we define this as a local resource [2.1] and embed it with a
>> >> >> different name (the string you ask about) [2.2], as defined at [3] as
>> >> >> an application reference-able constant.
>> >> >> Note that in [4], we add to the Container arguments the aliased name
>> >> >> we mapped it to (i.e. [3]) and not the original filename we received
>> >> >> from the user. The resource is placed on the container with this name
>> >> >> instead, so thats what we choose to execute.
>> >> >>
>> >> >> [1] -
>> >> >>
>> >> >> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L390
>> >> >>
>> >> >> [2] - [2.1]
>> >> >>
>> >> >> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L764
>> >> >> and [2.2]
>> >> >>
>> >> >> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L780
>> >> >>
>> >> >> [3] -
>> >> >>
>> >> >> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L205
>> >> >>
>> >> >> [4] -
>> >> >>
>> >> >> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L791
>> >> >>
>> >> >> On Mon, Aug 5, 2013 at 2:44 PM, Krishna Kishore Bonagiri
>> >> >> <wr...@gmail.com> wrote:
>> >> >> > Hi,
>> >> >> >
>> >> >> >   Can someone please tell me what is the use of calling
>> >> >> > setLocalResources()
>> >> >> > on ContainerLaunchContext?
>> >> >> >
>> >> >> >   And, also an example of how to use this will help...
>> >> >> >
>> >> >> >  I couldn't guess what is the String in the map that is passed to
>> >> >> > setLocalResources() like below:
>> >> >> >
>> >> >> >       // Set the local resources
>> >> >> >       Map<String, LocalResource> localResources = new HashMap<String,
>> >> >> > LocalResource>();
>> >> >> >
>> >> >> > Thanks,
>> >> >> > Kishore
>> >> >> >
>> >> >>
>> >> >>
>> >> >>
>> >> >> --
>> >> >> Harsh J
>> >> >
>> >> >
>> >>
>> >>
>> >>
>> >> --
>> >> Harsh J
>> >
>> >
>>
>>
>>
>> --
>> Harsh J
>>
>>
>>
>



-- 
Harsh J

Re: setLocalResources() on ContainerLaunchContext

Posted by Harsh J <ha...@cloudera.com>.
It is kinda unnecessary to be asking developers to load in timestamps and
length themselves. Why not provide a java.io.File, or perhaps a Path
accepting API, that gets it automatically on their behalf using the
FileSystem API internally?

P.s. A HDFS file gave him a FNF, while a Local file gave him a proper
TS/Len error. I'm guessing there's a bug here w.r.t. handling HDFS
paths.

On Wed, Aug 7, 2013 at 12:35 AM, Hitesh Shah <hi...@apache.org> wrote:
> Hi Krishna,
>
> YARN downloads a specified local resource on the container's node from the url specified. In all situtations, the remote url needs to be a fully qualified path. To verify that the file at the remote url is still valid, YARN expects you to provide the length and last modified timestamp of that file.
>
> If you use an hdfs path such as hdfs://namenode:port/<absolute path to file>, you will need to get the length and timestamp from HDFS.
> If you use file:///, the file should exist on all nodes and all nodes should have the file with the same length and timestamp for localization to work. ( For a single node setup, this works but tougher to get right on a multi-node setup - deploying the file via a rpm should likely work).
>
> -- Hitesh
>
> On Aug 6, 2013, at 11:11 AM, Omkar Joshi wrote:
>
>> Hi,
>>
>> You need to match the timestamp. Probably get the timestamp locally before adding it. This is explicitly done to ensure that file is not updated after user makes the call to avoid possible errors.
>>
>>
>> Thanks,
>> Omkar Joshi
>> Hortonworks Inc.
>>
>>
>> On Tue, Aug 6, 2013 at 5:25 AM, Krishna Kishore Bonagiri <wr...@gmail.com> wrote:
>> I tried the following and it works!
>> String shellScriptPath = "file:///home_/dsadm/kishore/kk.ksh";
>>
>> But now getting a timestamp error like below, when I passed 0 to setTimestamp()
>>
>> 13/08/06 08:23:48 INFO ApplicationMaster: Got container status for containerID= container_1375784329048_0017_01_000002, state=COMPLETE, exitStatus=-1000, diagnostics=Resource file:/home_/dsadm/kishore/kk.ksh changed on src filesystem (expected 0, was 1367580580000
>>
>>
>>
>>
>> On Tue, Aug 6, 2013 at 5:24 PM, Harsh J <ha...@cloudera.com> wrote:
>> Can you try passing a fully qualified local path? That is, including the file:/ scheme
>>
>> On Aug 6, 2013 4:05 PM, "Krishna Kishore Bonagiri" <wr...@gmail.com> wrote:
>> Hi Harsh,
>>    The setResource() call on LocalResource() is expecting an argument of type org.apache.hadoop.yarn.api.records.URL which is converted from a string in the form of URI. This happens in the following call of Distributed Shell example,
>>
>> shellRsrc.setResource(ConverterUtils.getYarnUrlFromURI(new URI( shellScriptPath)));
>>
>> So, if I give a local file I get a parsing error like below, which is when I changed it to an HDFS file thinking that it should be given like that only. Could you please give an example of how else it could be used, using a local file as you are saying?
>>
>> 2013-08-06 06:23:12,942 WARN org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container: Failed to parse resource-request
>> java.net.URISyntaxException: Expected scheme name at index 0: :///home_/dsadm/kishore/kk.ksh
>>         at java.net.URI$Parser.fail(URI.java:2820)
>>         at java.net.URI$Parser.failExpecting(URI.java:2826)
>>         at java.net.URI$Parser.parse(URI.java:3015)
>>         at java.net.URI.<init>(URI.java:747)
>>         at org.apache.hadoop.yarn.util.ConverterUtils.getPathFromYarnURL(ConverterUtils.java:77)
>>         at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalResourceRequest.<init>(LocalResourceRequest.java:46)
>>
>>
>>
>> On Tue, Aug 6, 2013 at 3:36 PM, Harsh J <ha...@cloudera.com> wrote:
>> To be honest, I've never tried loading a HDFS file onto the
>> LocalResource this way. I usually just pass a local file and that
>> works just fine. There may be something in the URI transformation
>> possibly breaking a HDFS source, but try passing a local file - does
>> that fail too? The Shell example uses a local file.
>>
>> On Tue, Aug 6, 2013 at 10:54 AM, Krishna Kishore Bonagiri
>> <wr...@gmail.com> wrote:
>> > Hi Harsh,
>> >
>> >   Please see if this is useful, I got a stack trace after the error has
>> > occurred....
>> >
>> > 2013-08-06 00:55:30,559 INFO
>> > org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor: CWD set
>> > to /tmp/nm-local-dir/usercache/dsadm/appcache/application_1375716148174_0004
>> > =
>> > file:/tmp/nm-local-dir/usercache/dsadm/appcache/application_1375716148174_0004
>> > 2013-08-06 00:55:31,017 ERROR
>> > org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException
>> > as:dsadm (auth:SIMPLE) cause:java.io.FileNotFoundException: File does not
>> > exist: hdfs://isredeng/kishore/kk.ksh
>> > 2013-08-06 00:55:31,029 INFO
>> > org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService:
>> > DEBUG: FAILED { hdfs://isredeng/kishore/kk.ksh, 0, FILE, null }, File does
>> > not exist: hdfs://isredeng/kishore/kk.ksh
>> > 2013-08-06 00:55:31,031 INFO
>> > org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalizedResource:
>> > Resource hdfs://isredeng/kishore/kk.ksh transitioned from DOWNLOADING to
>> > FAILED
>> > 2013-08-06 00:55:31,034 INFO
>> > org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container:
>> > Container container_1375716148174_0004_01_000002 transitioned from
>> > LOCALIZING to LOCALIZATION_FAILED
>> > 2013-08-06 00:55:31,035 INFO
>> > org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalResourcesTrackerImpl:
>> > Container container_1375716148174_0004_01_000002 sent RELEASE event on a
>> > resource request { hdfs://isredeng/kishore/kk.ksh, 0, FILE, null } not
>> > present in cache.
>> > 2013-08-06 00:55:31,036 WARN org.apache.hadoop.ipc.Client: interrupted
>> > waiting to send rpc request to server
>> > java.lang.InterruptedException
>> >         at
>> > java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1290)
>> >         at
>> > java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:229)
>> >         at java.util.concurrent.FutureTask.get(FutureTask.java:94)
>> >         at
>> > org.apache.hadoop.ipc.Client$Connection.sendRpcRequest(Client.java:930)
>> >         at org.apache.hadoop.ipc.Client.call(Client.java:1285)
>> >         at org.apache.hadoop.ipc.Client.call(Client.java:1264)
>> >         at
>> > org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
>> >         at $Proxy22.heartbeat(Unknown Source)
>> >         at
>> > org.apache.hadoop.yarn.server.nodemanager.api.impl.pb.client.LocalizationProtocolPBClientImpl.heartbeat(LocalizationProtocolPBClientImpl.java:62)
>> >         at
>> > org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.localizeFiles(ContainerLocalizer.java:249)
>> >         at
>> > org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.runLocalization(ContainerLocalizer.java:163)
>> >         at
>> > org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.startLocalizer(DefaultContainerExecutor.java:106)
>> >         at
>> > org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerRunner.run(ResourceLocalizationService.java:979)
>> >
>> >
>> >
>> > And here is my code snippet:
>> >
>> >       ContainerLaunchContext ctx =
>> > Records.newRecord(ContainerLaunchContext.class);
>> >
>> >       ctx.setEnvironment(oshEnv);
>> >
>> >       // Set the local resources
>> >       Map<String, LocalResource> localResources = new HashMap<String,
>> > LocalResource>();
>> >
>> >       LocalResource shellRsrc = Records.newRecord(LocalResource.class);
>> >       shellRsrc.setType(LocalResourceType.FILE);
>> >       shellRsrc.setVisibility(LocalResourceVisibility.APPLICATION);
>> >       String shellScriptPath = "hdfs://isredeng//kishore/kk.ksh";
>> >       try {
>> >         shellRsrc.setResource(ConverterUtils.getYarnUrlFromURI(new
>> > URI(shellScriptPath)));
>> >       } catch (URISyntaxException e) {
>> >         LOG.error("Error when trying to use shell script path specified"
>> >             + " in env, path=" + shellScriptPath);
>> >         e.printStackTrace();
>> >       }
>> >
>> >       shellRsrc.setTimestamp(0/*shellScriptPathTimestamp*/);
>> >       shellRsrc.setSize(0/*shellScriptPathLen*/);
>> >       String ExecShellStringPath = "ExecShellScript.sh";
>> >       localResources.put(ExecShellStringPath, shellRsrc);
>> >
>> >       ctx.setLocalResources(localResources);
>> >
>> >
>> > Please let me know if you need anything else.
>> >
>> > Thanks,
>> > Kishore
>> >
>> >
>> >
>> > On Tue, Aug 6, 2013 at 12:05 AM, Harsh J <ha...@cloudera.com> wrote:
>> >>
>> >> The detail is insufficient to answer why. You should also have gotten
>> >> a trace after it, can you post that? If possible, also the relevant
>> >> snippets of code.
>> >>
>> >> On Mon, Aug 5, 2013 at 6:36 PM, Krishna Kishore Bonagiri
>> >> <wr...@gmail.com> wrote:
>> >> > Hi Harsh,
>> >> >  Thanks for the quick and detailed reply, it really helps. I am trying
>> >> > to
>> >> > use it and getting this error in node manager's log:
>> >> >
>> >> > 2013-08-05 08:57:28,867 ERROR
>> >> > org.apache.hadoop.security.UserGroupInformation:
>> >> > PriviledgedActionException
>> >> > as:dsadm (auth:SIMPLE) cause:java.io.FileNotFoundException: File does
>> >> > not
>> >> > exist: hdfs://isredeng/kishore/kk.ksh
>> >> >
>> >> >
>> >> > This file is there on the machine with name "isredeng", I could do ls
>> >> > for
>> >> > that file as below:
>> >> >
>> >> > -bash-4.1$ hadoop fs -ls kishore/kk.ksh
>> >> > 13/08/05 09:01:03 WARN util.NativeCodeLoader: Unable to load
>> >> > native-hadoop
>> >> > library for your platform... using builtin-java classes where applicable
>> >> > Found 1 items
>> >> > -rw-r--r--   3 dsadm supergroup       1046 2013-08-05 08:48
>> >> > kishore/kk.ksh
>> >> >
>> >> > Note: I am using a single node cluster
>> >> >
>> >> > Thanks,
>> >> > Kishore
>> >> >
>> >> >
>> >> >
>> >> >
>> >> > On Mon, Aug 5, 2013 at 3:00 PM, Harsh J <ha...@cloudera.com> wrote:
>> >> >>
>> >> >> The string for each LocalResource in the map can be anything that
>> >> >> serves as a common identifier name for your application. At execution
>> >> >> time, the passed resource filename will be aliased to the name you've
>> >> >> mapped it to, so that the application code need not track special
>> >> >> names. The behavior is very similar to how you can, in MR, define a
>> >> >> symlink name for a DistributedCache entry (e.g. foo.jar#bar.jar).
>> >> >>
>> >> >> For an example, checkout the DistributedShell app sources.
>> >> >>
>> >> >> Over [1], you can see we take a user provided file path to a shell
>> >> >> script. This can be named anything as it is user-supplied.
>> >> >> Onto [2], we define this as a local resource [2.1] and embed it with a
>> >> >> different name (the string you ask about) [2.2], as defined at [3] as
>> >> >> an application reference-able constant.
>> >> >> Note that in [4], we add to the Container arguments the aliased name
>> >> >> we mapped it to (i.e. [3]) and not the original filename we received
>> >> >> from the user. The resource is placed on the container with this name
>> >> >> instead, so thats what we choose to execute.
>> >> >>
>> >> >> [1] -
>> >> >>
>> >> >> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L390
>> >> >>
>> >> >> [2] - [2.1]
>> >> >>
>> >> >> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L764
>> >> >> and [2.2]
>> >> >>
>> >> >> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L780
>> >> >>
>> >> >> [3] -
>> >> >>
>> >> >> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L205
>> >> >>
>> >> >> [4] -
>> >> >>
>> >> >> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L791
>> >> >>
>> >> >> On Mon, Aug 5, 2013 at 2:44 PM, Krishna Kishore Bonagiri
>> >> >> <wr...@gmail.com> wrote:
>> >> >> > Hi,
>> >> >> >
>> >> >> >   Can someone please tell me what is the use of calling
>> >> >> > setLocalResources()
>> >> >> > on ContainerLaunchContext?
>> >> >> >
>> >> >> >   And, also an example of how to use this will help...
>> >> >> >
>> >> >> >  I couldn't guess what is the String in the map that is passed to
>> >> >> > setLocalResources() like below:
>> >> >> >
>> >> >> >       // Set the local resources
>> >> >> >       Map<String, LocalResource> localResources = new HashMap<String,
>> >> >> > LocalResource>();
>> >> >> >
>> >> >> > Thanks,
>> >> >> > Kishore
>> >> >> >
>> >> >>
>> >> >>
>> >> >>
>> >> >> --
>> >> >> Harsh J
>> >> >
>> >> >
>> >>
>> >>
>> >>
>> >> --
>> >> Harsh J
>> >
>> >
>>
>>
>>
>> --
>> Harsh J
>>
>>
>>
>



-- 
Harsh J

Re: setLocalResources() on ContainerLaunchContext

Posted by Harsh J <ha...@cloudera.com>.
It is kinda unnecessary to be asking developers to load in timestamps and
length themselves. Why not provide a java.io.File, or perhaps a Path
accepting API, that gets it automatically on their behalf using the
FileSystem API internally?

P.s. A HDFS file gave him a FNF, while a Local file gave him a proper
TS/Len error. I'm guessing there's a bug here w.r.t. handling HDFS
paths.

On Wed, Aug 7, 2013 at 12:35 AM, Hitesh Shah <hi...@apache.org> wrote:
> Hi Krishna,
>
> YARN downloads a specified local resource on the container's node from the url specified. In all situtations, the remote url needs to be a fully qualified path. To verify that the file at the remote url is still valid, YARN expects you to provide the length and last modified timestamp of that file.
>
> If you use an hdfs path such as hdfs://namenode:port/<absolute path to file>, you will need to get the length and timestamp from HDFS.
> If you use file:///, the file should exist on all nodes and all nodes should have the file with the same length and timestamp for localization to work. ( For a single node setup, this works but tougher to get right on a multi-node setup - deploying the file via a rpm should likely work).
>
> -- Hitesh
>
> On Aug 6, 2013, at 11:11 AM, Omkar Joshi wrote:
>
>> Hi,
>>
>> You need to match the timestamp. Probably get the timestamp locally before adding it. This is explicitly done to ensure that file is not updated after user makes the call to avoid possible errors.
>>
>>
>> Thanks,
>> Omkar Joshi
>> Hortonworks Inc.
>>
>>
>> On Tue, Aug 6, 2013 at 5:25 AM, Krishna Kishore Bonagiri <wr...@gmail.com> wrote:
>> I tried the following and it works!
>> String shellScriptPath = "file:///home_/dsadm/kishore/kk.ksh";
>>
>> But now getting a timestamp error like below, when I passed 0 to setTimestamp()
>>
>> 13/08/06 08:23:48 INFO ApplicationMaster: Got container status for containerID= container_1375784329048_0017_01_000002, state=COMPLETE, exitStatus=-1000, diagnostics=Resource file:/home_/dsadm/kishore/kk.ksh changed on src filesystem (expected 0, was 1367580580000
>>
>>
>>
>>
>> On Tue, Aug 6, 2013 at 5:24 PM, Harsh J <ha...@cloudera.com> wrote:
>> Can you try passing a fully qualified local path? That is, including the file:/ scheme
>>
>> On Aug 6, 2013 4:05 PM, "Krishna Kishore Bonagiri" <wr...@gmail.com> wrote:
>> Hi Harsh,
>>    The setResource() call on LocalResource() is expecting an argument of type org.apache.hadoop.yarn.api.records.URL which is converted from a string in the form of URI. This happens in the following call of Distributed Shell example,
>>
>> shellRsrc.setResource(ConverterUtils.getYarnUrlFromURI(new URI( shellScriptPath)));
>>
>> So, if I give a local file I get a parsing error like below, which is when I changed it to an HDFS file thinking that it should be given like that only. Could you please give an example of how else it could be used, using a local file as you are saying?
>>
>> 2013-08-06 06:23:12,942 WARN org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container: Failed to parse resource-request
>> java.net.URISyntaxException: Expected scheme name at index 0: :///home_/dsadm/kishore/kk.ksh
>>         at java.net.URI$Parser.fail(URI.java:2820)
>>         at java.net.URI$Parser.failExpecting(URI.java:2826)
>>         at java.net.URI$Parser.parse(URI.java:3015)
>>         at java.net.URI.<init>(URI.java:747)
>>         at org.apache.hadoop.yarn.util.ConverterUtils.getPathFromYarnURL(ConverterUtils.java:77)
>>         at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalResourceRequest.<init>(LocalResourceRequest.java:46)
>>
>>
>>
>> On Tue, Aug 6, 2013 at 3:36 PM, Harsh J <ha...@cloudera.com> wrote:
>> To be honest, I've never tried loading a HDFS file onto the
>> LocalResource this way. I usually just pass a local file and that
>> works just fine. There may be something in the URI transformation
>> possibly breaking a HDFS source, but try passing a local file - does
>> that fail too? The Shell example uses a local file.
>>
>> On Tue, Aug 6, 2013 at 10:54 AM, Krishna Kishore Bonagiri
>> <wr...@gmail.com> wrote:
>> > Hi Harsh,
>> >
>> >   Please see if this is useful, I got a stack trace after the error has
>> > occurred....
>> >
>> > 2013-08-06 00:55:30,559 INFO
>> > org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor: CWD set
>> > to /tmp/nm-local-dir/usercache/dsadm/appcache/application_1375716148174_0004
>> > =
>> > file:/tmp/nm-local-dir/usercache/dsadm/appcache/application_1375716148174_0004
>> > 2013-08-06 00:55:31,017 ERROR
>> > org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException
>> > as:dsadm (auth:SIMPLE) cause:java.io.FileNotFoundException: File does not
>> > exist: hdfs://isredeng/kishore/kk.ksh
>> > 2013-08-06 00:55:31,029 INFO
>> > org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService:
>> > DEBUG: FAILED { hdfs://isredeng/kishore/kk.ksh, 0, FILE, null }, File does
>> > not exist: hdfs://isredeng/kishore/kk.ksh
>> > 2013-08-06 00:55:31,031 INFO
>> > org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalizedResource:
>> > Resource hdfs://isredeng/kishore/kk.ksh transitioned from DOWNLOADING to
>> > FAILED
>> > 2013-08-06 00:55:31,034 INFO
>> > org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container:
>> > Container container_1375716148174_0004_01_000002 transitioned from
>> > LOCALIZING to LOCALIZATION_FAILED
>> > 2013-08-06 00:55:31,035 INFO
>> > org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalResourcesTrackerImpl:
>> > Container container_1375716148174_0004_01_000002 sent RELEASE event on a
>> > resource request { hdfs://isredeng/kishore/kk.ksh, 0, FILE, null } not
>> > present in cache.
>> > 2013-08-06 00:55:31,036 WARN org.apache.hadoop.ipc.Client: interrupted
>> > waiting to send rpc request to server
>> > java.lang.InterruptedException
>> >         at
>> > java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1290)
>> >         at
>> > java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:229)
>> >         at java.util.concurrent.FutureTask.get(FutureTask.java:94)
>> >         at
>> > org.apache.hadoop.ipc.Client$Connection.sendRpcRequest(Client.java:930)
>> >         at org.apache.hadoop.ipc.Client.call(Client.java:1285)
>> >         at org.apache.hadoop.ipc.Client.call(Client.java:1264)
>> >         at
>> > org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
>> >         at $Proxy22.heartbeat(Unknown Source)
>> >         at
>> > org.apache.hadoop.yarn.server.nodemanager.api.impl.pb.client.LocalizationProtocolPBClientImpl.heartbeat(LocalizationProtocolPBClientImpl.java:62)
>> >         at
>> > org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.localizeFiles(ContainerLocalizer.java:249)
>> >         at
>> > org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.runLocalization(ContainerLocalizer.java:163)
>> >         at
>> > org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.startLocalizer(DefaultContainerExecutor.java:106)
>> >         at
>> > org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerRunner.run(ResourceLocalizationService.java:979)
>> >
>> >
>> >
>> > And here is my code snippet:
>> >
>> >       ContainerLaunchContext ctx =
>> > Records.newRecord(ContainerLaunchContext.class);
>> >
>> >       ctx.setEnvironment(oshEnv);
>> >
>> >       // Set the local resources
>> >       Map<String, LocalResource> localResources = new HashMap<String,
>> > LocalResource>();
>> >
>> >       LocalResource shellRsrc = Records.newRecord(LocalResource.class);
>> >       shellRsrc.setType(LocalResourceType.FILE);
>> >       shellRsrc.setVisibility(LocalResourceVisibility.APPLICATION);
>> >       String shellScriptPath = "hdfs://isredeng//kishore/kk.ksh";
>> >       try {
>> >         shellRsrc.setResource(ConverterUtils.getYarnUrlFromURI(new
>> > URI(shellScriptPath)));
>> >       } catch (URISyntaxException e) {
>> >         LOG.error("Error when trying to use shell script path specified"
>> >             + " in env, path=" + shellScriptPath);
>> >         e.printStackTrace();
>> >       }
>> >
>> >       shellRsrc.setTimestamp(0/*shellScriptPathTimestamp*/);
>> >       shellRsrc.setSize(0/*shellScriptPathLen*/);
>> >       String ExecShellStringPath = "ExecShellScript.sh";
>> >       localResources.put(ExecShellStringPath, shellRsrc);
>> >
>> >       ctx.setLocalResources(localResources);
>> >
>> >
>> > Please let me know if you need anything else.
>> >
>> > Thanks,
>> > Kishore
>> >
>> >
>> >
>> > On Tue, Aug 6, 2013 at 12:05 AM, Harsh J <ha...@cloudera.com> wrote:
>> >>
>> >> The detail is insufficient to answer why. You should also have gotten
>> >> a trace after it, can you post that? If possible, also the relevant
>> >> snippets of code.
>> >>
>> >> On Mon, Aug 5, 2013 at 6:36 PM, Krishna Kishore Bonagiri
>> >> <wr...@gmail.com> wrote:
>> >> > Hi Harsh,
>> >> >  Thanks for the quick and detailed reply, it really helps. I am trying
>> >> > to
>> >> > use it and getting this error in node manager's log:
>> >> >
>> >> > 2013-08-05 08:57:28,867 ERROR
>> >> > org.apache.hadoop.security.UserGroupInformation:
>> >> > PriviledgedActionException
>> >> > as:dsadm (auth:SIMPLE) cause:java.io.FileNotFoundException: File does
>> >> > not
>> >> > exist: hdfs://isredeng/kishore/kk.ksh
>> >> >
>> >> >
>> >> > This file is there on the machine with name "isredeng", I could do ls
>> >> > for
>> >> > that file as below:
>> >> >
>> >> > -bash-4.1$ hadoop fs -ls kishore/kk.ksh
>> >> > 13/08/05 09:01:03 WARN util.NativeCodeLoader: Unable to load
>> >> > native-hadoop
>> >> > library for your platform... using builtin-java classes where applicable
>> >> > Found 1 items
>> >> > -rw-r--r--   3 dsadm supergroup       1046 2013-08-05 08:48
>> >> > kishore/kk.ksh
>> >> >
>> >> > Note: I am using a single node cluster
>> >> >
>> >> > Thanks,
>> >> > Kishore
>> >> >
>> >> >
>> >> >
>> >> >
>> >> > On Mon, Aug 5, 2013 at 3:00 PM, Harsh J <ha...@cloudera.com> wrote:
>> >> >>
>> >> >> The string for each LocalResource in the map can be anything that
>> >> >> serves as a common identifier name for your application. At execution
>> >> >> time, the passed resource filename will be aliased to the name you've
>> >> >> mapped it to, so that the application code need not track special
>> >> >> names. The behavior is very similar to how you can, in MR, define a
>> >> >> symlink name for a DistributedCache entry (e.g. foo.jar#bar.jar).
>> >> >>
>> >> >> For an example, checkout the DistributedShell app sources.
>> >> >>
>> >> >> Over [1], you can see we take a user provided file path to a shell
>> >> >> script. This can be named anything as it is user-supplied.
>> >> >> Onto [2], we define this as a local resource [2.1] and embed it with a
>> >> >> different name (the string you ask about) [2.2], as defined at [3] as
>> >> >> an application reference-able constant.
>> >> >> Note that in [4], we add to the Container arguments the aliased name
>> >> >> we mapped it to (i.e. [3]) and not the original filename we received
>> >> >> from the user. The resource is placed on the container with this name
>> >> >> instead, so thats what we choose to execute.
>> >> >>
>> >> >> [1] -
>> >> >>
>> >> >> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L390
>> >> >>
>> >> >> [2] - [2.1]
>> >> >>
>> >> >> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L764
>> >> >> and [2.2]
>> >> >>
>> >> >> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L780
>> >> >>
>> >> >> [3] -
>> >> >>
>> >> >> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L205
>> >> >>
>> >> >> [4] -
>> >> >>
>> >> >> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L791
>> >> >>
>> >> >> On Mon, Aug 5, 2013 at 2:44 PM, Krishna Kishore Bonagiri
>> >> >> <wr...@gmail.com> wrote:
>> >> >> > Hi,
>> >> >> >
>> >> >> >   Can someone please tell me what is the use of calling
>> >> >> > setLocalResources()
>> >> >> > on ContainerLaunchContext?
>> >> >> >
>> >> >> >   And, also an example of how to use this will help...
>> >> >> >
>> >> >> >  I couldn't guess what is the String in the map that is passed to
>> >> >> > setLocalResources() like below:
>> >> >> >
>> >> >> >       // Set the local resources
>> >> >> >       Map<String, LocalResource> localResources = new HashMap<String,
>> >> >> > LocalResource>();
>> >> >> >
>> >> >> > Thanks,
>> >> >> > Kishore
>> >> >> >
>> >> >>
>> >> >>
>> >> >>
>> >> >> --
>> >> >> Harsh J
>> >> >
>> >> >
>> >>
>> >>
>> >>
>> >> --
>> >> Harsh J
>> >
>> >
>>
>>
>>
>> --
>> Harsh J
>>
>>
>>
>



-- 
Harsh J

Re: setLocalResources() on ContainerLaunchContext

Posted by Harsh J <ha...@cloudera.com>.
It is kinda unnecessary to be asking developers to load in timestamps and
length themselves. Why not provide a java.io.File, or perhaps a Path
accepting API, that gets it automatically on their behalf using the
FileSystem API internally?

P.s. A HDFS file gave him a FNF, while a Local file gave him a proper
TS/Len error. I'm guessing there's a bug here w.r.t. handling HDFS
paths.

On Wed, Aug 7, 2013 at 12:35 AM, Hitesh Shah <hi...@apache.org> wrote:
> Hi Krishna,
>
> YARN downloads a specified local resource on the container's node from the url specified. In all situtations, the remote url needs to be a fully qualified path. To verify that the file at the remote url is still valid, YARN expects you to provide the length and last modified timestamp of that file.
>
> If you use an hdfs path such as hdfs://namenode:port/<absolute path to file>, you will need to get the length and timestamp from HDFS.
> If you use file:///, the file should exist on all nodes and all nodes should have the file with the same length and timestamp for localization to work. ( For a single node setup, this works but tougher to get right on a multi-node setup - deploying the file via a rpm should likely work).
>
> -- Hitesh
>
> On Aug 6, 2013, at 11:11 AM, Omkar Joshi wrote:
>
>> Hi,
>>
>> You need to match the timestamp. Probably get the timestamp locally before adding it. This is explicitly done to ensure that file is not updated after user makes the call to avoid possible errors.
>>
>>
>> Thanks,
>> Omkar Joshi
>> Hortonworks Inc.
>>
>>
>> On Tue, Aug 6, 2013 at 5:25 AM, Krishna Kishore Bonagiri <wr...@gmail.com> wrote:
>> I tried the following and it works!
>> String shellScriptPath = "file:///home_/dsadm/kishore/kk.ksh";
>>
>> But now getting a timestamp error like below, when I passed 0 to setTimestamp()
>>
>> 13/08/06 08:23:48 INFO ApplicationMaster: Got container status for containerID= container_1375784329048_0017_01_000002, state=COMPLETE, exitStatus=-1000, diagnostics=Resource file:/home_/dsadm/kishore/kk.ksh changed on src filesystem (expected 0, was 1367580580000
>>
>>
>>
>>
>> On Tue, Aug 6, 2013 at 5:24 PM, Harsh J <ha...@cloudera.com> wrote:
>> Can you try passing a fully qualified local path? That is, including the file:/ scheme
>>
>> On Aug 6, 2013 4:05 PM, "Krishna Kishore Bonagiri" <wr...@gmail.com> wrote:
>> Hi Harsh,
>>    The setResource() call on LocalResource() is expecting an argument of type org.apache.hadoop.yarn.api.records.URL which is converted from a string in the form of URI. This happens in the following call of Distributed Shell example,
>>
>> shellRsrc.setResource(ConverterUtils.getYarnUrlFromURI(new URI( shellScriptPath)));
>>
>> So, if I give a local file I get a parsing error like below, which is when I changed it to an HDFS file thinking that it should be given like that only. Could you please give an example of how else it could be used, using a local file as you are saying?
>>
>> 2013-08-06 06:23:12,942 WARN org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container: Failed to parse resource-request
>> java.net.URISyntaxException: Expected scheme name at index 0: :///home_/dsadm/kishore/kk.ksh
>>         at java.net.URI$Parser.fail(URI.java:2820)
>>         at java.net.URI$Parser.failExpecting(URI.java:2826)
>>         at java.net.URI$Parser.parse(URI.java:3015)
>>         at java.net.URI.<init>(URI.java:747)
>>         at org.apache.hadoop.yarn.util.ConverterUtils.getPathFromYarnURL(ConverterUtils.java:77)
>>         at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalResourceRequest.<init>(LocalResourceRequest.java:46)
>>
>>
>>
>> On Tue, Aug 6, 2013 at 3:36 PM, Harsh J <ha...@cloudera.com> wrote:
>> To be honest, I've never tried loading a HDFS file onto the
>> LocalResource this way. I usually just pass a local file and that
>> works just fine. There may be something in the URI transformation
>> possibly breaking a HDFS source, but try passing a local file - does
>> that fail too? The Shell example uses a local file.
>>
>> On Tue, Aug 6, 2013 at 10:54 AM, Krishna Kishore Bonagiri
>> <wr...@gmail.com> wrote:
>> > Hi Harsh,
>> >
>> >   Please see if this is useful, I got a stack trace after the error has
>> > occurred....
>> >
>> > 2013-08-06 00:55:30,559 INFO
>> > org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor: CWD set
>> > to /tmp/nm-local-dir/usercache/dsadm/appcache/application_1375716148174_0004
>> > =
>> > file:/tmp/nm-local-dir/usercache/dsadm/appcache/application_1375716148174_0004
>> > 2013-08-06 00:55:31,017 ERROR
>> > org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException
>> > as:dsadm (auth:SIMPLE) cause:java.io.FileNotFoundException: File does not
>> > exist: hdfs://isredeng/kishore/kk.ksh
>> > 2013-08-06 00:55:31,029 INFO
>> > org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService:
>> > DEBUG: FAILED { hdfs://isredeng/kishore/kk.ksh, 0, FILE, null }, File does
>> > not exist: hdfs://isredeng/kishore/kk.ksh
>> > 2013-08-06 00:55:31,031 INFO
>> > org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalizedResource:
>> > Resource hdfs://isredeng/kishore/kk.ksh transitioned from DOWNLOADING to
>> > FAILED
>> > 2013-08-06 00:55:31,034 INFO
>> > org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container:
>> > Container container_1375716148174_0004_01_000002 transitioned from
>> > LOCALIZING to LOCALIZATION_FAILED
>> > 2013-08-06 00:55:31,035 INFO
>> > org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalResourcesTrackerImpl:
>> > Container container_1375716148174_0004_01_000002 sent RELEASE event on a
>> > resource request { hdfs://isredeng/kishore/kk.ksh, 0, FILE, null } not
>> > present in cache.
>> > 2013-08-06 00:55:31,036 WARN org.apache.hadoop.ipc.Client: interrupted
>> > waiting to send rpc request to server
>> > java.lang.InterruptedException
>> >         at
>> > java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1290)
>> >         at
>> > java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:229)
>> >         at java.util.concurrent.FutureTask.get(FutureTask.java:94)
>> >         at
>> > org.apache.hadoop.ipc.Client$Connection.sendRpcRequest(Client.java:930)
>> >         at org.apache.hadoop.ipc.Client.call(Client.java:1285)
>> >         at org.apache.hadoop.ipc.Client.call(Client.java:1264)
>> >         at
>> > org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
>> >         at $Proxy22.heartbeat(Unknown Source)
>> >         at
>> > org.apache.hadoop.yarn.server.nodemanager.api.impl.pb.client.LocalizationProtocolPBClientImpl.heartbeat(LocalizationProtocolPBClientImpl.java:62)
>> >         at
>> > org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.localizeFiles(ContainerLocalizer.java:249)
>> >         at
>> > org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.runLocalization(ContainerLocalizer.java:163)
>> >         at
>> > org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.startLocalizer(DefaultContainerExecutor.java:106)
>> >         at
>> > org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerRunner.run(ResourceLocalizationService.java:979)
>> >
>> >
>> >
>> > And here is my code snippet:
>> >
>> >       ContainerLaunchContext ctx =
>> > Records.newRecord(ContainerLaunchContext.class);
>> >
>> >       ctx.setEnvironment(oshEnv);
>> >
>> >       // Set the local resources
>> >       Map<String, LocalResource> localResources = new HashMap<String,
>> > LocalResource>();
>> >
>> >       LocalResource shellRsrc = Records.newRecord(LocalResource.class);
>> >       shellRsrc.setType(LocalResourceType.FILE);
>> >       shellRsrc.setVisibility(LocalResourceVisibility.APPLICATION);
>> >       String shellScriptPath = "hdfs://isredeng//kishore/kk.ksh";
>> >       try {
>> >         shellRsrc.setResource(ConverterUtils.getYarnUrlFromURI(new
>> > URI(shellScriptPath)));
>> >       } catch (URISyntaxException e) {
>> >         LOG.error("Error when trying to use shell script path specified"
>> >             + " in env, path=" + shellScriptPath);
>> >         e.printStackTrace();
>> >       }
>> >
>> >       shellRsrc.setTimestamp(0/*shellScriptPathTimestamp*/);
>> >       shellRsrc.setSize(0/*shellScriptPathLen*/);
>> >       String ExecShellStringPath = "ExecShellScript.sh";
>> >       localResources.put(ExecShellStringPath, shellRsrc);
>> >
>> >       ctx.setLocalResources(localResources);
>> >
>> >
>> > Please let me know if you need anything else.
>> >
>> > Thanks,
>> > Kishore
>> >
>> >
>> >
>> > On Tue, Aug 6, 2013 at 12:05 AM, Harsh J <ha...@cloudera.com> wrote:
>> >>
>> >> The detail is insufficient to answer why. You should also have gotten
>> >> a trace after it, can you post that? If possible, also the relevant
>> >> snippets of code.
>> >>
>> >> On Mon, Aug 5, 2013 at 6:36 PM, Krishna Kishore Bonagiri
>> >> <wr...@gmail.com> wrote:
>> >> > Hi Harsh,
>> >> >  Thanks for the quick and detailed reply, it really helps. I am trying
>> >> > to
>> >> > use it and getting this error in node manager's log:
>> >> >
>> >> > 2013-08-05 08:57:28,867 ERROR
>> >> > org.apache.hadoop.security.UserGroupInformation:
>> >> > PriviledgedActionException
>> >> > as:dsadm (auth:SIMPLE) cause:java.io.FileNotFoundException: File does
>> >> > not
>> >> > exist: hdfs://isredeng/kishore/kk.ksh
>> >> >
>> >> >
>> >> > This file is there on the machine with name "isredeng", I could do ls
>> >> > for
>> >> > that file as below:
>> >> >
>> >> > -bash-4.1$ hadoop fs -ls kishore/kk.ksh
>> >> > 13/08/05 09:01:03 WARN util.NativeCodeLoader: Unable to load
>> >> > native-hadoop
>> >> > library for your platform... using builtin-java classes where applicable
>> >> > Found 1 items
>> >> > -rw-r--r--   3 dsadm supergroup       1046 2013-08-05 08:48
>> >> > kishore/kk.ksh
>> >> >
>> >> > Note: I am using a single node cluster
>> >> >
>> >> > Thanks,
>> >> > Kishore
>> >> >
>> >> >
>> >> >
>> >> >
>> >> > On Mon, Aug 5, 2013 at 3:00 PM, Harsh J <ha...@cloudera.com> wrote:
>> >> >>
>> >> >> The string for each LocalResource in the map can be anything that
>> >> >> serves as a common identifier name for your application. At execution
>> >> >> time, the passed resource filename will be aliased to the name you've
>> >> >> mapped it to, so that the application code need not track special
>> >> >> names. The behavior is very similar to how you can, in MR, define a
>> >> >> symlink name for a DistributedCache entry (e.g. foo.jar#bar.jar).
>> >> >>
>> >> >> For an example, checkout the DistributedShell app sources.
>> >> >>
>> >> >> Over [1], you can see we take a user provided file path to a shell
>> >> >> script. This can be named anything as it is user-supplied.
>> >> >> Onto [2], we define this as a local resource [2.1] and embed it with a
>> >> >> different name (the string you ask about) [2.2], as defined at [3] as
>> >> >> an application reference-able constant.
>> >> >> Note that in [4], we add to the Container arguments the aliased name
>> >> >> we mapped it to (i.e. [3]) and not the original filename we received
>> >> >> from the user. The resource is placed on the container with this name
>> >> >> instead, so thats what we choose to execute.
>> >> >>
>> >> >> [1] -
>> >> >>
>> >> >> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L390
>> >> >>
>> >> >> [2] - [2.1]
>> >> >>
>> >> >> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L764
>> >> >> and [2.2]
>> >> >>
>> >> >> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L780
>> >> >>
>> >> >> [3] -
>> >> >>
>> >> >> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L205
>> >> >>
>> >> >> [4] -
>> >> >>
>> >> >> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L791
>> >> >>
>> >> >> On Mon, Aug 5, 2013 at 2:44 PM, Krishna Kishore Bonagiri
>> >> >> <wr...@gmail.com> wrote:
>> >> >> > Hi,
>> >> >> >
>> >> >> >   Can someone please tell me what is the use of calling
>> >> >> > setLocalResources()
>> >> >> > on ContainerLaunchContext?
>> >> >> >
>> >> >> >   And, also an example of how to use this will help...
>> >> >> >
>> >> >> >  I couldn't guess what is the String in the map that is passed to
>> >> >> > setLocalResources() like below:
>> >> >> >
>> >> >> >       // Set the local resources
>> >> >> >       Map<String, LocalResource> localResources = new HashMap<String,
>> >> >> > LocalResource>();
>> >> >> >
>> >> >> > Thanks,
>> >> >> > Kishore
>> >> >> >
>> >> >>
>> >> >>
>> >> >>
>> >> >> --
>> >> >> Harsh J
>> >> >
>> >> >
>> >>
>> >>
>> >>
>> >> --
>> >> Harsh J
>> >
>> >
>>
>>
>>
>> --
>> Harsh J
>>
>>
>>
>



-- 
Harsh J

Re: setLocalResources() on ContainerLaunchContext

Posted by Hitesh Shah <hi...@apache.org>.
Hi Krishna, 

YARN downloads a specified local resource on the container's node from the url specified. In all situtations, the remote url needs to be a fully qualified path. To verify that the file at the remote url is still valid, YARN expects you to provide the length and last modified timestamp of that file.

If you use an hdfs path such as hdfs://namenode:port/<absolute path to file>, you will need to get the length and timestamp from HDFS. 
If you use file:///, the file should exist on all nodes and all nodes should have the file with the same length and timestamp for localization to work. ( For a single node setup, this works but tougher to get right on a multi-node setup - deploying the file via a rpm should likely work).

-- Hitesh

On Aug 6, 2013, at 11:11 AM, Omkar Joshi wrote:

> Hi,
> 
> You need to match the timestamp. Probably get the timestamp locally before adding it. This is explicitly done to ensure that file is not updated after user makes the call to avoid possible errors.
> 
> 
> Thanks,
> Omkar Joshi
> Hortonworks Inc.
> 
> 
> On Tue, Aug 6, 2013 at 5:25 AM, Krishna Kishore Bonagiri <wr...@gmail.com> wrote:
> I tried the following and it works!
> String shellScriptPath = "file:///home_/dsadm/kishore/kk.ksh";
> 
> But now getting a timestamp error like below, when I passed 0 to setTimestamp()
> 
> 13/08/06 08:23:48 INFO ApplicationMaster: Got container status for containerID= container_1375784329048_0017_01_000002, state=COMPLETE, exitStatus=-1000, diagnostics=Resource file:/home_/dsadm/kishore/kk.ksh changed on src filesystem (expected 0, was 1367580580000
> 
> 
> 
> 
> On Tue, Aug 6, 2013 at 5:24 PM, Harsh J <ha...@cloudera.com> wrote:
> Can you try passing a fully qualified local path? That is, including the file:/ scheme
> 
> On Aug 6, 2013 4:05 PM, "Krishna Kishore Bonagiri" <wr...@gmail.com> wrote:
> Hi Harsh,
>    The setResource() call on LocalResource() is expecting an argument of type org.apache.hadoop.yarn.api.records.URL which is converted from a string in the form of URI. This happens in the following call of Distributed Shell example, 
> 
> shellRsrc.setResource(ConverterUtils.getYarnUrlFromURI(new URI( shellScriptPath)));
> 
> So, if I give a local file I get a parsing error like below, which is when I changed it to an HDFS file thinking that it should be given like that only. Could you please give an example of how else it could be used, using a local file as you are saying?
> 
> 2013-08-06 06:23:12,942 WARN org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container: Failed to parse resource-request
> java.net.URISyntaxException: Expected scheme name at index 0: :///home_/dsadm/kishore/kk.ksh
>         at java.net.URI$Parser.fail(URI.java:2820)
>         at java.net.URI$Parser.failExpecting(URI.java:2826)
>         at java.net.URI$Parser.parse(URI.java:3015)
>         at java.net.URI.<init>(URI.java:747)
>         at org.apache.hadoop.yarn.util.ConverterUtils.getPathFromYarnURL(ConverterUtils.java:77)
>         at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalResourceRequest.<init>(LocalResourceRequest.java:46)
> 
> 
> 
> On Tue, Aug 6, 2013 at 3:36 PM, Harsh J <ha...@cloudera.com> wrote:
> To be honest, I've never tried loading a HDFS file onto the
> LocalResource this way. I usually just pass a local file and that
> works just fine. There may be something in the URI transformation
> possibly breaking a HDFS source, but try passing a local file - does
> that fail too? The Shell example uses a local file.
> 
> On Tue, Aug 6, 2013 at 10:54 AM, Krishna Kishore Bonagiri
> <wr...@gmail.com> wrote:
> > Hi Harsh,
> >
> >   Please see if this is useful, I got a stack trace after the error has
> > occurred....
> >
> > 2013-08-06 00:55:30,559 INFO
> > org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor: CWD set
> > to /tmp/nm-local-dir/usercache/dsadm/appcache/application_1375716148174_0004
> > =
> > file:/tmp/nm-local-dir/usercache/dsadm/appcache/application_1375716148174_0004
> > 2013-08-06 00:55:31,017 ERROR
> > org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException
> > as:dsadm (auth:SIMPLE) cause:java.io.FileNotFoundException: File does not
> > exist: hdfs://isredeng/kishore/kk.ksh
> > 2013-08-06 00:55:31,029 INFO
> > org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService:
> > DEBUG: FAILED { hdfs://isredeng/kishore/kk.ksh, 0, FILE, null }, File does
> > not exist: hdfs://isredeng/kishore/kk.ksh
> > 2013-08-06 00:55:31,031 INFO
> > org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalizedResource:
> > Resource hdfs://isredeng/kishore/kk.ksh transitioned from DOWNLOADING to
> > FAILED
> > 2013-08-06 00:55:31,034 INFO
> > org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container:
> > Container container_1375716148174_0004_01_000002 transitioned from
> > LOCALIZING to LOCALIZATION_FAILED
> > 2013-08-06 00:55:31,035 INFO
> > org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalResourcesTrackerImpl:
> > Container container_1375716148174_0004_01_000002 sent RELEASE event on a
> > resource request { hdfs://isredeng/kishore/kk.ksh, 0, FILE, null } not
> > present in cache.
> > 2013-08-06 00:55:31,036 WARN org.apache.hadoop.ipc.Client: interrupted
> > waiting to send rpc request to server
> > java.lang.InterruptedException
> >         at
> > java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1290)
> >         at
> > java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:229)
> >         at java.util.concurrent.FutureTask.get(FutureTask.java:94)
> >         at
> > org.apache.hadoop.ipc.Client$Connection.sendRpcRequest(Client.java:930)
> >         at org.apache.hadoop.ipc.Client.call(Client.java:1285)
> >         at org.apache.hadoop.ipc.Client.call(Client.java:1264)
> >         at
> > org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
> >         at $Proxy22.heartbeat(Unknown Source)
> >         at
> > org.apache.hadoop.yarn.server.nodemanager.api.impl.pb.client.LocalizationProtocolPBClientImpl.heartbeat(LocalizationProtocolPBClientImpl.java:62)
> >         at
> > org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.localizeFiles(ContainerLocalizer.java:249)
> >         at
> > org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.runLocalization(ContainerLocalizer.java:163)
> >         at
> > org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.startLocalizer(DefaultContainerExecutor.java:106)
> >         at
> > org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerRunner.run(ResourceLocalizationService.java:979)
> >
> >
> >
> > And here is my code snippet:
> >
> >       ContainerLaunchContext ctx =
> > Records.newRecord(ContainerLaunchContext.class);
> >
> >       ctx.setEnvironment(oshEnv);
> >
> >       // Set the local resources
> >       Map<String, LocalResource> localResources = new HashMap<String,
> > LocalResource>();
> >
> >       LocalResource shellRsrc = Records.newRecord(LocalResource.class);
> >       shellRsrc.setType(LocalResourceType.FILE);
> >       shellRsrc.setVisibility(LocalResourceVisibility.APPLICATION);
> >       String shellScriptPath = "hdfs://isredeng//kishore/kk.ksh";
> >       try {
> >         shellRsrc.setResource(ConverterUtils.getYarnUrlFromURI(new
> > URI(shellScriptPath)));
> >       } catch (URISyntaxException e) {
> >         LOG.error("Error when trying to use shell script path specified"
> >             + " in env, path=" + shellScriptPath);
> >         e.printStackTrace();
> >       }
> >
> >       shellRsrc.setTimestamp(0/*shellScriptPathTimestamp*/);
> >       shellRsrc.setSize(0/*shellScriptPathLen*/);
> >       String ExecShellStringPath = "ExecShellScript.sh";
> >       localResources.put(ExecShellStringPath, shellRsrc);
> >
> >       ctx.setLocalResources(localResources);
> >
> >
> > Please let me know if you need anything else.
> >
> > Thanks,
> > Kishore
> >
> >
> >
> > On Tue, Aug 6, 2013 at 12:05 AM, Harsh J <ha...@cloudera.com> wrote:
> >>
> >> The detail is insufficient to answer why. You should also have gotten
> >> a trace after it, can you post that? If possible, also the relevant
> >> snippets of code.
> >>
> >> On Mon, Aug 5, 2013 at 6:36 PM, Krishna Kishore Bonagiri
> >> <wr...@gmail.com> wrote:
> >> > Hi Harsh,
> >> >  Thanks for the quick and detailed reply, it really helps. I am trying
> >> > to
> >> > use it and getting this error in node manager's log:
> >> >
> >> > 2013-08-05 08:57:28,867 ERROR
> >> > org.apache.hadoop.security.UserGroupInformation:
> >> > PriviledgedActionException
> >> > as:dsadm (auth:SIMPLE) cause:java.io.FileNotFoundException: File does
> >> > not
> >> > exist: hdfs://isredeng/kishore/kk.ksh
> >> >
> >> >
> >> > This file is there on the machine with name "isredeng", I could do ls
> >> > for
> >> > that file as below:
> >> >
> >> > -bash-4.1$ hadoop fs -ls kishore/kk.ksh
> >> > 13/08/05 09:01:03 WARN util.NativeCodeLoader: Unable to load
> >> > native-hadoop
> >> > library for your platform... using builtin-java classes where applicable
> >> > Found 1 items
> >> > -rw-r--r--   3 dsadm supergroup       1046 2013-08-05 08:48
> >> > kishore/kk.ksh
> >> >
> >> > Note: I am using a single node cluster
> >> >
> >> > Thanks,
> >> > Kishore
> >> >
> >> >
> >> >
> >> >
> >> > On Mon, Aug 5, 2013 at 3:00 PM, Harsh J <ha...@cloudera.com> wrote:
> >> >>
> >> >> The string for each LocalResource in the map can be anything that
> >> >> serves as a common identifier name for your application. At execution
> >> >> time, the passed resource filename will be aliased to the name you've
> >> >> mapped it to, so that the application code need not track special
> >> >> names. The behavior is very similar to how you can, in MR, define a
> >> >> symlink name for a DistributedCache entry (e.g. foo.jar#bar.jar).
> >> >>
> >> >> For an example, checkout the DistributedShell app sources.
> >> >>
> >> >> Over [1], you can see we take a user provided file path to a shell
> >> >> script. This can be named anything as it is user-supplied.
> >> >> Onto [2], we define this as a local resource [2.1] and embed it with a
> >> >> different name (the string you ask about) [2.2], as defined at [3] as
> >> >> an application reference-able constant.
> >> >> Note that in [4], we add to the Container arguments the aliased name
> >> >> we mapped it to (i.e. [3]) and not the original filename we received
> >> >> from the user. The resource is placed on the container with this name
> >> >> instead, so thats what we choose to execute.
> >> >>
> >> >> [1] -
> >> >>
> >> >> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L390
> >> >>
> >> >> [2] - [2.1]
> >> >>
> >> >> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L764
> >> >> and [2.2]
> >> >>
> >> >> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L780
> >> >>
> >> >> [3] -
> >> >>
> >> >> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L205
> >> >>
> >> >> [4] -
> >> >>
> >> >> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L791
> >> >>
> >> >> On Mon, Aug 5, 2013 at 2:44 PM, Krishna Kishore Bonagiri
> >> >> <wr...@gmail.com> wrote:
> >> >> > Hi,
> >> >> >
> >> >> >   Can someone please tell me what is the use of calling
> >> >> > setLocalResources()
> >> >> > on ContainerLaunchContext?
> >> >> >
> >> >> >   And, also an example of how to use this will help...
> >> >> >
> >> >> >  I couldn't guess what is the String in the map that is passed to
> >> >> > setLocalResources() like below:
> >> >> >
> >> >> >       // Set the local resources
> >> >> >       Map<String, LocalResource> localResources = new HashMap<String,
> >> >> > LocalResource>();
> >> >> >
> >> >> > Thanks,
> >> >> > Kishore
> >> >> >
> >> >>
> >> >>
> >> >>
> >> >> --
> >> >> Harsh J
> >> >
> >> >
> >>
> >>
> >>
> >> --
> >> Harsh J
> >
> >
> 
> 
> 
> --
> Harsh J
> 
> 
> 


Re: setLocalResources() on ContainerLaunchContext

Posted by Hitesh Shah <hi...@apache.org>.
Hi Krishna, 

YARN downloads a specified local resource on the container's node from the url specified. In all situtations, the remote url needs to be a fully qualified path. To verify that the file at the remote url is still valid, YARN expects you to provide the length and last modified timestamp of that file.

If you use an hdfs path such as hdfs://namenode:port/<absolute path to file>, you will need to get the length and timestamp from HDFS. 
If you use file:///, the file should exist on all nodes and all nodes should have the file with the same length and timestamp for localization to work. ( For a single node setup, this works but tougher to get right on a multi-node setup - deploying the file via a rpm should likely work).

-- Hitesh

On Aug 6, 2013, at 11:11 AM, Omkar Joshi wrote:

> Hi,
> 
> You need to match the timestamp. Probably get the timestamp locally before adding it. This is explicitly done to ensure that file is not updated after user makes the call to avoid possible errors.
> 
> 
> Thanks,
> Omkar Joshi
> Hortonworks Inc.
> 
> 
> On Tue, Aug 6, 2013 at 5:25 AM, Krishna Kishore Bonagiri <wr...@gmail.com> wrote:
> I tried the following and it works!
> String shellScriptPath = "file:///home_/dsadm/kishore/kk.ksh";
> 
> But now getting a timestamp error like below, when I passed 0 to setTimestamp()
> 
> 13/08/06 08:23:48 INFO ApplicationMaster: Got container status for containerID= container_1375784329048_0017_01_000002, state=COMPLETE, exitStatus=-1000, diagnostics=Resource file:/home_/dsadm/kishore/kk.ksh changed on src filesystem (expected 0, was 1367580580000
> 
> 
> 
> 
> On Tue, Aug 6, 2013 at 5:24 PM, Harsh J <ha...@cloudera.com> wrote:
> Can you try passing a fully qualified local path? That is, including the file:/ scheme
> 
> On Aug 6, 2013 4:05 PM, "Krishna Kishore Bonagiri" <wr...@gmail.com> wrote:
> Hi Harsh,
>    The setResource() call on LocalResource() is expecting an argument of type org.apache.hadoop.yarn.api.records.URL which is converted from a string in the form of URI. This happens in the following call of Distributed Shell example, 
> 
> shellRsrc.setResource(ConverterUtils.getYarnUrlFromURI(new URI( shellScriptPath)));
> 
> So, if I give a local file I get a parsing error like below, which is when I changed it to an HDFS file thinking that it should be given like that only. Could you please give an example of how else it could be used, using a local file as you are saying?
> 
> 2013-08-06 06:23:12,942 WARN org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container: Failed to parse resource-request
> java.net.URISyntaxException: Expected scheme name at index 0: :///home_/dsadm/kishore/kk.ksh
>         at java.net.URI$Parser.fail(URI.java:2820)
>         at java.net.URI$Parser.failExpecting(URI.java:2826)
>         at java.net.URI$Parser.parse(URI.java:3015)
>         at java.net.URI.<init>(URI.java:747)
>         at org.apache.hadoop.yarn.util.ConverterUtils.getPathFromYarnURL(ConverterUtils.java:77)
>         at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalResourceRequest.<init>(LocalResourceRequest.java:46)
> 
> 
> 
> On Tue, Aug 6, 2013 at 3:36 PM, Harsh J <ha...@cloudera.com> wrote:
> To be honest, I've never tried loading a HDFS file onto the
> LocalResource this way. I usually just pass a local file and that
> works just fine. There may be something in the URI transformation
> possibly breaking a HDFS source, but try passing a local file - does
> that fail too? The Shell example uses a local file.
> 
> On Tue, Aug 6, 2013 at 10:54 AM, Krishna Kishore Bonagiri
> <wr...@gmail.com> wrote:
> > Hi Harsh,
> >
> >   Please see if this is useful, I got a stack trace after the error has
> > occurred....
> >
> > 2013-08-06 00:55:30,559 INFO
> > org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor: CWD set
> > to /tmp/nm-local-dir/usercache/dsadm/appcache/application_1375716148174_0004
> > =
> > file:/tmp/nm-local-dir/usercache/dsadm/appcache/application_1375716148174_0004
> > 2013-08-06 00:55:31,017 ERROR
> > org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException
> > as:dsadm (auth:SIMPLE) cause:java.io.FileNotFoundException: File does not
> > exist: hdfs://isredeng/kishore/kk.ksh
> > 2013-08-06 00:55:31,029 INFO
> > org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService:
> > DEBUG: FAILED { hdfs://isredeng/kishore/kk.ksh, 0, FILE, null }, File does
> > not exist: hdfs://isredeng/kishore/kk.ksh
> > 2013-08-06 00:55:31,031 INFO
> > org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalizedResource:
> > Resource hdfs://isredeng/kishore/kk.ksh transitioned from DOWNLOADING to
> > FAILED
> > 2013-08-06 00:55:31,034 INFO
> > org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container:
> > Container container_1375716148174_0004_01_000002 transitioned from
> > LOCALIZING to LOCALIZATION_FAILED
> > 2013-08-06 00:55:31,035 INFO
> > org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalResourcesTrackerImpl:
> > Container container_1375716148174_0004_01_000002 sent RELEASE event on a
> > resource request { hdfs://isredeng/kishore/kk.ksh, 0, FILE, null } not
> > present in cache.
> > 2013-08-06 00:55:31,036 WARN org.apache.hadoop.ipc.Client: interrupted
> > waiting to send rpc request to server
> > java.lang.InterruptedException
> >         at
> > java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1290)
> >         at
> > java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:229)
> >         at java.util.concurrent.FutureTask.get(FutureTask.java:94)
> >         at
> > org.apache.hadoop.ipc.Client$Connection.sendRpcRequest(Client.java:930)
> >         at org.apache.hadoop.ipc.Client.call(Client.java:1285)
> >         at org.apache.hadoop.ipc.Client.call(Client.java:1264)
> >         at
> > org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
> >         at $Proxy22.heartbeat(Unknown Source)
> >         at
> > org.apache.hadoop.yarn.server.nodemanager.api.impl.pb.client.LocalizationProtocolPBClientImpl.heartbeat(LocalizationProtocolPBClientImpl.java:62)
> >         at
> > org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.localizeFiles(ContainerLocalizer.java:249)
> >         at
> > org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.runLocalization(ContainerLocalizer.java:163)
> >         at
> > org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.startLocalizer(DefaultContainerExecutor.java:106)
> >         at
> > org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerRunner.run(ResourceLocalizationService.java:979)
> >
> >
> >
> > And here is my code snippet:
> >
> >       ContainerLaunchContext ctx =
> > Records.newRecord(ContainerLaunchContext.class);
> >
> >       ctx.setEnvironment(oshEnv);
> >
> >       // Set the local resources
> >       Map<String, LocalResource> localResources = new HashMap<String,
> > LocalResource>();
> >
> >       LocalResource shellRsrc = Records.newRecord(LocalResource.class);
> >       shellRsrc.setType(LocalResourceType.FILE);
> >       shellRsrc.setVisibility(LocalResourceVisibility.APPLICATION);
> >       String shellScriptPath = "hdfs://isredeng//kishore/kk.ksh";
> >       try {
> >         shellRsrc.setResource(ConverterUtils.getYarnUrlFromURI(new
> > URI(shellScriptPath)));
> >       } catch (URISyntaxException e) {
> >         LOG.error("Error when trying to use shell script path specified"
> >             + " in env, path=" + shellScriptPath);
> >         e.printStackTrace();
> >       }
> >
> >       shellRsrc.setTimestamp(0/*shellScriptPathTimestamp*/);
> >       shellRsrc.setSize(0/*shellScriptPathLen*/);
> >       String ExecShellStringPath = "ExecShellScript.sh";
> >       localResources.put(ExecShellStringPath, shellRsrc);
> >
> >       ctx.setLocalResources(localResources);
> >
> >
> > Please let me know if you need anything else.
> >
> > Thanks,
> > Kishore
> >
> >
> >
> > On Tue, Aug 6, 2013 at 12:05 AM, Harsh J <ha...@cloudera.com> wrote:
> >>
> >> The detail is insufficient to answer why. You should also have gotten
> >> a trace after it, can you post that? If possible, also the relevant
> >> snippets of code.
> >>
> >> On Mon, Aug 5, 2013 at 6:36 PM, Krishna Kishore Bonagiri
> >> <wr...@gmail.com> wrote:
> >> > Hi Harsh,
> >> >  Thanks for the quick and detailed reply, it really helps. I am trying
> >> > to
> >> > use it and getting this error in node manager's log:
> >> >
> >> > 2013-08-05 08:57:28,867 ERROR
> >> > org.apache.hadoop.security.UserGroupInformation:
> >> > PriviledgedActionException
> >> > as:dsadm (auth:SIMPLE) cause:java.io.FileNotFoundException: File does
> >> > not
> >> > exist: hdfs://isredeng/kishore/kk.ksh
> >> >
> >> >
> >> > This file is there on the machine with name "isredeng", I could do ls
> >> > for
> >> > that file as below:
> >> >
> >> > -bash-4.1$ hadoop fs -ls kishore/kk.ksh
> >> > 13/08/05 09:01:03 WARN util.NativeCodeLoader: Unable to load
> >> > native-hadoop
> >> > library for your platform... using builtin-java classes where applicable
> >> > Found 1 items
> >> > -rw-r--r--   3 dsadm supergroup       1046 2013-08-05 08:48
> >> > kishore/kk.ksh
> >> >
> >> > Note: I am using a single node cluster
> >> >
> >> > Thanks,
> >> > Kishore
> >> >
> >> >
> >> >
> >> >
> >> > On Mon, Aug 5, 2013 at 3:00 PM, Harsh J <ha...@cloudera.com> wrote:
> >> >>
> >> >> The string for each LocalResource in the map can be anything that
> >> >> serves as a common identifier name for your application. At execution
> >> >> time, the passed resource filename will be aliased to the name you've
> >> >> mapped it to, so that the application code need not track special
> >> >> names. The behavior is very similar to how you can, in MR, define a
> >> >> symlink name for a DistributedCache entry (e.g. foo.jar#bar.jar).
> >> >>
> >> >> For an example, checkout the DistributedShell app sources.
> >> >>
> >> >> Over [1], you can see we take a user provided file path to a shell
> >> >> script. This can be named anything as it is user-supplied.
> >> >> Onto [2], we define this as a local resource [2.1] and embed it with a
> >> >> different name (the string you ask about) [2.2], as defined at [3] as
> >> >> an application reference-able constant.
> >> >> Note that in [4], we add to the Container arguments the aliased name
> >> >> we mapped it to (i.e. [3]) and not the original filename we received
> >> >> from the user. The resource is placed on the container with this name
> >> >> instead, so thats what we choose to execute.
> >> >>
> >> >> [1] -
> >> >>
> >> >> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L390
> >> >>
> >> >> [2] - [2.1]
> >> >>
> >> >> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L764
> >> >> and [2.2]
> >> >>
> >> >> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L780
> >> >>
> >> >> [3] -
> >> >>
> >> >> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L205
> >> >>
> >> >> [4] -
> >> >>
> >> >> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L791
> >> >>
> >> >> On Mon, Aug 5, 2013 at 2:44 PM, Krishna Kishore Bonagiri
> >> >> <wr...@gmail.com> wrote:
> >> >> > Hi,
> >> >> >
> >> >> >   Can someone please tell me what is the use of calling
> >> >> > setLocalResources()
> >> >> > on ContainerLaunchContext?
> >> >> >
> >> >> >   And, also an example of how to use this will help...
> >> >> >
> >> >> >  I couldn't guess what is the String in the map that is passed to
> >> >> > setLocalResources() like below:
> >> >> >
> >> >> >       // Set the local resources
> >> >> >       Map<String, LocalResource> localResources = new HashMap<String,
> >> >> > LocalResource>();
> >> >> >
> >> >> > Thanks,
> >> >> > Kishore
> >> >> >
> >> >>
> >> >>
> >> >>
> >> >> --
> >> >> Harsh J
> >> >
> >> >
> >>
> >>
> >>
> >> --
> >> Harsh J
> >
> >
> 
> 
> 
> --
> Harsh J
> 
> 
> 


Re: setLocalResources() on ContainerLaunchContext

Posted by Hitesh Shah <hi...@apache.org>.
Hi Krishna, 

YARN downloads a specified local resource on the container's node from the url specified. In all situtations, the remote url needs to be a fully qualified path. To verify that the file at the remote url is still valid, YARN expects you to provide the length and last modified timestamp of that file.

If you use an hdfs path such as hdfs://namenode:port/<absolute path to file>, you will need to get the length and timestamp from HDFS. 
If you use file:///, the file should exist on all nodes and all nodes should have the file with the same length and timestamp for localization to work. ( For a single node setup, this works but tougher to get right on a multi-node setup - deploying the file via a rpm should likely work).

-- Hitesh

On Aug 6, 2013, at 11:11 AM, Omkar Joshi wrote:

> Hi,
> 
> You need to match the timestamp. Probably get the timestamp locally before adding it. This is explicitly done to ensure that file is not updated after user makes the call to avoid possible errors.
> 
> 
> Thanks,
> Omkar Joshi
> Hortonworks Inc.
> 
> 
> On Tue, Aug 6, 2013 at 5:25 AM, Krishna Kishore Bonagiri <wr...@gmail.com> wrote:
> I tried the following and it works!
> String shellScriptPath = "file:///home_/dsadm/kishore/kk.ksh";
> 
> But now getting a timestamp error like below, when I passed 0 to setTimestamp()
> 
> 13/08/06 08:23:48 INFO ApplicationMaster: Got container status for containerID= container_1375784329048_0017_01_000002, state=COMPLETE, exitStatus=-1000, diagnostics=Resource file:/home_/dsadm/kishore/kk.ksh changed on src filesystem (expected 0, was 1367580580000
> 
> 
> 
> 
> On Tue, Aug 6, 2013 at 5:24 PM, Harsh J <ha...@cloudera.com> wrote:
> Can you try passing a fully qualified local path? That is, including the file:/ scheme
> 
> On Aug 6, 2013 4:05 PM, "Krishna Kishore Bonagiri" <wr...@gmail.com> wrote:
> Hi Harsh,
>    The setResource() call on LocalResource() is expecting an argument of type org.apache.hadoop.yarn.api.records.URL which is converted from a string in the form of URI. This happens in the following call of Distributed Shell example, 
> 
> shellRsrc.setResource(ConverterUtils.getYarnUrlFromURI(new URI( shellScriptPath)));
> 
> So, if I give a local file I get a parsing error like below, which is when I changed it to an HDFS file thinking that it should be given like that only. Could you please give an example of how else it could be used, using a local file as you are saying?
> 
> 2013-08-06 06:23:12,942 WARN org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container: Failed to parse resource-request
> java.net.URISyntaxException: Expected scheme name at index 0: :///home_/dsadm/kishore/kk.ksh
>         at java.net.URI$Parser.fail(URI.java:2820)
>         at java.net.URI$Parser.failExpecting(URI.java:2826)
>         at java.net.URI$Parser.parse(URI.java:3015)
>         at java.net.URI.<init>(URI.java:747)
>         at org.apache.hadoop.yarn.util.ConverterUtils.getPathFromYarnURL(ConverterUtils.java:77)
>         at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalResourceRequest.<init>(LocalResourceRequest.java:46)
> 
> 
> 
> On Tue, Aug 6, 2013 at 3:36 PM, Harsh J <ha...@cloudera.com> wrote:
> To be honest, I've never tried loading a HDFS file onto the
> LocalResource this way. I usually just pass a local file and that
> works just fine. There may be something in the URI transformation
> possibly breaking a HDFS source, but try passing a local file - does
> that fail too? The Shell example uses a local file.
> 
> On Tue, Aug 6, 2013 at 10:54 AM, Krishna Kishore Bonagiri
> <wr...@gmail.com> wrote:
> > Hi Harsh,
> >
> >   Please see if this is useful, I got a stack trace after the error has
> > occurred....
> >
> > 2013-08-06 00:55:30,559 INFO
> > org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor: CWD set
> > to /tmp/nm-local-dir/usercache/dsadm/appcache/application_1375716148174_0004
> > =
> > file:/tmp/nm-local-dir/usercache/dsadm/appcache/application_1375716148174_0004
> > 2013-08-06 00:55:31,017 ERROR
> > org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException
> > as:dsadm (auth:SIMPLE) cause:java.io.FileNotFoundException: File does not
> > exist: hdfs://isredeng/kishore/kk.ksh
> > 2013-08-06 00:55:31,029 INFO
> > org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService:
> > DEBUG: FAILED { hdfs://isredeng/kishore/kk.ksh, 0, FILE, null }, File does
> > not exist: hdfs://isredeng/kishore/kk.ksh
> > 2013-08-06 00:55:31,031 INFO
> > org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalizedResource:
> > Resource hdfs://isredeng/kishore/kk.ksh transitioned from DOWNLOADING to
> > FAILED
> > 2013-08-06 00:55:31,034 INFO
> > org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container:
> > Container container_1375716148174_0004_01_000002 transitioned from
> > LOCALIZING to LOCALIZATION_FAILED
> > 2013-08-06 00:55:31,035 INFO
> > org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalResourcesTrackerImpl:
> > Container container_1375716148174_0004_01_000002 sent RELEASE event on a
> > resource request { hdfs://isredeng/kishore/kk.ksh, 0, FILE, null } not
> > present in cache.
> > 2013-08-06 00:55:31,036 WARN org.apache.hadoop.ipc.Client: interrupted
> > waiting to send rpc request to server
> > java.lang.InterruptedException
> >         at
> > java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1290)
> >         at
> > java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:229)
> >         at java.util.concurrent.FutureTask.get(FutureTask.java:94)
> >         at
> > org.apache.hadoop.ipc.Client$Connection.sendRpcRequest(Client.java:930)
> >         at org.apache.hadoop.ipc.Client.call(Client.java:1285)
> >         at org.apache.hadoop.ipc.Client.call(Client.java:1264)
> >         at
> > org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
> >         at $Proxy22.heartbeat(Unknown Source)
> >         at
> > org.apache.hadoop.yarn.server.nodemanager.api.impl.pb.client.LocalizationProtocolPBClientImpl.heartbeat(LocalizationProtocolPBClientImpl.java:62)
> >         at
> > org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.localizeFiles(ContainerLocalizer.java:249)
> >         at
> > org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.runLocalization(ContainerLocalizer.java:163)
> >         at
> > org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.startLocalizer(DefaultContainerExecutor.java:106)
> >         at
> > org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerRunner.run(ResourceLocalizationService.java:979)
> >
> >
> >
> > And here is my code snippet:
> >
> >       ContainerLaunchContext ctx =
> > Records.newRecord(ContainerLaunchContext.class);
> >
> >       ctx.setEnvironment(oshEnv);
> >
> >       // Set the local resources
> >       Map<String, LocalResource> localResources = new HashMap<String,
> > LocalResource>();
> >
> >       LocalResource shellRsrc = Records.newRecord(LocalResource.class);
> >       shellRsrc.setType(LocalResourceType.FILE);
> >       shellRsrc.setVisibility(LocalResourceVisibility.APPLICATION);
> >       String shellScriptPath = "hdfs://isredeng//kishore/kk.ksh";
> >       try {
> >         shellRsrc.setResource(ConverterUtils.getYarnUrlFromURI(new
> > URI(shellScriptPath)));
> >       } catch (URISyntaxException e) {
> >         LOG.error("Error when trying to use shell script path specified"
> >             + " in env, path=" + shellScriptPath);
> >         e.printStackTrace();
> >       }
> >
> >       shellRsrc.setTimestamp(0/*shellScriptPathTimestamp*/);
> >       shellRsrc.setSize(0/*shellScriptPathLen*/);
> >       String ExecShellStringPath = "ExecShellScript.sh";
> >       localResources.put(ExecShellStringPath, shellRsrc);
> >
> >       ctx.setLocalResources(localResources);
> >
> >
> > Please let me know if you need anything else.
> >
> > Thanks,
> > Kishore
> >
> >
> >
> > On Tue, Aug 6, 2013 at 12:05 AM, Harsh J <ha...@cloudera.com> wrote:
> >>
> >> The detail is insufficient to answer why. You should also have gotten
> >> a trace after it, can you post that? If possible, also the relevant
> >> snippets of code.
> >>
> >> On Mon, Aug 5, 2013 at 6:36 PM, Krishna Kishore Bonagiri
> >> <wr...@gmail.com> wrote:
> >> > Hi Harsh,
> >> >  Thanks for the quick and detailed reply, it really helps. I am trying
> >> > to
> >> > use it and getting this error in node manager's log:
> >> >
> >> > 2013-08-05 08:57:28,867 ERROR
> >> > org.apache.hadoop.security.UserGroupInformation:
> >> > PriviledgedActionException
> >> > as:dsadm (auth:SIMPLE) cause:java.io.FileNotFoundException: File does
> >> > not
> >> > exist: hdfs://isredeng/kishore/kk.ksh
> >> >
> >> >
> >> > This file is there on the machine with name "isredeng", I could do ls
> >> > for
> >> > that file as below:
> >> >
> >> > -bash-4.1$ hadoop fs -ls kishore/kk.ksh
> >> > 13/08/05 09:01:03 WARN util.NativeCodeLoader: Unable to load
> >> > native-hadoop
> >> > library for your platform... using builtin-java classes where applicable
> >> > Found 1 items
> >> > -rw-r--r--   3 dsadm supergroup       1046 2013-08-05 08:48
> >> > kishore/kk.ksh
> >> >
> >> > Note: I am using a single node cluster
> >> >
> >> > Thanks,
> >> > Kishore
> >> >
> >> >
> >> >
> >> >
> >> > On Mon, Aug 5, 2013 at 3:00 PM, Harsh J <ha...@cloudera.com> wrote:
> >> >>
> >> >> The string for each LocalResource in the map can be anything that
> >> >> serves as a common identifier name for your application. At execution
> >> >> time, the passed resource filename will be aliased to the name you've
> >> >> mapped it to, so that the application code need not track special
> >> >> names. The behavior is very similar to how you can, in MR, define a
> >> >> symlink name for a DistributedCache entry (e.g. foo.jar#bar.jar).
> >> >>
> >> >> For an example, checkout the DistributedShell app sources.
> >> >>
> >> >> Over [1], you can see we take a user provided file path to a shell
> >> >> script. This can be named anything as it is user-supplied.
> >> >> Onto [2], we define this as a local resource [2.1] and embed it with a
> >> >> different name (the string you ask about) [2.2], as defined at [3] as
> >> >> an application reference-able constant.
> >> >> Note that in [4], we add to the Container arguments the aliased name
> >> >> we mapped it to (i.e. [3]) and not the original filename we received
> >> >> from the user. The resource is placed on the container with this name
> >> >> instead, so thats what we choose to execute.
> >> >>
> >> >> [1] -
> >> >>
> >> >> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L390
> >> >>
> >> >> [2] - [2.1]
> >> >>
> >> >> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L764
> >> >> and [2.2]
> >> >>
> >> >> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L780
> >> >>
> >> >> [3] -
> >> >>
> >> >> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L205
> >> >>
> >> >> [4] -
> >> >>
> >> >> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L791
> >> >>
> >> >> On Mon, Aug 5, 2013 at 2:44 PM, Krishna Kishore Bonagiri
> >> >> <wr...@gmail.com> wrote:
> >> >> > Hi,
> >> >> >
> >> >> >   Can someone please tell me what is the use of calling
> >> >> > setLocalResources()
> >> >> > on ContainerLaunchContext?
> >> >> >
> >> >> >   And, also an example of how to use this will help...
> >> >> >
> >> >> >  I couldn't guess what is the String in the map that is passed to
> >> >> > setLocalResources() like below:
> >> >> >
> >> >> >       // Set the local resources
> >> >> >       Map<String, LocalResource> localResources = new HashMap<String,
> >> >> > LocalResource>();
> >> >> >
> >> >> > Thanks,
> >> >> > Kishore
> >> >> >
> >> >>
> >> >>
> >> >>
> >> >> --
> >> >> Harsh J
> >> >
> >> >
> >>
> >>
> >>
> >> --
> >> Harsh J
> >
> >
> 
> 
> 
> --
> Harsh J
> 
> 
> 


Re: setLocalResources() on ContainerLaunchContext

Posted by Hitesh Shah <hi...@apache.org>.
Hi Krishna, 

YARN downloads a specified local resource on the container's node from the url specified. In all situtations, the remote url needs to be a fully qualified path. To verify that the file at the remote url is still valid, YARN expects you to provide the length and last modified timestamp of that file.

If you use an hdfs path such as hdfs://namenode:port/<absolute path to file>, you will need to get the length and timestamp from HDFS. 
If you use file:///, the file should exist on all nodes and all nodes should have the file with the same length and timestamp for localization to work. ( For a single node setup, this works but tougher to get right on a multi-node setup - deploying the file via a rpm should likely work).

-- Hitesh

On Aug 6, 2013, at 11:11 AM, Omkar Joshi wrote:

> Hi,
> 
> You need to match the timestamp. Probably get the timestamp locally before adding it. This is explicitly done to ensure that file is not updated after user makes the call to avoid possible errors.
> 
> 
> Thanks,
> Omkar Joshi
> Hortonworks Inc.
> 
> 
> On Tue, Aug 6, 2013 at 5:25 AM, Krishna Kishore Bonagiri <wr...@gmail.com> wrote:
> I tried the following and it works!
> String shellScriptPath = "file:///home_/dsadm/kishore/kk.ksh";
> 
> But now getting a timestamp error like below, when I passed 0 to setTimestamp()
> 
> 13/08/06 08:23:48 INFO ApplicationMaster: Got container status for containerID= container_1375784329048_0017_01_000002, state=COMPLETE, exitStatus=-1000, diagnostics=Resource file:/home_/dsadm/kishore/kk.ksh changed on src filesystem (expected 0, was 1367580580000
> 
> 
> 
> 
> On Tue, Aug 6, 2013 at 5:24 PM, Harsh J <ha...@cloudera.com> wrote:
> Can you try passing a fully qualified local path? That is, including the file:/ scheme
> 
> On Aug 6, 2013 4:05 PM, "Krishna Kishore Bonagiri" <wr...@gmail.com> wrote:
> Hi Harsh,
>    The setResource() call on LocalResource() is expecting an argument of type org.apache.hadoop.yarn.api.records.URL which is converted from a string in the form of URI. This happens in the following call of Distributed Shell example, 
> 
> shellRsrc.setResource(ConverterUtils.getYarnUrlFromURI(new URI( shellScriptPath)));
> 
> So, if I give a local file I get a parsing error like below, which is when I changed it to an HDFS file thinking that it should be given like that only. Could you please give an example of how else it could be used, using a local file as you are saying?
> 
> 2013-08-06 06:23:12,942 WARN org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container: Failed to parse resource-request
> java.net.URISyntaxException: Expected scheme name at index 0: :///home_/dsadm/kishore/kk.ksh
>         at java.net.URI$Parser.fail(URI.java:2820)
>         at java.net.URI$Parser.failExpecting(URI.java:2826)
>         at java.net.URI$Parser.parse(URI.java:3015)
>         at java.net.URI.<init>(URI.java:747)
>         at org.apache.hadoop.yarn.util.ConverterUtils.getPathFromYarnURL(ConverterUtils.java:77)
>         at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalResourceRequest.<init>(LocalResourceRequest.java:46)
> 
> 
> 
> On Tue, Aug 6, 2013 at 3:36 PM, Harsh J <ha...@cloudera.com> wrote:
> To be honest, I've never tried loading a HDFS file onto the
> LocalResource this way. I usually just pass a local file and that
> works just fine. There may be something in the URI transformation
> possibly breaking a HDFS source, but try passing a local file - does
> that fail too? The Shell example uses a local file.
> 
> On Tue, Aug 6, 2013 at 10:54 AM, Krishna Kishore Bonagiri
> <wr...@gmail.com> wrote:
> > Hi Harsh,
> >
> >   Please see if this is useful, I got a stack trace after the error has
> > occurred....
> >
> > 2013-08-06 00:55:30,559 INFO
> > org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor: CWD set
> > to /tmp/nm-local-dir/usercache/dsadm/appcache/application_1375716148174_0004
> > =
> > file:/tmp/nm-local-dir/usercache/dsadm/appcache/application_1375716148174_0004
> > 2013-08-06 00:55:31,017 ERROR
> > org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException
> > as:dsadm (auth:SIMPLE) cause:java.io.FileNotFoundException: File does not
> > exist: hdfs://isredeng/kishore/kk.ksh
> > 2013-08-06 00:55:31,029 INFO
> > org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService:
> > DEBUG: FAILED { hdfs://isredeng/kishore/kk.ksh, 0, FILE, null }, File does
> > not exist: hdfs://isredeng/kishore/kk.ksh
> > 2013-08-06 00:55:31,031 INFO
> > org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalizedResource:
> > Resource hdfs://isredeng/kishore/kk.ksh transitioned from DOWNLOADING to
> > FAILED
> > 2013-08-06 00:55:31,034 INFO
> > org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container:
> > Container container_1375716148174_0004_01_000002 transitioned from
> > LOCALIZING to LOCALIZATION_FAILED
> > 2013-08-06 00:55:31,035 INFO
> > org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalResourcesTrackerImpl:
> > Container container_1375716148174_0004_01_000002 sent RELEASE event on a
> > resource request { hdfs://isredeng/kishore/kk.ksh, 0, FILE, null } not
> > present in cache.
> > 2013-08-06 00:55:31,036 WARN org.apache.hadoop.ipc.Client: interrupted
> > waiting to send rpc request to server
> > java.lang.InterruptedException
> >         at
> > java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1290)
> >         at
> > java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:229)
> >         at java.util.concurrent.FutureTask.get(FutureTask.java:94)
> >         at
> > org.apache.hadoop.ipc.Client$Connection.sendRpcRequest(Client.java:930)
> >         at org.apache.hadoop.ipc.Client.call(Client.java:1285)
> >         at org.apache.hadoop.ipc.Client.call(Client.java:1264)
> >         at
> > org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
> >         at $Proxy22.heartbeat(Unknown Source)
> >         at
> > org.apache.hadoop.yarn.server.nodemanager.api.impl.pb.client.LocalizationProtocolPBClientImpl.heartbeat(LocalizationProtocolPBClientImpl.java:62)
> >         at
> > org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.localizeFiles(ContainerLocalizer.java:249)
> >         at
> > org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.runLocalization(ContainerLocalizer.java:163)
> >         at
> > org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.startLocalizer(DefaultContainerExecutor.java:106)
> >         at
> > org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerRunner.run(ResourceLocalizationService.java:979)
> >
> >
> >
> > And here is my code snippet:
> >
> >       ContainerLaunchContext ctx =
> > Records.newRecord(ContainerLaunchContext.class);
> >
> >       ctx.setEnvironment(oshEnv);
> >
> >       // Set the local resources
> >       Map<String, LocalResource> localResources = new HashMap<String,
> > LocalResource>();
> >
> >       LocalResource shellRsrc = Records.newRecord(LocalResource.class);
> >       shellRsrc.setType(LocalResourceType.FILE);
> >       shellRsrc.setVisibility(LocalResourceVisibility.APPLICATION);
> >       String shellScriptPath = "hdfs://isredeng//kishore/kk.ksh";
> >       try {
> >         shellRsrc.setResource(ConverterUtils.getYarnUrlFromURI(new
> > URI(shellScriptPath)));
> >       } catch (URISyntaxException e) {
> >         LOG.error("Error when trying to use shell script path specified"
> >             + " in env, path=" + shellScriptPath);
> >         e.printStackTrace();
> >       }
> >
> >       shellRsrc.setTimestamp(0/*shellScriptPathTimestamp*/);
> >       shellRsrc.setSize(0/*shellScriptPathLen*/);
> >       String ExecShellStringPath = "ExecShellScript.sh";
> >       localResources.put(ExecShellStringPath, shellRsrc);
> >
> >       ctx.setLocalResources(localResources);
> >
> >
> > Please let me know if you need anything else.
> >
> > Thanks,
> > Kishore
> >
> >
> >
> > On Tue, Aug 6, 2013 at 12:05 AM, Harsh J <ha...@cloudera.com> wrote:
> >>
> >> The detail is insufficient to answer why. You should also have gotten
> >> a trace after it, can you post that? If possible, also the relevant
> >> snippets of code.
> >>
> >> On Mon, Aug 5, 2013 at 6:36 PM, Krishna Kishore Bonagiri
> >> <wr...@gmail.com> wrote:
> >> > Hi Harsh,
> >> >  Thanks for the quick and detailed reply, it really helps. I am trying
> >> > to
> >> > use it and getting this error in node manager's log:
> >> >
> >> > 2013-08-05 08:57:28,867 ERROR
> >> > org.apache.hadoop.security.UserGroupInformation:
> >> > PriviledgedActionException
> >> > as:dsadm (auth:SIMPLE) cause:java.io.FileNotFoundException: File does
> >> > not
> >> > exist: hdfs://isredeng/kishore/kk.ksh
> >> >
> >> >
> >> > This file is there on the machine with name "isredeng", I could do ls
> >> > for
> >> > that file as below:
> >> >
> >> > -bash-4.1$ hadoop fs -ls kishore/kk.ksh
> >> > 13/08/05 09:01:03 WARN util.NativeCodeLoader: Unable to load
> >> > native-hadoop
> >> > library for your platform... using builtin-java classes where applicable
> >> > Found 1 items
> >> > -rw-r--r--   3 dsadm supergroup       1046 2013-08-05 08:48
> >> > kishore/kk.ksh
> >> >
> >> > Note: I am using a single node cluster
> >> >
> >> > Thanks,
> >> > Kishore
> >> >
> >> >
> >> >
> >> >
> >> > On Mon, Aug 5, 2013 at 3:00 PM, Harsh J <ha...@cloudera.com> wrote:
> >> >>
> >> >> The string for each LocalResource in the map can be anything that
> >> >> serves as a common identifier name for your application. At execution
> >> >> time, the passed resource filename will be aliased to the name you've
> >> >> mapped it to, so that the application code need not track special
> >> >> names. The behavior is very similar to how you can, in MR, define a
> >> >> symlink name for a DistributedCache entry (e.g. foo.jar#bar.jar).
> >> >>
> >> >> For an example, checkout the DistributedShell app sources.
> >> >>
> >> >> Over [1], you can see we take a user provided file path to a shell
> >> >> script. This can be named anything as it is user-supplied.
> >> >> Onto [2], we define this as a local resource [2.1] and embed it with a
> >> >> different name (the string you ask about) [2.2], as defined at [3] as
> >> >> an application reference-able constant.
> >> >> Note that in [4], we add to the Container arguments the aliased name
> >> >> we mapped it to (i.e. [3]) and not the original filename we received
> >> >> from the user. The resource is placed on the container with this name
> >> >> instead, so thats what we choose to execute.
> >> >>
> >> >> [1] -
> >> >>
> >> >> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L390
> >> >>
> >> >> [2] - [2.1]
> >> >>
> >> >> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L764
> >> >> and [2.2]
> >> >>
> >> >> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L780
> >> >>
> >> >> [3] -
> >> >>
> >> >> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L205
> >> >>
> >> >> [4] -
> >> >>
> >> >> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L791
> >> >>
> >> >> On Mon, Aug 5, 2013 at 2:44 PM, Krishna Kishore Bonagiri
> >> >> <wr...@gmail.com> wrote:
> >> >> > Hi,
> >> >> >
> >> >> >   Can someone please tell me what is the use of calling
> >> >> > setLocalResources()
> >> >> > on ContainerLaunchContext?
> >> >> >
> >> >> >   And, also an example of how to use this will help...
> >> >> >
> >> >> >  I couldn't guess what is the String in the map that is passed to
> >> >> > setLocalResources() like below:
> >> >> >
> >> >> >       // Set the local resources
> >> >> >       Map<String, LocalResource> localResources = new HashMap<String,
> >> >> > LocalResource>();
> >> >> >
> >> >> > Thanks,
> >> >> > Kishore
> >> >> >
> >> >>
> >> >>
> >> >>
> >> >> --
> >> >> Harsh J
> >> >
> >> >
> >>
> >>
> >>
> >> --
> >> Harsh J
> >
> >
> 
> 
> 
> --
> Harsh J
> 
> 
> 


Re: setLocalResources() on ContainerLaunchContext

Posted by Omkar Joshi <oj...@hortonworks.com>.
Hi,

You need to match the timestamp. Probably get the timestamp locally before
adding it. This is explicitly done to ensure that file is not updated after
user makes the call to avoid possible errors.


Thanks,
Omkar Joshi
*Hortonworks Inc.* <http://www.hortonworks.com>


On Tue, Aug 6, 2013 at 5:25 AM, Krishna Kishore Bonagiri <
write2kishore@gmail.com> wrote:

> I tried the following and it works!
> String shellScriptPath = "file:///home_/dsadm/kishore/kk.ksh";
>
> But now getting a timestamp error like below, when I passed 0 to
> setTimestamp()
>
> 13/08/06 08:23:48 INFO ApplicationMaster: Got container status for
> containerID= container_1375784329048_0017_01_000002, state=COMPLETE,
> exitStatus=-1000, diagnostics=Resource file:/home_/dsadm/kishore/kk.ksh
> changed on src filesystem (expected 0, was 1367580580000
>
>
>
>
> On Tue, Aug 6, 2013 at 5:24 PM, Harsh J <ha...@cloudera.com> wrote:
>
>> Can you try passing a fully qualified local path? That is, including the
>> file:/ scheme
>>  On Aug 6, 2013 4:05 PM, "Krishna Kishore Bonagiri" <
>> write2kishore@gmail.com> wrote:
>>
>>> Hi Harsh,
>>>    The setResource() call on LocalResource() is expecting an argument of
>>> type org.apache.hadoop.yarn.api.records.URL which is converted from a
>>> string in the form of URI. This happens in the following call of
>>> Distributed Shell example,
>>>
>>> shellRsrc.setResource(ConverterUtils.getYarnUrlFromURI(new URI(
>>> shellScriptPath)));
>>>
>>> So, if I give a local file I get a parsing error like below, which is
>>> when I changed it to an HDFS file thinking that it should be given like
>>> that only. Could you please give an example of how else it could be used,
>>> using a local file as you are saying?
>>>
>>> 2013-08-06 06:23:12,942 WARN
>>> org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container:
>>> Failed to parse resource-request
>>> java.net.URISyntaxException: Expected scheme name at index 0:
>>> :///home_/dsadm/kishore/kk.ksh
>>>         at java.net.URI$Parser.fail(URI.java:2820)
>>>         at java.net.URI$Parser.failExpecting(URI.java:2826)
>>>         at java.net.URI$Parser.parse(URI.java:3015)
>>>         at java.net.URI.<init>(URI.java:747)
>>>         at
>>> org.apache.hadoop.yarn.util.ConverterUtils.getPathFromYarnURL(ConverterUtils.java:77)
>>>         at
>>> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalResourceRequest.<init>(LocalResourceRequest.java:46)
>>>
>>>
>>>
>>> On Tue, Aug 6, 2013 at 3:36 PM, Harsh J <ha...@cloudera.com> wrote:
>>>
>>>> To be honest, I've never tried loading a HDFS file onto the
>>>> LocalResource this way. I usually just pass a local file and that
>>>> works just fine. There may be something in the URI transformation
>>>> possibly breaking a HDFS source, but try passing a local file - does
>>>> that fail too? The Shell example uses a local file.
>>>>
>>>> On Tue, Aug 6, 2013 at 10:54 AM, Krishna Kishore Bonagiri
>>>> <wr...@gmail.com> wrote:
>>>> > Hi Harsh,
>>>> >
>>>> >   Please see if this is useful, I got a stack trace after the error
>>>> has
>>>> > occurred....
>>>> >
>>>> > 2013-08-06 00:55:30,559 INFO
>>>> > org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor:
>>>> CWD set
>>>> > to
>>>> /tmp/nm-local-dir/usercache/dsadm/appcache/application_1375716148174_0004
>>>> > =
>>>> >
>>>> file:/tmp/nm-local-dir/usercache/dsadm/appcache/application_1375716148174_0004
>>>> > 2013-08-06 00:55:31,017 ERROR
>>>> > org.apache.hadoop.security.UserGroupInformation:
>>>> PriviledgedActionException
>>>> > as:dsadm (auth:SIMPLE) cause:java.io.FileNotFoundException: File does
>>>> not
>>>> > exist: hdfs://isredeng/kishore/kk.ksh
>>>> > 2013-08-06 00:55:31,029 INFO
>>>> >
>>>> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService:
>>>> > DEBUG: FAILED { hdfs://isredeng/kishore/kk.ksh, 0, FILE, null }, File
>>>> does
>>>> > not exist: hdfs://isredeng/kishore/kk.ksh
>>>> > 2013-08-06 00:55:31,031 INFO
>>>> >
>>>> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalizedResource:
>>>> > Resource hdfs://isredeng/kishore/kk.ksh transitioned from DOWNLOADING
>>>> to
>>>> > FAILED
>>>> > 2013-08-06 00:55:31,034 INFO
>>>> >
>>>> org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container:
>>>> > Container container_1375716148174_0004_01_000002 transitioned from
>>>> > LOCALIZING to LOCALIZATION_FAILED
>>>> > 2013-08-06 00:55:31,035 INFO
>>>> >
>>>> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalResourcesTrackerImpl:
>>>> > Container container_1375716148174_0004_01_000002 sent RELEASE event
>>>> on a
>>>> > resource request { hdfs://isredeng/kishore/kk.ksh, 0, FILE, null } not
>>>> > present in cache.
>>>> > 2013-08-06 00:55:31,036 WARN org.apache.hadoop.ipc.Client: interrupted
>>>> > waiting to send rpc request to server
>>>> > java.lang.InterruptedException
>>>> >         at
>>>> >
>>>> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1290)
>>>> >         at
>>>> > java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:229)
>>>> >         at java.util.concurrent.FutureTask.get(FutureTask.java:94)
>>>> >         at
>>>> >
>>>> org.apache.hadoop.ipc.Client$Connection.sendRpcRequest(Client.java:930)
>>>> >         at org.apache.hadoop.ipc.Client.call(Client.java:1285)
>>>> >         at org.apache.hadoop.ipc.Client.call(Client.java:1264)
>>>> >         at
>>>> >
>>>> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
>>>> >         at $Proxy22.heartbeat(Unknown Source)
>>>> >         at
>>>> >
>>>> org.apache.hadoop.yarn.server.nodemanager.api.impl.pb.client.LocalizationProtocolPBClientImpl.heartbeat(LocalizationProtocolPBClientImpl.java:62)
>>>> >         at
>>>> >
>>>> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.localizeFiles(ContainerLocalizer.java:249)
>>>> >         at
>>>> >
>>>> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.runLocalization(ContainerLocalizer.java:163)
>>>> >         at
>>>> >
>>>> org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.startLocalizer(DefaultContainerExecutor.java:106)
>>>> >         at
>>>> >
>>>> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerRunner.run(ResourceLocalizationService.java:979)
>>>> >
>>>> >
>>>> >
>>>> > And here is my code snippet:
>>>> >
>>>> >       ContainerLaunchContext ctx =
>>>> > Records.newRecord(ContainerLaunchContext.class);
>>>> >
>>>> >       ctx.setEnvironment(oshEnv);
>>>> >
>>>> >       // Set the local resources
>>>> >       Map<String, LocalResource> localResources = new HashMap<String,
>>>> > LocalResource>();
>>>> >
>>>> >       LocalResource shellRsrc =
>>>> Records.newRecord(LocalResource.class);
>>>> >       shellRsrc.setType(LocalResourceType.FILE);
>>>> >       shellRsrc.setVisibility(LocalResourceVisibility.APPLICATION);
>>>> >       String shellScriptPath = "hdfs://isredeng//kishore/kk.ksh";
>>>> >       try {
>>>> >         shellRsrc.setResource(ConverterUtils.getYarnUrlFromURI(new
>>>> > URI(shellScriptPath)));
>>>> >       } catch (URISyntaxException e) {
>>>> >         LOG.error("Error when trying to use shell script path
>>>> specified"
>>>> >             + " in env, path=" + shellScriptPath);
>>>> >         e.printStackTrace();
>>>> >       }
>>>> >
>>>> >       shellRsrc.setTimestamp(0/*shellScriptPathTimestamp*/);
>>>> >       shellRsrc.setSize(0/*shellScriptPathLen*/);
>>>> >       String ExecShellStringPath = "ExecShellScript.sh";
>>>> >       localResources.put(ExecShellStringPath, shellRsrc);
>>>> >
>>>> >       ctx.setLocalResources(localResources);
>>>> >
>>>> >
>>>> > Please let me know if you need anything else.
>>>> >
>>>> > Thanks,
>>>> > Kishore
>>>> >
>>>> >
>>>> >
>>>> > On Tue, Aug 6, 2013 at 12:05 AM, Harsh J <ha...@cloudera.com> wrote:
>>>> >>
>>>> >> The detail is insufficient to answer why. You should also have gotten
>>>> >> a trace after it, can you post that? If possible, also the relevant
>>>> >> snippets of code.
>>>> >>
>>>> >> On Mon, Aug 5, 2013 at 6:36 PM, Krishna Kishore Bonagiri
>>>> >> <wr...@gmail.com> wrote:
>>>> >> > Hi Harsh,
>>>> >> >  Thanks for the quick and detailed reply, it really helps. I am
>>>> trying
>>>> >> > to
>>>> >> > use it and getting this error in node manager's log:
>>>> >> >
>>>> >> > 2013-08-05 08:57:28,867 ERROR
>>>> >> > org.apache.hadoop.security.UserGroupInformation:
>>>> >> > PriviledgedActionException
>>>> >> > as:dsadm (auth:SIMPLE) cause:java.io.FileNotFoundException: File
>>>> does
>>>> >> > not
>>>> >> > exist: hdfs://isredeng/kishore/kk.ksh
>>>> >> >
>>>> >> >
>>>> >> > This file is there on the machine with name "isredeng", I could do
>>>> ls
>>>> >> > for
>>>> >> > that file as below:
>>>> >> >
>>>> >> > -bash-4.1$ hadoop fs -ls kishore/kk.ksh
>>>> >> > 13/08/05 09:01:03 WARN util.NativeCodeLoader: Unable to load
>>>> >> > native-hadoop
>>>> >> > library for your platform... using builtin-java classes where
>>>> applicable
>>>> >> > Found 1 items
>>>> >> > -rw-r--r--   3 dsadm supergroup       1046 2013-08-05 08:48
>>>> >> > kishore/kk.ksh
>>>> >> >
>>>> >> > Note: I am using a single node cluster
>>>> >> >
>>>> >> > Thanks,
>>>> >> > Kishore
>>>> >> >
>>>> >> >
>>>> >> >
>>>> >> >
>>>> >> > On Mon, Aug 5, 2013 at 3:00 PM, Harsh J <ha...@cloudera.com>
>>>> wrote:
>>>> >> >>
>>>> >> >> The string for each LocalResource in the map can be anything that
>>>> >> >> serves as a common identifier name for your application. At
>>>> execution
>>>> >> >> time, the passed resource filename will be aliased to the name
>>>> you've
>>>> >> >> mapped it to, so that the application code need not track special
>>>> >> >> names. The behavior is very similar to how you can, in MR, define
>>>> a
>>>> >> >> symlink name for a DistributedCache entry (e.g. foo.jar#bar.jar).
>>>> >> >>
>>>> >> >> For an example, checkout the DistributedShell app sources.
>>>> >> >>
>>>> >> >> Over [1], you can see we take a user provided file path to a shell
>>>> >> >> script. This can be named anything as it is user-supplied.
>>>> >> >> Onto [2], we define this as a local resource [2.1] and embed it
>>>> with a
>>>> >> >> different name (the string you ask about) [2.2], as defined at
>>>> [3] as
>>>> >> >> an application reference-able constant.
>>>> >> >> Note that in [4], we add to the Container arguments the aliased
>>>> name
>>>> >> >> we mapped it to (i.e. [3]) and not the original filename we
>>>> received
>>>> >> >> from the user. The resource is placed on the container with this
>>>> name
>>>> >> >> instead, so thats what we choose to execute.
>>>> >> >>
>>>> >> >> [1] -
>>>> >> >>
>>>> >> >>
>>>> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L390
>>>> >> >>
>>>> >> >> [2] - [2.1]
>>>> >> >>
>>>> >> >>
>>>> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L764
>>>> >> >> and [2.2]
>>>> >> >>
>>>> >> >>
>>>> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L780
>>>> >> >>
>>>> >> >> [3] -
>>>> >> >>
>>>> >> >>
>>>> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L205
>>>> >> >>
>>>> >> >> [4] -
>>>> >> >>
>>>> >> >>
>>>> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L791
>>>> >> >>
>>>> >> >> On Mon, Aug 5, 2013 at 2:44 PM, Krishna Kishore Bonagiri
>>>> >> >> <wr...@gmail.com> wrote:
>>>> >> >> > Hi,
>>>> >> >> >
>>>> >> >> >   Can someone please tell me what is the use of calling
>>>> >> >> > setLocalResources()
>>>> >> >> > on ContainerLaunchContext?
>>>> >> >> >
>>>> >> >> >   And, also an example of how to use this will help...
>>>> >> >> >
>>>> >> >> >  I couldn't guess what is the String in the map that is passed
>>>> to
>>>> >> >> > setLocalResources() like below:
>>>> >> >> >
>>>> >> >> >       // Set the local resources
>>>> >> >> >       Map<String, LocalResource> localResources = new
>>>> HashMap<String,
>>>> >> >> > LocalResource>();
>>>> >> >> >
>>>> >> >> > Thanks,
>>>> >> >> > Kishore
>>>> >> >> >
>>>> >> >>
>>>> >> >>
>>>> >> >>
>>>> >> >> --
>>>> >> >> Harsh J
>>>> >> >
>>>> >> >
>>>> >>
>>>> >>
>>>> >>
>>>> >> --
>>>> >> Harsh J
>>>> >
>>>> >
>>>>
>>>>
>>>>
>>>> --
>>>> Harsh J
>>>>
>>>
>>>
>

Re: setLocalResources() on ContainerLaunchContext

Posted by Omkar Joshi <oj...@hortonworks.com>.
Hi,

You need to match the timestamp. Probably get the timestamp locally before
adding it. This is explicitly done to ensure that file is not updated after
user makes the call to avoid possible errors.


Thanks,
Omkar Joshi
*Hortonworks Inc.* <http://www.hortonworks.com>


On Tue, Aug 6, 2013 at 5:25 AM, Krishna Kishore Bonagiri <
write2kishore@gmail.com> wrote:

> I tried the following and it works!
> String shellScriptPath = "file:///home_/dsadm/kishore/kk.ksh";
>
> But now getting a timestamp error like below, when I passed 0 to
> setTimestamp()
>
> 13/08/06 08:23:48 INFO ApplicationMaster: Got container status for
> containerID= container_1375784329048_0017_01_000002, state=COMPLETE,
> exitStatus=-1000, diagnostics=Resource file:/home_/dsadm/kishore/kk.ksh
> changed on src filesystem (expected 0, was 1367580580000
>
>
>
>
> On Tue, Aug 6, 2013 at 5:24 PM, Harsh J <ha...@cloudera.com> wrote:
>
>> Can you try passing a fully qualified local path? That is, including the
>> file:/ scheme
>>  On Aug 6, 2013 4:05 PM, "Krishna Kishore Bonagiri" <
>> write2kishore@gmail.com> wrote:
>>
>>> Hi Harsh,
>>>    The setResource() call on LocalResource() is expecting an argument of
>>> type org.apache.hadoop.yarn.api.records.URL which is converted from a
>>> string in the form of URI. This happens in the following call of
>>> Distributed Shell example,
>>>
>>> shellRsrc.setResource(ConverterUtils.getYarnUrlFromURI(new URI(
>>> shellScriptPath)));
>>>
>>> So, if I give a local file I get a parsing error like below, which is
>>> when I changed it to an HDFS file thinking that it should be given like
>>> that only. Could you please give an example of how else it could be used,
>>> using a local file as you are saying?
>>>
>>> 2013-08-06 06:23:12,942 WARN
>>> org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container:
>>> Failed to parse resource-request
>>> java.net.URISyntaxException: Expected scheme name at index 0:
>>> :///home_/dsadm/kishore/kk.ksh
>>>         at java.net.URI$Parser.fail(URI.java:2820)
>>>         at java.net.URI$Parser.failExpecting(URI.java:2826)
>>>         at java.net.URI$Parser.parse(URI.java:3015)
>>>         at java.net.URI.<init>(URI.java:747)
>>>         at
>>> org.apache.hadoop.yarn.util.ConverterUtils.getPathFromYarnURL(ConverterUtils.java:77)
>>>         at
>>> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalResourceRequest.<init>(LocalResourceRequest.java:46)
>>>
>>>
>>>
>>> On Tue, Aug 6, 2013 at 3:36 PM, Harsh J <ha...@cloudera.com> wrote:
>>>
>>>> To be honest, I've never tried loading a HDFS file onto the
>>>> LocalResource this way. I usually just pass a local file and that
>>>> works just fine. There may be something in the URI transformation
>>>> possibly breaking a HDFS source, but try passing a local file - does
>>>> that fail too? The Shell example uses a local file.
>>>>
>>>> On Tue, Aug 6, 2013 at 10:54 AM, Krishna Kishore Bonagiri
>>>> <wr...@gmail.com> wrote:
>>>> > Hi Harsh,
>>>> >
>>>> >   Please see if this is useful, I got a stack trace after the error
>>>> has
>>>> > occurred....
>>>> >
>>>> > 2013-08-06 00:55:30,559 INFO
>>>> > org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor:
>>>> CWD set
>>>> > to
>>>> /tmp/nm-local-dir/usercache/dsadm/appcache/application_1375716148174_0004
>>>> > =
>>>> >
>>>> file:/tmp/nm-local-dir/usercache/dsadm/appcache/application_1375716148174_0004
>>>> > 2013-08-06 00:55:31,017 ERROR
>>>> > org.apache.hadoop.security.UserGroupInformation:
>>>> PriviledgedActionException
>>>> > as:dsadm (auth:SIMPLE) cause:java.io.FileNotFoundException: File does
>>>> not
>>>> > exist: hdfs://isredeng/kishore/kk.ksh
>>>> > 2013-08-06 00:55:31,029 INFO
>>>> >
>>>> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService:
>>>> > DEBUG: FAILED { hdfs://isredeng/kishore/kk.ksh, 0, FILE, null }, File
>>>> does
>>>> > not exist: hdfs://isredeng/kishore/kk.ksh
>>>> > 2013-08-06 00:55:31,031 INFO
>>>> >
>>>> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalizedResource:
>>>> > Resource hdfs://isredeng/kishore/kk.ksh transitioned from DOWNLOADING
>>>> to
>>>> > FAILED
>>>> > 2013-08-06 00:55:31,034 INFO
>>>> >
>>>> org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container:
>>>> > Container container_1375716148174_0004_01_000002 transitioned from
>>>> > LOCALIZING to LOCALIZATION_FAILED
>>>> > 2013-08-06 00:55:31,035 INFO
>>>> >
>>>> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalResourcesTrackerImpl:
>>>> > Container container_1375716148174_0004_01_000002 sent RELEASE event
>>>> on a
>>>> > resource request { hdfs://isredeng/kishore/kk.ksh, 0, FILE, null } not
>>>> > present in cache.
>>>> > 2013-08-06 00:55:31,036 WARN org.apache.hadoop.ipc.Client: interrupted
>>>> > waiting to send rpc request to server
>>>> > java.lang.InterruptedException
>>>> >         at
>>>> >
>>>> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1290)
>>>> >         at
>>>> > java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:229)
>>>> >         at java.util.concurrent.FutureTask.get(FutureTask.java:94)
>>>> >         at
>>>> >
>>>> org.apache.hadoop.ipc.Client$Connection.sendRpcRequest(Client.java:930)
>>>> >         at org.apache.hadoop.ipc.Client.call(Client.java:1285)
>>>> >         at org.apache.hadoop.ipc.Client.call(Client.java:1264)
>>>> >         at
>>>> >
>>>> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
>>>> >         at $Proxy22.heartbeat(Unknown Source)
>>>> >         at
>>>> >
>>>> org.apache.hadoop.yarn.server.nodemanager.api.impl.pb.client.LocalizationProtocolPBClientImpl.heartbeat(LocalizationProtocolPBClientImpl.java:62)
>>>> >         at
>>>> >
>>>> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.localizeFiles(ContainerLocalizer.java:249)
>>>> >         at
>>>> >
>>>> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.runLocalization(ContainerLocalizer.java:163)
>>>> >         at
>>>> >
>>>> org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.startLocalizer(DefaultContainerExecutor.java:106)
>>>> >         at
>>>> >
>>>> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerRunner.run(ResourceLocalizationService.java:979)
>>>> >
>>>> >
>>>> >
>>>> > And here is my code snippet:
>>>> >
>>>> >       ContainerLaunchContext ctx =
>>>> > Records.newRecord(ContainerLaunchContext.class);
>>>> >
>>>> >       ctx.setEnvironment(oshEnv);
>>>> >
>>>> >       // Set the local resources
>>>> >       Map<String, LocalResource> localResources = new HashMap<String,
>>>> > LocalResource>();
>>>> >
>>>> >       LocalResource shellRsrc =
>>>> Records.newRecord(LocalResource.class);
>>>> >       shellRsrc.setType(LocalResourceType.FILE);
>>>> >       shellRsrc.setVisibility(LocalResourceVisibility.APPLICATION);
>>>> >       String shellScriptPath = "hdfs://isredeng//kishore/kk.ksh";
>>>> >       try {
>>>> >         shellRsrc.setResource(ConverterUtils.getYarnUrlFromURI(new
>>>> > URI(shellScriptPath)));
>>>> >       } catch (URISyntaxException e) {
>>>> >         LOG.error("Error when trying to use shell script path
>>>> specified"
>>>> >             + " in env, path=" + shellScriptPath);
>>>> >         e.printStackTrace();
>>>> >       }
>>>> >
>>>> >       shellRsrc.setTimestamp(0/*shellScriptPathTimestamp*/);
>>>> >       shellRsrc.setSize(0/*shellScriptPathLen*/);
>>>> >       String ExecShellStringPath = "ExecShellScript.sh";
>>>> >       localResources.put(ExecShellStringPath, shellRsrc);
>>>> >
>>>> >       ctx.setLocalResources(localResources);
>>>> >
>>>> >
>>>> > Please let me know if you need anything else.
>>>> >
>>>> > Thanks,
>>>> > Kishore
>>>> >
>>>> >
>>>> >
>>>> > On Tue, Aug 6, 2013 at 12:05 AM, Harsh J <ha...@cloudera.com> wrote:
>>>> >>
>>>> >> The detail is insufficient to answer why. You should also have gotten
>>>> >> a trace after it, can you post that? If possible, also the relevant
>>>> >> snippets of code.
>>>> >>
>>>> >> On Mon, Aug 5, 2013 at 6:36 PM, Krishna Kishore Bonagiri
>>>> >> <wr...@gmail.com> wrote:
>>>> >> > Hi Harsh,
>>>> >> >  Thanks for the quick and detailed reply, it really helps. I am
>>>> trying
>>>> >> > to
>>>> >> > use it and getting this error in node manager's log:
>>>> >> >
>>>> >> > 2013-08-05 08:57:28,867 ERROR
>>>> >> > org.apache.hadoop.security.UserGroupInformation:
>>>> >> > PriviledgedActionException
>>>> >> > as:dsadm (auth:SIMPLE) cause:java.io.FileNotFoundException: File
>>>> does
>>>> >> > not
>>>> >> > exist: hdfs://isredeng/kishore/kk.ksh
>>>> >> >
>>>> >> >
>>>> >> > This file is there on the machine with name "isredeng", I could do
>>>> ls
>>>> >> > for
>>>> >> > that file as below:
>>>> >> >
>>>> >> > -bash-4.1$ hadoop fs -ls kishore/kk.ksh
>>>> >> > 13/08/05 09:01:03 WARN util.NativeCodeLoader: Unable to load
>>>> >> > native-hadoop
>>>> >> > library for your platform... using builtin-java classes where
>>>> applicable
>>>> >> > Found 1 items
>>>> >> > -rw-r--r--   3 dsadm supergroup       1046 2013-08-05 08:48
>>>> >> > kishore/kk.ksh
>>>> >> >
>>>> >> > Note: I am using a single node cluster
>>>> >> >
>>>> >> > Thanks,
>>>> >> > Kishore
>>>> >> >
>>>> >> >
>>>> >> >
>>>> >> >
>>>> >> > On Mon, Aug 5, 2013 at 3:00 PM, Harsh J <ha...@cloudera.com>
>>>> wrote:
>>>> >> >>
>>>> >> >> The string for each LocalResource in the map can be anything that
>>>> >> >> serves as a common identifier name for your application. At
>>>> execution
>>>> >> >> time, the passed resource filename will be aliased to the name
>>>> you've
>>>> >> >> mapped it to, so that the application code need not track special
>>>> >> >> names. The behavior is very similar to how you can, in MR, define
>>>> a
>>>> >> >> symlink name for a DistributedCache entry (e.g. foo.jar#bar.jar).
>>>> >> >>
>>>> >> >> For an example, checkout the DistributedShell app sources.
>>>> >> >>
>>>> >> >> Over [1], you can see we take a user provided file path to a shell
>>>> >> >> script. This can be named anything as it is user-supplied.
>>>> >> >> Onto [2], we define this as a local resource [2.1] and embed it
>>>> with a
>>>> >> >> different name (the string you ask about) [2.2], as defined at
>>>> [3] as
>>>> >> >> an application reference-able constant.
>>>> >> >> Note that in [4], we add to the Container arguments the aliased
>>>> name
>>>> >> >> we mapped it to (i.e. [3]) and not the original filename we
>>>> received
>>>> >> >> from the user. The resource is placed on the container with this
>>>> name
>>>> >> >> instead, so thats what we choose to execute.
>>>> >> >>
>>>> >> >> [1] -
>>>> >> >>
>>>> >> >>
>>>> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L390
>>>> >> >>
>>>> >> >> [2] - [2.1]
>>>> >> >>
>>>> >> >>
>>>> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L764
>>>> >> >> and [2.2]
>>>> >> >>
>>>> >> >>
>>>> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L780
>>>> >> >>
>>>> >> >> [3] -
>>>> >> >>
>>>> >> >>
>>>> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L205
>>>> >> >>
>>>> >> >> [4] -
>>>> >> >>
>>>> >> >>
>>>> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L791
>>>> >> >>
>>>> >> >> On Mon, Aug 5, 2013 at 2:44 PM, Krishna Kishore Bonagiri
>>>> >> >> <wr...@gmail.com> wrote:
>>>> >> >> > Hi,
>>>> >> >> >
>>>> >> >> >   Can someone please tell me what is the use of calling
>>>> >> >> > setLocalResources()
>>>> >> >> > on ContainerLaunchContext?
>>>> >> >> >
>>>> >> >> >   And, also an example of how to use this will help...
>>>> >> >> >
>>>> >> >> >  I couldn't guess what is the String in the map that is passed
>>>> to
>>>> >> >> > setLocalResources() like below:
>>>> >> >> >
>>>> >> >> >       // Set the local resources
>>>> >> >> >       Map<String, LocalResource> localResources = new
>>>> HashMap<String,
>>>> >> >> > LocalResource>();
>>>> >> >> >
>>>> >> >> > Thanks,
>>>> >> >> > Kishore
>>>> >> >> >
>>>> >> >>
>>>> >> >>
>>>> >> >>
>>>> >> >> --
>>>> >> >> Harsh J
>>>> >> >
>>>> >> >
>>>> >>
>>>> >>
>>>> >>
>>>> >> --
>>>> >> Harsh J
>>>> >
>>>> >
>>>>
>>>>
>>>>
>>>> --
>>>> Harsh J
>>>>
>>>
>>>
>

Re: setLocalResources() on ContainerLaunchContext

Posted by Omkar Joshi <oj...@hortonworks.com>.
Hi,

You need to match the timestamp. Probably get the timestamp locally before
adding it. This is explicitly done to ensure that file is not updated after
user makes the call to avoid possible errors.


Thanks,
Omkar Joshi
*Hortonworks Inc.* <http://www.hortonworks.com>


On Tue, Aug 6, 2013 at 5:25 AM, Krishna Kishore Bonagiri <
write2kishore@gmail.com> wrote:

> I tried the following and it works!
> String shellScriptPath = "file:///home_/dsadm/kishore/kk.ksh";
>
> But now getting a timestamp error like below, when I passed 0 to
> setTimestamp()
>
> 13/08/06 08:23:48 INFO ApplicationMaster: Got container status for
> containerID= container_1375784329048_0017_01_000002, state=COMPLETE,
> exitStatus=-1000, diagnostics=Resource file:/home_/dsadm/kishore/kk.ksh
> changed on src filesystem (expected 0, was 1367580580000
>
>
>
>
> On Tue, Aug 6, 2013 at 5:24 PM, Harsh J <ha...@cloudera.com> wrote:
>
>> Can you try passing a fully qualified local path? That is, including the
>> file:/ scheme
>>  On Aug 6, 2013 4:05 PM, "Krishna Kishore Bonagiri" <
>> write2kishore@gmail.com> wrote:
>>
>>> Hi Harsh,
>>>    The setResource() call on LocalResource() is expecting an argument of
>>> type org.apache.hadoop.yarn.api.records.URL which is converted from a
>>> string in the form of URI. This happens in the following call of
>>> Distributed Shell example,
>>>
>>> shellRsrc.setResource(ConverterUtils.getYarnUrlFromURI(new URI(
>>> shellScriptPath)));
>>>
>>> So, if I give a local file I get a parsing error like below, which is
>>> when I changed it to an HDFS file thinking that it should be given like
>>> that only. Could you please give an example of how else it could be used,
>>> using a local file as you are saying?
>>>
>>> 2013-08-06 06:23:12,942 WARN
>>> org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container:
>>> Failed to parse resource-request
>>> java.net.URISyntaxException: Expected scheme name at index 0:
>>> :///home_/dsadm/kishore/kk.ksh
>>>         at java.net.URI$Parser.fail(URI.java:2820)
>>>         at java.net.URI$Parser.failExpecting(URI.java:2826)
>>>         at java.net.URI$Parser.parse(URI.java:3015)
>>>         at java.net.URI.<init>(URI.java:747)
>>>         at
>>> org.apache.hadoop.yarn.util.ConverterUtils.getPathFromYarnURL(ConverterUtils.java:77)
>>>         at
>>> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalResourceRequest.<init>(LocalResourceRequest.java:46)
>>>
>>>
>>>
>>> On Tue, Aug 6, 2013 at 3:36 PM, Harsh J <ha...@cloudera.com> wrote:
>>>
>>>> To be honest, I've never tried loading a HDFS file onto the
>>>> LocalResource this way. I usually just pass a local file and that
>>>> works just fine. There may be something in the URI transformation
>>>> possibly breaking a HDFS source, but try passing a local file - does
>>>> that fail too? The Shell example uses a local file.
>>>>
>>>> On Tue, Aug 6, 2013 at 10:54 AM, Krishna Kishore Bonagiri
>>>> <wr...@gmail.com> wrote:
>>>> > Hi Harsh,
>>>> >
>>>> >   Please see if this is useful, I got a stack trace after the error
>>>> has
>>>> > occurred....
>>>> >
>>>> > 2013-08-06 00:55:30,559 INFO
>>>> > org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor:
>>>> CWD set
>>>> > to
>>>> /tmp/nm-local-dir/usercache/dsadm/appcache/application_1375716148174_0004
>>>> > =
>>>> >
>>>> file:/tmp/nm-local-dir/usercache/dsadm/appcache/application_1375716148174_0004
>>>> > 2013-08-06 00:55:31,017 ERROR
>>>> > org.apache.hadoop.security.UserGroupInformation:
>>>> PriviledgedActionException
>>>> > as:dsadm (auth:SIMPLE) cause:java.io.FileNotFoundException: File does
>>>> not
>>>> > exist: hdfs://isredeng/kishore/kk.ksh
>>>> > 2013-08-06 00:55:31,029 INFO
>>>> >
>>>> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService:
>>>> > DEBUG: FAILED { hdfs://isredeng/kishore/kk.ksh, 0, FILE, null }, File
>>>> does
>>>> > not exist: hdfs://isredeng/kishore/kk.ksh
>>>> > 2013-08-06 00:55:31,031 INFO
>>>> >
>>>> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalizedResource:
>>>> > Resource hdfs://isredeng/kishore/kk.ksh transitioned from DOWNLOADING
>>>> to
>>>> > FAILED
>>>> > 2013-08-06 00:55:31,034 INFO
>>>> >
>>>> org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container:
>>>> > Container container_1375716148174_0004_01_000002 transitioned from
>>>> > LOCALIZING to LOCALIZATION_FAILED
>>>> > 2013-08-06 00:55:31,035 INFO
>>>> >
>>>> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalResourcesTrackerImpl:
>>>> > Container container_1375716148174_0004_01_000002 sent RELEASE event
>>>> on a
>>>> > resource request { hdfs://isredeng/kishore/kk.ksh, 0, FILE, null } not
>>>> > present in cache.
>>>> > 2013-08-06 00:55:31,036 WARN org.apache.hadoop.ipc.Client: interrupted
>>>> > waiting to send rpc request to server
>>>> > java.lang.InterruptedException
>>>> >         at
>>>> >
>>>> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1290)
>>>> >         at
>>>> > java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:229)
>>>> >         at java.util.concurrent.FutureTask.get(FutureTask.java:94)
>>>> >         at
>>>> >
>>>> org.apache.hadoop.ipc.Client$Connection.sendRpcRequest(Client.java:930)
>>>> >         at org.apache.hadoop.ipc.Client.call(Client.java:1285)
>>>> >         at org.apache.hadoop.ipc.Client.call(Client.java:1264)
>>>> >         at
>>>> >
>>>> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
>>>> >         at $Proxy22.heartbeat(Unknown Source)
>>>> >         at
>>>> >
>>>> org.apache.hadoop.yarn.server.nodemanager.api.impl.pb.client.LocalizationProtocolPBClientImpl.heartbeat(LocalizationProtocolPBClientImpl.java:62)
>>>> >         at
>>>> >
>>>> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.localizeFiles(ContainerLocalizer.java:249)
>>>> >         at
>>>> >
>>>> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.runLocalization(ContainerLocalizer.java:163)
>>>> >         at
>>>> >
>>>> org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.startLocalizer(DefaultContainerExecutor.java:106)
>>>> >         at
>>>> >
>>>> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerRunner.run(ResourceLocalizationService.java:979)
>>>> >
>>>> >
>>>> >
>>>> > And here is my code snippet:
>>>> >
>>>> >       ContainerLaunchContext ctx =
>>>> > Records.newRecord(ContainerLaunchContext.class);
>>>> >
>>>> >       ctx.setEnvironment(oshEnv);
>>>> >
>>>> >       // Set the local resources
>>>> >       Map<String, LocalResource> localResources = new HashMap<String,
>>>> > LocalResource>();
>>>> >
>>>> >       LocalResource shellRsrc =
>>>> Records.newRecord(LocalResource.class);
>>>> >       shellRsrc.setType(LocalResourceType.FILE);
>>>> >       shellRsrc.setVisibility(LocalResourceVisibility.APPLICATION);
>>>> >       String shellScriptPath = "hdfs://isredeng//kishore/kk.ksh";
>>>> >       try {
>>>> >         shellRsrc.setResource(ConverterUtils.getYarnUrlFromURI(new
>>>> > URI(shellScriptPath)));
>>>> >       } catch (URISyntaxException e) {
>>>> >         LOG.error("Error when trying to use shell script path
>>>> specified"
>>>> >             + " in env, path=" + shellScriptPath);
>>>> >         e.printStackTrace();
>>>> >       }
>>>> >
>>>> >       shellRsrc.setTimestamp(0/*shellScriptPathTimestamp*/);
>>>> >       shellRsrc.setSize(0/*shellScriptPathLen*/);
>>>> >       String ExecShellStringPath = "ExecShellScript.sh";
>>>> >       localResources.put(ExecShellStringPath, shellRsrc);
>>>> >
>>>> >       ctx.setLocalResources(localResources);
>>>> >
>>>> >
>>>> > Please let me know if you need anything else.
>>>> >
>>>> > Thanks,
>>>> > Kishore
>>>> >
>>>> >
>>>> >
>>>> > On Tue, Aug 6, 2013 at 12:05 AM, Harsh J <ha...@cloudera.com> wrote:
>>>> >>
>>>> >> The detail is insufficient to answer why. You should also have gotten
>>>> >> a trace after it, can you post that? If possible, also the relevant
>>>> >> snippets of code.
>>>> >>
>>>> >> On Mon, Aug 5, 2013 at 6:36 PM, Krishna Kishore Bonagiri
>>>> >> <wr...@gmail.com> wrote:
>>>> >> > Hi Harsh,
>>>> >> >  Thanks for the quick and detailed reply, it really helps. I am
>>>> trying
>>>> >> > to
>>>> >> > use it and getting this error in node manager's log:
>>>> >> >
>>>> >> > 2013-08-05 08:57:28,867 ERROR
>>>> >> > org.apache.hadoop.security.UserGroupInformation:
>>>> >> > PriviledgedActionException
>>>> >> > as:dsadm (auth:SIMPLE) cause:java.io.FileNotFoundException: File
>>>> does
>>>> >> > not
>>>> >> > exist: hdfs://isredeng/kishore/kk.ksh
>>>> >> >
>>>> >> >
>>>> >> > This file is there on the machine with name "isredeng", I could do
>>>> ls
>>>> >> > for
>>>> >> > that file as below:
>>>> >> >
>>>> >> > -bash-4.1$ hadoop fs -ls kishore/kk.ksh
>>>> >> > 13/08/05 09:01:03 WARN util.NativeCodeLoader: Unable to load
>>>> >> > native-hadoop
>>>> >> > library for your platform... using builtin-java classes where
>>>> applicable
>>>> >> > Found 1 items
>>>> >> > -rw-r--r--   3 dsadm supergroup       1046 2013-08-05 08:48
>>>> >> > kishore/kk.ksh
>>>> >> >
>>>> >> > Note: I am using a single node cluster
>>>> >> >
>>>> >> > Thanks,
>>>> >> > Kishore
>>>> >> >
>>>> >> >
>>>> >> >
>>>> >> >
>>>> >> > On Mon, Aug 5, 2013 at 3:00 PM, Harsh J <ha...@cloudera.com>
>>>> wrote:
>>>> >> >>
>>>> >> >> The string for each LocalResource in the map can be anything that
>>>> >> >> serves as a common identifier name for your application. At
>>>> execution
>>>> >> >> time, the passed resource filename will be aliased to the name
>>>> you've
>>>> >> >> mapped it to, so that the application code need not track special
>>>> >> >> names. The behavior is very similar to how you can, in MR, define
>>>> a
>>>> >> >> symlink name for a DistributedCache entry (e.g. foo.jar#bar.jar).
>>>> >> >>
>>>> >> >> For an example, checkout the DistributedShell app sources.
>>>> >> >>
>>>> >> >> Over [1], you can see we take a user provided file path to a shell
>>>> >> >> script. This can be named anything as it is user-supplied.
>>>> >> >> Onto [2], we define this as a local resource [2.1] and embed it
>>>> with a
>>>> >> >> different name (the string you ask about) [2.2], as defined at
>>>> [3] as
>>>> >> >> an application reference-able constant.
>>>> >> >> Note that in [4], we add to the Container arguments the aliased
>>>> name
>>>> >> >> we mapped it to (i.e. [3]) and not the original filename we
>>>> received
>>>> >> >> from the user. The resource is placed on the container with this
>>>> name
>>>> >> >> instead, so thats what we choose to execute.
>>>> >> >>
>>>> >> >> [1] -
>>>> >> >>
>>>> >> >>
>>>> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L390
>>>> >> >>
>>>> >> >> [2] - [2.1]
>>>> >> >>
>>>> >> >>
>>>> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L764
>>>> >> >> and [2.2]
>>>> >> >>
>>>> >> >>
>>>> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L780
>>>> >> >>
>>>> >> >> [3] -
>>>> >> >>
>>>> >> >>
>>>> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L205
>>>> >> >>
>>>> >> >> [4] -
>>>> >> >>
>>>> >> >>
>>>> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L791
>>>> >> >>
>>>> >> >> On Mon, Aug 5, 2013 at 2:44 PM, Krishna Kishore Bonagiri
>>>> >> >> <wr...@gmail.com> wrote:
>>>> >> >> > Hi,
>>>> >> >> >
>>>> >> >> >   Can someone please tell me what is the use of calling
>>>> >> >> > setLocalResources()
>>>> >> >> > on ContainerLaunchContext?
>>>> >> >> >
>>>> >> >> >   And, also an example of how to use this will help...
>>>> >> >> >
>>>> >> >> >  I couldn't guess what is the String in the map that is passed
>>>> to
>>>> >> >> > setLocalResources() like below:
>>>> >> >> >
>>>> >> >> >       // Set the local resources
>>>> >> >> >       Map<String, LocalResource> localResources = new
>>>> HashMap<String,
>>>> >> >> > LocalResource>();
>>>> >> >> >
>>>> >> >> > Thanks,
>>>> >> >> > Kishore
>>>> >> >> >
>>>> >> >>
>>>> >> >>
>>>> >> >>
>>>> >> >> --
>>>> >> >> Harsh J
>>>> >> >
>>>> >> >
>>>> >>
>>>> >>
>>>> >>
>>>> >> --
>>>> >> Harsh J
>>>> >
>>>> >
>>>>
>>>>
>>>>
>>>> --
>>>> Harsh J
>>>>
>>>
>>>
>

Re: setLocalResources() on ContainerLaunchContext

Posted by Omkar Joshi <oj...@hortonworks.com>.
Hi,

You need to match the timestamp. Probably get the timestamp locally before
adding it. This is explicitly done to ensure that file is not updated after
user makes the call to avoid possible errors.


Thanks,
Omkar Joshi
*Hortonworks Inc.* <http://www.hortonworks.com>


On Tue, Aug 6, 2013 at 5:25 AM, Krishna Kishore Bonagiri <
write2kishore@gmail.com> wrote:

> I tried the following and it works!
> String shellScriptPath = "file:///home_/dsadm/kishore/kk.ksh";
>
> But now getting a timestamp error like below, when I passed 0 to
> setTimestamp()
>
> 13/08/06 08:23:48 INFO ApplicationMaster: Got container status for
> containerID= container_1375784329048_0017_01_000002, state=COMPLETE,
> exitStatus=-1000, diagnostics=Resource file:/home_/dsadm/kishore/kk.ksh
> changed on src filesystem (expected 0, was 1367580580000
>
>
>
>
> On Tue, Aug 6, 2013 at 5:24 PM, Harsh J <ha...@cloudera.com> wrote:
>
>> Can you try passing a fully qualified local path? That is, including the
>> file:/ scheme
>>  On Aug 6, 2013 4:05 PM, "Krishna Kishore Bonagiri" <
>> write2kishore@gmail.com> wrote:
>>
>>> Hi Harsh,
>>>    The setResource() call on LocalResource() is expecting an argument of
>>> type org.apache.hadoop.yarn.api.records.URL which is converted from a
>>> string in the form of URI. This happens in the following call of
>>> Distributed Shell example,
>>>
>>> shellRsrc.setResource(ConverterUtils.getYarnUrlFromURI(new URI(
>>> shellScriptPath)));
>>>
>>> So, if I give a local file I get a parsing error like below, which is
>>> when I changed it to an HDFS file thinking that it should be given like
>>> that only. Could you please give an example of how else it could be used,
>>> using a local file as you are saying?
>>>
>>> 2013-08-06 06:23:12,942 WARN
>>> org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container:
>>> Failed to parse resource-request
>>> java.net.URISyntaxException: Expected scheme name at index 0:
>>> :///home_/dsadm/kishore/kk.ksh
>>>         at java.net.URI$Parser.fail(URI.java:2820)
>>>         at java.net.URI$Parser.failExpecting(URI.java:2826)
>>>         at java.net.URI$Parser.parse(URI.java:3015)
>>>         at java.net.URI.<init>(URI.java:747)
>>>         at
>>> org.apache.hadoop.yarn.util.ConverterUtils.getPathFromYarnURL(ConverterUtils.java:77)
>>>         at
>>> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalResourceRequest.<init>(LocalResourceRequest.java:46)
>>>
>>>
>>>
>>> On Tue, Aug 6, 2013 at 3:36 PM, Harsh J <ha...@cloudera.com> wrote:
>>>
>>>> To be honest, I've never tried loading a HDFS file onto the
>>>> LocalResource this way. I usually just pass a local file and that
>>>> works just fine. There may be something in the URI transformation
>>>> possibly breaking a HDFS source, but try passing a local file - does
>>>> that fail too? The Shell example uses a local file.
>>>>
>>>> On Tue, Aug 6, 2013 at 10:54 AM, Krishna Kishore Bonagiri
>>>> <wr...@gmail.com> wrote:
>>>> > Hi Harsh,
>>>> >
>>>> >   Please see if this is useful, I got a stack trace after the error
>>>> has
>>>> > occurred....
>>>> >
>>>> > 2013-08-06 00:55:30,559 INFO
>>>> > org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor:
>>>> CWD set
>>>> > to
>>>> /tmp/nm-local-dir/usercache/dsadm/appcache/application_1375716148174_0004
>>>> > =
>>>> >
>>>> file:/tmp/nm-local-dir/usercache/dsadm/appcache/application_1375716148174_0004
>>>> > 2013-08-06 00:55:31,017 ERROR
>>>> > org.apache.hadoop.security.UserGroupInformation:
>>>> PriviledgedActionException
>>>> > as:dsadm (auth:SIMPLE) cause:java.io.FileNotFoundException: File does
>>>> not
>>>> > exist: hdfs://isredeng/kishore/kk.ksh
>>>> > 2013-08-06 00:55:31,029 INFO
>>>> >
>>>> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService:
>>>> > DEBUG: FAILED { hdfs://isredeng/kishore/kk.ksh, 0, FILE, null }, File
>>>> does
>>>> > not exist: hdfs://isredeng/kishore/kk.ksh
>>>> > 2013-08-06 00:55:31,031 INFO
>>>> >
>>>> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalizedResource:
>>>> > Resource hdfs://isredeng/kishore/kk.ksh transitioned from DOWNLOADING
>>>> to
>>>> > FAILED
>>>> > 2013-08-06 00:55:31,034 INFO
>>>> >
>>>> org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container:
>>>> > Container container_1375716148174_0004_01_000002 transitioned from
>>>> > LOCALIZING to LOCALIZATION_FAILED
>>>> > 2013-08-06 00:55:31,035 INFO
>>>> >
>>>> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalResourcesTrackerImpl:
>>>> > Container container_1375716148174_0004_01_000002 sent RELEASE event
>>>> on a
>>>> > resource request { hdfs://isredeng/kishore/kk.ksh, 0, FILE, null } not
>>>> > present in cache.
>>>> > 2013-08-06 00:55:31,036 WARN org.apache.hadoop.ipc.Client: interrupted
>>>> > waiting to send rpc request to server
>>>> > java.lang.InterruptedException
>>>> >         at
>>>> >
>>>> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1290)
>>>> >         at
>>>> > java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:229)
>>>> >         at java.util.concurrent.FutureTask.get(FutureTask.java:94)
>>>> >         at
>>>> >
>>>> org.apache.hadoop.ipc.Client$Connection.sendRpcRequest(Client.java:930)
>>>> >         at org.apache.hadoop.ipc.Client.call(Client.java:1285)
>>>> >         at org.apache.hadoop.ipc.Client.call(Client.java:1264)
>>>> >         at
>>>> >
>>>> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
>>>> >         at $Proxy22.heartbeat(Unknown Source)
>>>> >         at
>>>> >
>>>> org.apache.hadoop.yarn.server.nodemanager.api.impl.pb.client.LocalizationProtocolPBClientImpl.heartbeat(LocalizationProtocolPBClientImpl.java:62)
>>>> >         at
>>>> >
>>>> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.localizeFiles(ContainerLocalizer.java:249)
>>>> >         at
>>>> >
>>>> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.runLocalization(ContainerLocalizer.java:163)
>>>> >         at
>>>> >
>>>> org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.startLocalizer(DefaultContainerExecutor.java:106)
>>>> >         at
>>>> >
>>>> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerRunner.run(ResourceLocalizationService.java:979)
>>>> >
>>>> >
>>>> >
>>>> > And here is my code snippet:
>>>> >
>>>> >       ContainerLaunchContext ctx =
>>>> > Records.newRecord(ContainerLaunchContext.class);
>>>> >
>>>> >       ctx.setEnvironment(oshEnv);
>>>> >
>>>> >       // Set the local resources
>>>> >       Map<String, LocalResource> localResources = new HashMap<String,
>>>> > LocalResource>();
>>>> >
>>>> >       LocalResource shellRsrc =
>>>> Records.newRecord(LocalResource.class);
>>>> >       shellRsrc.setType(LocalResourceType.FILE);
>>>> >       shellRsrc.setVisibility(LocalResourceVisibility.APPLICATION);
>>>> >       String shellScriptPath = "hdfs://isredeng//kishore/kk.ksh";
>>>> >       try {
>>>> >         shellRsrc.setResource(ConverterUtils.getYarnUrlFromURI(new
>>>> > URI(shellScriptPath)));
>>>> >       } catch (URISyntaxException e) {
>>>> >         LOG.error("Error when trying to use shell script path
>>>> specified"
>>>> >             + " in env, path=" + shellScriptPath);
>>>> >         e.printStackTrace();
>>>> >       }
>>>> >
>>>> >       shellRsrc.setTimestamp(0/*shellScriptPathTimestamp*/);
>>>> >       shellRsrc.setSize(0/*shellScriptPathLen*/);
>>>> >       String ExecShellStringPath = "ExecShellScript.sh";
>>>> >       localResources.put(ExecShellStringPath, shellRsrc);
>>>> >
>>>> >       ctx.setLocalResources(localResources);
>>>> >
>>>> >
>>>> > Please let me know if you need anything else.
>>>> >
>>>> > Thanks,
>>>> > Kishore
>>>> >
>>>> >
>>>> >
>>>> > On Tue, Aug 6, 2013 at 12:05 AM, Harsh J <ha...@cloudera.com> wrote:
>>>> >>
>>>> >> The detail is insufficient to answer why. You should also have gotten
>>>> >> a trace after it, can you post that? If possible, also the relevant
>>>> >> snippets of code.
>>>> >>
>>>> >> On Mon, Aug 5, 2013 at 6:36 PM, Krishna Kishore Bonagiri
>>>> >> <wr...@gmail.com> wrote:
>>>> >> > Hi Harsh,
>>>> >> >  Thanks for the quick and detailed reply, it really helps. I am
>>>> trying
>>>> >> > to
>>>> >> > use it and getting this error in node manager's log:
>>>> >> >
>>>> >> > 2013-08-05 08:57:28,867 ERROR
>>>> >> > org.apache.hadoop.security.UserGroupInformation:
>>>> >> > PriviledgedActionException
>>>> >> > as:dsadm (auth:SIMPLE) cause:java.io.FileNotFoundException: File
>>>> does
>>>> >> > not
>>>> >> > exist: hdfs://isredeng/kishore/kk.ksh
>>>> >> >
>>>> >> >
>>>> >> > This file is there on the machine with name "isredeng", I could do
>>>> ls
>>>> >> > for
>>>> >> > that file as below:
>>>> >> >
>>>> >> > -bash-4.1$ hadoop fs -ls kishore/kk.ksh
>>>> >> > 13/08/05 09:01:03 WARN util.NativeCodeLoader: Unable to load
>>>> >> > native-hadoop
>>>> >> > library for your platform... using builtin-java classes where
>>>> applicable
>>>> >> > Found 1 items
>>>> >> > -rw-r--r--   3 dsadm supergroup       1046 2013-08-05 08:48
>>>> >> > kishore/kk.ksh
>>>> >> >
>>>> >> > Note: I am using a single node cluster
>>>> >> >
>>>> >> > Thanks,
>>>> >> > Kishore
>>>> >> >
>>>> >> >
>>>> >> >
>>>> >> >
>>>> >> > On Mon, Aug 5, 2013 at 3:00 PM, Harsh J <ha...@cloudera.com>
>>>> wrote:
>>>> >> >>
>>>> >> >> The string for each LocalResource in the map can be anything that
>>>> >> >> serves as a common identifier name for your application. At
>>>> execution
>>>> >> >> time, the passed resource filename will be aliased to the name
>>>> you've
>>>> >> >> mapped it to, so that the application code need not track special
>>>> >> >> names. The behavior is very similar to how you can, in MR, define
>>>> a
>>>> >> >> symlink name for a DistributedCache entry (e.g. foo.jar#bar.jar).
>>>> >> >>
>>>> >> >> For an example, checkout the DistributedShell app sources.
>>>> >> >>
>>>> >> >> Over [1], you can see we take a user provided file path to a shell
>>>> >> >> script. This can be named anything as it is user-supplied.
>>>> >> >> Onto [2], we define this as a local resource [2.1] and embed it
>>>> with a
>>>> >> >> different name (the string you ask about) [2.2], as defined at
>>>> [3] as
>>>> >> >> an application reference-able constant.
>>>> >> >> Note that in [4], we add to the Container arguments the aliased
>>>> name
>>>> >> >> we mapped it to (i.e. [3]) and not the original filename we
>>>> received
>>>> >> >> from the user. The resource is placed on the container with this
>>>> name
>>>> >> >> instead, so thats what we choose to execute.
>>>> >> >>
>>>> >> >> [1] -
>>>> >> >>
>>>> >> >>
>>>> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L390
>>>> >> >>
>>>> >> >> [2] - [2.1]
>>>> >> >>
>>>> >> >>
>>>> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L764
>>>> >> >> and [2.2]
>>>> >> >>
>>>> >> >>
>>>> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L780
>>>> >> >>
>>>> >> >> [3] -
>>>> >> >>
>>>> >> >>
>>>> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L205
>>>> >> >>
>>>> >> >> [4] -
>>>> >> >>
>>>> >> >>
>>>> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L791
>>>> >> >>
>>>> >> >> On Mon, Aug 5, 2013 at 2:44 PM, Krishna Kishore Bonagiri
>>>> >> >> <wr...@gmail.com> wrote:
>>>> >> >> > Hi,
>>>> >> >> >
>>>> >> >> >   Can someone please tell me what is the use of calling
>>>> >> >> > setLocalResources()
>>>> >> >> > on ContainerLaunchContext?
>>>> >> >> >
>>>> >> >> >   And, also an example of how to use this will help...
>>>> >> >> >
>>>> >> >> >  I couldn't guess what is the String in the map that is passed
>>>> to
>>>> >> >> > setLocalResources() like below:
>>>> >> >> >
>>>> >> >> >       // Set the local resources
>>>> >> >> >       Map<String, LocalResource> localResources = new
>>>> HashMap<String,
>>>> >> >> > LocalResource>();
>>>> >> >> >
>>>> >> >> > Thanks,
>>>> >> >> > Kishore
>>>> >> >> >
>>>> >> >>
>>>> >> >>
>>>> >> >>
>>>> >> >> --
>>>> >> >> Harsh J
>>>> >> >
>>>> >> >
>>>> >>
>>>> >>
>>>> >>
>>>> >> --
>>>> >> Harsh J
>>>> >
>>>> >
>>>>
>>>>
>>>>
>>>> --
>>>> Harsh J
>>>>
>>>
>>>
>

Re: setLocalResources() on ContainerLaunchContext

Posted by Krishna Kishore Bonagiri <wr...@gmail.com>.
I tried the following and it works!
String shellScriptPath = "file:///home_/dsadm/kishore/kk.ksh";

But now getting a timestamp error like below, when I passed 0 to
setTimestamp()

13/08/06 08:23:48 INFO ApplicationMaster: Got container status for
containerID= container_1375784329048_0017_01_000002, state=COMPLETE,
exitStatus=-1000, diagnostics=Resource file:/home_/dsadm/kishore/kk.ksh
changed on src filesystem (expected 0, was 1367580580000




On Tue, Aug 6, 2013 at 5:24 PM, Harsh J <ha...@cloudera.com> wrote:

> Can you try passing a fully qualified local path? That is, including the
> file:/ scheme
> On Aug 6, 2013 4:05 PM, "Krishna Kishore Bonagiri" <
> write2kishore@gmail.com> wrote:
>
>> Hi Harsh,
>>    The setResource() call on LocalResource() is expecting an argument of
>> type org.apache.hadoop.yarn.api.records.URL which is converted from a
>> string in the form of URI. This happens in the following call of
>> Distributed Shell example,
>>
>> shellRsrc.setResource(ConverterUtils.getYarnUrlFromURI(new URI(
>> shellScriptPath)));
>>
>> So, if I give a local file I get a parsing error like below, which is
>> when I changed it to an HDFS file thinking that it should be given like
>> that only. Could you please give an example of how else it could be used,
>> using a local file as you are saying?
>>
>> 2013-08-06 06:23:12,942 WARN
>> org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container:
>> Failed to parse resource-request
>> java.net.URISyntaxException: Expected scheme name at index 0:
>> :///home_/dsadm/kishore/kk.ksh
>>         at java.net.URI$Parser.fail(URI.java:2820)
>>         at java.net.URI$Parser.failExpecting(URI.java:2826)
>>         at java.net.URI$Parser.parse(URI.java:3015)
>>         at java.net.URI.<init>(URI.java:747)
>>         at
>> org.apache.hadoop.yarn.util.ConverterUtils.getPathFromYarnURL(ConverterUtils.java:77)
>>         at
>> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalResourceRequest.<init>(LocalResourceRequest.java:46)
>>
>>
>>
>> On Tue, Aug 6, 2013 at 3:36 PM, Harsh J <ha...@cloudera.com> wrote:
>>
>>> To be honest, I've never tried loading a HDFS file onto the
>>> LocalResource this way. I usually just pass a local file and that
>>> works just fine. There may be something in the URI transformation
>>> possibly breaking a HDFS source, but try passing a local file - does
>>> that fail too? The Shell example uses a local file.
>>>
>>> On Tue, Aug 6, 2013 at 10:54 AM, Krishna Kishore Bonagiri
>>> <wr...@gmail.com> wrote:
>>> > Hi Harsh,
>>> >
>>> >   Please see if this is useful, I got a stack trace after the error has
>>> > occurred....
>>> >
>>> > 2013-08-06 00:55:30,559 INFO
>>> > org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor:
>>> CWD set
>>> > to
>>> /tmp/nm-local-dir/usercache/dsadm/appcache/application_1375716148174_0004
>>> > =
>>> >
>>> file:/tmp/nm-local-dir/usercache/dsadm/appcache/application_1375716148174_0004
>>> > 2013-08-06 00:55:31,017 ERROR
>>> > org.apache.hadoop.security.UserGroupInformation:
>>> PriviledgedActionException
>>> > as:dsadm (auth:SIMPLE) cause:java.io.FileNotFoundException: File does
>>> not
>>> > exist: hdfs://isredeng/kishore/kk.ksh
>>> > 2013-08-06 00:55:31,029 INFO
>>> >
>>> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService:
>>> > DEBUG: FAILED { hdfs://isredeng/kishore/kk.ksh, 0, FILE, null }, File
>>> does
>>> > not exist: hdfs://isredeng/kishore/kk.ksh
>>> > 2013-08-06 00:55:31,031 INFO
>>> >
>>> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalizedResource:
>>> > Resource hdfs://isredeng/kishore/kk.ksh transitioned from DOWNLOADING
>>> to
>>> > FAILED
>>> > 2013-08-06 00:55:31,034 INFO
>>> >
>>> org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container:
>>> > Container container_1375716148174_0004_01_000002 transitioned from
>>> > LOCALIZING to LOCALIZATION_FAILED
>>> > 2013-08-06 00:55:31,035 INFO
>>> >
>>> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalResourcesTrackerImpl:
>>> > Container container_1375716148174_0004_01_000002 sent RELEASE event on
>>> a
>>> > resource request { hdfs://isredeng/kishore/kk.ksh, 0, FILE, null } not
>>> > present in cache.
>>> > 2013-08-06 00:55:31,036 WARN org.apache.hadoop.ipc.Client: interrupted
>>> > waiting to send rpc request to server
>>> > java.lang.InterruptedException
>>> >         at
>>> >
>>> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1290)
>>> >         at
>>> > java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:229)
>>> >         at java.util.concurrent.FutureTask.get(FutureTask.java:94)
>>> >         at
>>> > org.apache.hadoop.ipc.Client$Connection.sendRpcRequest(Client.java:930)
>>> >         at org.apache.hadoop.ipc.Client.call(Client.java:1285)
>>> >         at org.apache.hadoop.ipc.Client.call(Client.java:1264)
>>> >         at
>>> >
>>> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
>>> >         at $Proxy22.heartbeat(Unknown Source)
>>> >         at
>>> >
>>> org.apache.hadoop.yarn.server.nodemanager.api.impl.pb.client.LocalizationProtocolPBClientImpl.heartbeat(LocalizationProtocolPBClientImpl.java:62)
>>> >         at
>>> >
>>> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.localizeFiles(ContainerLocalizer.java:249)
>>> >         at
>>> >
>>> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.runLocalization(ContainerLocalizer.java:163)
>>> >         at
>>> >
>>> org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.startLocalizer(DefaultContainerExecutor.java:106)
>>> >         at
>>> >
>>> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerRunner.run(ResourceLocalizationService.java:979)
>>> >
>>> >
>>> >
>>> > And here is my code snippet:
>>> >
>>> >       ContainerLaunchContext ctx =
>>> > Records.newRecord(ContainerLaunchContext.class);
>>> >
>>> >       ctx.setEnvironment(oshEnv);
>>> >
>>> >       // Set the local resources
>>> >       Map<String, LocalResource> localResources = new HashMap<String,
>>> > LocalResource>();
>>> >
>>> >       LocalResource shellRsrc = Records.newRecord(LocalResource.class);
>>> >       shellRsrc.setType(LocalResourceType.FILE);
>>> >       shellRsrc.setVisibility(LocalResourceVisibility.APPLICATION);
>>> >       String shellScriptPath = "hdfs://isredeng//kishore/kk.ksh";
>>> >       try {
>>> >         shellRsrc.setResource(ConverterUtils.getYarnUrlFromURI(new
>>> > URI(shellScriptPath)));
>>> >       } catch (URISyntaxException e) {
>>> >         LOG.error("Error when trying to use shell script path
>>> specified"
>>> >             + " in env, path=" + shellScriptPath);
>>> >         e.printStackTrace();
>>> >       }
>>> >
>>> >       shellRsrc.setTimestamp(0/*shellScriptPathTimestamp*/);
>>> >       shellRsrc.setSize(0/*shellScriptPathLen*/);
>>> >       String ExecShellStringPath = "ExecShellScript.sh";
>>> >       localResources.put(ExecShellStringPath, shellRsrc);
>>> >
>>> >       ctx.setLocalResources(localResources);
>>> >
>>> >
>>> > Please let me know if you need anything else.
>>> >
>>> > Thanks,
>>> > Kishore
>>> >
>>> >
>>> >
>>> > On Tue, Aug 6, 2013 at 12:05 AM, Harsh J <ha...@cloudera.com> wrote:
>>> >>
>>> >> The detail is insufficient to answer why. You should also have gotten
>>> >> a trace after it, can you post that? If possible, also the relevant
>>> >> snippets of code.
>>> >>
>>> >> On Mon, Aug 5, 2013 at 6:36 PM, Krishna Kishore Bonagiri
>>> >> <wr...@gmail.com> wrote:
>>> >> > Hi Harsh,
>>> >> >  Thanks for the quick and detailed reply, it really helps. I am
>>> trying
>>> >> > to
>>> >> > use it and getting this error in node manager's log:
>>> >> >
>>> >> > 2013-08-05 08:57:28,867 ERROR
>>> >> > org.apache.hadoop.security.UserGroupInformation:
>>> >> > PriviledgedActionException
>>> >> > as:dsadm (auth:SIMPLE) cause:java.io.FileNotFoundException: File
>>> does
>>> >> > not
>>> >> > exist: hdfs://isredeng/kishore/kk.ksh
>>> >> >
>>> >> >
>>> >> > This file is there on the machine with name "isredeng", I could do
>>> ls
>>> >> > for
>>> >> > that file as below:
>>> >> >
>>> >> > -bash-4.1$ hadoop fs -ls kishore/kk.ksh
>>> >> > 13/08/05 09:01:03 WARN util.NativeCodeLoader: Unable to load
>>> >> > native-hadoop
>>> >> > library for your platform... using builtin-java classes where
>>> applicable
>>> >> > Found 1 items
>>> >> > -rw-r--r--   3 dsadm supergroup       1046 2013-08-05 08:48
>>> >> > kishore/kk.ksh
>>> >> >
>>> >> > Note: I am using a single node cluster
>>> >> >
>>> >> > Thanks,
>>> >> > Kishore
>>> >> >
>>> >> >
>>> >> >
>>> >> >
>>> >> > On Mon, Aug 5, 2013 at 3:00 PM, Harsh J <ha...@cloudera.com> wrote:
>>> >> >>
>>> >> >> The string for each LocalResource in the map can be anything that
>>> >> >> serves as a common identifier name for your application. At
>>> execution
>>> >> >> time, the passed resource filename will be aliased to the name
>>> you've
>>> >> >> mapped it to, so that the application code need not track special
>>> >> >> names. The behavior is very similar to how you can, in MR, define a
>>> >> >> symlink name for a DistributedCache entry (e.g. foo.jar#bar.jar).
>>> >> >>
>>> >> >> For an example, checkout the DistributedShell app sources.
>>> >> >>
>>> >> >> Over [1], you can see we take a user provided file path to a shell
>>> >> >> script. This can be named anything as it is user-supplied.
>>> >> >> Onto [2], we define this as a local resource [2.1] and embed it
>>> with a
>>> >> >> different name (the string you ask about) [2.2], as defined at [3]
>>> as
>>> >> >> an application reference-able constant.
>>> >> >> Note that in [4], we add to the Container arguments the aliased
>>> name
>>> >> >> we mapped it to (i.e. [3]) and not the original filename we
>>> received
>>> >> >> from the user. The resource is placed on the container with this
>>> name
>>> >> >> instead, so thats what we choose to execute.
>>> >> >>
>>> >> >> [1] -
>>> >> >>
>>> >> >>
>>> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L390
>>> >> >>
>>> >> >> [2] - [2.1]
>>> >> >>
>>> >> >>
>>> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L764
>>> >> >> and [2.2]
>>> >> >>
>>> >> >>
>>> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L780
>>> >> >>
>>> >> >> [3] -
>>> >> >>
>>> >> >>
>>> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L205
>>> >> >>
>>> >> >> [4] -
>>> >> >>
>>> >> >>
>>> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L791
>>> >> >>
>>> >> >> On Mon, Aug 5, 2013 at 2:44 PM, Krishna Kishore Bonagiri
>>> >> >> <wr...@gmail.com> wrote:
>>> >> >> > Hi,
>>> >> >> >
>>> >> >> >   Can someone please tell me what is the use of calling
>>> >> >> > setLocalResources()
>>> >> >> > on ContainerLaunchContext?
>>> >> >> >
>>> >> >> >   And, also an example of how to use this will help...
>>> >> >> >
>>> >> >> >  I couldn't guess what is the String in the map that is passed to
>>> >> >> > setLocalResources() like below:
>>> >> >> >
>>> >> >> >       // Set the local resources
>>> >> >> >       Map<String, LocalResource> localResources = new
>>> HashMap<String,
>>> >> >> > LocalResource>();
>>> >> >> >
>>> >> >> > Thanks,
>>> >> >> > Kishore
>>> >> >> >
>>> >> >>
>>> >> >>
>>> >> >>
>>> >> >> --
>>> >> >> Harsh J
>>> >> >
>>> >> >
>>> >>
>>> >>
>>> >>
>>> >> --
>>> >> Harsh J
>>> >
>>> >
>>>
>>>
>>>
>>> --
>>> Harsh J
>>>
>>
>>

Re: setLocalResources() on ContainerLaunchContext

Posted by Krishna Kishore Bonagiri <wr...@gmail.com>.
I tried the following and it works!
String shellScriptPath = "file:///home_/dsadm/kishore/kk.ksh";

But now getting a timestamp error like below, when I passed 0 to
setTimestamp()

13/08/06 08:23:48 INFO ApplicationMaster: Got container status for
containerID= container_1375784329048_0017_01_000002, state=COMPLETE,
exitStatus=-1000, diagnostics=Resource file:/home_/dsadm/kishore/kk.ksh
changed on src filesystem (expected 0, was 1367580580000




On Tue, Aug 6, 2013 at 5:24 PM, Harsh J <ha...@cloudera.com> wrote:

> Can you try passing a fully qualified local path? That is, including the
> file:/ scheme
> On Aug 6, 2013 4:05 PM, "Krishna Kishore Bonagiri" <
> write2kishore@gmail.com> wrote:
>
>> Hi Harsh,
>>    The setResource() call on LocalResource() is expecting an argument of
>> type org.apache.hadoop.yarn.api.records.URL which is converted from a
>> string in the form of URI. This happens in the following call of
>> Distributed Shell example,
>>
>> shellRsrc.setResource(ConverterUtils.getYarnUrlFromURI(new URI(
>> shellScriptPath)));
>>
>> So, if I give a local file I get a parsing error like below, which is
>> when I changed it to an HDFS file thinking that it should be given like
>> that only. Could you please give an example of how else it could be used,
>> using a local file as you are saying?
>>
>> 2013-08-06 06:23:12,942 WARN
>> org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container:
>> Failed to parse resource-request
>> java.net.URISyntaxException: Expected scheme name at index 0:
>> :///home_/dsadm/kishore/kk.ksh
>>         at java.net.URI$Parser.fail(URI.java:2820)
>>         at java.net.URI$Parser.failExpecting(URI.java:2826)
>>         at java.net.URI$Parser.parse(URI.java:3015)
>>         at java.net.URI.<init>(URI.java:747)
>>         at
>> org.apache.hadoop.yarn.util.ConverterUtils.getPathFromYarnURL(ConverterUtils.java:77)
>>         at
>> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalResourceRequest.<init>(LocalResourceRequest.java:46)
>>
>>
>>
>> On Tue, Aug 6, 2013 at 3:36 PM, Harsh J <ha...@cloudera.com> wrote:
>>
>>> To be honest, I've never tried loading a HDFS file onto the
>>> LocalResource this way. I usually just pass a local file and that
>>> works just fine. There may be something in the URI transformation
>>> possibly breaking a HDFS source, but try passing a local file - does
>>> that fail too? The Shell example uses a local file.
>>>
>>> On Tue, Aug 6, 2013 at 10:54 AM, Krishna Kishore Bonagiri
>>> <wr...@gmail.com> wrote:
>>> > Hi Harsh,
>>> >
>>> >   Please see if this is useful, I got a stack trace after the error has
>>> > occurred....
>>> >
>>> > 2013-08-06 00:55:30,559 INFO
>>> > org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor:
>>> CWD set
>>> > to
>>> /tmp/nm-local-dir/usercache/dsadm/appcache/application_1375716148174_0004
>>> > =
>>> >
>>> file:/tmp/nm-local-dir/usercache/dsadm/appcache/application_1375716148174_0004
>>> > 2013-08-06 00:55:31,017 ERROR
>>> > org.apache.hadoop.security.UserGroupInformation:
>>> PriviledgedActionException
>>> > as:dsadm (auth:SIMPLE) cause:java.io.FileNotFoundException: File does
>>> not
>>> > exist: hdfs://isredeng/kishore/kk.ksh
>>> > 2013-08-06 00:55:31,029 INFO
>>> >
>>> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService:
>>> > DEBUG: FAILED { hdfs://isredeng/kishore/kk.ksh, 0, FILE, null }, File
>>> does
>>> > not exist: hdfs://isredeng/kishore/kk.ksh
>>> > 2013-08-06 00:55:31,031 INFO
>>> >
>>> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalizedResource:
>>> > Resource hdfs://isredeng/kishore/kk.ksh transitioned from DOWNLOADING
>>> to
>>> > FAILED
>>> > 2013-08-06 00:55:31,034 INFO
>>> >
>>> org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container:
>>> > Container container_1375716148174_0004_01_000002 transitioned from
>>> > LOCALIZING to LOCALIZATION_FAILED
>>> > 2013-08-06 00:55:31,035 INFO
>>> >
>>> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalResourcesTrackerImpl:
>>> > Container container_1375716148174_0004_01_000002 sent RELEASE event on
>>> a
>>> > resource request { hdfs://isredeng/kishore/kk.ksh, 0, FILE, null } not
>>> > present in cache.
>>> > 2013-08-06 00:55:31,036 WARN org.apache.hadoop.ipc.Client: interrupted
>>> > waiting to send rpc request to server
>>> > java.lang.InterruptedException
>>> >         at
>>> >
>>> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1290)
>>> >         at
>>> > java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:229)
>>> >         at java.util.concurrent.FutureTask.get(FutureTask.java:94)
>>> >         at
>>> > org.apache.hadoop.ipc.Client$Connection.sendRpcRequest(Client.java:930)
>>> >         at org.apache.hadoop.ipc.Client.call(Client.java:1285)
>>> >         at org.apache.hadoop.ipc.Client.call(Client.java:1264)
>>> >         at
>>> >
>>> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
>>> >         at $Proxy22.heartbeat(Unknown Source)
>>> >         at
>>> >
>>> org.apache.hadoop.yarn.server.nodemanager.api.impl.pb.client.LocalizationProtocolPBClientImpl.heartbeat(LocalizationProtocolPBClientImpl.java:62)
>>> >         at
>>> >
>>> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.localizeFiles(ContainerLocalizer.java:249)
>>> >         at
>>> >
>>> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.runLocalization(ContainerLocalizer.java:163)
>>> >         at
>>> >
>>> org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.startLocalizer(DefaultContainerExecutor.java:106)
>>> >         at
>>> >
>>> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerRunner.run(ResourceLocalizationService.java:979)
>>> >
>>> >
>>> >
>>> > And here is my code snippet:
>>> >
>>> >       ContainerLaunchContext ctx =
>>> > Records.newRecord(ContainerLaunchContext.class);
>>> >
>>> >       ctx.setEnvironment(oshEnv);
>>> >
>>> >       // Set the local resources
>>> >       Map<String, LocalResource> localResources = new HashMap<String,
>>> > LocalResource>();
>>> >
>>> >       LocalResource shellRsrc = Records.newRecord(LocalResource.class);
>>> >       shellRsrc.setType(LocalResourceType.FILE);
>>> >       shellRsrc.setVisibility(LocalResourceVisibility.APPLICATION);
>>> >       String shellScriptPath = "hdfs://isredeng//kishore/kk.ksh";
>>> >       try {
>>> >         shellRsrc.setResource(ConverterUtils.getYarnUrlFromURI(new
>>> > URI(shellScriptPath)));
>>> >       } catch (URISyntaxException e) {
>>> >         LOG.error("Error when trying to use shell script path
>>> specified"
>>> >             + " in env, path=" + shellScriptPath);
>>> >         e.printStackTrace();
>>> >       }
>>> >
>>> >       shellRsrc.setTimestamp(0/*shellScriptPathTimestamp*/);
>>> >       shellRsrc.setSize(0/*shellScriptPathLen*/);
>>> >       String ExecShellStringPath = "ExecShellScript.sh";
>>> >       localResources.put(ExecShellStringPath, shellRsrc);
>>> >
>>> >       ctx.setLocalResources(localResources);
>>> >
>>> >
>>> > Please let me know if you need anything else.
>>> >
>>> > Thanks,
>>> > Kishore
>>> >
>>> >
>>> >
>>> > On Tue, Aug 6, 2013 at 12:05 AM, Harsh J <ha...@cloudera.com> wrote:
>>> >>
>>> >> The detail is insufficient to answer why. You should also have gotten
>>> >> a trace after it, can you post that? If possible, also the relevant
>>> >> snippets of code.
>>> >>
>>> >> On Mon, Aug 5, 2013 at 6:36 PM, Krishna Kishore Bonagiri
>>> >> <wr...@gmail.com> wrote:
>>> >> > Hi Harsh,
>>> >> >  Thanks for the quick and detailed reply, it really helps. I am
>>> trying
>>> >> > to
>>> >> > use it and getting this error in node manager's log:
>>> >> >
>>> >> > 2013-08-05 08:57:28,867 ERROR
>>> >> > org.apache.hadoop.security.UserGroupInformation:
>>> >> > PriviledgedActionException
>>> >> > as:dsadm (auth:SIMPLE) cause:java.io.FileNotFoundException: File
>>> does
>>> >> > not
>>> >> > exist: hdfs://isredeng/kishore/kk.ksh
>>> >> >
>>> >> >
>>> >> > This file is there on the machine with name "isredeng", I could do
>>> ls
>>> >> > for
>>> >> > that file as below:
>>> >> >
>>> >> > -bash-4.1$ hadoop fs -ls kishore/kk.ksh
>>> >> > 13/08/05 09:01:03 WARN util.NativeCodeLoader: Unable to load
>>> >> > native-hadoop
>>> >> > library for your platform... using builtin-java classes where
>>> applicable
>>> >> > Found 1 items
>>> >> > -rw-r--r--   3 dsadm supergroup       1046 2013-08-05 08:48
>>> >> > kishore/kk.ksh
>>> >> >
>>> >> > Note: I am using a single node cluster
>>> >> >
>>> >> > Thanks,
>>> >> > Kishore
>>> >> >
>>> >> >
>>> >> >
>>> >> >
>>> >> > On Mon, Aug 5, 2013 at 3:00 PM, Harsh J <ha...@cloudera.com> wrote:
>>> >> >>
>>> >> >> The string for each LocalResource in the map can be anything that
>>> >> >> serves as a common identifier name for your application. At
>>> execution
>>> >> >> time, the passed resource filename will be aliased to the name
>>> you've
>>> >> >> mapped it to, so that the application code need not track special
>>> >> >> names. The behavior is very similar to how you can, in MR, define a
>>> >> >> symlink name for a DistributedCache entry (e.g. foo.jar#bar.jar).
>>> >> >>
>>> >> >> For an example, checkout the DistributedShell app sources.
>>> >> >>
>>> >> >> Over [1], you can see we take a user provided file path to a shell
>>> >> >> script. This can be named anything as it is user-supplied.
>>> >> >> Onto [2], we define this as a local resource [2.1] and embed it
>>> with a
>>> >> >> different name (the string you ask about) [2.2], as defined at [3]
>>> as
>>> >> >> an application reference-able constant.
>>> >> >> Note that in [4], we add to the Container arguments the aliased
>>> name
>>> >> >> we mapped it to (i.e. [3]) and not the original filename we
>>> received
>>> >> >> from the user. The resource is placed on the container with this
>>> name
>>> >> >> instead, so thats what we choose to execute.
>>> >> >>
>>> >> >> [1] -
>>> >> >>
>>> >> >>
>>> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L390
>>> >> >>
>>> >> >> [2] - [2.1]
>>> >> >>
>>> >> >>
>>> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L764
>>> >> >> and [2.2]
>>> >> >>
>>> >> >>
>>> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L780
>>> >> >>
>>> >> >> [3] -
>>> >> >>
>>> >> >>
>>> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L205
>>> >> >>
>>> >> >> [4] -
>>> >> >>
>>> >> >>
>>> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L791
>>> >> >>
>>> >> >> On Mon, Aug 5, 2013 at 2:44 PM, Krishna Kishore Bonagiri
>>> >> >> <wr...@gmail.com> wrote:
>>> >> >> > Hi,
>>> >> >> >
>>> >> >> >   Can someone please tell me what is the use of calling
>>> >> >> > setLocalResources()
>>> >> >> > on ContainerLaunchContext?
>>> >> >> >
>>> >> >> >   And, also an example of how to use this will help...
>>> >> >> >
>>> >> >> >  I couldn't guess what is the String in the map that is passed to
>>> >> >> > setLocalResources() like below:
>>> >> >> >
>>> >> >> >       // Set the local resources
>>> >> >> >       Map<String, LocalResource> localResources = new
>>> HashMap<String,
>>> >> >> > LocalResource>();
>>> >> >> >
>>> >> >> > Thanks,
>>> >> >> > Kishore
>>> >> >> >
>>> >> >>
>>> >> >>
>>> >> >>
>>> >> >> --
>>> >> >> Harsh J
>>> >> >
>>> >> >
>>> >>
>>> >>
>>> >>
>>> >> --
>>> >> Harsh J
>>> >
>>> >
>>>
>>>
>>>
>>> --
>>> Harsh J
>>>
>>
>>

Re: setLocalResources() on ContainerLaunchContext

Posted by Krishna Kishore Bonagiri <wr...@gmail.com>.
I tried the following and it works!
String shellScriptPath = "file:///home_/dsadm/kishore/kk.ksh";

But now getting a timestamp error like below, when I passed 0 to
setTimestamp()

13/08/06 08:23:48 INFO ApplicationMaster: Got container status for
containerID= container_1375784329048_0017_01_000002, state=COMPLETE,
exitStatus=-1000, diagnostics=Resource file:/home_/dsadm/kishore/kk.ksh
changed on src filesystem (expected 0, was 1367580580000




On Tue, Aug 6, 2013 at 5:24 PM, Harsh J <ha...@cloudera.com> wrote:

> Can you try passing a fully qualified local path? That is, including the
> file:/ scheme
> On Aug 6, 2013 4:05 PM, "Krishna Kishore Bonagiri" <
> write2kishore@gmail.com> wrote:
>
>> Hi Harsh,
>>    The setResource() call on LocalResource() is expecting an argument of
>> type org.apache.hadoop.yarn.api.records.URL which is converted from a
>> string in the form of URI. This happens in the following call of
>> Distributed Shell example,
>>
>> shellRsrc.setResource(ConverterUtils.getYarnUrlFromURI(new URI(
>> shellScriptPath)));
>>
>> So, if I give a local file I get a parsing error like below, which is
>> when I changed it to an HDFS file thinking that it should be given like
>> that only. Could you please give an example of how else it could be used,
>> using a local file as you are saying?
>>
>> 2013-08-06 06:23:12,942 WARN
>> org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container:
>> Failed to parse resource-request
>> java.net.URISyntaxException: Expected scheme name at index 0:
>> :///home_/dsadm/kishore/kk.ksh
>>         at java.net.URI$Parser.fail(URI.java:2820)
>>         at java.net.URI$Parser.failExpecting(URI.java:2826)
>>         at java.net.URI$Parser.parse(URI.java:3015)
>>         at java.net.URI.<init>(URI.java:747)
>>         at
>> org.apache.hadoop.yarn.util.ConverterUtils.getPathFromYarnURL(ConverterUtils.java:77)
>>         at
>> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalResourceRequest.<init>(LocalResourceRequest.java:46)
>>
>>
>>
>> On Tue, Aug 6, 2013 at 3:36 PM, Harsh J <ha...@cloudera.com> wrote:
>>
>>> To be honest, I've never tried loading a HDFS file onto the
>>> LocalResource this way. I usually just pass a local file and that
>>> works just fine. There may be something in the URI transformation
>>> possibly breaking a HDFS source, but try passing a local file - does
>>> that fail too? The Shell example uses a local file.
>>>
>>> On Tue, Aug 6, 2013 at 10:54 AM, Krishna Kishore Bonagiri
>>> <wr...@gmail.com> wrote:
>>> > Hi Harsh,
>>> >
>>> >   Please see if this is useful, I got a stack trace after the error has
>>> > occurred....
>>> >
>>> > 2013-08-06 00:55:30,559 INFO
>>> > org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor:
>>> CWD set
>>> > to
>>> /tmp/nm-local-dir/usercache/dsadm/appcache/application_1375716148174_0004
>>> > =
>>> >
>>> file:/tmp/nm-local-dir/usercache/dsadm/appcache/application_1375716148174_0004
>>> > 2013-08-06 00:55:31,017 ERROR
>>> > org.apache.hadoop.security.UserGroupInformation:
>>> PriviledgedActionException
>>> > as:dsadm (auth:SIMPLE) cause:java.io.FileNotFoundException: File does
>>> not
>>> > exist: hdfs://isredeng/kishore/kk.ksh
>>> > 2013-08-06 00:55:31,029 INFO
>>> >
>>> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService:
>>> > DEBUG: FAILED { hdfs://isredeng/kishore/kk.ksh, 0, FILE, null }, File
>>> does
>>> > not exist: hdfs://isredeng/kishore/kk.ksh
>>> > 2013-08-06 00:55:31,031 INFO
>>> >
>>> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalizedResource:
>>> > Resource hdfs://isredeng/kishore/kk.ksh transitioned from DOWNLOADING
>>> to
>>> > FAILED
>>> > 2013-08-06 00:55:31,034 INFO
>>> >
>>> org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container:
>>> > Container container_1375716148174_0004_01_000002 transitioned from
>>> > LOCALIZING to LOCALIZATION_FAILED
>>> > 2013-08-06 00:55:31,035 INFO
>>> >
>>> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalResourcesTrackerImpl:
>>> > Container container_1375716148174_0004_01_000002 sent RELEASE event on
>>> a
>>> > resource request { hdfs://isredeng/kishore/kk.ksh, 0, FILE, null } not
>>> > present in cache.
>>> > 2013-08-06 00:55:31,036 WARN org.apache.hadoop.ipc.Client: interrupted
>>> > waiting to send rpc request to server
>>> > java.lang.InterruptedException
>>> >         at
>>> >
>>> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1290)
>>> >         at
>>> > java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:229)
>>> >         at java.util.concurrent.FutureTask.get(FutureTask.java:94)
>>> >         at
>>> > org.apache.hadoop.ipc.Client$Connection.sendRpcRequest(Client.java:930)
>>> >         at org.apache.hadoop.ipc.Client.call(Client.java:1285)
>>> >         at org.apache.hadoop.ipc.Client.call(Client.java:1264)
>>> >         at
>>> >
>>> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
>>> >         at $Proxy22.heartbeat(Unknown Source)
>>> >         at
>>> >
>>> org.apache.hadoop.yarn.server.nodemanager.api.impl.pb.client.LocalizationProtocolPBClientImpl.heartbeat(LocalizationProtocolPBClientImpl.java:62)
>>> >         at
>>> >
>>> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.localizeFiles(ContainerLocalizer.java:249)
>>> >         at
>>> >
>>> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.runLocalization(ContainerLocalizer.java:163)
>>> >         at
>>> >
>>> org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.startLocalizer(DefaultContainerExecutor.java:106)
>>> >         at
>>> >
>>> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerRunner.run(ResourceLocalizationService.java:979)
>>> >
>>> >
>>> >
>>> > And here is my code snippet:
>>> >
>>> >       ContainerLaunchContext ctx =
>>> > Records.newRecord(ContainerLaunchContext.class);
>>> >
>>> >       ctx.setEnvironment(oshEnv);
>>> >
>>> >       // Set the local resources
>>> >       Map<String, LocalResource> localResources = new HashMap<String,
>>> > LocalResource>();
>>> >
>>> >       LocalResource shellRsrc = Records.newRecord(LocalResource.class);
>>> >       shellRsrc.setType(LocalResourceType.FILE);
>>> >       shellRsrc.setVisibility(LocalResourceVisibility.APPLICATION);
>>> >       String shellScriptPath = "hdfs://isredeng//kishore/kk.ksh";
>>> >       try {
>>> >         shellRsrc.setResource(ConverterUtils.getYarnUrlFromURI(new
>>> > URI(shellScriptPath)));
>>> >       } catch (URISyntaxException e) {
>>> >         LOG.error("Error when trying to use shell script path
>>> specified"
>>> >             + " in env, path=" + shellScriptPath);
>>> >         e.printStackTrace();
>>> >       }
>>> >
>>> >       shellRsrc.setTimestamp(0/*shellScriptPathTimestamp*/);
>>> >       shellRsrc.setSize(0/*shellScriptPathLen*/);
>>> >       String ExecShellStringPath = "ExecShellScript.sh";
>>> >       localResources.put(ExecShellStringPath, shellRsrc);
>>> >
>>> >       ctx.setLocalResources(localResources);
>>> >
>>> >
>>> > Please let me know if you need anything else.
>>> >
>>> > Thanks,
>>> > Kishore
>>> >
>>> >
>>> >
>>> > On Tue, Aug 6, 2013 at 12:05 AM, Harsh J <ha...@cloudera.com> wrote:
>>> >>
>>> >> The detail is insufficient to answer why. You should also have gotten
>>> >> a trace after it, can you post that? If possible, also the relevant
>>> >> snippets of code.
>>> >>
>>> >> On Mon, Aug 5, 2013 at 6:36 PM, Krishna Kishore Bonagiri
>>> >> <wr...@gmail.com> wrote:
>>> >> > Hi Harsh,
>>> >> >  Thanks for the quick and detailed reply, it really helps. I am
>>> trying
>>> >> > to
>>> >> > use it and getting this error in node manager's log:
>>> >> >
>>> >> > 2013-08-05 08:57:28,867 ERROR
>>> >> > org.apache.hadoop.security.UserGroupInformation:
>>> >> > PriviledgedActionException
>>> >> > as:dsadm (auth:SIMPLE) cause:java.io.FileNotFoundException: File
>>> does
>>> >> > not
>>> >> > exist: hdfs://isredeng/kishore/kk.ksh
>>> >> >
>>> >> >
>>> >> > This file is there on the machine with name "isredeng", I could do
>>> ls
>>> >> > for
>>> >> > that file as below:
>>> >> >
>>> >> > -bash-4.1$ hadoop fs -ls kishore/kk.ksh
>>> >> > 13/08/05 09:01:03 WARN util.NativeCodeLoader: Unable to load
>>> >> > native-hadoop
>>> >> > library for your platform... using builtin-java classes where
>>> applicable
>>> >> > Found 1 items
>>> >> > -rw-r--r--   3 dsadm supergroup       1046 2013-08-05 08:48
>>> >> > kishore/kk.ksh
>>> >> >
>>> >> > Note: I am using a single node cluster
>>> >> >
>>> >> > Thanks,
>>> >> > Kishore
>>> >> >
>>> >> >
>>> >> >
>>> >> >
>>> >> > On Mon, Aug 5, 2013 at 3:00 PM, Harsh J <ha...@cloudera.com> wrote:
>>> >> >>
>>> >> >> The string for each LocalResource in the map can be anything that
>>> >> >> serves as a common identifier name for your application. At
>>> execution
>>> >> >> time, the passed resource filename will be aliased to the name
>>> you've
>>> >> >> mapped it to, so that the application code need not track special
>>> >> >> names. The behavior is very similar to how you can, in MR, define a
>>> >> >> symlink name for a DistributedCache entry (e.g. foo.jar#bar.jar).
>>> >> >>
>>> >> >> For an example, checkout the DistributedShell app sources.
>>> >> >>
>>> >> >> Over [1], you can see we take a user provided file path to a shell
>>> >> >> script. This can be named anything as it is user-supplied.
>>> >> >> Onto [2], we define this as a local resource [2.1] and embed it
>>> with a
>>> >> >> different name (the string you ask about) [2.2], as defined at [3]
>>> as
>>> >> >> an application reference-able constant.
>>> >> >> Note that in [4], we add to the Container arguments the aliased
>>> name
>>> >> >> we mapped it to (i.e. [3]) and not the original filename we
>>> received
>>> >> >> from the user. The resource is placed on the container with this
>>> name
>>> >> >> instead, so thats what we choose to execute.
>>> >> >>
>>> >> >> [1] -
>>> >> >>
>>> >> >>
>>> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L390
>>> >> >>
>>> >> >> [2] - [2.1]
>>> >> >>
>>> >> >>
>>> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L764
>>> >> >> and [2.2]
>>> >> >>
>>> >> >>
>>> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L780
>>> >> >>
>>> >> >> [3] -
>>> >> >>
>>> >> >>
>>> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L205
>>> >> >>
>>> >> >> [4] -
>>> >> >>
>>> >> >>
>>> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L791
>>> >> >>
>>> >> >> On Mon, Aug 5, 2013 at 2:44 PM, Krishna Kishore Bonagiri
>>> >> >> <wr...@gmail.com> wrote:
>>> >> >> > Hi,
>>> >> >> >
>>> >> >> >   Can someone please tell me what is the use of calling
>>> >> >> > setLocalResources()
>>> >> >> > on ContainerLaunchContext?
>>> >> >> >
>>> >> >> >   And, also an example of how to use this will help...
>>> >> >> >
>>> >> >> >  I couldn't guess what is the String in the map that is passed to
>>> >> >> > setLocalResources() like below:
>>> >> >> >
>>> >> >> >       // Set the local resources
>>> >> >> >       Map<String, LocalResource> localResources = new
>>> HashMap<String,
>>> >> >> > LocalResource>();
>>> >> >> >
>>> >> >> > Thanks,
>>> >> >> > Kishore
>>> >> >> >
>>> >> >>
>>> >> >>
>>> >> >>
>>> >> >> --
>>> >> >> Harsh J
>>> >> >
>>> >> >
>>> >>
>>> >>
>>> >>
>>> >> --
>>> >> Harsh J
>>> >
>>> >
>>>
>>>
>>>
>>> --
>>> Harsh J
>>>
>>
>>

Re: setLocalResources() on ContainerLaunchContext

Posted by Krishna Kishore Bonagiri <wr...@gmail.com>.
I tried the following and it works!
String shellScriptPath = "file:///home_/dsadm/kishore/kk.ksh";

But now getting a timestamp error like below, when I passed 0 to
setTimestamp()

13/08/06 08:23:48 INFO ApplicationMaster: Got container status for
containerID= container_1375784329048_0017_01_000002, state=COMPLETE,
exitStatus=-1000, diagnostics=Resource file:/home_/dsadm/kishore/kk.ksh
changed on src filesystem (expected 0, was 1367580580000




On Tue, Aug 6, 2013 at 5:24 PM, Harsh J <ha...@cloudera.com> wrote:

> Can you try passing a fully qualified local path? That is, including the
> file:/ scheme
> On Aug 6, 2013 4:05 PM, "Krishna Kishore Bonagiri" <
> write2kishore@gmail.com> wrote:
>
>> Hi Harsh,
>>    The setResource() call on LocalResource() is expecting an argument of
>> type org.apache.hadoop.yarn.api.records.URL which is converted from a
>> string in the form of URI. This happens in the following call of
>> Distributed Shell example,
>>
>> shellRsrc.setResource(ConverterUtils.getYarnUrlFromURI(new URI(
>> shellScriptPath)));
>>
>> So, if I give a local file I get a parsing error like below, which is
>> when I changed it to an HDFS file thinking that it should be given like
>> that only. Could you please give an example of how else it could be used,
>> using a local file as you are saying?
>>
>> 2013-08-06 06:23:12,942 WARN
>> org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container:
>> Failed to parse resource-request
>> java.net.URISyntaxException: Expected scheme name at index 0:
>> :///home_/dsadm/kishore/kk.ksh
>>         at java.net.URI$Parser.fail(URI.java:2820)
>>         at java.net.URI$Parser.failExpecting(URI.java:2826)
>>         at java.net.URI$Parser.parse(URI.java:3015)
>>         at java.net.URI.<init>(URI.java:747)
>>         at
>> org.apache.hadoop.yarn.util.ConverterUtils.getPathFromYarnURL(ConverterUtils.java:77)
>>         at
>> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalResourceRequest.<init>(LocalResourceRequest.java:46)
>>
>>
>>
>> On Tue, Aug 6, 2013 at 3:36 PM, Harsh J <ha...@cloudera.com> wrote:
>>
>>> To be honest, I've never tried loading a HDFS file onto the
>>> LocalResource this way. I usually just pass a local file and that
>>> works just fine. There may be something in the URI transformation
>>> possibly breaking a HDFS source, but try passing a local file - does
>>> that fail too? The Shell example uses a local file.
>>>
>>> On Tue, Aug 6, 2013 at 10:54 AM, Krishna Kishore Bonagiri
>>> <wr...@gmail.com> wrote:
>>> > Hi Harsh,
>>> >
>>> >   Please see if this is useful, I got a stack trace after the error has
>>> > occurred....
>>> >
>>> > 2013-08-06 00:55:30,559 INFO
>>> > org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor:
>>> CWD set
>>> > to
>>> /tmp/nm-local-dir/usercache/dsadm/appcache/application_1375716148174_0004
>>> > =
>>> >
>>> file:/tmp/nm-local-dir/usercache/dsadm/appcache/application_1375716148174_0004
>>> > 2013-08-06 00:55:31,017 ERROR
>>> > org.apache.hadoop.security.UserGroupInformation:
>>> PriviledgedActionException
>>> > as:dsadm (auth:SIMPLE) cause:java.io.FileNotFoundException: File does
>>> not
>>> > exist: hdfs://isredeng/kishore/kk.ksh
>>> > 2013-08-06 00:55:31,029 INFO
>>> >
>>> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService:
>>> > DEBUG: FAILED { hdfs://isredeng/kishore/kk.ksh, 0, FILE, null }, File
>>> does
>>> > not exist: hdfs://isredeng/kishore/kk.ksh
>>> > 2013-08-06 00:55:31,031 INFO
>>> >
>>> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalizedResource:
>>> > Resource hdfs://isredeng/kishore/kk.ksh transitioned from DOWNLOADING
>>> to
>>> > FAILED
>>> > 2013-08-06 00:55:31,034 INFO
>>> >
>>> org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container:
>>> > Container container_1375716148174_0004_01_000002 transitioned from
>>> > LOCALIZING to LOCALIZATION_FAILED
>>> > 2013-08-06 00:55:31,035 INFO
>>> >
>>> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalResourcesTrackerImpl:
>>> > Container container_1375716148174_0004_01_000002 sent RELEASE event on
>>> a
>>> > resource request { hdfs://isredeng/kishore/kk.ksh, 0, FILE, null } not
>>> > present in cache.
>>> > 2013-08-06 00:55:31,036 WARN org.apache.hadoop.ipc.Client: interrupted
>>> > waiting to send rpc request to server
>>> > java.lang.InterruptedException
>>> >         at
>>> >
>>> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1290)
>>> >         at
>>> > java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:229)
>>> >         at java.util.concurrent.FutureTask.get(FutureTask.java:94)
>>> >         at
>>> > org.apache.hadoop.ipc.Client$Connection.sendRpcRequest(Client.java:930)
>>> >         at org.apache.hadoop.ipc.Client.call(Client.java:1285)
>>> >         at org.apache.hadoop.ipc.Client.call(Client.java:1264)
>>> >         at
>>> >
>>> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
>>> >         at $Proxy22.heartbeat(Unknown Source)
>>> >         at
>>> >
>>> org.apache.hadoop.yarn.server.nodemanager.api.impl.pb.client.LocalizationProtocolPBClientImpl.heartbeat(LocalizationProtocolPBClientImpl.java:62)
>>> >         at
>>> >
>>> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.localizeFiles(ContainerLocalizer.java:249)
>>> >         at
>>> >
>>> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.runLocalization(ContainerLocalizer.java:163)
>>> >         at
>>> >
>>> org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.startLocalizer(DefaultContainerExecutor.java:106)
>>> >         at
>>> >
>>> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerRunner.run(ResourceLocalizationService.java:979)
>>> >
>>> >
>>> >
>>> > And here is my code snippet:
>>> >
>>> >       ContainerLaunchContext ctx =
>>> > Records.newRecord(ContainerLaunchContext.class);
>>> >
>>> >       ctx.setEnvironment(oshEnv);
>>> >
>>> >       // Set the local resources
>>> >       Map<String, LocalResource> localResources = new HashMap<String,
>>> > LocalResource>();
>>> >
>>> >       LocalResource shellRsrc = Records.newRecord(LocalResource.class);
>>> >       shellRsrc.setType(LocalResourceType.FILE);
>>> >       shellRsrc.setVisibility(LocalResourceVisibility.APPLICATION);
>>> >       String shellScriptPath = "hdfs://isredeng//kishore/kk.ksh";
>>> >       try {
>>> >         shellRsrc.setResource(ConverterUtils.getYarnUrlFromURI(new
>>> > URI(shellScriptPath)));
>>> >       } catch (URISyntaxException e) {
>>> >         LOG.error("Error when trying to use shell script path
>>> specified"
>>> >             + " in env, path=" + shellScriptPath);
>>> >         e.printStackTrace();
>>> >       }
>>> >
>>> >       shellRsrc.setTimestamp(0/*shellScriptPathTimestamp*/);
>>> >       shellRsrc.setSize(0/*shellScriptPathLen*/);
>>> >       String ExecShellStringPath = "ExecShellScript.sh";
>>> >       localResources.put(ExecShellStringPath, shellRsrc);
>>> >
>>> >       ctx.setLocalResources(localResources);
>>> >
>>> >
>>> > Please let me know if you need anything else.
>>> >
>>> > Thanks,
>>> > Kishore
>>> >
>>> >
>>> >
>>> > On Tue, Aug 6, 2013 at 12:05 AM, Harsh J <ha...@cloudera.com> wrote:
>>> >>
>>> >> The detail is insufficient to answer why. You should also have gotten
>>> >> a trace after it, can you post that? If possible, also the relevant
>>> >> snippets of code.
>>> >>
>>> >> On Mon, Aug 5, 2013 at 6:36 PM, Krishna Kishore Bonagiri
>>> >> <wr...@gmail.com> wrote:
>>> >> > Hi Harsh,
>>> >> >  Thanks for the quick and detailed reply, it really helps. I am
>>> trying
>>> >> > to
>>> >> > use it and getting this error in node manager's log:
>>> >> >
>>> >> > 2013-08-05 08:57:28,867 ERROR
>>> >> > org.apache.hadoop.security.UserGroupInformation:
>>> >> > PriviledgedActionException
>>> >> > as:dsadm (auth:SIMPLE) cause:java.io.FileNotFoundException: File
>>> does
>>> >> > not
>>> >> > exist: hdfs://isredeng/kishore/kk.ksh
>>> >> >
>>> >> >
>>> >> > This file is there on the machine with name "isredeng", I could do
>>> ls
>>> >> > for
>>> >> > that file as below:
>>> >> >
>>> >> > -bash-4.1$ hadoop fs -ls kishore/kk.ksh
>>> >> > 13/08/05 09:01:03 WARN util.NativeCodeLoader: Unable to load
>>> >> > native-hadoop
>>> >> > library for your platform... using builtin-java classes where
>>> applicable
>>> >> > Found 1 items
>>> >> > -rw-r--r--   3 dsadm supergroup       1046 2013-08-05 08:48
>>> >> > kishore/kk.ksh
>>> >> >
>>> >> > Note: I am using a single node cluster
>>> >> >
>>> >> > Thanks,
>>> >> > Kishore
>>> >> >
>>> >> >
>>> >> >
>>> >> >
>>> >> > On Mon, Aug 5, 2013 at 3:00 PM, Harsh J <ha...@cloudera.com> wrote:
>>> >> >>
>>> >> >> The string for each LocalResource in the map can be anything that
>>> >> >> serves as a common identifier name for your application. At
>>> execution
>>> >> >> time, the passed resource filename will be aliased to the name
>>> you've
>>> >> >> mapped it to, so that the application code need not track special
>>> >> >> names. The behavior is very similar to how you can, in MR, define a
>>> >> >> symlink name for a DistributedCache entry (e.g. foo.jar#bar.jar).
>>> >> >>
>>> >> >> For an example, checkout the DistributedShell app sources.
>>> >> >>
>>> >> >> Over [1], you can see we take a user provided file path to a shell
>>> >> >> script. This can be named anything as it is user-supplied.
>>> >> >> Onto [2], we define this as a local resource [2.1] and embed it
>>> with a
>>> >> >> different name (the string you ask about) [2.2], as defined at [3]
>>> as
>>> >> >> an application reference-able constant.
>>> >> >> Note that in [4], we add to the Container arguments the aliased
>>> name
>>> >> >> we mapped it to (i.e. [3]) and not the original filename we
>>> received
>>> >> >> from the user. The resource is placed on the container with this
>>> name
>>> >> >> instead, so thats what we choose to execute.
>>> >> >>
>>> >> >> [1] -
>>> >> >>
>>> >> >>
>>> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L390
>>> >> >>
>>> >> >> [2] - [2.1]
>>> >> >>
>>> >> >>
>>> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L764
>>> >> >> and [2.2]
>>> >> >>
>>> >> >>
>>> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L780
>>> >> >>
>>> >> >> [3] -
>>> >> >>
>>> >> >>
>>> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L205
>>> >> >>
>>> >> >> [4] -
>>> >> >>
>>> >> >>
>>> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L791
>>> >> >>
>>> >> >> On Mon, Aug 5, 2013 at 2:44 PM, Krishna Kishore Bonagiri
>>> >> >> <wr...@gmail.com> wrote:
>>> >> >> > Hi,
>>> >> >> >
>>> >> >> >   Can someone please tell me what is the use of calling
>>> >> >> > setLocalResources()
>>> >> >> > on ContainerLaunchContext?
>>> >> >> >
>>> >> >> >   And, also an example of how to use this will help...
>>> >> >> >
>>> >> >> >  I couldn't guess what is the String in the map that is passed to
>>> >> >> > setLocalResources() like below:
>>> >> >> >
>>> >> >> >       // Set the local resources
>>> >> >> >       Map<String, LocalResource> localResources = new
>>> HashMap<String,
>>> >> >> > LocalResource>();
>>> >> >> >
>>> >> >> > Thanks,
>>> >> >> > Kishore
>>> >> >> >
>>> >> >>
>>> >> >>
>>> >> >>
>>> >> >> --
>>> >> >> Harsh J
>>> >> >
>>> >> >
>>> >>
>>> >>
>>> >>
>>> >> --
>>> >> Harsh J
>>> >
>>> >
>>>
>>>
>>>
>>> --
>>> Harsh J
>>>
>>
>>

Re: setLocalResources() on ContainerLaunchContext

Posted by Harsh J <ha...@cloudera.com>.
Can you try passing a fully qualified local path? That is, including the
file:/ scheme
On Aug 6, 2013 4:05 PM, "Krishna Kishore Bonagiri" <wr...@gmail.com>
wrote:

> Hi Harsh,
>    The setResource() call on LocalResource() is expecting an argument of
> type org.apache.hadoop.yarn.api.records.URL which is converted from a
> string in the form of URI. This happens in the following call of
> Distributed Shell example,
>
> shellRsrc.setResource(ConverterUtils.getYarnUrlFromURI(new URI(
> shellScriptPath)));
>
> So, if I give a local file I get a parsing error like below, which is when
> I changed it to an HDFS file thinking that it should be given like that
> only. Could you please give an example of how else it could be used, using
> a local file as you are saying?
>
> 2013-08-06 06:23:12,942 WARN
> org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container:
> Failed to parse resource-request
> java.net.URISyntaxException: Expected scheme name at index 0:
> :///home_/dsadm/kishore/kk.ksh
>         at java.net.URI$Parser.fail(URI.java:2820)
>         at java.net.URI$Parser.failExpecting(URI.java:2826)
>         at java.net.URI$Parser.parse(URI.java:3015)
>         at java.net.URI.<init>(URI.java:747)
>         at
> org.apache.hadoop.yarn.util.ConverterUtils.getPathFromYarnURL(ConverterUtils.java:77)
>         at
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalResourceRequest.<init>(LocalResourceRequest.java:46)
>
>
>
> On Tue, Aug 6, 2013 at 3:36 PM, Harsh J <ha...@cloudera.com> wrote:
>
>> To be honest, I've never tried loading a HDFS file onto the
>> LocalResource this way. I usually just pass a local file and that
>> works just fine. There may be something in the URI transformation
>> possibly breaking a HDFS source, but try passing a local file - does
>> that fail too? The Shell example uses a local file.
>>
>> On Tue, Aug 6, 2013 at 10:54 AM, Krishna Kishore Bonagiri
>> <wr...@gmail.com> wrote:
>> > Hi Harsh,
>> >
>> >   Please see if this is useful, I got a stack trace after the error has
>> > occurred....
>> >
>> > 2013-08-06 00:55:30,559 INFO
>> > org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor: CWD
>> set
>> > to
>> /tmp/nm-local-dir/usercache/dsadm/appcache/application_1375716148174_0004
>> > =
>> >
>> file:/tmp/nm-local-dir/usercache/dsadm/appcache/application_1375716148174_0004
>> > 2013-08-06 00:55:31,017 ERROR
>> > org.apache.hadoop.security.UserGroupInformation:
>> PriviledgedActionException
>> > as:dsadm (auth:SIMPLE) cause:java.io.FileNotFoundException: File does
>> not
>> > exist: hdfs://isredeng/kishore/kk.ksh
>> > 2013-08-06 00:55:31,029 INFO
>> >
>> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService:
>> > DEBUG: FAILED { hdfs://isredeng/kishore/kk.ksh, 0, FILE, null }, File
>> does
>> > not exist: hdfs://isredeng/kishore/kk.ksh
>> > 2013-08-06 00:55:31,031 INFO
>> >
>> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalizedResource:
>> > Resource hdfs://isredeng/kishore/kk.ksh transitioned from DOWNLOADING to
>> > FAILED
>> > 2013-08-06 00:55:31,034 INFO
>> >
>> org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container:
>> > Container container_1375716148174_0004_01_000002 transitioned from
>> > LOCALIZING to LOCALIZATION_FAILED
>> > 2013-08-06 00:55:31,035 INFO
>> >
>> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalResourcesTrackerImpl:
>> > Container container_1375716148174_0004_01_000002 sent RELEASE event on a
>> > resource request { hdfs://isredeng/kishore/kk.ksh, 0, FILE, null } not
>> > present in cache.
>> > 2013-08-06 00:55:31,036 WARN org.apache.hadoop.ipc.Client: interrupted
>> > waiting to send rpc request to server
>> > java.lang.InterruptedException
>> >         at
>> >
>> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1290)
>> >         at
>> > java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:229)
>> >         at java.util.concurrent.FutureTask.get(FutureTask.java:94)
>> >         at
>> > org.apache.hadoop.ipc.Client$Connection.sendRpcRequest(Client.java:930)
>> >         at org.apache.hadoop.ipc.Client.call(Client.java:1285)
>> >         at org.apache.hadoop.ipc.Client.call(Client.java:1264)
>> >         at
>> >
>> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
>> >         at $Proxy22.heartbeat(Unknown Source)
>> >         at
>> >
>> org.apache.hadoop.yarn.server.nodemanager.api.impl.pb.client.LocalizationProtocolPBClientImpl.heartbeat(LocalizationProtocolPBClientImpl.java:62)
>> >         at
>> >
>> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.localizeFiles(ContainerLocalizer.java:249)
>> >         at
>> >
>> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.runLocalization(ContainerLocalizer.java:163)
>> >         at
>> >
>> org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.startLocalizer(DefaultContainerExecutor.java:106)
>> >         at
>> >
>> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerRunner.run(ResourceLocalizationService.java:979)
>> >
>> >
>> >
>> > And here is my code snippet:
>> >
>> >       ContainerLaunchContext ctx =
>> > Records.newRecord(ContainerLaunchContext.class);
>> >
>> >       ctx.setEnvironment(oshEnv);
>> >
>> >       // Set the local resources
>> >       Map<String, LocalResource> localResources = new HashMap<String,
>> > LocalResource>();
>> >
>> >       LocalResource shellRsrc = Records.newRecord(LocalResource.class);
>> >       shellRsrc.setType(LocalResourceType.FILE);
>> >       shellRsrc.setVisibility(LocalResourceVisibility.APPLICATION);
>> >       String shellScriptPath = "hdfs://isredeng//kishore/kk.ksh";
>> >       try {
>> >         shellRsrc.setResource(ConverterUtils.getYarnUrlFromURI(new
>> > URI(shellScriptPath)));
>> >       } catch (URISyntaxException e) {
>> >         LOG.error("Error when trying to use shell script path specified"
>> >             + " in env, path=" + shellScriptPath);
>> >         e.printStackTrace();
>> >       }
>> >
>> >       shellRsrc.setTimestamp(0/*shellScriptPathTimestamp*/);
>> >       shellRsrc.setSize(0/*shellScriptPathLen*/);
>> >       String ExecShellStringPath = "ExecShellScript.sh";
>> >       localResources.put(ExecShellStringPath, shellRsrc);
>> >
>> >       ctx.setLocalResources(localResources);
>> >
>> >
>> > Please let me know if you need anything else.
>> >
>> > Thanks,
>> > Kishore
>> >
>> >
>> >
>> > On Tue, Aug 6, 2013 at 12:05 AM, Harsh J <ha...@cloudera.com> wrote:
>> >>
>> >> The detail is insufficient to answer why. You should also have gotten
>> >> a trace after it, can you post that? If possible, also the relevant
>> >> snippets of code.
>> >>
>> >> On Mon, Aug 5, 2013 at 6:36 PM, Krishna Kishore Bonagiri
>> >> <wr...@gmail.com> wrote:
>> >> > Hi Harsh,
>> >> >  Thanks for the quick and detailed reply, it really helps. I am
>> trying
>> >> > to
>> >> > use it and getting this error in node manager's log:
>> >> >
>> >> > 2013-08-05 08:57:28,867 ERROR
>> >> > org.apache.hadoop.security.UserGroupInformation:
>> >> > PriviledgedActionException
>> >> > as:dsadm (auth:SIMPLE) cause:java.io.FileNotFoundException: File does
>> >> > not
>> >> > exist: hdfs://isredeng/kishore/kk.ksh
>> >> >
>> >> >
>> >> > This file is there on the machine with name "isredeng", I could do ls
>> >> > for
>> >> > that file as below:
>> >> >
>> >> > -bash-4.1$ hadoop fs -ls kishore/kk.ksh
>> >> > 13/08/05 09:01:03 WARN util.NativeCodeLoader: Unable to load
>> >> > native-hadoop
>> >> > library for your platform... using builtin-java classes where
>> applicable
>> >> > Found 1 items
>> >> > -rw-r--r--   3 dsadm supergroup       1046 2013-08-05 08:48
>> >> > kishore/kk.ksh
>> >> >
>> >> > Note: I am using a single node cluster
>> >> >
>> >> > Thanks,
>> >> > Kishore
>> >> >
>> >> >
>> >> >
>> >> >
>> >> > On Mon, Aug 5, 2013 at 3:00 PM, Harsh J <ha...@cloudera.com> wrote:
>> >> >>
>> >> >> The string for each LocalResource in the map can be anything that
>> >> >> serves as a common identifier name for your application. At
>> execution
>> >> >> time, the passed resource filename will be aliased to the name
>> you've
>> >> >> mapped it to, so that the application code need not track special
>> >> >> names. The behavior is very similar to how you can, in MR, define a
>> >> >> symlink name for a DistributedCache entry (e.g. foo.jar#bar.jar).
>> >> >>
>> >> >> For an example, checkout the DistributedShell app sources.
>> >> >>
>> >> >> Over [1], you can see we take a user provided file path to a shell
>> >> >> script. This can be named anything as it is user-supplied.
>> >> >> Onto [2], we define this as a local resource [2.1] and embed it
>> with a
>> >> >> different name (the string you ask about) [2.2], as defined at [3]
>> as
>> >> >> an application reference-able constant.
>> >> >> Note that in [4], we add to the Container arguments the aliased name
>> >> >> we mapped it to (i.e. [3]) and not the original filename we received
>> >> >> from the user. The resource is placed on the container with this
>> name
>> >> >> instead, so thats what we choose to execute.
>> >> >>
>> >> >> [1] -
>> >> >>
>> >> >>
>> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L390
>> >> >>
>> >> >> [2] - [2.1]
>> >> >>
>> >> >>
>> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L764
>> >> >> and [2.2]
>> >> >>
>> >> >>
>> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L780
>> >> >>
>> >> >> [3] -
>> >> >>
>> >> >>
>> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L205
>> >> >>
>> >> >> [4] -
>> >> >>
>> >> >>
>> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L791
>> >> >>
>> >> >> On Mon, Aug 5, 2013 at 2:44 PM, Krishna Kishore Bonagiri
>> >> >> <wr...@gmail.com> wrote:
>> >> >> > Hi,
>> >> >> >
>> >> >> >   Can someone please tell me what is the use of calling
>> >> >> > setLocalResources()
>> >> >> > on ContainerLaunchContext?
>> >> >> >
>> >> >> >   And, also an example of how to use this will help...
>> >> >> >
>> >> >> >  I couldn't guess what is the String in the map that is passed to
>> >> >> > setLocalResources() like below:
>> >> >> >
>> >> >> >       // Set the local resources
>> >> >> >       Map<String, LocalResource> localResources = new
>> HashMap<String,
>> >> >> > LocalResource>();
>> >> >> >
>> >> >> > Thanks,
>> >> >> > Kishore
>> >> >> >
>> >> >>
>> >> >>
>> >> >>
>> >> >> --
>> >> >> Harsh J
>> >> >
>> >> >
>> >>
>> >>
>> >>
>> >> --
>> >> Harsh J
>> >
>> >
>>
>>
>>
>> --
>> Harsh J
>>
>
>

Re: setLocalResources() on ContainerLaunchContext

Posted by Harsh J <ha...@cloudera.com>.
Can you try passing a fully qualified local path? That is, including the
file:/ scheme
On Aug 6, 2013 4:05 PM, "Krishna Kishore Bonagiri" <wr...@gmail.com>
wrote:

> Hi Harsh,
>    The setResource() call on LocalResource() is expecting an argument of
> type org.apache.hadoop.yarn.api.records.URL which is converted from a
> string in the form of URI. This happens in the following call of
> Distributed Shell example,
>
> shellRsrc.setResource(ConverterUtils.getYarnUrlFromURI(new URI(
> shellScriptPath)));
>
> So, if I give a local file I get a parsing error like below, which is when
> I changed it to an HDFS file thinking that it should be given like that
> only. Could you please give an example of how else it could be used, using
> a local file as you are saying?
>
> 2013-08-06 06:23:12,942 WARN
> org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container:
> Failed to parse resource-request
> java.net.URISyntaxException: Expected scheme name at index 0:
> :///home_/dsadm/kishore/kk.ksh
>         at java.net.URI$Parser.fail(URI.java:2820)
>         at java.net.URI$Parser.failExpecting(URI.java:2826)
>         at java.net.URI$Parser.parse(URI.java:3015)
>         at java.net.URI.<init>(URI.java:747)
>         at
> org.apache.hadoop.yarn.util.ConverterUtils.getPathFromYarnURL(ConverterUtils.java:77)
>         at
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalResourceRequest.<init>(LocalResourceRequest.java:46)
>
>
>
> On Tue, Aug 6, 2013 at 3:36 PM, Harsh J <ha...@cloudera.com> wrote:
>
>> To be honest, I've never tried loading a HDFS file onto the
>> LocalResource this way. I usually just pass a local file and that
>> works just fine. There may be something in the URI transformation
>> possibly breaking a HDFS source, but try passing a local file - does
>> that fail too? The Shell example uses a local file.
>>
>> On Tue, Aug 6, 2013 at 10:54 AM, Krishna Kishore Bonagiri
>> <wr...@gmail.com> wrote:
>> > Hi Harsh,
>> >
>> >   Please see if this is useful, I got a stack trace after the error has
>> > occurred....
>> >
>> > 2013-08-06 00:55:30,559 INFO
>> > org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor: CWD
>> set
>> > to
>> /tmp/nm-local-dir/usercache/dsadm/appcache/application_1375716148174_0004
>> > =
>> >
>> file:/tmp/nm-local-dir/usercache/dsadm/appcache/application_1375716148174_0004
>> > 2013-08-06 00:55:31,017 ERROR
>> > org.apache.hadoop.security.UserGroupInformation:
>> PriviledgedActionException
>> > as:dsadm (auth:SIMPLE) cause:java.io.FileNotFoundException: File does
>> not
>> > exist: hdfs://isredeng/kishore/kk.ksh
>> > 2013-08-06 00:55:31,029 INFO
>> >
>> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService:
>> > DEBUG: FAILED { hdfs://isredeng/kishore/kk.ksh, 0, FILE, null }, File
>> does
>> > not exist: hdfs://isredeng/kishore/kk.ksh
>> > 2013-08-06 00:55:31,031 INFO
>> >
>> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalizedResource:
>> > Resource hdfs://isredeng/kishore/kk.ksh transitioned from DOWNLOADING to
>> > FAILED
>> > 2013-08-06 00:55:31,034 INFO
>> >
>> org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container:
>> > Container container_1375716148174_0004_01_000002 transitioned from
>> > LOCALIZING to LOCALIZATION_FAILED
>> > 2013-08-06 00:55:31,035 INFO
>> >
>> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalResourcesTrackerImpl:
>> > Container container_1375716148174_0004_01_000002 sent RELEASE event on a
>> > resource request { hdfs://isredeng/kishore/kk.ksh, 0, FILE, null } not
>> > present in cache.
>> > 2013-08-06 00:55:31,036 WARN org.apache.hadoop.ipc.Client: interrupted
>> > waiting to send rpc request to server
>> > java.lang.InterruptedException
>> >         at
>> >
>> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1290)
>> >         at
>> > java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:229)
>> >         at java.util.concurrent.FutureTask.get(FutureTask.java:94)
>> >         at
>> > org.apache.hadoop.ipc.Client$Connection.sendRpcRequest(Client.java:930)
>> >         at org.apache.hadoop.ipc.Client.call(Client.java:1285)
>> >         at org.apache.hadoop.ipc.Client.call(Client.java:1264)
>> >         at
>> >
>> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
>> >         at $Proxy22.heartbeat(Unknown Source)
>> >         at
>> >
>> org.apache.hadoop.yarn.server.nodemanager.api.impl.pb.client.LocalizationProtocolPBClientImpl.heartbeat(LocalizationProtocolPBClientImpl.java:62)
>> >         at
>> >
>> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.localizeFiles(ContainerLocalizer.java:249)
>> >         at
>> >
>> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.runLocalization(ContainerLocalizer.java:163)
>> >         at
>> >
>> org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.startLocalizer(DefaultContainerExecutor.java:106)
>> >         at
>> >
>> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerRunner.run(ResourceLocalizationService.java:979)
>> >
>> >
>> >
>> > And here is my code snippet:
>> >
>> >       ContainerLaunchContext ctx =
>> > Records.newRecord(ContainerLaunchContext.class);
>> >
>> >       ctx.setEnvironment(oshEnv);
>> >
>> >       // Set the local resources
>> >       Map<String, LocalResource> localResources = new HashMap<String,
>> > LocalResource>();
>> >
>> >       LocalResource shellRsrc = Records.newRecord(LocalResource.class);
>> >       shellRsrc.setType(LocalResourceType.FILE);
>> >       shellRsrc.setVisibility(LocalResourceVisibility.APPLICATION);
>> >       String shellScriptPath = "hdfs://isredeng//kishore/kk.ksh";
>> >       try {
>> >         shellRsrc.setResource(ConverterUtils.getYarnUrlFromURI(new
>> > URI(shellScriptPath)));
>> >       } catch (URISyntaxException e) {
>> >         LOG.error("Error when trying to use shell script path specified"
>> >             + " in env, path=" + shellScriptPath);
>> >         e.printStackTrace();
>> >       }
>> >
>> >       shellRsrc.setTimestamp(0/*shellScriptPathTimestamp*/);
>> >       shellRsrc.setSize(0/*shellScriptPathLen*/);
>> >       String ExecShellStringPath = "ExecShellScript.sh";
>> >       localResources.put(ExecShellStringPath, shellRsrc);
>> >
>> >       ctx.setLocalResources(localResources);
>> >
>> >
>> > Please let me know if you need anything else.
>> >
>> > Thanks,
>> > Kishore
>> >
>> >
>> >
>> > On Tue, Aug 6, 2013 at 12:05 AM, Harsh J <ha...@cloudera.com> wrote:
>> >>
>> >> The detail is insufficient to answer why. You should also have gotten
>> >> a trace after it, can you post that? If possible, also the relevant
>> >> snippets of code.
>> >>
>> >> On Mon, Aug 5, 2013 at 6:36 PM, Krishna Kishore Bonagiri
>> >> <wr...@gmail.com> wrote:
>> >> > Hi Harsh,
>> >> >  Thanks for the quick and detailed reply, it really helps. I am
>> trying
>> >> > to
>> >> > use it and getting this error in node manager's log:
>> >> >
>> >> > 2013-08-05 08:57:28,867 ERROR
>> >> > org.apache.hadoop.security.UserGroupInformation:
>> >> > PriviledgedActionException
>> >> > as:dsadm (auth:SIMPLE) cause:java.io.FileNotFoundException: File does
>> >> > not
>> >> > exist: hdfs://isredeng/kishore/kk.ksh
>> >> >
>> >> >
>> >> > This file is there on the machine with name "isredeng", I could do ls
>> >> > for
>> >> > that file as below:
>> >> >
>> >> > -bash-4.1$ hadoop fs -ls kishore/kk.ksh
>> >> > 13/08/05 09:01:03 WARN util.NativeCodeLoader: Unable to load
>> >> > native-hadoop
>> >> > library for your platform... using builtin-java classes where
>> applicable
>> >> > Found 1 items
>> >> > -rw-r--r--   3 dsadm supergroup       1046 2013-08-05 08:48
>> >> > kishore/kk.ksh
>> >> >
>> >> > Note: I am using a single node cluster
>> >> >
>> >> > Thanks,
>> >> > Kishore
>> >> >
>> >> >
>> >> >
>> >> >
>> >> > On Mon, Aug 5, 2013 at 3:00 PM, Harsh J <ha...@cloudera.com> wrote:
>> >> >>
>> >> >> The string for each LocalResource in the map can be anything that
>> >> >> serves as a common identifier name for your application. At
>> execution
>> >> >> time, the passed resource filename will be aliased to the name
>> you've
>> >> >> mapped it to, so that the application code need not track special
>> >> >> names. The behavior is very similar to how you can, in MR, define a
>> >> >> symlink name for a DistributedCache entry (e.g. foo.jar#bar.jar).
>> >> >>
>> >> >> For an example, checkout the DistributedShell app sources.
>> >> >>
>> >> >> Over [1], you can see we take a user provided file path to a shell
>> >> >> script. This can be named anything as it is user-supplied.
>> >> >> Onto [2], we define this as a local resource [2.1] and embed it
>> with a
>> >> >> different name (the string you ask about) [2.2], as defined at [3]
>> as
>> >> >> an application reference-able constant.
>> >> >> Note that in [4], we add to the Container arguments the aliased name
>> >> >> we mapped it to (i.e. [3]) and not the original filename we received
>> >> >> from the user. The resource is placed on the container with this
>> name
>> >> >> instead, so thats what we choose to execute.
>> >> >>
>> >> >> [1] -
>> >> >>
>> >> >>
>> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L390
>> >> >>
>> >> >> [2] - [2.1]
>> >> >>
>> >> >>
>> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L764
>> >> >> and [2.2]
>> >> >>
>> >> >>
>> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L780
>> >> >>
>> >> >> [3] -
>> >> >>
>> >> >>
>> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L205
>> >> >>
>> >> >> [4] -
>> >> >>
>> >> >>
>> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L791
>> >> >>
>> >> >> On Mon, Aug 5, 2013 at 2:44 PM, Krishna Kishore Bonagiri
>> >> >> <wr...@gmail.com> wrote:
>> >> >> > Hi,
>> >> >> >
>> >> >> >   Can someone please tell me what is the use of calling
>> >> >> > setLocalResources()
>> >> >> > on ContainerLaunchContext?
>> >> >> >
>> >> >> >   And, also an example of how to use this will help...
>> >> >> >
>> >> >> >  I couldn't guess what is the String in the map that is passed to
>> >> >> > setLocalResources() like below:
>> >> >> >
>> >> >> >       // Set the local resources
>> >> >> >       Map<String, LocalResource> localResources = new
>> HashMap<String,
>> >> >> > LocalResource>();
>> >> >> >
>> >> >> > Thanks,
>> >> >> > Kishore
>> >> >> >
>> >> >>
>> >> >>
>> >> >>
>> >> >> --
>> >> >> Harsh J
>> >> >
>> >> >
>> >>
>> >>
>> >>
>> >> --
>> >> Harsh J
>> >
>> >
>>
>>
>>
>> --
>> Harsh J
>>
>
>

Re: setLocalResources() on ContainerLaunchContext

Posted by Harsh J <ha...@cloudera.com>.
Can you try passing a fully qualified local path? That is, including the
file:/ scheme
On Aug 6, 2013 4:05 PM, "Krishna Kishore Bonagiri" <wr...@gmail.com>
wrote:

> Hi Harsh,
>    The setResource() call on LocalResource() is expecting an argument of
> type org.apache.hadoop.yarn.api.records.URL which is converted from a
> string in the form of URI. This happens in the following call of
> Distributed Shell example,
>
> shellRsrc.setResource(ConverterUtils.getYarnUrlFromURI(new URI(
> shellScriptPath)));
>
> So, if I give a local file I get a parsing error like below, which is when
> I changed it to an HDFS file thinking that it should be given like that
> only. Could you please give an example of how else it could be used, using
> a local file as you are saying?
>
> 2013-08-06 06:23:12,942 WARN
> org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container:
> Failed to parse resource-request
> java.net.URISyntaxException: Expected scheme name at index 0:
> :///home_/dsadm/kishore/kk.ksh
>         at java.net.URI$Parser.fail(URI.java:2820)
>         at java.net.URI$Parser.failExpecting(URI.java:2826)
>         at java.net.URI$Parser.parse(URI.java:3015)
>         at java.net.URI.<init>(URI.java:747)
>         at
> org.apache.hadoop.yarn.util.ConverterUtils.getPathFromYarnURL(ConverterUtils.java:77)
>         at
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalResourceRequest.<init>(LocalResourceRequest.java:46)
>
>
>
> On Tue, Aug 6, 2013 at 3:36 PM, Harsh J <ha...@cloudera.com> wrote:
>
>> To be honest, I've never tried loading a HDFS file onto the
>> LocalResource this way. I usually just pass a local file and that
>> works just fine. There may be something in the URI transformation
>> possibly breaking a HDFS source, but try passing a local file - does
>> that fail too? The Shell example uses a local file.
>>
>> On Tue, Aug 6, 2013 at 10:54 AM, Krishna Kishore Bonagiri
>> <wr...@gmail.com> wrote:
>> > Hi Harsh,
>> >
>> >   Please see if this is useful, I got a stack trace after the error has
>> > occurred....
>> >
>> > 2013-08-06 00:55:30,559 INFO
>> > org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor: CWD
>> set
>> > to
>> /tmp/nm-local-dir/usercache/dsadm/appcache/application_1375716148174_0004
>> > =
>> >
>> file:/tmp/nm-local-dir/usercache/dsadm/appcache/application_1375716148174_0004
>> > 2013-08-06 00:55:31,017 ERROR
>> > org.apache.hadoop.security.UserGroupInformation:
>> PriviledgedActionException
>> > as:dsadm (auth:SIMPLE) cause:java.io.FileNotFoundException: File does
>> not
>> > exist: hdfs://isredeng/kishore/kk.ksh
>> > 2013-08-06 00:55:31,029 INFO
>> >
>> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService:
>> > DEBUG: FAILED { hdfs://isredeng/kishore/kk.ksh, 0, FILE, null }, File
>> does
>> > not exist: hdfs://isredeng/kishore/kk.ksh
>> > 2013-08-06 00:55:31,031 INFO
>> >
>> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalizedResource:
>> > Resource hdfs://isredeng/kishore/kk.ksh transitioned from DOWNLOADING to
>> > FAILED
>> > 2013-08-06 00:55:31,034 INFO
>> >
>> org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container:
>> > Container container_1375716148174_0004_01_000002 transitioned from
>> > LOCALIZING to LOCALIZATION_FAILED
>> > 2013-08-06 00:55:31,035 INFO
>> >
>> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalResourcesTrackerImpl:
>> > Container container_1375716148174_0004_01_000002 sent RELEASE event on a
>> > resource request { hdfs://isredeng/kishore/kk.ksh, 0, FILE, null } not
>> > present in cache.
>> > 2013-08-06 00:55:31,036 WARN org.apache.hadoop.ipc.Client: interrupted
>> > waiting to send rpc request to server
>> > java.lang.InterruptedException
>> >         at
>> >
>> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1290)
>> >         at
>> > java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:229)
>> >         at java.util.concurrent.FutureTask.get(FutureTask.java:94)
>> >         at
>> > org.apache.hadoop.ipc.Client$Connection.sendRpcRequest(Client.java:930)
>> >         at org.apache.hadoop.ipc.Client.call(Client.java:1285)
>> >         at org.apache.hadoop.ipc.Client.call(Client.java:1264)
>> >         at
>> >
>> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
>> >         at $Proxy22.heartbeat(Unknown Source)
>> >         at
>> >
>> org.apache.hadoop.yarn.server.nodemanager.api.impl.pb.client.LocalizationProtocolPBClientImpl.heartbeat(LocalizationProtocolPBClientImpl.java:62)
>> >         at
>> >
>> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.localizeFiles(ContainerLocalizer.java:249)
>> >         at
>> >
>> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.runLocalization(ContainerLocalizer.java:163)
>> >         at
>> >
>> org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.startLocalizer(DefaultContainerExecutor.java:106)
>> >         at
>> >
>> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerRunner.run(ResourceLocalizationService.java:979)
>> >
>> >
>> >
>> > And here is my code snippet:
>> >
>> >       ContainerLaunchContext ctx =
>> > Records.newRecord(ContainerLaunchContext.class);
>> >
>> >       ctx.setEnvironment(oshEnv);
>> >
>> >       // Set the local resources
>> >       Map<String, LocalResource> localResources = new HashMap<String,
>> > LocalResource>();
>> >
>> >       LocalResource shellRsrc = Records.newRecord(LocalResource.class);
>> >       shellRsrc.setType(LocalResourceType.FILE);
>> >       shellRsrc.setVisibility(LocalResourceVisibility.APPLICATION);
>> >       String shellScriptPath = "hdfs://isredeng//kishore/kk.ksh";
>> >       try {
>> >         shellRsrc.setResource(ConverterUtils.getYarnUrlFromURI(new
>> > URI(shellScriptPath)));
>> >       } catch (URISyntaxException e) {
>> >         LOG.error("Error when trying to use shell script path specified"
>> >             + " in env, path=" + shellScriptPath);
>> >         e.printStackTrace();
>> >       }
>> >
>> >       shellRsrc.setTimestamp(0/*shellScriptPathTimestamp*/);
>> >       shellRsrc.setSize(0/*shellScriptPathLen*/);
>> >       String ExecShellStringPath = "ExecShellScript.sh";
>> >       localResources.put(ExecShellStringPath, shellRsrc);
>> >
>> >       ctx.setLocalResources(localResources);
>> >
>> >
>> > Please let me know if you need anything else.
>> >
>> > Thanks,
>> > Kishore
>> >
>> >
>> >
>> > On Tue, Aug 6, 2013 at 12:05 AM, Harsh J <ha...@cloudera.com> wrote:
>> >>
>> >> The detail is insufficient to answer why. You should also have gotten
>> >> a trace after it, can you post that? If possible, also the relevant
>> >> snippets of code.
>> >>
>> >> On Mon, Aug 5, 2013 at 6:36 PM, Krishna Kishore Bonagiri
>> >> <wr...@gmail.com> wrote:
>> >> > Hi Harsh,
>> >> >  Thanks for the quick and detailed reply, it really helps. I am
>> trying
>> >> > to
>> >> > use it and getting this error in node manager's log:
>> >> >
>> >> > 2013-08-05 08:57:28,867 ERROR
>> >> > org.apache.hadoop.security.UserGroupInformation:
>> >> > PriviledgedActionException
>> >> > as:dsadm (auth:SIMPLE) cause:java.io.FileNotFoundException: File does
>> >> > not
>> >> > exist: hdfs://isredeng/kishore/kk.ksh
>> >> >
>> >> >
>> >> > This file is there on the machine with name "isredeng", I could do ls
>> >> > for
>> >> > that file as below:
>> >> >
>> >> > -bash-4.1$ hadoop fs -ls kishore/kk.ksh
>> >> > 13/08/05 09:01:03 WARN util.NativeCodeLoader: Unable to load
>> >> > native-hadoop
>> >> > library for your platform... using builtin-java classes where
>> applicable
>> >> > Found 1 items
>> >> > -rw-r--r--   3 dsadm supergroup       1046 2013-08-05 08:48
>> >> > kishore/kk.ksh
>> >> >
>> >> > Note: I am using a single node cluster
>> >> >
>> >> > Thanks,
>> >> > Kishore
>> >> >
>> >> >
>> >> >
>> >> >
>> >> > On Mon, Aug 5, 2013 at 3:00 PM, Harsh J <ha...@cloudera.com> wrote:
>> >> >>
>> >> >> The string for each LocalResource in the map can be anything that
>> >> >> serves as a common identifier name for your application. At
>> execution
>> >> >> time, the passed resource filename will be aliased to the name
>> you've
>> >> >> mapped it to, so that the application code need not track special
>> >> >> names. The behavior is very similar to how you can, in MR, define a
>> >> >> symlink name for a DistributedCache entry (e.g. foo.jar#bar.jar).
>> >> >>
>> >> >> For an example, checkout the DistributedShell app sources.
>> >> >>
>> >> >> Over [1], you can see we take a user provided file path to a shell
>> >> >> script. This can be named anything as it is user-supplied.
>> >> >> Onto [2], we define this as a local resource [2.1] and embed it
>> with a
>> >> >> different name (the string you ask about) [2.2], as defined at [3]
>> as
>> >> >> an application reference-able constant.
>> >> >> Note that in [4], we add to the Container arguments the aliased name
>> >> >> we mapped it to (i.e. [3]) and not the original filename we received
>> >> >> from the user. The resource is placed on the container with this
>> name
>> >> >> instead, so thats what we choose to execute.
>> >> >>
>> >> >> [1] -
>> >> >>
>> >> >>
>> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L390
>> >> >>
>> >> >> [2] - [2.1]
>> >> >>
>> >> >>
>> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L764
>> >> >> and [2.2]
>> >> >>
>> >> >>
>> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L780
>> >> >>
>> >> >> [3] -
>> >> >>
>> >> >>
>> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L205
>> >> >>
>> >> >> [4] -
>> >> >>
>> >> >>
>> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L791
>> >> >>
>> >> >> On Mon, Aug 5, 2013 at 2:44 PM, Krishna Kishore Bonagiri
>> >> >> <wr...@gmail.com> wrote:
>> >> >> > Hi,
>> >> >> >
>> >> >> >   Can someone please tell me what is the use of calling
>> >> >> > setLocalResources()
>> >> >> > on ContainerLaunchContext?
>> >> >> >
>> >> >> >   And, also an example of how to use this will help...
>> >> >> >
>> >> >> >  I couldn't guess what is the String in the map that is passed to
>> >> >> > setLocalResources() like below:
>> >> >> >
>> >> >> >       // Set the local resources
>> >> >> >       Map<String, LocalResource> localResources = new
>> HashMap<String,
>> >> >> > LocalResource>();
>> >> >> >
>> >> >> > Thanks,
>> >> >> > Kishore
>> >> >> >
>> >> >>
>> >> >>
>> >> >>
>> >> >> --
>> >> >> Harsh J
>> >> >
>> >> >
>> >>
>> >>
>> >>
>> >> --
>> >> Harsh J
>> >
>> >
>>
>>
>>
>> --
>> Harsh J
>>
>
>

Re: setLocalResources() on ContainerLaunchContext

Posted by Harsh J <ha...@cloudera.com>.
Can you try passing a fully qualified local path? That is, including the
file:/ scheme
On Aug 6, 2013 4:05 PM, "Krishna Kishore Bonagiri" <wr...@gmail.com>
wrote:

> Hi Harsh,
>    The setResource() call on LocalResource() is expecting an argument of
> type org.apache.hadoop.yarn.api.records.URL which is converted from a
> string in the form of URI. This happens in the following call of
> Distributed Shell example,
>
> shellRsrc.setResource(ConverterUtils.getYarnUrlFromURI(new URI(
> shellScriptPath)));
>
> So, if I give a local file I get a parsing error like below, which is when
> I changed it to an HDFS file thinking that it should be given like that
> only. Could you please give an example of how else it could be used, using
> a local file as you are saying?
>
> 2013-08-06 06:23:12,942 WARN
> org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container:
> Failed to parse resource-request
> java.net.URISyntaxException: Expected scheme name at index 0:
> :///home_/dsadm/kishore/kk.ksh
>         at java.net.URI$Parser.fail(URI.java:2820)
>         at java.net.URI$Parser.failExpecting(URI.java:2826)
>         at java.net.URI$Parser.parse(URI.java:3015)
>         at java.net.URI.<init>(URI.java:747)
>         at
> org.apache.hadoop.yarn.util.ConverterUtils.getPathFromYarnURL(ConverterUtils.java:77)
>         at
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalResourceRequest.<init>(LocalResourceRequest.java:46)
>
>
>
> On Tue, Aug 6, 2013 at 3:36 PM, Harsh J <ha...@cloudera.com> wrote:
>
>> To be honest, I've never tried loading a HDFS file onto the
>> LocalResource this way. I usually just pass a local file and that
>> works just fine. There may be something in the URI transformation
>> possibly breaking a HDFS source, but try passing a local file - does
>> that fail too? The Shell example uses a local file.
>>
>> On Tue, Aug 6, 2013 at 10:54 AM, Krishna Kishore Bonagiri
>> <wr...@gmail.com> wrote:
>> > Hi Harsh,
>> >
>> >   Please see if this is useful, I got a stack trace after the error has
>> > occurred....
>> >
>> > 2013-08-06 00:55:30,559 INFO
>> > org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor: CWD
>> set
>> > to
>> /tmp/nm-local-dir/usercache/dsadm/appcache/application_1375716148174_0004
>> > =
>> >
>> file:/tmp/nm-local-dir/usercache/dsadm/appcache/application_1375716148174_0004
>> > 2013-08-06 00:55:31,017 ERROR
>> > org.apache.hadoop.security.UserGroupInformation:
>> PriviledgedActionException
>> > as:dsadm (auth:SIMPLE) cause:java.io.FileNotFoundException: File does
>> not
>> > exist: hdfs://isredeng/kishore/kk.ksh
>> > 2013-08-06 00:55:31,029 INFO
>> >
>> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService:
>> > DEBUG: FAILED { hdfs://isredeng/kishore/kk.ksh, 0, FILE, null }, File
>> does
>> > not exist: hdfs://isredeng/kishore/kk.ksh
>> > 2013-08-06 00:55:31,031 INFO
>> >
>> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalizedResource:
>> > Resource hdfs://isredeng/kishore/kk.ksh transitioned from DOWNLOADING to
>> > FAILED
>> > 2013-08-06 00:55:31,034 INFO
>> >
>> org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container:
>> > Container container_1375716148174_0004_01_000002 transitioned from
>> > LOCALIZING to LOCALIZATION_FAILED
>> > 2013-08-06 00:55:31,035 INFO
>> >
>> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalResourcesTrackerImpl:
>> > Container container_1375716148174_0004_01_000002 sent RELEASE event on a
>> > resource request { hdfs://isredeng/kishore/kk.ksh, 0, FILE, null } not
>> > present in cache.
>> > 2013-08-06 00:55:31,036 WARN org.apache.hadoop.ipc.Client: interrupted
>> > waiting to send rpc request to server
>> > java.lang.InterruptedException
>> >         at
>> >
>> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1290)
>> >         at
>> > java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:229)
>> >         at java.util.concurrent.FutureTask.get(FutureTask.java:94)
>> >         at
>> > org.apache.hadoop.ipc.Client$Connection.sendRpcRequest(Client.java:930)
>> >         at org.apache.hadoop.ipc.Client.call(Client.java:1285)
>> >         at org.apache.hadoop.ipc.Client.call(Client.java:1264)
>> >         at
>> >
>> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
>> >         at $Proxy22.heartbeat(Unknown Source)
>> >         at
>> >
>> org.apache.hadoop.yarn.server.nodemanager.api.impl.pb.client.LocalizationProtocolPBClientImpl.heartbeat(LocalizationProtocolPBClientImpl.java:62)
>> >         at
>> >
>> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.localizeFiles(ContainerLocalizer.java:249)
>> >         at
>> >
>> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.runLocalization(ContainerLocalizer.java:163)
>> >         at
>> >
>> org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.startLocalizer(DefaultContainerExecutor.java:106)
>> >         at
>> >
>> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerRunner.run(ResourceLocalizationService.java:979)
>> >
>> >
>> >
>> > And here is my code snippet:
>> >
>> >       ContainerLaunchContext ctx =
>> > Records.newRecord(ContainerLaunchContext.class);
>> >
>> >       ctx.setEnvironment(oshEnv);
>> >
>> >       // Set the local resources
>> >       Map<String, LocalResource> localResources = new HashMap<String,
>> > LocalResource>();
>> >
>> >       LocalResource shellRsrc = Records.newRecord(LocalResource.class);
>> >       shellRsrc.setType(LocalResourceType.FILE);
>> >       shellRsrc.setVisibility(LocalResourceVisibility.APPLICATION);
>> >       String shellScriptPath = "hdfs://isredeng//kishore/kk.ksh";
>> >       try {
>> >         shellRsrc.setResource(ConverterUtils.getYarnUrlFromURI(new
>> > URI(shellScriptPath)));
>> >       } catch (URISyntaxException e) {
>> >         LOG.error("Error when trying to use shell script path specified"
>> >             + " in env, path=" + shellScriptPath);
>> >         e.printStackTrace();
>> >       }
>> >
>> >       shellRsrc.setTimestamp(0/*shellScriptPathTimestamp*/);
>> >       shellRsrc.setSize(0/*shellScriptPathLen*/);
>> >       String ExecShellStringPath = "ExecShellScript.sh";
>> >       localResources.put(ExecShellStringPath, shellRsrc);
>> >
>> >       ctx.setLocalResources(localResources);
>> >
>> >
>> > Please let me know if you need anything else.
>> >
>> > Thanks,
>> > Kishore
>> >
>> >
>> >
>> > On Tue, Aug 6, 2013 at 12:05 AM, Harsh J <ha...@cloudera.com> wrote:
>> >>
>> >> The detail is insufficient to answer why. You should also have gotten
>> >> a trace after it, can you post that? If possible, also the relevant
>> >> snippets of code.
>> >>
>> >> On Mon, Aug 5, 2013 at 6:36 PM, Krishna Kishore Bonagiri
>> >> <wr...@gmail.com> wrote:
>> >> > Hi Harsh,
>> >> >  Thanks for the quick and detailed reply, it really helps. I am
>> trying
>> >> > to
>> >> > use it and getting this error in node manager's log:
>> >> >
>> >> > 2013-08-05 08:57:28,867 ERROR
>> >> > org.apache.hadoop.security.UserGroupInformation:
>> >> > PriviledgedActionException
>> >> > as:dsadm (auth:SIMPLE) cause:java.io.FileNotFoundException: File does
>> >> > not
>> >> > exist: hdfs://isredeng/kishore/kk.ksh
>> >> >
>> >> >
>> >> > This file is there on the machine with name "isredeng", I could do ls
>> >> > for
>> >> > that file as below:
>> >> >
>> >> > -bash-4.1$ hadoop fs -ls kishore/kk.ksh
>> >> > 13/08/05 09:01:03 WARN util.NativeCodeLoader: Unable to load
>> >> > native-hadoop
>> >> > library for your platform... using builtin-java classes where
>> applicable
>> >> > Found 1 items
>> >> > -rw-r--r--   3 dsadm supergroup       1046 2013-08-05 08:48
>> >> > kishore/kk.ksh
>> >> >
>> >> > Note: I am using a single node cluster
>> >> >
>> >> > Thanks,
>> >> > Kishore
>> >> >
>> >> >
>> >> >
>> >> >
>> >> > On Mon, Aug 5, 2013 at 3:00 PM, Harsh J <ha...@cloudera.com> wrote:
>> >> >>
>> >> >> The string for each LocalResource in the map can be anything that
>> >> >> serves as a common identifier name for your application. At
>> execution
>> >> >> time, the passed resource filename will be aliased to the name
>> you've
>> >> >> mapped it to, so that the application code need not track special
>> >> >> names. The behavior is very similar to how you can, in MR, define a
>> >> >> symlink name for a DistributedCache entry (e.g. foo.jar#bar.jar).
>> >> >>
>> >> >> For an example, checkout the DistributedShell app sources.
>> >> >>
>> >> >> Over [1], you can see we take a user provided file path to a shell
>> >> >> script. This can be named anything as it is user-supplied.
>> >> >> Onto [2], we define this as a local resource [2.1] and embed it
>> with a
>> >> >> different name (the string you ask about) [2.2], as defined at [3]
>> as
>> >> >> an application reference-able constant.
>> >> >> Note that in [4], we add to the Container arguments the aliased name
>> >> >> we mapped it to (i.e. [3]) and not the original filename we received
>> >> >> from the user. The resource is placed on the container with this
>> name
>> >> >> instead, so thats what we choose to execute.
>> >> >>
>> >> >> [1] -
>> >> >>
>> >> >>
>> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L390
>> >> >>
>> >> >> [2] - [2.1]
>> >> >>
>> >> >>
>> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L764
>> >> >> and [2.2]
>> >> >>
>> >> >>
>> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L780
>> >> >>
>> >> >> [3] -
>> >> >>
>> >> >>
>> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L205
>> >> >>
>> >> >> [4] -
>> >> >>
>> >> >>
>> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L791
>> >> >>
>> >> >> On Mon, Aug 5, 2013 at 2:44 PM, Krishna Kishore Bonagiri
>> >> >> <wr...@gmail.com> wrote:
>> >> >> > Hi,
>> >> >> >
>> >> >> >   Can someone please tell me what is the use of calling
>> >> >> > setLocalResources()
>> >> >> > on ContainerLaunchContext?
>> >> >> >
>> >> >> >   And, also an example of how to use this will help...
>> >> >> >
>> >> >> >  I couldn't guess what is the String in the map that is passed to
>> >> >> > setLocalResources() like below:
>> >> >> >
>> >> >> >       // Set the local resources
>> >> >> >       Map<String, LocalResource> localResources = new
>> HashMap<String,
>> >> >> > LocalResource>();
>> >> >> >
>> >> >> > Thanks,
>> >> >> > Kishore
>> >> >> >
>> >> >>
>> >> >>
>> >> >>
>> >> >> --
>> >> >> Harsh J
>> >> >
>> >> >
>> >>
>> >>
>> >>
>> >> --
>> >> Harsh J
>> >
>> >
>>
>>
>>
>> --
>> Harsh J
>>
>
>

Re: setLocalResources() on ContainerLaunchContext

Posted by Krishna Kishore Bonagiri <wr...@gmail.com>.
Hi Harsh,
   The setResource() call on LocalResource() is expecting an argument of
type org.apache.hadoop.yarn.api.records.URL which is converted from a
string in the form of URI. This happens in the following call of
Distributed Shell example,

shellRsrc.setResource(ConverterUtils.getYarnUrlFromURI(new URI(
shellScriptPath)));

So, if I give a local file I get a parsing error like below, which is when
I changed it to an HDFS file thinking that it should be given like that
only. Could you please give an example of how else it could be used, using
a local file as you are saying?

2013-08-06 06:23:12,942 WARN
org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container:
Failed to parse resource-request
java.net.URISyntaxException: Expected scheme name at index 0:
:///home_/dsadm/kishore/kk.ksh
        at java.net.URI$Parser.fail(URI.java:2820)
        at java.net.URI$Parser.failExpecting(URI.java:2826)
        at java.net.URI$Parser.parse(URI.java:3015)
        at java.net.URI.<init>(URI.java:747)
        at
org.apache.hadoop.yarn.util.ConverterUtils.getPathFromYarnURL(ConverterUtils.java:77)
        at
org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalResourceRequest.<init>(LocalResourceRequest.java:46)



On Tue, Aug 6, 2013 at 3:36 PM, Harsh J <ha...@cloudera.com> wrote:

> To be honest, I've never tried loading a HDFS file onto the
> LocalResource this way. I usually just pass a local file and that
> works just fine. There may be something in the URI transformation
> possibly breaking a HDFS source, but try passing a local file - does
> that fail too? The Shell example uses a local file.
>
> On Tue, Aug 6, 2013 at 10:54 AM, Krishna Kishore Bonagiri
> <wr...@gmail.com> wrote:
> > Hi Harsh,
> >
> >   Please see if this is useful, I got a stack trace after the error has
> > occurred....
> >
> > 2013-08-06 00:55:30,559 INFO
> > org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor: CWD
> set
> > to
> /tmp/nm-local-dir/usercache/dsadm/appcache/application_1375716148174_0004
> > =
> >
> file:/tmp/nm-local-dir/usercache/dsadm/appcache/application_1375716148174_0004
> > 2013-08-06 00:55:31,017 ERROR
> > org.apache.hadoop.security.UserGroupInformation:
> PriviledgedActionException
> > as:dsadm (auth:SIMPLE) cause:java.io.FileNotFoundException: File does not
> > exist: hdfs://isredeng/kishore/kk.ksh
> > 2013-08-06 00:55:31,029 INFO
> >
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService:
> > DEBUG: FAILED { hdfs://isredeng/kishore/kk.ksh, 0, FILE, null }, File
> does
> > not exist: hdfs://isredeng/kishore/kk.ksh
> > 2013-08-06 00:55:31,031 INFO
> >
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalizedResource:
> > Resource hdfs://isredeng/kishore/kk.ksh transitioned from DOWNLOADING to
> > FAILED
> > 2013-08-06 00:55:31,034 INFO
> >
> org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container:
> > Container container_1375716148174_0004_01_000002 transitioned from
> > LOCALIZING to LOCALIZATION_FAILED
> > 2013-08-06 00:55:31,035 INFO
> >
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalResourcesTrackerImpl:
> > Container container_1375716148174_0004_01_000002 sent RELEASE event on a
> > resource request { hdfs://isredeng/kishore/kk.ksh, 0, FILE, null } not
> > present in cache.
> > 2013-08-06 00:55:31,036 WARN org.apache.hadoop.ipc.Client: interrupted
> > waiting to send rpc request to server
> > java.lang.InterruptedException
> >         at
> >
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1290)
> >         at
> > java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:229)
> >         at java.util.concurrent.FutureTask.get(FutureTask.java:94)
> >         at
> > org.apache.hadoop.ipc.Client$Connection.sendRpcRequest(Client.java:930)
> >         at org.apache.hadoop.ipc.Client.call(Client.java:1285)
> >         at org.apache.hadoop.ipc.Client.call(Client.java:1264)
> >         at
> >
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
> >         at $Proxy22.heartbeat(Unknown Source)
> >         at
> >
> org.apache.hadoop.yarn.server.nodemanager.api.impl.pb.client.LocalizationProtocolPBClientImpl.heartbeat(LocalizationProtocolPBClientImpl.java:62)
> >         at
> >
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.localizeFiles(ContainerLocalizer.java:249)
> >         at
> >
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.runLocalization(ContainerLocalizer.java:163)
> >         at
> >
> org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.startLocalizer(DefaultContainerExecutor.java:106)
> >         at
> >
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerRunner.run(ResourceLocalizationService.java:979)
> >
> >
> >
> > And here is my code snippet:
> >
> >       ContainerLaunchContext ctx =
> > Records.newRecord(ContainerLaunchContext.class);
> >
> >       ctx.setEnvironment(oshEnv);
> >
> >       // Set the local resources
> >       Map<String, LocalResource> localResources = new HashMap<String,
> > LocalResource>();
> >
> >       LocalResource shellRsrc = Records.newRecord(LocalResource.class);
> >       shellRsrc.setType(LocalResourceType.FILE);
> >       shellRsrc.setVisibility(LocalResourceVisibility.APPLICATION);
> >       String shellScriptPath = "hdfs://isredeng//kishore/kk.ksh";
> >       try {
> >         shellRsrc.setResource(ConverterUtils.getYarnUrlFromURI(new
> > URI(shellScriptPath)));
> >       } catch (URISyntaxException e) {
> >         LOG.error("Error when trying to use shell script path specified"
> >             + " in env, path=" + shellScriptPath);
> >         e.printStackTrace();
> >       }
> >
> >       shellRsrc.setTimestamp(0/*shellScriptPathTimestamp*/);
> >       shellRsrc.setSize(0/*shellScriptPathLen*/);
> >       String ExecShellStringPath = "ExecShellScript.sh";
> >       localResources.put(ExecShellStringPath, shellRsrc);
> >
> >       ctx.setLocalResources(localResources);
> >
> >
> > Please let me know if you need anything else.
> >
> > Thanks,
> > Kishore
> >
> >
> >
> > On Tue, Aug 6, 2013 at 12:05 AM, Harsh J <ha...@cloudera.com> wrote:
> >>
> >> The detail is insufficient to answer why. You should also have gotten
> >> a trace after it, can you post that? If possible, also the relevant
> >> snippets of code.
> >>
> >> On Mon, Aug 5, 2013 at 6:36 PM, Krishna Kishore Bonagiri
> >> <wr...@gmail.com> wrote:
> >> > Hi Harsh,
> >> >  Thanks for the quick and detailed reply, it really helps. I am trying
> >> > to
> >> > use it and getting this error in node manager's log:
> >> >
> >> > 2013-08-05 08:57:28,867 ERROR
> >> > org.apache.hadoop.security.UserGroupInformation:
> >> > PriviledgedActionException
> >> > as:dsadm (auth:SIMPLE) cause:java.io.FileNotFoundException: File does
> >> > not
> >> > exist: hdfs://isredeng/kishore/kk.ksh
> >> >
> >> >
> >> > This file is there on the machine with name "isredeng", I could do ls
> >> > for
> >> > that file as below:
> >> >
> >> > -bash-4.1$ hadoop fs -ls kishore/kk.ksh
> >> > 13/08/05 09:01:03 WARN util.NativeCodeLoader: Unable to load
> >> > native-hadoop
> >> > library for your platform... using builtin-java classes where
> applicable
> >> > Found 1 items
> >> > -rw-r--r--   3 dsadm supergroup       1046 2013-08-05 08:48
> >> > kishore/kk.ksh
> >> >
> >> > Note: I am using a single node cluster
> >> >
> >> > Thanks,
> >> > Kishore
> >> >
> >> >
> >> >
> >> >
> >> > On Mon, Aug 5, 2013 at 3:00 PM, Harsh J <ha...@cloudera.com> wrote:
> >> >>
> >> >> The string for each LocalResource in the map can be anything that
> >> >> serves as a common identifier name for your application. At execution
> >> >> time, the passed resource filename will be aliased to the name you've
> >> >> mapped it to, so that the application code need not track special
> >> >> names. The behavior is very similar to how you can, in MR, define a
> >> >> symlink name for a DistributedCache entry (e.g. foo.jar#bar.jar).
> >> >>
> >> >> For an example, checkout the DistributedShell app sources.
> >> >>
> >> >> Over [1], you can see we take a user provided file path to a shell
> >> >> script. This can be named anything as it is user-supplied.
> >> >> Onto [2], we define this as a local resource [2.1] and embed it with
> a
> >> >> different name (the string you ask about) [2.2], as defined at [3] as
> >> >> an application reference-able constant.
> >> >> Note that in [4], we add to the Container arguments the aliased name
> >> >> we mapped it to (i.e. [3]) and not the original filename we received
> >> >> from the user. The resource is placed on the container with this name
> >> >> instead, so thats what we choose to execute.
> >> >>
> >> >> [1] -
> >> >>
> >> >>
> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L390
> >> >>
> >> >> [2] - [2.1]
> >> >>
> >> >>
> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L764
> >> >> and [2.2]
> >> >>
> >> >>
> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L780
> >> >>
> >> >> [3] -
> >> >>
> >> >>
> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L205
> >> >>
> >> >> [4] -
> >> >>
> >> >>
> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L791
> >> >>
> >> >> On Mon, Aug 5, 2013 at 2:44 PM, Krishna Kishore Bonagiri
> >> >> <wr...@gmail.com> wrote:
> >> >> > Hi,
> >> >> >
> >> >> >   Can someone please tell me what is the use of calling
> >> >> > setLocalResources()
> >> >> > on ContainerLaunchContext?
> >> >> >
> >> >> >   And, also an example of how to use this will help...
> >> >> >
> >> >> >  I couldn't guess what is the String in the map that is passed to
> >> >> > setLocalResources() like below:
> >> >> >
> >> >> >       // Set the local resources
> >> >> >       Map<String, LocalResource> localResources = new
> HashMap<String,
> >> >> > LocalResource>();
> >> >> >
> >> >> > Thanks,
> >> >> > Kishore
> >> >> >
> >> >>
> >> >>
> >> >>
> >> >> --
> >> >> Harsh J
> >> >
> >> >
> >>
> >>
> >>
> >> --
> >> Harsh J
> >
> >
>
>
>
> --
> Harsh J
>

Re: setLocalResources() on ContainerLaunchContext

Posted by Krishna Kishore Bonagiri <wr...@gmail.com>.
Hi Harsh,
   The setResource() call on LocalResource() is expecting an argument of
type org.apache.hadoop.yarn.api.records.URL which is converted from a
string in the form of URI. This happens in the following call of
Distributed Shell example,

shellRsrc.setResource(ConverterUtils.getYarnUrlFromURI(new URI(
shellScriptPath)));

So, if I give a local file I get a parsing error like below, which is when
I changed it to an HDFS file thinking that it should be given like that
only. Could you please give an example of how else it could be used, using
a local file as you are saying?

2013-08-06 06:23:12,942 WARN
org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container:
Failed to parse resource-request
java.net.URISyntaxException: Expected scheme name at index 0:
:///home_/dsadm/kishore/kk.ksh
        at java.net.URI$Parser.fail(URI.java:2820)
        at java.net.URI$Parser.failExpecting(URI.java:2826)
        at java.net.URI$Parser.parse(URI.java:3015)
        at java.net.URI.<init>(URI.java:747)
        at
org.apache.hadoop.yarn.util.ConverterUtils.getPathFromYarnURL(ConverterUtils.java:77)
        at
org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalResourceRequest.<init>(LocalResourceRequest.java:46)



On Tue, Aug 6, 2013 at 3:36 PM, Harsh J <ha...@cloudera.com> wrote:

> To be honest, I've never tried loading a HDFS file onto the
> LocalResource this way. I usually just pass a local file and that
> works just fine. There may be something in the URI transformation
> possibly breaking a HDFS source, but try passing a local file - does
> that fail too? The Shell example uses a local file.
>
> On Tue, Aug 6, 2013 at 10:54 AM, Krishna Kishore Bonagiri
> <wr...@gmail.com> wrote:
> > Hi Harsh,
> >
> >   Please see if this is useful, I got a stack trace after the error has
> > occurred....
> >
> > 2013-08-06 00:55:30,559 INFO
> > org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor: CWD
> set
> > to
> /tmp/nm-local-dir/usercache/dsadm/appcache/application_1375716148174_0004
> > =
> >
> file:/tmp/nm-local-dir/usercache/dsadm/appcache/application_1375716148174_0004
> > 2013-08-06 00:55:31,017 ERROR
> > org.apache.hadoop.security.UserGroupInformation:
> PriviledgedActionException
> > as:dsadm (auth:SIMPLE) cause:java.io.FileNotFoundException: File does not
> > exist: hdfs://isredeng/kishore/kk.ksh
> > 2013-08-06 00:55:31,029 INFO
> >
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService:
> > DEBUG: FAILED { hdfs://isredeng/kishore/kk.ksh, 0, FILE, null }, File
> does
> > not exist: hdfs://isredeng/kishore/kk.ksh
> > 2013-08-06 00:55:31,031 INFO
> >
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalizedResource:
> > Resource hdfs://isredeng/kishore/kk.ksh transitioned from DOWNLOADING to
> > FAILED
> > 2013-08-06 00:55:31,034 INFO
> >
> org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container:
> > Container container_1375716148174_0004_01_000002 transitioned from
> > LOCALIZING to LOCALIZATION_FAILED
> > 2013-08-06 00:55:31,035 INFO
> >
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalResourcesTrackerImpl:
> > Container container_1375716148174_0004_01_000002 sent RELEASE event on a
> > resource request { hdfs://isredeng/kishore/kk.ksh, 0, FILE, null } not
> > present in cache.
> > 2013-08-06 00:55:31,036 WARN org.apache.hadoop.ipc.Client: interrupted
> > waiting to send rpc request to server
> > java.lang.InterruptedException
> >         at
> >
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1290)
> >         at
> > java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:229)
> >         at java.util.concurrent.FutureTask.get(FutureTask.java:94)
> >         at
> > org.apache.hadoop.ipc.Client$Connection.sendRpcRequest(Client.java:930)
> >         at org.apache.hadoop.ipc.Client.call(Client.java:1285)
> >         at org.apache.hadoop.ipc.Client.call(Client.java:1264)
> >         at
> >
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
> >         at $Proxy22.heartbeat(Unknown Source)
> >         at
> >
> org.apache.hadoop.yarn.server.nodemanager.api.impl.pb.client.LocalizationProtocolPBClientImpl.heartbeat(LocalizationProtocolPBClientImpl.java:62)
> >         at
> >
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.localizeFiles(ContainerLocalizer.java:249)
> >         at
> >
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.runLocalization(ContainerLocalizer.java:163)
> >         at
> >
> org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.startLocalizer(DefaultContainerExecutor.java:106)
> >         at
> >
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerRunner.run(ResourceLocalizationService.java:979)
> >
> >
> >
> > And here is my code snippet:
> >
> >       ContainerLaunchContext ctx =
> > Records.newRecord(ContainerLaunchContext.class);
> >
> >       ctx.setEnvironment(oshEnv);
> >
> >       // Set the local resources
> >       Map<String, LocalResource> localResources = new HashMap<String,
> > LocalResource>();
> >
> >       LocalResource shellRsrc = Records.newRecord(LocalResource.class);
> >       shellRsrc.setType(LocalResourceType.FILE);
> >       shellRsrc.setVisibility(LocalResourceVisibility.APPLICATION);
> >       String shellScriptPath = "hdfs://isredeng//kishore/kk.ksh";
> >       try {
> >         shellRsrc.setResource(ConverterUtils.getYarnUrlFromURI(new
> > URI(shellScriptPath)));
> >       } catch (URISyntaxException e) {
> >         LOG.error("Error when trying to use shell script path specified"
> >             + " in env, path=" + shellScriptPath);
> >         e.printStackTrace();
> >       }
> >
> >       shellRsrc.setTimestamp(0/*shellScriptPathTimestamp*/);
> >       shellRsrc.setSize(0/*shellScriptPathLen*/);
> >       String ExecShellStringPath = "ExecShellScript.sh";
> >       localResources.put(ExecShellStringPath, shellRsrc);
> >
> >       ctx.setLocalResources(localResources);
> >
> >
> > Please let me know if you need anything else.
> >
> > Thanks,
> > Kishore
> >
> >
> >
> > On Tue, Aug 6, 2013 at 12:05 AM, Harsh J <ha...@cloudera.com> wrote:
> >>
> >> The detail is insufficient to answer why. You should also have gotten
> >> a trace after it, can you post that? If possible, also the relevant
> >> snippets of code.
> >>
> >> On Mon, Aug 5, 2013 at 6:36 PM, Krishna Kishore Bonagiri
> >> <wr...@gmail.com> wrote:
> >> > Hi Harsh,
> >> >  Thanks for the quick and detailed reply, it really helps. I am trying
> >> > to
> >> > use it and getting this error in node manager's log:
> >> >
> >> > 2013-08-05 08:57:28,867 ERROR
> >> > org.apache.hadoop.security.UserGroupInformation:
> >> > PriviledgedActionException
> >> > as:dsadm (auth:SIMPLE) cause:java.io.FileNotFoundException: File does
> >> > not
> >> > exist: hdfs://isredeng/kishore/kk.ksh
> >> >
> >> >
> >> > This file is there on the machine with name "isredeng", I could do ls
> >> > for
> >> > that file as below:
> >> >
> >> > -bash-4.1$ hadoop fs -ls kishore/kk.ksh
> >> > 13/08/05 09:01:03 WARN util.NativeCodeLoader: Unable to load
> >> > native-hadoop
> >> > library for your platform... using builtin-java classes where
> applicable
> >> > Found 1 items
> >> > -rw-r--r--   3 dsadm supergroup       1046 2013-08-05 08:48
> >> > kishore/kk.ksh
> >> >
> >> > Note: I am using a single node cluster
> >> >
> >> > Thanks,
> >> > Kishore
> >> >
> >> >
> >> >
> >> >
> >> > On Mon, Aug 5, 2013 at 3:00 PM, Harsh J <ha...@cloudera.com> wrote:
> >> >>
> >> >> The string for each LocalResource in the map can be anything that
> >> >> serves as a common identifier name for your application. At execution
> >> >> time, the passed resource filename will be aliased to the name you've
> >> >> mapped it to, so that the application code need not track special
> >> >> names. The behavior is very similar to how you can, in MR, define a
> >> >> symlink name for a DistributedCache entry (e.g. foo.jar#bar.jar).
> >> >>
> >> >> For an example, checkout the DistributedShell app sources.
> >> >>
> >> >> Over [1], you can see we take a user provided file path to a shell
> >> >> script. This can be named anything as it is user-supplied.
> >> >> Onto [2], we define this as a local resource [2.1] and embed it with
> a
> >> >> different name (the string you ask about) [2.2], as defined at [3] as
> >> >> an application reference-able constant.
> >> >> Note that in [4], we add to the Container arguments the aliased name
> >> >> we mapped it to (i.e. [3]) and not the original filename we received
> >> >> from the user. The resource is placed on the container with this name
> >> >> instead, so thats what we choose to execute.
> >> >>
> >> >> [1] -
> >> >>
> >> >>
> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L390
> >> >>
> >> >> [2] - [2.1]
> >> >>
> >> >>
> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L764
> >> >> and [2.2]
> >> >>
> >> >>
> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L780
> >> >>
> >> >> [3] -
> >> >>
> >> >>
> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L205
> >> >>
> >> >> [4] -
> >> >>
> >> >>
> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L791
> >> >>
> >> >> On Mon, Aug 5, 2013 at 2:44 PM, Krishna Kishore Bonagiri
> >> >> <wr...@gmail.com> wrote:
> >> >> > Hi,
> >> >> >
> >> >> >   Can someone please tell me what is the use of calling
> >> >> > setLocalResources()
> >> >> > on ContainerLaunchContext?
> >> >> >
> >> >> >   And, also an example of how to use this will help...
> >> >> >
> >> >> >  I couldn't guess what is the String in the map that is passed to
> >> >> > setLocalResources() like below:
> >> >> >
> >> >> >       // Set the local resources
> >> >> >       Map<String, LocalResource> localResources = new
> HashMap<String,
> >> >> > LocalResource>();
> >> >> >
> >> >> > Thanks,
> >> >> > Kishore
> >> >> >
> >> >>
> >> >>
> >> >>
> >> >> --
> >> >> Harsh J
> >> >
> >> >
> >>
> >>
> >>
> >> --
> >> Harsh J
> >
> >
>
>
>
> --
> Harsh J
>

Re: setLocalResources() on ContainerLaunchContext

Posted by Krishna Kishore Bonagiri <wr...@gmail.com>.
Hi Harsh,
   The setResource() call on LocalResource() is expecting an argument of
type org.apache.hadoop.yarn.api.records.URL which is converted from a
string in the form of URI. This happens in the following call of
Distributed Shell example,

shellRsrc.setResource(ConverterUtils.getYarnUrlFromURI(new URI(
shellScriptPath)));

So, if I give a local file I get a parsing error like below, which is when
I changed it to an HDFS file thinking that it should be given like that
only. Could you please give an example of how else it could be used, using
a local file as you are saying?

2013-08-06 06:23:12,942 WARN
org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container:
Failed to parse resource-request
java.net.URISyntaxException: Expected scheme name at index 0:
:///home_/dsadm/kishore/kk.ksh
        at java.net.URI$Parser.fail(URI.java:2820)
        at java.net.URI$Parser.failExpecting(URI.java:2826)
        at java.net.URI$Parser.parse(URI.java:3015)
        at java.net.URI.<init>(URI.java:747)
        at
org.apache.hadoop.yarn.util.ConverterUtils.getPathFromYarnURL(ConverterUtils.java:77)
        at
org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalResourceRequest.<init>(LocalResourceRequest.java:46)



On Tue, Aug 6, 2013 at 3:36 PM, Harsh J <ha...@cloudera.com> wrote:

> To be honest, I've never tried loading a HDFS file onto the
> LocalResource this way. I usually just pass a local file and that
> works just fine. There may be something in the URI transformation
> possibly breaking a HDFS source, but try passing a local file - does
> that fail too? The Shell example uses a local file.
>
> On Tue, Aug 6, 2013 at 10:54 AM, Krishna Kishore Bonagiri
> <wr...@gmail.com> wrote:
> > Hi Harsh,
> >
> >   Please see if this is useful, I got a stack trace after the error has
> > occurred....
> >
> > 2013-08-06 00:55:30,559 INFO
> > org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor: CWD
> set
> > to
> /tmp/nm-local-dir/usercache/dsadm/appcache/application_1375716148174_0004
> > =
> >
> file:/tmp/nm-local-dir/usercache/dsadm/appcache/application_1375716148174_0004
> > 2013-08-06 00:55:31,017 ERROR
> > org.apache.hadoop.security.UserGroupInformation:
> PriviledgedActionException
> > as:dsadm (auth:SIMPLE) cause:java.io.FileNotFoundException: File does not
> > exist: hdfs://isredeng/kishore/kk.ksh
> > 2013-08-06 00:55:31,029 INFO
> >
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService:
> > DEBUG: FAILED { hdfs://isredeng/kishore/kk.ksh, 0, FILE, null }, File
> does
> > not exist: hdfs://isredeng/kishore/kk.ksh
> > 2013-08-06 00:55:31,031 INFO
> >
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalizedResource:
> > Resource hdfs://isredeng/kishore/kk.ksh transitioned from DOWNLOADING to
> > FAILED
> > 2013-08-06 00:55:31,034 INFO
> >
> org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container:
> > Container container_1375716148174_0004_01_000002 transitioned from
> > LOCALIZING to LOCALIZATION_FAILED
> > 2013-08-06 00:55:31,035 INFO
> >
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalResourcesTrackerImpl:
> > Container container_1375716148174_0004_01_000002 sent RELEASE event on a
> > resource request { hdfs://isredeng/kishore/kk.ksh, 0, FILE, null } not
> > present in cache.
> > 2013-08-06 00:55:31,036 WARN org.apache.hadoop.ipc.Client: interrupted
> > waiting to send rpc request to server
> > java.lang.InterruptedException
> >         at
> >
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1290)
> >         at
> > java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:229)
> >         at java.util.concurrent.FutureTask.get(FutureTask.java:94)
> >         at
> > org.apache.hadoop.ipc.Client$Connection.sendRpcRequest(Client.java:930)
> >         at org.apache.hadoop.ipc.Client.call(Client.java:1285)
> >         at org.apache.hadoop.ipc.Client.call(Client.java:1264)
> >         at
> >
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
> >         at $Proxy22.heartbeat(Unknown Source)
> >         at
> >
> org.apache.hadoop.yarn.server.nodemanager.api.impl.pb.client.LocalizationProtocolPBClientImpl.heartbeat(LocalizationProtocolPBClientImpl.java:62)
> >         at
> >
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.localizeFiles(ContainerLocalizer.java:249)
> >         at
> >
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.runLocalization(ContainerLocalizer.java:163)
> >         at
> >
> org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.startLocalizer(DefaultContainerExecutor.java:106)
> >         at
> >
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerRunner.run(ResourceLocalizationService.java:979)
> >
> >
> >
> > And here is my code snippet:
> >
> >       ContainerLaunchContext ctx =
> > Records.newRecord(ContainerLaunchContext.class);
> >
> >       ctx.setEnvironment(oshEnv);
> >
> >       // Set the local resources
> >       Map<String, LocalResource> localResources = new HashMap<String,
> > LocalResource>();
> >
> >       LocalResource shellRsrc = Records.newRecord(LocalResource.class);
> >       shellRsrc.setType(LocalResourceType.FILE);
> >       shellRsrc.setVisibility(LocalResourceVisibility.APPLICATION);
> >       String shellScriptPath = "hdfs://isredeng//kishore/kk.ksh";
> >       try {
> >         shellRsrc.setResource(ConverterUtils.getYarnUrlFromURI(new
> > URI(shellScriptPath)));
> >       } catch (URISyntaxException e) {
> >         LOG.error("Error when trying to use shell script path specified"
> >             + " in env, path=" + shellScriptPath);
> >         e.printStackTrace();
> >       }
> >
> >       shellRsrc.setTimestamp(0/*shellScriptPathTimestamp*/);
> >       shellRsrc.setSize(0/*shellScriptPathLen*/);
> >       String ExecShellStringPath = "ExecShellScript.sh";
> >       localResources.put(ExecShellStringPath, shellRsrc);
> >
> >       ctx.setLocalResources(localResources);
> >
> >
> > Please let me know if you need anything else.
> >
> > Thanks,
> > Kishore
> >
> >
> >
> > On Tue, Aug 6, 2013 at 12:05 AM, Harsh J <ha...@cloudera.com> wrote:
> >>
> >> The detail is insufficient to answer why. You should also have gotten
> >> a trace after it, can you post that? If possible, also the relevant
> >> snippets of code.
> >>
> >> On Mon, Aug 5, 2013 at 6:36 PM, Krishna Kishore Bonagiri
> >> <wr...@gmail.com> wrote:
> >> > Hi Harsh,
> >> >  Thanks for the quick and detailed reply, it really helps. I am trying
> >> > to
> >> > use it and getting this error in node manager's log:
> >> >
> >> > 2013-08-05 08:57:28,867 ERROR
> >> > org.apache.hadoop.security.UserGroupInformation:
> >> > PriviledgedActionException
> >> > as:dsadm (auth:SIMPLE) cause:java.io.FileNotFoundException: File does
> >> > not
> >> > exist: hdfs://isredeng/kishore/kk.ksh
> >> >
> >> >
> >> > This file is there on the machine with name "isredeng", I could do ls
> >> > for
> >> > that file as below:
> >> >
> >> > -bash-4.1$ hadoop fs -ls kishore/kk.ksh
> >> > 13/08/05 09:01:03 WARN util.NativeCodeLoader: Unable to load
> >> > native-hadoop
> >> > library for your platform... using builtin-java classes where
> applicable
> >> > Found 1 items
> >> > -rw-r--r--   3 dsadm supergroup       1046 2013-08-05 08:48
> >> > kishore/kk.ksh
> >> >
> >> > Note: I am using a single node cluster
> >> >
> >> > Thanks,
> >> > Kishore
> >> >
> >> >
> >> >
> >> >
> >> > On Mon, Aug 5, 2013 at 3:00 PM, Harsh J <ha...@cloudera.com> wrote:
> >> >>
> >> >> The string for each LocalResource in the map can be anything that
> >> >> serves as a common identifier name for your application. At execution
> >> >> time, the passed resource filename will be aliased to the name you've
> >> >> mapped it to, so that the application code need not track special
> >> >> names. The behavior is very similar to how you can, in MR, define a
> >> >> symlink name for a DistributedCache entry (e.g. foo.jar#bar.jar).
> >> >>
> >> >> For an example, checkout the DistributedShell app sources.
> >> >>
> >> >> Over [1], you can see we take a user provided file path to a shell
> >> >> script. This can be named anything as it is user-supplied.
> >> >> Onto [2], we define this as a local resource [2.1] and embed it with
> a
> >> >> different name (the string you ask about) [2.2], as defined at [3] as
> >> >> an application reference-able constant.
> >> >> Note that in [4], we add to the Container arguments the aliased name
> >> >> we mapped it to (i.e. [3]) and not the original filename we received
> >> >> from the user. The resource is placed on the container with this name
> >> >> instead, so thats what we choose to execute.
> >> >>
> >> >> [1] -
> >> >>
> >> >>
> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L390
> >> >>
> >> >> [2] - [2.1]
> >> >>
> >> >>
> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L764
> >> >> and [2.2]
> >> >>
> >> >>
> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L780
> >> >>
> >> >> [3] -
> >> >>
> >> >>
> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L205
> >> >>
> >> >> [4] -
> >> >>
> >> >>
> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L791
> >> >>
> >> >> On Mon, Aug 5, 2013 at 2:44 PM, Krishna Kishore Bonagiri
> >> >> <wr...@gmail.com> wrote:
> >> >> > Hi,
> >> >> >
> >> >> >   Can someone please tell me what is the use of calling
> >> >> > setLocalResources()
> >> >> > on ContainerLaunchContext?
> >> >> >
> >> >> >   And, also an example of how to use this will help...
> >> >> >
> >> >> >  I couldn't guess what is the String in the map that is passed to
> >> >> > setLocalResources() like below:
> >> >> >
> >> >> >       // Set the local resources
> >> >> >       Map<String, LocalResource> localResources = new
> HashMap<String,
> >> >> > LocalResource>();
> >> >> >
> >> >> > Thanks,
> >> >> > Kishore
> >> >> >
> >> >>
> >> >>
> >> >>
> >> >> --
> >> >> Harsh J
> >> >
> >> >
> >>
> >>
> >>
> >> --
> >> Harsh J
> >
> >
>
>
>
> --
> Harsh J
>

Re: setLocalResources() on ContainerLaunchContext

Posted by Krishna Kishore Bonagiri <wr...@gmail.com>.
Hi Harsh,
   The setResource() call on LocalResource() is expecting an argument of
type org.apache.hadoop.yarn.api.records.URL which is converted from a
string in the form of URI. This happens in the following call of
Distributed Shell example,

shellRsrc.setResource(ConverterUtils.getYarnUrlFromURI(new URI(
shellScriptPath)));

So, if I give a local file I get a parsing error like below, which is when
I changed it to an HDFS file thinking that it should be given like that
only. Could you please give an example of how else it could be used, using
a local file as you are saying?

2013-08-06 06:23:12,942 WARN
org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container:
Failed to parse resource-request
java.net.URISyntaxException: Expected scheme name at index 0:
:///home_/dsadm/kishore/kk.ksh
        at java.net.URI$Parser.fail(URI.java:2820)
        at java.net.URI$Parser.failExpecting(URI.java:2826)
        at java.net.URI$Parser.parse(URI.java:3015)
        at java.net.URI.<init>(URI.java:747)
        at
org.apache.hadoop.yarn.util.ConverterUtils.getPathFromYarnURL(ConverterUtils.java:77)
        at
org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalResourceRequest.<init>(LocalResourceRequest.java:46)



On Tue, Aug 6, 2013 at 3:36 PM, Harsh J <ha...@cloudera.com> wrote:

> To be honest, I've never tried loading a HDFS file onto the
> LocalResource this way. I usually just pass a local file and that
> works just fine. There may be something in the URI transformation
> possibly breaking a HDFS source, but try passing a local file - does
> that fail too? The Shell example uses a local file.
>
> On Tue, Aug 6, 2013 at 10:54 AM, Krishna Kishore Bonagiri
> <wr...@gmail.com> wrote:
> > Hi Harsh,
> >
> >   Please see if this is useful, I got a stack trace after the error has
> > occurred....
> >
> > 2013-08-06 00:55:30,559 INFO
> > org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor: CWD
> set
> > to
> /tmp/nm-local-dir/usercache/dsadm/appcache/application_1375716148174_0004
> > =
> >
> file:/tmp/nm-local-dir/usercache/dsadm/appcache/application_1375716148174_0004
> > 2013-08-06 00:55:31,017 ERROR
> > org.apache.hadoop.security.UserGroupInformation:
> PriviledgedActionException
> > as:dsadm (auth:SIMPLE) cause:java.io.FileNotFoundException: File does not
> > exist: hdfs://isredeng/kishore/kk.ksh
> > 2013-08-06 00:55:31,029 INFO
> >
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService:
> > DEBUG: FAILED { hdfs://isredeng/kishore/kk.ksh, 0, FILE, null }, File
> does
> > not exist: hdfs://isredeng/kishore/kk.ksh
> > 2013-08-06 00:55:31,031 INFO
> >
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalizedResource:
> > Resource hdfs://isredeng/kishore/kk.ksh transitioned from DOWNLOADING to
> > FAILED
> > 2013-08-06 00:55:31,034 INFO
> >
> org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container:
> > Container container_1375716148174_0004_01_000002 transitioned from
> > LOCALIZING to LOCALIZATION_FAILED
> > 2013-08-06 00:55:31,035 INFO
> >
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalResourcesTrackerImpl:
> > Container container_1375716148174_0004_01_000002 sent RELEASE event on a
> > resource request { hdfs://isredeng/kishore/kk.ksh, 0, FILE, null } not
> > present in cache.
> > 2013-08-06 00:55:31,036 WARN org.apache.hadoop.ipc.Client: interrupted
> > waiting to send rpc request to server
> > java.lang.InterruptedException
> >         at
> >
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1290)
> >         at
> > java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:229)
> >         at java.util.concurrent.FutureTask.get(FutureTask.java:94)
> >         at
> > org.apache.hadoop.ipc.Client$Connection.sendRpcRequest(Client.java:930)
> >         at org.apache.hadoop.ipc.Client.call(Client.java:1285)
> >         at org.apache.hadoop.ipc.Client.call(Client.java:1264)
> >         at
> >
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
> >         at $Proxy22.heartbeat(Unknown Source)
> >         at
> >
> org.apache.hadoop.yarn.server.nodemanager.api.impl.pb.client.LocalizationProtocolPBClientImpl.heartbeat(LocalizationProtocolPBClientImpl.java:62)
> >         at
> >
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.localizeFiles(ContainerLocalizer.java:249)
> >         at
> >
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.runLocalization(ContainerLocalizer.java:163)
> >         at
> >
> org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.startLocalizer(DefaultContainerExecutor.java:106)
> >         at
> >
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerRunner.run(ResourceLocalizationService.java:979)
> >
> >
> >
> > And here is my code snippet:
> >
> >       ContainerLaunchContext ctx =
> > Records.newRecord(ContainerLaunchContext.class);
> >
> >       ctx.setEnvironment(oshEnv);
> >
> >       // Set the local resources
> >       Map<String, LocalResource> localResources = new HashMap<String,
> > LocalResource>();
> >
> >       LocalResource shellRsrc = Records.newRecord(LocalResource.class);
> >       shellRsrc.setType(LocalResourceType.FILE);
> >       shellRsrc.setVisibility(LocalResourceVisibility.APPLICATION);
> >       String shellScriptPath = "hdfs://isredeng//kishore/kk.ksh";
> >       try {
> >         shellRsrc.setResource(ConverterUtils.getYarnUrlFromURI(new
> > URI(shellScriptPath)));
> >       } catch (URISyntaxException e) {
> >         LOG.error("Error when trying to use shell script path specified"
> >             + " in env, path=" + shellScriptPath);
> >         e.printStackTrace();
> >       }
> >
> >       shellRsrc.setTimestamp(0/*shellScriptPathTimestamp*/);
> >       shellRsrc.setSize(0/*shellScriptPathLen*/);
> >       String ExecShellStringPath = "ExecShellScript.sh";
> >       localResources.put(ExecShellStringPath, shellRsrc);
> >
> >       ctx.setLocalResources(localResources);
> >
> >
> > Please let me know if you need anything else.
> >
> > Thanks,
> > Kishore
> >
> >
> >
> > On Tue, Aug 6, 2013 at 12:05 AM, Harsh J <ha...@cloudera.com> wrote:
> >>
> >> The detail is insufficient to answer why. You should also have gotten
> >> a trace after it, can you post that? If possible, also the relevant
> >> snippets of code.
> >>
> >> On Mon, Aug 5, 2013 at 6:36 PM, Krishna Kishore Bonagiri
> >> <wr...@gmail.com> wrote:
> >> > Hi Harsh,
> >> >  Thanks for the quick and detailed reply, it really helps. I am trying
> >> > to
> >> > use it and getting this error in node manager's log:
> >> >
> >> > 2013-08-05 08:57:28,867 ERROR
> >> > org.apache.hadoop.security.UserGroupInformation:
> >> > PriviledgedActionException
> >> > as:dsadm (auth:SIMPLE) cause:java.io.FileNotFoundException: File does
> >> > not
> >> > exist: hdfs://isredeng/kishore/kk.ksh
> >> >
> >> >
> >> > This file is there on the machine with name "isredeng", I could do ls
> >> > for
> >> > that file as below:
> >> >
> >> > -bash-4.1$ hadoop fs -ls kishore/kk.ksh
> >> > 13/08/05 09:01:03 WARN util.NativeCodeLoader: Unable to load
> >> > native-hadoop
> >> > library for your platform... using builtin-java classes where
> applicable
> >> > Found 1 items
> >> > -rw-r--r--   3 dsadm supergroup       1046 2013-08-05 08:48
> >> > kishore/kk.ksh
> >> >
> >> > Note: I am using a single node cluster
> >> >
> >> > Thanks,
> >> > Kishore
> >> >
> >> >
> >> >
> >> >
> >> > On Mon, Aug 5, 2013 at 3:00 PM, Harsh J <ha...@cloudera.com> wrote:
> >> >>
> >> >> The string for each LocalResource in the map can be anything that
> >> >> serves as a common identifier name for your application. At execution
> >> >> time, the passed resource filename will be aliased to the name you've
> >> >> mapped it to, so that the application code need not track special
> >> >> names. The behavior is very similar to how you can, in MR, define a
> >> >> symlink name for a DistributedCache entry (e.g. foo.jar#bar.jar).
> >> >>
> >> >> For an example, checkout the DistributedShell app sources.
> >> >>
> >> >> Over [1], you can see we take a user provided file path to a shell
> >> >> script. This can be named anything as it is user-supplied.
> >> >> Onto [2], we define this as a local resource [2.1] and embed it with
> a
> >> >> different name (the string you ask about) [2.2], as defined at [3] as
> >> >> an application reference-able constant.
> >> >> Note that in [4], we add to the Container arguments the aliased name
> >> >> we mapped it to (i.e. [3]) and not the original filename we received
> >> >> from the user. The resource is placed on the container with this name
> >> >> instead, so thats what we choose to execute.
> >> >>
> >> >> [1] -
> >> >>
> >> >>
> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L390
> >> >>
> >> >> [2] - [2.1]
> >> >>
> >> >>
> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L764
> >> >> and [2.2]
> >> >>
> >> >>
> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L780
> >> >>
> >> >> [3] -
> >> >>
> >> >>
> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L205
> >> >>
> >> >> [4] -
> >> >>
> >> >>
> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L791
> >> >>
> >> >> On Mon, Aug 5, 2013 at 2:44 PM, Krishna Kishore Bonagiri
> >> >> <wr...@gmail.com> wrote:
> >> >> > Hi,
> >> >> >
> >> >> >   Can someone please tell me what is the use of calling
> >> >> > setLocalResources()
> >> >> > on ContainerLaunchContext?
> >> >> >
> >> >> >   And, also an example of how to use this will help...
> >> >> >
> >> >> >  I couldn't guess what is the String in the map that is passed to
> >> >> > setLocalResources() like below:
> >> >> >
> >> >> >       // Set the local resources
> >> >> >       Map<String, LocalResource> localResources = new
> HashMap<String,
> >> >> > LocalResource>();
> >> >> >
> >> >> > Thanks,
> >> >> > Kishore
> >> >> >
> >> >>
> >> >>
> >> >>
> >> >> --
> >> >> Harsh J
> >> >
> >> >
> >>
> >>
> >>
> >> --
> >> Harsh J
> >
> >
>
>
>
> --
> Harsh J
>

Re: setLocalResources() on ContainerLaunchContext

Posted by Harsh J <ha...@cloudera.com>.
To be honest, I've never tried loading a HDFS file onto the
LocalResource this way. I usually just pass a local file and that
works just fine. There may be something in the URI transformation
possibly breaking a HDFS source, but try passing a local file - does
that fail too? The Shell example uses a local file.

On Tue, Aug 6, 2013 at 10:54 AM, Krishna Kishore Bonagiri
<wr...@gmail.com> wrote:
> Hi Harsh,
>
>   Please see if this is useful, I got a stack trace after the error has
> occurred....
>
> 2013-08-06 00:55:30,559 INFO
> org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor: CWD set
> to /tmp/nm-local-dir/usercache/dsadm/appcache/application_1375716148174_0004
> =
> file:/tmp/nm-local-dir/usercache/dsadm/appcache/application_1375716148174_0004
> 2013-08-06 00:55:31,017 ERROR
> org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException
> as:dsadm (auth:SIMPLE) cause:java.io.FileNotFoundException: File does not
> exist: hdfs://isredeng/kishore/kk.ksh
> 2013-08-06 00:55:31,029 INFO
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService:
> DEBUG: FAILED { hdfs://isredeng/kishore/kk.ksh, 0, FILE, null }, File does
> not exist: hdfs://isredeng/kishore/kk.ksh
> 2013-08-06 00:55:31,031 INFO
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalizedResource:
> Resource hdfs://isredeng/kishore/kk.ksh transitioned from DOWNLOADING to
> FAILED
> 2013-08-06 00:55:31,034 INFO
> org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container:
> Container container_1375716148174_0004_01_000002 transitioned from
> LOCALIZING to LOCALIZATION_FAILED
> 2013-08-06 00:55:31,035 INFO
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalResourcesTrackerImpl:
> Container container_1375716148174_0004_01_000002 sent RELEASE event on a
> resource request { hdfs://isredeng/kishore/kk.ksh, 0, FILE, null } not
> present in cache.
> 2013-08-06 00:55:31,036 WARN org.apache.hadoop.ipc.Client: interrupted
> waiting to send rpc request to server
> java.lang.InterruptedException
>         at
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1290)
>         at
> java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:229)
>         at java.util.concurrent.FutureTask.get(FutureTask.java:94)
>         at
> org.apache.hadoop.ipc.Client$Connection.sendRpcRequest(Client.java:930)
>         at org.apache.hadoop.ipc.Client.call(Client.java:1285)
>         at org.apache.hadoop.ipc.Client.call(Client.java:1264)
>         at
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
>         at $Proxy22.heartbeat(Unknown Source)
>         at
> org.apache.hadoop.yarn.server.nodemanager.api.impl.pb.client.LocalizationProtocolPBClientImpl.heartbeat(LocalizationProtocolPBClientImpl.java:62)
>         at
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.localizeFiles(ContainerLocalizer.java:249)
>         at
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.runLocalization(ContainerLocalizer.java:163)
>         at
> org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.startLocalizer(DefaultContainerExecutor.java:106)
>         at
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerRunner.run(ResourceLocalizationService.java:979)
>
>
>
> And here is my code snippet:
>
>       ContainerLaunchContext ctx =
> Records.newRecord(ContainerLaunchContext.class);
>
>       ctx.setEnvironment(oshEnv);
>
>       // Set the local resources
>       Map<String, LocalResource> localResources = new HashMap<String,
> LocalResource>();
>
>       LocalResource shellRsrc = Records.newRecord(LocalResource.class);
>       shellRsrc.setType(LocalResourceType.FILE);
>       shellRsrc.setVisibility(LocalResourceVisibility.APPLICATION);
>       String shellScriptPath = "hdfs://isredeng//kishore/kk.ksh";
>       try {
>         shellRsrc.setResource(ConverterUtils.getYarnUrlFromURI(new
> URI(shellScriptPath)));
>       } catch (URISyntaxException e) {
>         LOG.error("Error when trying to use shell script path specified"
>             + " in env, path=" + shellScriptPath);
>         e.printStackTrace();
>       }
>
>       shellRsrc.setTimestamp(0/*shellScriptPathTimestamp*/);
>       shellRsrc.setSize(0/*shellScriptPathLen*/);
>       String ExecShellStringPath = "ExecShellScript.sh";
>       localResources.put(ExecShellStringPath, shellRsrc);
>
>       ctx.setLocalResources(localResources);
>
>
> Please let me know if you need anything else.
>
> Thanks,
> Kishore
>
>
>
> On Tue, Aug 6, 2013 at 12:05 AM, Harsh J <ha...@cloudera.com> wrote:
>>
>> The detail is insufficient to answer why. You should also have gotten
>> a trace after it, can you post that? If possible, also the relevant
>> snippets of code.
>>
>> On Mon, Aug 5, 2013 at 6:36 PM, Krishna Kishore Bonagiri
>> <wr...@gmail.com> wrote:
>> > Hi Harsh,
>> >  Thanks for the quick and detailed reply, it really helps. I am trying
>> > to
>> > use it and getting this error in node manager's log:
>> >
>> > 2013-08-05 08:57:28,867 ERROR
>> > org.apache.hadoop.security.UserGroupInformation:
>> > PriviledgedActionException
>> > as:dsadm (auth:SIMPLE) cause:java.io.FileNotFoundException: File does
>> > not
>> > exist: hdfs://isredeng/kishore/kk.ksh
>> >
>> >
>> > This file is there on the machine with name "isredeng", I could do ls
>> > for
>> > that file as below:
>> >
>> > -bash-4.1$ hadoop fs -ls kishore/kk.ksh
>> > 13/08/05 09:01:03 WARN util.NativeCodeLoader: Unable to load
>> > native-hadoop
>> > library for your platform... using builtin-java classes where applicable
>> > Found 1 items
>> > -rw-r--r--   3 dsadm supergroup       1046 2013-08-05 08:48
>> > kishore/kk.ksh
>> >
>> > Note: I am using a single node cluster
>> >
>> > Thanks,
>> > Kishore
>> >
>> >
>> >
>> >
>> > On Mon, Aug 5, 2013 at 3:00 PM, Harsh J <ha...@cloudera.com> wrote:
>> >>
>> >> The string for each LocalResource in the map can be anything that
>> >> serves as a common identifier name for your application. At execution
>> >> time, the passed resource filename will be aliased to the name you've
>> >> mapped it to, so that the application code need not track special
>> >> names. The behavior is very similar to how you can, in MR, define a
>> >> symlink name for a DistributedCache entry (e.g. foo.jar#bar.jar).
>> >>
>> >> For an example, checkout the DistributedShell app sources.
>> >>
>> >> Over [1], you can see we take a user provided file path to a shell
>> >> script. This can be named anything as it is user-supplied.
>> >> Onto [2], we define this as a local resource [2.1] and embed it with a
>> >> different name (the string you ask about) [2.2], as defined at [3] as
>> >> an application reference-able constant.
>> >> Note that in [4], we add to the Container arguments the aliased name
>> >> we mapped it to (i.e. [3]) and not the original filename we received
>> >> from the user. The resource is placed on the container with this name
>> >> instead, so thats what we choose to execute.
>> >>
>> >> [1] -
>> >>
>> >> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L390
>> >>
>> >> [2] - [2.1]
>> >>
>> >> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L764
>> >> and [2.2]
>> >>
>> >> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L780
>> >>
>> >> [3] -
>> >>
>> >> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L205
>> >>
>> >> [4] -
>> >>
>> >> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L791
>> >>
>> >> On Mon, Aug 5, 2013 at 2:44 PM, Krishna Kishore Bonagiri
>> >> <wr...@gmail.com> wrote:
>> >> > Hi,
>> >> >
>> >> >   Can someone please tell me what is the use of calling
>> >> > setLocalResources()
>> >> > on ContainerLaunchContext?
>> >> >
>> >> >   And, also an example of how to use this will help...
>> >> >
>> >> >  I couldn't guess what is the String in the map that is passed to
>> >> > setLocalResources() like below:
>> >> >
>> >> >       // Set the local resources
>> >> >       Map<String, LocalResource> localResources = new HashMap<String,
>> >> > LocalResource>();
>> >> >
>> >> > Thanks,
>> >> > Kishore
>> >> >
>> >>
>> >>
>> >>
>> >> --
>> >> Harsh J
>> >
>> >
>>
>>
>>
>> --
>> Harsh J
>
>



-- 
Harsh J

Re: setLocalResources() on ContainerLaunchContext

Posted by Harsh J <ha...@cloudera.com>.
To be honest, I've never tried loading a HDFS file onto the
LocalResource this way. I usually just pass a local file and that
works just fine. There may be something in the URI transformation
possibly breaking a HDFS source, but try passing a local file - does
that fail too? The Shell example uses a local file.

On Tue, Aug 6, 2013 at 10:54 AM, Krishna Kishore Bonagiri
<wr...@gmail.com> wrote:
> Hi Harsh,
>
>   Please see if this is useful, I got a stack trace after the error has
> occurred....
>
> 2013-08-06 00:55:30,559 INFO
> org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor: CWD set
> to /tmp/nm-local-dir/usercache/dsadm/appcache/application_1375716148174_0004
> =
> file:/tmp/nm-local-dir/usercache/dsadm/appcache/application_1375716148174_0004
> 2013-08-06 00:55:31,017 ERROR
> org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException
> as:dsadm (auth:SIMPLE) cause:java.io.FileNotFoundException: File does not
> exist: hdfs://isredeng/kishore/kk.ksh
> 2013-08-06 00:55:31,029 INFO
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService:
> DEBUG: FAILED { hdfs://isredeng/kishore/kk.ksh, 0, FILE, null }, File does
> not exist: hdfs://isredeng/kishore/kk.ksh
> 2013-08-06 00:55:31,031 INFO
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalizedResource:
> Resource hdfs://isredeng/kishore/kk.ksh transitioned from DOWNLOADING to
> FAILED
> 2013-08-06 00:55:31,034 INFO
> org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container:
> Container container_1375716148174_0004_01_000002 transitioned from
> LOCALIZING to LOCALIZATION_FAILED
> 2013-08-06 00:55:31,035 INFO
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalResourcesTrackerImpl:
> Container container_1375716148174_0004_01_000002 sent RELEASE event on a
> resource request { hdfs://isredeng/kishore/kk.ksh, 0, FILE, null } not
> present in cache.
> 2013-08-06 00:55:31,036 WARN org.apache.hadoop.ipc.Client: interrupted
> waiting to send rpc request to server
> java.lang.InterruptedException
>         at
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1290)
>         at
> java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:229)
>         at java.util.concurrent.FutureTask.get(FutureTask.java:94)
>         at
> org.apache.hadoop.ipc.Client$Connection.sendRpcRequest(Client.java:930)
>         at org.apache.hadoop.ipc.Client.call(Client.java:1285)
>         at org.apache.hadoop.ipc.Client.call(Client.java:1264)
>         at
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
>         at $Proxy22.heartbeat(Unknown Source)
>         at
> org.apache.hadoop.yarn.server.nodemanager.api.impl.pb.client.LocalizationProtocolPBClientImpl.heartbeat(LocalizationProtocolPBClientImpl.java:62)
>         at
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.localizeFiles(ContainerLocalizer.java:249)
>         at
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.runLocalization(ContainerLocalizer.java:163)
>         at
> org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.startLocalizer(DefaultContainerExecutor.java:106)
>         at
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerRunner.run(ResourceLocalizationService.java:979)
>
>
>
> And here is my code snippet:
>
>       ContainerLaunchContext ctx =
> Records.newRecord(ContainerLaunchContext.class);
>
>       ctx.setEnvironment(oshEnv);
>
>       // Set the local resources
>       Map<String, LocalResource> localResources = new HashMap<String,
> LocalResource>();
>
>       LocalResource shellRsrc = Records.newRecord(LocalResource.class);
>       shellRsrc.setType(LocalResourceType.FILE);
>       shellRsrc.setVisibility(LocalResourceVisibility.APPLICATION);
>       String shellScriptPath = "hdfs://isredeng//kishore/kk.ksh";
>       try {
>         shellRsrc.setResource(ConverterUtils.getYarnUrlFromURI(new
> URI(shellScriptPath)));
>       } catch (URISyntaxException e) {
>         LOG.error("Error when trying to use shell script path specified"
>             + " in env, path=" + shellScriptPath);
>         e.printStackTrace();
>       }
>
>       shellRsrc.setTimestamp(0/*shellScriptPathTimestamp*/);
>       shellRsrc.setSize(0/*shellScriptPathLen*/);
>       String ExecShellStringPath = "ExecShellScript.sh";
>       localResources.put(ExecShellStringPath, shellRsrc);
>
>       ctx.setLocalResources(localResources);
>
>
> Please let me know if you need anything else.
>
> Thanks,
> Kishore
>
>
>
> On Tue, Aug 6, 2013 at 12:05 AM, Harsh J <ha...@cloudera.com> wrote:
>>
>> The detail is insufficient to answer why. You should also have gotten
>> a trace after it, can you post that? If possible, also the relevant
>> snippets of code.
>>
>> On Mon, Aug 5, 2013 at 6:36 PM, Krishna Kishore Bonagiri
>> <wr...@gmail.com> wrote:
>> > Hi Harsh,
>> >  Thanks for the quick and detailed reply, it really helps. I am trying
>> > to
>> > use it and getting this error in node manager's log:
>> >
>> > 2013-08-05 08:57:28,867 ERROR
>> > org.apache.hadoop.security.UserGroupInformation:
>> > PriviledgedActionException
>> > as:dsadm (auth:SIMPLE) cause:java.io.FileNotFoundException: File does
>> > not
>> > exist: hdfs://isredeng/kishore/kk.ksh
>> >
>> >
>> > This file is there on the machine with name "isredeng", I could do ls
>> > for
>> > that file as below:
>> >
>> > -bash-4.1$ hadoop fs -ls kishore/kk.ksh
>> > 13/08/05 09:01:03 WARN util.NativeCodeLoader: Unable to load
>> > native-hadoop
>> > library for your platform... using builtin-java classes where applicable
>> > Found 1 items
>> > -rw-r--r--   3 dsadm supergroup       1046 2013-08-05 08:48
>> > kishore/kk.ksh
>> >
>> > Note: I am using a single node cluster
>> >
>> > Thanks,
>> > Kishore
>> >
>> >
>> >
>> >
>> > On Mon, Aug 5, 2013 at 3:00 PM, Harsh J <ha...@cloudera.com> wrote:
>> >>
>> >> The string for each LocalResource in the map can be anything that
>> >> serves as a common identifier name for your application. At execution
>> >> time, the passed resource filename will be aliased to the name you've
>> >> mapped it to, so that the application code need not track special
>> >> names. The behavior is very similar to how you can, in MR, define a
>> >> symlink name for a DistributedCache entry (e.g. foo.jar#bar.jar).
>> >>
>> >> For an example, checkout the DistributedShell app sources.
>> >>
>> >> Over [1], you can see we take a user provided file path to a shell
>> >> script. This can be named anything as it is user-supplied.
>> >> Onto [2], we define this as a local resource [2.1] and embed it with a
>> >> different name (the string you ask about) [2.2], as defined at [3] as
>> >> an application reference-able constant.
>> >> Note that in [4], we add to the Container arguments the aliased name
>> >> we mapped it to (i.e. [3]) and not the original filename we received
>> >> from the user. The resource is placed on the container with this name
>> >> instead, so thats what we choose to execute.
>> >>
>> >> [1] -
>> >>
>> >> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L390
>> >>
>> >> [2] - [2.1]
>> >>
>> >> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L764
>> >> and [2.2]
>> >>
>> >> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L780
>> >>
>> >> [3] -
>> >>
>> >> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L205
>> >>
>> >> [4] -
>> >>
>> >> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L791
>> >>
>> >> On Mon, Aug 5, 2013 at 2:44 PM, Krishna Kishore Bonagiri
>> >> <wr...@gmail.com> wrote:
>> >> > Hi,
>> >> >
>> >> >   Can someone please tell me what is the use of calling
>> >> > setLocalResources()
>> >> > on ContainerLaunchContext?
>> >> >
>> >> >   And, also an example of how to use this will help...
>> >> >
>> >> >  I couldn't guess what is the String in the map that is passed to
>> >> > setLocalResources() like below:
>> >> >
>> >> >       // Set the local resources
>> >> >       Map<String, LocalResource> localResources = new HashMap<String,
>> >> > LocalResource>();
>> >> >
>> >> > Thanks,
>> >> > Kishore
>> >> >
>> >>
>> >>
>> >>
>> >> --
>> >> Harsh J
>> >
>> >
>>
>>
>>
>> --
>> Harsh J
>
>



-- 
Harsh J

Re: setLocalResources() on ContainerLaunchContext

Posted by Harsh J <ha...@cloudera.com>.
To be honest, I've never tried loading a HDFS file onto the
LocalResource this way. I usually just pass a local file and that
works just fine. There may be something in the URI transformation
possibly breaking a HDFS source, but try passing a local file - does
that fail too? The Shell example uses a local file.

On Tue, Aug 6, 2013 at 10:54 AM, Krishna Kishore Bonagiri
<wr...@gmail.com> wrote:
> Hi Harsh,
>
>   Please see if this is useful, I got a stack trace after the error has
> occurred....
>
> 2013-08-06 00:55:30,559 INFO
> org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor: CWD set
> to /tmp/nm-local-dir/usercache/dsadm/appcache/application_1375716148174_0004
> =
> file:/tmp/nm-local-dir/usercache/dsadm/appcache/application_1375716148174_0004
> 2013-08-06 00:55:31,017 ERROR
> org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException
> as:dsadm (auth:SIMPLE) cause:java.io.FileNotFoundException: File does not
> exist: hdfs://isredeng/kishore/kk.ksh
> 2013-08-06 00:55:31,029 INFO
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService:
> DEBUG: FAILED { hdfs://isredeng/kishore/kk.ksh, 0, FILE, null }, File does
> not exist: hdfs://isredeng/kishore/kk.ksh
> 2013-08-06 00:55:31,031 INFO
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalizedResource:
> Resource hdfs://isredeng/kishore/kk.ksh transitioned from DOWNLOADING to
> FAILED
> 2013-08-06 00:55:31,034 INFO
> org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container:
> Container container_1375716148174_0004_01_000002 transitioned from
> LOCALIZING to LOCALIZATION_FAILED
> 2013-08-06 00:55:31,035 INFO
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalResourcesTrackerImpl:
> Container container_1375716148174_0004_01_000002 sent RELEASE event on a
> resource request { hdfs://isredeng/kishore/kk.ksh, 0, FILE, null } not
> present in cache.
> 2013-08-06 00:55:31,036 WARN org.apache.hadoop.ipc.Client: interrupted
> waiting to send rpc request to server
> java.lang.InterruptedException
>         at
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1290)
>         at
> java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:229)
>         at java.util.concurrent.FutureTask.get(FutureTask.java:94)
>         at
> org.apache.hadoop.ipc.Client$Connection.sendRpcRequest(Client.java:930)
>         at org.apache.hadoop.ipc.Client.call(Client.java:1285)
>         at org.apache.hadoop.ipc.Client.call(Client.java:1264)
>         at
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
>         at $Proxy22.heartbeat(Unknown Source)
>         at
> org.apache.hadoop.yarn.server.nodemanager.api.impl.pb.client.LocalizationProtocolPBClientImpl.heartbeat(LocalizationProtocolPBClientImpl.java:62)
>         at
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.localizeFiles(ContainerLocalizer.java:249)
>         at
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.runLocalization(ContainerLocalizer.java:163)
>         at
> org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.startLocalizer(DefaultContainerExecutor.java:106)
>         at
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerRunner.run(ResourceLocalizationService.java:979)
>
>
>
> And here is my code snippet:
>
>       ContainerLaunchContext ctx =
> Records.newRecord(ContainerLaunchContext.class);
>
>       ctx.setEnvironment(oshEnv);
>
>       // Set the local resources
>       Map<String, LocalResource> localResources = new HashMap<String,
> LocalResource>();
>
>       LocalResource shellRsrc = Records.newRecord(LocalResource.class);
>       shellRsrc.setType(LocalResourceType.FILE);
>       shellRsrc.setVisibility(LocalResourceVisibility.APPLICATION);
>       String shellScriptPath = "hdfs://isredeng//kishore/kk.ksh";
>       try {
>         shellRsrc.setResource(ConverterUtils.getYarnUrlFromURI(new
> URI(shellScriptPath)));
>       } catch (URISyntaxException e) {
>         LOG.error("Error when trying to use shell script path specified"
>             + " in env, path=" + shellScriptPath);
>         e.printStackTrace();
>       }
>
>       shellRsrc.setTimestamp(0/*shellScriptPathTimestamp*/);
>       shellRsrc.setSize(0/*shellScriptPathLen*/);
>       String ExecShellStringPath = "ExecShellScript.sh";
>       localResources.put(ExecShellStringPath, shellRsrc);
>
>       ctx.setLocalResources(localResources);
>
>
> Please let me know if you need anything else.
>
> Thanks,
> Kishore
>
>
>
> On Tue, Aug 6, 2013 at 12:05 AM, Harsh J <ha...@cloudera.com> wrote:
>>
>> The detail is insufficient to answer why. You should also have gotten
>> a trace after it, can you post that? If possible, also the relevant
>> snippets of code.
>>
>> On Mon, Aug 5, 2013 at 6:36 PM, Krishna Kishore Bonagiri
>> <wr...@gmail.com> wrote:
>> > Hi Harsh,
>> >  Thanks for the quick and detailed reply, it really helps. I am trying
>> > to
>> > use it and getting this error in node manager's log:
>> >
>> > 2013-08-05 08:57:28,867 ERROR
>> > org.apache.hadoop.security.UserGroupInformation:
>> > PriviledgedActionException
>> > as:dsadm (auth:SIMPLE) cause:java.io.FileNotFoundException: File does
>> > not
>> > exist: hdfs://isredeng/kishore/kk.ksh
>> >
>> >
>> > This file is there on the machine with name "isredeng", I could do ls
>> > for
>> > that file as below:
>> >
>> > -bash-4.1$ hadoop fs -ls kishore/kk.ksh
>> > 13/08/05 09:01:03 WARN util.NativeCodeLoader: Unable to load
>> > native-hadoop
>> > library for your platform... using builtin-java classes where applicable
>> > Found 1 items
>> > -rw-r--r--   3 dsadm supergroup       1046 2013-08-05 08:48
>> > kishore/kk.ksh
>> >
>> > Note: I am using a single node cluster
>> >
>> > Thanks,
>> > Kishore
>> >
>> >
>> >
>> >
>> > On Mon, Aug 5, 2013 at 3:00 PM, Harsh J <ha...@cloudera.com> wrote:
>> >>
>> >> The string for each LocalResource in the map can be anything that
>> >> serves as a common identifier name for your application. At execution
>> >> time, the passed resource filename will be aliased to the name you've
>> >> mapped it to, so that the application code need not track special
>> >> names. The behavior is very similar to how you can, in MR, define a
>> >> symlink name for a DistributedCache entry (e.g. foo.jar#bar.jar).
>> >>
>> >> For an example, checkout the DistributedShell app sources.
>> >>
>> >> Over [1], you can see we take a user provided file path to a shell
>> >> script. This can be named anything as it is user-supplied.
>> >> Onto [2], we define this as a local resource [2.1] and embed it with a
>> >> different name (the string you ask about) [2.2], as defined at [3] as
>> >> an application reference-able constant.
>> >> Note that in [4], we add to the Container arguments the aliased name
>> >> we mapped it to (i.e. [3]) and not the original filename we received
>> >> from the user. The resource is placed on the container with this name
>> >> instead, so thats what we choose to execute.
>> >>
>> >> [1] -
>> >>
>> >> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L390
>> >>
>> >> [2] - [2.1]
>> >>
>> >> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L764
>> >> and [2.2]
>> >>
>> >> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L780
>> >>
>> >> [3] -
>> >>
>> >> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L205
>> >>
>> >> [4] -
>> >>
>> >> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L791
>> >>
>> >> On Mon, Aug 5, 2013 at 2:44 PM, Krishna Kishore Bonagiri
>> >> <wr...@gmail.com> wrote:
>> >> > Hi,
>> >> >
>> >> >   Can someone please tell me what is the use of calling
>> >> > setLocalResources()
>> >> > on ContainerLaunchContext?
>> >> >
>> >> >   And, also an example of how to use this will help...
>> >> >
>> >> >  I couldn't guess what is the String in the map that is passed to
>> >> > setLocalResources() like below:
>> >> >
>> >> >       // Set the local resources
>> >> >       Map<String, LocalResource> localResources = new HashMap<String,
>> >> > LocalResource>();
>> >> >
>> >> > Thanks,
>> >> > Kishore
>> >> >
>> >>
>> >>
>> >>
>> >> --
>> >> Harsh J
>> >
>> >
>>
>>
>>
>> --
>> Harsh J
>
>



-- 
Harsh J

Re: setLocalResources() on ContainerLaunchContext

Posted by Harsh J <ha...@cloudera.com>.
To be honest, I've never tried loading a HDFS file onto the
LocalResource this way. I usually just pass a local file and that
works just fine. There may be something in the URI transformation
possibly breaking a HDFS source, but try passing a local file - does
that fail too? The Shell example uses a local file.

On Tue, Aug 6, 2013 at 10:54 AM, Krishna Kishore Bonagiri
<wr...@gmail.com> wrote:
> Hi Harsh,
>
>   Please see if this is useful, I got a stack trace after the error has
> occurred....
>
> 2013-08-06 00:55:30,559 INFO
> org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor: CWD set
> to /tmp/nm-local-dir/usercache/dsadm/appcache/application_1375716148174_0004
> =
> file:/tmp/nm-local-dir/usercache/dsadm/appcache/application_1375716148174_0004
> 2013-08-06 00:55:31,017 ERROR
> org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException
> as:dsadm (auth:SIMPLE) cause:java.io.FileNotFoundException: File does not
> exist: hdfs://isredeng/kishore/kk.ksh
> 2013-08-06 00:55:31,029 INFO
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService:
> DEBUG: FAILED { hdfs://isredeng/kishore/kk.ksh, 0, FILE, null }, File does
> not exist: hdfs://isredeng/kishore/kk.ksh
> 2013-08-06 00:55:31,031 INFO
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalizedResource:
> Resource hdfs://isredeng/kishore/kk.ksh transitioned from DOWNLOADING to
> FAILED
> 2013-08-06 00:55:31,034 INFO
> org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container:
> Container container_1375716148174_0004_01_000002 transitioned from
> LOCALIZING to LOCALIZATION_FAILED
> 2013-08-06 00:55:31,035 INFO
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalResourcesTrackerImpl:
> Container container_1375716148174_0004_01_000002 sent RELEASE event on a
> resource request { hdfs://isredeng/kishore/kk.ksh, 0, FILE, null } not
> present in cache.
> 2013-08-06 00:55:31,036 WARN org.apache.hadoop.ipc.Client: interrupted
> waiting to send rpc request to server
> java.lang.InterruptedException
>         at
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1290)
>         at
> java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:229)
>         at java.util.concurrent.FutureTask.get(FutureTask.java:94)
>         at
> org.apache.hadoop.ipc.Client$Connection.sendRpcRequest(Client.java:930)
>         at org.apache.hadoop.ipc.Client.call(Client.java:1285)
>         at org.apache.hadoop.ipc.Client.call(Client.java:1264)
>         at
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
>         at $Proxy22.heartbeat(Unknown Source)
>         at
> org.apache.hadoop.yarn.server.nodemanager.api.impl.pb.client.LocalizationProtocolPBClientImpl.heartbeat(LocalizationProtocolPBClientImpl.java:62)
>         at
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.localizeFiles(ContainerLocalizer.java:249)
>         at
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.runLocalization(ContainerLocalizer.java:163)
>         at
> org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.startLocalizer(DefaultContainerExecutor.java:106)
>         at
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerRunner.run(ResourceLocalizationService.java:979)
>
>
>
> And here is my code snippet:
>
>       ContainerLaunchContext ctx =
> Records.newRecord(ContainerLaunchContext.class);
>
>       ctx.setEnvironment(oshEnv);
>
>       // Set the local resources
>       Map<String, LocalResource> localResources = new HashMap<String,
> LocalResource>();
>
>       LocalResource shellRsrc = Records.newRecord(LocalResource.class);
>       shellRsrc.setType(LocalResourceType.FILE);
>       shellRsrc.setVisibility(LocalResourceVisibility.APPLICATION);
>       String shellScriptPath = "hdfs://isredeng//kishore/kk.ksh";
>       try {
>         shellRsrc.setResource(ConverterUtils.getYarnUrlFromURI(new
> URI(shellScriptPath)));
>       } catch (URISyntaxException e) {
>         LOG.error("Error when trying to use shell script path specified"
>             + " in env, path=" + shellScriptPath);
>         e.printStackTrace();
>       }
>
>       shellRsrc.setTimestamp(0/*shellScriptPathTimestamp*/);
>       shellRsrc.setSize(0/*shellScriptPathLen*/);
>       String ExecShellStringPath = "ExecShellScript.sh";
>       localResources.put(ExecShellStringPath, shellRsrc);
>
>       ctx.setLocalResources(localResources);
>
>
> Please let me know if you need anything else.
>
> Thanks,
> Kishore
>
>
>
> On Tue, Aug 6, 2013 at 12:05 AM, Harsh J <ha...@cloudera.com> wrote:
>>
>> The detail is insufficient to answer why. You should also have gotten
>> a trace after it, can you post that? If possible, also the relevant
>> snippets of code.
>>
>> On Mon, Aug 5, 2013 at 6:36 PM, Krishna Kishore Bonagiri
>> <wr...@gmail.com> wrote:
>> > Hi Harsh,
>> >  Thanks for the quick and detailed reply, it really helps. I am trying
>> > to
>> > use it and getting this error in node manager's log:
>> >
>> > 2013-08-05 08:57:28,867 ERROR
>> > org.apache.hadoop.security.UserGroupInformation:
>> > PriviledgedActionException
>> > as:dsadm (auth:SIMPLE) cause:java.io.FileNotFoundException: File does
>> > not
>> > exist: hdfs://isredeng/kishore/kk.ksh
>> >
>> >
>> > This file is there on the machine with name "isredeng", I could do ls
>> > for
>> > that file as below:
>> >
>> > -bash-4.1$ hadoop fs -ls kishore/kk.ksh
>> > 13/08/05 09:01:03 WARN util.NativeCodeLoader: Unable to load
>> > native-hadoop
>> > library for your platform... using builtin-java classes where applicable
>> > Found 1 items
>> > -rw-r--r--   3 dsadm supergroup       1046 2013-08-05 08:48
>> > kishore/kk.ksh
>> >
>> > Note: I am using a single node cluster
>> >
>> > Thanks,
>> > Kishore
>> >
>> >
>> >
>> >
>> > On Mon, Aug 5, 2013 at 3:00 PM, Harsh J <ha...@cloudera.com> wrote:
>> >>
>> >> The string for each LocalResource in the map can be anything that
>> >> serves as a common identifier name for your application. At execution
>> >> time, the passed resource filename will be aliased to the name you've
>> >> mapped it to, so that the application code need not track special
>> >> names. The behavior is very similar to how you can, in MR, define a
>> >> symlink name for a DistributedCache entry (e.g. foo.jar#bar.jar).
>> >>
>> >> For an example, checkout the DistributedShell app sources.
>> >>
>> >> Over [1], you can see we take a user provided file path to a shell
>> >> script. This can be named anything as it is user-supplied.
>> >> Onto [2], we define this as a local resource [2.1] and embed it with a
>> >> different name (the string you ask about) [2.2], as defined at [3] as
>> >> an application reference-able constant.
>> >> Note that in [4], we add to the Container arguments the aliased name
>> >> we mapped it to (i.e. [3]) and not the original filename we received
>> >> from the user. The resource is placed on the container with this name
>> >> instead, so thats what we choose to execute.
>> >>
>> >> [1] -
>> >>
>> >> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L390
>> >>
>> >> [2] - [2.1]
>> >>
>> >> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L764
>> >> and [2.2]
>> >>
>> >> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L780
>> >>
>> >> [3] -
>> >>
>> >> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L205
>> >>
>> >> [4] -
>> >>
>> >> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L791
>> >>
>> >> On Mon, Aug 5, 2013 at 2:44 PM, Krishna Kishore Bonagiri
>> >> <wr...@gmail.com> wrote:
>> >> > Hi,
>> >> >
>> >> >   Can someone please tell me what is the use of calling
>> >> > setLocalResources()
>> >> > on ContainerLaunchContext?
>> >> >
>> >> >   And, also an example of how to use this will help...
>> >> >
>> >> >  I couldn't guess what is the String in the map that is passed to
>> >> > setLocalResources() like below:
>> >> >
>> >> >       // Set the local resources
>> >> >       Map<String, LocalResource> localResources = new HashMap<String,
>> >> > LocalResource>();
>> >> >
>> >> > Thanks,
>> >> > Kishore
>> >> >
>> >>
>> >>
>> >>
>> >> --
>> >> Harsh J
>> >
>> >
>>
>>
>>
>> --
>> Harsh J
>
>



-- 
Harsh J

Re: setLocalResources() on ContainerLaunchContext

Posted by Krishna Kishore Bonagiri <wr...@gmail.com>.
Hi Harsh,

  Please see if this is useful, I got a stack trace after the error has
occurred....

2013-08-06 00:55:30,559 INFO
org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor: CWD set
to
/tmp/nm-local-dir/usercache/dsadm/appcache/application_1375716148174_0004 =
file:/tmp/nm-local-dir/usercache/dsadm/appcache/application_1375716148174_0004
2013-08-06 00:55:31,017 ERROR
org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException
as:dsadm (auth:SIMPLE) cause:java.io.FileNotFoundException: File does not
exist: hdfs://isredeng/kishore/kk.ksh
2013-08-06 00:55:31,029 INFO
org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService:
DEBUG: FAILED { hdfs://isredeng/kishore/kk.ksh, 0, FILE, null }, File does
not exist: hdfs://isredeng/kishore/kk.ksh
2013-08-06 00:55:31,031 INFO
org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalizedResource:
Resource hdfs://isredeng/kishore/kk.ksh transitioned from DOWNLOADING to
FAILED
2013-08-06 00:55:31,034 INFO
org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container:
Container container_1375716148174_0004_01_000002 transitioned from
LOCALIZING to LOCALIZATION_FAILED
2013-08-06 00:55:31,035 INFO
org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalResourcesTrackerImpl:
Container container_1375716148174_0004_01_000002 sent RELEASE event on a
resource request { hdfs://isredeng/kishore/kk.ksh, 0, FILE, null } not
present in cache.
2013-08-06 00:55:31,036 WARN org.apache.hadoop.ipc.Client: interrupted
waiting to send rpc request to server
java.lang.InterruptedException
        at
java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1290)
        at
java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:229)
        at java.util.concurrent.FutureTask.get(FutureTask.java:94)
        at
org.apache.hadoop.ipc.Client$Connection.sendRpcRequest(Client.java:930)
        at org.apache.hadoop.ipc.Client.call(Client.java:1285)
        at org.apache.hadoop.ipc.Client.call(Client.java:1264)
        at
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
        at $Proxy22.heartbeat(Unknown Source)
        at
org.apache.hadoop.yarn.server.nodemanager.api.impl.pb.client.LocalizationProtocolPBClientImpl.heartbeat(LocalizationProtocolPBClientImpl.java:62)
        at
org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.localizeFiles(ContainerLocalizer.java:249)
        at
org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.runLocalization(ContainerLocalizer.java:163)
        at
org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.startLocalizer(DefaultContainerExecutor.java:106)
        at
org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerRunner.run(ResourceLocalizationService.java:979)



And here is my code snippet:

      ContainerLaunchContext ctx =
Records.newRecord(ContainerLaunchContext.class);

      ctx.setEnvironment(oshEnv);

      // Set the local resources
      Map<String, LocalResource> localResources = new HashMap<String,
LocalResource>();

      LocalResource shellRsrc = Records.newRecord(LocalResource.class);
      shellRsrc.setType(LocalResourceType.FILE);
      shellRsrc.setVisibility(LocalResourceVisibility.APPLICATION);
      String shellScriptPath = "hdfs://isredeng//kishore/kk.ksh";
      try {
        shellRsrc.setResource(ConverterUtils.getYarnUrlFromURI(new
URI(shellScriptPath)));
      } catch (URISyntaxException e) {
        LOG.error("Error when trying to use shell script path specified"
            + " in env, path=" + shellScriptPath);
        e.printStackTrace();
      }

      shellRsrc.setTimestamp(0/*shellScriptPathTimestamp*/);
      shellRsrc.setSize(0/*shellScriptPathLen*/);
      String ExecShellStringPath = "ExecShellScript.sh";
      localResources.put(ExecShellStringPath, shellRsrc);

      ctx.setLocalResources(localResources);


Please let me know if you need anything else.

Thanks,
Kishore



On Tue, Aug 6, 2013 at 12:05 AM, Harsh J <ha...@cloudera.com> wrote:

> The detail is insufficient to answer why. You should also have gotten
> a trace after it, can you post that? If possible, also the relevant
> snippets of code.
>
> On Mon, Aug 5, 2013 at 6:36 PM, Krishna Kishore Bonagiri
> <wr...@gmail.com> wrote:
> > Hi Harsh,
> >  Thanks for the quick and detailed reply, it really helps. I am trying to
> > use it and getting this error in node manager's log:
> >
> > 2013-08-05 08:57:28,867 ERROR
> > org.apache.hadoop.security.UserGroupInformation:
> PriviledgedActionException
> > as:dsadm (auth:SIMPLE) cause:java.io.FileNotFoundException: File does not
> > exist: hdfs://isredeng/kishore/kk.ksh
> >
> >
> > This file is there on the machine with name "isredeng", I could do ls for
> > that file as below:
> >
> > -bash-4.1$ hadoop fs -ls kishore/kk.ksh
> > 13/08/05 09:01:03 WARN util.NativeCodeLoader: Unable to load
> native-hadoop
> > library for your platform... using builtin-java classes where applicable
> > Found 1 items
> > -rw-r--r--   3 dsadm supergroup       1046 2013-08-05 08:48
> kishore/kk.ksh
> >
> > Note: I am using a single node cluster
> >
> > Thanks,
> > Kishore
> >
> >
> >
> >
> > On Mon, Aug 5, 2013 at 3:00 PM, Harsh J <ha...@cloudera.com> wrote:
> >>
> >> The string for each LocalResource in the map can be anything that
> >> serves as a common identifier name for your application. At execution
> >> time, the passed resource filename will be aliased to the name you've
> >> mapped it to, so that the application code need not track special
> >> names. The behavior is very similar to how you can, in MR, define a
> >> symlink name for a DistributedCache entry (e.g. foo.jar#bar.jar).
> >>
> >> For an example, checkout the DistributedShell app sources.
> >>
> >> Over [1], you can see we take a user provided file path to a shell
> >> script. This can be named anything as it is user-supplied.
> >> Onto [2], we define this as a local resource [2.1] and embed it with a
> >> different name (the string you ask about) [2.2], as defined at [3] as
> >> an application reference-able constant.
> >> Note that in [4], we add to the Container arguments the aliased name
> >> we mapped it to (i.e. [3]) and not the original filename we received
> >> from the user. The resource is placed on the container with this name
> >> instead, so thats what we choose to execute.
> >>
> >> [1] -
> >>
> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L390
> >>
> >> [2] - [2.1]
> >>
> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L764
> >> and [2.2]
> >>
> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L780
> >>
> >> [3] -
> >>
> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L205
> >>
> >> [4] -
> >>
> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L791
> >>
> >> On Mon, Aug 5, 2013 at 2:44 PM, Krishna Kishore Bonagiri
> >> <wr...@gmail.com> wrote:
> >> > Hi,
> >> >
> >> >   Can someone please tell me what is the use of calling
> >> > setLocalResources()
> >> > on ContainerLaunchContext?
> >> >
> >> >   And, also an example of how to use this will help...
> >> >
> >> >  I couldn't guess what is the String in the map that is passed to
> >> > setLocalResources() like below:
> >> >
> >> >       // Set the local resources
> >> >       Map<String, LocalResource> localResources = new HashMap<String,
> >> > LocalResource>();
> >> >
> >> > Thanks,
> >> > Kishore
> >> >
> >>
> >>
> >>
> >> --
> >> Harsh J
> >
> >
>
>
>
> --
> Harsh J
>

Re: setLocalResources() on ContainerLaunchContext

Posted by Krishna Kishore Bonagiri <wr...@gmail.com>.
Hi Harsh,

  Please see if this is useful, I got a stack trace after the error has
occurred....

2013-08-06 00:55:30,559 INFO
org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor: CWD set
to
/tmp/nm-local-dir/usercache/dsadm/appcache/application_1375716148174_0004 =
file:/tmp/nm-local-dir/usercache/dsadm/appcache/application_1375716148174_0004
2013-08-06 00:55:31,017 ERROR
org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException
as:dsadm (auth:SIMPLE) cause:java.io.FileNotFoundException: File does not
exist: hdfs://isredeng/kishore/kk.ksh
2013-08-06 00:55:31,029 INFO
org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService:
DEBUG: FAILED { hdfs://isredeng/kishore/kk.ksh, 0, FILE, null }, File does
not exist: hdfs://isredeng/kishore/kk.ksh
2013-08-06 00:55:31,031 INFO
org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalizedResource:
Resource hdfs://isredeng/kishore/kk.ksh transitioned from DOWNLOADING to
FAILED
2013-08-06 00:55:31,034 INFO
org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container:
Container container_1375716148174_0004_01_000002 transitioned from
LOCALIZING to LOCALIZATION_FAILED
2013-08-06 00:55:31,035 INFO
org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalResourcesTrackerImpl:
Container container_1375716148174_0004_01_000002 sent RELEASE event on a
resource request { hdfs://isredeng/kishore/kk.ksh, 0, FILE, null } not
present in cache.
2013-08-06 00:55:31,036 WARN org.apache.hadoop.ipc.Client: interrupted
waiting to send rpc request to server
java.lang.InterruptedException
        at
java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1290)
        at
java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:229)
        at java.util.concurrent.FutureTask.get(FutureTask.java:94)
        at
org.apache.hadoop.ipc.Client$Connection.sendRpcRequest(Client.java:930)
        at org.apache.hadoop.ipc.Client.call(Client.java:1285)
        at org.apache.hadoop.ipc.Client.call(Client.java:1264)
        at
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
        at $Proxy22.heartbeat(Unknown Source)
        at
org.apache.hadoop.yarn.server.nodemanager.api.impl.pb.client.LocalizationProtocolPBClientImpl.heartbeat(LocalizationProtocolPBClientImpl.java:62)
        at
org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.localizeFiles(ContainerLocalizer.java:249)
        at
org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.runLocalization(ContainerLocalizer.java:163)
        at
org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.startLocalizer(DefaultContainerExecutor.java:106)
        at
org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerRunner.run(ResourceLocalizationService.java:979)



And here is my code snippet:

      ContainerLaunchContext ctx =
Records.newRecord(ContainerLaunchContext.class);

      ctx.setEnvironment(oshEnv);

      // Set the local resources
      Map<String, LocalResource> localResources = new HashMap<String,
LocalResource>();

      LocalResource shellRsrc = Records.newRecord(LocalResource.class);
      shellRsrc.setType(LocalResourceType.FILE);
      shellRsrc.setVisibility(LocalResourceVisibility.APPLICATION);
      String shellScriptPath = "hdfs://isredeng//kishore/kk.ksh";
      try {
        shellRsrc.setResource(ConverterUtils.getYarnUrlFromURI(new
URI(shellScriptPath)));
      } catch (URISyntaxException e) {
        LOG.error("Error when trying to use shell script path specified"
            + " in env, path=" + shellScriptPath);
        e.printStackTrace();
      }

      shellRsrc.setTimestamp(0/*shellScriptPathTimestamp*/);
      shellRsrc.setSize(0/*shellScriptPathLen*/);
      String ExecShellStringPath = "ExecShellScript.sh";
      localResources.put(ExecShellStringPath, shellRsrc);

      ctx.setLocalResources(localResources);


Please let me know if you need anything else.

Thanks,
Kishore



On Tue, Aug 6, 2013 at 12:05 AM, Harsh J <ha...@cloudera.com> wrote:

> The detail is insufficient to answer why. You should also have gotten
> a trace after it, can you post that? If possible, also the relevant
> snippets of code.
>
> On Mon, Aug 5, 2013 at 6:36 PM, Krishna Kishore Bonagiri
> <wr...@gmail.com> wrote:
> > Hi Harsh,
> >  Thanks for the quick and detailed reply, it really helps. I am trying to
> > use it and getting this error in node manager's log:
> >
> > 2013-08-05 08:57:28,867 ERROR
> > org.apache.hadoop.security.UserGroupInformation:
> PriviledgedActionException
> > as:dsadm (auth:SIMPLE) cause:java.io.FileNotFoundException: File does not
> > exist: hdfs://isredeng/kishore/kk.ksh
> >
> >
> > This file is there on the machine with name "isredeng", I could do ls for
> > that file as below:
> >
> > -bash-4.1$ hadoop fs -ls kishore/kk.ksh
> > 13/08/05 09:01:03 WARN util.NativeCodeLoader: Unable to load
> native-hadoop
> > library for your platform... using builtin-java classes where applicable
> > Found 1 items
> > -rw-r--r--   3 dsadm supergroup       1046 2013-08-05 08:48
> kishore/kk.ksh
> >
> > Note: I am using a single node cluster
> >
> > Thanks,
> > Kishore
> >
> >
> >
> >
> > On Mon, Aug 5, 2013 at 3:00 PM, Harsh J <ha...@cloudera.com> wrote:
> >>
> >> The string for each LocalResource in the map can be anything that
> >> serves as a common identifier name for your application. At execution
> >> time, the passed resource filename will be aliased to the name you've
> >> mapped it to, so that the application code need not track special
> >> names. The behavior is very similar to how you can, in MR, define a
> >> symlink name for a DistributedCache entry (e.g. foo.jar#bar.jar).
> >>
> >> For an example, checkout the DistributedShell app sources.
> >>
> >> Over [1], you can see we take a user provided file path to a shell
> >> script. This can be named anything as it is user-supplied.
> >> Onto [2], we define this as a local resource [2.1] and embed it with a
> >> different name (the string you ask about) [2.2], as defined at [3] as
> >> an application reference-able constant.
> >> Note that in [4], we add to the Container arguments the aliased name
> >> we mapped it to (i.e. [3]) and not the original filename we received
> >> from the user. The resource is placed on the container with this name
> >> instead, so thats what we choose to execute.
> >>
> >> [1] -
> >>
> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L390
> >>
> >> [2] - [2.1]
> >>
> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L764
> >> and [2.2]
> >>
> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L780
> >>
> >> [3] -
> >>
> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L205
> >>
> >> [4] -
> >>
> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L791
> >>
> >> On Mon, Aug 5, 2013 at 2:44 PM, Krishna Kishore Bonagiri
> >> <wr...@gmail.com> wrote:
> >> > Hi,
> >> >
> >> >   Can someone please tell me what is the use of calling
> >> > setLocalResources()
> >> > on ContainerLaunchContext?
> >> >
> >> >   And, also an example of how to use this will help...
> >> >
> >> >  I couldn't guess what is the String in the map that is passed to
> >> > setLocalResources() like below:
> >> >
> >> >       // Set the local resources
> >> >       Map<String, LocalResource> localResources = new HashMap<String,
> >> > LocalResource>();
> >> >
> >> > Thanks,
> >> > Kishore
> >> >
> >>
> >>
> >>
> >> --
> >> Harsh J
> >
> >
>
>
>
> --
> Harsh J
>

Re: setLocalResources() on ContainerLaunchContext

Posted by Krishna Kishore Bonagiri <wr...@gmail.com>.
Hi Harsh,

  Please see if this is useful, I got a stack trace after the error has
occurred....

2013-08-06 00:55:30,559 INFO
org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor: CWD set
to
/tmp/nm-local-dir/usercache/dsadm/appcache/application_1375716148174_0004 =
file:/tmp/nm-local-dir/usercache/dsadm/appcache/application_1375716148174_0004
2013-08-06 00:55:31,017 ERROR
org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException
as:dsadm (auth:SIMPLE) cause:java.io.FileNotFoundException: File does not
exist: hdfs://isredeng/kishore/kk.ksh
2013-08-06 00:55:31,029 INFO
org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService:
DEBUG: FAILED { hdfs://isredeng/kishore/kk.ksh, 0, FILE, null }, File does
not exist: hdfs://isredeng/kishore/kk.ksh
2013-08-06 00:55:31,031 INFO
org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalizedResource:
Resource hdfs://isredeng/kishore/kk.ksh transitioned from DOWNLOADING to
FAILED
2013-08-06 00:55:31,034 INFO
org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container:
Container container_1375716148174_0004_01_000002 transitioned from
LOCALIZING to LOCALIZATION_FAILED
2013-08-06 00:55:31,035 INFO
org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalResourcesTrackerImpl:
Container container_1375716148174_0004_01_000002 sent RELEASE event on a
resource request { hdfs://isredeng/kishore/kk.ksh, 0, FILE, null } not
present in cache.
2013-08-06 00:55:31,036 WARN org.apache.hadoop.ipc.Client: interrupted
waiting to send rpc request to server
java.lang.InterruptedException
        at
java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1290)
        at
java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:229)
        at java.util.concurrent.FutureTask.get(FutureTask.java:94)
        at
org.apache.hadoop.ipc.Client$Connection.sendRpcRequest(Client.java:930)
        at org.apache.hadoop.ipc.Client.call(Client.java:1285)
        at org.apache.hadoop.ipc.Client.call(Client.java:1264)
        at
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
        at $Proxy22.heartbeat(Unknown Source)
        at
org.apache.hadoop.yarn.server.nodemanager.api.impl.pb.client.LocalizationProtocolPBClientImpl.heartbeat(LocalizationProtocolPBClientImpl.java:62)
        at
org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.localizeFiles(ContainerLocalizer.java:249)
        at
org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.runLocalization(ContainerLocalizer.java:163)
        at
org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.startLocalizer(DefaultContainerExecutor.java:106)
        at
org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerRunner.run(ResourceLocalizationService.java:979)



And here is my code snippet:

      ContainerLaunchContext ctx =
Records.newRecord(ContainerLaunchContext.class);

      ctx.setEnvironment(oshEnv);

      // Set the local resources
      Map<String, LocalResource> localResources = new HashMap<String,
LocalResource>();

      LocalResource shellRsrc = Records.newRecord(LocalResource.class);
      shellRsrc.setType(LocalResourceType.FILE);
      shellRsrc.setVisibility(LocalResourceVisibility.APPLICATION);
      String shellScriptPath = "hdfs://isredeng//kishore/kk.ksh";
      try {
        shellRsrc.setResource(ConverterUtils.getYarnUrlFromURI(new
URI(shellScriptPath)));
      } catch (URISyntaxException e) {
        LOG.error("Error when trying to use shell script path specified"
            + " in env, path=" + shellScriptPath);
        e.printStackTrace();
      }

      shellRsrc.setTimestamp(0/*shellScriptPathTimestamp*/);
      shellRsrc.setSize(0/*shellScriptPathLen*/);
      String ExecShellStringPath = "ExecShellScript.sh";
      localResources.put(ExecShellStringPath, shellRsrc);

      ctx.setLocalResources(localResources);


Please let me know if you need anything else.

Thanks,
Kishore



On Tue, Aug 6, 2013 at 12:05 AM, Harsh J <ha...@cloudera.com> wrote:

> The detail is insufficient to answer why. You should also have gotten
> a trace after it, can you post that? If possible, also the relevant
> snippets of code.
>
> On Mon, Aug 5, 2013 at 6:36 PM, Krishna Kishore Bonagiri
> <wr...@gmail.com> wrote:
> > Hi Harsh,
> >  Thanks for the quick and detailed reply, it really helps. I am trying to
> > use it and getting this error in node manager's log:
> >
> > 2013-08-05 08:57:28,867 ERROR
> > org.apache.hadoop.security.UserGroupInformation:
> PriviledgedActionException
> > as:dsadm (auth:SIMPLE) cause:java.io.FileNotFoundException: File does not
> > exist: hdfs://isredeng/kishore/kk.ksh
> >
> >
> > This file is there on the machine with name "isredeng", I could do ls for
> > that file as below:
> >
> > -bash-4.1$ hadoop fs -ls kishore/kk.ksh
> > 13/08/05 09:01:03 WARN util.NativeCodeLoader: Unable to load
> native-hadoop
> > library for your platform... using builtin-java classes where applicable
> > Found 1 items
> > -rw-r--r--   3 dsadm supergroup       1046 2013-08-05 08:48
> kishore/kk.ksh
> >
> > Note: I am using a single node cluster
> >
> > Thanks,
> > Kishore
> >
> >
> >
> >
> > On Mon, Aug 5, 2013 at 3:00 PM, Harsh J <ha...@cloudera.com> wrote:
> >>
> >> The string for each LocalResource in the map can be anything that
> >> serves as a common identifier name for your application. At execution
> >> time, the passed resource filename will be aliased to the name you've
> >> mapped it to, so that the application code need not track special
> >> names. The behavior is very similar to how you can, in MR, define a
> >> symlink name for a DistributedCache entry (e.g. foo.jar#bar.jar).
> >>
> >> For an example, checkout the DistributedShell app sources.
> >>
> >> Over [1], you can see we take a user provided file path to a shell
> >> script. This can be named anything as it is user-supplied.
> >> Onto [2], we define this as a local resource [2.1] and embed it with a
> >> different name (the string you ask about) [2.2], as defined at [3] as
> >> an application reference-able constant.
> >> Note that in [4], we add to the Container arguments the aliased name
> >> we mapped it to (i.e. [3]) and not the original filename we received
> >> from the user. The resource is placed on the container with this name
> >> instead, so thats what we choose to execute.
> >>
> >> [1] -
> >>
> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L390
> >>
> >> [2] - [2.1]
> >>
> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L764
> >> and [2.2]
> >>
> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L780
> >>
> >> [3] -
> >>
> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L205
> >>
> >> [4] -
> >>
> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L791
> >>
> >> On Mon, Aug 5, 2013 at 2:44 PM, Krishna Kishore Bonagiri
> >> <wr...@gmail.com> wrote:
> >> > Hi,
> >> >
> >> >   Can someone please tell me what is the use of calling
> >> > setLocalResources()
> >> > on ContainerLaunchContext?
> >> >
> >> >   And, also an example of how to use this will help...
> >> >
> >> >  I couldn't guess what is the String in the map that is passed to
> >> > setLocalResources() like below:
> >> >
> >> >       // Set the local resources
> >> >       Map<String, LocalResource> localResources = new HashMap<String,
> >> > LocalResource>();
> >> >
> >> > Thanks,
> >> > Kishore
> >> >
> >>
> >>
> >>
> >> --
> >> Harsh J
> >
> >
>
>
>
> --
> Harsh J
>

Re: setLocalResources() on ContainerLaunchContext

Posted by Krishna Kishore Bonagiri <wr...@gmail.com>.
Hi Harsh,

  Please see if this is useful, I got a stack trace after the error has
occurred....

2013-08-06 00:55:30,559 INFO
org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor: CWD set
to
/tmp/nm-local-dir/usercache/dsadm/appcache/application_1375716148174_0004 =
file:/tmp/nm-local-dir/usercache/dsadm/appcache/application_1375716148174_0004
2013-08-06 00:55:31,017 ERROR
org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException
as:dsadm (auth:SIMPLE) cause:java.io.FileNotFoundException: File does not
exist: hdfs://isredeng/kishore/kk.ksh
2013-08-06 00:55:31,029 INFO
org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService:
DEBUG: FAILED { hdfs://isredeng/kishore/kk.ksh, 0, FILE, null }, File does
not exist: hdfs://isredeng/kishore/kk.ksh
2013-08-06 00:55:31,031 INFO
org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalizedResource:
Resource hdfs://isredeng/kishore/kk.ksh transitioned from DOWNLOADING to
FAILED
2013-08-06 00:55:31,034 INFO
org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container:
Container container_1375716148174_0004_01_000002 transitioned from
LOCALIZING to LOCALIZATION_FAILED
2013-08-06 00:55:31,035 INFO
org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalResourcesTrackerImpl:
Container container_1375716148174_0004_01_000002 sent RELEASE event on a
resource request { hdfs://isredeng/kishore/kk.ksh, 0, FILE, null } not
present in cache.
2013-08-06 00:55:31,036 WARN org.apache.hadoop.ipc.Client: interrupted
waiting to send rpc request to server
java.lang.InterruptedException
        at
java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1290)
        at
java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:229)
        at java.util.concurrent.FutureTask.get(FutureTask.java:94)
        at
org.apache.hadoop.ipc.Client$Connection.sendRpcRequest(Client.java:930)
        at org.apache.hadoop.ipc.Client.call(Client.java:1285)
        at org.apache.hadoop.ipc.Client.call(Client.java:1264)
        at
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
        at $Proxy22.heartbeat(Unknown Source)
        at
org.apache.hadoop.yarn.server.nodemanager.api.impl.pb.client.LocalizationProtocolPBClientImpl.heartbeat(LocalizationProtocolPBClientImpl.java:62)
        at
org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.localizeFiles(ContainerLocalizer.java:249)
        at
org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.runLocalization(ContainerLocalizer.java:163)
        at
org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.startLocalizer(DefaultContainerExecutor.java:106)
        at
org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerRunner.run(ResourceLocalizationService.java:979)



And here is my code snippet:

      ContainerLaunchContext ctx =
Records.newRecord(ContainerLaunchContext.class);

      ctx.setEnvironment(oshEnv);

      // Set the local resources
      Map<String, LocalResource> localResources = new HashMap<String,
LocalResource>();

      LocalResource shellRsrc = Records.newRecord(LocalResource.class);
      shellRsrc.setType(LocalResourceType.FILE);
      shellRsrc.setVisibility(LocalResourceVisibility.APPLICATION);
      String shellScriptPath = "hdfs://isredeng//kishore/kk.ksh";
      try {
        shellRsrc.setResource(ConverterUtils.getYarnUrlFromURI(new
URI(shellScriptPath)));
      } catch (URISyntaxException e) {
        LOG.error("Error when trying to use shell script path specified"
            + " in env, path=" + shellScriptPath);
        e.printStackTrace();
      }

      shellRsrc.setTimestamp(0/*shellScriptPathTimestamp*/);
      shellRsrc.setSize(0/*shellScriptPathLen*/);
      String ExecShellStringPath = "ExecShellScript.sh";
      localResources.put(ExecShellStringPath, shellRsrc);

      ctx.setLocalResources(localResources);


Please let me know if you need anything else.

Thanks,
Kishore



On Tue, Aug 6, 2013 at 12:05 AM, Harsh J <ha...@cloudera.com> wrote:

> The detail is insufficient to answer why. You should also have gotten
> a trace after it, can you post that? If possible, also the relevant
> snippets of code.
>
> On Mon, Aug 5, 2013 at 6:36 PM, Krishna Kishore Bonagiri
> <wr...@gmail.com> wrote:
> > Hi Harsh,
> >  Thanks for the quick and detailed reply, it really helps. I am trying to
> > use it and getting this error in node manager's log:
> >
> > 2013-08-05 08:57:28,867 ERROR
> > org.apache.hadoop.security.UserGroupInformation:
> PriviledgedActionException
> > as:dsadm (auth:SIMPLE) cause:java.io.FileNotFoundException: File does not
> > exist: hdfs://isredeng/kishore/kk.ksh
> >
> >
> > This file is there on the machine with name "isredeng", I could do ls for
> > that file as below:
> >
> > -bash-4.1$ hadoop fs -ls kishore/kk.ksh
> > 13/08/05 09:01:03 WARN util.NativeCodeLoader: Unable to load
> native-hadoop
> > library for your platform... using builtin-java classes where applicable
> > Found 1 items
> > -rw-r--r--   3 dsadm supergroup       1046 2013-08-05 08:48
> kishore/kk.ksh
> >
> > Note: I am using a single node cluster
> >
> > Thanks,
> > Kishore
> >
> >
> >
> >
> > On Mon, Aug 5, 2013 at 3:00 PM, Harsh J <ha...@cloudera.com> wrote:
> >>
> >> The string for each LocalResource in the map can be anything that
> >> serves as a common identifier name for your application. At execution
> >> time, the passed resource filename will be aliased to the name you've
> >> mapped it to, so that the application code need not track special
> >> names. The behavior is very similar to how you can, in MR, define a
> >> symlink name for a DistributedCache entry (e.g. foo.jar#bar.jar).
> >>
> >> For an example, checkout the DistributedShell app sources.
> >>
> >> Over [1], you can see we take a user provided file path to a shell
> >> script. This can be named anything as it is user-supplied.
> >> Onto [2], we define this as a local resource [2.1] and embed it with a
> >> different name (the string you ask about) [2.2], as defined at [3] as
> >> an application reference-able constant.
> >> Note that in [4], we add to the Container arguments the aliased name
> >> we mapped it to (i.e. [3]) and not the original filename we received
> >> from the user. The resource is placed on the container with this name
> >> instead, so thats what we choose to execute.
> >>
> >> [1] -
> >>
> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L390
> >>
> >> [2] - [2.1]
> >>
> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L764
> >> and [2.2]
> >>
> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L780
> >>
> >> [3] -
> >>
> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L205
> >>
> >> [4] -
> >>
> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L791
> >>
> >> On Mon, Aug 5, 2013 at 2:44 PM, Krishna Kishore Bonagiri
> >> <wr...@gmail.com> wrote:
> >> > Hi,
> >> >
> >> >   Can someone please tell me what is the use of calling
> >> > setLocalResources()
> >> > on ContainerLaunchContext?
> >> >
> >> >   And, also an example of how to use this will help...
> >> >
> >> >  I couldn't guess what is the String in the map that is passed to
> >> > setLocalResources() like below:
> >> >
> >> >       // Set the local resources
> >> >       Map<String, LocalResource> localResources = new HashMap<String,
> >> > LocalResource>();
> >> >
> >> > Thanks,
> >> > Kishore
> >> >
> >>
> >>
> >>
> >> --
> >> Harsh J
> >
> >
>
>
>
> --
> Harsh J
>

Re: setLocalResources() on ContainerLaunchContext

Posted by Harsh J <ha...@cloudera.com>.
The detail is insufficient to answer why. You should also have gotten
a trace after it, can you post that? If possible, also the relevant
snippets of code.

On Mon, Aug 5, 2013 at 6:36 PM, Krishna Kishore Bonagiri
<wr...@gmail.com> wrote:
> Hi Harsh,
>  Thanks for the quick and detailed reply, it really helps. I am trying to
> use it and getting this error in node manager's log:
>
> 2013-08-05 08:57:28,867 ERROR
> org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException
> as:dsadm (auth:SIMPLE) cause:java.io.FileNotFoundException: File does not
> exist: hdfs://isredeng/kishore/kk.ksh
>
>
> This file is there on the machine with name "isredeng", I could do ls for
> that file as below:
>
> -bash-4.1$ hadoop fs -ls kishore/kk.ksh
> 13/08/05 09:01:03 WARN util.NativeCodeLoader: Unable to load native-hadoop
> library for your platform... using builtin-java classes where applicable
> Found 1 items
> -rw-r--r--   3 dsadm supergroup       1046 2013-08-05 08:48 kishore/kk.ksh
>
> Note: I am using a single node cluster
>
> Thanks,
> Kishore
>
>
>
>
> On Mon, Aug 5, 2013 at 3:00 PM, Harsh J <ha...@cloudera.com> wrote:
>>
>> The string for each LocalResource in the map can be anything that
>> serves as a common identifier name for your application. At execution
>> time, the passed resource filename will be aliased to the name you've
>> mapped it to, so that the application code need not track special
>> names. The behavior is very similar to how you can, in MR, define a
>> symlink name for a DistributedCache entry (e.g. foo.jar#bar.jar).
>>
>> For an example, checkout the DistributedShell app sources.
>>
>> Over [1], you can see we take a user provided file path to a shell
>> script. This can be named anything as it is user-supplied.
>> Onto [2], we define this as a local resource [2.1] and embed it with a
>> different name (the string you ask about) [2.2], as defined at [3] as
>> an application reference-able constant.
>> Note that in [4], we add to the Container arguments the aliased name
>> we mapped it to (i.e. [3]) and not the original filename we received
>> from the user. The resource is placed on the container with this name
>> instead, so thats what we choose to execute.
>>
>> [1] -
>> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L390
>>
>> [2] - [2.1]
>> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L764
>> and [2.2]
>> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L780
>>
>> [3] -
>> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L205
>>
>> [4] -
>> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L791
>>
>> On Mon, Aug 5, 2013 at 2:44 PM, Krishna Kishore Bonagiri
>> <wr...@gmail.com> wrote:
>> > Hi,
>> >
>> >   Can someone please tell me what is the use of calling
>> > setLocalResources()
>> > on ContainerLaunchContext?
>> >
>> >   And, also an example of how to use this will help...
>> >
>> >  I couldn't guess what is the String in the map that is passed to
>> > setLocalResources() like below:
>> >
>> >       // Set the local resources
>> >       Map<String, LocalResource> localResources = new HashMap<String,
>> > LocalResource>();
>> >
>> > Thanks,
>> > Kishore
>> >
>>
>>
>>
>> --
>> Harsh J
>
>



-- 
Harsh J

Re: setLocalResources() on ContainerLaunchContext

Posted by Harsh J <ha...@cloudera.com>.
The detail is insufficient to answer why. You should also have gotten
a trace after it, can you post that? If possible, also the relevant
snippets of code.

On Mon, Aug 5, 2013 at 6:36 PM, Krishna Kishore Bonagiri
<wr...@gmail.com> wrote:
> Hi Harsh,
>  Thanks for the quick and detailed reply, it really helps. I am trying to
> use it and getting this error in node manager's log:
>
> 2013-08-05 08:57:28,867 ERROR
> org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException
> as:dsadm (auth:SIMPLE) cause:java.io.FileNotFoundException: File does not
> exist: hdfs://isredeng/kishore/kk.ksh
>
>
> This file is there on the machine with name "isredeng", I could do ls for
> that file as below:
>
> -bash-4.1$ hadoop fs -ls kishore/kk.ksh
> 13/08/05 09:01:03 WARN util.NativeCodeLoader: Unable to load native-hadoop
> library for your platform... using builtin-java classes where applicable
> Found 1 items
> -rw-r--r--   3 dsadm supergroup       1046 2013-08-05 08:48 kishore/kk.ksh
>
> Note: I am using a single node cluster
>
> Thanks,
> Kishore
>
>
>
>
> On Mon, Aug 5, 2013 at 3:00 PM, Harsh J <ha...@cloudera.com> wrote:
>>
>> The string for each LocalResource in the map can be anything that
>> serves as a common identifier name for your application. At execution
>> time, the passed resource filename will be aliased to the name you've
>> mapped it to, so that the application code need not track special
>> names. The behavior is very similar to how you can, in MR, define a
>> symlink name for a DistributedCache entry (e.g. foo.jar#bar.jar).
>>
>> For an example, checkout the DistributedShell app sources.
>>
>> Over [1], you can see we take a user provided file path to a shell
>> script. This can be named anything as it is user-supplied.
>> Onto [2], we define this as a local resource [2.1] and embed it with a
>> different name (the string you ask about) [2.2], as defined at [3] as
>> an application reference-able constant.
>> Note that in [4], we add to the Container arguments the aliased name
>> we mapped it to (i.e. [3]) and not the original filename we received
>> from the user. The resource is placed on the container with this name
>> instead, so thats what we choose to execute.
>>
>> [1] -
>> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L390
>>
>> [2] - [2.1]
>> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L764
>> and [2.2]
>> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L780
>>
>> [3] -
>> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L205
>>
>> [4] -
>> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L791
>>
>> On Mon, Aug 5, 2013 at 2:44 PM, Krishna Kishore Bonagiri
>> <wr...@gmail.com> wrote:
>> > Hi,
>> >
>> >   Can someone please tell me what is the use of calling
>> > setLocalResources()
>> > on ContainerLaunchContext?
>> >
>> >   And, also an example of how to use this will help...
>> >
>> >  I couldn't guess what is the String in the map that is passed to
>> > setLocalResources() like below:
>> >
>> >       // Set the local resources
>> >       Map<String, LocalResource> localResources = new HashMap<String,
>> > LocalResource>();
>> >
>> > Thanks,
>> > Kishore
>> >
>>
>>
>>
>> --
>> Harsh J
>
>



-- 
Harsh J

Re: setLocalResources() on ContainerLaunchContext

Posted by Harsh J <ha...@cloudera.com>.
The detail is insufficient to answer why. You should also have gotten
a trace after it, can you post that? If possible, also the relevant
snippets of code.

On Mon, Aug 5, 2013 at 6:36 PM, Krishna Kishore Bonagiri
<wr...@gmail.com> wrote:
> Hi Harsh,
>  Thanks for the quick and detailed reply, it really helps. I am trying to
> use it and getting this error in node manager's log:
>
> 2013-08-05 08:57:28,867 ERROR
> org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException
> as:dsadm (auth:SIMPLE) cause:java.io.FileNotFoundException: File does not
> exist: hdfs://isredeng/kishore/kk.ksh
>
>
> This file is there on the machine with name "isredeng", I could do ls for
> that file as below:
>
> -bash-4.1$ hadoop fs -ls kishore/kk.ksh
> 13/08/05 09:01:03 WARN util.NativeCodeLoader: Unable to load native-hadoop
> library for your platform... using builtin-java classes where applicable
> Found 1 items
> -rw-r--r--   3 dsadm supergroup       1046 2013-08-05 08:48 kishore/kk.ksh
>
> Note: I am using a single node cluster
>
> Thanks,
> Kishore
>
>
>
>
> On Mon, Aug 5, 2013 at 3:00 PM, Harsh J <ha...@cloudera.com> wrote:
>>
>> The string for each LocalResource in the map can be anything that
>> serves as a common identifier name for your application. At execution
>> time, the passed resource filename will be aliased to the name you've
>> mapped it to, so that the application code need not track special
>> names. The behavior is very similar to how you can, in MR, define a
>> symlink name for a DistributedCache entry (e.g. foo.jar#bar.jar).
>>
>> For an example, checkout the DistributedShell app sources.
>>
>> Over [1], you can see we take a user provided file path to a shell
>> script. This can be named anything as it is user-supplied.
>> Onto [2], we define this as a local resource [2.1] and embed it with a
>> different name (the string you ask about) [2.2], as defined at [3] as
>> an application reference-able constant.
>> Note that in [4], we add to the Container arguments the aliased name
>> we mapped it to (i.e. [3]) and not the original filename we received
>> from the user. The resource is placed on the container with this name
>> instead, so thats what we choose to execute.
>>
>> [1] -
>> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L390
>>
>> [2] - [2.1]
>> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L764
>> and [2.2]
>> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L780
>>
>> [3] -
>> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L205
>>
>> [4] -
>> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L791
>>
>> On Mon, Aug 5, 2013 at 2:44 PM, Krishna Kishore Bonagiri
>> <wr...@gmail.com> wrote:
>> > Hi,
>> >
>> >   Can someone please tell me what is the use of calling
>> > setLocalResources()
>> > on ContainerLaunchContext?
>> >
>> >   And, also an example of how to use this will help...
>> >
>> >  I couldn't guess what is the String in the map that is passed to
>> > setLocalResources() like below:
>> >
>> >       // Set the local resources
>> >       Map<String, LocalResource> localResources = new HashMap<String,
>> > LocalResource>();
>> >
>> > Thanks,
>> > Kishore
>> >
>>
>>
>>
>> --
>> Harsh J
>
>



-- 
Harsh J

Re: setLocalResources() on ContainerLaunchContext

Posted by Harsh J <ha...@cloudera.com>.
The detail is insufficient to answer why. You should also have gotten
a trace after it, can you post that? If possible, also the relevant
snippets of code.

On Mon, Aug 5, 2013 at 6:36 PM, Krishna Kishore Bonagiri
<wr...@gmail.com> wrote:
> Hi Harsh,
>  Thanks for the quick and detailed reply, it really helps. I am trying to
> use it and getting this error in node manager's log:
>
> 2013-08-05 08:57:28,867 ERROR
> org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException
> as:dsadm (auth:SIMPLE) cause:java.io.FileNotFoundException: File does not
> exist: hdfs://isredeng/kishore/kk.ksh
>
>
> This file is there on the machine with name "isredeng", I could do ls for
> that file as below:
>
> -bash-4.1$ hadoop fs -ls kishore/kk.ksh
> 13/08/05 09:01:03 WARN util.NativeCodeLoader: Unable to load native-hadoop
> library for your platform... using builtin-java classes where applicable
> Found 1 items
> -rw-r--r--   3 dsadm supergroup       1046 2013-08-05 08:48 kishore/kk.ksh
>
> Note: I am using a single node cluster
>
> Thanks,
> Kishore
>
>
>
>
> On Mon, Aug 5, 2013 at 3:00 PM, Harsh J <ha...@cloudera.com> wrote:
>>
>> The string for each LocalResource in the map can be anything that
>> serves as a common identifier name for your application. At execution
>> time, the passed resource filename will be aliased to the name you've
>> mapped it to, so that the application code need not track special
>> names. The behavior is very similar to how you can, in MR, define a
>> symlink name for a DistributedCache entry (e.g. foo.jar#bar.jar).
>>
>> For an example, checkout the DistributedShell app sources.
>>
>> Over [1], you can see we take a user provided file path to a shell
>> script. This can be named anything as it is user-supplied.
>> Onto [2], we define this as a local resource [2.1] and embed it with a
>> different name (the string you ask about) [2.2], as defined at [3] as
>> an application reference-able constant.
>> Note that in [4], we add to the Container arguments the aliased name
>> we mapped it to (i.e. [3]) and not the original filename we received
>> from the user. The resource is placed on the container with this name
>> instead, so thats what we choose to execute.
>>
>> [1] -
>> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L390
>>
>> [2] - [2.1]
>> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L764
>> and [2.2]
>> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L780
>>
>> [3] -
>> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L205
>>
>> [4] -
>> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L791
>>
>> On Mon, Aug 5, 2013 at 2:44 PM, Krishna Kishore Bonagiri
>> <wr...@gmail.com> wrote:
>> > Hi,
>> >
>> >   Can someone please tell me what is the use of calling
>> > setLocalResources()
>> > on ContainerLaunchContext?
>> >
>> >   And, also an example of how to use this will help...
>> >
>> >  I couldn't guess what is the String in the map that is passed to
>> > setLocalResources() like below:
>> >
>> >       // Set the local resources
>> >       Map<String, LocalResource> localResources = new HashMap<String,
>> > LocalResource>();
>> >
>> > Thanks,
>> > Kishore
>> >
>>
>>
>>
>> --
>> Harsh J
>
>



-- 
Harsh J

Re: setLocalResources() on ContainerLaunchContext

Posted by Krishna Kishore Bonagiri <wr...@gmail.com>.
Hi Harsh,
 Thanks for the quick and detailed reply, it really helps. I am trying to
use it and getting this error in node manager's log:

2013-08-05 08:57:28,867 ERROR
org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException
as:dsadm (auth:SIMPLE) cause:java.io.FileNotFoundException: File does not
exist: hdfs://isredeng/kishore/kk.ksh


This file is there on the machine with name "isredeng", I could do ls for
that file as below:

-bash-4.1$ hadoop fs -ls kishore/kk.ksh
13/08/05 09:01:03 WARN util.NativeCodeLoader: Unable to load native-hadoop
library for your platform... using builtin-java classes where applicable
Found 1 items
-rw-r--r--   3 dsadm supergroup       1046 2013-08-05 08:48 kishore/kk.ksh

Note: I am using a single node cluster

Thanks,
Kishore




On Mon, Aug 5, 2013 at 3:00 PM, Harsh J <ha...@cloudera.com> wrote:

> The string for each LocalResource in the map can be anything that
> serves as a common identifier name for your application. At execution
> time, the passed resource filename will be aliased to the name you've
> mapped it to, so that the application code need not track special
> names. The behavior is very similar to how you can, in MR, define a
> symlink name for a DistributedCache entry (e.g. foo.jar#bar.jar).
>
> For an example, checkout the DistributedShell app sources.
>
> Over [1], you can see we take a user provided file path to a shell
> script. This can be named anything as it is user-supplied.
> Onto [2], we define this as a local resource [2.1] and embed it with a
> different name (the string you ask about) [2.2], as defined at [3] as
> an application reference-able constant.
> Note that in [4], we add to the Container arguments the aliased name
> we mapped it to (i.e. [3]) and not the original filename we received
> from the user. The resource is placed on the container with this name
> instead, so thats what we choose to execute.
>
> [1] -
> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L390
>
> [2] - [2.1]
> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L764
> and [2.2]
> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L780
>
> [3] -
> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L205
>
> [4] -
> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L791
>
> On Mon, Aug 5, 2013 at 2:44 PM, Krishna Kishore Bonagiri
> <wr...@gmail.com> wrote:
> > Hi,
> >
> >   Can someone please tell me what is the use of calling
> setLocalResources()
> > on ContainerLaunchContext?
> >
> >   And, also an example of how to use this will help...
> >
> >  I couldn't guess what is the String in the map that is passed to
> > setLocalResources() like below:
> >
> >       // Set the local resources
> >       Map<String, LocalResource> localResources = new HashMap<String,
> > LocalResource>();
> >
> > Thanks,
> > Kishore
> >
>
>
>
> --
> Harsh J
>

Re: setLocalResources() on ContainerLaunchContext

Posted by Krishna Kishore Bonagiri <wr...@gmail.com>.
Hi Harsh,
 Thanks for the quick and detailed reply, it really helps. I am trying to
use it and getting this error in node manager's log:

2013-08-05 08:57:28,867 ERROR
org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException
as:dsadm (auth:SIMPLE) cause:java.io.FileNotFoundException: File does not
exist: hdfs://isredeng/kishore/kk.ksh


This file is there on the machine with name "isredeng", I could do ls for
that file as below:

-bash-4.1$ hadoop fs -ls kishore/kk.ksh
13/08/05 09:01:03 WARN util.NativeCodeLoader: Unable to load native-hadoop
library for your platform... using builtin-java classes where applicable
Found 1 items
-rw-r--r--   3 dsadm supergroup       1046 2013-08-05 08:48 kishore/kk.ksh

Note: I am using a single node cluster

Thanks,
Kishore




On Mon, Aug 5, 2013 at 3:00 PM, Harsh J <ha...@cloudera.com> wrote:

> The string for each LocalResource in the map can be anything that
> serves as a common identifier name for your application. At execution
> time, the passed resource filename will be aliased to the name you've
> mapped it to, so that the application code need not track special
> names. The behavior is very similar to how you can, in MR, define a
> symlink name for a DistributedCache entry (e.g. foo.jar#bar.jar).
>
> For an example, checkout the DistributedShell app sources.
>
> Over [1], you can see we take a user provided file path to a shell
> script. This can be named anything as it is user-supplied.
> Onto [2], we define this as a local resource [2.1] and embed it with a
> different name (the string you ask about) [2.2], as defined at [3] as
> an application reference-able constant.
> Note that in [4], we add to the Container arguments the aliased name
> we mapped it to (i.e. [3]) and not the original filename we received
> from the user. The resource is placed on the container with this name
> instead, so thats what we choose to execute.
>
> [1] -
> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L390
>
> [2] - [2.1]
> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L764
> and [2.2]
> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L780
>
> [3] -
> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L205
>
> [4] -
> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L791
>
> On Mon, Aug 5, 2013 at 2:44 PM, Krishna Kishore Bonagiri
> <wr...@gmail.com> wrote:
> > Hi,
> >
> >   Can someone please tell me what is the use of calling
> setLocalResources()
> > on ContainerLaunchContext?
> >
> >   And, also an example of how to use this will help...
> >
> >  I couldn't guess what is the String in the map that is passed to
> > setLocalResources() like below:
> >
> >       // Set the local resources
> >       Map<String, LocalResource> localResources = new HashMap<String,
> > LocalResource>();
> >
> > Thanks,
> > Kishore
> >
>
>
>
> --
> Harsh J
>

Re: setLocalResources() on ContainerLaunchContext

Posted by Krishna Kishore Bonagiri <wr...@gmail.com>.
Hi Harsh,
 Thanks for the quick and detailed reply, it really helps. I am trying to
use it and getting this error in node manager's log:

2013-08-05 08:57:28,867 ERROR
org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException
as:dsadm (auth:SIMPLE) cause:java.io.FileNotFoundException: File does not
exist: hdfs://isredeng/kishore/kk.ksh


This file is there on the machine with name "isredeng", I could do ls for
that file as below:

-bash-4.1$ hadoop fs -ls kishore/kk.ksh
13/08/05 09:01:03 WARN util.NativeCodeLoader: Unable to load native-hadoop
library for your platform... using builtin-java classes where applicable
Found 1 items
-rw-r--r--   3 dsadm supergroup       1046 2013-08-05 08:48 kishore/kk.ksh

Note: I am using a single node cluster

Thanks,
Kishore




On Mon, Aug 5, 2013 at 3:00 PM, Harsh J <ha...@cloudera.com> wrote:

> The string for each LocalResource in the map can be anything that
> serves as a common identifier name for your application. At execution
> time, the passed resource filename will be aliased to the name you've
> mapped it to, so that the application code need not track special
> names. The behavior is very similar to how you can, in MR, define a
> symlink name for a DistributedCache entry (e.g. foo.jar#bar.jar).
>
> For an example, checkout the DistributedShell app sources.
>
> Over [1], you can see we take a user provided file path to a shell
> script. This can be named anything as it is user-supplied.
> Onto [2], we define this as a local resource [2.1] and embed it with a
> different name (the string you ask about) [2.2], as defined at [3] as
> an application reference-able constant.
> Note that in [4], we add to the Container arguments the aliased name
> we mapped it to (i.e. [3]) and not the original filename we received
> from the user. The resource is placed on the container with this name
> instead, so thats what we choose to execute.
>
> [1] -
> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L390
>
> [2] - [2.1]
> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L764
> and [2.2]
> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L780
>
> [3] -
> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L205
>
> [4] -
> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L791
>
> On Mon, Aug 5, 2013 at 2:44 PM, Krishna Kishore Bonagiri
> <wr...@gmail.com> wrote:
> > Hi,
> >
> >   Can someone please tell me what is the use of calling
> setLocalResources()
> > on ContainerLaunchContext?
> >
> >   And, also an example of how to use this will help...
> >
> >  I couldn't guess what is the String in the map that is passed to
> > setLocalResources() like below:
> >
> >       // Set the local resources
> >       Map<String, LocalResource> localResources = new HashMap<String,
> > LocalResource>();
> >
> > Thanks,
> > Kishore
> >
>
>
>
> --
> Harsh J
>

Re: setLocalResources() on ContainerLaunchContext

Posted by Krishna Kishore Bonagiri <wr...@gmail.com>.
Hi Harsh,
 Thanks for the quick and detailed reply, it really helps. I am trying to
use it and getting this error in node manager's log:

2013-08-05 08:57:28,867 ERROR
org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException
as:dsadm (auth:SIMPLE) cause:java.io.FileNotFoundException: File does not
exist: hdfs://isredeng/kishore/kk.ksh


This file is there on the machine with name "isredeng", I could do ls for
that file as below:

-bash-4.1$ hadoop fs -ls kishore/kk.ksh
13/08/05 09:01:03 WARN util.NativeCodeLoader: Unable to load native-hadoop
library for your platform... using builtin-java classes where applicable
Found 1 items
-rw-r--r--   3 dsadm supergroup       1046 2013-08-05 08:48 kishore/kk.ksh

Note: I am using a single node cluster

Thanks,
Kishore




On Mon, Aug 5, 2013 at 3:00 PM, Harsh J <ha...@cloudera.com> wrote:

> The string for each LocalResource in the map can be anything that
> serves as a common identifier name for your application. At execution
> time, the passed resource filename will be aliased to the name you've
> mapped it to, so that the application code need not track special
> names. The behavior is very similar to how you can, in MR, define a
> symlink name for a DistributedCache entry (e.g. foo.jar#bar.jar).
>
> For an example, checkout the DistributedShell app sources.
>
> Over [1], you can see we take a user provided file path to a shell
> script. This can be named anything as it is user-supplied.
> Onto [2], we define this as a local resource [2.1] and embed it with a
> different name (the string you ask about) [2.2], as defined at [3] as
> an application reference-able constant.
> Note that in [4], we add to the Container arguments the aliased name
> we mapped it to (i.e. [3]) and not the original filename we received
> from the user. The resource is placed on the container with this name
> instead, so thats what we choose to execute.
>
> [1] -
> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L390
>
> [2] - [2.1]
> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L764
> and [2.2]
> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L780
>
> [3] -
> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L205
>
> [4] -
> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L791
>
> On Mon, Aug 5, 2013 at 2:44 PM, Krishna Kishore Bonagiri
> <wr...@gmail.com> wrote:
> > Hi,
> >
> >   Can someone please tell me what is the use of calling
> setLocalResources()
> > on ContainerLaunchContext?
> >
> >   And, also an example of how to use this will help...
> >
> >  I couldn't guess what is the String in the map that is passed to
> > setLocalResources() like below:
> >
> >       // Set the local resources
> >       Map<String, LocalResource> localResources = new HashMap<String,
> > LocalResource>();
> >
> > Thanks,
> > Kishore
> >
>
>
>
> --
> Harsh J
>

Re: setLocalResources() on ContainerLaunchContext

Posted by Harsh J <ha...@cloudera.com>.
The string for each LocalResource in the map can be anything that
serves as a common identifier name for your application. At execution
time, the passed resource filename will be aliased to the name you've
mapped it to, so that the application code need not track special
names. The behavior is very similar to how you can, in MR, define a
symlink name for a DistributedCache entry (e.g. foo.jar#bar.jar).

For an example, checkout the DistributedShell app sources.

Over [1], you can see we take a user provided file path to a shell
script. This can be named anything as it is user-supplied.
Onto [2], we define this as a local resource [2.1] and embed it with a
different name (the string you ask about) [2.2], as defined at [3] as
an application reference-able constant.
Note that in [4], we add to the Container arguments the aliased name
we mapped it to (i.e. [3]) and not the original filename we received
from the user. The resource is placed on the container with this name
instead, so thats what we choose to execute.

[1] - https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L390

[2] - [2.1] https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L764
and [2.2] https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L780

[3] - https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L205

[4] - https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L791

On Mon, Aug 5, 2013 at 2:44 PM, Krishna Kishore Bonagiri
<wr...@gmail.com> wrote:
> Hi,
>
>   Can someone please tell me what is the use of calling setLocalResources()
> on ContainerLaunchContext?
>
>   And, also an example of how to use this will help...
>
>  I couldn't guess what is the String in the map that is passed to
> setLocalResources() like below:
>
>       // Set the local resources
>       Map<String, LocalResource> localResources = new HashMap<String,
> LocalResource>();
>
> Thanks,
> Kishore
>



-- 
Harsh J

Re: setLocalResources() on ContainerLaunchContext

Posted by Harsh J <ha...@cloudera.com>.
The string for each LocalResource in the map can be anything that
serves as a common identifier name for your application. At execution
time, the passed resource filename will be aliased to the name you've
mapped it to, so that the application code need not track special
names. The behavior is very similar to how you can, in MR, define a
symlink name for a DistributedCache entry (e.g. foo.jar#bar.jar).

For an example, checkout the DistributedShell app sources.

Over [1], you can see we take a user provided file path to a shell
script. This can be named anything as it is user-supplied.
Onto [2], we define this as a local resource [2.1] and embed it with a
different name (the string you ask about) [2.2], as defined at [3] as
an application reference-able constant.
Note that in [4], we add to the Container arguments the aliased name
we mapped it to (i.e. [3]) and not the original filename we received
from the user. The resource is placed on the container with this name
instead, so thats what we choose to execute.

[1] - https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L390

[2] - [2.1] https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L764
and [2.2] https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L780

[3] - https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L205

[4] - https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L791

On Mon, Aug 5, 2013 at 2:44 PM, Krishna Kishore Bonagiri
<wr...@gmail.com> wrote:
> Hi,
>
>   Can someone please tell me what is the use of calling setLocalResources()
> on ContainerLaunchContext?
>
>   And, also an example of how to use this will help...
>
>  I couldn't guess what is the String in the map that is passed to
> setLocalResources() like below:
>
>       // Set the local resources
>       Map<String, LocalResource> localResources = new HashMap<String,
> LocalResource>();
>
> Thanks,
> Kishore
>



-- 
Harsh J

Re: setLocalResources() on ContainerLaunchContext

Posted by Harsh J <ha...@cloudera.com>.
The string for each LocalResource in the map can be anything that
serves as a common identifier name for your application. At execution
time, the passed resource filename will be aliased to the name you've
mapped it to, so that the application code need not track special
names. The behavior is very similar to how you can, in MR, define a
symlink name for a DistributedCache entry (e.g. foo.jar#bar.jar).

For an example, checkout the DistributedShell app sources.

Over [1], you can see we take a user provided file path to a shell
script. This can be named anything as it is user-supplied.
Onto [2], we define this as a local resource [2.1] and embed it with a
different name (the string you ask about) [2.2], as defined at [3] as
an application reference-able constant.
Note that in [4], we add to the Container arguments the aliased name
we mapped it to (i.e. [3]) and not the original filename we received
from the user. The resource is placed on the container with this name
instead, so thats what we choose to execute.

[1] - https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L390

[2] - [2.1] https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L764
and [2.2] https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L780

[3] - https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L205

[4] - https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L791

On Mon, Aug 5, 2013 at 2:44 PM, Krishna Kishore Bonagiri
<wr...@gmail.com> wrote:
> Hi,
>
>   Can someone please tell me what is the use of calling setLocalResources()
> on ContainerLaunchContext?
>
>   And, also an example of how to use this will help...
>
>  I couldn't guess what is the String in the map that is passed to
> setLocalResources() like below:
>
>       // Set the local resources
>       Map<String, LocalResource> localResources = new HashMap<String,
> LocalResource>();
>
> Thanks,
> Kishore
>



-- 
Harsh J

Re: setLocalResources() on ContainerLaunchContext

Posted by Harsh J <ha...@cloudera.com>.
The string for each LocalResource in the map can be anything that
serves as a common identifier name for your application. At execution
time, the passed resource filename will be aliased to the name you've
mapped it to, so that the application code need not track special
names. The behavior is very similar to how you can, in MR, define a
symlink name for a DistributedCache entry (e.g. foo.jar#bar.jar).

For an example, checkout the DistributedShell app sources.

Over [1], you can see we take a user provided file path to a shell
script. This can be named anything as it is user-supplied.
Onto [2], we define this as a local resource [2.1] and embed it with a
different name (the string you ask about) [2.2], as defined at [3] as
an application reference-able constant.
Note that in [4], we add to the Container arguments the aliased name
we mapped it to (i.e. [3]) and not the original filename we received
from the user. The resource is placed on the container with this name
instead, so thats what we choose to execute.

[1] - https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L390

[2] - [2.1] https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L764
and [2.2] https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L780

[3] - https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L205

[4] - https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L791

On Mon, Aug 5, 2013 at 2:44 PM, Krishna Kishore Bonagiri
<wr...@gmail.com> wrote:
> Hi,
>
>   Can someone please tell me what is the use of calling setLocalResources()
> on ContainerLaunchContext?
>
>   And, also an example of how to use this will help...
>
>  I couldn't guess what is the String in the map that is passed to
> setLocalResources() like below:
>
>       // Set the local resources
>       Map<String, LocalResource> localResources = new HashMap<String,
> LocalResource>();
>
> Thanks,
> Kishore
>



-- 
Harsh J