You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@apex.apache.org by vikram patil <pa...@gmail.com> on 2017/03/17 07:49:06 UTC

Need suggestion about temporary file usage on container node

Hello All,

I am working on operator which would download the file from HDFS and store
in on the local machine during run-time.  For storing files to HDFS from
client machine I will be using LIB_JARS or FILES configuration. Once
operator fails/shuts down, I would like to clean up these files
automatically if possible.

Thanks & Regards,
Vikram

Re: Need suggestion about temporary file usage on container node

Posted by Thomas Weise <th...@apache.org>.
Vikram,

The same files should be localized for the worker container. This looks
like a bug, please create a JIRA.

Thomas

On Thu, Mar 23, 2017 at 2:03 AM, vikram patil <pa...@gmail.com> wrote:

> Thank you very much Thomas for the suggestion. It worked for jar files but
> I still faced some issues for non-jar files.
>  I have  *.tar.gz file available on edge node and I am using FILES
> variable to localize it on the container node.
>
>
> *Files Transfered as:*
>
> Configuration configuration = this.getConf();
>
> configuration.set(StramAppLauncher.FILES_CONF_KEY_NAME,
> APEX_RUNTIME_FILES_PATH + "/runtime/py4j-0.10.4.tar.gz");
>
> appLauncher = new StramAppLauncher("App-Name", getConf());
>
> After few more experiments, it seems like "gz" file is only available in a directory created for the STRAM container but it was missing from the directory created for container, used for operator deployment.
>
> *Files localized for STRAM :*
> /tmp/hadoop-vikram/nm-local-dir/usercache/vikram/appcache/ap
> plication_1490062699498_0013/container_1490062699498_0013_01_000001
>
> *Files localized for Operator:*
> ls: cannot access '/tmp/hadoop-vikram/nm-local-d
> ir/usercache/vikram/appcache/application_1490062699498_0013/
> container_1490062699498_0013_01_000003/py4j-0.10.4.tar.gz': No such file
> or directory
>
> Thanks & Regards,
> Vikram
>
> On Sun, Mar 19, 2017 at 10:11 AM, Thomas Weise <th...@apache.org> wrote:
>
>> The localized files will be in the YARN containers working directory and
>> JVMs classpath, so as long as you know the original name, you will be able
>> to access via Class.getResourceAsStream or Class.getResource
>>
>> Thomas
>>
>> On Fri, Mar 17, 2017 at 8:54 AM, Vikram Patil <vi...@datatorrent.com>
>> wrote:
>>
>>> Thanks Thomas. Could you suggest a way so that I can figure out the path
>>> to particular localized file on container node preferably using
>>> OperatorContext ?
>>>
>>>
>>> On Fri, Mar 17, 2017 at 8:06 PM, Thomas Weise <th...@apache.org> wrote:
>>>
>>>> If you use LIB_JARS or FILES, then those files will be localized by
>>>> YARN on the container node, you don't need to manually copy them from HDFS
>>>> or write cleanup code for it.
>>>>
>>>> Thomas
>>>>
>>>> On Fri, Mar 17, 2017 at 12:49 AM, vikram patil <pa...@gmail.com>
>>>> wrote:
>>>>
>>>>> Hello All,
>>>>>
>>>>> I am working on operator which would download the file from HDFS and
>>>>> store in on the local machine during run-time.  For storing files to HDFS
>>>>> from client machine I will be using LIB_JARS or FILES configuration. Once
>>>>> operator fails/shuts down, I would like to clean up these files
>>>>> automatically if possible.
>>>>>
>>>>> Thanks & Regards,
>>>>> Vikram
>>>>>
>>>>
>>>>
>>>
>>
>

Re: Need suggestion about temporary file usage on container node

Posted by vikram patil <pa...@gmail.com>.
Thank you very much Thomas for the suggestion. It worked for jar files but
I still faced some issues for non-jar files.
 I have  *.tar.gz file available on edge node and I am using FILES variable
to localize it on the container node.


*Files Transfered as:*

Configuration configuration = this.getConf();

configuration.set(StramAppLauncher.FILES_CONF_KEY_NAME,
APEX_RUNTIME_FILES_PATH + "/runtime/py4j-0.10.4.tar.gz");

appLauncher = new StramAppLauncher("App-Name", getConf());

After few more experiments, it seems like "gz" file is only available
in a directory created for the STRAM container but it was missing from
the directory created for container, used for operator deployment.

*Files localized for STRAM :*
/tmp/hadoop-vikram/nm-local-dir/usercache/vikram/appcache/
application_1490062699498_0013/container_1490062699498_0013_01_000001

*Files localized for Operator:*
ls: cannot access '/tmp/hadoop-vikram/nm-local-
dir/usercache/vikram/appcache/application_1490062699498_
0013/container_1490062699498_0013_01_000003/py4j-0.10.4.tar.gz': No such
file or directory

Thanks & Regards,
Vikram

On Sun, Mar 19, 2017 at 10:11 AM, Thomas Weise <th...@apache.org> wrote:

> The localized files will be in the YARN containers working directory and
> JVMs classpath, so as long as you know the original name, you will be able
> to access via Class.getResourceAsStream or Class.getResource
>
> Thomas
>
> On Fri, Mar 17, 2017 at 8:54 AM, Vikram Patil <vi...@datatorrent.com>
> wrote:
>
>> Thanks Thomas. Could you suggest a way so that I can figure out the path
>> to particular localized file on container node preferably using
>> OperatorContext ?
>>
>>
>> On Fri, Mar 17, 2017 at 8:06 PM, Thomas Weise <th...@apache.org> wrote:
>>
>>> If you use LIB_JARS or FILES, then those files will be localized by YARN
>>> on the container node, you don't need to manually copy them from HDFS or
>>> write cleanup code for it.
>>>
>>> Thomas
>>>
>>> On Fri, Mar 17, 2017 at 12:49 AM, vikram patil <pa...@gmail.com>
>>> wrote:
>>>
>>>> Hello All,
>>>>
>>>> I am working on operator which would download the file from HDFS and
>>>> store in on the local machine during run-time.  For storing files to HDFS
>>>> from client machine I will be using LIB_JARS or FILES configuration. Once
>>>> operator fails/shuts down, I would like to clean up these files
>>>> automatically if possible.
>>>>
>>>> Thanks & Regards,
>>>> Vikram
>>>>
>>>
>>>
>>
>

Re: Need suggestion about temporary file usage on container node

Posted by Thomas Weise <th...@apache.org>.
The localized files will be in the YARN containers working directory and
JVMs classpath, so as long as you know the original name, you will be able
to access via Class.getResourceAsStream or Class.getResource

Thomas

On Fri, Mar 17, 2017 at 8:54 AM, Vikram Patil <vi...@datatorrent.com>
wrote:

> Thanks Thomas. Could you suggest a way so that I can figure out the path
> to particular localized file on container node preferably using
> OperatorContext ?
>
>
> On Fri, Mar 17, 2017 at 8:06 PM, Thomas Weise <th...@apache.org> wrote:
>
>> If you use LIB_JARS or FILES, then those files will be localized by YARN
>> on the container node, you don't need to manually copy them from HDFS or
>> write cleanup code for it.
>>
>> Thomas
>>
>> On Fri, Mar 17, 2017 at 12:49 AM, vikram patil <pa...@gmail.com>
>> wrote:
>>
>>> Hello All,
>>>
>>> I am working on operator which would download the file from HDFS and
>>> store in on the local machine during run-time.  For storing files to HDFS
>>> from client machine I will be using LIB_JARS or FILES configuration. Once
>>> operator fails/shuts down, I would like to clean up these files
>>> automatically if possible.
>>>
>>> Thanks & Regards,
>>> Vikram
>>>
>>
>>
>

Re: Need suggestion about temporary file usage on container node

Posted by Vikram Patil <vi...@datatorrent.com>.
Thanks Thomas. Could you suggest a way so that I can figure out the path to
particular localized file on container node preferably using
OperatorContext ?


On Fri, Mar 17, 2017 at 8:06 PM, Thomas Weise <th...@apache.org> wrote:

> If you use LIB_JARS or FILES, then those files will be localized by YARN
> on the container node, you don't need to manually copy them from HDFS or
> write cleanup code for it.
>
> Thomas
>
> On Fri, Mar 17, 2017 at 12:49 AM, vikram patil <pa...@gmail.com>
> wrote:
>
>> Hello All,
>>
>> I am working on operator which would download the file from HDFS and
>> store in on the local machine during run-time.  For storing files to HDFS
>> from client machine I will be using LIB_JARS or FILES configuration. Once
>> operator fails/shuts down, I would like to clean up these files
>> automatically if possible.
>>
>> Thanks & Regards,
>> Vikram
>>
>
>

Re: Need suggestion about temporary file usage on container node

Posted by Thomas Weise <th...@apache.org>.
If you use LIB_JARS or FILES, then those files will be localized by YARN on
the container node, you don't need to manually copy them from HDFS or write
cleanup code for it.

Thomas

On Fri, Mar 17, 2017 at 12:49 AM, vikram patil <pa...@gmail.com>
wrote:

> Hello All,
>
> I am working on operator which would download the file from HDFS and store
> in on the local machine during run-time.  For storing files to HDFS from
> client machine I will be using LIB_JARS or FILES configuration. Once
> operator fails/shuts down, I would like to clean up these files
> automatically if possible.
>
> Thanks & Regards,
> Vikram
>