You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hadoop.apache.org by John Lilley <jo...@redpoint.net> on 2013/07/01 22:37:48 UTC

temporary folders for YARN tasks

When a YARN app and its tasks wants to write temporary files, how does it know where to write the files?
I am assuming that each task has some temporary space available, and I hope it is available across multiple disk volumes for parallel performance.
Are those files cleaned up automatically after task exit?
If I want to give lifetime control of the files to an auxiliary service (along the lines of MR shuffle passing files to the aux service), how would I do that, and would that entail different file locations?
Thanks
John



Re: temporary folders for YARN tasks

Posted by Sandy Ryza <sa...@cloudera.com>.
LocalDirAllocator should help with this.  You can look through MapReduce
code to see how it's used.

-Sandy


On Mon, Jul 1, 2013 at 11:01 PM, Devaraj k <de...@huawei.com> wrote:

>  You can make use of this configuration to do the same. ****
>
> ** **
>
> <property>****
>
>     <description>List of directories to store *localized* files in. An ***
> *
>
>       application's *localized* file directory will be found in:****
>
>       ${yarn.nodemanager.local-*dirs*}/*usercache*/${user}/*appcache*
> /application_${*appid*}.****
>
>       Individual containers' work directories, called container_${*contid*},
> will****
>
>       be *subdirectories* of this.****
>
>    </description>****
>
>     <name>yarn.nodemanager.local-*dirs*</name>****
>
>     <value>${hadoop.tmp.dir}/*nm*-local-*dir*</value>****
>
>   </property>****
>
> ** **
>
> Thanks****
>
> Devaraj k****
>
> ** **
>
> *From:* John Lilley [mailto:john.lilley@redpoint.net]
> *Sent:* 02 July 2013 02:08
> *To:* user@hadoop.apache.org
> *Subject:* temporary folders for YARN tasks****
>
> ** **
>
> When a YARN app and its tasks wants to write temporary files, how does it
> know where to write the files?  ****
>
> I am assuming that each task has some temporary space available, and I
> hope it is available across multiple disk volumes for parallel performance.
> ****
>
> Are those files cleaned up automatically after task exit?****
>
> If I want to give lifetime control of the files to an auxiliary service
> (along the lines of MR shuffle passing files to the aux service), how would
> I do that, and would that entail different file locations?****
>
> Thanks****
>
> John****
>
> ** **
>
> ** **
>

Re: temporary folders for YARN tasks

Posted by Sandy Ryza <sa...@cloudera.com>.
LocalDirAllocator should help with this.  You can look through MapReduce
code to see how it's used.

-Sandy


On Mon, Jul 1, 2013 at 11:01 PM, Devaraj k <de...@huawei.com> wrote:

>  You can make use of this configuration to do the same. ****
>
> ** **
>
> <property>****
>
>     <description>List of directories to store *localized* files in. An ***
> *
>
>       application's *localized* file directory will be found in:****
>
>       ${yarn.nodemanager.local-*dirs*}/*usercache*/${user}/*appcache*
> /application_${*appid*}.****
>
>       Individual containers' work directories, called container_${*contid*},
> will****
>
>       be *subdirectories* of this.****
>
>    </description>****
>
>     <name>yarn.nodemanager.local-*dirs*</name>****
>
>     <value>${hadoop.tmp.dir}/*nm*-local-*dir*</value>****
>
>   </property>****
>
> ** **
>
> Thanks****
>
> Devaraj k****
>
> ** **
>
> *From:* John Lilley [mailto:john.lilley@redpoint.net]
> *Sent:* 02 July 2013 02:08
> *To:* user@hadoop.apache.org
> *Subject:* temporary folders for YARN tasks****
>
> ** **
>
> When a YARN app and its tasks wants to write temporary files, how does it
> know where to write the files?  ****
>
> I am assuming that each task has some temporary space available, and I
> hope it is available across multiple disk volumes for parallel performance.
> ****
>
> Are those files cleaned up automatically after task exit?****
>
> If I want to give lifetime control of the files to an auxiliary service
> (along the lines of MR shuffle passing files to the aux service), how would
> I do that, and would that entail different file locations?****
>
> Thanks****
>
> John****
>
> ** **
>
> ** **
>

Re: temporary folders for YARN tasks

Posted by Sandy Ryza <sa...@cloudera.com>.
LocalDirAllocator should help with this.  You can look through MapReduce
code to see how it's used.

-Sandy


On Mon, Jul 1, 2013 at 11:01 PM, Devaraj k <de...@huawei.com> wrote:

>  You can make use of this configuration to do the same. ****
>
> ** **
>
> <property>****
>
>     <description>List of directories to store *localized* files in. An ***
> *
>
>       application's *localized* file directory will be found in:****
>
>       ${yarn.nodemanager.local-*dirs*}/*usercache*/${user}/*appcache*
> /application_${*appid*}.****
>
>       Individual containers' work directories, called container_${*contid*},
> will****
>
>       be *subdirectories* of this.****
>
>    </description>****
>
>     <name>yarn.nodemanager.local-*dirs*</name>****
>
>     <value>${hadoop.tmp.dir}/*nm*-local-*dir*</value>****
>
>   </property>****
>
> ** **
>
> Thanks****
>
> Devaraj k****
>
> ** **
>
> *From:* John Lilley [mailto:john.lilley@redpoint.net]
> *Sent:* 02 July 2013 02:08
> *To:* user@hadoop.apache.org
> *Subject:* temporary folders for YARN tasks****
>
> ** **
>
> When a YARN app and its tasks wants to write temporary files, how does it
> know where to write the files?  ****
>
> I am assuming that each task has some temporary space available, and I
> hope it is available across multiple disk volumes for parallel performance.
> ****
>
> Are those files cleaned up automatically after task exit?****
>
> If I want to give lifetime control of the files to an auxiliary service
> (along the lines of MR shuffle passing files to the aux service), how would
> I do that, and would that entail different file locations?****
>
> Thanks****
>
> John****
>
> ** **
>
> ** **
>

Re: temporary folders for YARN tasks

Posted by Sandy Ryza <sa...@cloudera.com>.
LocalDirAllocator should help with this.  You can look through MapReduce
code to see how it's used.

-Sandy


On Mon, Jul 1, 2013 at 11:01 PM, Devaraj k <de...@huawei.com> wrote:

>  You can make use of this configuration to do the same. ****
>
> ** **
>
> <property>****
>
>     <description>List of directories to store *localized* files in. An ***
> *
>
>       application's *localized* file directory will be found in:****
>
>       ${yarn.nodemanager.local-*dirs*}/*usercache*/${user}/*appcache*
> /application_${*appid*}.****
>
>       Individual containers' work directories, called container_${*contid*},
> will****
>
>       be *subdirectories* of this.****
>
>    </description>****
>
>     <name>yarn.nodemanager.local-*dirs*</name>****
>
>     <value>${hadoop.tmp.dir}/*nm*-local-*dir*</value>****
>
>   </property>****
>
> ** **
>
> Thanks****
>
> Devaraj k****
>
> ** **
>
> *From:* John Lilley [mailto:john.lilley@redpoint.net]
> *Sent:* 02 July 2013 02:08
> *To:* user@hadoop.apache.org
> *Subject:* temporary folders for YARN tasks****
>
> ** **
>
> When a YARN app and its tasks wants to write temporary files, how does it
> know where to write the files?  ****
>
> I am assuming that each task has some temporary space available, and I
> hope it is available across multiple disk volumes for parallel performance.
> ****
>
> Are those files cleaned up automatically after task exit?****
>
> If I want to give lifetime control of the files to an auxiliary service
> (along the lines of MR shuffle passing files to the aux service), how would
> I do that, and would that entail different file locations?****
>
> Thanks****
>
> John****
>
> ** **
>
> ** **
>

RE: temporary folders for YARN tasks

Posted by Devaraj k <de...@huawei.com>.
You can make use of this configuration to do the same.

<property>
    <description>List of directories to store localized files in. An
      application's localized file directory will be found in:
      ${yarn.nodemanager.local-dirs}/usercache/${user}/appcache/application_${appid}.
      Individual containers' work directories, called container_${contid}, will
      be subdirectories of this.
   </description>
    <name>yarn.nodemanager.local-dirs</name>
    <value>${hadoop.tmp.dir}/nm-local-dir</value>
  </property>

Thanks
Devaraj k

From: John Lilley [mailto:john.lilley@redpoint.net]
Sent: 02 July 2013 02:08
To: user@hadoop.apache.org
Subject: temporary folders for YARN tasks

When a YARN app and its tasks wants to write temporary files, how does it know where to write the files?
I am assuming that each task has some temporary space available, and I hope it is available across multiple disk volumes for parallel performance.
Are those files cleaned up automatically after task exit?
If I want to give lifetime control of the files to an auxiliary service (along the lines of MR shuffle passing files to the aux service), how would I do that, and would that entail different file locations?
Thanks
John



RE: temporary folders for YARN tasks

Posted by Devaraj k <de...@huawei.com>.
You can make use of this configuration to do the same.

<property>
    <description>List of directories to store localized files in. An
      application's localized file directory will be found in:
      ${yarn.nodemanager.local-dirs}/usercache/${user}/appcache/application_${appid}.
      Individual containers' work directories, called container_${contid}, will
      be subdirectories of this.
   </description>
    <name>yarn.nodemanager.local-dirs</name>
    <value>${hadoop.tmp.dir}/nm-local-dir</value>
  </property>

Thanks
Devaraj k

From: John Lilley [mailto:john.lilley@redpoint.net]
Sent: 02 July 2013 02:08
To: user@hadoop.apache.org
Subject: temporary folders for YARN tasks

When a YARN app and its tasks wants to write temporary files, how does it know where to write the files?
I am assuming that each task has some temporary space available, and I hope it is available across multiple disk volumes for parallel performance.
Are those files cleaned up automatically after task exit?
If I want to give lifetime control of the files to an auxiliary service (along the lines of MR shuffle passing files to the aux service), how would I do that, and would that entail different file locations?
Thanks
John



RE: temporary folders for YARN tasks

Posted by Devaraj k <de...@huawei.com>.
You can make use of this configuration to do the same.

<property>
    <description>List of directories to store localized files in. An
      application's localized file directory will be found in:
      ${yarn.nodemanager.local-dirs}/usercache/${user}/appcache/application_${appid}.
      Individual containers' work directories, called container_${contid}, will
      be subdirectories of this.
   </description>
    <name>yarn.nodemanager.local-dirs</name>
    <value>${hadoop.tmp.dir}/nm-local-dir</value>
  </property>

Thanks
Devaraj k

From: John Lilley [mailto:john.lilley@redpoint.net]
Sent: 02 July 2013 02:08
To: user@hadoop.apache.org
Subject: temporary folders for YARN tasks

When a YARN app and its tasks wants to write temporary files, how does it know where to write the files?
I am assuming that each task has some temporary space available, and I hope it is available across multiple disk volumes for parallel performance.
Are those files cleaned up automatically after task exit?
If I want to give lifetime control of the files to an auxiliary service (along the lines of MR shuffle passing files to the aux service), how would I do that, and would that entail different file locations?
Thanks
John



RE: temporary folders for YARN tasks

Posted by Devaraj k <de...@huawei.com>.
You can make use of this configuration to do the same.

<property>
    <description>List of directories to store localized files in. An
      application's localized file directory will be found in:
      ${yarn.nodemanager.local-dirs}/usercache/${user}/appcache/application_${appid}.
      Individual containers' work directories, called container_${contid}, will
      be subdirectories of this.
   </description>
    <name>yarn.nodemanager.local-dirs</name>
    <value>${hadoop.tmp.dir}/nm-local-dir</value>
  </property>

Thanks
Devaraj k

From: John Lilley [mailto:john.lilley@redpoint.net]
Sent: 02 July 2013 02:08
To: user@hadoop.apache.org
Subject: temporary folders for YARN tasks

When a YARN app and its tasks wants to write temporary files, how does it know where to write the files?
I am assuming that each task has some temporary space available, and I hope it is available across multiple disk volumes for parallel performance.
Are those files cleaned up automatically after task exit?
If I want to give lifetime control of the files to an auxiliary service (along the lines of MR shuffle passing files to the aux service), how would I do that, and would that entail different file locations?
Thanks
John