You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by John Lilley <jo...@redpoint.net> on 2013/07/01 22:37:48 UTC
temporary folders for YARN tasks
When a YARN app and its tasks wants to write temporary files, how does it know where to write the files?
I am assuming that each task has some temporary space available, and I hope it is available across multiple disk volumes for parallel performance.
Are those files cleaned up automatically after task exit?
If I want to give lifetime control of the files to an auxiliary service (along the lines of MR shuffle passing files to the aux service), how would I do that, and would that entail different file locations?
Thanks
John
Re: temporary folders for YARN tasks
Posted by Sandy Ryza <sa...@cloudera.com>.
LocalDirAllocator should help with this. You can look through MapReduce
code to see how it's used.
-Sandy
On Mon, Jul 1, 2013 at 11:01 PM, Devaraj k <de...@huawei.com> wrote:
> You can make use of this configuration to do the same. ****
>
> ** **
>
> <property>****
>
> <description>List of directories to store *localized* files in. An ***
> *
>
> application's *localized* file directory will be found in:****
>
> ${yarn.nodemanager.local-*dirs*}/*usercache*/${user}/*appcache*
> /application_${*appid*}.****
>
> Individual containers' work directories, called container_${*contid*},
> will****
>
> be *subdirectories* of this.****
>
> </description>****
>
> <name>yarn.nodemanager.local-*dirs*</name>****
>
> <value>${hadoop.tmp.dir}/*nm*-local-*dir*</value>****
>
> </property>****
>
> ** **
>
> Thanks****
>
> Devaraj k****
>
> ** **
>
> *From:* John Lilley [mailto:john.lilley@redpoint.net]
> *Sent:* 02 July 2013 02:08
> *To:* user@hadoop.apache.org
> *Subject:* temporary folders for YARN tasks****
>
> ** **
>
> When a YARN app and its tasks wants to write temporary files, how does it
> know where to write the files? ****
>
> I am assuming that each task has some temporary space available, and I
> hope it is available across multiple disk volumes for parallel performance.
> ****
>
> Are those files cleaned up automatically after task exit?****
>
> If I want to give lifetime control of the files to an auxiliary service
> (along the lines of MR shuffle passing files to the aux service), how would
> I do that, and would that entail different file locations?****
>
> Thanks****
>
> John****
>
> ** **
>
> ** **
>
Re: temporary folders for YARN tasks
Posted by Sandy Ryza <sa...@cloudera.com>.
LocalDirAllocator should help with this. You can look through MapReduce
code to see how it's used.
-Sandy
On Mon, Jul 1, 2013 at 11:01 PM, Devaraj k <de...@huawei.com> wrote:
> You can make use of this configuration to do the same. ****
>
> ** **
>
> <property>****
>
> <description>List of directories to store *localized* files in. An ***
> *
>
> application's *localized* file directory will be found in:****
>
> ${yarn.nodemanager.local-*dirs*}/*usercache*/${user}/*appcache*
> /application_${*appid*}.****
>
> Individual containers' work directories, called container_${*contid*},
> will****
>
> be *subdirectories* of this.****
>
> </description>****
>
> <name>yarn.nodemanager.local-*dirs*</name>****
>
> <value>${hadoop.tmp.dir}/*nm*-local-*dir*</value>****
>
> </property>****
>
> ** **
>
> Thanks****
>
> Devaraj k****
>
> ** **
>
> *From:* John Lilley [mailto:john.lilley@redpoint.net]
> *Sent:* 02 July 2013 02:08
> *To:* user@hadoop.apache.org
> *Subject:* temporary folders for YARN tasks****
>
> ** **
>
> When a YARN app and its tasks wants to write temporary files, how does it
> know where to write the files? ****
>
> I am assuming that each task has some temporary space available, and I
> hope it is available across multiple disk volumes for parallel performance.
> ****
>
> Are those files cleaned up automatically after task exit?****
>
> If I want to give lifetime control of the files to an auxiliary service
> (along the lines of MR shuffle passing files to the aux service), how would
> I do that, and would that entail different file locations?****
>
> Thanks****
>
> John****
>
> ** **
>
> ** **
>
Re: temporary folders for YARN tasks
Posted by Sandy Ryza <sa...@cloudera.com>.
LocalDirAllocator should help with this. You can look through MapReduce
code to see how it's used.
-Sandy
On Mon, Jul 1, 2013 at 11:01 PM, Devaraj k <de...@huawei.com> wrote:
> You can make use of this configuration to do the same. ****
>
> ** **
>
> <property>****
>
> <description>List of directories to store *localized* files in. An ***
> *
>
> application's *localized* file directory will be found in:****
>
> ${yarn.nodemanager.local-*dirs*}/*usercache*/${user}/*appcache*
> /application_${*appid*}.****
>
> Individual containers' work directories, called container_${*contid*},
> will****
>
> be *subdirectories* of this.****
>
> </description>****
>
> <name>yarn.nodemanager.local-*dirs*</name>****
>
> <value>${hadoop.tmp.dir}/*nm*-local-*dir*</value>****
>
> </property>****
>
> ** **
>
> Thanks****
>
> Devaraj k****
>
> ** **
>
> *From:* John Lilley [mailto:john.lilley@redpoint.net]
> *Sent:* 02 July 2013 02:08
> *To:* user@hadoop.apache.org
> *Subject:* temporary folders for YARN tasks****
>
> ** **
>
> When a YARN app and its tasks wants to write temporary files, how does it
> know where to write the files? ****
>
> I am assuming that each task has some temporary space available, and I
> hope it is available across multiple disk volumes for parallel performance.
> ****
>
> Are those files cleaned up automatically after task exit?****
>
> If I want to give lifetime control of the files to an auxiliary service
> (along the lines of MR shuffle passing files to the aux service), how would
> I do that, and would that entail different file locations?****
>
> Thanks****
>
> John****
>
> ** **
>
> ** **
>
Re: temporary folders for YARN tasks
Posted by Sandy Ryza <sa...@cloudera.com>.
LocalDirAllocator should help with this. You can look through MapReduce
code to see how it's used.
-Sandy
On Mon, Jul 1, 2013 at 11:01 PM, Devaraj k <de...@huawei.com> wrote:
> You can make use of this configuration to do the same. ****
>
> ** **
>
> <property>****
>
> <description>List of directories to store *localized* files in. An ***
> *
>
> application's *localized* file directory will be found in:****
>
> ${yarn.nodemanager.local-*dirs*}/*usercache*/${user}/*appcache*
> /application_${*appid*}.****
>
> Individual containers' work directories, called container_${*contid*},
> will****
>
> be *subdirectories* of this.****
>
> </description>****
>
> <name>yarn.nodemanager.local-*dirs*</name>****
>
> <value>${hadoop.tmp.dir}/*nm*-local-*dir*</value>****
>
> </property>****
>
> ** **
>
> Thanks****
>
> Devaraj k****
>
> ** **
>
> *From:* John Lilley [mailto:john.lilley@redpoint.net]
> *Sent:* 02 July 2013 02:08
> *To:* user@hadoop.apache.org
> *Subject:* temporary folders for YARN tasks****
>
> ** **
>
> When a YARN app and its tasks wants to write temporary files, how does it
> know where to write the files? ****
>
> I am assuming that each task has some temporary space available, and I
> hope it is available across multiple disk volumes for parallel performance.
> ****
>
> Are those files cleaned up automatically after task exit?****
>
> If I want to give lifetime control of the files to an auxiliary service
> (along the lines of MR shuffle passing files to the aux service), how would
> I do that, and would that entail different file locations?****
>
> Thanks****
>
> John****
>
> ** **
>
> ** **
>
RE: temporary folders for YARN tasks
Posted by Devaraj k <de...@huawei.com>.
You can make use of this configuration to do the same.
<property>
<description>List of directories to store localized files in. An
application's localized file directory will be found in:
${yarn.nodemanager.local-dirs}/usercache/${user}/appcache/application_${appid}.
Individual containers' work directories, called container_${contid}, will
be subdirectories of this.
</description>
<name>yarn.nodemanager.local-dirs</name>
<value>${hadoop.tmp.dir}/nm-local-dir</value>
</property>
Thanks
Devaraj k
From: John Lilley [mailto:john.lilley@redpoint.net]
Sent: 02 July 2013 02:08
To: user@hadoop.apache.org
Subject: temporary folders for YARN tasks
When a YARN app and its tasks wants to write temporary files, how does it know where to write the files?
I am assuming that each task has some temporary space available, and I hope it is available across multiple disk volumes for parallel performance.
Are those files cleaned up automatically after task exit?
If I want to give lifetime control of the files to an auxiliary service (along the lines of MR shuffle passing files to the aux service), how would I do that, and would that entail different file locations?
Thanks
John
RE: temporary folders for YARN tasks
Posted by Devaraj k <de...@huawei.com>.
You can make use of this configuration to do the same.
<property>
<description>List of directories to store localized files in. An
application's localized file directory will be found in:
${yarn.nodemanager.local-dirs}/usercache/${user}/appcache/application_${appid}.
Individual containers' work directories, called container_${contid}, will
be subdirectories of this.
</description>
<name>yarn.nodemanager.local-dirs</name>
<value>${hadoop.tmp.dir}/nm-local-dir</value>
</property>
Thanks
Devaraj k
From: John Lilley [mailto:john.lilley@redpoint.net]
Sent: 02 July 2013 02:08
To: user@hadoop.apache.org
Subject: temporary folders for YARN tasks
When a YARN app and its tasks wants to write temporary files, how does it know where to write the files?
I am assuming that each task has some temporary space available, and I hope it is available across multiple disk volumes for parallel performance.
Are those files cleaned up automatically after task exit?
If I want to give lifetime control of the files to an auxiliary service (along the lines of MR shuffle passing files to the aux service), how would I do that, and would that entail different file locations?
Thanks
John
RE: temporary folders for YARN tasks
Posted by Devaraj k <de...@huawei.com>.
You can make use of this configuration to do the same.
<property>
<description>List of directories to store localized files in. An
application's localized file directory will be found in:
${yarn.nodemanager.local-dirs}/usercache/${user}/appcache/application_${appid}.
Individual containers' work directories, called container_${contid}, will
be subdirectories of this.
</description>
<name>yarn.nodemanager.local-dirs</name>
<value>${hadoop.tmp.dir}/nm-local-dir</value>
</property>
Thanks
Devaraj k
From: John Lilley [mailto:john.lilley@redpoint.net]
Sent: 02 July 2013 02:08
To: user@hadoop.apache.org
Subject: temporary folders for YARN tasks
When a YARN app and its tasks wants to write temporary files, how does it know where to write the files?
I am assuming that each task has some temporary space available, and I hope it is available across multiple disk volumes for parallel performance.
Are those files cleaned up automatically after task exit?
If I want to give lifetime control of the files to an auxiliary service (along the lines of MR shuffle passing files to the aux service), how would I do that, and would that entail different file locations?
Thanks
John
RE: temporary folders for YARN tasks
Posted by Devaraj k <de...@huawei.com>.
You can make use of this configuration to do the same.
<property>
<description>List of directories to store localized files in. An
application's localized file directory will be found in:
${yarn.nodemanager.local-dirs}/usercache/${user}/appcache/application_${appid}.
Individual containers' work directories, called container_${contid}, will
be subdirectories of this.
</description>
<name>yarn.nodemanager.local-dirs</name>
<value>${hadoop.tmp.dir}/nm-local-dir</value>
</property>
Thanks
Devaraj k
From: John Lilley [mailto:john.lilley@redpoint.net]
Sent: 02 July 2013 02:08
To: user@hadoop.apache.org
Subject: temporary folders for YARN tasks
When a YARN app and its tasks wants to write temporary files, how does it know where to write the files?
I am assuming that each task has some temporary space available, and I hope it is available across multiple disk volumes for parallel performance.
Are those files cleaned up automatically after task exit?
If I want to give lifetime control of the files to an auxiliary service (along the lines of MR shuffle passing files to the aux service), how would I do that, and would that entail different file locations?
Thanks
John