You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by hutashan <hu...@gmail.com> on 2014/12/29 21:01:24 UTC

Clean up app folders in worker nodes

Hello All,

I need  to clean up app folder(include app downloaded jar) in spark under
work folder. 
I have tried to set below configuration in spark env. but it is not working
as expected.

SPARK_WORKER_OPTS="-Dspark.worker.cleanup.enabled=true
-Dspark.worker.cleanup.interval=10" 

I cannot clear work folder blindly because of some other job may run on the
same time.

Is there any way to find out below :-
1) Clear only my app folder under work folder
2) In executor level can we set work directories. So, that i will clean that
folder after my job completed.
3) Is there way to find of app id of my application. I tried to get master
object in code but it is throwing actor error.
4) Is there any correction required on below configuration 
SPARK_WORKER_OPTS="-Dspark.worker.cleanup.enabled=true
-Dspark.worker.cleanup.interval=10" 


Thanks in advance
Hutashan.



--
View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Clean-up-app-folders-in-worker-nodes-tp20889.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
For additional commands, e-mail: user-help@spark.apache.org


Re: Clean up app folders in worker nodes

Posted by Michael Quinlan <mq...@gmail.com>.
I'm also interested in the solution to this.

Thanks,
Mike

On Mon, Dec 29, 2014 at 12:01 PM, hutashan [via Apache Spark User List] <
ml-node+s1001560n20889h89@n3.nabble.com> wrote:
>
> Hello All,
>
> I need  to clean up app folder(include app downloaded jar) in spark under
> work folder.
> I have tried to set below configuration in spark env. but it is not
> working as expected.
>
> SPARK_WORKER_OPTS="-Dspark.worker.cleanup.enabled=true
> -Dspark.worker.cleanup.interval=10"
>
> I cannot clear work folder blindly because of some other job may run on
> the same time.
>
> Is there any way to find out below :-
> 1) Clear only my app folder under work folder
> 2) In executor level can we set work directories. So, that i will clean
> that folder after my job completed.
> 3) Is there way to find of app id of my application. I tried to get master
> object in code but it is throwing actor error.
> 4) Is there any correction required on below configuration
> SPARK_WORKER_OPTS="-Dspark.worker.cleanup.enabled=true
> -Dspark.worker.cleanup.interval=10"
>
>
> Thanks in advance
> Hutashan.
>
> ------------------------------
>  If you reply to this email, your message will be added to the discussion
> below:
>
> http://apache-spark-user-list.1001560.n3.nabble.com/Clean-up-app-folders-in-worker-nodes-tp20889.html
>  To unsubscribe from Apache Spark User List, click here
> <http://apache-spark-user-list.1001560.n3.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_code&node=1&code=bXEwMDFrQGdtYWlsLmNvbXwxfDgxMTQwOTE5Nw==>
> .
> NAML
> <http://apache-spark-user-list.1001560.n3.nabble.com/template/NamlServlet.jtp?macro=macro_viewer&id=instant_html%21nabble%3Aemail.naml&base=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace&breadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml>
>




--
View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Clean-up-app-folders-in-worker-nodes-tp20889p20890.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

Re: Clean up app folders in worker nodes

Posted by markjgreene <ma...@evertrue.com>.
I think the setting you are missing is 'spark.worker.cleanup.appDataTtl'.
This setting controls how long the age of a file has to be before it is
deleted. More info here:
https://spark.apache.org/docs/1.0.1/spark-standalone.html.

Also, 'spark.worker.cleanup.interval' you have configured is pretty
aggressive, at 10 seconds. Looking at the code, I would be willing to bet
you would be kicking off one of these cleanup threads in close proximity to
one another and if your system was under load, you could end up thrashing
your CPU. You may want to use something a little more reasonable like 30
minutes or an hour.



--
View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Clean-up-app-folders-in-worker-nodes-tp20889p21841.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
For additional commands, e-mail: user-help@spark.apache.org


Re: Clean up app folders in worker nodes

Posted by pbirsinger <pb...@gmail.com>.
This works. I needed to restart the master and slaves for the changes to take
effect. Plus 1 million to you sir.



--
View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Clean-up-app-folders-in-worker-nodes-tp20889p27440.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscribe@spark.apache.org