You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@mesos.apache.org by Reinis <me...@orbit-x.de> on 2015/05/22 13:24:41 UTC

Garbage-collecting completed tasks

Hi,

I am using Mesos 0.22.1 with Spark 1.3.1-hadoop2.4.

I am submitting 9 spark jobs every hour with spark-submit sequentially 
and they are being run by mesos as frameworks.

The challenge is that my own application containing spark jobs (jar 
file) is 160MB, spark-1.3.1-bin-hadoop2.4.tgz executor is 241 MB and 
unpacked spark is another 200 MB.

Thus every hour a terminated framework Sandboxes build up worth of 9 * 
(160 + 240 + 200) = 5GB!

I am running out of the disk space every night and am trying to somehow 
garbage collect those large jar and tgz files but KEEP the log files 
(stderr, stdout).

Is it possible to somehow selectively garbage collect files stored in 
terminated frameworks?

thx
reinis

Re: Garbage-collecting completed tasks

Posted by Adam Bordelon <ad...@mesosphere.io>.
Reinis, you could try wrapping your Spark executor in a bash script that
removes the spark job/executor/binaries once the job is complete.

On Sat, May 23, 2015 at 3:03 AM, tommy xiao <xi...@gmail.com> wrote:

> Reinis,
>
> --gc_delay have a try
>
> 2015-05-22 19:24 GMT+08:00 Reinis <me...@orbit-x.de>:
>
>> Hi,
>>
>> I am using Mesos 0.22.1 with Spark 1.3.1-hadoop2.4.
>>
>> I am submitting 9 spark jobs every hour with spark-submit sequentially
>> and they are being run by mesos as frameworks.
>>
>> The challenge is that my own application containing spark jobs (jar file)
>> is 160MB, spark-1.3.1-bin-hadoop2.4.tgz executor is 241 MB and unpacked
>> spark is another 200 MB.
>>
>> Thus every hour a terminated framework Sandboxes build up worth of 9 *
>> (160 + 240 + 200) = 5GB!
>>
>> I am running out of the disk space every night and am trying to somehow
>> garbage collect those large jar and tgz files but KEEP the log files
>> (stderr, stdout).
>>
>> Is it possible to somehow selectively garbage collect files stored in
>> terminated frameworks?
>>
>> thx
>> reinis
>>
>
>
>
> --
> Deshi Xiao
> Twitter: xds2000
> E-mail: xiaods(AT)gmail.com
>

Re: Garbage-collecting completed tasks

Posted by tommy xiao <xi...@gmail.com>.
Reinis,

--gc_delay have a try

2015-05-22 19:24 GMT+08:00 Reinis <me...@orbit-x.de>:

> Hi,
>
> I am using Mesos 0.22.1 with Spark 1.3.1-hadoop2.4.
>
> I am submitting 9 spark jobs every hour with spark-submit sequentially and
> they are being run by mesos as frameworks.
>
> The challenge is that my own application containing spark jobs (jar file)
> is 160MB, spark-1.3.1-bin-hadoop2.4.tgz executor is 241 MB and unpacked
> spark is another 200 MB.
>
> Thus every hour a terminated framework Sandboxes build up worth of 9 *
> (160 + 240 + 200) = 5GB!
>
> I am running out of the disk space every night and am trying to somehow
> garbage collect those large jar and tgz files but KEEP the log files
> (stderr, stdout).
>
> Is it possible to somehow selectively garbage collect files stored in
> terminated frameworks?
>
> thx
> reinis
>



-- 
Deshi Xiao
Twitter: xds2000
E-mail: xiaods(AT)gmail.com