You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by Abhishek Anand <ab...@gmail.com> on 2016/02/14 08:04:53 UTC

Re: Worker's BlockManager Folder not getting cleared

Hi All,

Any ideas on this one ?

The size of this directory keeps on growing.

I can see there are many files from a day earlier too.

Cheers !!
Abhi

On Tue, Jan 26, 2016 at 7:13 PM, Abhishek Anand <ab...@gmail.com>
wrote:

> Hi Adrian,
>
> I am running spark in standalone mode.
>
> The spark version that I am using is 1.4.0
>
> Thanks,
> Abhi
>
> On Tue, Jan 26, 2016 at 4:10 PM, Adrian Bridgett <ad...@opensignal.com>
> wrote:
>
>> Hi Abhi - are you running on Mesos perchance?
>>
>> If so then with spark <1.6 you will be hitting
>> https://issues.apache.org/jira/browse/SPARK-10975
>> With spark >= 1.6:
>> https://issues.apache.org/jira/browse/SPARK-12430
>> and also be aware of:
>> https://issues.apache.org/jira/browse/SPARK-12583
>>
>>
>> On 25/01/2016 07:14, Abhishek Anand wrote:
>>
>> Hi All,
>>
>> How long the shuffle files and data files are stored on the block manager
>> folder of the workers.
>>
>> I have a spark streaming job with window duration of 2 hours and slide
>> interval of 15 minutes.
>>
>> When I execute the following command in my block manager path
>>
>> find . -type f -cmin +150 -name "shuffle*" -exec ls {} \;
>>
>> I see a lot of files which means that they are not getting cleared which
>> I was expecting that they should get cleared.
>>
>> Subsequently, this size keeps on increasing and takes space on the disk.
>>
>> Please suggest how to get rid of this and help on understanding this
>> behaviour.
>>
>>
>>
>> Thanks !!!
>> Abhi
>>
>>
>> --
>> *Adrian Bridgett* |  Sysadmin Engineer, OpenSignal
>> <http://www.opensignal.com>
>> _____________________________________________________
>> Office: 3rd Floor, The Angel Office, 2 Angel Square, London, EC1V 1NY
>> Phone #: +44 777-377-8251
>> Skype: abridgett  |  @adrianbridgett <http://twitter.com/adrianbridgett>
>>   |  LinkedIn link  <https://uk.linkedin.com/in/abridgett>
>> _____________________________________________________
>>
>
>

Re: Worker's BlockManager Folder not getting cleared

Posted by Abhishek Anand <ab...@gmail.com>.
Looking for answer to this.

Is it safe to delete the older files using

find . -type f -cmin +200 -name "shuffle*" -exec rm -rf {} \;

For a window duration of 2 hours how older files can we delete ?

Thanks.

On Sun, Feb 14, 2016 at 12:34 PM, Abhishek Anand <ab...@gmail.com>
wrote:

> Hi All,
>
> Any ideas on this one ?
>
> The size of this directory keeps on growing.
>
> I can see there are many files from a day earlier too.
>
> Cheers !!
> Abhi
>
> On Tue, Jan 26, 2016 at 7:13 PM, Abhishek Anand <ab...@gmail.com>
> wrote:
>
>> Hi Adrian,
>>
>> I am running spark in standalone mode.
>>
>> The spark version that I am using is 1.4.0
>>
>> Thanks,
>> Abhi
>>
>> On Tue, Jan 26, 2016 at 4:10 PM, Adrian Bridgett <ad...@opensignal.com>
>> wrote:
>>
>>> Hi Abhi - are you running on Mesos perchance?
>>>
>>> If so then with spark <1.6 you will be hitting
>>> https://issues.apache.org/jira/browse/SPARK-10975
>>> With spark >= 1.6:
>>> https://issues.apache.org/jira/browse/SPARK-12430
>>> and also be aware of:
>>> https://issues.apache.org/jira/browse/SPARK-12583
>>>
>>>
>>> On 25/01/2016 07:14, Abhishek Anand wrote:
>>>
>>> Hi All,
>>>
>>> How long the shuffle files and data files are stored on the block
>>> manager folder of the workers.
>>>
>>> I have a spark streaming job with window duration of 2 hours and slide
>>> interval of 15 minutes.
>>>
>>> When I execute the following command in my block manager path
>>>
>>> find . -type f -cmin +150 -name "shuffle*" -exec ls {} \;
>>>
>>> I see a lot of files which means that they are not getting cleared which
>>> I was expecting that they should get cleared.
>>>
>>> Subsequently, this size keeps on increasing and takes space on the disk.
>>>
>>> Please suggest how to get rid of this and help on understanding this
>>> behaviour.
>>>
>>>
>>>
>>> Thanks !!!
>>> Abhi
>>>
>>>
>>> --
>>> *Adrian Bridgett* |  Sysadmin Engineer, OpenSignal
>>> <http://www.opensignal.com>
>>> _____________________________________________________
>>> Office: 3rd Floor, The Angel Office, 2 Angel Square, London, EC1V 1NY
>>> Phone #: +44 777-377-8251
>>> Skype: abridgett  |  @adrianbridgett <http://twitter.com/adrianbridgett>
>>>   |  LinkedIn link  <https://uk.linkedin.com/in/abridgett>
>>> _____________________________________________________
>>>
>>
>>
>