You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by Patrick Wendell <pw...@gmail.com> on 2014/09/04 07:45:24 UTC

Re: memory size for caching RDD

Changing this is not supported, it si immutable similar to other spark
configuration settings.

On Wed, Sep 3, 2014 at 8:13 PM, 牛兆捷 <nz...@gmail.com> wrote:
> Dear all:
>
> Spark uses memory to cache RDD and the memory size is specified by
> "spark.storage.memoryFraction".
>
> One the Executor starts, does Spark support adjusting/resizing memory size
> of this part dynamically?
>
> Thanks.
>
> --
> *Regards,*
> *Zhaojie*

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
For additional commands, e-mail: user-help@spark.apache.org


Re: memory size for caching RDD

Posted by 牛兆捷 <nz...@gmail.com>.
Thanks raymond.

I duplicated the question. Please see the reply here. [?]


2014-09-04 14:27 GMT+08:00 牛兆捷 <nz...@gmail.com>:

> But is it possible to make t resizable? When we don't have many RDD to
> cache, we can give some memory to others.
>
>
> 2014-09-04 13:45 GMT+08:00 Patrick Wendell <pw...@gmail.com>:
>
> Changing this is not supported, it si immutable similar to other spark
>> configuration settings.
>>
>> On Wed, Sep 3, 2014 at 8:13 PM, 牛兆捷 <nz...@gmail.com> wrote:
>> > Dear all:
>> >
>> > Spark uses memory to cache RDD and the memory size is specified by
>> > "spark.storage.memoryFraction".
>> >
>> > One the Executor starts, does Spark support adjusting/resizing memory
>> size
>> > of this part dynamically?
>> >
>> > Thanks.
>> >
>> > --
>> > *Regards,*
>> > *Zhaojie*
>>
>
>
>
> --
> *Regards,*
> *Zhaojie*
>
>


-- 
*Regards,*
*Zhaojie*

Re: memory size for caching RDD

Posted by Tom Hubregtsen <th...@gmail.com>.
Use unpersist(), even when not persisted before. 



--
View this message in context: http://apache-spark-developers-list.1001551.n3.nabble.com/memory-size-for-caching-RDD-tp8256p8579.html
Sent from the Apache Spark Developers List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
For additional commands, e-mail: dev-help@spark.apache.org


Re: memory size for caching RDD

Posted by 牛兆捷 <nz...@gmail.com>.
ok. So can I use the similar logic as the block manager does when space
fills up ?


2014-09-04 15:05 GMT+08:00 Liu, Raymond <ra...@intel.com>:

> I think there is no public API available to do this. In this case, the
> best you can do might be unpersist some RDDs manually. The problem is that
> this is done by RDD unit, not by block unit. And then, if the storage level
> including disk level, the data on the disk will be removed too.
>
> Best Regards,
> Raymond Liu
>
> From: 牛兆捷 [mailto:nzjemail@gmail.com]
> Sent: Thursday, September 04, 2014 2:57 PM
> To: Liu, Raymond
> Cc: Patrick Wendell; user@spark.apache.org; dev@spark.apache.org
> Subject: Re: memory size for caching RDD
>
> Oh I see.
>
> I want to implement something like this: sometimes I need to release some
> memory for other usage even when they are occupied by some RDDs (can be
> recomputed with the help of lineage when they are needed),  does spark
> provide interfaces to force it to release some memory ?
>
> 2014-09-04 14:32 GMT+08:00 Liu, Raymond <ra...@intel.com>:
> You don’t need to. It is not static allocated to RDD cache, it is just an
> up limit.
> If you don’t use up the memory by RDD cache, it is always available for
> other usage. except those one also controlled by some memoryFraction conf.
> e.g. spark.shuffle.memoryFraction which you also set the up limit.
>
> Best Regards,
> Raymond Liu
>
> From: 牛兆捷 [mailto:nzjemail@gmail.com]
> Sent: Thursday, September 04, 2014 2:27 PM
> To: Patrick Wendell
> Cc: user@spark.apache.org; dev@spark.apache.org
> Subject: Re: memory size for caching RDD
>
> But is it possible to make t resizable? When we don't have many RDD to
> cache, we can give some memory to others.
>
> 2014-09-04 13:45 GMT+08:00 Patrick Wendell <pw...@gmail.com>:
> Changing this is not supported, it si immutable similar to other spark
> configuration settings.
>
> On Wed, Sep 3, 2014 at 8:13 PM, 牛兆捷 <nz...@gmail.com> wrote:
> > Dear all:
> >
> > Spark uses memory to cache RDD and the memory size is specified by
> > "spark.storage.memoryFraction".
> >
> > One the Executor starts, does Spark support adjusting/resizing memory
> size
> > of this part dynamically?
> >
> > Thanks.
> >
> > --
> > *Regards,*
> > *Zhaojie*
>
>
>
> --
> Regards,
> Zhaojie
>
>
>
>
> --
> Regards,
> Zhaojie
>
>


-- 
*Regards,*
*Zhaojie*

Re: memory size for caching RDD

Posted by 牛兆捷 <nz...@gmail.com>.
ok. So can I use the similar logic as the block manager does when space
fills up ?


2014-09-04 15:05 GMT+08:00 Liu, Raymond <ra...@intel.com>:

> I think there is no public API available to do this. In this case, the
> best you can do might be unpersist some RDDs manually. The problem is that
> this is done by RDD unit, not by block unit. And then, if the storage level
> including disk level, the data on the disk will be removed too.
>
> Best Regards,
> Raymond Liu
>
> From: 牛兆捷 [mailto:nzjemail@gmail.com]
> Sent: Thursday, September 04, 2014 2:57 PM
> To: Liu, Raymond
> Cc: Patrick Wendell; user@spark.apache.org; dev@spark.apache.org
> Subject: Re: memory size for caching RDD
>
> Oh I see.
>
> I want to implement something like this: sometimes I need to release some
> memory for other usage even when they are occupied by some RDDs (can be
> recomputed with the help of lineage when they are needed),  does spark
> provide interfaces to force it to release some memory ?
>
> 2014-09-04 14:32 GMT+08:00 Liu, Raymond <ra...@intel.com>:
> You don’t need to. It is not static allocated to RDD cache, it is just an
> up limit.
> If you don’t use up the memory by RDD cache, it is always available for
> other usage. except those one also controlled by some memoryFraction conf.
> e.g. spark.shuffle.memoryFraction which you also set the up limit.
>
> Best Regards,
> Raymond Liu
>
> From: 牛兆捷 [mailto:nzjemail@gmail.com]
> Sent: Thursday, September 04, 2014 2:27 PM
> To: Patrick Wendell
> Cc: user@spark.apache.org; dev@spark.apache.org
> Subject: Re: memory size for caching RDD
>
> But is it possible to make t resizable? When we don't have many RDD to
> cache, we can give some memory to others.
>
> 2014-09-04 13:45 GMT+08:00 Patrick Wendell <pw...@gmail.com>:
> Changing this is not supported, it si immutable similar to other spark
> configuration settings.
>
> On Wed, Sep 3, 2014 at 8:13 PM, 牛兆捷 <nz...@gmail.com> wrote:
> > Dear all:
> >
> > Spark uses memory to cache RDD and the memory size is specified by
> > "spark.storage.memoryFraction".
> >
> > One the Executor starts, does Spark support adjusting/resizing memory
> size
> > of this part dynamically?
> >
> > Thanks.
> >
> > --
> > *Regards,*
> > *Zhaojie*
>
>
>
> --
> Regards,
> Zhaojie
>
>
>
>
> --
> Regards,
> Zhaojie
>
>


-- 
*Regards,*
*Zhaojie*

RE: memory size for caching RDD

Posted by "Liu, Raymond" <ra...@intel.com>.
I think there is no public API available to do this. In this case, the best you can do might be unpersist some RDDs manually. The problem is that this is done by RDD unit, not by block unit. And then, if the storage level including disk level, the data on the disk will be removed too.

Best Regards,
Raymond Liu

From: 牛兆捷 [mailto:nzjemail@gmail.com] 
Sent: Thursday, September 04, 2014 2:57 PM
To: Liu, Raymond
Cc: Patrick Wendell; user@spark.apache.org; dev@spark.apache.org
Subject: Re: memory size for caching RDD

Oh I see. 

I want to implement something like this: sometimes I need to release some memory for other usage even when they are occupied by some RDDs (can be recomputed with the help of lineage when they are needed),  does spark provide interfaces to force it to release some memory ?

2014-09-04 14:32 GMT+08:00 Liu, Raymond <ra...@intel.com>:
You don’t need to. It is not static allocated to RDD cache, it is just an up limit.
If you don’t use up the memory by RDD cache, it is always available for other usage. except those one also controlled by some memoryFraction conf. e.g. spark.shuffle.memoryFraction which you also set the up limit.
 
Best Regards,
Raymond Liu
 
From: 牛兆捷 [mailto:nzjemail@gmail.com] 
Sent: Thursday, September 04, 2014 2:27 PM
To: Patrick Wendell
Cc: user@spark.apache.org; dev@spark.apache.org
Subject: Re: memory size for caching RDD
 
But is it possible to make t resizable? When we don't have many RDD to cache, we can give some memory to others.
 
2014-09-04 13:45 GMT+08:00 Patrick Wendell <pw...@gmail.com>:
Changing this is not supported, it si immutable similar to other spark
configuration settings.

On Wed, Sep 3, 2014 at 8:13 PM, 牛兆捷 <nz...@gmail.com> wrote:
> Dear all:
>
> Spark uses memory to cache RDD and the memory size is specified by
> "spark.storage.memoryFraction".
>
> One the Executor starts, does Spark support adjusting/resizing memory size
> of this part dynamically?
>
> Thanks.
>
> --
> *Regards,*
> *Zhaojie*



-- 
Regards,
Zhaojie
 



-- 
Regards,
Zhaojie


RE: memory size for caching RDD

Posted by "Liu, Raymond" <ra...@intel.com>.
I think there is no public API available to do this. In this case, the best you can do might be unpersist some RDDs manually. The problem is that this is done by RDD unit, not by block unit. And then, if the storage level including disk level, the data on the disk will be removed too.

Best Regards,
Raymond Liu

From: 牛兆捷 [mailto:nzjemail@gmail.com] 
Sent: Thursday, September 04, 2014 2:57 PM
To: Liu, Raymond
Cc: Patrick Wendell; user@spark.apache.org; dev@spark.apache.org
Subject: Re: memory size for caching RDD

Oh I see. 

I want to implement something like this: sometimes I need to release some memory for other usage even when they are occupied by some RDDs (can be recomputed with the help of lineage when they are needed),  does spark provide interfaces to force it to release some memory ?

2014-09-04 14:32 GMT+08:00 Liu, Raymond <ra...@intel.com>:
You don’t need to. It is not static allocated to RDD cache, it is just an up limit.
If you don’t use up the memory by RDD cache, it is always available for other usage. except those one also controlled by some memoryFraction conf. e.g. spark.shuffle.memoryFraction which you also set the up limit.
 
Best Regards,
Raymond Liu
 
From: 牛兆捷 [mailto:nzjemail@gmail.com] 
Sent: Thursday, September 04, 2014 2:27 PM
To: Patrick Wendell
Cc: user@spark.apache.org; dev@spark.apache.org
Subject: Re: memory size for caching RDD
 
But is it possible to make t resizable? When we don't have many RDD to cache, we can give some memory to others.
 
2014-09-04 13:45 GMT+08:00 Patrick Wendell <pw...@gmail.com>:
Changing this is not supported, it si immutable similar to other spark
configuration settings.

On Wed, Sep 3, 2014 at 8:13 PM, 牛兆捷 <nz...@gmail.com> wrote:
> Dear all:
>
> Spark uses memory to cache RDD and the memory size is specified by
> "spark.storage.memoryFraction".
>
> One the Executor starts, does Spark support adjusting/resizing memory size
> of this part dynamically?
>
> Thanks.
>
> --
> *Regards,*
> *Zhaojie*



-- 
Regards,
Zhaojie
 



-- 
Regards,
Zhaojie


Re: memory size for caching RDD

Posted by 牛兆捷 <nz...@gmail.com>.
Oh I see.

I want to implement something like this: sometimes I need to release some
memory for other usage even when they are occupied by some RDDs (can be
recomputed with the help of lineage when they are needed),  does spark
provide interfaces to force it to release some memory ?


2014-09-04 14:32 GMT+08:00 Liu, Raymond <ra...@intel.com>:

>  You don’t need to. It is not static allocated to RDD cache, it is just
> an up limit.
>
> If you don’t use up the memory by RDD cache, it is always available for
> other usage. except those one also controlled by some memoryFraction conf.
> e.g. spark.shuffle.memoryFraction which you also set the up limit.
>
>
>
> Best Regards,
>
> *Raymond Liu*
>
>
>
> *From:* 牛兆捷 [mailto:nzjemail@gmail.com]
> *Sent:* Thursday, September 04, 2014 2:27 PM
> *To:* Patrick Wendell
> *Cc:* user@spark.apache.org; dev@spark.apache.org
> *Subject:* Re: memory size for caching RDD
>
>
>
> But is it possible to make t resizable? When we don't have many RDD to
> cache, we can give some memory to others.
>
>
>
> 2014-09-04 13:45 GMT+08:00 Patrick Wendell <pw...@gmail.com>:
>
> Changing this is not supported, it si immutable similar to other spark
> configuration settings.
>
>
> On Wed, Sep 3, 2014 at 8:13 PM, 牛兆捷 <nz...@gmail.com> wrote:
> > Dear all:
> >
> > Spark uses memory to cache RDD and the memory size is specified by
> > "spark.storage.memoryFraction".
> >
> > One the Executor starts, does Spark support adjusting/resizing memory
> size
> > of this part dynamically?
> >
> > Thanks.
> >
> > --
>
> > *Regards,*
> > *Zhaojie*
>
>
>
>
> --
>
> *Regards,*
>
> *Zhaojie*
>
>
>



-- 
*Regards,*
*Zhaojie*

Re: memory size for caching RDD

Posted by 牛兆捷 <nz...@gmail.com>.
Oh I see.

I want to implement something like this: sometimes I need to release some
memory for other usage even when they are occupied by some RDDs (can be
recomputed with the help of lineage when they are needed),  does spark
provide interfaces to force it to release some memory ?


2014-09-04 14:32 GMT+08:00 Liu, Raymond <ra...@intel.com>:

>  You don’t need to. It is not static allocated to RDD cache, it is just
> an up limit.
>
> If you don’t use up the memory by RDD cache, it is always available for
> other usage. except those one also controlled by some memoryFraction conf.
> e.g. spark.shuffle.memoryFraction which you also set the up limit.
>
>
>
> Best Regards,
>
> *Raymond Liu*
>
>
>
> *From:* 牛兆捷 [mailto:nzjemail@gmail.com]
> *Sent:* Thursday, September 04, 2014 2:27 PM
> *To:* Patrick Wendell
> *Cc:* user@spark.apache.org; dev@spark.apache.org
> *Subject:* Re: memory size for caching RDD
>
>
>
> But is it possible to make t resizable? When we don't have many RDD to
> cache, we can give some memory to others.
>
>
>
> 2014-09-04 13:45 GMT+08:00 Patrick Wendell <pw...@gmail.com>:
>
> Changing this is not supported, it si immutable similar to other spark
> configuration settings.
>
>
> On Wed, Sep 3, 2014 at 8:13 PM, 牛兆捷 <nz...@gmail.com> wrote:
> > Dear all:
> >
> > Spark uses memory to cache RDD and the memory size is specified by
> > "spark.storage.memoryFraction".
> >
> > One the Executor starts, does Spark support adjusting/resizing memory
> size
> > of this part dynamically?
> >
> > Thanks.
> >
> > --
>
> > *Regards,*
> > *Zhaojie*
>
>
>
>
> --
>
> *Regards,*
>
> *Zhaojie*
>
>
>



-- 
*Regards,*
*Zhaojie*

RE: memory size for caching RDD

Posted by "Liu, Raymond" <ra...@intel.com>.
You don’t need to. It is not static allocated to RDD cache, it is just an up limit.
If you don’t use up the memory by RDD cache, it is always available for other usage. except those one also controlled by some memoryFraction conf. e.g. spark.shuffle.memoryFraction which you also set the up limit.

Best Regards,
Raymond Liu

From: 牛兆捷 [mailto:nzjemail@gmail.com]
Sent: Thursday, September 04, 2014 2:27 PM
To: Patrick Wendell
Cc: user@spark.apache.org; dev@spark.apache.org
Subject: Re: memory size for caching RDD

But is it possible to make t resizable? When we don't have many RDD to cache, we can give some memory to others.

2014-09-04 13:45 GMT+08:00 Patrick Wendell <pw...@gmail.com>>:
Changing this is not supported, it si immutable similar to other spark
configuration settings.

On Wed, Sep 3, 2014 at 8:13 PM, 牛兆捷 <nz...@gmail.com>> wrote:
> Dear all:
>
> Spark uses memory to cache RDD and the memory size is specified by
> "spark.storage.memoryFraction".
>
> One the Executor starts, does Spark support adjusting/resizing memory size
> of this part dynamically?
>
> Thanks.
>
> --
> *Regards,*
> *Zhaojie*



--
Regards,
Zhaojie


RE: memory size for caching RDD

Posted by "Liu, Raymond" <ra...@intel.com>.
You don’t need to. It is not static allocated to RDD cache, it is just an up limit.
If you don’t use up the memory by RDD cache, it is always available for other usage. except those one also controlled by some memoryFraction conf. e.g. spark.shuffle.memoryFraction which you also set the up limit.

Best Regards,
Raymond Liu

From: 牛兆捷 [mailto:nzjemail@gmail.com]
Sent: Thursday, September 04, 2014 2:27 PM
To: Patrick Wendell
Cc: user@spark.apache.org; dev@spark.apache.org
Subject: Re: memory size for caching RDD

But is it possible to make t resizable? When we don't have many RDD to cache, we can give some memory to others.

2014-09-04 13:45 GMT+08:00 Patrick Wendell <pw...@gmail.com>>:
Changing this is not supported, it si immutable similar to other spark
configuration settings.

On Wed, Sep 3, 2014 at 8:13 PM, 牛兆捷 <nz...@gmail.com>> wrote:
> Dear all:
>
> Spark uses memory to cache RDD and the memory size is specified by
> "spark.storage.memoryFraction".
>
> One the Executor starts, does Spark support adjusting/resizing memory size
> of this part dynamically?
>
> Thanks.
>
> --
> *Regards,*
> *Zhaojie*



--
Regards,
Zhaojie


Re: memory size for caching RDD

Posted by 牛兆捷 <nz...@gmail.com>.
Thanks raymond.

I duplicated the question. Please see the reply here. [?]


2014-09-04 14:27 GMT+08:00 牛兆捷 <nz...@gmail.com>:

> But is it possible to make t resizable? When we don't have many RDD to
> cache, we can give some memory to others.
>
>
> 2014-09-04 13:45 GMT+08:00 Patrick Wendell <pw...@gmail.com>:
>
> Changing this is not supported, it si immutable similar to other spark
>> configuration settings.
>>
>> On Wed, Sep 3, 2014 at 8:13 PM, 牛兆捷 <nz...@gmail.com> wrote:
>> > Dear all:
>> >
>> > Spark uses memory to cache RDD and the memory size is specified by
>> > "spark.storage.memoryFraction".
>> >
>> > One the Executor starts, does Spark support adjusting/resizing memory
>> size
>> > of this part dynamically?
>> >
>> > Thanks.
>> >
>> > --
>> > *Regards,*
>> > *Zhaojie*
>>
>
>
>
> --
> *Regards,*
> *Zhaojie*
>
>


-- 
*Regards,*
*Zhaojie*

Re: memory size for caching RDD

Posted by 牛兆捷 <nz...@gmail.com>.
But is it possible to make t resizable? When we don't have many RDD to
cache, we can give some memory to others.


2014-09-04 13:45 GMT+08:00 Patrick Wendell <pw...@gmail.com>:

> Changing this is not supported, it si immutable similar to other spark
> configuration settings.
>
> On Wed, Sep 3, 2014 at 8:13 PM, 牛兆捷 <nz...@gmail.com> wrote:
> > Dear all:
> >
> > Spark uses memory to cache RDD and the memory size is specified by
> > "spark.storage.memoryFraction".
> >
> > One the Executor starts, does Spark support adjusting/resizing memory
> size
> > of this part dynamically?
> >
> > Thanks.
> >
> > --
> > *Regards,*
> > *Zhaojie*
>



-- 
*Regards,*
*Zhaojie*

Re: memory size for caching RDD

Posted by 牛兆捷 <nz...@gmail.com>.
But is it possible to make t resizable? When we don't have many RDD to
cache, we can give some memory to others.


2014-09-04 13:45 GMT+08:00 Patrick Wendell <pw...@gmail.com>:

> Changing this is not supported, it si immutable similar to other spark
> configuration settings.
>
> On Wed, Sep 3, 2014 at 8:13 PM, 牛兆捷 <nz...@gmail.com> wrote:
> > Dear all:
> >
> > Spark uses memory to cache RDD and the memory size is specified by
> > "spark.storage.memoryFraction".
> >
> > One the Executor starts, does Spark support adjusting/resizing memory
> size
> > of this part dynamically?
> >
> > Thanks.
> >
> > --
> > *Regards,*
> > *Zhaojie*
>



-- 
*Regards,*
*Zhaojie*