You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by Jem Tucker <je...@gmail.com> on 2015/07/01 16:18:42 UTC

Making Unpersist Lazy

Hi,

The current behavior of rdd.unpersist() appears to not be lazily executed
and therefore must be placed after an action. Is there any way to emulate
lazy execution of this function so it is added to the task queue?

Thanks,

Jem

RE: Making Unpersist Lazy

Posted by "Ganelin, Ilya" <Il...@capitalone.com>.
You may pass an optional parameter (blocking = false) to make it lazy.



Thank you,
Ilya Ganelin



-----Original Message-----
From: Jem Tucker [jem.tucker@gmail.com<ma...@gmail.com>]
Sent: Thursday, July 02, 2015 04:06 AM Eastern Standard Time
To: Akhil Das
Cc: user
Subject: Re: Making Unpersist Lazy

Hi,

After running some tests it appears the unpersist is called as soon as it is reached, so any tasks using this rdd later on will have to re calculate it. This is fine for simple programs but when an rdd is created within a function and its reference is then lost but children of it continue to be used the persist/unpersist does not work effectively

Thanks

Jem
On Thu, 2 Jul 2015 at 08:18, Akhil Das <ak...@sigmoidanalytics.com>> wrote:
rdd's which are no longer required will be removed from memory by spark itself (which you can consider as lazy?).

Thanks
Best Regards

On Wed, Jul 1, 2015 at 7:48 PM, Jem Tucker <je...@gmail.com>> wrote:
Hi,

The current behavior of rdd.unpersist() appears to not be lazily executed and therefore must be placed after an action. Is there any way to emulate lazy execution of this function so it is added to the task queue?

Thanks,

Jem

________________________________________________________

The information contained in this e-mail is confidential and/or proprietary to Capital One and/or its affiliates and may only be used solely in performance of work or services for Capital One. The information transmitted herewith is intended only for use by the individual or entity to which it is addressed. If the reader of this message is not the intended recipient, you are hereby notified that any review, retransmission, dissemination, distribution, copying or other use of, or taking of any action in reliance upon this information is strictly prohibited. If you have received this communication in error, please contact the sender and delete the material from your computer.

Re: Making Unpersist Lazy

Posted by Jem Tucker <je...@gmail.com>.
Hi,

After running some tests it appears the unpersist is called as soon as it
is reached, so any tasks using this rdd later on will have to re calculate
it. This is fine for simple programs but when an rdd is created within a
function and its reference is then lost but children of it continue to be
used the persist/unpersist does not work effectively

Thanks

Jem
On Thu, 2 Jul 2015 at 08:18, Akhil Das <ak...@sigmoidanalytics.com> wrote:

> rdd's which are no longer required will be removed from memory by spark
> itself (which you can consider as lazy?).
>
> Thanks
> Best Regards
>
> On Wed, Jul 1, 2015 at 7:48 PM, Jem Tucker <je...@gmail.com> wrote:
>
>> Hi,
>>
>> The current behavior of rdd.unpersist() appears to not be lazily executed
>> and therefore must be placed after an action. Is there any way to emulate
>> lazy execution of this function so it is added to the task queue?
>>
>> Thanks,
>>
>> Jem
>>
>
>

Re: Making Unpersist Lazy

Posted by Akhil Das <ak...@sigmoidanalytics.com>.
rdd's which are no longer required will be removed from memory by spark
itself (which you can consider as lazy?).

Thanks
Best Regards

On Wed, Jul 1, 2015 at 7:48 PM, Jem Tucker <je...@gmail.com> wrote:

> Hi,
>
> The current behavior of rdd.unpersist() appears to not be lazily executed
> and therefore must be placed after an action. Is there any way to emulate
> lazy execution of this function so it is added to the task queue?
>
> Thanks,
>
> Jem
>