You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by pseudo oduesp <ps...@gmail.com> on 2016/06/16 12:17:21 UTC
cache datframe
hi,
if i cache same data frame and transforme and add collumns i should cache
second times
df.cache()
transforamtion
add new columns
df.cache()
?
Re: cache datframe
Posted by Jacek Laskowski <ja...@japila.pl>.
Yes. Yes.
What's the use case?
Jacek
On 16 Jun 2016 2:17 p.m., "pseudo oduesp" <ps...@gmail.com> wrote:
> hi,
> if i cache same data frame and transforme and add collumns i should cache
> second times
>
> df.cache()
>
> transforamtion
> add new columns
>
> df.cache()
> ?
>
>
Re: cache datframe
Posted by Alexey Pechorin <al...@taboola.com>.
What's the reason for your first cache call? It looks like you've used the
data only once to transform it without reusing the data, so there's no
reason for the first cache call, and you need only the second call (and
that also depends on the rest of your code).
On Thu, Jun 16, 2016 at 3:17 PM, pseudo oduesp <ps...@gmail.com>
wrote:
> hi,
> if i cache same data frame and transforme and add collumns i should cache
> second times
>
> df.cache()
>
> transforamtion
> add new columns
>
> df.cache()
> ?
>
>