You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by Koert Kuipers <ko...@tresata.com> on 2014/07/26 22:44:56 UTC

graphx cached partitions wont go away

i have graphx queries running inside a service where i collect the results
to the driver and do not hold any references to the rdds involved in the
queries. my assumption was that with the references gone spark would go and
remove the cached rdds from memory (note, i did not cache them, graphx did).

yet they hang around...

is my understanding of how the ContextCleaner works incorrect? or could it
be that grapx holds some references internally to rdds, preventing garbage
collection? maybe even circular references?

Re: graphx cached partitions wont go away

Posted by Koert Kuipers <ko...@tresata.com>.
never mind I think its just the GC taking its time while I got many
gigabytes of unused cached rdds that I cannot get rid of easily
On Jul 26, 2014 4:44 PM, "Koert Kuipers" <ko...@tresata.com> wrote:

> i have graphx queries running inside a service where i collect the results
> to the driver and do not hold any references to the rdds involved in the
> queries. my assumption was that with the references gone spark would go and
> remove the cached rdds from memory (note, i did not cache them, graphx did).
>
> yet they hang around...
>
> is my understanding of how the ContextCleaner works incorrect? or could it
> be that grapx holds some references internally to rdds, preventing garbage
> collection? maybe even circular references?
>