You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Joseph K. Bradley (JIRA)" <ji...@apache.org> on 2016/12/02 01:02:58 UTC

[jira] [Commented] (SPARK-17822) JVMObjectTracker.objMap may leak JVM objects

    [ https://issues.apache.org/jira/browse/SPARK-17822?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15713604#comment-15713604 ] 

Joseph K. Bradley commented on SPARK-17822:
-------------------------------------------

I've been able to observe something like this bug by creating a DataFrame in SparkR and calling sql queries on it repeatedly.  Java objects from these duplicate queries start to collect in JVMObjectTracker.  But those Java objects do get GCed periodically.  And calling gc() in R completely cleans them up.

The periodic GC I saw only occurred when I ran R commands, so perhaps it is not triggered as frequently as we’d like.  I'm not that familiar with SparkR internals, but is there a good way to make this happen?

> JVMObjectTracker.objMap may leak JVM objects
> --------------------------------------------
>
>                 Key: SPARK-17822
>                 URL: https://issues.apache.org/jira/browse/SPARK-17822
>             Project: Spark
>          Issue Type: Bug
>          Components: SparkR
>            Reporter: Yin Huai
>         Attachments: screenshot-1.png
>
>
> JVMObjectTracker.objMap is used to track JVM objects for SparkR. However, we observed that JVM objects that are not used anymore are still trapped in this map, which prevents those object get GCed. 
> Seems it makes sense to use weak reference (like persistentRdds in SparkContext). 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org