You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by SemMulder <gi...@git.apache.org> on 2015/10/23 10:15:15 UTC

[GitHub] spark pull request: [SPARK-11276][CORE] SizeEstimator prevents cla...

GitHub user SemMulder opened a pull request:

    https://github.com/apache/spark/pull/9244

    [SPARK-11276][CORE] SizeEstimator prevents class unloading

    The SizeEstimator keeps a cache of ClassInfos but this cache uses Class objects as keys.
    Which results in strong references to the Class objects. If these classes are dynamically created
    this prevents the corresponding ClassLoader from being GCed. Leading to PermGen exhaustion.
    
    We use a Map with WeakKeys to prevent this issue.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/Site2Mobile/spark fix-sizeestimator-classunloading

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/9244.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #9244
    
----
commit 22a02a22401128cb249417ff38ef8f9d24face32
Author: Sem Mulder <se...@site2mobile.com>
Date:   2015-10-22T16:26:16Z

    [SPARK-11276][CORE] SizeEstimator prevents class unloading
    
    The SizeEstimator keeps a cache of ClassInfos but this cache uses Class objects as keys.
    Which results in strong references to the Class objects. If these classes are dynamically created
    this prevents the corresponding ClassLoader from being GCed. Leading to PermGen exhaustion.
    
    We use a Map with WeakKeys to prevent this issue.

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-11276][CORE] SizeEstimator prevents cla...

Posted by SemMulder <gi...@git.apache.org>.
Github user SemMulder commented on a diff in the pull request:

    https://github.com/apache/spark/pull/9244#discussion_r42932252
  
    --- Diff: core/src/main/scala/org/apache/spark/util/SizeEstimator.scala ---
    @@ -73,7 +74,8 @@ object SizeEstimator extends Logging {
       private val ALIGN_SIZE = 8
     
       // A cache of ClassInfo objects for each class
    -  private val classInfos = new ConcurrentHashMap[Class[_], ClassInfo]
    +  // We use weakKeys to allow GC of dynamically created classes
    +  private val classInfos = new MapMaker().weakKeys().makeMap[Class[_], ClassInfo]()
    --- End diff --
    
    A WeakHashMap would need to be synchronized and we would therefore lose concurrency.
    I don't know if the SizeEstimator is used concurrently and hence felt it was safer to keep the concurrency of the original code despite the added complexity of a Guava Map.
    
    Is it safe to drop the concurrency requirement?



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-11276][CORE] SizeEstimator prevents cla...

Posted by srowen <gi...@git.apache.org>.
Github user srowen commented on a diff in the pull request:

    https://github.com/apache/spark/pull/9244#discussion_r42936482
  
    --- Diff: core/src/main/scala/org/apache/spark/util/SizeEstimator.scala ---
    @@ -73,7 +74,8 @@ object SizeEstimator extends Logging {
       private val ALIGN_SIZE = 8
     
       // A cache of ClassInfo objects for each class
    -  private val classInfos = new ConcurrentHashMap[Class[_], ClassInfo]
    +  // We use weakKeys to allow GC of dynamically created classes
    +  private val classInfos = new MapMaker().weakKeys().makeMap[Class[_], ClassInfo]()
    --- End diff --
    
    Good call, I see why you chose this. I agree with it and LGTM.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-11276][CORE] SizeEstimator prevents cla...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/9244#issuecomment-150847233
  
    **[Test build #1951 has finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/1951/consoleFull)** for PR 9244 at commit [`22a02a2`](https://github.com/apache/spark/commit/22a02a22401128cb249417ff38ef8f9d24face32).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-11276][CORE] SizeEstimator prevents cla...

Posted by srowen <gi...@git.apache.org>.
Github user srowen commented on the pull request:

    https://github.com/apache/spark/pull/9244#issuecomment-151079068
  
    If there are no objections, I'll merge to master. This looks like a good targeted fix.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-11276][CORE] SizeEstimator prevents cla...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/9244#issuecomment-150838392
  
    **[Test build #1951 has started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/1951/consoleFull)** for PR 9244 at commit [`22a02a2`](https://github.com/apache/spark/commit/22a02a22401128cb249417ff38ef8f9d24face32).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-11276][CORE] SizeEstimator prevents cla...

Posted by srowen <gi...@git.apache.org>.
Github user srowen commented on the pull request:

    https://github.com/apache/spark/pull/9244#issuecomment-151405796
  
    merged to master


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-11276][CORE] SizeEstimator prevents cla...

Posted by srowen <gi...@git.apache.org>.
Github user srowen commented on a diff in the pull request:

    https://github.com/apache/spark/pull/9244#discussion_r42853708
  
    --- Diff: core/src/main/scala/org/apache/spark/util/SizeEstimator.scala ---
    @@ -73,7 +74,8 @@ object SizeEstimator extends Logging {
       private val ALIGN_SIZE = 8
     
       // A cache of ClassInfo objects for each class
    -  private val classInfos = new ConcurrentHashMap[Class[_], ClassInfo]
    +  // We use weakKeys to allow GC of dynamically created classes
    +  private val classInfos = new MapMaker().weakKeys().makeMap[Class[_], ClassInfo]()
    --- End diff --
    
    Rather than involve Guava, this can just be a `WeakHashMap`? it has weak keys.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-11276][CORE] SizeEstimator prevents cla...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/spark/pull/9244


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-11276][CORE] SizeEstimator prevents cla...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9244#issuecomment-150506052
  
    Can one of the admins verify this patch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org