You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by attilapiros <gi...@git.apache.org> on 2018/02/08 14:21:09 UTC

[GitHub] spark pull request #20546: [SPARK-20659][Core] Remove StorageStatus, or make...

GitHub user attilapiros opened a pull request:

    https://github.com/apache/spark/pull/20546

    [SPARK-20659][Core] Remove StorageStatus, or make it private.

    ## What changes were proposed in this pull request?
    
    In this PR StorageStatus is made to private and simplified a bit moreover SparkContext.getExecutorStorageStatus method is removed. The reason of keeping StorageStatus is that it is usage from SparkContext.getRDDStorageInfo.
    
    Instead of the method SparkContext.getExecutorStorageStatus executor infos are extended with additional memory metrics such as usedOnHeapStorageMemory, usedOffHeapStorageMemory, totalOnHeapStorageMemory, totalOffHeapStorageMemory.
    
    ## How was this patch tested?
    
    By running existing unit tests.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/attilapiros/spark SPARK-20659

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/20546.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #20546
    
----
commit 3f5dba62cfa8b07d602639e9f88f3fd8cec92831
Author: “attilapiros” <pi...@...>
Date:   2018-02-07T13:41:31Z

    initial version

----


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20546: [SPARK-20659][Core] Removing sc.getExecutorStorageStatus...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/20546
  
    **[Test build #87345 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87345/testReport)** for PR 20546 at commit [`049a065`](https://github.com/apache/spark/commit/049a0659b7494387ec265121e4cf4e24c5d3caa0).
     * This patch **fails Spark unit tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #20546: [SPARK-20659][Core] Removing sc.getExecutorStorag...

Posted by attilapiros <gi...@git.apache.org>.
Github user attilapiros commented on a diff in the pull request:

    https://github.com/apache/spark/pull/20546#discussion_r167514294
  
    --- Diff: core/src/main/scala/org/apache/spark/SparkContext.scala ---
    @@ -1715,7 +1715,7 @@ class SparkContext(config: SparkConf) extends Logging {
       private[spark] def getRDDStorageInfo(filter: RDD[_] => Boolean): Array[RDDInfo] = {
    --- End diff --
    
    The old code considered the replication factor. I have created a separate Jira issue: https://issues.apache.org/jira/browse/SPARK-23394. 



---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #20546: [SPARK-20659][Core] Removing sc.getExecutorStorag...

Posted by attilapiros <gi...@git.apache.org>.
Github user attilapiros commented on a diff in the pull request:

    https://github.com/apache/spark/pull/20546#discussion_r167272305
  
    --- Diff: core/src/test/scala/org/apache/spark/deploy/StandaloneDynamicAllocationSuite.scala ---
    @@ -610,7 +610,7 @@ class StandaloneDynamicAllocationSuite
        * we submit a request to kill them. This must be called before each kill request.
        */
       private def syncExecutors(sc: SparkContext): Unit = {
    -    val driverExecutors = sc.getExecutorStorageStatus
    --- End diff --
    
    I have tried to use "sc.statusStore.executorList(true)" instead of sc.env.blockManager.master.getStorageStatus but the test failed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20546: [SPARK-20659][Core] Removing sc.getExecutorStorageStatus...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/20546
  
    Merged build finished. Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20546: [SPARK-20659][Core] Removing sc.getExecutorStorageStatus...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/20546
  
    Merged build finished. Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #20546: [SPARK-20659][Core] Removing sc.getExecutorStorag...

Posted by jiangxb1987 <gi...@git.apache.org>.
Github user jiangxb1987 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/20546#discussion_r168136979
  
    --- Diff: core/src/main/java/org/apache/spark/SparkExecutorInfo.java ---
    @@ -30,4 +30,8 @@
       int port();
       long cacheSize();
       int numRunningTasks();
    +  long usedOnHeapStorageMemory();
    --- End diff --
    
    Maybe I'm missing something here, but do we already have a real use case for the added memory metrics here?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20546: [SPARK-20659][Core] Removing sc.getExecutorStorageStatus...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/20546
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/87378/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #20546: [SPARK-20659][Core] Removing sc.getExecutorStorag...

Posted by jiangxb1987 <gi...@git.apache.org>.
Github user jiangxb1987 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/20546#discussion_r168147073
  
    --- Diff: core/src/main/java/org/apache/spark/SparkExecutorInfo.java ---
    @@ -30,4 +30,8 @@
       int port();
       long cacheSize();
       int numRunningTasks();
    +  long usedOnHeapStorageMemory();
    --- End diff --
    
    Agreed, the changes should be a minor issue. Thanks for explanations!


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20546: [SPARK-20659][Core] Removing sc.getExecutorStorageStatus...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/20546
  
    **[Test build #87232 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87232/testReport)** for PR 20546 at commit [`8544380`](https://github.com/apache/spark/commit/8544380e91f5bfa7c95cf613d49bc9144fedee9f).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20546: [SPARK-20659][Core] Removing sc.getExecutorStorageStatus...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/20546
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/87265/
    Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #20546: [SPARK-20659][Core] Removing sc.getExecutorStorag...

Posted by jiangxb1987 <gi...@git.apache.org>.
Github user jiangxb1987 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/20546#discussion_r168144704
  
    --- Diff: core/src/main/java/org/apache/spark/SparkExecutorInfo.java ---
    @@ -30,4 +30,8 @@
       int port();
       long cacheSize();
       int numRunningTasks();
    +  long usedOnHeapStorageMemory();
    --- End diff --
    
    I'm not disagreeing with you on making such changes, but I'm also worrying about users could have to change their code a lot because of the changes we made. If you don't mind, may I submit a follow up PR to minimize the gap between the `SparkExecutorInfo` and `StorageStatus`?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20546: [SPARK-20659][Core] Removing sc.getExecutorStorageStatus...

Posted by vanzin <gi...@git.apache.org>.
Github user vanzin commented on the issue:

    https://github.com/apache/spark/pull/20546
  
    LGTM, merging to master.
    
    I filed SPARK-23411 to deprecate the other API I missed before.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20546: [SPARK-20659][Core] Removing sc.getExecutorStorageStatus...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/20546
  
    **[Test build #87345 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87345/testReport)** for PR 20546 at commit [`049a065`](https://github.com/apache/spark/commit/049a0659b7494387ec265121e4cf4e24c5d3caa0).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20546: [SPARK-20659][Core] Removing sc.getExecutorStorageStatus...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/20546
  
    **[Test build #87361 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87361/testReport)** for PR 20546 at commit [`543caf8`](https://github.com/apache/spark/commit/543caf879468a3ade8350934716443207d2eaeca).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #20546: [SPARK-20659][Core] Removing sc.getExecutorStorag...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/spark/pull/20546


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20546: [SPARK-20659][Core] Removing sc.getExecutorStorageStatus...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/20546
  
    Merged build finished. Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #20546: [SPARK-20659][Core] Removing sc.getExecutorStorag...

Posted by vanzin <gi...@git.apache.org>.
Github user vanzin commented on a diff in the pull request:

    https://github.com/apache/spark/pull/20546#discussion_r167303770
  
    --- Diff: core/src/test/scala/org/apache/spark/deploy/StandaloneDynamicAllocationSuite.scala ---
    @@ -610,7 +610,7 @@ class StandaloneDynamicAllocationSuite
        * we submit a request to kill them. This must be called before each kill request.
        */
       private def syncExecutors(sc: SparkContext): Unit = {
    -    val driverExecutors = sc.getExecutorStorageStatus
    --- End diff --
    
    Failed how? The list kept by the block manager and by the status store should be the same, so if they differ, there's a problem somewhere.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20546: [SPARK-20659][Core] Remove StorageStatus, or make it pri...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/20546
  
    Can one of the admins verify this patch?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20546: [SPARK-20659][Core] Remove StorageStatus, or make it pri...

Posted by vanzin <gi...@git.apache.org>.
Github user vanzin commented on the issue:

    https://github.com/apache/spark/pull/20546
  
    This won't go into 2.3.
    
    Also, please don't copy & paste the bug title in your PR. Explain what you're doing instead. The current title does not explain what the change does.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20546: [SPARK-20659][Core] Removing sc.getExecutorStorageStatus...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/20546
  
    **[Test build #87216 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87216/testReport)** for PR 20546 at commit [`3f5dba6`](https://github.com/apache/spark/commit/3f5dba62cfa8b07d602639e9f88f3fd8cec92831).
     * This patch **fails Spark unit tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20546: [SPARK-20659][Core] Removing sc.getExecutorStorageStatus...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/20546
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20546: [SPARK-20659][Core] Removing sc.getExecutorStorageStatus...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/20546
  
    **[Test build #87265 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87265/testReport)** for PR 20546 at commit [`5505235`](https://github.com/apache/spark/commit/5505235d1b310d44d57f13781036dbbdc6bec679).
     * This patch **fails MiMa tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20546: [SPARK-20659][Core] Removing sc.getExecutorStorageStatus...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/20546
  
    **[Test build #87378 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87378/testReport)** for PR 20546 at commit [`543caf8`](https://github.com/apache/spark/commit/543caf879468a3ade8350934716443207d2eaeca).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20546: [SPARK-20659][Core] Removing sc.getExecutorStorageStatus...

Posted by vanzin <gi...@git.apache.org>.
Github user vanzin commented on the issue:

    https://github.com/apache/spark/pull/20546
  
    retest this please


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #20546: [SPARK-20659][Core] Removing sc.getExecutorStorag...

Posted by vanzin <gi...@git.apache.org>.
Github user vanzin commented on a diff in the pull request:

    https://github.com/apache/spark/pull/20546#discussion_r168146172
  
    --- Diff: core/src/main/java/org/apache/spark/SparkExecutorInfo.java ---
    @@ -30,4 +30,8 @@
       int port();
       long cacheSize();
       int numRunningTasks();
    +  long usedOnHeapStorageMemory();
    --- End diff --
    
    Sure. But users put themselves at those kind of risks by using @DeveloperApi methods, especially ones that have been deprecated.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20546: [SPARK-20659][Core] Removing sc.getExecutorStorageStatus...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/20546
  
    **[Test build #87265 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87265/testReport)** for PR 20546 at commit [`5505235`](https://github.com/apache/spark/commit/5505235d1b310d44d57f13781036dbbdc6bec679).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #20546: [SPARK-20659][Core] Removing sc.getExecutorStorag...

Posted by vanzin <gi...@git.apache.org>.
Github user vanzin commented on a diff in the pull request:

    https://github.com/apache/spark/pull/20546#discussion_r168143242
  
    --- Diff: core/src/main/java/org/apache/spark/SparkExecutorInfo.java ---
    @@ -30,4 +30,8 @@
       int port();
       long cacheSize();
       int numRunningTasks();
    +  long usedOnHeapStorageMemory();
    --- End diff --
    
    1. That follows the names in the public `MemoryMetrics` class from the REST API.
    2. We could add that, just as we could add a whole lot of other things. At some point we should look at exposing the REST API types directly through `SparkStatusTracker` instead of having these mirror types.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20546: [SPARK-20659][Core] Removing sc.getExecutorStorageStatus...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/20546
  
    **[Test build #87266 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87266/testReport)** for PR 20546 at commit [`a797e04`](https://github.com/apache/spark/commit/a797e04de471f4794f785e7c9f81bf83294ce4b8).
     * This patch **fails Spark unit tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #20546: [SPARK-20659][Core] Removing sc.getExecutorStorag...

Posted by vanzin <gi...@git.apache.org>.
Github user vanzin commented on a diff in the pull request:

    https://github.com/apache/spark/pull/20546#discussion_r167071122
  
    --- Diff: core/src/main/scala/org/apache/spark/storage/StorageUtils.scala ---
    @@ -24,19 +24,15 @@ import scala.collection.mutable
     
     import sun.nio.ch.DirectBuffer
     
    -import org.apache.spark.annotation.DeveloperApi
     import org.apache.spark.internal.Logging
     
     /**
    - * :: DeveloperApi ::
      * Storage information for each BlockManager.
      *
      * This class assumes BlockId and BlockStatus are immutable, such that the consumers of this
      * class cannot mutate the source of the information. Accesses are not thread-safe.
      */
    -@DeveloperApi
    -@deprecated("This class may be removed or made private in a future release.", "2.2.0")
    -class StorageStatus(
    +private [spark] class StorageStatus(
    --- End diff --
    
    nit: no space after `private`


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #20546: [SPARK-20659][Core] Removing sc.getExecutorStorag...

Posted by vanzin <gi...@git.apache.org>.
Github user vanzin commented on a diff in the pull request:

    https://github.com/apache/spark/pull/20546#discussion_r167303527
  
    --- Diff: core/src/main/scala/org/apache/spark/SparkContext.scala ---
    @@ -1715,7 +1715,7 @@ class SparkContext(config: SparkConf) extends Logging {
       private[spark] def getRDDStorageInfo(filter: RDD[_] => Boolean): Array[RDDInfo] = {
    --- End diff --
    
    If those values differ then it's probably a bug in the new code. Or maybe a bug in the old code, although that's less likely. It would be good to investigate why they differ.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20546: [SPARK-20659][Core] Removing sc.getExecutorStorageStatus...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/20546
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/87336/
    Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20546: [SPARK-20659][Core] Removing sc.getExecutorStorageStatus...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/20546
  
    Merged build finished. Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20546: [SPARK-20659][Core] Removing sc.getExecutorStorageStatus...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/20546
  
    Merged build finished. Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #20546: [SPARK-20659][Core] Removing sc.getExecutorStorag...

Posted by attilapiros <gi...@git.apache.org>.
Github user attilapiros commented on a diff in the pull request:

    https://github.com/apache/spark/pull/20546#discussion_r167270757
  
    --- Diff: core/src/main/scala/org/apache/spark/SparkContext.scala ---
    @@ -1715,7 +1715,7 @@ class SparkContext(config: SparkConf) extends Logging {
       private[spark] def getRDDStorageInfo(filter: RDD[_] => Boolean): Array[RDDInfo] = {
    --- End diff --
    
    I have found something and I am not sure whether is it a bug. 
    Using rddStorageInfo.numCachedPartitions gives back a different value then the old storage util computed/updated rddInfo.numCachedPartitions.
    
    This is why I changed the assert in org.apache.spark.repl.SingletonReplSuite at "replicating blocks of object with class defined in repl".



---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #20546: [SPARK-20659][Core] Removing sc.getExecutorStorag...

Posted by vanzin <gi...@git.apache.org>.
Github user vanzin commented on a diff in the pull request:

    https://github.com/apache/spark/pull/20546#discussion_r167070392
  
    --- Diff: core/src/main/scala/org/apache/spark/SparkContext.scala ---
    @@ -1715,7 +1715,7 @@ class SparkContext(config: SparkConf) extends Logging {
       private[spark] def getRDDStorageInfo(filter: RDD[_] => Boolean): Array[RDDInfo] = {
    --- End diff --
    
    There's a single call to this method outside of tests, in `RDD.toDebugString`. That to me makes it another candidate to go away and be replaced with information from the `AppStatusStore`. Then maybe you can remove more code from `StorageStatus`.
    
    Have you taken a look at that?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #20546: [SPARK-20659][Core] Removing sc.getExecutorStorag...

Posted by attilapiros <gi...@git.apache.org>.
Github user attilapiros commented on a diff in the pull request:

    https://github.com/apache/spark/pull/20546#discussion_r167612371
  
    --- Diff: core/src/test/scala/org/apache/spark/deploy/StandaloneDynamicAllocationSuite.scala ---
    @@ -610,7 +610,7 @@ class StandaloneDynamicAllocationSuite
        * we submit a request to kill them. This must be called before each kill request.
        */
       private def syncExecutors(sc: SparkContext): Unit = {
    -    val driverExecutors = sc.getExecutorStorageStatus
    --- End diff --
    
    As only registered executors can be killed this part synchronised the executors known by the master and the driver (the missing executors from the driver was registered with some mock data). The reason behind was some performance. 
    
    I have run tried to change the code to wait for the executors and it get very slow (one test took even 50 seconds long).   


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20546: [SPARK-20659][Core] Removing sc.getExecutorStorageStatus...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/20546
  
    **[Test build #87361 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87361/testReport)** for PR 20546 at commit [`543caf8`](https://github.com/apache/spark/commit/543caf879468a3ade8350934716443207d2eaeca).
     * This patch **fails Spark unit tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #20546: [SPARK-20659][Core] Removing sc.getExecutorStorag...

Posted by vanzin <gi...@git.apache.org>.
Github user vanzin commented on a diff in the pull request:

    https://github.com/apache/spark/pull/20546#discussion_r167072049
  
    --- Diff: core/src/test/scala/org/apache/spark/DistributedSuite.scala ---
    @@ -160,10 +160,6 @@ class DistributedSuite extends SparkFunSuite with Matchers with LocalSparkContex
         val data = sc.parallelize(1 to 1000, 10)
         val cachedData = data.persist(storageLevel)
         assert(cachedData.count === 1000)
    -    assert(sc.getExecutorStorageStatus.map(_.rddBlocksById(cachedData.id).size).sum ===
    --- End diff --
    
    You could replace these with code based on `sc.statusStore`.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20546: [SPARK-20659][Core] Removing sc.getExecutorStorageStatus...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/20546
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/87216/
    Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #20546: [SPARK-20659][Core] Removing sc.getExecutorStorag...

Posted by attilapiros <gi...@git.apache.org>.
Github user attilapiros commented on a diff in the pull request:

    https://github.com/apache/spark/pull/20546#discussion_r167276875
  
    --- Diff: core/src/test/scala/org/apache/spark/DistributedSuite.scala ---
    @@ -160,10 +160,6 @@ class DistributedSuite extends SparkFunSuite with Matchers with LocalSparkContex
         val data = sc.parallelize(1 to 1000, 10)
         val cachedData = data.persist(storageLevel)
         assert(cachedData.count === 1000)
    -    assert(sc.getExecutorStorageStatus.map(_.rddBlocksById(cachedData.id).size).sum ===
    --- End diff --
    
    Using sc.statusStore here would also cause the bug I mentioned above (rddStorageInfo.numCachedPartitions difference). This is why I left this part untouched.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20546: [SPARK-20659][Core] Removing sc.getExecutorStorageStatus...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/20546
  
    **[Test build #87378 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87378/testReport)** for PR 20546 at commit [`543caf8`](https://github.com/apache/spark/commit/543caf879468a3ade8350934716443207d2eaeca).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #20546: [SPARK-20659][Core] Removing sc.getExecutorStorag...

Posted by jiangxb1987 <gi...@git.apache.org>.
Github user jiangxb1987 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/20546#discussion_r168142034
  
    --- Diff: core/src/main/java/org/apache/spark/SparkExecutorInfo.java ---
    @@ -30,4 +30,8 @@
       int port();
       long cacheSize();
       int numRunningTasks();
    +  long usedOnHeapStorageMemory();
    --- End diff --
    
    If you are referring to `StorageStatus`.`onHeapMemUsed`/`offHeapMemUsed`/`onHeapMemRemaining`/`offHeapMemRemaining`, it makes great sense then. But I still don't quite get a few things:
    1. Why do we use a different name format, while we could have let them be the same as in `StorageStatus`?
    2. Are these information all we need to add to `SparkExecutorInfo` ? After this change, how do we expose the information of disk usage?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #20546: [SPARK-20659][Core] Removing sc.getExecutorStorag...

Posted by vanzin <gi...@git.apache.org>.
Github user vanzin commented on a diff in the pull request:

    https://github.com/apache/spark/pull/20546#discussion_r168137296
  
    --- Diff: core/src/main/java/org/apache/spark/SparkExecutorInfo.java ---
    @@ -30,4 +30,8 @@
       int port();
       long cacheSize();
       int numRunningTasks();
    +  long usedOnHeapStorageMemory();
    --- End diff --
    
    This information was exposed by the public method being removed by this PR, so it makes sense to add these so that people still have a way to get that data.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20546: [SPARK-20659][Core] Removing sc.getExecutorStorageStatus...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/20546
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/87361/
    Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20546: [SPARK-20659][Core] Remove StorageStatus, or make it pri...

Posted by attilapiros <gi...@git.apache.org>.
Github user attilapiros commented on the issue:

    https://github.com/apache/spark/pull/20546
  
    If this change goes into the 2.3 branch then MimaExcludes.scala should be changed accordingly.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20546: [SPARK-20659][Core] Removing sc.getExecutorStorageStatus...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/20546
  
    **[Test build #87266 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87266/testReport)** for PR 20546 at commit [`a797e04`](https://github.com/apache/spark/commit/a797e04de471f4794f785e7c9f81bf83294ce4b8).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #20546: [SPARK-20659][Core] Removing sc.getExecutorStorag...

Posted by vanzin <gi...@git.apache.org>.
Github user vanzin commented on a diff in the pull request:

    https://github.com/apache/spark/pull/20546#discussion_r167071776
  
    --- Diff: core/src/main/scala/org/apache/spark/storage/StorageUtils.scala ---
    @@ -46,7 +42,7 @@ class StorageStatus(
        * Internal representation of the blocks stored in this block manager.
        *
        * We store RDD blocks and non-RDD blocks separately to allow quick retrievals of RDD blocks.
    -   * These collections should only be mutated through the add/update/removeBlock methods.
    +   * These collections should only be mutated through the addBlock method.
    --- End diff --
    
    I think this is pretty out of date now. I don't see any calls to `addBlock` outside of this class.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20546: [SPARK-20659][Core] Removing sc.getExecutorStorageStatus...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/20546
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #20546: [SPARK-20659][Core] Removing sc.getExecutorStorag...

Posted by vanzin <gi...@git.apache.org>.
Github user vanzin commented on a diff in the pull request:

    https://github.com/apache/spark/pull/20546#discussion_r167072238
  
    --- Diff: core/src/test/scala/org/apache/spark/deploy/StandaloneDynamicAllocationSuite.scala ---
    @@ -610,7 +610,7 @@ class StandaloneDynamicAllocationSuite
        * we submit a request to kill them. This must be called before each kill request.
        */
       private def syncExecutors(sc: SparkContext): Unit = {
    -    val driverExecutors = sc.getExecutorStorageStatus
    --- End diff --
    
    You could replace this with code based on `sc.statusStore`.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20546: [SPARK-20659][Core] Remove StorageStatus, or make it pri...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/20546
  
    **[Test build #87216 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87216/testReport)** for PR 20546 at commit [`3f5dba6`](https://github.com/apache/spark/commit/3f5dba62cfa8b07d602639e9f88f3fd8cec92831).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20546: [SPARK-20659][Core] Removing sc.getExecutorStorageStatus...

Posted by attilapiros <gi...@git.apache.org>.
Github user attilapiros commented on the issue:

    https://github.com/apache/spark/pull/20546
  
    @vanzin In the org.apache.spark.repl.SingletonReplSuite there is a test named "should clone and clean line object in ClosureCleaner" which uses getExecutorStorageStatus to calculate memory consumption of different RDDs. I do not see how to keep this test after my modification. Could you please provide me some help regarding this?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20546: [SPARK-20659][Core] Removing sc.getExecutorStorageStatus...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/20546
  
    Merged build finished. Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20546: [SPARK-20659][Core] Removing sc.getExecutorStorageStatus...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/20546
  
    **[Test build #87232 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87232/testReport)** for PR 20546 at commit [`8544380`](https://github.com/apache/spark/commit/8544380e91f5bfa7c95cf613d49bc9144fedee9f).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20546: [SPARK-20659][Core] Removing sc.getExecutorStorageStatus...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/20546
  
    **[Test build #87336 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87336/testReport)** for PR 20546 at commit [`9194acf`](https://github.com/apache/spark/commit/9194acf8afc4c0f2af20c1163a1c1bbd41d8accd).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20546: [SPARK-20659][Core] Removing sc.getExecutorStorageStatus...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/20546
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/87232/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20546: [SPARK-20659][Core] Removing sc.getExecutorStorageStatus...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/20546
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/87345/
    Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20546: [SPARK-20659][Core] Removing sc.getExecutorStorageStatus...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/20546
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/87266/
    Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20546: [SPARK-20659][Core] Removing sc.getExecutorStorageStatus...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/20546
  
    **[Test build #87336 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87336/testReport)** for PR 20546 at commit [`9194acf`](https://github.com/apache/spark/commit/9194acf8afc4c0f2af20c1163a1c1bbd41d8accd).
     * This patch **fails Spark unit tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org