You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by CodingCat <gi...@git.apache.org> on 2014/10/20 05:12:51 UTC

[GitHub] spark pull request: [WIP]SPARK-3957: show broadcast variable resou...

GitHub user CodingCat opened a pull request:

    https://github.com/apache/spark/pull/2851

    [WIP]SPARK-3957: show broadcast variable resource usage info in UI

    WIP, finish the most of logic, need to reorganize the pages
    
    ![image](https://cloud.githubusercontent.com/assets/678008/4696076/6bb23818-57e5-11e4-83b9-a59ced0d1bc4.png)


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/CodingCat/spark SPARK-3957

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/2851.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #2851
    
----
commit a8fa6b1980a724315ffdca4d80869b4a9f73abde
Author: Nan Zhu <na...@nans-macbook-pro.local>
Date:   2014-10-17T02:58:40Z

    update page

commit 8f75476cd144b5e6c2b486ec0afccd20f6283644
Author: Nan Zhu <na...@nans-macbook-pro.local>
Date:   2014-10-20T02:58:30Z

    send broadcastInfo in blockManager heartbeat

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3957]: show broadcast variable resource...

Posted by CodingCat <gi...@git.apache.org>.
Github user CodingCat commented on the pull request:

    https://github.com/apache/spark/pull/2851#issuecomment-80383022
  
    Hi, @squito , thanks for the comments, I made some revision on the patch


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3957]: show broadcast variable resource...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2851#issuecomment-61421199
  
      [Test build #22767 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/22767/consoleFull) for   PR 2851 at commit [`c079b4c`](https://github.com/apache/spark/commit/c079b4ca693f522b99975f662024fe32d32d9495).
     * This patch **does not merge cleanly**.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3957]: show broadcast variable resource...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2851#issuecomment-70644964
  
      [Test build #25817 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/25817/consoleFull) for   PR 2851 at commit [`1f53135`](https://github.com/apache/spark/commit/1f5313515f6151ee4d8d24cefb4ae1f870703f7e).
     * This patch **fails to build**.
     * This patch merges cleanly.
     * This patch adds the following public classes _(experimental)_:
      * `case class SparkListenerBlockUpdate(blockManagerId: BlockManagerId, blockId: BlockId,`
      * `class DefaultSparkListenerEventFilter `
      * `class BroadcastInfo(`



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3957]: show broadcast variable resource...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2851#issuecomment-74151627
  
      [Test build #27377 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/27377/consoleFull) for   PR 2851 at commit [`3a0aa25`](https://github.com/apache/spark/commit/3a0aa258a430217cc2dc69b11531230d078b16a7).
     * This patch **fails to build**.
     * This patch merges cleanly.
     * This patch adds the following public classes _(experimental)_:
      * `case class SparkListenerBlockUpdate(blockManagerId: BlockManagerId, blockId: BlockId,`
      * `class DefaultSparkListenerEventFilter `
      * `class BroadcastInfo(`



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3957]: show broadcast variable resource...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/2851#issuecomment-70084764
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/25600/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [WIP]SPARK-3957: show broadcast variable resou...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/2851#issuecomment-59679560
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/21903/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3957]: show broadcast variable resource...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/2851#issuecomment-60042976
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/22022/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3957]: show broadcast variable resource...

Posted by rxin <gi...@git.apache.org>.
Github user rxin commented on the pull request:

    https://github.com/apache/spark/pull/2851#issuecomment-60314638
  
    Hi @CodingCat - This is a great & handy feature, but what I meant was that we should NOT have custom code that tracks broadcast blocks. We can have special UIs for reporting broadcast blocks for convenience, but the underlying reporting mechanism should be the same across all blocks. BlockManager and the associated code are complicated enough (pretty terribly designed by us already), and we should not further complicate it by adding custom code for x, y, and z. We should make this reporting general enough (or use existing general reporting code path) so we can easily report other blocks in the future. 



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3957]: show broadcast variable resource...

Posted by shivaram <gi...@git.apache.org>.
Github user shivaram commented on the pull request:

    https://github.com/apache/spark/pull/2851#issuecomment-60316298
  
    @rxin - Here is a simpler design -- What if we report all broadcast blocks to the master when they are added to a block manager as well (tellMaster = true instead of false in https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/broadcast/TorrentBroadcast.scala#L95 )
    
    It might be the same network traffic as any reporting mechanism we come up with -- The only concern then is to make sure that we don't wait for the master RPC, and we could do that if `tellMaster` is asynchronous ?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3957]: show broadcast variable resource...

Posted by squito <gi...@git.apache.org>.
Github user squito commented on a diff in the pull request:

    https://github.com/apache/spark/pull/2851#discussion_r26391837
  
    --- Diff: core/src/main/scala/org/apache/spark/scheduler/EventLoggingListener.scala ---
    @@ -181,6 +181,12 @@ private[spark] class EventLoggingListener(
         logEvent(event, flushLogger = true)
       override def onExecutorRemoved(event: SparkListenerExecutorRemoved) =
         logEvent(event, flushLogger = true)
    +  override def onBlockUpdate(event: SparkListenerBlockUpdate) = {
    +    // we only log Broadcast block update for now
    --- End diff --
    
    can you add to this comment that we don't need to log RDD blocks, b/c they are already logged as part of `StageCompleted` events


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3957]: show broadcast variable resource...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2851#issuecomment-74746516
  
      [Test build #27638 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/27638/consoleFull) for   PR 2851 at commit [`564f7d2`](https://github.com/apache/spark/commit/564f7d26de6337bfd1fb6453c6344d73c9dd0ed0).
     * This patch **passes all tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3957]: show broadcast variable resource...

Posted by CodingCat <gi...@git.apache.org>.
Github user CodingCat commented on the pull request:

    https://github.com/apache/spark/pull/2851#issuecomment-70644262
  
    ping


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3957]: show broadcast variable resource...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2851#issuecomment-78560997
  
      [Test build #28529 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/28529/consoleFull) for   PR 2851 at commit [`2ea8ee3`](https://github.com/apache/spark/commit/2ea8ee341a40ae543284875b3e6ba333fc07ca9d).
     * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3957]: show broadcast variable resource...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2851#issuecomment-61434872
  
      [Test build #22791 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/22791/consoleFull) for   PR 2851 at commit [`84ba580`](https://github.com/apache/spark/commit/84ba5806ac857e48a1bb3d04f8be849410888db0).
     * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3957]: show broadcast variable resource...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2851#issuecomment-79377540
  
      [Test build #28585 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/28585/consoleFull) for   PR 2851 at commit [`49ee118`](https://github.com/apache/spark/commit/49ee1188989422673203eece83d67b590b05bb27).
     * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3957]: show broadcast variable resource...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2851#issuecomment-61420991
  
      [Test build #22766 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/22766/consoleFull) for   PR 2851 at commit [`6e4915d`](https://github.com/apache/spark/commit/6e4915dcfaf57a9b7363dce85135b675f2294981).
     * This patch **does not merge cleanly**.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3957]: show broadcast variable resource...

Posted by squito <gi...@git.apache.org>.
Github user squito commented on the pull request:

    https://github.com/apache/spark/pull/2851#issuecomment-80476703
  
    thanks for the updates!  just one small style comment.
    FWIW, after thinking about it more, I am in favor of your approach of sending the block updates, just want to get some confirmation.
    
    lgtm (with the minor style update)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3957]: show broadcast variable resource...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2851#issuecomment-70077701
  
      [Test build #25600 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/25600/consoleFull) for   PR 2851 at commit [`bf7095c`](https://github.com/apache/spark/commit/bf7095cc31d4434daabba3bfcfecec98c9389dbb).
     * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3957]: show broadcast variable resource...

Posted by CodingCat <gi...@git.apache.org>.
Github user CodingCat commented on the pull request:

    https://github.com/apache/spark/pull/2851#issuecomment-74599587
  
    weird...I can pass the test cases in my laptop without any problem....


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3957]: show broadcast variable resource...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2851#issuecomment-74190412
  
      [Test build #27405 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/27405/consoleFull) for   PR 2851 at commit [`08ad98d`](https://github.com/apache/spark/commit/08ad98d5a544225c733411fb895de5fa32431c24).
     * This patch **fails Spark unit tests**.
     * This patch merges cleanly.
     * This patch adds the following public classes _(experimental)_:
      * `case class SparkListenerBlockUpdate(blockManagerId: BlockManagerId, blockId: BlockId,`
      * `class DefaultSparkListenerEventFilter `
      * `class BroadcastInfo(`



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3957]: show broadcast variable resource...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/2851#issuecomment-61434380
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/22782/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3957]: show broadcast variable resource...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2851#issuecomment-61424135
  
      [Test build #22773 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/22773/consoleFull) for   PR 2851 at commit [`5fed94b`](https://github.com/apache/spark/commit/5fed94b076dc5d2e1532bc7e9396f72e2b2074ef).
     * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3957]: show broadcast variable resource...

Posted by squito <gi...@git.apache.org>.
Github user squito commented on a diff in the pull request:

    https://github.com/apache/spark/pull/2851#discussion_r26393202
  
    --- Diff: core/src/main/scala/org/apache/spark/storage/StorageStatusListener.scala ---
    @@ -88,4 +88,19 @@ class StorageStatusListener extends SparkListener {
         }
       }
     
    +  override def onBlockUpdate(blockUpdateEvent: SparkListenerBlockUpdate) = synchronized {
    +    val executorId = blockUpdateEvent.blockManagerId.executorId
    +    updateStorageStatus(executorId, Seq((blockUpdateEvent.blockId,
    --- End diff --
    
    If I'm following things correctly, this will also get called for RDD blocks, which are already getting updated by `TaskEnd`.  I suppose its not a problem for them to get updated again with the same info, but you could probably skip the update for non-broadcast blocks, and include a comment as in `EventLoggingListener`.  


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3957]: show broadcast variable resource...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/2851#issuecomment-61423512
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/22770/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3957]: show broadcast variable resource...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/2851#issuecomment-70644966
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/25817/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3957]: show broadcast variable resource...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2851#issuecomment-74152876
  
      [Test build #27380 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/27380/consoleFull) for   PR 2851 at commit [`b65d49e`](https://github.com/apache/spark/commit/b65d49e1fa4e00e7353851197b9fce92a3e911e1).
     * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3957]: show broadcast variable resource...

Posted by squito <gi...@git.apache.org>.
Github user squito commented on a diff in the pull request:

    https://github.com/apache/spark/pull/2851#discussion_r26392169
  
    --- Diff: core/src/main/scala/org/apache/spark/storage/BlockManager.scala ---
    @@ -357,7 +357,13 @@ private[spark] class BlockManager(
           info: BlockInfo,
           status: BlockStatus,
           droppedMemorySize: Long = 0L): Unit = {
    -    val needReregister = !tryToReportBlockStatus(blockId, info, status, droppedMemorySize)
    +    def ifAsync: Boolean = {
    +      blockId.isBroadcast
    +    }
    +    // asynchronously report broadcast variables, in future we only need to change the
    +    // implementation of ifAsync
    --- End diff --
    
    I find this comment a little bit confusing -- it sounds like you are saying there is future work still to do.  Do you mean something like "Broadcast variables are currently reported asynchronously, but if we later decide they should be reported synchronously, we just need to update ifAsync`"


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3957]: show broadcast variable resource...

Posted by CodingCat <gi...@git.apache.org>.
Github user CodingCat commented on the pull request:

    https://github.com/apache/spark/pull/2851#issuecomment-61484584
  
    so, this one is ready for the further review
    
    @shivaram @rxin @andrewor14 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3957]: show broadcast variable resource...

Posted by rxin <gi...@git.apache.org>.
Github user rxin commented on a diff in the pull request:

    https://github.com/apache/spark/pull/2851#discussion_r24683430
  
    --- Diff: core/src/main/scala/org/apache/spark/storage/StorageUtils.scala ---
    @@ -118,8 +140,20 @@ class StorageStatus(val blockManagerId: BlockManagerId, val maxMem: Long) {
             } else {
               None
             }
    +      case BroadcastBlockId(broadcastId, _) =>
    +        // Actually remove the block, if it exists
    +        if (_broadcastBlocks.contains(broadcastId)) {
    +          val removed = _broadcastBlocks(broadcastId).remove(blockId)
    +          // If the given RDD has no more blocks left, remove the RDD
    --- End diff --
    
    Please, don't do that. It is very hard to understand these kinds of extractors, especially when RddOrBlockId itself doesn't have a real class.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3957]: show broadcast variable resource...

Posted by squito <gi...@git.apache.org>.
Github user squito commented on a diff in the pull request:

    https://github.com/apache/spark/pull/2851#discussion_r26395613
  
    --- Diff: core/src/main/scala/org/apache/spark/ui/storage/BroadcastPage.scala ---
    @@ -0,0 +1,86 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.spark.ui.storage
    +
    +import org.apache.spark.storage.StorageUtils
    +import org.apache.spark.ui.UIUtils
    +import org.apache.spark.util.Utils
    +
    +import scala.xml.Node
    +
    +private[ui] class BroadcastPage(parent: StorageTab) extends StorageDetailPage("broadcast", parent){
    +
    +  protected override val workerTableID: String = "broadcast-storage-by-block-table"
    +  
    +  protected override def objectList = listener.broadcastInfoList
    +
    +  protected override def getBlockTableAndSize(objectId: Any): (Seq[Node], Int) = {
    +    val blockLocations = StorageUtils.getBroadcastBlockLocation(objectId.asInstanceOf[Long],
    +      storageStatusList)
    +    val blocks = listener.storageStatusList
    +      .flatMap(_.broadcastBlocksById(objectId.asInstanceOf[Long]))
    +      .sortWith(_._1.name < _._1.name)
    +      .map { case (blockId, status) =>
    +      (blockId, status, blockLocations.get(blockId).getOrElse(Seq[String]("Unknown")))
    +    }
    +    (UIUtils.listingTable(blockHeader, blockRow, blocks, id = Some("rdd-storage-by-block-table")),
    --- End diff --
    
    broadcast, not rdd


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3957]: show broadcast variable resource...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2851#issuecomment-70084758
  
      [Test build #25600 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/25600/consoleFull) for   PR 2851 at commit [`bf7095c`](https://github.com/apache/spark/commit/bf7095cc31d4434daabba3bfcfecec98c9389dbb).
     * This patch **passes all tests**.
     * This patch merges cleanly.
     * This patch adds the following public classes _(experimental)_:
      * `case class SparkListenerBlockUpdate(blockManagerId: BlockManagerId, blockId: BlockId,`
      * `class DefaultSparkListenerEventFilter `
      * `class BroadcastInfo(`



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3957]: show broadcast variable resource...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2851#issuecomment-60043790
  
      [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/22023/consoleFull) for   PR 2851 at commit [`1eff69e`](https://github.com/apache/spark/commit/1eff69eeb19594e6d757a952a7ee82f7d7571fcb).
     * This patch **fails Spark unit tests**.
     * This patch merges cleanly.
     * This patch adds the following public classes _(experimental)_:
      * `          throw new SparkException("Failed to load class to register with Kryo", e)`
      * `sealed abstract class BlockId extends Serializable `



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3957]: show broadcast variable resource...

Posted by squito <gi...@git.apache.org>.
Github user squito commented on a diff in the pull request:

    https://github.com/apache/spark/pull/2851#discussion_r24672255
  
    --- Diff: core/src/main/scala/org/apache/spark/util/JsonProtocol.scala ---
    @@ -91,6 +91,7 @@ private[spark] object JsonProtocol {
             executorRemovedToJson(executorRemoved)
           // These aren't used, but keeps compiler happy
           case SparkListenerExecutorMetricsUpdate(_, _) => JNothing
    +      case SparkListenerBlockUpdate(_, _, _) => JNothing
    --- End diff --
    
    This event should be converted to json so it can be seen in the UI for completed apps.  You need to implement serialization here, and also deserialization in `sparkEventFromJson`, as that is used by `ReplayListenerBus` to show the UI for completed apps.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3957]: show broadcast variable resource...

Posted by CodingCat <gi...@git.apache.org>.
Github user CodingCat commented on the pull request:

    https://github.com/apache/spark/pull/2851#issuecomment-81021347
  
    thanks @squito 
    
    I updated that again


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3957]: show broadcast variable resource...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/2851#issuecomment-74151633
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/27377/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3957]: show broadcast variable resource...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2851#issuecomment-74731158
  
      [Test build #27638 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/27638/consoleFull) for   PR 2851 at commit [`564f7d2`](https://github.com/apache/spark/commit/564f7d26de6337bfd1fb6453c6344d73c9dd0ed0).
     * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3957]: show broadcast variable resource...

Posted by CodingCat <gi...@git.apache.org>.
Github user CodingCat commented on the pull request:

    https://github.com/apache/spark/pull/2851#issuecomment-74728276
  
    ![image](https://cloud.githubusercontent.com/assets/678008/6235037/622218b4-b6ad-11e4-8cd6-bd291fd4e58a.png)



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3957]: show broadcast variable resource...

Posted by squito <gi...@git.apache.org>.
Github user squito commented on a diff in the pull request:

    https://github.com/apache/spark/pull/2851#discussion_r24642522
  
    --- Diff: core/src/main/scala/org/apache/spark/storage/StorageUtils.scala ---
    @@ -81,19 +84,38 @@ class StorageStatus(val blockManagerId: BlockManagerId, val maxMem: Long) {
        */
       def rddBlocks: Map[BlockId, BlockStatus] = _rddBlocks.flatMap { case (_, blocks) => blocks }
     
    +
    +  /**
    +   * Return the broadcast blocks stored in this block manager.
    +   *
    +   * Note that this is somewhat expensive, as it involves cloning the underlying maps and then
    +   * concatenating them together. Much faster alternatives exist for common operations such as
    +   * getting the memory, disk, and off-heap memory sizes occupied by this RDD.
    --- End diff --
    
    should be "broadcast variable" instead of "RDD"


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3957]: show broadcast variable resource...

Posted by squito <gi...@git.apache.org>.
Github user squito commented on a diff in the pull request:

    https://github.com/apache/spark/pull/2851#discussion_r24831712
  
    --- Diff: core/src/main/scala/org/apache/spark/storage/RDDInfo.scala ---
    @@ -21,13 +21,14 @@ import org.apache.spark.annotation.DeveloperApi
     import org.apache.spark.rdd.RDD
     import org.apache.spark.util.Utils
     
    +trait InMemoryObjectInfo
    --- End diff --
    
    you don't really need this trait at all


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3957]: show broadcast variable resource...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/2851#issuecomment-70738809
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/25837/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3957]: show broadcast variable resource...

Posted by JoshRosen <gi...@git.apache.org>.
Github user JoshRosen commented on the pull request:

    https://github.com/apache/spark/pull/2851#issuecomment-148821576
  
    Hey @CodingCat, it looks like this PR is significantly out of date. Do you mind closing it for now? You can always re-open / re-submit once we're ready to work on this feature again. Thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3957]: show broadcast variable resource...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/2851#issuecomment-74587826
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/27571/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3957]: show broadcast variable resource...

Posted by squito <gi...@git.apache.org>.
Github user squito commented on a diff in the pull request:

    https://github.com/apache/spark/pull/2851#discussion_r26394719
  
    --- Diff: core/src/main/scala/org/apache/spark/ui/storage/StorageTab.scala ---
    @@ -79,4 +87,19 @@ class StorageListener(storageStatusListener: StorageStatusListener) extends Spar
       override def onUnpersistRDD(unpersistRDD: SparkListenerUnpersistRDD) = synchronized {
         _rddInfoMap.remove(unpersistRDD.rddId)
       }
    +
    +  override def onBlockUpdate(blockUpdateEvent: SparkListenerBlockUpdate) = synchronized {
    +    // only update broadcast for now, need to be modified if want to track other blocks
    --- End diff --
    
    this comment is slightly misleading -- as in other places, I would instead point out that RDD blocks are handled in `onStageCompleted`, but for now we are ignoring all other blocks.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3957]: show broadcast variable resource...

Posted by squito <gi...@git.apache.org>.
Github user squito commented on a diff in the pull request:

    https://github.com/apache/spark/pull/2851#discussion_r26393526
  
    --- Diff: core/src/main/scala/org/apache/spark/storage/StorageUtils.scala ---
    @@ -101,25 +122,30 @@ class StorageStatus(val blockManagerId: BlockManagerId, val maxMem: Long) {
       private[spark] def updateBlock(blockId: BlockId, blockStatus: BlockStatus): Unit = {
         addBlock(blockId, blockStatus)
       }
    +  
    +  private def getIdAndBlockMap(blockId: BlockId) = blockId match { 
    +    case RDDBlockId(rddId, _) => (rddId.toLong, _rddBlocks)
    +    case BroadcastBlockId(broadcastId, _) => (broadcastId, _broadcastBlocks)
    +  }
     
       /** Remove the given block from this storage status. */
       private[spark] def removeBlock(blockId: BlockId): Option[BlockStatus] = {
         updateStorageInfo(blockId, BlockStatus.empty)
         blockId match {
    -      case RDDBlockId(rddId, _) =>
    -        // Actually remove the block, if it exists
    -        if (_rddBlocks.contains(rddId)) {
    -          val removed = _rddBlocks(rddId).remove(blockId)
    -          // If the given RDD has no more blocks left, remove the RDD
    -          if (_rddBlocks(rddId).isEmpty) {
    -            _rddBlocks.remove(rddId)
    +      case rddOrBroadcastBlockId @ (_: BroadcastBlockId | _: RDDBlockId) =>
    +        val (id, blockMap) = getIdAndBlockMap(rddOrBroadcastBlockId)
    +        var removed: Option[BlockStatus] = None
    +        if (blockMap.contains(id)) {
    +          removed = blockMap(id).remove(blockId)
    --- End diff --
    
    you don't need the `var` here, you can keep the `val` like before:
    
    ```
    if (blockMap.contains(id)) {
      val removed = blockMap(id).remove(blockId)
      ...
    ```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3957]: show broadcast variable resource...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2851#issuecomment-60085470
  
      [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/22035/consoleFull) for   PR 2851 at commit [`23b1819`](https://github.com/apache/spark/commit/23b1819d39f21295ef954c95d72c0f8a733ca915).
     * This patch **fails Spark unit tests**.
     * This patch merges cleanly.
     * This patch adds the following public classes _(experimental)_:
      * `sealed abstract class BlockId extends Serializable `



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3957]: show broadcast variable resource...

Posted by CodingCat <gi...@git.apache.org>.
Github user CodingCat commented on a diff in the pull request:

    https://github.com/apache/spark/pull/2851#discussion_r19747011
  
    --- Diff: core/src/main/scala/org/apache/spark/scheduler/SparkListener.scala ---
    @@ -56,6 +56,11 @@ case class SparkListenerTaskEnd(
       extends SparkListenerEvent
     
     @DeveloperApi
    +case class SparkListenerBlockUpdate(blockManagerId: BlockManagerId, blockId: BlockId,
    --- End diff --
    
    en...I'm not sure... e.g. when the user created sc.broadcast(1), it should be shown on the UI
    



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3957]: show broadcast variable resource...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/2851#issuecomment-61569172
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/22829/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [WIP]SPARK-3957: show broadcast variable resou...

Posted by shivaram <gi...@git.apache.org>.
Github user shivaram commented on a diff in the pull request:

    https://github.com/apache/spark/pull/2851#discussion_r19096569
  
    --- Diff: core/src/main/scala/org/apache/spark/HeartbeatReceiver.scala ---
    @@ -30,7 +30,8 @@ import org.apache.spark.util.ActorLogReceive
     private[spark] case class Heartbeat(
         executorId: String,
         taskMetrics: Array[(Long, TaskMetrics)], // taskId -> TaskMetrics
    -    blockManagerId: BlockManagerId)
    +    blockManagerId: BlockManagerId,
    +    broadcastBlocks: Map[BlockId, Option[BlockStatus]])
    --- End diff --
    
    Would this send a BlockStatus for each broadcast variable on a heartbeat ? If we have hundreds or thousands of broadcast variables I wonder if the message size will become huge. Could we send deltas somehow ?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3957]: show broadcast variable resource...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2851#issuecomment-74151253
  
      [Test build #27377 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/27377/consoleFull) for   PR 2851 at commit [`3a0aa25`](https://github.com/apache/spark/commit/3a0aa258a430217cc2dc69b11531230d078b16a7).
     * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3957]: show broadcast variable resource...

Posted by CodingCat <gi...@git.apache.org>.
Github user CodingCat closed the pull request at:

    https://github.com/apache/spark/pull/2851


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3957]: show broadcast variable resource...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2851#issuecomment-60042970
  
      [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/22022/consoleFull) for   PR 2851 at commit [`c07c306`](https://github.com/apache/spark/commit/c07c306017ffce24a4b21d14f71a9e6734b21575).
     * This patch **fails Spark unit tests**.
     * This patch merges cleanly.
     * This patch adds the following public classes _(experimental)_:
      * `sealed abstract class BlockId extends Serializable `



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3957]: show broadcast variable resource...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/2851#issuecomment-61429585
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/22774/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3957]: show broadcast variable resource...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2851#issuecomment-61429584
  
    **[Test build #22774 timed out](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/22774/consoleFull)**     for PR 2851 at commit [`78b7461`](https://github.com/apache/spark/commit/78b7461865b7706f9f249a9e7054e51a50c29dc1)     after a configured wait of `120m`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3957]: show broadcast variable resource...

Posted by squito <gi...@git.apache.org>.
Github user squito commented on the pull request:

    https://github.com/apache/spark/pull/2851#issuecomment-74698831
  
    Hi @CodingCat 
    
    thanks for making all the updates.  Sorry I hadn't realized the subtlety w/ `Int` vs `Long` on the `RDDBlockId` and `BroadcastBlockId`.  Still, I think its still a good change for simplifying the code, glad you figured a way to make it work.   And on issue (2), the memory usage of a worker, I think what you have is correct, its supposed to be memory usage of the particular object on the worker.  This is on a UI page in the context of a particular object -- there is a separate page to summarize the memory usage of the executor overall.  (though I agree the UI is a little confusing ...)
    
    I'll make a few more minor comments on the code, but mostly I just want to get another opinion on the right events to pass the block added events around, as I have mentioned above.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3957]: show broadcast variable resource...

Posted by rxin <gi...@git.apache.org>.
Github user rxin commented on the pull request:

    https://github.com/apache/spark/pull/2851#issuecomment-60048094
  
    Can't we just have a general block reporting mechanism rather than special casing for broadcast? 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3957]: show broadcast variable resource...

Posted by squito <gi...@git.apache.org>.
Github user squito commented on a diff in the pull request:

    https://github.com/apache/spark/pull/2851#discussion_r24833690
  
    --- Diff: core/src/main/scala/org/apache/spark/ui/storage/InMemoryObjectPage.scala ---
    @@ -0,0 +1,123 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.spark.ui.storage
    +
    +import javax.servlet.http.HttpServletRequest
    +
    +import org.apache.spark.storage._
    +import org.apache.spark.ui.{UIUtils, WebUIPage}
    +import org.apache.spark.util.Utils
    +
    +import scala.xml.Node
    +
    +private[ui] abstract class InMemoryObjectPage(pageName: String, parent: StorageTab)
    --- End diff --
    
    RDD aren't necessarily in memory ... maybe a better name would be `StorageDetailPage`?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3957]: show broadcast variable resource...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/2851#issuecomment-74575571
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/27570/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3957]: show broadcast variable resource...

Posted by CodingCat <gi...@git.apache.org>.
Github user CodingCat commented on the pull request:

    https://github.com/apache/spark/pull/2851#issuecomment-61877783
  
    ping


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3957]: show broadcast variable resource...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/2851#issuecomment-61425778
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/22773/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3957]: show broadcast variable resource...

Posted by squito <gi...@git.apache.org>.
Github user squito commented on a diff in the pull request:

    https://github.com/apache/spark/pull/2851#discussion_r24833566
  
    --- Diff: core/src/main/scala/org/apache/spark/storage/StorageUtils.scala ---
    @@ -271,4 +368,19 @@ private[spark] object StorageUtils {
         blockLocations
       }
     
    +
    +  /**
    +   * Return a mapping from block ID to its locations for each block that belongs to the given RDD.
    --- End diff --
    
    broadcast block, not RDD


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3957]: show broadcast variable resource...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2851#issuecomment-79262603
  
      [Test build #28577 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/28577/consoleFull) for   PR 2851 at commit [`26977c9`](https://github.com/apache/spark/commit/26977c97add7d47d798985753411645e6cdcac29).
     * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3957]: show broadcast variable resource...

Posted by squito <gi...@git.apache.org>.
Github user squito commented on the pull request:

    https://github.com/apache/spark/pull/2851#issuecomment-79050227
  
    Hi @CodingCat , I took another look through everything.  My comments are all very minor.  But I think I'd still like to get some more thoughts from others on how the broadcast block info is passed along.  As I mentioned earlier, it is also possible to get it into `TaskEnd` events, like the RDD blocks.  I'm torn, so I'd like to hear what others think.  Sorry it is taking so long to get feedback ...


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3957]: show broadcast variable resource...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2851#issuecomment-79705666
  
      [Test build #28601 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/28601/consoleFull) for   PR 2851 at commit [`9b1bccf`](https://github.com/apache/spark/commit/9b1bccf89c59e95ba7132034149b0d0e657b35b7).
     * This patch **passes all tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3957]: show broadcast variable resource...

Posted by CodingCat <gi...@git.apache.org>.
Github user CodingCat commented on the pull request:

    https://github.com/apache/spark/pull/2851#issuecomment-60308342
  
    @andrewor14 do you want to take a look at this patch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3957]: show broadcast variable resource...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2851#issuecomment-60074404
  
      [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/22034/consoleFull) for   PR 2851 at commit [`473519a`](https://github.com/apache/spark/commit/473519a4c60f3ecd415d569fc32694ff8b744a22).
     * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3957]: show broadcast variable resource...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2851#issuecomment-60041575
  
      [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/22023/consoleFull) for   PR 2851 at commit [`1eff69e`](https://github.com/apache/spark/commit/1eff69eeb19594e6d757a952a7ee82f7d7571fcb).
     * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3957]: show broadcast variable resource...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/2851#issuecomment-61421467
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/22768/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3957]: show broadcast variable resource...

Posted by squito <gi...@git.apache.org>.
Github user squito commented on a diff in the pull request:

    https://github.com/apache/spark/pull/2851#discussion_r24668891
  
    --- Diff: core/src/main/scala/org/apache/spark/storage/StorageUtils.scala ---
    @@ -118,8 +140,20 @@ class StorageStatus(val blockManagerId: BlockManagerId, val maxMem: Long) {
             } else {
               None
             }
    +      case BroadcastBlockId(broadcastId, _) =>
    +        // Actually remove the block, if it exists
    +        if (_broadcastBlocks.contains(broadcastId)) {
    +          val removed = _broadcastBlocks(broadcastId).remove(blockId)
    +          // If the given RDD has no more blocks left, remove the RDD
    --- End diff --
    
    you could tighten it up even more if you moved that helper method to an extractor:
    
    ```
    object RddOrBlockId {
      def unapply(blockId: BlockId): Option[(Int, Map[BlockId,BlockStatus])] = {
        case RDDBlockId(rddId, _) => Some((rddId, _rddBlocks))
        case BroadcastBlockId(broadcastId, _) => Some((broadcastId, _broadcastBlocks))
        case _ => None
      }
    }
    ```
    
    then the actual matching in each of these blocks would simplify to:
    
    ```
      case RddOrBlockId(id, blockMap) =>
        val removed = blockMap(id).remove(blockId)
        ...
    ```
    (the point is, the helper function to get the id & blockMap goes right into the pattern match.)  But maybe this is a bit of scala that is too esoteric for our code base for saving one line in each of these cases.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3957]: show broadcast variable resource...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/2851#issuecomment-74595231
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/27588/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3957]: show broadcast variable resource...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2851#issuecomment-60077874
  
      [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/22034/consoleFull) for   PR 2851 at commit [`473519a`](https://github.com/apache/spark/commit/473519a4c60f3ecd415d569fc32694ff8b744a22).
     * This patch **fails Spark unit tests**.
     * This patch merges cleanly.
     * This patch adds the following public classes _(experimental)_:
      * `sealed abstract class BlockId extends Serializable `



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3957]: show broadcast variable resource...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/2851#issuecomment-60098610
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/22036/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3957]: show broadcast variable resource...

Posted by CodingCat <gi...@git.apache.org>.
Github user CodingCat commented on the pull request:

    https://github.com/apache/spark/pull/2851#issuecomment-60075293
  
    @rxin I'm not sure, maybe we don't need that, because currently all RDD blocks are not reported via network, but by calling post(...) from the driver


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3957]: show broadcast variable resource...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2851#issuecomment-61421992
  
      [Test build #22770 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/22770/consoleFull) for   PR 2851 at commit [`8f2253e`](https://github.com/apache/spark/commit/8f2253e70111f0eb0caa83508c2bb6b1c6d271fa).
     * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3957]: show broadcast variable resource...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/2851#issuecomment-61437845
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/22791/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3957]: show broadcast variable resource...

Posted by CodingCat <gi...@git.apache.org>.
Github user CodingCat commented on the pull request:

    https://github.com/apache/spark/pull/2851#issuecomment-74798631
  
    Hi, @squito , thanks for the further comments
    
    I have addressed them and also replied under some of them
    
    The snapshots was uploaded, 
    
    I will not be here tomorrow but will be back after that.... 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3957]: show broadcast variable resource...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/2851#issuecomment-61422283
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/22769/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3957]: show broadcast variable resource...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2851#issuecomment-61421361
  
      [Test build #22767 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/22767/consoleFull) for   PR 2851 at commit [`c079b4c`](https://github.com/apache/spark/commit/c079b4ca693f522b99975f662024fe32d32d9495).
     * This patch **fails Scala style tests**.
     * This patch **does not merge cleanly**.
     * This patch adds the following public classes _(experimental)_:
      * `case class SparkListenerBlockUpdate(blockManagerId: BlockManagerId, blockId: BlockId,`
      * `class DefaultSparkListenerEventFilter `
      * `sealed abstract class BlockId extends Serializable `
      * `class BroadcastInfo(`



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3957]: show broadcast variable resource...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2851#issuecomment-79608399
  
      [Test build #28601 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/28601/consoleFull) for   PR 2851 at commit [`9b1bccf`](https://github.com/apache/spark/commit/9b1bccf89c59e95ba7132034149b0d0e657b35b7).
     * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3957]: show broadcast variable resource...

Posted by CodingCat <gi...@git.apache.org>.
Github user CodingCat commented on the pull request:

    https://github.com/apache/spark/pull/2851#issuecomment-78615124
  
    Hi, @squito , would you mind taking further review when you have time?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3957]: show broadcast variable resource...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2851#issuecomment-74575563
  
      [Test build #27570 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/27570/consoleFull) for   PR 2851 at commit [`b05196a`](https://github.com/apache/spark/commit/b05196af57e2c9ad8504b7ab12f1d3b2ea454067).
     * This patch **fails Spark unit tests**.
     * This patch merges cleanly.
     * This patch adds the following public classes _(experimental)_:
      * `case class SparkListenerBlockUpdate(blockManagerId: BlockManagerId, blockId: BlockId,`
      * `class DefaultSparkListenerEventFilter `
      * `class BroadcastInfo(`



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3957]: show broadcast variable resource...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/2851#issuecomment-60085483
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/22035/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3957]: show broadcast variable resource...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2851#issuecomment-74568258
  
      [Test build #27569 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/27569/consoleFull) for   PR 2851 at commit [`57d3f46`](https://github.com/apache/spark/commit/57d3f4609d069c8891cda3f0658d1eb1bb93e9f1).
     * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3957]: show broadcast variable resource...

Posted by squito <gi...@git.apache.org>.
Github user squito commented on the pull request:

    https://github.com/apache/spark/pull/2851#issuecomment-74510287
  
    well I was leaning towards using a `ThreadLocal` to get the broadcast blocks into the task end event ... but I forgot that broadcast blocks are also created by the driver (eg. [here](https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/broadcast/TorrentBroadcast.scala#L98) or [here](https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/broadcast/HttpBroadcast.scala#L52)). It doesn't make any sense to stick those into a task end event.  I suppose they could go into a task **start** event, but that is still a little weird, maybe we really do need a new event.
    
    @rxin @andrewor14 thoughts?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3957]: show broadcast variable resource...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2851#issuecomment-70727414
  
      [Test build #25837 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/25837/consoleFull) for   PR 2851 at commit [`6b25507`](https://github.com/apache/spark/commit/6b255078375b1861fd252e3db365c73d38ac1945).
     * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3957]: show broadcast variable resource...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2851#issuecomment-78607932
  
      [Test build #28534 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/28534/consoleFull) for   PR 2851 at commit [`223cdfb`](https://github.com/apache/spark/commit/223cdfb6c02412f6bafbbbe32e160f08a96e984f).
     * This patch **passes all tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3957]: show broadcast variable resource...

Posted by squito <gi...@git.apache.org>.
Github user squito commented on a diff in the pull request:

    https://github.com/apache/spark/pull/2851#discussion_r26395589
  
    --- Diff: core/src/main/scala/org/apache/spark/ui/storage/RDDPage.scala ---
    @@ -17,134 +17,84 @@
     
     package org.apache.spark.ui.storage
     
    -import javax.servlet.http.HttpServletRequest
    -
     import scala.xml.Node
     
    -import org.apache.spark.storage.{BlockId, BlockStatus, StorageStatus, StorageUtils}
    -import org.apache.spark.ui.{WebUIPage, UIUtils}
    +import org.apache.spark.storage.StorageUtils
    +import org.apache.spark.ui.UIUtils
     import org.apache.spark.util.Utils
     
     /** Page showing storage details for a given RDD */
    -private[ui] class RDDPage(parent: StorageTab) extends WebUIPage("rdd") {
    -  private val listener = parent.listener
    -
    -  def render(request: HttpServletRequest): Seq[Node] = {
    -    val parameterId = request.getParameter("id")
    -    require(parameterId != null && parameterId.nonEmpty, "Missing id parameter")
    -
    -    val rddId = parameterId.toInt
    -    val storageStatusList = listener.storageStatusList
    -    val rddInfo = listener.rddInfoList.find(_.id == rddId).getOrElse {
    -      // Rather than crashing, render an "RDD Not Found" page
    -      return UIUtils.headerSparkPage("RDD Not Found", Seq[Node](), parent)
    -    }
    +private[ui] class RDDPage(parent: StorageTab) extends StorageDetailPage("rdd", parent) {
     
    -    // Worker table
    -    val workers = storageStatusList.map((rddId, _))
    -    val workerTable = UIUtils.listingTable(workerHeader, workerRow, workers,
    -      id = Some("rdd-storage-by-worker-table"))
    +  protected override val workerTableID: String = "rdd-storage-by-block-table"
    --- End diff --
    
    by *worker*, not by *block*


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3957]: show broadcast variable resource...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2851#issuecomment-78577370
  
      [Test build #28529 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/28529/consoleFull) for   PR 2851 at commit [`2ea8ee3`](https://github.com/apache/spark/commit/2ea8ee341a40ae543284875b3e6ba333fc07ca9d).
     * This patch **fails Spark unit tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3957]: show broadcast variable resource...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2851#issuecomment-60080906
  
      [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/22035/consoleFull) for   PR 2851 at commit [`23b1819`](https://github.com/apache/spark/commit/23b1819d39f21295ef954c95d72c0f8a733ca915).
     * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3957]: show broadcast variable resource...

Posted by CodingCat <gi...@git.apache.org>.
Github user CodingCat commented on the pull request:

    https://github.com/apache/spark/pull/2851#issuecomment-74151412
  
    Hey, @andrewor14 
    
    Sorry for the late response, I just cleaned a congested task queue in my schedule...I will address your comments within today


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3957]: show broadcast variable resource...

Posted by andrewor14 <gi...@git.apache.org>.
Github user andrewor14 commented on the pull request:

    https://github.com/apache/spark/pull/2851#issuecomment-71409994
  
    Hey @CodingCat this seems like a very complicated change for showing broadcast variables. In particular I do not understand why we need to introduce a new listener event for this. Can you summarize how the broadcast blocks info is sent to the driver before and after this PR?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3957]: show broadcast variable resource...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2851#issuecomment-70738795
  
      [Test build #25837 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/25837/consoleFull) for   PR 2851 at commit [`6b25507`](https://github.com/apache/spark/commit/6b255078375b1861fd252e3db365c73d38ac1945).
     * This patch **passes all tests**.
     * This patch merges cleanly.
     * This patch adds the following public classes _(experimental)_:
      * `case class SparkListenerBlockUpdate(blockManagerId: BlockManagerId, blockId: BlockId,`
      * `class DefaultSparkListenerEventFilter `
      * `class BroadcastInfo(`



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3957]: show broadcast variable resource...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2851#issuecomment-74189206
  
      [Test build #27404 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/27404/consoleFull) for   PR 2851 at commit [`5c60a34`](https://github.com/apache/spark/commit/5c60a345e9b6e34f51403df1bf721d00fac7bffa).
     * This patch **fails Spark unit tests**.
     * This patch merges cleanly.
     * This patch adds the following public classes _(experimental)_:
      * `case class SparkListenerBlockUpdate(blockManagerId: BlockManagerId, blockId: BlockId,`
      * `class DefaultSparkListenerEventFilter `
      * `class BroadcastInfo(`



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3957]: show broadcast variable resource...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2851#issuecomment-61437843
  
      [Test build #22791 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/22791/consoleFull) for   PR 2851 at commit [`84ba580`](https://github.com/apache/spark/commit/84ba5806ac857e48a1bb3d04f8be849410888db0).
     * This patch **passes all tests**.
     * This patch merges cleanly.
     * This patch adds the following public classes _(experimental)_:
      * `case class SparkListenerBlockUpdate(blockManagerId: BlockManagerId, blockId: BlockId,`
      * `class DefaultSparkListenerEventFilter `
      * `sealed abstract class BlockId extends Serializable `
      * `class BroadcastInfo(`



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3957]: show broadcast variable resource...

Posted by squito <gi...@git.apache.org>.
Github user squito commented on a diff in the pull request:

    https://github.com/apache/spark/pull/2851#discussion_r26393712
  
    --- Diff: core/src/main/scala/org/apache/spark/storage/StorageUtils.scala ---
    @@ -166,28 +195,69 @@ class StorageStatus(val blockManagerId: BlockManagerId, val maxMem: Long) {
        * Note that this is much faster than `this.rddBlocksById(rddId).size`, which is
        * O(blocks in this RDD) time.
        */
    -  def numRddBlocksById(rddId: Int): Int = _rddBlocks.get(rddId).map(_.size).getOrElse(0)
    +  def numRddBlocksById(rddId: Long): Int = _rddBlocks.get(rddId).map(_.size).getOrElse(0)
    +
    +
    +  /**
    +   * Return the number of Broadcast blocks stored in this block manager in O(RDDs) time.
    +   * Note that this is much faster than `this.rddBlocks.size`, which is O(RDD blocks) time.
    +   */
    +  def numBroadcastBlocks: Int = _broadcastBlocks.values.map(_.size).sum
    +
    +  /**
    +   * Return the number of blocks that belong to the given Broadcast variable in O(1) time.
    +   * Note that this is much faster than `this.rddBlocksById(rddId).size`, which is
    +   * O(blocks in this RDD) time.
    --- End diff --
    
    same here


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3957]: show broadcast variable resource...

Posted by vanzin <gi...@git.apache.org>.
Github user vanzin commented on a diff in the pull request:

    https://github.com/apache/spark/pull/2851#discussion_r24833573
  
    --- Diff: core/src/main/scala/org/apache/spark/storage/RDDInfo.scala ---
    @@ -49,9 +50,40 @@ class RDDInfo(
       }
     }
     
    +
     private[spark] object RDDInfo {
       def fromRdd(rdd: RDD[_]): RDDInfo = {
         val rddName = Option(rdd.name).getOrElse(rdd.id.toString)
         new RDDInfo(rdd.id, rddName, rdd.partitions.size, rdd.getStorageLevel)
       }
     }
    +
    +@DeveloperApi
    +class BroadcastInfo(
    +    val id: Long,
    +    val name: String,
    +    val numPartitions: Int) extends Ordered[BroadcastInfo] with InMemoryObjectInfo {
    +
    +  var memSize = 0L
    +  var diskSize = 0L
    +  var tachyonSize = 0L
    +
    +  override def toString = {
    +    import Utils.bytesToString
    +    ("%s\" (%d) ; " +
    +      "MemorySize: %s; TachyonSize: %s; DiskSize: %s").format(
    +        name, id, bytesToString(memSize), bytesToString(tachyonSize), bytesToString(diskSize))
    +  }
    +
    +  override def compare(that: BroadcastInfo): Int = {
    +    if (this.id > that.id) {
    +      1
    +    } else {
    +      if (this.id == that.id) {
    +        return 0
    +      }
    +      -1
    --- End diff --
    
    If you really want a suggestion here :-):
    
        com.google.common.primitives.Longs.compare(this.id, that.id)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3957]: show broadcast variable resource...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2851#issuecomment-61434377
  
      [Test build #22782 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/22782/consoleFull) for   PR 2851 at commit [`7aa68b0`](https://github.com/apache/spark/commit/7aa68b090d399e03ee1919f8c54689863a546f4d).
     * This patch **fails MiMa tests**.
     * This patch merges cleanly.
     * This patch adds the following public classes _(experimental)_:
      * `case class SparkListenerBlockUpdate(blockManagerId: BlockManagerId, blockId: BlockId,`
      * `class DefaultSparkListenerEventFilter `
      * `sealed abstract class BlockId extends Serializable `
      * `class BroadcastInfo(`



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3957]: show broadcast variable resource...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2851#issuecomment-74169209
  
      [Test build #27387 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/27387/consoleFull) for   PR 2851 at commit [`bf03271`](https://github.com/apache/spark/commit/bf032716ed7f7d8d57e44c936e70385ab775c6c8).
     * This patch **fails Spark unit tests**.
     * This patch merges cleanly.
     * This patch adds the following public classes _(experimental)_:
      * `case class SparkListenerBlockUpdate(blockManagerId: BlockManagerId, blockId: BlockId,`
      * `class DefaultSparkListenerEventFilter `
      * `class BroadcastInfo(`



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3957]: show broadcast variable resource...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2851#issuecomment-79320145
  
      [Test build #28577 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/28577/consoleFull) for   PR 2851 at commit [`26977c9`](https://github.com/apache/spark/commit/26977c97add7d47d798985753411645e6cdcac29).
     * This patch **fails Spark unit tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3957]: show broadcast variable resource...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2851#issuecomment-61424962
  
      [Test build #22774 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/22774/consoleFull) for   PR 2851 at commit [`78b7461`](https://github.com/apache/spark/commit/78b7461865b7706f9f249a9e7054e51a50c29dc1).
     * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3957]: show broadcast variable resource...

Posted by CodingCat <gi...@git.apache.org>.
Github user CodingCat commented on a diff in the pull request:

    https://github.com/apache/spark/pull/2851#discussion_r24840983
  
    --- Diff: core/src/main/scala/org/apache/spark/storage/BlockManagerMasterActor.scala ---
    @@ -522,7 +523,9 @@ private[spark] class BlockManagerInfo(
             logInfo("Removed %s on %s on tachyon (size: %s)".format(
               blockId, blockManagerId.hostPort, Utils.bytesToString(blockStatus.tachyonSize)))
           }
    +      return BlockStatus(storageLevel, 0, 0, 0)
         }
    +    null
    --- End diff --
    
    actually the code should not enter into this line...unless something wrong happened in updating https://github.com/CodingCat/spark/blob/SPARK-3957/core/src/main/scala/org/apache/spark/storage/BlockManagerMasterActor.scala#L457
    
    maybe I can change https://github.com/CodingCat/spark/blob/SPARK-3957/core/src/main/scala/org/apache/spark/storage/BlockManagerMasterActor.scala#L373 to an Option, or add a if guard in https://github.com/CodingCat/spark/blob/SPARK-3957/core/src/main/scala/org/apache/spark/storage/BlockManagerMasterActor.scala#L373


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3957]: show broadcast variable resource...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2851#issuecomment-61425776
  
      [Test build #22773 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/22773/consoleFull) for   PR 2851 at commit [`5fed94b`](https://github.com/apache/spark/commit/5fed94b076dc5d2e1532bc7e9396f72e2b2074ef).
     * This patch **fails Spark unit tests**.
     * This patch merges cleanly.
     * This patch adds the following public classes _(experimental)_:
      * `case class SparkListenerBlockUpdate(blockManagerId: BlockManagerId, blockId: BlockId,`
      * `class DefaultSparkListenerEventFilter `
      * `sealed abstract class BlockId extends Serializable `
      * `class BroadcastInfo(`



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3957]: show broadcast variable resource...

Posted by shivaram <gi...@git.apache.org>.
Github user shivaram commented on a diff in the pull request:

    https://github.com/apache/spark/pull/2851#discussion_r19746366
  
    --- Diff: core/src/main/scala/org/apache/spark/storage/BlockId.scala ---
    @@ -30,12 +30,13 @@ import org.apache.spark.annotation.DeveloperApi
      * If your BlockId should be serializable, be sure to add it to the BlockId.apply() method.
      */
     @DeveloperApi
    -sealed abstract class BlockId {
    +sealed abstract class BlockId extends Serializable {
    --- End diff --
    
    Is there a reason to make this serializable ? There are subclasses which are explicitly marked as non serializable https://github.com/CodingCat/spark/blob/SPARK-3957/core/src/main/scala/org/apache/spark/storage/BlockId.scala#L89 @aarondav can probably comment ?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3957]: show broadcast variable resource...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/2851#issuecomment-79705758
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/28601/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3957]: show broadcast variable resource...

Posted by CodingCat <gi...@git.apache.org>.
Github user CodingCat commented on a diff in the pull request:

    https://github.com/apache/spark/pull/2851#discussion_r24840220
  
    --- Diff: core/src/main/scala/org/apache/spark/storage/RDDInfo.scala ---
    @@ -21,13 +21,14 @@ import org.apache.spark.annotation.DeveloperApi
     import org.apache.spark.rdd.RDD
     import org.apache.spark.util.Utils
     
    +trait InMemoryObjectInfo
    --- End diff --
    
    Actually I used it in https://github.com/apache/spark/pull/2851/files#diff-dc63d4ab5b2ed5a2135c054be78820eeR47
    
    I need a common base class to save RDD and Broadcast Info ...


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3957]: show broadcast variable resource...

Posted by CodingCat <gi...@git.apache.org>.
Github user CodingCat commented on the pull request:

    https://github.com/apache/spark/pull/2851#issuecomment-74185015
  
    Hi, @andrewor14, thanks for the review.
    
    To answer your question:
    
    > how the broadcast blocks info is sent to the driver before and after this PR?
    
    before this PR, the broadcast blocks was not sent to the driver from executors, 
    
    in this PR, I am exactly using the strategy mentione by @shivaram, i.e. setting tellMaster=true
    
    > Would all the new logic just go away if we simply set tellMaster = true for all broadcast blocks,
    
    No...the complexity comes from 1) updating the page, and 2) as mentioned by @rxin , we need to avoid introduce more complexity to BlockManagers (at the beginning I simply piggyback the broadcast info in the BlockManager heartbeat)
    
    to address 2), I use the listenerBus to notify the StorageStatusListener and StorageListener that some of the blocks are updated, they need to apply the operation respectively (StorageStatusListener => update executorIdToStorageStatus, StorageListener => update internal data structure)
    
    



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3957]: show broadcast variable resource...

Posted by squito <gi...@git.apache.org>.
Github user squito commented on a diff in the pull request:

    https://github.com/apache/spark/pull/2851#discussion_r24642582
  
    --- Diff: core/src/main/scala/org/apache/spark/storage/StorageUtils.scala ---
    @@ -118,8 +140,20 @@ class StorageStatus(val blockManagerId: BlockManagerId, val maxMem: Long) {
             } else {
               None
             }
    +      case BroadcastBlockId(broadcastId, _) =>
    +        // Actually remove the block, if it exists
    +        if (_broadcastBlocks.contains(broadcastId)) {
    +          val removed = _broadcastBlocks(broadcastId).remove(blockId)
    +          // If the given RDD has no more blocks left, remove the RDD
    --- End diff --
    
    again, broadcast variable instead of RDD


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3957]: show broadcast variable resource...

Posted by CodingCat <gi...@git.apache.org>.
Github user CodingCat commented on the pull request:

    https://github.com/apache/spark/pull/2851#issuecomment-148867896
  
    sure, I will close this temporarily 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3957]: show broadcast variable resource...

Posted by CodingCat <gi...@git.apache.org>.
Github user CodingCat commented on the pull request:

    https://github.com/apache/spark/pull/2851#issuecomment-60319128
  
    Hi, @shivaram, do you mean we send the report with tell instead of askDriverWithReply....hmmm...
    
    what's the original motivation to send BlockInfo "synchronously"?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3957]: show broadcast variable resource...

Posted by squito <gi...@git.apache.org>.
Github user squito commented on a diff in the pull request:

    https://github.com/apache/spark/pull/2851#discussion_r26394350
  
    --- Diff: core/src/main/scala/org/apache/spark/ui/storage/StorageTab.scala ---
    @@ -39,11 +40,18 @@ private[ui] class StorageTab(parent: SparkUI) extends SparkUITab(parent, "storag
     @DeveloperApi
     class StorageListener(storageStatusListener: StorageStatusListener) extends SparkListener {
       private[ui] val _rddInfoMap = mutable.Map[Int, RDDInfo]() // exposed for testing
    +  private[ui] val _broadcastInfoMap = mutable.Map[Long, BroadcastInfo]()
     
       def storageStatusList = storageStatusListener.storageStatusList
     
       /** Filter RDD info to include only those with cached partitions */
    -  def rddInfoList = _rddInfoMap.values.filter(_.numCachedPartitions > 0).toSeq
    +  def rddInfoList = {
    +    _rddInfoMap.values.filter(_.numCachedPartitions > 0).toSeq
    +  }
    +    
    +
    +  /** Filter broadcast info to include only those with cached partitions */
    --- End diff --
    
    delete this comment


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3957]: show broadcast variable resource...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2851#issuecomment-61569165
  
      [Test build #22829 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/22829/consoleFull) for   PR 2851 at commit [`a279ef8`](https://github.com/apache/spark/commit/a279ef8ed315765587ee5212ee7ac5975cba4272).
     * This patch **passes all tests**.
     * This patch merges cleanly.
     * This patch adds the following public classes _(experimental)_:
      * `case class SparkListenerBlockUpdate(blockManagerId: BlockManagerId, blockId: BlockId,`
      * `class DefaultSparkListenerEventFilter `
      * `class BroadcastInfo(`



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3957]: show broadcast variable resource...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2851#issuecomment-79466606
  
      [Test build #28585 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/28585/consoleFull) for   PR 2851 at commit [`49ee118`](https://github.com/apache/spark/commit/49ee1188989422673203eece83d67b590b05bb27).
     * This patch **passes all tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3957]: show broadcast variable resource...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2851#issuecomment-61421126
  
      [Test build #22766 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/22766/consoleFull) for   PR 2851 at commit [`6e4915d`](https://github.com/apache/spark/commit/6e4915dcfaf57a9b7363dce85135b675f2294981).
     * This patch **fails Scala style tests**.
     * This patch **does not merge cleanly**.
     * This patch adds the following public classes _(experimental)_:
      * `case class SparkListenerBlockUpdate(blockManagerId: BlockManagerId, blockId: BlockId,`
      * `class DefaultSparkListenerEventFilter `
      * `sealed abstract class BlockId extends Serializable `
      * `class BroadcastInfo(`



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3957]: show broadcast variable resource...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2851#issuecomment-74573745
  
      [Test build #27571 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/27571/consoleFull) for   PR 2851 at commit [`34aed25`](https://github.com/apache/spark/commit/34aed25737ef896356505e4ca2f50dd2d788b98c).
     * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3957]: show broadcast variable resource...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2851#issuecomment-74602430
  
      [Test build #27595 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/27595/consoleFull) for   PR 2851 at commit [`0d854b8`](https://github.com/apache/spark/commit/0d854b835db225baa3b2ee2271d03afa87d9a3e8).
     * This patch **passes all tests**.
     * This patch merges cleanly.
     * This patch adds the following public classes _(experimental)_:
      * `case class SparkListenerBlockUpdate(blockManagerId: BlockManagerId, blockId: BlockId,`
      * `class DefaultSparkListenerEventFilter `
      * `class BroadcastInfo(`



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3957]: show broadcast variable resource...

Posted by squito <gi...@git.apache.org>.
Github user squito commented on a diff in the pull request:

    https://github.com/apache/spark/pull/2851#discussion_r24830276
  
    --- Diff: core/src/main/scala/org/apache/spark/scheduler/SparkListenerBus.scala ---
    @@ -24,7 +24,13 @@ import org.apache.spark.util.ListenerBus
      */
     private[spark] trait SparkListenerBus extends ListenerBus[SparkListener, SparkListenerEvent] {
     
    +  private[spark] var filter: DefaultSparkListenerEventFilter = new DefaultSparkListenerEventFilter
    +
       override def onPostEvent(listener: SparkListener, event: SparkListenerEvent): Unit = {
    +    if (!filter.validate(event)) {
    +      return  
    --- End diff --
    
    also, I think there is still one missing piece to get the json into the event logging for the history server.  You need to implement to put in the implementation for `onBlockUpdate` in `EventLoggingListener`.  I am suggesting that you just put this filter into that implementation, and get rid of the `Filter` abstraction.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3957]: show broadcast variable resource...

Posted by CodingCat <gi...@git.apache.org>.
Github user CodingCat commented on a diff in the pull request:

    https://github.com/apache/spark/pull/2851#discussion_r19755610
  
    --- Diff: core/src/main/scala/org/apache/spark/HeartbeatReceiver.scala ---
    @@ -19,7 +19,7 @@ package org.apache.spark
     
     import akka.actor.Actor
     import org.apache.spark.executor.TaskMetrics
    -import org.apache.spark.storage.BlockManagerId
    +import org.apache.spark.storage.{BlockStatus, BlockId, BlockManagerId}
    --- End diff --
    
    en...my bad, something from heartbeat piggyback version...will fix it together with other issues


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3957]: show broadcast variable resource...

Posted by shivaram <gi...@git.apache.org>.
Github user shivaram commented on a diff in the pull request:

    https://github.com/apache/spark/pull/2851#discussion_r19746633
  
    --- Diff: core/src/main/scala/org/apache/spark/scheduler/SparkListener.scala ---
    @@ -56,6 +56,11 @@ case class SparkListenerTaskEnd(
       extends SparkListenerEvent
     
     @DeveloperApi
    +case class SparkListenerBlockUpdate(blockManagerId: BlockManagerId, blockId: BlockId,
    --- End diff --
    
    From what I can see RDD blocks are updated at the end of a stage ? That should be fine for broadcast blocks too if that is the case -- I am just not sure we need a new listener event, filter etc for just this use case. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3957]: show broadcast variable resource...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2851#issuecomment-61421418
  
      [Test build #22768 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/22768/consoleFull) for   PR 2851 at commit [`0fd22c9`](https://github.com/apache/spark/commit/0fd22c9f3f2bbb82be70c8a0279facde9f74d096).
     * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3957]: show broadcast variable resource...

Posted by shivaram <gi...@git.apache.org>.
Github user shivaram commented on a diff in the pull request:

    https://github.com/apache/spark/pull/2851#discussion_r19746018
  
    --- Diff: core/src/main/scala/org/apache/spark/HeartbeatReceiver.scala ---
    @@ -19,7 +19,7 @@ package org.apache.spark
     
     import akka.actor.Actor
     import org.apache.spark.executor.TaskMetrics
    -import org.apache.spark.storage.BlockManagerId
    +import org.apache.spark.storage.{BlockStatus, BlockId, BlockManagerId}
    --- End diff --
    
    import seems unnecessary ? There are similar diffs below with imports which don't look like they are required


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3957]: show broadcast variable resource...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2851#issuecomment-61423509
  
      [Test build #22770 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/22770/consoleFull) for   PR 2851 at commit [`8f2253e`](https://github.com/apache/spark/commit/8f2253e70111f0eb0caa83508c2bb6b1c6d271fa).
     * This patch **fails Spark unit tests**.
     * This patch merges cleanly.
     * This patch adds the following public classes _(experimental)_:
      * `case class SparkListenerBlockUpdate(blockManagerId: BlockManagerId, blockId: BlockId,`
      * `class DefaultSparkListenerEventFilter `
      * `sealed abstract class BlockId extends Serializable `
      * `class BroadcastInfo(`



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3957]: show broadcast variable resource...

Posted by shivaram <gi...@git.apache.org>.
Github user shivaram commented on a diff in the pull request:

    https://github.com/apache/spark/pull/2851#discussion_r19752142
  
    --- Diff: core/src/main/scala/org/apache/spark/scheduler/SparkListener.scala ---
    @@ -56,6 +56,11 @@ case class SparkListenerTaskEnd(
       extends SparkListenerEvent
     
     @DeveloperApi
    +case class SparkListenerBlockUpdate(blockManagerId: BlockManagerId, blockId: BlockId,
    --- End diff --
    
    Well when the user does sc.broadcast(1), the variable is only created on the driver. Its only when the variable is used that it gets fetched on the executors.  While the driver memory / disk reporting is definitely required, I think it might be okay for it only show up after some stage has run.
    
    @rxin - any thoughts ?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3957]: show broadcast variable resource...

Posted by CodingCat <gi...@git.apache.org>.
Github user CodingCat commented on a diff in the pull request:

    https://github.com/apache/spark/pull/2851#discussion_r19769048
  
    --- Diff: core/src/main/scala/org/apache/spark/scheduler/SparkListener.scala ---
    @@ -56,6 +56,11 @@ case class SparkListenerTaskEnd(
       extends SparkListenerEvent
     
     @DeveloperApi
    +case class SparkListenerBlockUpdate(blockManagerId: BlockManagerId, blockId: BlockId,
    --- End diff --
    
    I just looked at the possible solution to update broadcast var info upon the stage finished (just attempted to implement this in a single commit so that I can easily revert to shorten the RTT), the problem is, how to associate a broadcast variable with the stage....I didn't find a very reasonable solution on this, because for RDD, we need that information to create a stage object, so everything works straightforwardly...for broadcast...I'm not sure if we can really do that....maybe I missed something 
    
    @shivaram @rxin ?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3957]: show broadcast variable resource...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/2851#issuecomment-74169221
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/27387/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3957]: show broadcast variable resource...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2851#issuecomment-60098599
  
      [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/22036/consoleFull) for   PR 2851 at commit [`2db9dd9`](https://github.com/apache/spark/commit/2db9dd994dcf4b0c9248488d06698b9c01d9b746).
     * This patch **passes all tests**.
     * This patch merges cleanly.
     * This patch adds the following public classes _(experimental)_:
      * `sealed abstract class BlockId extends Serializable `



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [WIP]SPARK-3957: show broadcast variable resou...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2851#issuecomment-59679500
  
      [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21903/consoleFull) for   PR 2851 at commit [`8f75476`](https://github.com/apache/spark/commit/8f75476cd144b5e6c2b486ec0afccd20f6283644).
     * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3957]: show broadcast variable resource...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2851#issuecomment-61431627
  
      [Test build #22782 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/22782/consoleFull) for   PR 2851 at commit [`7aa68b0`](https://github.com/apache/spark/commit/7aa68b090d399e03ee1919f8c54689863a546f4d).
     * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3957]: show broadcast variable resource...

Posted by CodingCat <gi...@git.apache.org>.
Github user CodingCat commented on the pull request:

    https://github.com/apache/spark/pull/2851#issuecomment-74583601
  
    Potential problems in the current patch
    
    1. to de-duplicate the code in StorageUtils, I save RDD ID with **Int** type in a **Long** Field (https://github.com/apache/spark/pull/2851/files#diff-21027f5c826cd378daaae5f7c3eea2b5R41), otherwise it's hard to get a BlockMap according to the blockId (https://github.com/apache/spark/pull/2851/files#diff-21027f5c826cd378daaae5f7c3eea2b5R126) (you will see a lot of type imcompatibility from the compiler)
    
    2. InMemoryObjectPage.scala (https://github.com/apache/spark/pull/2851/files#diff-dc63d4ab5b2ed5a2135c054be78820eeR97), I'm not sure if the semantics of (memory usage of a worker) is correct...it should be the memory used by the current InMemoryObject or the total memory usage?



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3957]: show broadcast variable resource...

Posted by shivaram <gi...@git.apache.org>.
Github user shivaram commented on the pull request:

    https://github.com/apache/spark/pull/2851#issuecomment-60316582
  
    Sorry that link should have been https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/broadcast/TorrentBroadcast.scala#L181


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3957]: show broadcast variable resource...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/2851#issuecomment-74189213
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/27404/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3957]: show broadcast variable resource...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2851#issuecomment-74581075
  
      [Test build #27588 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/27588/consoleFull) for   PR 2851 at commit [`04726c4`](https://github.com/apache/spark/commit/04726c410909144581a308651837c2039b841276).
     * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3957]: show broadcast variable resource...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2851#issuecomment-74153428
  
      [Test build #27380 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/27380/consoleFull) for   PR 2851 at commit [`b65d49e`](https://github.com/apache/spark/commit/b65d49e1fa4e00e7353851197b9fce92a3e911e1).
     * This patch **fails to build**.
     * This patch merges cleanly.
     * This patch adds the following public classes _(experimental)_:
      * `case class SparkListenerBlockUpdate(blockManagerId: BlockManagerId, blockId: BlockId,`
      * `class DefaultSparkListenerEventFilter `
      * `class BroadcastInfo(`



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3957]: show broadcast variable resource...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/2851#issuecomment-74190415
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/27405/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3957]: show broadcast variable resource...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/2851#issuecomment-79320222
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/28577/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3957]: show broadcast variable resource...

Posted by CodingCat <gi...@git.apache.org>.
Github user CodingCat commented on the pull request:

    https://github.com/apache/spark/pull/2851#issuecomment-61180867
  
    haven't forgot this, I will make it done tomorrow


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3957]: show broadcast variable resource...

Posted by squito <gi...@git.apache.org>.
Github user squito commented on the pull request:

    https://github.com/apache/spark/pull/2851#issuecomment-74713997
  
    can you also post a screenshot of the detailed page for a broadcast var?  Ideally involving a broadcast var that gets turned into multiple blocks by `TorrentBroadcast`, I think just needs to be over 4MB by default.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3957]: show broadcast variable resource...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/2851#issuecomment-60077880
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/22034/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3957]: show broadcast variable resource...

Posted by CodingCat <gi...@git.apache.org>.
Github user CodingCat commented on the pull request:

    https://github.com/apache/spark/pull/2851#issuecomment-60315631
  
    @rxin, I see
    
    then I will try to refactor the reporting mechanism (currently piggyback in heartbeat) to make it more general....


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3957]: show broadcast variable resource...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2851#issuecomment-74593574
  
      [Test build #27595 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/27595/consoleFull) for   PR 2851 at commit [`0d854b8`](https://github.com/apache/spark/commit/0d854b835db225baa3b2ee2271d03afa87d9a3e8).
     * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3957]: show broadcast variable resource...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/2851#issuecomment-61421363
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/22767/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3957]: show broadcast variable resource...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/2851#issuecomment-74568272
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/27569/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3957]: show broadcast variable resource...

Posted by CodingCat <gi...@git.apache.org>.
Github user CodingCat commented on the pull request:

    https://github.com/apache/spark/pull/2851#issuecomment-74599888
  
    oh...the failed one is the previous submission.....


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3957]: show broadcast variable resource...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2851#issuecomment-74595226
  
    **[Test build #27588 timed out](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/27588/consoleFull)**     for PR 2851 at commit [`04726c4`](https://github.com/apache/spark/commit/04726c410909144581a308651837c2039b841276)     after a configured wait of `120m`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3957]: show broadcast variable resource...

Posted by squito <gi...@git.apache.org>.
Github user squito commented on a diff in the pull request:

    https://github.com/apache/spark/pull/2851#discussion_r26391448
  
    --- Diff: core/src/main/scala/org/apache/spark/HeartbeatReceiver.scala ---
    @@ -67,10 +67,8 @@ private[spark] class HeartbeatReceiver(sc: SparkContext, scheduler: TaskSchedule
       
       override def receiveWithLogging = {
         case Heartbeat(executorId, taskMetrics, blockManagerId) =>
    -      val unknownExecutor = !scheduler.executorHeartbeatReceived(
    -        executorId, taskMetrics, blockManagerId)
    -      val response = HeartbeatResponse(reregisterBlockManager = unknownExecutor)
    -      executorLastSeen(executorId) = System.currentTimeMillis()
    +      val response = HeartbeatResponse(
    +        !scheduler.executorHeartbeatReceived(executorId, taskMetrics, blockManagerId))
    --- End diff --
    
    did you mean to remove the update of `executorLastSeen`?  seems like maybe a mistake in the merge?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3957]: show broadcast variable resource...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/2851#issuecomment-74602432
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/27595/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [WIP]SPARK-3957: show broadcast variable resou...

Posted by CodingCat <gi...@git.apache.org>.
Github user CodingCat commented on a diff in the pull request:

    https://github.com/apache/spark/pull/2851#discussion_r19185382
  
    --- Diff: core/src/main/scala/org/apache/spark/HeartbeatReceiver.scala ---
    @@ -30,7 +30,8 @@ import org.apache.spark.util.ActorLogReceive
     private[spark] case class Heartbeat(
         executorId: String,
         taskMetrics: Array[(Long, TaskMetrics)], // taskId -> TaskMetrics
    -    blockManagerId: BlockManagerId)
    +    blockManagerId: BlockManagerId,
    +    broadcastBlocks: Map[BlockId, Option[BlockStatus]])
    --- End diff --
    
    good point, I will make this PR this evening 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3957]: show broadcast variable resource...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2851#issuecomment-61421464
  
      [Test build #22768 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/22768/consoleFull) for   PR 2851 at commit [`0fd22c9`](https://github.com/apache/spark/commit/0fd22c9f3f2bbb82be70c8a0279facde9f74d096).
     * This patch **fails Scala style tests**.
     * This patch merges cleanly.
     * This patch adds the following public classes _(experimental)_:
      * `case class ExecutorLostFailure(execId: String) extends TaskFailedReason `
      * `case class SparkListenerBlockUpdate(blockManagerId: BlockManagerId, blockId: BlockId,`
      * `class DefaultSparkListenerEventFilter `
      * `sealed abstract class BlockId extends Serializable `
      * `class BroadcastInfo(`



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3957]: show broadcast variable resource...

Posted by squito <gi...@git.apache.org>.
Github user squito commented on a diff in the pull request:

    https://github.com/apache/spark/pull/2851#discussion_r26394598
  
    --- Diff: core/src/main/scala/org/apache/spark/ui/storage/StorageTab.scala ---
    @@ -79,4 +87,19 @@ class StorageListener(storageStatusListener: StorageStatusListener) extends Spar
       override def onUnpersistRDD(unpersistRDD: SparkListenerUnpersistRDD) = synchronized {
         _rddInfoMap.remove(unpersistRDD.rddId)
       }
    +
    +  override def onBlockUpdate(blockUpdateEvent: SparkListenerBlockUpdate) = synchronized {
    +    // only update broadcast for now, need to be modified if want to track other blocks
    +    val broadcastIdOpt = blockUpdateEvent.blockId.asBroadcastId
    +    if (broadcastIdOpt.isDefined) {
    --- End diff --
    
    I don't think it particularly matters here, but fyi a more "scala" way of doing this is:
    
    ```
    blockUpdateEvent.blockId.asBroadcastId.foreach{broadcastBlock =>
      val broadcastId = broadcastBlock.broadcastId
      ...
    }
    ```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3957]: show broadcast variable resource...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2851#issuecomment-70644705
  
      [Test build #25817 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/25817/consoleFull) for   PR 2851 at commit [`1f53135`](https://github.com/apache/spark/commit/1f5313515f6151ee4d8d24cefb4ae1f870703f7e).
     * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [WIP]SPARK-3957: show broadcast variable resou...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2851#issuecomment-59679558
  
      [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21903/consoleFull) for   PR 2851 at commit [`8f75476`](https://github.com/apache/spark/commit/8f75476cd144b5e6c2b486ec0afccd20f6283644).
     * This patch **fails Scala style tests**.
     * This patch merges cleanly.
     * This patch adds the following public classes _(experimental)_:
      * `sealed abstract class BlockId extends Serializable `



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3957]: show broadcast variable resource...

Posted by squito <gi...@git.apache.org>.
Github user squito commented on a diff in the pull request:

    https://github.com/apache/spark/pull/2851#discussion_r24833405
  
    --- Diff: core/src/main/scala/org/apache/spark/storage/RDDInfo.scala ---
    @@ -49,9 +50,40 @@ class RDDInfo(
       }
     }
     
    +
     private[spark] object RDDInfo {
       def fromRdd(rdd: RDD[_]): RDDInfo = {
         val rddName = Option(rdd.name).getOrElse(rdd.id.toString)
         new RDDInfo(rdd.id, rddName, rdd.partitions.size, rdd.getStorageLevel)
       }
     }
    +
    +@DeveloperApi
    +class BroadcastInfo(
    +    val id: Long,
    +    val name: String,
    +    val numPartitions: Int) extends Ordered[BroadcastInfo] with InMemoryObjectInfo {
    +
    +  var memSize = 0L
    +  var diskSize = 0L
    +  var tachyonSize = 0L
    +
    +  override def toString = {
    +    import Utils.bytesToString
    +    ("%s\" (%d) ; " +
    +      "MemorySize: %s; TachyonSize: %s; DiskSize: %s").format(
    +        name, id, bytesToString(memSize), bytesToString(tachyonSize), bytesToString(diskSize))
    +  }
    +
    +  override def compare(that: BroadcastInfo): Int = {
    +    if (this.id > that.id) {
    +      1
    +    } else {
    +      if (this.id == that.id) {
    +        return 0
    +      }
    +      -1
    --- End diff --
    
    super minor:
    ```
    if (this.id > that.id) {
      1
    }  else if (this.is == that.id) {
      0
    }  else {
      -1
    }
    ```
    (sorry I was wrong w/ the earlier suggestion)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3957]: show broadcast variable resource...

Posted by CodingCat <gi...@git.apache.org>.
Github user CodingCat commented on the pull request:

    https://github.com/apache/spark/pull/2851#issuecomment-61557580
  
    just addressed the other comments, thanks @shivaram 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3957]: show broadcast variable resource...

Posted by CodingCat <gi...@git.apache.org>.
Github user CodingCat commented on the pull request:

    https://github.com/apache/spark/pull/2851#issuecomment-74447987
  
    working on this now


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3957]: show broadcast variable resource...

Posted by squito <gi...@git.apache.org>.
Github user squito commented on the pull request:

    https://github.com/apache/spark/pull/2851#issuecomment-74293360
  
    Hi @CodingCat ,
    
    thanks for working on this, sorry the review has been dragging out.  This will be a nice addition, but I have some concerns similar to the other reviewers -- but lemme explain in a little more detail.  I don't necessarily see anything wrong with your approach for getting the Broadcast block info by itself, but it seems like it ought to share a similar code path to tracking the RDDBlocks.  RDDBlocks get updated in the `SparkListenerTaskEnd` event, through `event.taskMetrics.updatedBlocks`.  this makes its way to the current `StorageTab` in [`StorageListener`](https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/ui/storage/StorageTab.scala#L59), which then [filters down to the RDDBlocks](https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/ui/storage/StorageTab.scala#L50) via `BlockId.asRDDId`.
    
    So, this makes me wonder about two possible alternatives for unifying the code paths:
    
    1) Can you get the Broadcast block ids into the same `SparkListenerTaskEnd` event, which would eliminate the need to create a new event?  (In fact, maybe those block ids are already there?  I am still looking into the code for how those events get created ...)
    
    2) If we can't get Broadcast block ids into the same `SparkListenerTaskEnd`, does it make sense to change the RDD block tracking logic to do the same thing?  Yes this will be a bigger change, but it seems strange to have these two blocks tracked by such different mechanisms so it will lead to simpler code overall.  (But really that pushes me to option #1)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3957]: show broadcast variable resource...

Posted by CodingCat <gi...@git.apache.org>.
Github user CodingCat commented on the pull request:

    https://github.com/apache/spark/pull/2851#issuecomment-60319425
  
    I mean Akka's tell, not 
    
    ```
     private def tell(message: Any) {
        if (!askDriverWithReply[Boolean](message)) {
          throw new SparkException("BlockManagerMasterActor returned false, expected true.")
        }
      }
    ```
    
    (.....why we have such a method....)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3957]: show broadcast variable resource...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2851#issuecomment-60040650
  
      [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/22022/consoleFull) for   PR 2851 at commit [`c07c306`](https://github.com/apache/spark/commit/c07c306017ffce24a4b21d14f71a9e6734b21575).
     * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3957]: show broadcast variable resource...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/2851#issuecomment-74746530
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/27638/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3957]: show broadcast variable resource...

Posted by vanzin <gi...@git.apache.org>.
Github user vanzin commented on a diff in the pull request:

    https://github.com/apache/spark/pull/2851#discussion_r24696753
  
    --- Diff: core/src/main/scala/org/apache/spark/storage/BroadcastInfo.scala ---
    @@ -0,0 +1,52 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.spark.storage
    +
    +import org.apache.spark.annotation.DeveloperApi
    +import org.apache.spark.broadcast.{TorrentBroadcast, Broadcast}
    +import org.apache.spark.util.Utils
    +import org.apache.spark.util.Utils._
    +
    +@DeveloperApi
    +class BroadcastInfo(
    +    val id: Long,
    +    val name: String,
    +    val numPartitions: Int) extends Ordered[BroadcastInfo] {
    +
    +  var memSize = 0L
    +  var diskSize = 0L
    +  var tachyonSize = 0L
    +
    +  override def toString = {
    +    import Utils.bytesToString
    +    ("%s\" (%d) ; " +
    +      "MemorySize: %s; TachyonSize: %s; DiskSize: %s").format(
    +        name, id, bytesToString(memSize), bytesToString(tachyonSize), bytesToString(diskSize))
    +  }
    +
    +  override def compare(that: BroadcastInfo) = {
    +    if (this.id > that.id) {
    +      1
    +    } else {
    +      if (this.id == that.id) {
    +        0
    +      }
    +      -1
    --- End diff --
    
    what type does `compare` return? It's generally an `int`, and `id` is a long. (BTW, *hint*hint*, make the method's return type explicit.)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3957]: show broadcast variable resource...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2851#issuecomment-60087123
  
      [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/22036/consoleFull) for   PR 2851 at commit [`2db9dd9`](https://github.com/apache/spark/commit/2db9dd994dcf4b0c9248488d06698b9c01d9b746).
     * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3957]: show broadcast variable resource...

Posted by squito <gi...@git.apache.org>.
Github user squito commented on a diff in the pull request:

    https://github.com/apache/spark/pull/2851#discussion_r24671936
  
    --- Diff: core/src/main/scala/org/apache/spark/ui/storage/BroadcastPage.scala ---
    @@ -0,0 +1,152 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.spark.ui.storage
    +
    +import javax.servlet.http.HttpServletRequest
    +
    +import org.apache.spark.storage.{BlockStatus, BlockId, StorageStatus, StorageUtils}
    +import org.apache.spark.ui.{UIUtils, WebUIPage}
    +import org.apache.spark.util.Utils
    +
    +import scala.xml.Node
    +
    +private[ui] class BroadcastPage(parent: StorageTab) extends WebUIPage("broadcast"){
    --- End diff --
    
    this is just a minor variation of `RDDPage`, right?  better to make a common superclass with most of the code, then `RDDPage` and `BroadcastPage` would just need a couple of tweaks (eg., labels in the UI, which block map to look at etc.)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3957]: show broadcast variable resource...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/2851#issuecomment-78577401
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/28529/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3957]: show broadcast variable resource...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2851#issuecomment-74184738
  
      [Test build #27405 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/27405/consoleFull) for   PR 2851 at commit [`08ad98d`](https://github.com/apache/spark/commit/08ad98d5a544225c733411fb895de5fa32431c24).
     * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3957]: show broadcast variable resource...

Posted by squito <gi...@git.apache.org>.
Github user squito commented on a diff in the pull request:

    https://github.com/apache/spark/pull/2851#discussion_r24673367
  
    --- Diff: core/src/main/scala/org/apache/spark/storage/BlockManagerMasterActor.scala ---
    @@ -466,10 +467,11 @@ private[spark] class BlockManagerInfo(
           storageLevel: StorageLevel,
           memSize: Long,
           diskSize: Long,
    -      tachyonSize: Long) {
    +      tachyonSize: Long): BlockStatus =  {
     
    -    updateLastSeenMs()
    +    var blockStatus: BlockStatus = null
    --- End diff --
    
    we can get rid of this var -- just have each branch of the if/else finish w/ the block status you want to return. 
    
    ```
    if (_blocks.containsKey(blockId)) {
      ...
      if (storageLevel.isValid) {
        ...
       _blocks.get(blockId)
      } else if (_blocks.containsKey(blockId)) {//I realize you aren't modifying this part, but isn't this condition redundant?
        val blockStatus: BlockStatus = _blocks.get(blockId)
        ...
        BlockStatus(storageLevel, 0, 0, 0)
      }
    } else {
      null
    }
    ```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3957]: show broadcast variable resource...

Posted by CodingCat <gi...@git.apache.org>.
Github user CodingCat commented on a diff in the pull request:

    https://github.com/apache/spark/pull/2851#discussion_r26392543
  
    --- Diff: core/src/main/scala/org/apache/spark/storage/BlockManager.scala ---
    @@ -357,7 +357,13 @@ private[spark] class BlockManager(
           info: BlockInfo,
           status: BlockStatus,
           droppedMemorySize: Long = 0L): Unit = {
    -    val needReregister = !tryToReportBlockStatus(blockId, info, status, droppedMemorySize)
    +    def ifAsync: Boolean = {
    +      blockId.isBroadcast
    +    }
    +    // asynchronously report broadcast variables, in future we only need to change the
    +    // implementation of ifAsync
    --- End diff --
    
    hmmm...I mean, for reporting broadcast, we do it asynchronously, but in future, if we want to report other blocks, we can make ifAsync logic more complicate, e.g.
    
    ```
    
    if (blockId.isBroadcast) true
    else false
    
    ```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3957]: show broadcast variable resource...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/2851#issuecomment-61421127
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/22766/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3957]: show broadcast variable resource...

Posted by CodingCat <gi...@git.apache.org>.
Github user CodingCat commented on a diff in the pull request:

    https://github.com/apache/spark/pull/2851#discussion_r19752830
  
    --- Diff: core/src/main/scala/org/apache/spark/storage/BlockId.scala ---
    @@ -30,12 +30,13 @@ import org.apache.spark.annotation.DeveloperApi
      * If your BlockId should be serializable, be sure to add it to the BlockId.apply() method.
      */
     @DeveloperApi
    -sealed abstract class BlockId {
    +sealed abstract class BlockId extends Serializable {
    --- End diff --
    
    oops, my fault....it's something left by last version......I will fix it after we get feedback about whether we need to add another sparklistenerevent....


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3957]: show broadcast variable resource...

Posted by squito <gi...@git.apache.org>.
Github user squito commented on a diff in the pull request:

    https://github.com/apache/spark/pull/2851#discussion_r24642955
  
    --- Diff: core/src/main/scala/org/apache/spark/storage/StorageUtils.scala ---
    @@ -118,8 +140,20 @@ class StorageStatus(val blockManagerId: BlockManagerId, val maxMem: Long) {
             } else {
               None
             }
    +      case BroadcastBlockId(broadcastId, _) =>
    +        // Actually remove the block, if it exists
    +        if (_broadcastBlocks.contains(broadcastId)) {
    +          val removed = _broadcastBlocks(broadcastId).remove(blockId)
    +          // If the given RDD has no more blocks left, remove the RDD
    --- End diff --
    
    actually now I am thinking there is a lot of copy-pasting that could be cleaned up.  you could mostly merge this w/ the block above if you did something like:
    
    ```
    case rddOrBlockId @ (_: BroadcastBlockId | _: RDDBlockId) =>
      val (id, blockMap) = getIdAndBlockMap(rddOrBlockId)
      val removed = blockMap(id).remove(blockId)
      ...
    ```
    
    where the helper function `getIdAndBlockMap` is something like:
    
    ```
    def getIdAndBlockMap(blockId: BlockId): (Int, Map[BlockId, BlockStatus]) = blockId match {
      case RDDBlockId(rddId, _) => (rddId, _rddBlocks)
      case BroadcastBlockId(broadcastId, _) => (broadcastId, _broadcastBlocks)
    }
    ```
    
    and then you could do a similar thing in a few other places.  You could also take this a step further, and even merge `_rddBlocks` and `_broadcastBlocks` into a `EnumMap[BlockType, Map[BlockId, BlockStatus]]` if you made a new `public enum BlockType{ RDD,Broadcast}`, but that might not really help much since at the end of the day you do want separate getter methods for the RDD and Broadcast stuff


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3957]: show broadcast variable resource...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2851#issuecomment-81020599
  
      [Test build #28626 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/28626/consoleFull) for   PR 2851 at commit [`1aec3a8`](https://github.com/apache/spark/commit/1aec3a8073a241d16953a0adad7a17cf3ac2bba4).
     * This patch **passes all tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3957]: show broadcast variable resource...

Posted by squito <gi...@git.apache.org>.
Github user squito commented on the pull request:

    https://github.com/apache/spark/pull/2851#issuecomment-74298037
  
    hmm, ok now I see that it is hard to get the BroadcastBlockIds into the task end metrics, since when broadcast blocks get created, its a side effect of task-deserialization, so the broadcast var doesn't really know which task its being created for.  The only other option I can think of is to create a `ThreadLocal` with the current task id, before deserializing the task [here](https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/executor/Executor.scala#L177).  Then when the broadcast var is actually put in the block manager, ([eg.  for `TorrentBroadcast`](https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/broadcast/TorrentBroadcast.scala#L181)) and finally when the task is finished you'd [update the metrics](https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/executor/Executor.scala#L214).
    
    Honestly I am somewhat torn between (a) leaving it as you have it, (b) what I've outlined in this comment, and (c) having RDD block info also tracked by your new mechansim (option 2 above)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3957]: show broadcast variable resource...

Posted by CodingCat <gi...@git.apache.org>.
Github user CodingCat commented on the pull request:

    https://github.com/apache/spark/pull/2851#issuecomment-74255969
  
    Hi, @squito , thanks for the review, I will handle it in the weekend


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3957]: show broadcast variable resource...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/2851#issuecomment-60043794
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/22023/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3957]: show broadcast variable resource...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/2851#issuecomment-81020626
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/28626/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3957]: show broadcast variable resource...

Posted by CodingCat <gi...@git.apache.org>.
Github user CodingCat commented on the pull request:

    https://github.com/apache/spark/pull/2851#issuecomment-74582634
  
    Hi, @squito, thank you very much for your patient review
     
     Here is the summary on what I have done on addressing your comments
     
     1. regarding the duplication in RDDPage and BroadcastPage
     
        Now, both of them inherit from InMemoryObjectPage, which defines the common parts of these two classes, e.g. Worker Header, etc.
     
     2. Serialization and Deserialization of SparkListenerBlockUpdate event
     
        I implemented the serialization and deserialization methods (https://github.com/apache/spark/pull/2851/files#diff-4f6ba18259eb4c31ac930e18f1ba6f88R99) and (https://github.com/apache/spark/pull/2851/files#diff-4f6ba18259eb4c31ac930e18f1ba6f88R480)
    
      	To prevent the large volume of events being logged, there is an event filter existing in SparkListenerBus (https://github.com/apache/spark/pull/2851/files#diff-01185688f339238fc6689baa6106df63R27). Now this filter ensure that only the broadcast block update event is logged (https://github.com/apache/spark/pull/2851/files#diff-fbe8f967070627c8dc155237e77c7314R130) (actually this was implemented long time ago)
      	
     3. get rid of the mutable variable in BlockManagerMasterActor 
     	
     	Done (https://github.com/apache/spark/pull/2851/files#diff-6da9360efd5e37dcd0edcc651db1e9cbR508)
     	
     4. shorten the compare() in BroadcastInfo (https://github.com/apache/spark/pull/2851/files#diff-6da9360efd5e37dcd0edcc651db1e9cbR508)
     
        Unfortunately, I probably cannot do that, since compare() returns Int, but broadcastId is Long
      	
     5. tighten the methods in StorageUtils
     
        Done (https://github.com/apache/spark/pull/2851/files#diff-6da9360efd5e37dcd0edcc651db1e9cbR508)
      


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3957]: show broadcast variable resource...

Posted by squito <gi...@git.apache.org>.
Github user squito commented on a diff in the pull request:

    https://github.com/apache/spark/pull/2851#discussion_r26393960
  
    --- Diff: core/src/main/scala/org/apache/spark/storage/StorageUtils.scala ---
    @@ -224,8 +299,14 @@ class StorageStatus(val blockManagerId: BlockManagerId, val maxMem: Long) {
             } else {
               _rddStorageInfo(rddId) = (newMem, newDisk, newTachyon, level)
             }
    +      case BroadcastBlockId(broadcastId, _) =>
    +        if (newMem + newDisk + newTachyon == 0) {
    +          _broadcastStorageInfo.remove(broadcastId)
    +        } else {
    +          _broadcastStorageInfo(broadcastId) = (newMem, newDisk, newTachyon)
    +        }
    --- End diff --
    
    I think you could use the same `case rddOrBroadcastBlockId @ (_: BroadcastBlockId | _: RDDBlockId)` pattern here to combine both matches in this method ... its subjective whether its worth it or not, but I think it would be cleaner.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3957]: show broadcast variable resource...

Posted by squito <gi...@git.apache.org>.
Github user squito commented on a diff in the pull request:

    https://github.com/apache/spark/pull/2851#discussion_r24831053
  
    --- Diff: core/src/main/scala/org/apache/spark/storage/BlockManagerMasterActor.scala ---
    @@ -522,7 +523,9 @@ private[spark] class BlockManagerInfo(
             logInfo("Removed %s on %s on tachyon (size: %s)".format(
               blockId, blockManagerId.hostPort, Utils.bytesToString(blockStatus.tachyonSize)))
           }
    +      return BlockStatus(storageLevel, 0, 0, 0)
         }
    +    null
    --- End diff --
    
    you don't need `return` here -- the last value of each block is its return value.  so this could be:
    
    ```
    if (storageLevel.isValid) {
     ...
      _blocks.get(blockId)
    } else if (_blocks.containsKey(blockId)) {
      ...
      BlockStatus(storageLevel, 0, 0, 0)
    } else {
      null
    }
    ```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3957]: show broadcast variable resource...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2851#issuecomment-74568269
  
      [Test build #27569 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/27569/consoleFull) for   PR 2851 at commit [`57d3f46`](https://github.com/apache/spark/commit/57d3f4609d069c8891cda3f0658d1eb1bb93e9f1).
     * This patch **fails RAT tests**.
     * This patch merges cleanly.
     * This patch adds the following public classes _(experimental)_:
      * `case class SparkListenerBlockUpdate(blockManagerId: BlockManagerId, blockId: BlockId,`
      * `class DefaultSparkListenerEventFilter `
      * `class BroadcastInfo(`



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3957]: show broadcast variable resource...

Posted by squito <gi...@git.apache.org>.
Github user squito commented on a diff in the pull request:

    https://github.com/apache/spark/pull/2851#discussion_r24835658
  
    --- Diff: core/src/main/scala/org/apache/spark/ui/storage/BroadcastPage.scala ---
    @@ -0,0 +1,90 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.spark.ui.storage
    +
    +import org.apache.spark.storage.StorageUtils
    +import org.apache.spark.ui.UIUtils
    +import org.apache.spark.util.Utils
    +
    +import scala.xml.Node
    +
    +private[ui] class BroadcastPage(parent: StorageTab) extends InMemoryObjectPage("broadcast", parent){
    +
    +  protected override val workerTableID: String = "broadcast-storage-by-block-table"
    +  
    +  protected override def objectList = listener.broadcastInfoList
    +
    +  protected override def getBlockTableAndSize(objectId: Any): (Seq[Node], Int) = {
    +    val blockLocations = StorageUtils.getBroadcastBlockLocation(objectId.asInstanceOf[Long],
    +      storageStatusList)
    +    val blocks = listener.storageStatusList
    +      .flatMap(_.broadcastBlocksById(objectId.asInstanceOf[Long]))
    +      .sortWith(_._1.name < _._1.name)
    +      .map { case (blockId, status) =>
    +      (blockId, status, blockLocations.get(blockId).getOrElse(Seq[String]("Unknown")))
    +    }
    +    (UIUtils.listingTable(blockHeader, blockRow, blocks, id = Some("rdd-storage-by-block-table")),
    +      blocks.size)
    +  }
    +
    +  protected override def generateContent(objectId: Long): (String, Seq[Node]) = {
    +    val objectInfo = objectList.find(_.id == objectId).getOrElse {
    +      // Rather than crashing, render an "Not Found" page
    +      return (objectId.toString, nonFoundErrorInfo)
    +    }
    +    val (workerTable, workerCount) = getWorkerTableAndSize(objectId)
    +
    +    val (blockTable, blockCount) = getBlockTableAndSize(objectId)
    +
    +    val content =
    +      <div class="row-fluid">
    +        <div class="span12">
    +          <ul class="unstyled">
    +            <li>
    +              <strong>Total Partitions:</strong>
    +              {objectInfo.numPartitions}
    --- End diff --
    
    `numPartitions` doesn't mean anything for a Broadcast var.  I think its always `0` in your code, unless I've missed something.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3957]: show broadcast variable resource...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2851#issuecomment-74183512
  
      [Test build #27404 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/27404/consoleFull) for   PR 2851 at commit [`5c60a34`](https://github.com/apache/spark/commit/5c60a345e9b6e34f51403df1bf721d00fac7bffa).
     * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [WIP]SPARK-3957: show broadcast variable resou...

Posted by CodingCat <gi...@git.apache.org>.
Github user CodingCat commented on the pull request:

    https://github.com/apache/spark/pull/2851#issuecomment-60039787
  
    ![image](https://cloud.githubusercontent.com/assets/678008/4731666/589ce496-59af-11e4-99fd-01e4b37d7fef.png)
    
    
    ![image](https://cloud.githubusercontent.com/assets/678008/4731669/6f70127e-59af-11e4-8e23-10a18facc2e0.png)



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3957]: show broadcast variable resource...

Posted by squito <gi...@git.apache.org>.
Github user squito commented on a diff in the pull request:

    https://github.com/apache/spark/pull/2851#discussion_r26393024
  
    --- Diff: core/src/main/scala/org/apache/spark/scheduler/EventLoggingListener.scala ---
    @@ -181,6 +181,12 @@ private[spark] class EventLoggingListener(
         logEvent(event, flushLogger = true)
       override def onExecutorRemoved(event: SparkListenerExecutorRemoved) =
         logEvent(event, flushLogger = true)
    +  override def onBlockUpdate(event: SparkListenerBlockUpdate) = {
    +    // we only log Broadcast block update for now
    --- End diff --
    
    er, I guess its also happening in `TaskEnd` events


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3957]: show broadcast variable resource...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/2851#issuecomment-79466620
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/28585/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3957]: show broadcast variable resource...

Posted by squito <gi...@git.apache.org>.
Github user squito commented on a diff in the pull request:

    https://github.com/apache/spark/pull/2851#discussion_r26393684
  
    --- Diff: core/src/main/scala/org/apache/spark/storage/StorageUtils.scala ---
    @@ -166,28 +195,69 @@ class StorageStatus(val blockManagerId: BlockManagerId, val maxMem: Long) {
        * Note that this is much faster than `this.rddBlocksById(rddId).size`, which is
        * O(blocks in this RDD) time.
        */
    -  def numRddBlocksById(rddId: Int): Int = _rddBlocks.get(rddId).map(_.size).getOrElse(0)
    +  def numRddBlocksById(rddId: Long): Int = _rddBlocks.get(rddId).map(_.size).getOrElse(0)
    +
    +
    +  /**
    +   * Return the number of Broadcast blocks stored in this block manager in O(RDDs) time.
    +   * Note that this is much faster than `this.rddBlocks.size`, which is O(RDD blocks) time.
    --- End diff --
    
    RDD should be Broadcast


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3957]: show broadcast variable resource...

Posted by squito <gi...@git.apache.org>.
Github user squito commented on a diff in the pull request:

    https://github.com/apache/spark/pull/2851#discussion_r26395579
  
    --- Diff: core/src/main/scala/org/apache/spark/ui/storage/BroadcastPage.scala ---
    @@ -0,0 +1,86 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.spark.ui.storage
    +
    +import org.apache.spark.storage.StorageUtils
    +import org.apache.spark.ui.UIUtils
    +import org.apache.spark.util.Utils
    +
    +import scala.xml.Node
    +
    +private[ui] class BroadcastPage(parent: StorageTab) extends StorageDetailPage("broadcast", parent){
    +
    +  protected override val workerTableID: String = "broadcast-storage-by-block-table"
    --- End diff --
    
    by *worker*, not by *block*


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3957]: show broadcast variable resource...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2851#issuecomment-74587814
  
    **[Test build #27571 timed out](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/27571/consoleFull)**     for PR 2851 at commit [`34aed25`](https://github.com/apache/spark/commit/34aed25737ef896356505e4ca2f50dd2d788b98c)     after a configured wait of `120m`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3957]: show broadcast variable resource...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2851#issuecomment-74160865
  
      [Test build #27387 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/27387/consoleFull) for   PR 2851 at commit [`bf03271`](https://github.com/apache/spark/commit/bf032716ed7f7d8d57e44c936e70385ab775c6c8).
     * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3957]: show broadcast variable resource...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/2851#issuecomment-60041142
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/22021/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3957]: show broadcast variable resource...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2851#issuecomment-80998056
  
      [Test build #28626 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/28626/consoleFull) for   PR 2851 at commit [`1aec3a8`](https://github.com/apache/spark/commit/1aec3a8073a241d16953a0adad7a17cf3ac2bba4).
     * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3957]: show broadcast variable resource...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2851#issuecomment-61558803
  
      [Test build #22829 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/22829/consoleFull) for   PR 2851 at commit [`a279ef8`](https://github.com/apache/spark/commit/a279ef8ed315765587ee5212ee7ac5975cba4272).
     * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3957]: show broadcast variable resource...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/2851#issuecomment-74153431
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/27380/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3957]: show broadcast variable resource...

Posted by squito <gi...@git.apache.org>.
Github user squito commented on a diff in the pull request:

    https://github.com/apache/spark/pull/2851#discussion_r24673735
  
    --- Diff: core/src/main/scala/org/apache/spark/storage/BroadcastInfo.scala ---
    @@ -0,0 +1,52 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.spark.storage
    +
    +import org.apache.spark.annotation.DeveloperApi
    +import org.apache.spark.broadcast.{TorrentBroadcast, Broadcast}
    +import org.apache.spark.util.Utils
    +import org.apache.spark.util.Utils._
    +
    +@DeveloperApi
    +class BroadcastInfo(
    +    val id: Long,
    +    val name: String,
    +    val numPartitions: Int) extends Ordered[BroadcastInfo] {
    +
    +  var memSize = 0L
    +  var diskSize = 0L
    +  var tachyonSize = 0L
    +
    +  override def toString = {
    +    import Utils.bytesToString
    +    ("%s\" (%d) ; " +
    +      "MemorySize: %s; TachyonSize: %s; DiskSize: %s").format(
    +        name, id, bytesToString(memSize), bytesToString(tachyonSize), bytesToString(diskSize))
    +  }
    +
    +  override def compare(that: BroadcastInfo) = {
    +    if (this.id > that.id) {
    +      1
    +    } else {
    +      if (this.id == that.id) {
    +        0
    +      }
    +      -1
    --- End diff --
    
    this can be just `this.id - that.id`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3957]: show broadcast variable resource...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2851#issuecomment-74568860
  
      [Test build #27570 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/27570/consoleFull) for   PR 2851 at commit [`b05196a`](https://github.com/apache/spark/commit/b05196af57e2c9ad8504b7ab12f1d3b2ea454067).
     * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3957]: show broadcast variable resource...

Posted by squito <gi...@git.apache.org>.
Github user squito commented on a diff in the pull request:

    https://github.com/apache/spark/pull/2851#discussion_r24829369
  
    --- Diff: core/src/main/scala/org/apache/spark/scheduler/SparkListenerBus.scala ---
    @@ -24,7 +24,13 @@ import org.apache.spark.util.ListenerBus
      */
     private[spark] trait SparkListenerBus extends ListenerBus[SparkListener, SparkListenerEvent] {
     
    +  private[spark] var filter: DefaultSparkListenerEventFilter = new DefaultSparkListenerEventFilter
    +
       override def onPostEvent(listener: SparkListener, event: SparkListenerEvent): Unit = {
    +    if (!filter.validate(event)) {
    +      return  
    --- End diff --
    
    I think the `DefaultSparkListenerEventFilter` is probably adding an abstraction without any really good reason.  If we do stick w/ the new `SparkListenerBlockUpdate` events, I think its better if you just put the filter into the right method on the event logging listener and remove `DefaultSparkListenerEventFilter`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3957]: show broadcast variable resource...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2851#issuecomment-78582270
  
      [Test build #28534 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/28534/consoleFull) for   PR 2851 at commit [`223cdfb`](https://github.com/apache/spark/commit/223cdfb6c02412f6bafbbbe32e160f08a96e984f).
     * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3957]: show broadcast variable resource...

Posted by squito <gi...@git.apache.org>.
Github user squito commented on a diff in the pull request:

    https://github.com/apache/spark/pull/2851#discussion_r24831483
  
    --- Diff: core/src/main/scala/org/apache/spark/storage/BlockManagerMasterActor.scala ---
    @@ -522,7 +523,9 @@ private[spark] class BlockManagerInfo(
             logInfo("Removed %s on %s on tachyon (size: %s)".format(
               blockId, blockManagerId.hostPort, Utils.bytesToString(blockStatus.tachyonSize)))
           }
    +      return BlockStatus(storageLevel, 0, 0, 0)
         }
    +    null
    --- End diff --
    
    actually, can you explain the `null` case?  how does that happen, and won't that result in an NPE when it gets to your code in `StorageStatusListener` and in `JsonProtocol`?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3957]: show broadcast variable resource...

Posted by andrewor14 <gi...@git.apache.org>.
Github user andrewor14 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/2851#discussion_r24696584
  
    --- Diff: core/src/main/scala/org/apache/spark/util/JsonProtocol.scala ---
    @@ -91,6 +91,7 @@ private[spark] object JsonProtocol {
             executorRemovedToJson(executorRemoved)
           // These aren't used, but keeps compiler happy
           case SparkListenerExecutorMetricsUpdate(_, _) => JNothing
    +      case SparkListenerBlockUpdate(_, _, _) => JNothing
    --- End diff --
    
    how often is this event posted, is it once per cached partition? If so there might be too many such events to log them.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3957]: show broadcast variable resource...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/2851#issuecomment-78607951
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/28534/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org