You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by maropu <gi...@git.apache.org> on 2015/11/05 01:35:25 UTC

[GitHub] spark pull request: [SPARK-6521][Core] Bypass unnecessary network ...

GitHub user maropu opened a pull request:

    https://github.com/apache/spark/pull/9478

    [SPARK-6521][Core] Bypass unnecessary network access if block managers share an identical host

    Refactored #5178 and added unit tests.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/maropu/spark LocalReadBlockManager

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/9478.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #9478
    
----
commit 849eaa75bb57540083324e66a467cafdcc52222a
Author: Takeshi YAMAMURO <li...@gmail.com>
Date:   2015-11-04T17:58:56Z

    Bypass unnecessary network access if block managers share an identical host

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-6521][Core] Bypass unnecessary network ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9478#issuecomment-153982306
  
    Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-6521][Core] Bypass unnecessary network ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9478#issuecomment-154232445
  
     Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-6521][Core] Bypass unnecessary network ...

Posted by andrewor14 <gi...@git.apache.org>.
Github user andrewor14 commented on the pull request:

    https://github.com/apache/spark/pull/9478#issuecomment-160806983
  
    @maropu thanks for running the benchmarks. It seems that the gains are not really significant enough to warrant all the complexity this patch adds. This is to a certain extent as expected since network is generally pretty fast, and Spark tends to bottleneck on CPU rather than I/O or network.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-6521][Core] Bypass unnecessary network ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9478#issuecomment-154265383
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/45170/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-6521][Core] Bypass unnecessary network ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9478#issuecomment-153934112
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/45071/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-6521][Core] Bypass unnecessary network ...

Posted by andrewor14 <gi...@git.apache.org>.
Github user andrewor14 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/9478#discussion_r44487614
  
    --- Diff: core/src/main/scala/org/apache/spark/shuffle/IndexShuffleBlockResolver.scala ---
    @@ -47,12 +47,35 @@ private[spark] class IndexShuffleBlockResolver(conf: SparkConf) extends ShuffleB
     
       private val transportConf = SparkTransportConf.fromSparkConf(conf)
     
    -  def getDataFile(shuffleId: Int, mapId: Int): File = {
    -    blockManager.diskBlockManager.getFile(ShuffleDataBlockId(shuffleId, mapId, NOOP_REDUCE_ID))
    +  private def getDataFile(
    +      shuffleId: Int,
    +      mapId: Int,
    +      blockManagerId: BlockManagerId = blockManager.blockManagerId)
    +    : File = {
    +    if (blockManager.blockManagerId != blockManagerId) {
    +      blockManager.diskBlockManager.getShuffleFileBypassNetworkAccess(
    +        ShuffleDataBlockId(shuffleId, mapId, NOOP_REDUCE_ID), blockManagerId)
    +    } else {
    +      blockManager.diskBlockManager.getFile(
    +        ShuffleDataBlockId(shuffleId, mapId, NOOP_REDUCE_ID))
    +    }
       }
     
    -  private def getIndexFile(shuffleId: Int, mapId: Int): File = {
    -    blockManager.diskBlockManager.getFile(ShuffleIndexBlockId(shuffleId, mapId, NOOP_REDUCE_ID))
    +  def getDataFile(shuffleId: Int, mapId: Int): File =
    +    getDataFile(shuffleId, mapId, blockManager.blockManagerId)
    --- End diff --
    
    this method doesn't seem necessary.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-6521][Core] Bypass unnecessary network ...

Posted by andrewor14 <gi...@git.apache.org>.
Github user andrewor14 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/9478#discussion_r44487656
  
    --- Diff: core/src/main/scala/org/apache/spark/storage/BlockManager.scala ---
    @@ -302,6 +303,12 @@ private[spark] class BlockManager(
         }
       }
     
    +  override def getShuffleBlockData(
    +    blockId: ShuffleBlockId, blockManagerId: BlockManagerId)
    +    : ManagedBuffer = {
    --- End diff --
    
    style:
    ```
    def getShuffleBlockData(
        blockId: ShuffleBlockId,
        blockManagerId: BlockManagerId): ManagedBuffer = {
      ...
    }
    ```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-6521][Core] Bypass unnecessary network ...

Posted by andrewor14 <gi...@git.apache.org>.
Github user andrewor14 commented on the pull request:

    https://github.com/apache/spark/pull/9478#issuecomment-155615442
  
    By the way, have you done some benchmarking to measure how much this actually saves? I wonder if the gains are actually all that significant. Maybe it actually matters if we're fetching many small blocks, but even then I'm not 100% sure.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-6521][Core] Bypass unnecessary network ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9478#issuecomment-153982287
  
     Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-6521][Core] Bypass unnecessary network ...

Posted by andrewor14 <gi...@git.apache.org>.
Github user andrewor14 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/9478#discussion_r44488463
  
    --- Diff: core/src/test/scala/org/apache/spark/storage/ShuffleBlockFetcherIteratorSuite.scala ---
    @@ -27,16 +27,24 @@ import org.mockito.Matchers.{any, eq => meq}
     import org.mockito.Mockito._
     import org.mockito.invocation.InvocationOnMock
     import org.mockito.stubbing.Answer
    -import org.scalatest.PrivateMethodTester
    +import org.scalatest.{BeforeAndAfterAll, PrivateMethodTester}
     
    -import org.apache.spark.{SparkFunSuite, TaskContext}
    +import org.apache.spark.{SparkConf, SparkFunSuite, TaskContext}
     import org.apache.spark.network._
     import org.apache.spark.network.buffer.ManagedBuffer
     import org.apache.spark.network.shuffle.BlockFetchingListener
     import org.apache.spark.shuffle.FetchFailedException
     
     
    -class ShuffleBlockFetcherIteratorSuite extends SparkFunSuite with PrivateMethodTester {
    +class ShuffleBlockFetcherIteratorSuite extends SparkFunSuite
    +    with BeforeAndAfterAll with PrivateMethodTester {
    --- End diff --
    
    unindent this, here and other similar places


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-6521][Core] Bypass unnecessary network ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9478#issuecomment-154009895
  
    Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-6521][Core] Bypass unnecessary network ...

Posted by andrewor14 <gi...@git.apache.org>.
Github user andrewor14 commented on the pull request:

    https://github.com/apache/spark/pull/9478#issuecomment-155614988
  
    @maropu Thanks for taking this over. I took a pass and it seems that the abstractions can be simplified a little. Please also fix the style issues that I pointed out. I believe the high level approach is correct, however.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-6521][Core] Bypass unnecessary network ...

Posted by squito <gi...@git.apache.org>.
Github user squito commented on a diff in the pull request:

    https://github.com/apache/spark/pull/9478#discussion_r44070245
  
    --- Diff: core/src/main/scala/org/apache/spark/shuffle/IndexShuffleBlockResolver.scala ---
    @@ -95,10 +118,12 @@ private[spark] class IndexShuffleBlockResolver(conf: SparkConf) extends ShuffleB
         }
       }
     
    -  override def getBlockData(blockId: ShuffleBlockId): ManagedBuffer = {
    +  override def getBlockData(
    +      blockId: ShuffleBlockId, blockManagerId: BlockManagerId)
    +    : ManagedBuffer = {
    --- End diff --
    
    nit: each arg on its own line, with return value on same line as last arg


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-6521][Core] Bypass unnecessary network ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9478#issuecomment-153916074
  
     Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-6521][Core] Bypass unnecessary network ...

Posted by andrewor14 <gi...@git.apache.org>.
Github user andrewor14 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/9478#discussion_r44488406
  
    --- Diff: core/src/test/scala/org/apache/spark/storage/DiskBlockManagerSuite.scala ---
    @@ -17,21 +17,21 @@
     
     package org.apache.spark.storage
     
    -import java.io.{File, FileWriter}
    +import java.io.{File, FileWriter, IOException}
     
    -import scala.language.reflectiveCalls
    -
    -import org.mockito.Mockito.{mock, when}
    -import org.scalatest.{BeforeAndAfterAll, BeforeAndAfterEach}
    -
    -import org.apache.spark.{SparkConf, SparkFunSuite}
     import org.apache.spark.util.Utils
    +import org.apache.spark.{SparkConf, SparkFunSuite}
    +import org.mockito.Matchers.{eq => meq}
    +import org.mockito.Mockito.{mock, times, verify, when}
    +import org.scalatest.{BeforeAndAfterAll, BeforeAndAfterEach, PrivateMethodTester}
     
    -class DiskBlockManagerSuite extends SparkFunSuite with BeforeAndAfterEach with BeforeAndAfterAll {
    +import scala.language.reflectiveCalls
    --- End diff --
    
    please reorder this


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-6521][Core] Bypass unnecessary network ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9478#issuecomment-154095960
  
     Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-6521][Core] Bypass unnecessary network ...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/9478#issuecomment-153982455
  
    **[Test build #45106 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/45106/consoleFull)** for PR 9478 at commit [`5c00a5a`](https://github.com/apache/spark/commit/5c00a5a2e847468c03517343f941de251dd9d5a7).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-6521][Core] Bypass unnecessary network ...

Posted by maropu <gi...@git.apache.org>.
Github user maropu commented on the pull request:

    https://github.com/apache/spark/pull/9478#issuecomment-160842618
  
    @andrewor14 Okay and thanks. Also, can you close SPARK-6521?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-6521][Core] Bypass unnecessary network ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9478#issuecomment-155655286
  
    Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-6521][Core] Bypass unnecessary network ...

Posted by andrewor14 <gi...@git.apache.org>.
Github user andrewor14 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/9478#discussion_r44487535
  
    --- Diff: core/src/main/scala/org/apache/spark/shuffle/IndexShuffleBlockResolver.scala ---
    @@ -47,12 +47,35 @@ private[spark] class IndexShuffleBlockResolver(conf: SparkConf) extends ShuffleB
     
       private val transportConf = SparkTransportConf.fromSparkConf(conf)
     
    -  def getDataFile(shuffleId: Int, mapId: Int): File = {
    -    blockManager.diskBlockManager.getFile(ShuffleDataBlockId(shuffleId, mapId, NOOP_REDUCE_ID))
    +  private def getDataFile(
    +      shuffleId: Int,
    +      mapId: Int,
    +      blockManagerId: BlockManagerId = blockManager.blockManagerId)
    +    : File = {
    +    if (blockManager.blockManagerId != blockManagerId) {
    +      blockManager.diskBlockManager.getShuffleFileBypassNetworkAccess(
    +        ShuffleDataBlockId(shuffleId, mapId, NOOP_REDUCE_ID), blockManagerId)
    +    } else {
    +      blockManager.diskBlockManager.getFile(
    +        ShuffleDataBlockId(shuffleId, mapId, NOOP_REDUCE_ID))
    +    }
       }
     
    -  private def getIndexFile(shuffleId: Int, mapId: Int): File = {
    -    blockManager.diskBlockManager.getFile(ShuffleIndexBlockId(shuffleId, mapId, NOOP_REDUCE_ID))
    +  def getDataFile(shuffleId: Int, mapId: Int): File =
    +    getDataFile(shuffleId, mapId, blockManager.blockManagerId)
    +
    +  private def getIndexFile(
    +      shuffleId: Int,
    +      mapId: Int,
    +      blockManagerId: BlockManagerId = blockManager.blockManagerId)
    +    : File = {
    --- End diff --
    
    style nit: put this on the previous line (here and other places)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-6521][Core] Bypass unnecessary network ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9478#issuecomment-154238591
  
     Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-6521][Core] Bypass unnecessary network ...

Posted by maropu <gi...@git.apache.org>.
Github user maropu commented on the pull request:

    https://github.com/apache/spark/pull/9478#issuecomment-155652857
  
    Thx for your reviews.
    I'll fix it and also do benchmarks .


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-6521][Core] Bypass unnecessary network ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9478#issuecomment-155673033
  
    Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-6521][Core] Bypass unnecessary network ...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/9478#issuecomment-153917270
  
    **[Test build #45071 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/45071/consoleFull)** for PR 9478 at commit [`849eaa7`](https://github.com/apache/spark/commit/849eaa75bb57540083324e66a467cafdcc52222a).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-6521][Core] Bypass unnecessary network ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9478#issuecomment-154257149
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/45163/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-6521][Core] Bypass unnecessary network ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9478#issuecomment-153934111
  
    Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-6521][Core] Bypass unnecessary network ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9478#issuecomment-154130913
  
    Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-6521][Core] Bypass unnecessary network ...

Posted by andrewor14 <gi...@git.apache.org>.
Github user andrewor14 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/9478#discussion_r44488520
  
    --- Diff: core/src/main/scala/org/apache/spark/storage/BlockManagerMasterEndpoint.scala ---
    @@ -299,7 +310,11 @@ class BlockManagerMasterEndpoint(
         ).map(_.flatten.toSeq)
       }
     
    -  private def register(id: BlockManagerId, maxMemSize: Long, slaveEndpoint: RpcEndpointRef) {
    +  private def register(
    +      id: BlockManagerId,
    +      maxMemSize: Long,
    +      localDirsPath: Array[String],
    +      slaveEndpoint: RpcEndpointRef) {
    --- End diff --
    
    please add `Unit` return type


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-6521][Core] Bypass unnecessary network ...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/9478#issuecomment-154009796
  
    **[Test build #45106 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/45106/consoleFull)** for PR 9478 at commit [`5c00a5a`](https://github.com/apache/spark/commit/5c00a5a2e847468c03517343f941de251dd9d5a7).
     * This patch **fails Spark unit tests**.
     * This patch merges cleanly.
     * This patch adds the following public classes _(experimental)_:\n  * `  case class GetLocalDirsPath(blockManagerId: BlockManagerId) extends ToBlockManagerMaster`\n


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-6521][Core] Bypass unnecessary network ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9478#issuecomment-154261775
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-6521][Core] Bypass unnecessary network ...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/9478#issuecomment-154098687
  
    **[Test build #45117 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/45117/consoleFull)** for PR 9478 at commit [`8d0a2f0`](https://github.com/apache/spark/commit/8d0a2f006f895a53c029b29ad82e33d14c796da6).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-6521][Core] Bypass unnecessary network ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9478#issuecomment-154234475
  
    Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-6521][Core] Bypass unnecessary network ...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/9478#issuecomment-153934042
  
    **[Test build #45071 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/45071/consoleFull)** for PR 9478 at commit [`849eaa7`](https://github.com/apache/spark/commit/849eaa75bb57540083324e66a467cafdcc52222a).
     * This patch **fails Spark unit tests**.
     * This patch merges cleanly.
     * This patch adds the following public classes _(experimental)_:\n  * `  case class GetLocalDirsPath(blockManagerId: BlockManagerId) extends ToBlockManagerMaster`\n


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-6521][Core] Bypass unnecessary network ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9478#issuecomment-154238607
  
    Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-6521][Core] Bypass unnecessary network ...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/9478#issuecomment-155672972
  
    **[Test build #45591 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/45591/consoleFull)** for PR 9478 at commit [`6b8f7bf`](https://github.com/apache/spark/commit/6b8f7bf4c438cb8717be081256f10a70e2a063ed).
     * This patch **fails Spark unit tests**.
     * This patch merges cleanly.
     * This patch adds the following public classes _(experimental)_:\n  * `  case class GetLocalDirsPath(blockManagerId: BlockManagerId) extends ToBlockManagerMaster`\n  * `public class JavaLBFGSExample `\n  * `class LDA @Since(\"1.6.0\") (`\n  * `  case class Metadata(`\n  * `      require(className == expectedClassName, s\"Error loading metadata: Expected class name\" +`\n  * `class SlidingRDDPartition[T](val idx: Int, val prev: Partition, val tail: Seq[T], val offset: Int)`\n  * `class SlidingRDD[T: ClassTag](@transient val parent: RDD[T], val windowSize: Int, val step: Int)`\n


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-6521][Core] Bypass unnecessary network ...

Posted by maropu <gi...@git.apache.org>.
Github user maropu commented on the pull request:

    https://github.com/apache/spark/pull/9478#issuecomment-154270938
  
    @andrewor14 Could you review this and give some suggestions?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-6521][Core] Bypass unnecessary network ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9478#issuecomment-153916097
  
    Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-6521][Core] Bypass unnecessary network ...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/9478#issuecomment-154261668
  
    **[Test build #45166 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/45166/consoleFull)** for PR 9478 at commit [`211f0b9`](https://github.com/apache/spark/commit/211f0b9a96a66e3fe3b887b9114d58f9ea10fdc9).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds the following public classes _(experimental)_:\n  * `  case class GetLocalDirsPath(blockManagerId: BlockManagerId) extends ToBlockManagerMaster`\n


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-6521][Core] Bypass unnecessary network ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9478#issuecomment-154232458
  
    Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-6521][Core] Bypass unnecessary network ...

Posted by andrewor14 <gi...@git.apache.org>.
Github user andrewor14 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/9478#discussion_r44488201
  
    --- Diff: core/src/main/scala/org/apache/spark/storage/BlockManager.scala ---
    @@ -302,6 +303,12 @@ private[spark] class BlockManager(
         }
       }
     
    +  override def getShuffleBlockData(
    +    blockId: ShuffleBlockId, blockManagerId: BlockManagerId)
    +    : ManagedBuffer = {
    --- End diff --
    
    actually, you probably don't need this method


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-6521][Core] Bypass unnecessary network ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9478#issuecomment-154009898
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/45106/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-6521][Core] Bypass unnecessary network ...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/9478#issuecomment-155655819
  
    **[Test build #45591 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/45591/consoleFull)** for PR 9478 at commit [`6b8f7bf`](https://github.com/apache/spark/commit/6b8f7bf4c438cb8717be081256f10a70e2a063ed).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-6521][Core] Bypass unnecessary network ...

Posted by squito <gi...@git.apache.org>.
Github user squito commented on a diff in the pull request:

    https://github.com/apache/spark/pull/9478#discussion_r44070440
  
    --- Diff: core/src/main/scala/org/apache/spark/storage/BlockManagerMaster.scala ---
    @@ -17,14 +17,14 @@
     
     package org.apache.spark.storage
     
    -import scala.collection.Iterable
    -import scala.collection.generic.CanBuildFrom
    -import scala.concurrent.{Await, Future}
    -
     import org.apache.spark.rpc.RpcEndpointRef
    -import org.apache.spark.{Logging, SparkConf, SparkException}
     import org.apache.spark.storage.BlockManagerMessages._
    -import org.apache.spark.util.{ThreadUtils, RpcUtils}
    +import org.apache.spark.util.{RpcUtils, ThreadUtils}
    +import org.apache.spark.{Logging, SparkConf, SparkException}
    +
    +import scala.collection.Iterable
    +import scala.collection.generic.CanBuildFrom
    +import scala.concurrent.Future
    --- End diff --
    
    nit: ordering (old ordering was *almost* right ...)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-6521][Core] Bypass unnecessary network ...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/9478#issuecomment-154233268
  
    **[Test build #45163 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/45163/consoleFull)** for PR 9478 at commit [`8d0a2f0`](https://github.com/apache/spark/commit/8d0a2f006f895a53c029b29ad82e33d14c796da6).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-6521][Core] Bypass unnecessary network ...

Posted by andrewor14 <gi...@git.apache.org>.
Github user andrewor14 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/9478#discussion_r44488213
  
    --- Diff: core/src/main/scala/org/apache/spark/storage/BlockManagerId.scala ---
    @@ -65,6 +65,10 @@ class BlockManagerId private (
           executorId == SparkContext.LEGACY_DRIVER_IDENTIFIER
       }
     
    +  def shareHost(other: BlockManagerId): Boolean = {
    +    host == other.host
    +  }
    --- End diff --
    
    you don't need this method


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-6521][Core] Bypass unnecessary network ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9478#issuecomment-155673034
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/45591/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-6521][Core] Bypass unnecessary network ...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/9478#issuecomment-154130815
  
    **[Test build #45117 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/45117/consoleFull)** for PR 9478 at commit [`8d0a2f0`](https://github.com/apache/spark/commit/8d0a2f006f895a53c029b29ad82e33d14c796da6).
     * This patch **fails Spark unit tests**.
     * This patch merges cleanly.
     * This patch adds the following public classes _(experimental)_:\n  * `  case class GetLocalDirsPath(blockManagerId: BlockManagerId) extends ToBlockManagerMaster`\n


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-6521][Core] Bypass unnecessary network ...

Posted by andrewor14 <gi...@git.apache.org>.
Github user andrewor14 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/9478#discussion_r44488058
  
    --- Diff: core/src/main/scala/org/apache/spark/shuffle/FileShuffleBlockResolver.scala ---
    @@ -107,8 +107,13 @@ private[spark] class FileShuffleBlockResolver(conf: SparkConf)
         }
       }
     
    -  override def getBlockData(blockId: ShuffleBlockId): ManagedBuffer = {
    -    val file = blockManager.diskBlockManager.getFile(blockId)
    +  override def getBlockData(blockId: ShuffleBlockId, blockManagerId: BlockManagerId)
    +    : ManagedBuffer = {
    +    val file = if (blockManager.blockManagerId != blockManagerId) {
    +      blockManager.diskBlockManager.getShuffleFileBypassNetworkAccess(blockId, blockManagerId)
    --- End diff --
    
    This seems like the wrong abstraction. Ideally we should just call `diskBlockManager.getFile(blockId, blockManagerId)` and it will take care of the rest for us. We shouldn't expose the details of how the shuffle files are retrieved here. i.e. I would just do this:
    ```
    override def getBlockData(
        blockId: ShuffleBlockId,
        blockManagerId: BlockManagerId): ManagedBuffer = {
      val file = blockManager.diskBlockManager.getFile(blockId, blockManagerId)
      ...
    }
    ```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-6521][Core] Bypass unnecessary network ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9478#issuecomment-154130916
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/45117/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-6521][Core] Bypass unnecessary network ...

Posted by andrewor14 <gi...@git.apache.org>.
Github user andrewor14 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/9478#discussion_r44488357
  
    --- Diff: core/src/main/scala/org/apache/spark/storage/ShuffleBlockFetcherIterator.scala ---
    @@ -233,19 +250,28 @@ final class ShuffleBlockFetcherIterator(
       private[this] def fetchLocalBlocks() {
         val iter = localBlocks.iterator
         while (iter.hasNext) {
    -      val blockId = iter.next()
    -      try {
    -        val buf = blockManager.getBlockData(blockId)
    -        shuffleMetrics.incLocalBlocksFetched(1)
    -        shuffleMetrics.incLocalBytesRead(buf.size)
    -        buf.retain()
    -        results.put(new SuccessFetchResult(blockId, blockManager.blockManagerId, 0, buf))
    -      } catch {
    -        case e: Exception =>
    -          // If we see an exception, stop immediately.
    -          logError(s"Error occurred while fetching local blocks", e)
    -          results.put(new FailureFetchResult(blockId, blockManager.blockManagerId, e))
    -          return
    +      val (blockManagerId, blockIds) = iter.next()
    +      val blockIter = blockIds.iterator
    +      while (blockIter.hasNext) {
    +        val blockId = blockIter.next()
    +        assert(blockId.isShuffle)
    +        try {
    +          val buf = if (!enableExternalShuffleService) {
    +            blockManager.getShuffleBlockData(blockId.asInstanceOf[ShuffleBlockId], blockManagerId)
    +          } else {
    +            blockManager.getBlockData(blockId.asInstanceOf[ShuffleBlockId])
    +          }
    +          shuffleMetrics.incLocalBlocksFetched(1)
    +          shuffleMetrics.incLocalBytesRead(buf.size)
    +          buf.retain()
    +          results.put(new SuccessFetchResult(blockId, blockManager.blockManagerId, 0, buf))
    +        } catch {
    +          case e: Exception =>
    --- End diff --
    
    should use
    ```
    case NonFatal(e) =>
    ```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-6521][Core] Bypass unnecessary network ...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/9478#issuecomment-154257041
  
    **[Test build #45163 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/45163/consoleFull)** for PR 9478 at commit [`8d0a2f0`](https://github.com/apache/spark/commit/8d0a2f006f895a53c029b29ad82e33d14c796da6).
     * This patch **fails Spark unit tests**.
     * This patch merges cleanly.
     * This patch adds the following public classes _(experimental)_:\n  * `  case class GetLocalDirsPath(blockManagerId: BlockManagerId) extends ToBlockManagerMaster`\n


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-6521][Core] Bypass unnecessary network ...

Posted by andrewor14 <gi...@git.apache.org>.
Github user andrewor14 commented on the pull request:

    https://github.com/apache/spark/pull/9478#issuecomment-160807059
  
    Can you close this PR?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-6521][Core] Bypass unnecessary network ...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/9478#issuecomment-154240184
  
    **[Test build #45170 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/45170/consoleFull)** for PR 9478 at commit [`cebc895`](https://github.com/apache/spark/commit/cebc8958f5fa2d3a7d27f128c16cbcef483863ad).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-6521][Core] Bypass unnecessary network ...

Posted by maropu <gi...@git.apache.org>.
Github user maropu commented on the pull request:

    https://github.com/apache/spark/pull/9478#issuecomment-154231203
  
    retest this please


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-6521][Core] Bypass unnecessary network ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9478#issuecomment-154261778
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/45166/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-6521][Core] Bypass unnecessary network ...

Posted by andrewor14 <gi...@git.apache.org>.
Github user andrewor14 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/9478#discussion_r44487407
  
    --- Diff: core/src/main/scala/org/apache/spark/network/BlockDataManager.scala ---
    @@ -30,6 +30,13 @@ trait BlockDataManager {
       def getBlockData(blockId: BlockId): ManagedBuffer
     
       /**
    +   * Interface to get the shuffle block data that block manager with given blockManagerId
    +   * holds in a local host. Throws an exception if the block cannot be found or
    +   * cannot be read successfully.
    +   */
    +  def getShuffleBlockData(blockId: ShuffleBlockId, blockManagerId: BlockManagerId): ManagedBuffer
    --- End diff --
    
    this doesn't belong here. `BlockDataManager` should know nothing about shuffles.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-6521][Core] Bypass unnecessary network ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9478#issuecomment-154257146
  
    Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-6521][Core] Bypass unnecessary network ...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/9478#issuecomment-154234656
  
    **[Test build #45166 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/45166/consoleFull)** for PR 9478 at commit [`211f0b9`](https://github.com/apache/spark/commit/211f0b9a96a66e3fe3b887b9114d58f9ea10fdc9).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-6521][Core] Bypass unnecessary network ...

Posted by maropu <gi...@git.apache.org>.
Github user maropu commented on the pull request:

    https://github.com/apache/spark/pull/9478#issuecomment-154231492
  
    retest this please


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-6521][Core] Bypass unnecessary network ...

Posted by andrewor14 <gi...@git.apache.org>.
Github user andrewor14 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/9478#discussion_r44487704
  
    --- Diff: core/src/main/scala/org/apache/spark/storage/BlockManagerMasterEndpoint.scala ---
    @@ -19,17 +19,16 @@ package org.apache.spark.storage
     
     import java.util.{HashMap => JHashMap}
     
    -import scala.collection.immutable.HashSet
    -import scala.collection.mutable
    -import scala.collection.JavaConverters._
    -import scala.concurrent.{ExecutionContext, Future}
    -
    -import org.apache.spark.rpc.{RpcEndpointRef, RpcEnv, RpcCallContext, ThreadSafeRpcEndpoint}
    -import org.apache.spark.{Logging, SparkConf}
     import org.apache.spark.annotation.DeveloperApi
    +import org.apache.spark.rpc.{RpcCallContext, RpcEndpointRef, RpcEnv, ThreadSafeRpcEndpoint}
     import org.apache.spark.scheduler._
     import org.apache.spark.storage.BlockManagerMessages._
     import org.apache.spark.util.{ThreadUtils, Utils}
    +import org.apache.spark.{Logging, SparkConf}
    +
    +import scala.collection.JavaConverters._
    +import scala.collection.mutable
    +import scala.concurrent.{ExecutionContext, Future}
    --- End diff --
    
    these were in the right place. The import ordering should be java -> scala -> 3rd party -> Spark.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-6521][Core] Bypass unnecessary network ...

Posted by maropu <gi...@git.apache.org>.
Github user maropu closed the pull request at:

    https://github.com/apache/spark/pull/9478


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-6521][Core] Bypass unnecessary network ...

Posted by andrewor14 <gi...@git.apache.org>.
Github user andrewor14 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/9478#discussion_r44488245
  
    --- Diff: core/src/main/scala/org/apache/spark/storage/ShuffleBlockFetcherIterator.scala ---
    @@ -58,6 +58,17 @@ final class ShuffleBlockFetcherIterator(
     
       import ShuffleBlockFetcherIterator._
     
    +  private[this] val enableExternalShuffleService =
    +    blockManager.conf.getBoolean("spark.shuffle.service.enabled", false)
    +
    +  /**
    +   * If this option enabled, bypass unnecessary network interaction
    +   * if multiple block managers work in a single host.
    +   */
    +  private[this] val enableBypassNetworkAccess =
    +    blockManager.conf.getBoolean("spark.shuffle.bypassNetworkAccess", false) &&
    --- End diff --
    
    I wouldn't bother making this configurable. I can't imagine a case when the user would want to disable this.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-6521][Core] Bypass unnecessary network ...

Posted by andrewor14 <gi...@git.apache.org>.
Github user andrewor14 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/9478#discussion_r44488161
  
    --- Diff: core/src/main/scala/org/apache/spark/shuffle/IndexShuffleBlockResolver.scala ---
    @@ -47,12 +47,35 @@ private[spark] class IndexShuffleBlockResolver(conf: SparkConf) extends ShuffleB
     
       private val transportConf = SparkTransportConf.fromSparkConf(conf)
     
    -  def getDataFile(shuffleId: Int, mapId: Int): File = {
    -    blockManager.diskBlockManager.getFile(ShuffleDataBlockId(shuffleId, mapId, NOOP_REDUCE_ID))
    +  private def getDataFile(
    +      shuffleId: Int,
    +      mapId: Int,
    +      blockManagerId: BlockManagerId = blockManager.blockManagerId)
    +    : File = {
    +    if (blockManager.blockManagerId != blockManagerId) {
    +      blockManager.diskBlockManager.getShuffleFileBypassNetworkAccess(
    +        ShuffleDataBlockId(shuffleId, mapId, NOOP_REDUCE_ID), blockManagerId)
    +    } else {
    +      blockManager.diskBlockManager.getFile(
    +        ShuffleDataBlockId(shuffleId, mapId, NOOP_REDUCE_ID))
    +    }
       }
     
    -  private def getIndexFile(shuffleId: Int, mapId: Int): File = {
    -    blockManager.diskBlockManager.getFile(ShuffleIndexBlockId(shuffleId, mapId, NOOP_REDUCE_ID))
    +  def getDataFile(shuffleId: Int, mapId: Int): File =
    +    getDataFile(shuffleId, mapId, blockManager.blockManagerId)
    +
    +  private def getIndexFile(
    +      shuffleId: Int,
    +      mapId: Int,
    +      blockManagerId: BlockManagerId = blockManager.blockManagerId)
    +    : File = {
    +    if (blockManager.blockManagerId != blockManagerId) {
    +      blockManager.diskBlockManager.getShuffleFileBypassNetworkAccess(
    +        ShuffleIndexBlockId(shuffleId, mapId, NOOP_REDUCE_ID), blockManagerId)
    +    } else {
    +      blockManager.diskBlockManager.getFile(
    +        ShuffleIndexBlockId(shuffleId, mapId, NOOP_REDUCE_ID))
    --- End diff --
    
    actually, if you get rid of `getShuffleFileBypassNetworkAccess` as I suggested elsewhere, then you don't need to abstract this since it's just 1 line.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-6521][Core] Bypass unnecessary network ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9478#issuecomment-154265380
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-6521][Core] Bypass unnecessary network ...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/9478#issuecomment-154265172
  
    **[Test build #45170 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/45170/consoleFull)** for PR 9478 at commit [`cebc895`](https://github.com/apache/spark/commit/cebc8958f5fa2d3a7d27f128c16cbcef483863ad).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds the following public classes _(experimental)_:\n  * `  case class GetLocalDirsPath(blockManagerId: BlockManagerId) extends ToBlockManagerMaster`\n


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-6521][Core] Bypass unnecessary network ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9478#issuecomment-154095987
  
    Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-6521][Core] Bypass unnecessary network ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9478#issuecomment-154234463
  
     Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-6521][Core] Bypass unnecessary network ...

Posted by squito <gi...@git.apache.org>.
Github user squito commented on a diff in the pull request:

    https://github.com/apache/spark/pull/9478#discussion_r44070286
  
    --- Diff: core/src/main/scala/org/apache/spark/storage/BlockManager.scala ---
    @@ -38,10 +30,16 @@ import org.apache.spark.network.netty.SparkTransportConf
     import org.apache.spark.network.shuffle.ExternalShuffleClient
     import org.apache.spark.network.shuffle.protocol.ExecutorShuffleInfo
     import org.apache.spark.rpc.RpcEnv
    -import org.apache.spark.serializer.{SerializerInstance, Serializer}
    +import org.apache.spark.serializer.{Serializer, SerializerInstance}
     import org.apache.spark.shuffle.ShuffleManager
    -import org.apache.spark.shuffle.hash.HashShuffleManager
     import org.apache.spark.util._
    +import sun.nio.ch.DirectBuffer
    +
    +import scala.collection.mutable.{ArrayBuffer, HashMap}
    +import scala.concurrent.duration._
    +import scala.concurrent.{Await, ExecutionContext, Future}
    +import scala.util.Random
    +import scala.util.control.NonFatal
    --- End diff --
    
    nit: old grouping was correct


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-6521][Core] Bypass unnecessary network ...

Posted by andrewor14 <gi...@git.apache.org>.
Github user andrewor14 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/9478#discussion_r44487505
  
    --- Diff: core/src/main/scala/org/apache/spark/shuffle/IndexShuffleBlockResolver.scala ---
    @@ -47,12 +47,35 @@ private[spark] class IndexShuffleBlockResolver(conf: SparkConf) extends ShuffleB
     
       private val transportConf = SparkTransportConf.fromSparkConf(conf)
     
    -  def getDataFile(shuffleId: Int, mapId: Int): File = {
    -    blockManager.diskBlockManager.getFile(ShuffleDataBlockId(shuffleId, mapId, NOOP_REDUCE_ID))
    +  private def getDataFile(
    +      shuffleId: Int,
    +      mapId: Int,
    +      blockManagerId: BlockManagerId = blockManager.blockManagerId)
    +    : File = {
    +    if (blockManager.blockManagerId != blockManagerId) {
    +      blockManager.diskBlockManager.getShuffleFileBypassNetworkAccess(
    +        ShuffleDataBlockId(shuffleId, mapId, NOOP_REDUCE_ID), blockManagerId)
    +    } else {
    +      blockManager.diskBlockManager.getFile(
    +        ShuffleDataBlockId(shuffleId, mapId, NOOP_REDUCE_ID))
    +    }
       }
     
    -  private def getIndexFile(shuffleId: Int, mapId: Int): File = {
    -    blockManager.diskBlockManager.getFile(ShuffleIndexBlockId(shuffleId, mapId, NOOP_REDUCE_ID))
    +  def getDataFile(shuffleId: Int, mapId: Int): File =
    +    getDataFile(shuffleId, mapId, blockManager.blockManagerId)
    +
    +  private def getIndexFile(
    +      shuffleId: Int,
    +      mapId: Int,
    +      blockManagerId: BlockManagerId = blockManager.blockManagerId)
    +    : File = {
    +    if (blockManager.blockManagerId != blockManagerId) {
    +      blockManager.diskBlockManager.getShuffleFileBypassNetworkAccess(
    +        ShuffleIndexBlockId(shuffleId, mapId, NOOP_REDUCE_ID), blockManagerId)
    +    } else {
    +      blockManager.diskBlockManager.getFile(
    +        ShuffleIndexBlockId(shuffleId, mapId, NOOP_REDUCE_ID))
    --- End diff --
    
    a large part of this method is duplicated in `getDataFile`. We should have a common method that takes in a `BlockId`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-6521][Core] Bypass unnecessary network ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9478#issuecomment-155655274
  
     Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-6521][Core] Bypass unnecessary network ...

Posted by andrewor14 <gi...@git.apache.org>.
Github user andrewor14 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/9478#discussion_r44487716
  
    --- Diff: core/src/main/scala/org/apache/spark/storage/BlockManagerMasterEndpoint.scala ---
    @@ -240,6 +242,15 @@ class BlockManagerMasterEndpoint(
         }.toMap
       }
     
    +  // Return the local dirs of a block manager with the given blockManagerId
    +  private def getLocalDirsPath(blockManagerId: BlockManagerId)
    +    : Map[BlockManagerId, Array[String]] = {
    +    blockManagerInfo
    +      .filter { case(id, _) => id != blockManagerId && id.host == blockManagerId.host }
    --- End diff --
    
    space after case


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org