You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by vincent-grosbois <gi...@git.apache.org> on 2018/08/07 13:02:00 UTC

[GitHub] spark pull request #22024: [SPARK-25034][CORE] Remove allocations in onBlock...

GitHub user vincent-grosbois opened a pull request:

    https://github.com/apache/spark/pull/22024

    [SPARK-25034][CORE] Remove allocations in onBlockFetchSuccess

    This method is only transferring a ManagedBuffer to the caller,
    so there is no reason why it should allocate 2 (!) intermediate data
    buffers in order to do so.
    
    In this commit I'm removing the conversion from any kind of managed buffer
    besides FileSegment to a NioManagedBuffer.
    However if you check the only calling method getRemoteBytes(), you will
    see that here we either:
     - do a memory-map if we have a FileSegmentManagedBuffer
     - try again to call the nioByteBuffer() method otherwise
    
    So in any case the conversion will occur later.
    
    ## What changes were proposed in this pull request?
    Remove needless temporary allocations
    
    ## How was this patch tested?
    Tested this change with a few jobs


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/vincent-grosbois/spark no-alloc-onfetchsuccess

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/22024.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #22024
    
----
commit 2c182b3c93c7cc70f042d7dcd82520ac2adece1c
Author: Vincent Grosbois <v....@...>
Date:   2018-08-07T12:34:32Z

    [SPARK-25034][CORE] Remove allocations in onBlockFetchSuccess
    
    This method is only transferring a ManagedBuffer to the caller,
    so there is no reason why it should allocate 2 (!) intermediate data
    buffers in order to do so.
    
    In this commit I'm removing the conversion from any kind of managed buffer
    besides FileSegment to a NioManagedBuffer.
    However if you check the only calling method getRemoteBytes(), you will
    see that here we either:
     - do a memory-map if we have a FileSegmentManagedBuffer
     - try again to call the nioByteBuffer() method otherwise
    
    So in any case the conversion will occur later.

----


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22024: [SPARK-25034][CORE] Remove allocations in onBlockFetchSu...

Posted by viirya <gi...@git.apache.org>.
Github user viirya commented on the issue:

    https://github.com/apache/spark/pull/22024
  
    @vincent-grosbois I don't look into this change yet. Do you have reply for @dbtsai's comment https://github.com/apache/spark/pull/22024#issuecomment-412659532?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22024: [SPARK-25034][CORE] Remove allocations in onBlockFetchSu...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22024
  
    Can one of the admins verify this patch?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22024: [SPARK-25034][CORE] Remove allocations in onBlockFetchSu...

Posted by vincent-grosbois <gi...@git.apache.org>.
Github user vincent-grosbois commented on the issue:

    https://github.com/apache/spark/pull/22024
  
    Hi,
    sorry I haven't done any benchmark on this.
    But this solved an Out Of Memory issue for:
    - before this fix with Spark 2.3 we had OOM issues in specific jobs when the Partition size was big (1.9 GB) and made of few big objects, without using the spark.maxRemoteBlockSizeFetchToMem trick
    - with this fix our OOM dissapeared, without having to use spark.maxRemoteBlockSizeFetchToMem (ie everything is done on memory, no disk spill)


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22024: [SPARK-25034][CORE] Remove allocations in onBlockFetchSu...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22024
  
    Can one of the admins verify this patch?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #22024: [SPARK-25034][CORE] Remove allocations in onBlock...

Posted by xuanyuanking <gi...@git.apache.org>.
Github user xuanyuanking commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22024#discussion_r213015245
  
    --- Diff: core/src/main/scala/org/apache/spark/broadcast/TorrentBroadcast.scala ---
    @@ -160,7 +160,13 @@ private[spark] class TorrentBroadcast[T: ClassTag](obj: T, id: Long)
               releaseLock(pieceId)
             case None =>
               bm.getRemoteBytes(pieceId) match {
    -            case Some(b) =>
    +            case Some(splitB) =>
    +
    +              // Checksum computation and further computations require the data
    +              // from the ChunkedByteBuffer to be merged, so we we merge it now.
    --- End diff --
    
    nit of the comment.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22024: [SPARK-25034][CORE] Remove allocations in onBlockFetchSu...

Posted by vincent-grosbois <gi...@git.apache.org>.
Github user vincent-grosbois commented on the issue:

    https://github.com/apache/spark/pull/22024
  
    Actually I spoke too soon! 
    We spotted some regression in TorrentBroadcast: 
    https://github.com/apache/spark/blob/46110a589f4e91cd7605c5a2c34c3db6b2635830/core/src/main/scala/org/apache/spark/broadcast/TorrentBroadcast.scala#L164
    
    it seems that the code in charge of checking the CheckSum is assuming that we only receive 1 block in the chunkedByteBuffer (cf checksum of only chunk(0)). Does anyone know why ?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22024: [SPARK-25034][CORE] Remove allocations in onBlockFetchSu...

Posted by vincent-grosbois <gi...@git.apache.org>.
Github user vincent-grosbois commented on the issue:

    https://github.com/apache/spark/pull/22024
  
    Hello ! is there anybody interested in merging this ?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22024: [SPARK-25034][CORE] Remove allocations in onBlockFetchSu...

Posted by vincent-grosbois <gi...@git.apache.org>.
Github user vincent-grosbois commented on the issue:

    https://github.com/apache/spark/pull/22024
  
    I updated my branch to take into account the TorrentBroadcast issue, can anyone have a look?
    In general some parts of the source code are unclear, it seems that some cases are requiring the ChunkedByteBuffer instances to only have 1 component but I don't know why and I don't know if it was intended


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22024: [SPARK-25034][CORE] Remove allocations in onBlockFetchSu...

Posted by dbtsai <gi...@git.apache.org>.
Github user dbtsai commented on the issue:

    https://github.com/apache/spark/pull/22024
  
    @vincent-grosbois do you have some benchmark or profiling on this change?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22024: [SPARK-25034][CORE] Remove allocations in onBlockFetchSu...

Posted by dbtsai <gi...@git.apache.org>.
Github user dbtsai commented on the issue:

    https://github.com/apache/spark/pull/22024
  
    test this please.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22024: [SPARK-25034][CORE] Remove allocations in onBlockFetchSu...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22024
  
    Merged build finished. Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22024: [SPARK-25034][CORE] Remove allocations in onBlockFetchSu...

Posted by dbtsai <gi...@git.apache.org>.
Github user dbtsai commented on the issue:

    https://github.com/apache/spark/pull/22024
  
    test this please.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22024: [SPARK-25034][CORE] Remove allocations in onBlockFetchSu...

Posted by vincent-grosbois <gi...@git.apache.org>.
Github user vincent-grosbois commented on the issue:

    https://github.com/apache/spark/pull/22024
  
    Hi @viirya , any news on this? do you need me to do something?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22024: [SPARK-25034][CORE] Remove allocations in onBlockFetchSu...

Posted by viirya <gi...@git.apache.org>.
Github user viirya commented on the issue:

    https://github.com/apache/spark/pull/22024
  
    retest this please.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22024: [SPARK-25034][CORE] Remove allocations in onBlockFetchSu...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/22024
  
    **[Test build #94708 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94708/testReport)** for PR 22024 at commit [`6b29a12`](https://github.com/apache/spark/commit/6b29a122189ffe1cf75f1b84f181db2abf7ee337).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22024: [SPARK-25034][CORE] Remove allocations in onBlockFetchSu...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22024
  
    Can one of the admins verify this patch?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22024: [SPARK-25034][CORE] Remove allocations in onBlockFetchSu...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22024
  
    Can one of the admins verify this patch?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22024: [SPARK-25034][CORE] Remove allocations in onBlockFetchSu...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22024
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/94708/
    Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22024: [SPARK-25034][CORE] Remove allocations in onBlockFetchSu...

Posted by xuanyuanking <gi...@git.apache.org>.
Github user xuanyuanking commented on the issue:

    https://github.com/apache/spark/pull/22024
  
    retest this please.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22024: [SPARK-25034][CORE] Remove allocations in onBlockFetchSu...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22024
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/94696/
    Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22024: [SPARK-25034][CORE] Remove allocations in onBlockFetchSu...

Posted by viirya <gi...@git.apache.org>.
Github user viirya commented on the issue:

    https://github.com/apache/spark/pull/22024
  
    retest this please.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22024: [SPARK-25034][CORE] Remove allocations in onBlockFetchSu...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22024
  
    Can one of the admins verify this patch?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22024: [SPARK-25034][CORE] Remove allocations in onBlockFetchSu...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/22024
  
    **[Test build #94696 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94696/testReport)** for PR 22024 at commit [`6b29a12`](https://github.com/apache/spark/commit/6b29a122189ffe1cf75f1b84f181db2abf7ee337).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #22024: [SPARK-25034][CORE] Remove allocations in onBlock...

Posted by vincent-grosbois <gi...@git.apache.org>.
Github user vincent-grosbois commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22024#discussion_r213037747
  
    --- Diff: core/src/main/scala/org/apache/spark/network/BlockTransferService.scala ---
    @@ -101,15 +101,7 @@ abstract class BlockTransferService extends ShuffleClient with Closeable with Lo
               result.failure(exception)
             }
             override def onBlockFetchSuccess(blockId: String, data: ManagedBuffer): Unit = {
    -          data match {
    -            case f: FileSegmentManagedBuffer =>
    -              result.success(f)
    -            case _ =>
    -              val ret = ByteBuffer.allocate(data.size.toInt)
    --- End diff --
    
    I don't really understand the point of this initial commit tbh, was there ever a rationale for it ? (I can't find any comments).
    
    I made sure it works by testing it on our dataset (it will indeed crash if the ref count is not incremented).
    
    All 69f5d0a does is transforming a ManagedBuffer (abstract) into the concrete sub-type NioManagedBuffer. There is no real reason to copy the data, as long as you keep track of the reference count



---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22024: [SPARK-25034][CORE] Remove allocations in onBlockFetchSu...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/22024
  
    **[Test build #94708 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94708/testReport)** for PR 22024 at commit [`6b29a12`](https://github.com/apache/spark/commit/6b29a122189ffe1cf75f1b84f181db2abf7ee337).
     * This patch **fails Spark unit tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22024: [SPARK-25034][CORE] Remove allocations in onBlockFetchSu...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22024
  
    Merged build finished. Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22024: [SPARK-25034][CORE] Remove allocations in onBlockFetchSu...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/22024
  
    **[Test build #94696 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94696/testReport)** for PR 22024 at commit [`6b29a12`](https://github.com/apache/spark/commit/6b29a122189ffe1cf75f1b84f181db2abf7ee337).
     * This patch **fails Spark unit tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #22024: [SPARK-25034][CORE] Remove allocations in onBlock...

Posted by xuanyuanking <gi...@git.apache.org>.
Github user xuanyuanking commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22024#discussion_r213015113
  
    --- Diff: core/src/main/scala/org/apache/spark/network/BlockTransferService.scala ---
    @@ -101,15 +101,7 @@ abstract class BlockTransferService extends ShuffleClient with Closeable with Lo
               result.failure(exception)
             }
             override def onBlockFetchSuccess(blockId: String, data: ManagedBuffer): Unit = {
    -          data match {
    -            case f: FileSegmentManagedBuffer =>
    -              result.success(f)
    -            case _ =>
    -              val ret = ByteBuffer.allocate(data.size.toInt)
    --- End diff --
    
    The copy behavior was introduced by : https://github.com/apache/spark/pull/2330/commits/69f5d0a2434396abbbd98886e047bc08a9e65565. How can you make sure this can be replaced by increasing the reference count?



---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org