You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by 10110346 <gi...@git.apache.org> on 2017/08/29 08:52:46 UTC

[GitHub] spark pull request #19077: [SPARK-21860][core]Optimize heap memory allocator

GitHub user 10110346 opened a pull request:

    https://github.com/apache/spark/pull/19077

    [SPARK-21860][core]Optimize heap memory allocator

    ## What changes were proposed in this pull request?
    In `HeapMemoryAllocator`, when allocating memory from pool, and the key of pool is memory size.
    Actually some size of memory ,such as 1025bytes,1026bytes,......1032bytes, we can think they are the same,because we allocate memory in multiples of 8 bytes.
    In this case, we can improve memory reuse.
    
    ## How was this patch tested?
    Unit test exists

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/10110346/spark headmemoptimize

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/19077.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #19077
    
----
commit f168325717cab6ad97dd909618a0891a9d5cccbc
Author: liuxian <li...@zte.com.cn>
Date:   2017-08-29T08:21:08Z

    fix

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #19077: [SPARK-21860][core]Improve memory reuse for heap ...

Posted by viirya <gi...@git.apache.org>.
Github user viirya commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19077#discussion_r137441814
  
    --- Diff: common/unsafe/src/main/java/org/apache/spark/unsafe/memory/MemoryBlock.java ---
    @@ -48,6 +48,13 @@ public long size() {
       }
     
       /**
    +   * Reset the size of the memory block.
    +   */
    --- End diff --
    
    It is dangerous to reset to a invalid size. We should add a check here or put a WARNING in the method comment. 


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #19077: [SPARK-21860][core]Improve memory reuse for heap ...

Posted by jiangxb1987 <gi...@git.apache.org>.
Github user jiangxb1987 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19077#discussion_r144038007
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/complexTypeCreator.scala ---
    @@ -116,9 +116,10 @@ private [sql] object GenArrayData {
            s"final ArrayData $arrayDataName = new $genericArrayClass($arrayName);",
            arrayDataName)
         } else {
    +      val numBytes = elementType.defaultSize * numElements
           val unsafeArraySizeInBytes =
             UnsafeArrayData.calculateHeaderPortionInBytes(numElements) +
    -        ByteArrayMethods.roundNumberOfBytesToNearestWord(elementType.defaultSize * numElements)
    +        ByteArrayMethods.roundNumberOfBytesToNearestWord(numBytes).toInt
    --- End diff --
    
    We should really inline that.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19077: [SPARK-21860][core]Improve memory reuse for heap memory ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/19077
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #19077: [SPARK-21860][core]Improve memory reuse for heap ...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19077#discussion_r166280889
  
    --- Diff: common/unsafe/src/main/java/org/apache/spark/unsafe/memory/MemoryBlock.java ---
    @@ -20,6 +20,7 @@
     import javax.annotation.Nullable;
     
     import org.apache.spark.unsafe.Platform;
    +import org.apache.spark.unsafe.array.ByteArrayMethods;
    --- End diff --
    
    unnecessary change


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #19077: [SPARK-21860][core]Improve memory reuse for heap ...

Posted by 10110346 <gi...@git.apache.org>.
Github user 10110346 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19077#discussion_r144216267
  
    --- Diff: common/unsafe/src/main/java/org/apache/spark/unsafe/memory/MemoryBlock.java ---
    @@ -48,6 +49,15 @@ public long size() {
       }
     
       /**
    +   * Reset the size of the memory block.
    +   */
    +  public void resetSize(long len) {
    +    assert (ByteArrayMethods.roundNumberOfBytesToNearestWord(length) ==
    --- End diff --
    
    It seems that `require`  can not be used in the Java code @jiangxb1987 


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19077: [SPARK-21860][core]Improve memory reuse for heap memory ...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/19077
  
    **[Test build #83102 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/83102/testReport)** for PR 19077 at commit [`ff091a8`](https://github.com/apache/spark/commit/ff091a898fbd74de5017986aca59abc71bbc4cf5).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19077: [SPARK-21860][core]Improve memory reuse for heap memory ...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/19077
  
    **[Test build #81492 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81492/testReport)** for PR 19077 at commit [`0c6647c`](https://github.com/apache/spark/commit/0c6647cca3868a24f07c077bd9e37d436b49f5e8).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19077: [SPARK-21860][core]Improve memory reuse for heap memory ...

Posted by kiszk <gi...@git.apache.org>.
Github user kiszk commented on the issue:

    https://github.com/apache/spark/pull/19077
  
    @cloud-fan IIUC, this PR does not change alignment policy for page allocation. Alignment policy still depends on JVM. This PR changes reuse policy in memory pool.
    
    If you are asking about round policy (i.e. 8-byte in this PR) for memory reuse, I do not have strong opinion. 8-byte is also fine.
    If you are asking about allocation policy, it would be good to use 8-byte boundaries for an allocation as HotSpot and IBM JVMs do.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #19077: [SPARK-21860][core]Improve memory reuse for heap ...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19077#discussion_r166834595
  
    --- Diff: common/unsafe/src/main/java/org/apache/spark/unsafe/memory/HeapMemoryAllocator.java ---
    @@ -46,9 +46,12 @@ private boolean shouldPool(long size) {
     
       @Override
       public MemoryBlock allocate(long size) throws OutOfMemoryError {
    -    if (shouldPool(size)) {
    +    int numWords = (int) ((size + 7) / 8);
    +    long alignedSize = numWords * 8;
    --- End diff --
    
    `numWords * 8L`, to avoid overflow


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19077: [SPARK-21860][core]Improve memory reuse for heap memory ...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/19077
  
    **[Test build #82666 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82666/testReport)** for PR 19077 at commit [`b4d6b4e`](https://github.com/apache/spark/commit/b4d6b4e16027745ae512d722ad95b34b29b39bfa).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19077: [SPARK-21860][core]Improve memory reuse for heap memory ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/19077
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19077: [SPARK-21860][core]Improve memory reuse for heap memory ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/19077
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/83102/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19077: [SPARK-21860][core]Improve memory reuse for heap memory ...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/19077
  
    **[Test build #81273 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81273/testReport)** for PR 19077 at commit [`ba5717e`](https://github.com/apache/spark/commit/ba5717eedb8f7b7427630f118c69f9b2b4abb5fb).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19077: [SPARK-21860][core]Improve memory reuse for heap memory ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/19077
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/83686/
    Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19077: [SPARK-21860][core]Improve memory reuse for heap memory ...

Posted by jerryshao <gi...@git.apache.org>.
Github user jerryshao commented on the issue:

    https://github.com/apache/spark/pull/19077
  
    This PR generally looks fine to me, my concern is that will this change bring in subtle impact on the code which leverage it.
    
    CC @JoshRosen to take a review.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19077: [SPARK-21860][core]Improve memory reuse for heap memory ...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/19077
  
    **[Test build #81212 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81212/testReport)** for PR 19077 at commit [`6862403`](https://github.com/apache/spark/commit/6862403566e5c811bc6efa8a52409bcefb253749).
     * This patch **fails Spark unit tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19077: [SPARK-21860][core]Optimize heap memory allocator

Posted by srowen <gi...@git.apache.org>.
Github user srowen commented on the issue:

    https://github.com/apache/spark/pull/19077
  
    This and the JIRA need a better title


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19077: [SPARK-21860][core]Improve memory reuse for heap memory ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/19077
  
    Merged build finished. Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #19077: [SPARK-21860][core]Improve memory reuse for heap ...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19077#discussion_r166280387
  
    --- Diff: common/unsafe/src/main/java/org/apache/spark/unsafe/array/ByteArrayMethods.java ---
    @@ -40,6 +40,15 @@ public static int roundNumberOfBytesToNearestWord(int numBytes) {
         }
       }
     
    +  public static long roundNumberOfBytesToNearestWord(long numBytes) {
    +    long remainder = numBytes & 0x07;  // This is equivalent to `numBytes % 8`
    --- End diff --
    
    is it just `return (numBytes + 7) / 8;`?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19077: [SPARK-21860][core]Improve memory reuse for heap memory ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/19077
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19077: [SPARK-21860][core]Improve memory reuse for heap memory ...

Posted by jiangxb1987 <gi...@git.apache.org>.
Github user jiangxb1987 commented on the issue:

    https://github.com/apache/spark/pull/19077
  
    ping @cloud-fan shall we continue with this PR?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19077: [SPARK-21860][core]Improve memory reuse for heap memory ...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/19077
  
    **[Test build #81273 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81273/testReport)** for PR 19077 at commit [`ba5717e`](https://github.com/apache/spark/commit/ba5717eedb8f7b7427630f118c69f9b2b4abb5fb).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19077: [SPARK-21860][core]Improve memory reuse for heap memory ...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/19077
  
    **[Test build #87192 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87192/testReport)** for PR 19077 at commit [`78ede3f`](https://github.com/apache/spark/commit/78ede3fae243c5379eaea1c86584e200ce697c19).
     * This patch **fails due to an unknown error code, -9**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #19077: [SPARK-21860][core]Improve memory reuse for heap ...

Posted by 10110346 <gi...@git.apache.org>.
Github user 10110346 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19077#discussion_r166832418
  
    --- Diff: common/unsafe/src/main/java/org/apache/spark/unsafe/memory/HeapMemoryAllocator.java ---
    @@ -46,9 +47,10 @@ private boolean shouldPool(long size) {
     
       @Override
       public MemoryBlock allocate(long size) throws OutOfMemoryError {
    -    if (shouldPool(size)) {
    --- End diff --
    
    I aggree with you .
    I hava updated,thanks


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19077: [SPARK-21860][core]Improve memory reuse for heap memory ...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on the issue:

    https://github.com/apache/spark/pull/19077
  
    LGTM, we can also update L71 `long[] array = new long[(int) ((size + 7) / 8)];`, to use the `alignedSize`


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19077: [SPARK-21860][core]Improve memory reuse for heap memory ...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on the issue:

    https://github.com/apache/spark/pull/19077
  
    thanks, merging to master!


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19077: [SPARK-21860][core]Improve memory reuse for heap memory ...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/19077
  
    **[Test build #81251 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81251/testReport)** for PR 19077 at commit [`b00b685`](https://github.com/apache/spark/commit/b00b6859d36c5e9a0c3c9e0a77cab8d0681a5644).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19077: [SPARK-21860][core]Improve memory reuse for heap memory ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/19077
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19077: [SPARK-21860][core]Improve memory reuse for heap memory ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/19077
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19077: [SPARK-21860][core]Improve memory reuse for heap memory ...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on the issue:

    https://github.com/apache/spark/pull/19077
  
    What about the unsafe allocation Spark does? Is it also better to use 8-byte boundaries?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19077: [SPARK-21860][core]Improve memory reuse for heap memory ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/19077
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/83695/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19077: [SPARK-21860][core]Improve memory reuse for heap memory ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/19077
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/692/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19077: [SPARK-21860][core]Improve memory reuse for heap memory ...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/19077
  
    **[Test build #87106 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87106/testReport)** for PR 19077 at commit [`f745144`](https://github.com/apache/spark/commit/f745144e982aeee8ffe79dacca1559cc1a01f0c8).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19077: [SPARK-21860][core]Improve memory reuse for heap memory ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/19077
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/81251/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19077: [SPARK-21860][core]Improve memory reuse for heap memory ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/19077
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/81508/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #19077: [SPARK-21860][core]Improve memory reuse for heap ...

Posted by jerryshao <gi...@git.apache.org>.
Github user jerryshao commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19077#discussion_r143380706
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/complexTypeCreator.scala ---
    @@ -116,9 +116,10 @@ private [sql] object GenArrayData {
            s"final ArrayData $arrayDataName = new $genericArrayClass($arrayName);",
            arrayDataName)
         } else {
    +      val numBytes = elementType.defaultSize * numElements
           val unsafeArraySizeInBytes =
             UnsafeArrayData.calculateHeaderPortionInBytes(numElements) +
    -        ByteArrayMethods.roundNumberOfBytesToNearestWord(elementType.defaultSize * numElements)
    +        ByteArrayMethods.roundNumberOfBytesToNearestWord(numBytes).toInt
    --- End diff --
    
    Minor: why don't we inline this instead of creating a new variable?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #19077: [SPARK-21860][core]Improve memory reuse for heap ...

Posted by 10110346 <gi...@git.apache.org>.
Github user 10110346 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19077#discussion_r166815274
  
    --- Diff: common/unsafe/src/main/java/org/apache/spark/unsafe/memory/MemoryBlock.java ---
    @@ -20,6 +20,7 @@
     import javax.annotation.Nullable;
     
     import org.apache.spark.unsafe.Platform;
    +import org.apache.spark.unsafe.array.ByteArrayMethods;
    --- End diff --
    
    ok,thanks


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19077: [SPARK-21860][core]Improve memory reuse for heap memory ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/19077
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19077: [SPARK-21860][core]Improve memory reuse for heap memory ...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/19077
  
    **[Test build #81251 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81251/testReport)** for PR 19077 at commit [`b00b685`](https://github.com/apache/spark/commit/b00b6859d36c5e9a0c3c9e0a77cab8d0681a5644).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19077: [SPARK-21860][core]Improve memory reuse for heap memory ...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/19077
  
    **[Test build #81272 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81272/testReport)** for PR 19077 at commit [`fc8b895`](https://github.com/apache/spark/commit/fc8b895e12bda7f9e5e76dbb7cbd798c95e3ec9f).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19077: [SPARK-21860][core]Improve memory reuse for heap memory ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/19077
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/81273/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19077: [SPARK-21860][core]Improve memory reuse for heap memory ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/19077
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #19077: [SPARK-21860][core]Improve memory reuse for heap ...

Posted by jiangxb1987 <gi...@git.apache.org>.
Github user jiangxb1987 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19077#discussion_r144037771
  
    --- Diff: sql/catalyst/src/main/java/org/apache/spark/sql/catalyst/expressions/codegen/UnsafeArrayWriter.java ---
    @@ -57,7 +57,7 @@ public void initialize(BufferHolder holder, int numElements, int elementSize) {
     
         // Grows the global buffer ahead for header and fixed size data.
         int fixedPartInBytes =
    -      ByteArrayMethods.roundNumberOfBytesToNearestWord(elementSize * numElements);
    +      (int)ByteArrayMethods.roundNumberOfBytesToNearestWord(elementSize * numElements);
    --- End diff --
    
    nit: extra space after `(int) `, also please update the other similar changes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #19077: [SPARK-21860][core]Improve memory reuse for heap ...

Posted by 10110346 <gi...@git.apache.org>.
Github user 10110346 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19077#discussion_r137442821
  
    --- Diff: common/unsafe/src/main/java/org/apache/spark/unsafe/memory/MemoryBlock.java ---
    @@ -48,6 +48,13 @@ public long size() {
       }
     
       /**
    +   * Reset the size of the memory block.
    +   */
    --- End diff --
    
    Thanks,i will  add a check.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #19077: [SPARK-21860][core]Improve memory reuse for heap ...

Posted by kiszk <gi...@git.apache.org>.
Github user kiszk commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19077#discussion_r166368336
  
    --- Diff: common/unsafe/src/test/java/org/apache/spark/unsafe/PlatformUtilSuite.java ---
    @@ -134,4 +135,24 @@ public void memoryDebugFillEnabledInTest() {
           MemoryAllocator.MEMORY_DEBUG_FILL_CLEAN_VALUE);
         MemoryAllocator.UNSAFE.free(offheap);
       }
    +
    +  @Test
    +  public void heapMemoryReuse() {
    +    MemoryAllocator heapMem = new HeapMemoryAllocator();
    +    // The size is less than 1M,allocate new memory every time.
    +    MemoryBlock onheap1 = heapMem.allocate(513);
    +    Object obj1 = onheap1.getBaseObject();
    +    heapMem.free(onheap1);
    +    MemoryBlock onheap2 = heapMem.allocate(514);
    +    Assert.assertNotEquals(obj1, onheap2.getBaseObject());
    +
    +    // The size is greater than 1M,reuse the previous memory which has released.
    --- End diff --
    
    ditto


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19077: [SPARK-21860][core]Improve memory reuse for heap memory ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/19077
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/83715/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #19077: [SPARK-21860][core]Improve memory reuse for heap ...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19077#discussion_r166280829
  
    --- Diff: common/unsafe/src/main/java/org/apache/spark/unsafe/memory/HeapMemoryAllocator.java ---
    @@ -98,12 +100,13 @@ public void free(MemoryBlock memory) {
         long[] array = (long[]) memory.obj;
         memory.setObjAndOffset(null, 0);
     
    -    if (shouldPool(size)) {
    +    long alignedSize = ByteArrayMethods.roundNumberOfBytesToNearestWord(size);
    --- End diff --
    
    do we need to do it again here? I think the `size` is guaranteed to be aligned with word.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19077: [SPARK-21860][core]Improve memory reuse for heap memory ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/19077
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #19077: [SPARK-21860][core]Improve memory reuse for heap ...

Posted by 10110346 <gi...@git.apache.org>.
Github user 10110346 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19077#discussion_r166836613
  
    --- Diff: common/unsafe/src/main/java/org/apache/spark/unsafe/memory/HeapMemoryAllocator.java ---
    @@ -46,9 +46,12 @@ private boolean shouldPool(long size) {
     
       @Override
       public MemoryBlock allocate(long size) throws OutOfMemoryError {
    -    if (shouldPool(size)) {
    +    int numWords = (int) ((size + 7) / 8);
    +    long alignedSize = numWords * 8;
    --- End diff --
    
    yeah,thanks


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #19077: [SPARK-21860][core]Improve memory reuse for heap ...

Posted by 10110346 <gi...@git.apache.org>.
Github user 10110346 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19077#discussion_r166817332
  
    --- Diff: common/unsafe/src/main/java/org/apache/spark/unsafe/array/ByteArrayMethods.java ---
    @@ -40,6 +40,15 @@ public static int roundNumberOfBytesToNearestWord(int numBytes) {
         }
       }
     
    +  public static long roundNumberOfBytesToNearestWord(long numBytes) {
    +    long remainder = numBytes & 0x07;  // This is equivalent to `numBytes % 8`
    --- End diff --
    
    Yes, it's 8 * ((numBytes + 7) / 8).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19077: [SPARK-21860][core]Improve memory reuse for heap memory ...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/19077
  
    **[Test build #87183 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87183/testReport)** for PR 19077 at commit [`acfe551`](https://github.com/apache/spark/commit/acfe5513f0d83d83aed6899de2bb42262442e3a1).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #19077: [SPARK-21860][core]Improve memory reuse for heap ...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19077#discussion_r166834552
  
    --- Diff: common/unsafe/src/main/java/org/apache/spark/unsafe/memory/HeapMemoryAllocator.java ---
    @@ -46,9 +46,12 @@ private boolean shouldPool(long size) {
     
       @Override
       public MemoryBlock allocate(long size) throws OutOfMemoryError {
    -    if (shouldPool(size)) {
    +    int numWords = (int) ((size + 7) / 8);
    --- End diff --
    
    let's add some check to make sure `(size + 7) / 8` doesn't exceed max int.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #19077: [SPARK-21860][core]Improve memory reuse for heap ...

Posted by kiszk <gi...@git.apache.org>.
Github user kiszk commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19077#discussion_r166368258
  
    --- Diff: common/unsafe/src/test/java/org/apache/spark/unsafe/PlatformUtilSuite.java ---
    @@ -134,4 +135,24 @@ public void memoryDebugFillEnabledInTest() {
           MemoryAllocator.MEMORY_DEBUG_FILL_CLEAN_VALUE);
         MemoryAllocator.UNSAFE.free(offheap);
       }
    +
    +  @Test
    +  public void heapMemoryReuse() {
    +    MemoryAllocator heapMem = new HeapMemoryAllocator();
    +    // The size is less than 1M,allocate new memory every time.
    --- End diff --
    
    Is it better to refer to `HeapMemoryAllocator.POOLING_THRESHOLD_BYTES` at `1M`?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19077: [SPARK-21860][core]Improve memory reuse for heap memory ...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/19077
  
    **[Test build #83102 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/83102/testReport)** for PR 19077 at commit [`ff091a8`](https://github.com/apache/spark/commit/ff091a898fbd74de5017986aca59abc71bbc4cf5).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19077: [SPARK-21860][core]Improve memory reuse for heap memory ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/19077
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19077: [SPARK-21860][core]Improve memory reuse for heap memory ...

Posted by jerryshao <gi...@git.apache.org>.
Github user jerryshao commented on the issue:

    https://github.com/apache/spark/pull/19077
  
    Can you please add some unit test to verify your changes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19077: [SPARK-21860][core]Improve memory reuse for heap memory ...

Posted by 10110346 <gi...@git.apache.org>.
Github user 10110346 commented on the issue:

    https://github.com/apache/spark/pull/19077
  
    @jerryshao @JoshRosen yes, it would not generally be arbitrary sized allocations. Basically, we allocate memory in multiples of 4 or 8 bytes,even so, I think this change is also beneficial .
    Also,I think this change  will not  impact on the code which leverage it, because `MemoryBlock` is not changed


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19077: [SPARK-21860][core]Improve memory reuse for heap memory ...

Posted by jiangxb1987 <gi...@git.apache.org>.
Github user jiangxb1987 commented on the issue:

    https://github.com/apache/spark/pull/19077
  
    ping @cloud-fan 


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #19077: [SPARK-21860][core]Improve memory reuse for heap ...

Posted by 10110346 <gi...@git.apache.org>.
Github user 10110346 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19077#discussion_r136489896
  
    --- Diff: common/unsafe/src/main/java/org/apache/spark/unsafe/memory/HeapMemoryAllocator.java ---
    @@ -47,23 +47,29 @@ private boolean shouldPool(long size) {
     
       @Override
       public MemoryBlock allocate(long size) throws OutOfMemoryError {
    -    if (shouldPool(size)) {
    +    int arraySize = (int)((size + 7) / 8);
    --- End diff --
    
    But the type of input parameter for `roundNumberOfBytesToNearestWord` is `int`



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19077: [SPARK-21860][core]Improve memory reuse for heap memory ...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/19077
  
    **[Test build #82668 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82668/testReport)** for PR 19077 at commit [`62073e3`](https://github.com/apache/spark/commit/62073e361e2c5287ab57cbfb1300db0c99bfcb62).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #19077: [SPARK-21860][core]Improve memory reuse for heap ...

Posted by 10110346 <gi...@git.apache.org>.
Github user 10110346 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19077#discussion_r150442437
  
    --- Diff: common/unsafe/src/main/java/org/apache/spark/unsafe/memory/MemoryBlock.java ---
    @@ -48,6 +49,15 @@ public long size() {
       }
     
       /**
    +   * Reset the size of the memory block.
    --- End diff --
    
    The size of memory block is not aligned size,resetting the size is just to keep the behavior consistent with before.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19077: [SPARK-21860][core]Optimize heap memory allocator

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/19077
  
    **[Test build #81208 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81208/testReport)** for PR 19077 at commit [`f168325`](https://github.com/apache/spark/commit/f168325717cab6ad97dd909618a0891a9d5cccbc).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19077: [SPARK-21860][core]Improve memory reuse for heap memory ...

Posted by JoshRosen <gi...@git.apache.org>.
Github user JoshRosen commented on the issue:

    https://github.com/apache/spark/pull/19077
  
    Just curious: do you know where are we allocating these close-in-size chunks of memory? I understand the motivation, but just curious to know what's causing this pattern. I think the original idea here was that most allocations would come from a small set of sizes (usually the page size, or a configurable buffer size) and would not generally be arbitrary sized allocations.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19077: [SPARK-21860][core]Improve memory reuse for heap memory ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/19077
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/710/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #19077: [SPARK-21860][core]Improve memory reuse for heap ...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19077#discussion_r150240149
  
    --- Diff: common/unsafe/src/main/java/org/apache/spark/unsafe/array/ByteArrayMethods.java ---
    @@ -31,8 +31,8 @@ public static long nextPowerOf2(long num) {
         return (highBit == num) ? num : highBit << 1;
       }
     
    -  public static int roundNumberOfBytesToNearestWord(int numBytes) {
    -    int remainder = numBytes & 0x07;  // This is equivalent to `numBytes % 8`
    +  public static long roundNumberOfBytesToNearestWord(long numBytes) {
    --- End diff --
    
    I think we should create a new method instead of changing this method. The added int cast looks scaring.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19077: [SPARK-21860][core]Improve memory reuse for heap memory ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/19077
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #19077: [SPARK-21860][core]Improve memory reuse for heap ...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19077#discussion_r166829018
  
    --- Diff: common/unsafe/src/main/java/org/apache/spark/unsafe/memory/HeapMemoryAllocator.java ---
    @@ -46,9 +47,10 @@ private boolean shouldPool(long size) {
     
       @Override
       public MemoryBlock allocate(long size) throws OutOfMemoryError {
    -    if (shouldPool(size)) {
    --- End diff --
    
    I think we don't need to create a new method:
    ```
    int numWords = (int) ((size + 7) / 8); // add some overflow check.
    int alignedSize = numWords * 8;
    if (shouldPool(alignedSize)) {
      ...
    }
    long[] array = new long[numWords];
    ```


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19077: [SPARK-21860][core]Improve memory reuse for heap memory ...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on the issue:

    https://github.com/apache/spark/pull/19077
  
    LGTM


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19077: [SPARK-21860][core]Improve memory reuse for heap memory ...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/19077
  
    **[Test build #81508 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81508/testReport)** for PR 19077 at commit [`0c6647c`](https://github.com/apache/spark/commit/0c6647cca3868a24f07c077bd9e37d436b49f5e8).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19077: [SPARK-21860][core]Improve memory reuse for heap memory ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/19077
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #19077: [SPARK-21860][core]Improve memory reuse for heap ...

Posted by 10110346 <gi...@git.apache.org>.
Github user 10110346 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19077#discussion_r166836868
  
    --- Diff: common/unsafe/src/main/java/org/apache/spark/unsafe/memory/HeapMemoryAllocator.java ---
    @@ -46,9 +46,12 @@ private boolean shouldPool(long size) {
     
       @Override
       public MemoryBlock allocate(long size) throws OutOfMemoryError {
    -    if (shouldPool(size)) {
    +    int numWords = (int) ((size + 7) / 8);
    --- End diff --
    
    L51: assert (alignedSize >= size); 
    I think `assert (alignedSize >= size) ` can make sure `(size + 7) / 8` doesn't exceed max int.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #19077: [SPARK-21860][core]Improve memory reuse for heap ...

Posted by jerryshao <gi...@git.apache.org>.
Github user jerryshao commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19077#discussion_r136332974
  
    --- Diff: common/unsafe/src/main/java/org/apache/spark/unsafe/memory/HeapMemoryAllocator.java ---
    @@ -47,23 +47,29 @@ private boolean shouldPool(long size) {
     
       @Override
       public MemoryBlock allocate(long size) throws OutOfMemoryError {
    -    if (shouldPool(size)) {
    +    int arraySize = (int)((size + 7) / 8);
    +    long alignedSize = arraySize * 8;
    +    if (shouldPool(alignedSize)) {
           synchronized (this) {
    -        final LinkedList<WeakReference<MemoryBlock>> pool = bufferPoolsBySize.get(size);
    +        final LinkedList<WeakReference<MemoryBlock>> pool = bufferPoolsBySize.get(alignedSize);
             if (pool != null) {
               while (!pool.isEmpty()) {
                 final WeakReference<MemoryBlock> blockReference = pool.pop();
                 final MemoryBlock memory = blockReference.get();
                 if (memory != null) {
    -              assert (memory.size() == size);
    +              assert ((int)((memory.size() + 7) / 8) == arraySize);
    +              memory.resetSize(size);
    --- End diff --
    
    I got it, thanks for the explanation.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19077: [SPARK-21860][core]Improve memory reuse for heap memory ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/19077
  
    Merged build finished. Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19077: [SPARK-21860][core]Improve memory reuse for heap memory ...

Posted by kiszk <gi...@git.apache.org>.
Github user kiszk commented on the issue:

    https://github.com/apache/spark/pull/19077
  
    retest this please


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #19077: [SPARK-21860][core]Improve memory reuse for heap ...

Posted by viirya <gi...@git.apache.org>.
Github user viirya commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19077#discussion_r137439954
  
    --- Diff: common/unsafe/src/main/java/org/apache/spark/unsafe/memory/HeapMemoryAllocator.java ---
    @@ -47,23 +48,29 @@ private boolean shouldPool(long size) {
     
       @Override
       public MemoryBlock allocate(long size) throws OutOfMemoryError {
    -    if (shouldPool(size)) {
    +    long alignedSize = ByteArrayMethods.roundNumberOfBytesToNearestWord(size);
    --- End diff --
    
    Maybe minor but some small allocations will be counted for pooling mechanism but they are not before, e.g. `POOLING_THRESHOLD_BYTES` - 1.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19077: [SPARK-21860][core]Improve memory reuse for heap memory ...

Posted by 10110346 <gi...@git.apache.org>.
Github user 10110346 commented on the issue:

    https://github.com/apache/spark/pull/19077
  
    @jerryshao  Thanks,i will add unit tests.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19077: [SPARK-21860][core]Improve memory reuse for heap memory ...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on the issue:

    https://github.com/apache/spark/pull/19077
  
    shall we always do word alignment for page allocation? cc @kiszk too


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #19077: [SPARK-21860][core]Optimize heap memory allocator

Posted by srowen <gi...@git.apache.org>.
Github user srowen commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19077#discussion_r135742822
  
    --- Diff: common/unsafe/src/main/java/org/apache/spark/unsafe/memory/HeapMemoryAllocator.java ---
    @@ -47,24 +47,25 @@ private boolean shouldPool(long size) {
     
       @Override
       public MemoryBlock allocate(long size) throws OutOfMemoryError {
    -    if (shouldPool(size)) {
    +    int arraySize = (int)((size + 7) / 8);
    +    if (shouldPool(arraySize * 8)) {
    --- End diff --
    
    Factor out `arraySize * 8` as `alignedSize` or something


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19077: [SPARK-21860][core]Improve memory reuse for heap memory ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/19077
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/81272/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19077: [SPARK-21860][core]Improve memory reuse for heap memory ...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/19077
  
    **[Test build #81272 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81272/testReport)** for PR 19077 at commit [`fc8b895`](https://github.com/apache/spark/commit/fc8b895e12bda7f9e5e76dbb7cbd798c95e3ec9f).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19077: [SPARK-21860][core]Improve memory reuse for heap memory ...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/19077
  
    **[Test build #87190 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87190/testReport)** for PR 19077 at commit [`ad211b8`](https://github.com/apache/spark/commit/ad211b860ae4869aca8be747924578c944ba8974).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19077: [SPARK-21860][core]Improve memory reuse for heap memory ...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/19077
  
    **[Test build #87183 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87183/testReport)** for PR 19077 at commit [`acfe551`](https://github.com/apache/spark/commit/acfe5513f0d83d83aed6899de2bb42262442e3a1).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #19077: [SPARK-21860][core]Improve memory reuse for heap ...

Posted by srowen <gi...@git.apache.org>.
Github user srowen commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19077#discussion_r166300538
  
    --- Diff: common/unsafe/src/main/java/org/apache/spark/unsafe/array/ByteArrayMethods.java ---
    @@ -40,6 +40,15 @@ public static int roundNumberOfBytesToNearestWord(int numBytes) {
         }
       }
     
    +  public static long roundNumberOfBytesToNearestWord(long numBytes) {
    +    long remainder = numBytes & 0x07;  // This is equivalent to `numBytes % 8`
    --- End diff --
    
    It's `8 * ((numBytes + 7) / 8)` but yeah, and the bit operation is unnecessarily complex IMHO. This is inherently a numeric operation, not bit-wise.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19077: [SPARK-21860][core]Improve memory reuse for heap memory ...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/19077
  
    **[Test build #83686 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/83686/testReport)** for PR 19077 at commit [`ff091a8`](https://github.com/apache/spark/commit/ff091a898fbd74de5017986aca59abc71bbc4cf5).
     * This patch **fails PySpark unit tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19077: [SPARK-21860][core]Improve memory reuse for heap memory ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/19077
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/626/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19077: [SPARK-21860][core]Improve memory reuse for heap memory ...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/19077
  
    **[Test build #83715 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/83715/testReport)** for PR 19077 at commit [`967c73a`](https://github.com/apache/spark/commit/967c73a2462f0948029a37a84f6912983f03c126).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19077: [SPARK-21860][core]Improve memory reuse for heap memory ...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/19077
  
    **[Test build #87106 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87106/testReport)** for PR 19077 at commit [`f745144`](https://github.com/apache/spark/commit/f745144e982aeee8ffe79dacca1559cc1a01f0c8).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19077: [SPARK-21860][core]Improve memory reuse for heap memory ...

Posted by jiangxb1987 <gi...@git.apache.org>.
Github user jiangxb1987 commented on the issue:

    https://github.com/apache/spark/pull/19077
  
    retest this please


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #19077: [SPARK-21860][core]Improve memory reuse for heap ...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19077#discussion_r150381892
  
    --- Diff: common/unsafe/src/main/java/org/apache/spark/unsafe/memory/MemoryBlock.java ---
    @@ -48,6 +49,15 @@ public long size() {
       }
     
       /**
    +   * Reset the size of the memory block.
    --- End diff --
    
    I think it's a really strange and dangerous interface. Why do we have to set the size? Does the memory consumer complain when he gets a memory block larger than he requested?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19077: [SPARK-21860][core]Improve memory reuse for heap memory ...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/19077
  
    **[Test build #81491 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81491/testReport)** for PR 19077 at commit [`729df24`](https://github.com/apache/spark/commit/729df248bf44818202d1ca61b30ab43daf8aea8d).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #19077: [SPARK-21860][core]Improve memory reuse for heap ...

Posted by jiangxb1987 <gi...@git.apache.org>.
Github user jiangxb1987 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19077#discussion_r144037069
  
    --- Diff: common/unsafe/src/main/java/org/apache/spark/unsafe/memory/MemoryBlock.java ---
    @@ -48,6 +49,15 @@ public long size() {
       }
     
       /**
    +   * Reset the size of the memory block.
    +   */
    +  public void resetSize(long len) {
    +    assert (ByteArrayMethods.roundNumberOfBytesToNearestWord(length) ==
    --- End diff --
    
    We'd better use `require` instead of `assert`.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19077: [SPARK-21860][core]Improve memory reuse for heap memory ...

Posted by jiangxb1987 <gi...@git.apache.org>.
Github user jiangxb1987 commented on the issue:

    https://github.com/apache/spark/pull/19077
  
    cc @cloud-fan for review


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19077: [SPARK-21860][core]Improve memory reuse for heap memory ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/19077
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19077: [SPARK-21860][core]Improve memory reuse for heap memory ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/19077
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/82666/
    Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19077: [SPARK-21860][core]Improve memory reuse for heap memory ...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/19077
  
    **[Test build #83695 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/83695/testReport)** for PR 19077 at commit [`ff091a8`](https://github.com/apache/spark/commit/ff091a898fbd74de5017986aca59abc71bbc4cf5).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19077: [SPARK-21860][core]Improve memory reuse for heap memory ...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/19077
  
    **[Test build #81491 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81491/testReport)** for PR 19077 at commit [`729df24`](https://github.com/apache/spark/commit/729df248bf44818202d1ca61b30ab43daf8aea8d).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19077: [SPARK-21860][core]Improve memory reuse for heap memory ...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/19077
  
    **[Test build #81208 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81208/testReport)** for PR 19077 at commit [`f168325`](https://github.com/apache/spark/commit/f168325717cab6ad97dd909618a0891a9d5cccbc).
     * This patch **fails Spark unit tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19077: [SPARK-21860][core]Improve memory reuse for heap memory ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/19077
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/81491/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19077: [SPARK-21860][core]Improve memory reuse for heap memory ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/19077
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/81208/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #19077: [SPARK-21860][core]Improve memory reuse for heap ...

Posted by jerryshao <gi...@git.apache.org>.
Github user jerryshao commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19077#discussion_r136069800
  
    --- Diff: common/unsafe/src/main/java/org/apache/spark/unsafe/memory/HeapMemoryAllocator.java ---
    @@ -47,23 +47,29 @@ private boolean shouldPool(long size) {
     
       @Override
       public MemoryBlock allocate(long size) throws OutOfMemoryError {
    -    if (shouldPool(size)) {
    +    int arraySize = (int)((size + 7) / 8);
    +    long alignedSize = arraySize * 8;
    +    if (shouldPool(alignedSize)) {
           synchronized (this) {
    -        final LinkedList<WeakReference<MemoryBlock>> pool = bufferPoolsBySize.get(size);
    +        final LinkedList<WeakReference<MemoryBlock>> pool = bufferPoolsBySize.get(alignedSize);
             if (pool != null) {
               while (!pool.isEmpty()) {
                 final WeakReference<MemoryBlock> blockReference = pool.pop();
                 final MemoryBlock memory = blockReference.get();
                 if (memory != null) {
    -              assert (memory.size() == size);
    +              assert ((int)((memory.size() + 7) / 8) == arraySize);
    +              memory.resetSize(size);
    --- End diff --
    
    Hmm, from my understanding the size of `MemoryBlock` is always the actual size, not the aligned size, so looks like we dont need to reset the size here.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19077: [SPARK-21860][core]Improve memory reuse for heap memory ...

Posted by kiszk <gi...@git.apache.org>.
Github user kiszk commented on the issue:

    https://github.com/apache/spark/pull/19077
  
    gentle ping @jerryshao for review


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19077: [SPARK-21860][core]Improve memory reuse for heap memory ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/19077
  
    Merged build finished. Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #19077: [SPARK-21860][core]Improve memory reuse for heap ...

Posted by jiangxb1987 <gi...@git.apache.org>.
Github user jiangxb1987 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19077#discussion_r147179694
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/complexTypeCreator.scala ---
    @@ -118,7 +118,8 @@ private [sql] object GenArrayData {
         } else {
           val unsafeArraySizeInBytes =
             UnsafeArrayData.calculateHeaderPortionInBytes(numElements) +
    -        ByteArrayMethods.roundNumberOfBytesToNearestWord(elementType.defaultSize * numElements)
    +        ByteArrayMethods.roundNumberOfBytesToNearestWord(elementType.defaultSize *
    --- End diff --
    
    nit:
    ```
    ByteArrayMethods.roundNumberOfBytesToNearestWord(elementType.defaultSize * numElements)
    .toInt
    ```


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #19077: [SPARK-21860][core]Improve memory reuse for heap ...

Posted by jiangxb1987 <gi...@git.apache.org>.
Github user jiangxb1987 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19077#discussion_r144037194
  
    --- Diff: common/unsafe/src/main/java/org/apache/spark/unsafe/memory/MemoryBlock.java ---
    @@ -48,6 +49,15 @@ public long size() {
       }
     
       /**
    +   * Reset the size of the memory block.
    +   */
    +  public void resetSize(long len) {
    +    assert (ByteArrayMethods.roundNumberOfBytesToNearestWord(length) ==
    --- End diff --
    
    Also leave some message if the check fails.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19077: [SPARK-21860][core]Improve memory reuse for heap memory ...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/19077
  
    **[Test build #83068 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/83068/testReport)** for PR 19077 at commit [`c312fde`](https://github.com/apache/spark/commit/c312fdecbd33eb038c56ce47c3b34753df551c8f).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19077: [SPARK-21860][core]Improve memory reuse for heap memory ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/19077
  
    Merged build finished. Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19077: [SPARK-21860][core]Improve memory reuse for heap memory ...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/19077
  
    **[Test build #82666 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82666/testReport)** for PR 19077 at commit [`b4d6b4e`](https://github.com/apache/spark/commit/b4d6b4e16027745ae512d722ad95b34b29b39bfa).
     * This patch **fails to build**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #19077: [SPARK-21860][core]Improve memory reuse for heap ...

Posted by 10110346 <gi...@git.apache.org>.
Github user 10110346 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19077#discussion_r143439369
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/complexTypeCreator.scala ---
    @@ -116,9 +116,10 @@ private [sql] object GenArrayData {
            s"final ArrayData $arrayDataName = new $genericArrayClass($arrayName);",
            arrayDataName)
         } else {
    +      val numBytes = elementType.defaultSize * numElements
           val unsafeArraySizeInBytes =
             UnsafeArrayData.calculateHeaderPortionInBytes(numElements) +
    -        ByteArrayMethods.roundNumberOfBytesToNearestWord(elementType.defaultSize * numElements)
    +        ByteArrayMethods.roundNumberOfBytesToNearestWord(numBytes).toInt
    --- End diff --
    
     The size of this line is larger than 200 bytes


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19077: [SPARK-21860][core]Improve memory reuse for heap memory ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/19077
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19077: [SPARK-21860][core]Improve memory reuse for heap memory ...

Posted by kiszk <gi...@git.apache.org>.
Github user kiszk commented on the issue:

    https://github.com/apache/spark/pull/19077
  
    According to [this material](http://s3-eu-west-1.amazonaws.com/presentations2013/10_presentation.pdf#pages=42), `Unsafe.allocateMemory` uses 8-byte boundary.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19077: [SPARK-21860][core]Improve memory reuse for heap memory ...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/19077
  
    **[Test build #83686 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/83686/testReport)** for PR 19077 at commit [`ff091a8`](https://github.com/apache/spark/commit/ff091a898fbd74de5017986aca59abc71bbc4cf5).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19077: [SPARK-21860][core]Improve memory reuse for heap memory ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/19077
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/81492/
    Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19077: [SPARK-21860][core]Improve memory reuse for heap memory ...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/19077
  
    **[Test build #83695 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/83695/testReport)** for PR 19077 at commit [`ff091a8`](https://github.com/apache/spark/commit/ff091a898fbd74de5017986aca59abc71bbc4cf5).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19077: [SPARK-21860][core]Improve memory reuse for heap memory ...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/19077
  
    **[Test build #83715 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/83715/testReport)** for PR 19077 at commit [`967c73a`](https://github.com/apache/spark/commit/967c73a2462f0948029a37a84f6912983f03c126).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19077: [SPARK-21860][core]Improve memory reuse for heap memory ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/19077
  
    Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19077: [SPARK-21860][core]Improve memory reuse for heap memory ...

Posted by 10110346 <gi...@git.apache.org>.
Github user 10110346 commented on the issue:

    https://github.com/apache/spark/pull/19077
  
    I have updated this PR, just a little bit of improvement, please help review it again,thanks @jiangxb1987 @cloud-fan


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #19077: [SPARK-21860][core]Improve memory reuse for heap ...

Posted by 10110346 <gi...@git.apache.org>.
Github user 10110346 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19077#discussion_r166815823
  
    --- Diff: common/unsafe/src/test/java/org/apache/spark/unsafe/PlatformUtilSuite.java ---
    @@ -134,4 +135,24 @@ public void memoryDebugFillEnabledInTest() {
           MemoryAllocator.MEMORY_DEBUG_FILL_CLEAN_VALUE);
         MemoryAllocator.UNSAFE.free(offheap);
       }
    +
    +  @Test
    +  public void heapMemoryReuse() {
    +    MemoryAllocator heapMem = new HeapMemoryAllocator();
    +    // The size is less than 1M,allocate new memory every time.
    --- End diff --
    
    Yeah,i agree with you,thanks


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19077: [SPARK-21860][core]Optimize heap memory allocator

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/19077
  
    **[Test build #81212 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81212/testReport)** for PR 19077 at commit [`6862403`](https://github.com/apache/spark/commit/6862403566e5c811bc6efa8a52409bcefb253749).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19077: [SPARK-21860][core]Improve memory reuse for heap memory ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/19077
  
    Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19077: [SPARK-21860][core]Improve memory reuse for heap memory ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/19077
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/81212/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19077: [SPARK-21860][core]Improve memory reuse for heap memory ...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/19077
  
    **[Test build #87212 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87212/testReport)** for PR 19077 at commit [`78ede3f`](https://github.com/apache/spark/commit/78ede3fae243c5379eaea1c86584e200ce697c19).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19077: [SPARK-21860][core]Improve memory reuse for heap memory ...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/19077
  
    **[Test build #81508 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81508/testReport)** for PR 19077 at commit [`0c6647c`](https://github.com/apache/spark/commit/0c6647cca3868a24f07c077bd9e37d436b49f5e8).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19077: [SPARK-21860][core]Improve memory reuse for heap memory ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/19077
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/87183/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19077: [SPARK-21860][core]Improve memory reuse for heap memory ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/19077
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #19077: [SPARK-21860][core]Improve memory reuse for heap ...

Posted by JoshRosen <gi...@git.apache.org>.
Github user JoshRosen commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19077#discussion_r136487281
  
    --- Diff: common/unsafe/src/main/java/org/apache/spark/unsafe/memory/HeapMemoryAllocator.java ---
    @@ -47,23 +47,29 @@ private boolean shouldPool(long size) {
     
       @Override
       public MemoryBlock allocate(long size) throws OutOfMemoryError {
    -    if (shouldPool(size)) {
    +    int arraySize = (int)((size + 7) / 8);
    --- End diff --
    
    You might be able to use `ByteAraryMethods.roundNumberOfBytesToNearestWord` for this, which we'e done for similar rounding elsewhere. Makes it a bit easier to spot what's happening.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19077: [SPARK-21860][core]Improve memory reuse for heap memory ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/19077
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19077: [SPARK-21860][core]Improve memory reuse for heap memory ...

Posted by jiangxb1987 <gi...@git.apache.org>.
Github user jiangxb1987 commented on the issue:

    https://github.com/apache/spark/pull/19077
  
    LGTM


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19077: [SPARK-21860][core]Improve memory reuse for heap memory ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/19077
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/87106/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19077: [SPARK-21860][core]Improve memory reuse for heap memory ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/19077
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/87190/
    Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19077: [SPARK-21860][core]Improve memory reuse for heap memory ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/19077
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/87212/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19077: [SPARK-21860][core]Improve memory reuse for heap memory ...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/19077
  
    **[Test build #87190 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87190/testReport)** for PR 19077 at commit [`ad211b8`](https://github.com/apache/spark/commit/ad211b860ae4869aca8be747924578c944ba8974).
     * This patch **fails due to an unknown error code, -9**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19077: [SPARK-21860][core]Improve memory reuse for heap memory ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/19077
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/683/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19077: [SPARK-21860][core]Improve memory reuse for heap memory ...

Posted by jiangxb1987 <gi...@git.apache.org>.
Github user jiangxb1987 commented on the issue:

    https://github.com/apache/spark/pull/19077
  
    seems jenkins is very tired now, let's try later :)


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19077: [SPARK-21860][core]Improve memory reuse for heap memory ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/19077
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/690/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19077: [SPARK-21860][core]Improve memory reuse for heap memory ...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/19077
  
    **[Test build #87212 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87212/testReport)** for PR 19077 at commit [`78ede3f`](https://github.com/apache/spark/commit/78ede3fae243c5379eaea1c86584e200ce697c19).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19077: [SPARK-21860][core]Improve memory reuse for heap memory ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/19077
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19077: [SPARK-21860][core]Improve memory reuse for heap memory ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/19077
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/87192/
    Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #19077: [SPARK-21860][core]Improve memory reuse for heap ...

Posted by 10110346 <gi...@git.apache.org>.
Github user 10110346 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19077#discussion_r136223648
  
    --- Diff: common/unsafe/src/main/java/org/apache/spark/unsafe/memory/HeapMemoryAllocator.java ---
    @@ -47,23 +47,29 @@ private boolean shouldPool(long size) {
     
       @Override
       public MemoryBlock allocate(long size) throws OutOfMemoryError {
    -    if (shouldPool(size)) {
    +    int arraySize = (int)((size + 7) / 8);
    +    long alignedSize = arraySize * 8;
    +    if (shouldPool(alignedSize)) {
           synchronized (this) {
    -        final LinkedList<WeakReference<MemoryBlock>> pool = bufferPoolsBySize.get(size);
    +        final LinkedList<WeakReference<MemoryBlock>> pool = bufferPoolsBySize.get(alignedSize);
             if (pool != null) {
               while (!pool.isEmpty()) {
                 final WeakReference<MemoryBlock> blockReference = pool.pop();
                 final MemoryBlock memory = blockReference.get();
                 if (memory != null) {
    -              assert (memory.size() == size);
    +              assert ((int)((memory.size() + 7) / 8) == arraySize);
    +              memory.resetSize(size);
    --- End diff --
    
    yes, `MemoryBlock` is always the actual size, if we reuse the previous memory,we should modify the  size of `MemoryBlock`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19077: [SPARK-21860][core]Improve memory reuse for heap memory ...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/19077
  
    **[Test build #87192 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87192/testReport)** for PR 19077 at commit [`78ede3f`](https://github.com/apache/spark/commit/78ede3fae243c5379eaea1c86584e200ce697c19).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #19077: [SPARK-21860][core]Improve memory reuse for heap ...

Posted by 10110346 <gi...@git.apache.org>.
Github user 10110346 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19077#discussion_r137442763
  
    --- Diff: common/unsafe/src/main/java/org/apache/spark/unsafe/memory/HeapMemoryAllocator.java ---
    @@ -47,23 +48,29 @@ private boolean shouldPool(long size) {
     
       @Override
       public MemoryBlock allocate(long size) throws OutOfMemoryError {
    -    if (shouldPool(size)) {
    +    long alignedSize = ByteArrayMethods.roundNumberOfBytesToNearestWord(size);
    --- End diff --
    
    yeah,I think it's acceptable


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19077: [SPARK-21860][core]Improve memory reuse for heap memory ...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/19077
  
    **[Test build #82668 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82668/testReport)** for PR 19077 at commit [`62073e3`](https://github.com/apache/spark/commit/62073e361e2c5287ab57cbfb1300db0c99bfcb62).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19077: [SPARK-21860][core]Improve memory reuse for heap memory ...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/19077
  
    **[Test build #81492 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81492/testReport)** for PR 19077 at commit [`0c6647c`](https://github.com/apache/spark/commit/0c6647cca3868a24f07c077bd9e37d436b49f5e8).
     * This patch **fails due to an unknown error code, -9**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #19077: [SPARK-21860][core]Improve memory reuse for heap ...

Posted by jiangxb1987 <gi...@git.apache.org>.
Github user jiangxb1987 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19077#discussion_r137361142
  
    --- Diff: common/unsafe/src/main/java/org/apache/spark/unsafe/memory/HeapMemoryAllocator.java ---
    @@ -47,23 +47,29 @@ private boolean shouldPool(long size) {
     
       @Override
       public MemoryBlock allocate(long size) throws OutOfMemoryError {
    -    if (shouldPool(size)) {
    +    int arraySize = (int)((size + 7) / 8);
    --- End diff --
    
    Maybe we should make the method to tackle long values.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19077: [SPARK-21860][core]Improve memory reuse for heap memory ...

Posted by 10110346 <gi...@git.apache.org>.
Github user 10110346 commented on the issue:

    https://github.com/apache/spark/pull/19077
  
    restest this please


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19077: [SPARK-21860][core]Improve memory reuse for heap memory ...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on the issue:

    https://github.com/apache/spark/pull/19077
  
    The idea LGTM, but I think we can simplify the implementation to allow the memory allocator to return a larger memory than requested.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #19077: [SPARK-21860][core]Optimize heap memory allocator

Posted by 10110346 <gi...@git.apache.org>.
Github user 10110346 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19077#discussion_r135745255
  
    --- Diff: common/unsafe/src/main/java/org/apache/spark/unsafe/memory/HeapMemoryAllocator.java ---
    @@ -47,24 +47,25 @@ private boolean shouldPool(long size) {
     
       @Override
       public MemoryBlock allocate(long size) throws OutOfMemoryError {
    -    if (shouldPool(size)) {
    +    int arraySize = (int)((size + 7) / 8);
    +    if (shouldPool(arraySize * 8)) {
    --- End diff --
    
    Always assign long integer arrays
    `long[] array = new long[arraySize];`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19077: [SPARK-21860][core]Improve memory reuse for heap memory ...

Posted by 10110346 <gi...@git.apache.org>.
Github user 10110346 commented on the issue:

    https://github.com/apache/spark/pull/19077
  
    retest this please


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19077: [SPARK-21860][core]Improve memory reuse for heap memory ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/19077
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/83068/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19077: [SPARK-21860][core]Improve memory reuse for heap memory ...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/19077
  
    **[Test build #83068 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/83068/testReport)** for PR 19077 at commit [`c312fde`](https://github.com/apache/spark/commit/c312fdecbd33eb038c56ce47c3b34753df551c8f).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19077: [SPARK-21860][core]Improve memory reuse for heap memory ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/19077
  
    Merged build finished. Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19077: [SPARK-21860][core]Improve memory reuse for heap memory ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/19077
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #19077: [SPARK-21860][core]Improve memory reuse for heap ...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/spark/pull/19077


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19077: [SPARK-21860][core]Improve memory reuse for heap memory ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/19077
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19077: [SPARK-21860][core]Improve memory reuse for heap memory ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/19077
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/82668/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org