You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by davies <gi...@git.apache.org> on 2015/11/05 01:08:07 UTC

[GitHub] spark pull request: [SPARK-7542] [SQL] Support off-heap index/sort...

GitHub user davies opened a pull request:

    https://github.com/apache/spark/pull/9477

    [SPARK-7542] [SQL] Support off-heap index/sort buffer

    This brings the support of off-heap memory for array inside BytesToBytesMap and InMemorySorter, then we could allocate all the memory from off-heap for execution.
    
    This PR include #9383, see https://github.com/apache/spark/commit/2b3277781c21d0efb20275bd5632a2d4f7f171c3 for the real changes.
    
    Closes #8068 

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/davies/spark unsafe_timsort

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/9477.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #9477
    
----
commit 4e09081050633bc8baba4cebc5ffba6e9d900bae
Author: Davies Liu <da...@databricks.com>
Date:   2015-10-30T19:08:29Z

    Do hash-based aggregation for all records before switch to sort-based

commit 53dbdf2d4c8c547e6bd50a589bf0223e7ce95e84
Author: Davies Liu <da...@databricks.com>
Date:   2015-10-30T19:24:08Z

    merge the last map

commit 2e341f50b656d0effe36004b6abc68898a119f35
Author: Davies Liu <da...@databricks.com>
Date:   2015-11-02T18:56:59Z

    update tests

commit df44fc64ed1495a1c0f6f51a7014327b6a8750b7
Author: Davies Liu <da...@databricks.com>
Date:   2015-11-02T20:57:18Z

    fix bug

commit 6f3bb15b19cd326f677f15860cf215f57fd3671a
Author: Davies Liu <da...@databricks.com>
Date:   2015-11-03T21:27:35Z

    address comments, add regression test

commit fc5e052ff17560d02ef7cdeec91a4a30605c65f0
Author: Davies Liu <da...@databricks.com>
Date:   2015-11-04T04:51:13Z

    free the array after spilling

commit d89e03463b99e56fd25e5ada8f7d146b6749082f
Author: Davies Liu <da...@databricks.com>
Date:   2015-11-04T06:11:58Z

    refactor

commit 1c0c6c36a5a16c33ceb4cd43534ce02ec3c2b286
Author: Davies Liu <da...@databricks.com>
Date:   2015-11-04T06:20:02Z

    cleanup

commit d8422e15e70a1fab2535ca13ac01c0d3a7be19e9
Author: Davies Liu <da...@databricks.com>
Date:   2015-11-04T06:27:11Z

    throw better exception

commit cbeaedf1cc47365ea90db6478819ca02db5acaea
Author: Davies Liu <da...@databricks.com>
Date:   2015-11-04T23:00:16Z

    add more comments

commit fbce6fe74b8dccd0aefa98a1183ba1321b500a56
Author: Davies Liu <da...@databricks.com>
Date:   2015-11-04T23:08:46Z

    Merge branch 'master' of github.com:apache/spark into fix_switch

commit 10d71694ae07af68265bb36a957b4ff5320d8e72
Author: Davies Liu <da...@databricks.com>
Date:   2015-11-04T23:09:16Z

    fix conflict

commit 8a20e569fdb43e26804e1c71439fd2d02c5f5a69
Author: Davies Liu <da...@databricks.com>
Date:   2015-11-04T23:21:08Z

    Update UnsafeFixedWidthAggregationMap.java

commit f6a5f0629c0b462fa45c8da209f724c158fba078
Author: Davies Liu <da...@databricks.com>
Date:   2015-11-04T23:46:46Z

    fix build

commit 2b3277781c21d0efb20275bd5632a2d4f7f171c3
Author: Davies Liu <da...@databricks.com>
Date:   2015-11-05T00:04:11Z

    support off-heap index/sort buffer

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-7542] [SQL] Support off-heap index/sort...

Posted by JoshRosen <gi...@git.apache.org>.
Github user JoshRosen commented on the pull request:

    https://github.com/apache/spark/pull/9477#issuecomment-154134910
  
    Jenkins, retest this please.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-7542] [SQL] Support off-heap index/sort...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9477#issuecomment-153917623
  
    Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-7542] [SQL] Support off-heap index/sort...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/9477#issuecomment-154140207
  
    **[Test build #45127 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/45127/consoleFull)** for PR 9477 at commit [`c35f512`](https://github.com/apache/spark/commit/c35f5124fcb2746a12faad280063a8424bfff821).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-7542] [SQL] Support off-heap index/sort...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9477#issuecomment-153950136
  
    Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-7542] [SQL] Support off-heap index/sort...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9477#issuecomment-154212662
  
    Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-7542] [SQL] Support off-heap index/sort...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9477#issuecomment-153953921
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/45070/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-7542] [SQL] Support off-heap index/sort...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9477#issuecomment-153911937
  
    Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-7542] [SQL] Support off-heap index/sort...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9477#issuecomment-153917595
  
     Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-7542] [SQL] Support off-heap index/sort...

Posted by JoshRosen <gi...@git.apache.org>.
Github user JoshRosen commented on a diff in the pull request:

    https://github.com/apache/spark/pull/9477#discussion_r43983047
  
    --- Diff: core/src/main/java/org/apache/spark/shuffle/sort/ShuffleInMemorySorter.java ---
    @@ -32,24 +37,39 @@ public int compare(PackedRecordPointer left, PackedRecordPointer right) {
       }
       private static final SortComparator SORT_COMPARATOR = new SortComparator();
     
    +  private final MemoryConsumer consumer;
    +  private final TaskMemoryManager memoryManager;
    --- End diff --
    
    Mind calling this `taskMemoryManager` so that it's clear that's what it is?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-7542] [SQL] Support off-heap index/sort...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9477#issuecomment-153914518
  
    Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-7542] [SQL] Support off-heap index/sort...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9477#issuecomment-154210332
  
    Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-7542] [SQL] Support off-heap index/sort...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9477#issuecomment-153941334
  
    Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-7542] [SQL] Support off-heap index/sort...

Posted by JoshRosen <gi...@git.apache.org>.
Github user JoshRosen commented on the pull request:

    https://github.com/apache/spark/pull/9477#issuecomment-153978941
  
    LGTM overall, but I'd like to address one concern before merging: I'm worried that passing both the `MemoryConsumer` and `TaskMemoryManager` to the sorter components will open the potential for bugs; I feel that those classes should allocate their memory using only the public `MemoryConsumer` methods.
    
    Note that the `MemoryConsumer` methods actually end up calling the `TaskMemoryManager` methods which your code calls directly: https://github.com/apache/spark/blob/81498dd5c86ca51d2fb351c8ef52cbb28e6844f4/core/src/main/java/org/apache/spark/memory/MemoryConsumer.java#L81
    
    I think that we should mark those TaskMemoryManager methods as methods that are only supposed to be called _by_ the memory consumer and not directly by developers. The problem with calling them directly is that the bookkeeping in the MemoryConsumer itself won't have been updated. This current API has such a high potential for this type of misuse that I think we should look at fancy Java-isms to restrict the visibility / callability of those methods such that they can only be called from MemoryConsumer.
    
    If you fix this by having those classes only allocate through their MemoryConsumers then feel free to merge as soon as this passes tests. I can take care of a followup to fix the documentation / code to avoid this type of misuse.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-7542] [SQL] Support off-heap index/sort...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9477#issuecomment-153962196
  
     Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-7542] [SQL] Support off-heap index/sort...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/9477#issuecomment-154248203
  
    **[Test build #1994 has started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/1994/consoleFull)** for PR 9477 at commit [`b367daf`](https://github.com/apache/spark/commit/b367dafd1fcf29367776a42b3ef3086506c70605).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-7542] [SQL] Support off-heap index/sort...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/9477#issuecomment-153953782
  
    **[Test build #45070 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/45070/consoleFull)** for PR 9477 at commit [`862b38f`](https://github.com/apache/spark/commit/862b38f9d58c51e5d06b15c311c91842aa183475).
     * This patch **fails from timeout after a configured wait of \`250m\`**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-7542] [SQL] Support off-heap index/sort...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9477#issuecomment-153996532
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-7542] [SQL] Support off-heap index/sort...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9477#issuecomment-154139243
  
     Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-7542] [SQL] Support off-heap index/sort...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/9477#issuecomment-154261887
  
    **[Test build #1993 has finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/1993/consoleFull)** for PR 9477 at commit [`b367daf`](https://github.com/apache/spark/commit/b367dafd1fcf29367776a42b3ef3086506c70605).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds the following public classes _(experimental)_:\n  * `final class ShuffleSortDataFormat extends SortDataFormat<PackedRecordPointer, LongArray> `\n  * `final class UnsafeSortDataFormat extends SortDataFormat<RecordPointerAndKeyPrefix, LongArray> `\n


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-7542] [SQL] Support off-heap index/sort...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9477#issuecomment-154170426
  
    Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-7542] [SQL] Support off-heap index/sort...

Posted by JoshRosen <gi...@git.apache.org>.
Github user JoshRosen commented on a diff in the pull request:

    https://github.com/apache/spark/pull/9477#discussion_r43983565
  
    --- Diff: core/src/main/java/org/apache/spark/shuffle/sort/ShuffleInMemorySorter.java ---
    @@ -96,14 +111,12 @@ public long getMemoryUsage() {
        */
       public void insertRecord(long recordPointer, int partitionId) {
         if (!hasSpaceForAnotherRecord()) {
    -      if (array.length == Integer.MAX_VALUE) {
    -        throw new IllegalStateException("Sort pointer array has reached maximum size");
    --- End diff --
    
    Well, technically only if we're running in off-heap mode.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-7542] [SQL] Support off-heap index/sort...

Posted by JoshRosen <gi...@git.apache.org>.
Github user JoshRosen commented on a diff in the pull request:

    https://github.com/apache/spark/pull/9477#discussion_r43983728
  
    --- Diff: core/src/main/java/org/apache/spark/util/collection/unsafe/sort/UnsafeInMemorySorter.java ---
    @@ -78,22 +81,33 @@ public int compare(RecordPointerAndKeyPrefix r1, RecordPointerAndKeyPrefix r2) {
       private int pos = 0;
     
       public UnsafeInMemorySorter(
    +    final MemoryConsumer consumer,
    --- End diff --
    
    Similar question here: why not pass _just_ the consumer?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-7542] [SQL] Support off-heap index/sort...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9477#issuecomment-153924523
  
     Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-7542] [SQL] Support off-heap index/sort...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9477#issuecomment-154135406
  
     Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-7542] [SQL] Support off-heap index/sort...

Posted by davies <gi...@git.apache.org>.
Github user davies commented on the pull request:

    https://github.com/apache/spark/pull/9477#issuecomment-153961923
  
    @JoshRosen This is ready for review


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-7542] [SQL] Support off-heap index/sort...

Posted by JoshRosen <gi...@git.apache.org>.
Github user JoshRosen commented on the pull request:

    https://github.com/apache/spark/pull/9477#issuecomment-154270321
  
    LGTM, so I'm going to merge this into master and 1.6. Thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-7542] [SQL] Support off-heap index/sort...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/9477#issuecomment-154048664
  
    **[Test build #45108 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/45108/consoleFull)** for PR 9477 at commit [`b42e7db`](https://github.com/apache/spark/commit/b42e7db68067d64b6b0e19bd00b3371ffde4b174).
     * This patch **fails from timeout after a configured wait of \`250m\`**.
     * This patch merges cleanly.
     * This patch adds the following public classes _(experimental)_:\n  * `final class ShuffleSortDataFormat extends SortDataFormat<PackedRecordPointer, LongArray> `\n  * `final class UnsafeSortDataFormat extends SortDataFormat<RecordPointerAndKeyPrefix, LongArray> `\n


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-7542] [SQL] Support off-heap index/sort...

Posted by JoshRosen <gi...@git.apache.org>.
Github user JoshRosen commented on a diff in the pull request:

    https://github.com/apache/spark/pull/9477#discussion_r43983106
  
    --- Diff: core/src/main/java/org/apache/spark/shuffle/sort/ShuffleInMemorySorter.java ---
    @@ -96,14 +111,12 @@ public long getMemoryUsage() {
        */
       public void insertRecord(long recordPointer, int partitionId) {
         if (!hasSpaceForAnotherRecord()) {
    -      if (array.length == Integer.MAX_VALUE) {
    -        throw new IllegalStateException("Sort pointer array has reached maximum size");
    --- End diff --
    
    I guess this lifts the size limit. Nice!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-7542] [SQL] Support off-heap index/sort...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9477#issuecomment-153914523
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/45067/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-7542] [SQL] Support off-heap index/sort...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/9477#issuecomment-154054716
  
    **[Test build #45110 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/45110/consoleFull)** for PR 9477 at commit [`854a99f`](https://github.com/apache/spark/commit/854a99f6339e831efccf9868c0bacbace2e1f75d).
     * This patch **fails from timeout after a configured wait of \`250m\`**.
     * This patch merges cleanly.
     * This patch adds the following public classes _(experimental)_:\n  * `final class ShuffleSortDataFormat extends SortDataFormat<PackedRecordPointer, LongArray> `\n  * `final class UnsafeSortDataFormat extends SortDataFormat<RecordPointerAndKeyPrefix, LongArray> `\n


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-7542] [SQL] Support off-heap index/sort...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/9477#issuecomment-154273968
  
    **[Test build #1994 has finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/1994/consoleFull)** for PR 9477 at commit [`b367daf`](https://github.com/apache/spark/commit/b367dafd1fcf29367776a42b3ef3086506c70605).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds the following public classes _(experimental)_:\n  * `final class ShuffleSortDataFormat extends SortDataFormat<PackedRecordPointer, LongArray> `\n  * `final class UnsafeSortDataFormat extends SortDataFormat<RecordPointerAndKeyPrefix, LongArray> `\n


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-7542] [SQL] Support off-heap index/sort...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/9477#issuecomment-154210258
  
    **[Test build #45125 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/45125/consoleFull)** for PR 9477 at commit [`854a99f`](https://github.com/apache/spark/commit/854a99f6339e831efccf9868c0bacbace2e1f75d).
     * This patch **fails from timeout after a configured wait of \`250m\`**.
     * This patch merges cleanly.
     * This patch adds the following public classes _(experimental)_:\n  * `final class ShuffleSortDataFormat extends SortDataFormat<PackedRecordPointer, LongArray> `\n  * `final class UnsafeSortDataFormat extends SortDataFormat<RecordPointerAndKeyPrefix, LongArray> `\n


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-7542] [SQL] Support off-heap index/sort...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9477#issuecomment-153987350
  
    Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-7542] [SQL] Support off-heap index/sort...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/9477#issuecomment-154212555
  
    **[Test build #45127 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/45127/consoleFull)** for PR 9477 at commit [`c35f512`](https://github.com/apache/spark/commit/c35f5124fcb2746a12faad280063a8424bfff821).
     * This patch **fails from timeout after a configured wait of \`250m\`**.
     * This patch merges cleanly.
     * This patch adds the following public classes _(experimental)_:\n  * `final class ShuffleSortDataFormat extends SortDataFormat<PackedRecordPointer, LongArray> `\n  * `final class UnsafeSortDataFormat extends SortDataFormat<RecordPointerAndKeyPrefix, LongArray> `\n


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-7542] [SQL] Support off-heap index/sort...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9477#issuecomment-154212667
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/45127/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-7542] [SQL] Support off-heap index/sort...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/9477#issuecomment-153992590
  
    **[Test build #45110 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/45110/consoleFull)** for PR 9477 at commit [`854a99f`](https://github.com/apache/spark/commit/854a99f6339e831efccf9868c0bacbace2e1f75d).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-7542] [SQL] Support off-heap index/sort...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9477#issuecomment-154054815
  
    Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-7542] [SQL] Support off-heap index/sort...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9477#issuecomment-154135434
  
    Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-7542] [SQL] Support off-heap index/sort...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9477#issuecomment-153953917
  
    Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-7542] [SQL] Support off-heap index/sort...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/9477#issuecomment-153925753
  
    **[Test build #45076 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/45076/consoleFull)** for PR 9477 at commit [`3cb22d4`](https://github.com/apache/spark/commit/3cb22d4422e33a8dfafb0b181309a666ebd6d369).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-7542] [SQL] Support off-heap index/sort...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/9477#issuecomment-154188619
  
    **[Test build #1989 has started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/1989/consoleFull)** for PR 9477 at commit [`b367daf`](https://github.com/apache/spark/commit/b367dafd1fcf29367776a42b3ef3086506c70605).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-7542] [SQL] Support off-heap index/sort...

Posted by davies <gi...@git.apache.org>.
Github user davies commented on the pull request:

    https://github.com/apache/spark/pull/9477#issuecomment-154228222
  
    The failed test is not related, will re-run it.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-7542] [SQL] Support off-heap index/sort...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9477#issuecomment-153924545
  
    Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-7542] [SQL] Support off-heap index/sort...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9477#issuecomment-153912804
  
     Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-7542] [SQL] Support off-heap index/sort...

Posted by JoshRosen <gi...@git.apache.org>.
Github user JoshRosen commented on a diff in the pull request:

    https://github.com/apache/spark/pull/9477#discussion_r43982895
  
    --- Diff: core/src/main/java/org/apache/spark/shuffle/sort/ShuffleExternalSorter.java ---
    @@ -321,9 +320,10 @@ private void growPointerArrayIfNecessary() throws IOException {
         assert(inMemSorter != null);
         if (!inMemSorter.hasSpaceForAnotherRecord()) {
           long used = inMemSorter.getMemoryUsage();
    -      long needed = used + inMemSorter.getMemoryToExpand();
    +      MemoryBlock page;
           try {
    -        acquireMemory(needed);  // could trigger spilling
    +        // could trigger spilling
    +        page = taskMemoryManager.allocatePage(used * 2, this);
    --- End diff --
    
    Implicit here is the fact that the in memory sorter's only source of memory usage is the pointer array itself. That's fine, though.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-7542] [SQL] Support off-heap index/sort...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9477#issuecomment-153911915
  
     Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-7542] [SQL] Support off-heap index/sort...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/9477#issuecomment-154228478
  
    **[Test build #1993 has started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/1993/consoleFull)** for PR 9477 at commit [`b367daf`](https://github.com/apache/spark/commit/b367dafd1fcf29367776a42b3ef3086506c70605).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-7542] [SQL] Support off-heap index/sort...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9477#issuecomment-153991110
  
     Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-7542] [SQL] Support off-heap index/sort...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9477#issuecomment-153950138
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/45076/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-7542] [SQL] Support off-heap index/sort...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/9477#issuecomment-153950106
  
    **[Test build #45076 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/45076/consoleFull)** for PR 9477 at commit [`3cb22d4`](https://github.com/apache/spark/commit/3cb22d4422e33a8dfafb0b181309a666ebd6d369).
     * This patch **fails Spark unit tests**.
     * This patch merges cleanly.
     * This patch adds the following public classes _(experimental)_:\n  * `final class ShuffleSortDataFormat extends SortDataFormat<PackedRecordPointer, LongArray> `\n  * `final class UnsafeSortDataFormat extends SortDataFormat<RecordPointerAndKeyPrefix, LongArray> `\n


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-7542] [SQL] Support off-heap index/sort...

Posted by JoshRosen <gi...@git.apache.org>.
Github user JoshRosen commented on a diff in the pull request:

    https://github.com/apache/spark/pull/9477#discussion_r43983319
  
    --- Diff: core/src/main/java/org/apache/spark/util/collection/unsafe/sort/UnsafeSortDataFormat.java ---
    @@ -44,37 +47,43 @@ public RecordPointerAndKeyPrefix newKey() {
       }
     
       @Override
    -  public RecordPointerAndKeyPrefix getKey(long[] data, int pos, RecordPointerAndKeyPrefix reuse) {
    -    reuse.recordPointer = data[pos * 2];
    -    reuse.keyPrefix = data[pos * 2 + 1];
    +  public RecordPointerAndKeyPrefix getKey(LongArray data, int pos, RecordPointerAndKeyPrefix reuse) {
    +    reuse.recordPointer = data.get(pos * 2);
    +    reuse.keyPrefix = data.get(pos * 2 + 1);
         return reuse;
       }
     
       @Override
    -  public void swap(long[] data, int pos0, int pos1) {
    -    long tempPointer = data[pos0 * 2];
    -    long tempKeyPrefix = data[pos0 * 2 + 1];
    -    data[pos0 * 2] = data[pos1 * 2];
    -    data[pos0 * 2 + 1] = data[pos1 * 2 + 1];
    -    data[pos1 * 2] = tempPointer;
    -    data[pos1 * 2 + 1] = tempKeyPrefix;
    +  public void swap(LongArray data, int pos0, int pos1) {
    +    long tempPointer = data.get(pos0 * 2);
    +    long tempKeyPrefix = data.get(pos0 * 2 + 1);
    +    data.set(pos0 * 2, data.get(pos1 * 2));
    +    data.set(pos0 * 2 + 1, data.get(pos1 * 2 + 1));
    +    data.set(pos1 * 2, tempPointer);
    +    data.set(pos1 * 2 + 1, tempKeyPrefix);
       }
     
       @Override
    -  public void copyElement(long[] src, int srcPos, long[] dst, int dstPos) {
    -    dst[dstPos * 2] = src[srcPos * 2];
    -    dst[dstPos * 2 + 1] = src[srcPos * 2 + 1];
    +  public void copyElement(LongArray src, int srcPos, LongArray dst, int dstPos) {
    +    dst.set(dstPos * 2, src.get(srcPos * 2));
    +    dst.set(dstPos * 2 + 1, src.get(srcPos * 2 + 1));
       }
     
       @Override
    -  public void copyRange(long[] src, int srcPos, long[] dst, int dstPos, int length) {
    -    System.arraycopy(src, srcPos * 2, dst, dstPos * 2, length * 2);
    +  public void copyRange(LongArray src, int srcPos, LongArray dst, int dstPos, int length) {
    +    Platform.copyMemory(
    +      src.getBaseObject(),
    +      src.getBaseOffset() + srcPos * 16,
    +      dst.getBaseObject(),
    +      dst.getBaseOffset() + dstPos * 16,
    +      length * 16);
       }
     
       @Override
    -  public long[] allocate(int length) {
    +  public LongArray allocate(int length) {
         assert (length < Integer.MAX_VALUE / 2) : "Length " + length + " is too large";
    -    return new long[length * 2];
    +    // This is used as temporary buffer, it's fine to allocate from JVM heap.
    +    return new LongArray(MemoryBlock.fromLongArray(new long[length * 2]));
    --- End diff --
    
    Do we need to zero-out this array?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-7542] [SQL] Support off-heap index/sort...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/9477#issuecomment-154137615
  
    **[Test build #45125 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/45125/consoleFull)** for PR 9477 at commit [`854a99f`](https://github.com/apache/spark/commit/854a99f6339e831efccf9868c0bacbace2e1f75d).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-7542] [SQL] Support off-heap index/sort...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/9477#issuecomment-153913274
  
    **[Test build #45070 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/45070/consoleFull)** for PR 9477 at commit [`862b38f`](https://github.com/apache/spark/commit/862b38f9d58c51e5d06b15c311c91842aa183475).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-7542] [SQL] Support off-heap index/sort...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9477#issuecomment-153991140
  
    Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-7542] [SQL] Support off-heap index/sort...

Posted by JoshRosen <gi...@git.apache.org>.
Github user JoshRosen commented on a diff in the pull request:

    https://github.com/apache/spark/pull/9477#discussion_r43983322
  
    --- Diff: core/src/main/java/org/apache/spark/util/collection/unsafe/sort/UnsafeSortDataFormat.java ---
    @@ -44,37 +47,43 @@ public RecordPointerAndKeyPrefix newKey() {
       }
     
       @Override
    -  public RecordPointerAndKeyPrefix getKey(long[] data, int pos, RecordPointerAndKeyPrefix reuse) {
    -    reuse.recordPointer = data[pos * 2];
    -    reuse.keyPrefix = data[pos * 2 + 1];
    +  public RecordPointerAndKeyPrefix getKey(LongArray data, int pos, RecordPointerAndKeyPrefix reuse) {
    +    reuse.recordPointer = data.get(pos * 2);
    +    reuse.keyPrefix = data.get(pos * 2 + 1);
         return reuse;
       }
     
       @Override
    -  public void swap(long[] data, int pos0, int pos1) {
    -    long tempPointer = data[pos0 * 2];
    -    long tempKeyPrefix = data[pos0 * 2 + 1];
    -    data[pos0 * 2] = data[pos1 * 2];
    -    data[pos0 * 2 + 1] = data[pos1 * 2 + 1];
    -    data[pos1 * 2] = tempPointer;
    -    data[pos1 * 2 + 1] = tempKeyPrefix;
    +  public void swap(LongArray data, int pos0, int pos1) {
    +    long tempPointer = data.get(pos0 * 2);
    +    long tempKeyPrefix = data.get(pos0 * 2 + 1);
    +    data.set(pos0 * 2, data.get(pos1 * 2));
    +    data.set(pos0 * 2 + 1, data.get(pos1 * 2 + 1));
    +    data.set(pos1 * 2, tempPointer);
    +    data.set(pos1 * 2 + 1, tempKeyPrefix);
       }
     
       @Override
    -  public void copyElement(long[] src, int srcPos, long[] dst, int dstPos) {
    -    dst[dstPos * 2] = src[srcPos * 2];
    -    dst[dstPos * 2 + 1] = src[srcPos * 2 + 1];
    +  public void copyElement(LongArray src, int srcPos, LongArray dst, int dstPos) {
    +    dst.set(dstPos * 2, src.get(srcPos * 2));
    +    dst.set(dstPos * 2 + 1, src.get(srcPos * 2 + 1));
       }
     
       @Override
    -  public void copyRange(long[] src, int srcPos, long[] dst, int dstPos, int length) {
    -    System.arraycopy(src, srcPos * 2, dst, dstPos * 2, length * 2);
    +  public void copyRange(LongArray src, int srcPos, LongArray dst, int dstPos, int length) {
    +    Platform.copyMemory(
    +      src.getBaseObject(),
    +      src.getBaseOffset() + srcPos * 16,
    +      dst.getBaseObject(),
    +      dst.getBaseOffset() + dstPos * 16,
    +      length * 16);
       }
     
       @Override
    -  public long[] allocate(int length) {
    +  public LongArray allocate(int length) {
         assert (length < Integer.MAX_VALUE / 2) : "Length " + length + " is too large";
    -    return new long[length * 2];
    +    // This is used as temporary buffer, it's fine to allocate from JVM heap.
    +    return new LongArray(MemoryBlock.fromLongArray(new long[length * 2]));
    --- End diff --
    
    Wait, nevermind: it's on-heap.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-7542] [SQL] Support off-heap index/sort...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9477#issuecomment-154048716
  
    Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-7542] [SQL] Support off-heap index/sort...

Posted by davies <gi...@git.apache.org>.
Github user davies commented on the pull request:

    https://github.com/apache/spark/pull/9477#issuecomment-153990700
  
    @JoshRosen I should had addressed your comments, will merge this once pass the tests.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-7542] [SQL] Support off-heap index/sort...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/9477#issuecomment-153941271
  
    **[Test build #45072 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/45072/consoleFull)** for PR 9477 at commit [`77555e1`](https://github.com/apache/spark/commit/77555e10664cf4a468a5a6ebd530a0b470a56356).
     * This patch **fails Spark unit tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-7542] [SQL] Support off-heap index/sort...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9477#issuecomment-154048718
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/45108/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-7542] [SQL] Support off-heap index/sort...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9477#issuecomment-154165565
  
    Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-7542] [SQL] Support off-heap index/sort...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/9477#issuecomment-153987965
  
    **[Test build #45108 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/45108/consoleFull)** for PR 9477 at commit [`b42e7db`](https://github.com/apache/spark/commit/b42e7db68067d64b6b0e19bd00b3371ffde4b174).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-7542] [SQL] Support off-heap index/sort...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9477#issuecomment-153941335
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/45072/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-7542] [SQL] Support off-heap index/sort...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9477#issuecomment-153987324
  
     Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-7542] [SQL] Support off-heap index/sort...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9477#issuecomment-154054818
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/45110/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-7542] [SQL] Support off-heap index/sort...

Posted by JoshRosen <gi...@git.apache.org>.
Github user JoshRosen commented on a diff in the pull request:

    https://github.com/apache/spark/pull/9477#discussion_r43983033
  
    --- Diff: core/src/main/java/org/apache/spark/shuffle/sort/ShuffleInMemorySorter.java ---
    @@ -32,24 +37,39 @@ public int compare(PackedRecordPointer left, PackedRecordPointer right) {
       }
       private static final SortComparator SORT_COMPARATOR = new SortComparator();
     
    +  private final MemoryConsumer consumer;
    +  private final TaskMemoryManager memoryManager;
    +
       /**
        * An array of record pointers and partition ids that have been encoded by
        * {@link PackedRecordPointer}. The sort operates on this array instead of directly manipulating
        * records.
        */
    -  private long[] array;
    +  private LongArray array;
     
       /**
        * The position in the pointer array where new records can be inserted.
        */
       private int pos = 0;
     
    -  public ShuffleInMemorySorter(int initialSize) {
    +  public ShuffleInMemorySorter(
    +      MemoryConsumer consumer,
    +      TaskMemoryManager memoryManager,
    +      int initialSize) {
    +    this.consumer = consumer;
    +    this.memoryManager = memoryManager;
         assert (initialSize > 0);
    -    this.array = new long[initialSize];
    +    this.array = new LongArray(memoryManager.allocatePage(initialSize * 8L, consumer));
    --- End diff --
    
    If this `allocatePage` call were to fail, I think you'd get an NPE here, since `allocatePage` would return null.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-7542] [SQL] Support off-heap index/sort...

Posted by JoshRosen <gi...@git.apache.org>.
Github user JoshRosen commented on a diff in the pull request:

    https://github.com/apache/spark/pull/9477#discussion_r43983352
  
    --- Diff: core/src/main/java/org/apache/spark/shuffle/sort/ShuffleInMemorySorter.java ---
    @@ -96,14 +111,12 @@ public long getMemoryUsage() {
        */
       public void insertRecord(long recordPointer, int partitionId) {
         if (!hasSpaceForAnotherRecord()) {
    -      if (array.length == Integer.MAX_VALUE) {
    -        throw new IllegalStateException("Sort pointer array has reached maximum size");
    -      } else {
    -        expandPointerArray();
    -      }
    +      // for testing
    --- End diff --
    
    For testing?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-7542] [SQL] Support off-heap index/sort...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9477#issuecomment-153996536
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/45096/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-7542] [SQL] Support off-heap index/sort...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/9477#issuecomment-153962004
  
    **[Test build #1983 has started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/1983/consoleFull)** for PR 9477 at commit [`3cb22d4`](https://github.com/apache/spark/commit/3cb22d4422e33a8dfafb0b181309a666ebd6d369).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-7542] [SQL] Support off-heap index/sort...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/spark/pull/9477


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-7542] [SQL] Support off-heap index/sort...

Posted by JoshRosen <gi...@git.apache.org>.
Github user JoshRosen commented on a diff in the pull request:

    https://github.com/apache/spark/pull/9477#discussion_r44043737
  
    --- Diff: core/src/main/java/org/apache/spark/util/collection/unsafe/sort/UnsafeExternalSorter.java ---
    @@ -293,9 +292,10 @@ private void growPointerArrayIfNecessary() throws IOException {
         assert(inMemSorter != null);
         if (!inMemSorter.hasSpaceForAnotherRecord()) {
           long used = inMemSorter.getMemoryUsage();
    -      long needed = used + inMemSorter.getMemoryToExpand();
    +      LongArray array;
           try {
    -        acquireMemory(needed);  // could trigger spilling
    +        // could trigger spilling
    +        array = allocateArray(used / 16 * 2);
    --- End diff --
    
    Should this be `/ 8 * 2` instead, since we want to double the number of slots in the array and each slot requires 8 bytes of memory?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-7542] [SQL] Support off-heap index/sort...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9477#issuecomment-154139271
  
    Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-7542] [SQL] Support off-heap index/sort...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9477#issuecomment-153912836
  
    Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-7542] [SQL] Support off-heap index/sort...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/9477#issuecomment-153917831
  
    **[Test build #45072 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/45072/consoleFull)** for PR 9477 at commit [`77555e1`](https://github.com/apache/spark/commit/77555e10664cf4a468a5a6ebd530a0b470a56356).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-7542] [SQL] Support off-heap index/sort...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9477#issuecomment-154165525
  
     Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-7542] [SQL] Support off-heap index/sort...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9477#issuecomment-154210333
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/45125/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-7542] [SQL] Support off-heap index/sort...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/9477#issuecomment-154223721
  
    **[Test build #1989 has finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/1989/consoleFull)** for PR 9477 at commit [`b367daf`](https://github.com/apache/spark/commit/b367dafd1fcf29367776a42b3ef3086506c70605).
     * This patch **fails Spark unit tests**.
     * This patch merges cleanly.
     * This patch adds the following public classes _(experimental)_:\n  * `final class ShuffleSortDataFormat extends SortDataFormat<PackedRecordPointer, LongArray> `\n  * `final class UnsafeSortDataFormat extends SortDataFormat<RecordPointerAndKeyPrefix, LongArray> `\n


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-7542] [SQL] Support off-heap index/sort...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/9477#issuecomment-153983073
  
    **[Test build #1983 has finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/1983/consoleFull)** for PR 9477 at commit [`3cb22d4`](https://github.com/apache/spark/commit/3cb22d4422e33a8dfafb0b181309a666ebd6d369).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-7542] [SQL] Support off-heap index/sort...

Posted by JoshRosen <gi...@git.apache.org>.
Github user JoshRosen commented on a diff in the pull request:

    https://github.com/apache/spark/pull/9477#discussion_r43983708
  
    --- Diff: core/src/main/java/org/apache/spark/shuffle/sort/ShuffleInMemorySorter.java ---
    @@ -32,24 +37,39 @@ public int compare(PackedRecordPointer left, PackedRecordPointer right) {
       }
       private static final SortComparator SORT_COMPARATOR = new SortComparator();
     
    +  private final MemoryConsumer consumer;
    +  private final TaskMemoryManager memoryManager;
    +
       /**
        * An array of record pointers and partition ids that have been encoded by
        * {@link PackedRecordPointer}. The sort operates on this array instead of directly manipulating
        * records.
        */
    -  private long[] array;
    +  private LongArray array;
     
       /**
        * The position in the pointer array where new records can be inserted.
        */
       private int pos = 0;
     
    -  public ShuffleInMemorySorter(int initialSize) {
    +  public ShuffleInMemorySorter(
    +      MemoryConsumer consumer,
    +      TaskMemoryManager memoryManager,
    --- End diff --
    
    Why does this class take both a `memoryManager` _and_ a consumer? Why not pass it just the consumer and use methods of the `consumer` to do the allocation?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-7542] [SQL] Support off-heap index/sort...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9477#issuecomment-153962209
  
    Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-7542] [SQL] Support off-heap index/sort...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/9477#issuecomment-153995771
  
    **[Test build #45096 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/45096/consoleFull)** for PR 9477 at commit [`89319e0`](https://github.com/apache/spark/commit/89319e0540dda6cd70d03e3454c999998fecc3d8).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds the following public classes _(experimental)_:\n  * `final class ShuffleSortDataFormat extends SortDataFormat<PackedRecordPointer, LongArray> `\n  * `final class UnsafeSortDataFormat extends SortDataFormat<RecordPointerAndKeyPrefix, LongArray> `\n


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-7542] [SQL] Support off-heap index/sort...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/9477#issuecomment-153963379
  
    **[Test build #45096 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/45096/consoleFull)** for PR 9477 at commit [`89319e0`](https://github.com/apache/spark/commit/89319e0540dda6cd70d03e3454c999998fecc3d8).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-7542] [SQL] Support off-heap index/sort...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9477#issuecomment-154170431
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/45133/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org