You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by andrewor14 <gi...@git.apache.org> on 2015/09/18 02:17:08 UTC

[GitHub] spark pull request: [SPARK-10677] [SQL] [WIP] UnsafeExternalSorter...

GitHub user andrewor14 opened a pull request:

    https://github.com/apache/spark/pull/8805

    [SPARK-10677] [SQL] [WIP] UnsafeExternalSorter atomic acquire release

    Currently `UnsafeExternalSorter` acquires memory like this:
    (1) Request memory for a task
    (2) If (1) fails, spill and release all of our memory
    (3) Try request memory for a task again
    (4) If (3) fails, throw exception
    
    Because we release all of our memory, however, we introduce the chance for other tasks to jump in and steal our memory allocation, which causes issues like SPARK-10474. Instead, we should just release a fraction of the memory and keep the ones that we *know* we're going to use immediately afterwards.
    
    WIP because tests are coming.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/andrewor14/spark unsafe-atomic-release-acquire

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/8805.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #8805
    
----
commit 6ee2025f26883ca652565dea82562aa1acd4aaee
Author: Andrew Or <an...@databricks.com>
Date:   2015-09-17T23:12:34Z

    Atomic release and acquire
    
    This commit adds the functionality to reserve bytes when a sorter
    spills, such that we release only a fraction of all the memory we
    occupied. Note that we need to do this on the bytes level because
    we have things like overflow pages.

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-10677] [SQL] [WIP] UnsafeExternalSorter...

Posted by chenghao-intel <gi...@git.apache.org>.
Github user chenghao-intel commented on the pull request:

    https://github.com/apache/spark/pull/8805#issuecomment-141304224
  
    And the default data page is 4MB (grows only a single record size greater than that), so it will causes lots of spills if large amount records come, that's also lead performance issue when do the merge with PriorityQueue in the external sorting. Large number of records can be very common cases sql, as people may set a relative smaller partition number, as you know, that's the motive people use the sort-merge(join/aggregation) instead of the hash-based.
    
    Ideally, we'd better to take the large memory as data page, to reduce the spill times. But this requires a better strategy for the ShuffleMemory management.
    
    Sorry, again, it's not about this PR, but hopefully we can find a better mechanism for the memory allocation.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-10677] [SQL] [WIP] UnsafeExternalSorter...

Posted by chenghao-intel <gi...@git.apache.org>.
Github user chenghao-intel commented on the pull request:

    https://github.com/apache/spark/pull/8805#issuecomment-141297824
  
    After investigation, SPARK-10474 probably caused by the HashAggregation, which eat out all of the memory before switching to SortBased aggregation. It will be great if we can jump the discussion at #8798 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-10677] [SQL] [WIP] UnsafeExternalSorter...

Posted by andrewor14 <gi...@git.apache.org>.
Github user andrewor14 commented on the pull request:

    https://github.com/apache/spark/pull/8805#issuecomment-141534690
  
    Actually @yhuai and I just found the real root cause of the original issue SPARK-10474. I'm closing this PR.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-10677] [SQL] [WIP] UnsafeExternalSorter...

Posted by chenghao-intel <gi...@git.apache.org>.
Github user chenghao-intel commented on a diff in the pull request:

    https://github.com/apache/spark/pull/8805#discussion_r39815902
  
    --- Diff: core/src/main/java/org/apache/spark/util/collection/unsafe/sort/UnsafeExternalSorter.java ---
    @@ -254,22 +281,25 @@ public int getNumberOfAllocatedPages() {
     
       /**
        * Free this sorter's in-memory data structures, including its data pages and pointer array.
    -   *
    +   * @param bytesToReserve number of bytes to hold onto when releasing memory for this task.
        * @return the number of bytes freed.
        */
    -  private long freeMemory() {
    +  private long freeMemory(long bytesToReserve) {
         updatePeakMemoryUsed();
         long memoryFreed = 0;
    +    long remainingBytesToReserve = bytesToReserve;
         for (MemoryBlock block : allocatedPages) {
    --- End diff --
    
    Do we really have to release all of the data pages? 
    Says we have large amount of data, and requires about 10 spills, each time when doing the spill, we will release all of the data pages, and then try to allocate one by one again in next iteration.
    
    Sorry, I know that's not the goal of this PR, but I actually I am thinking if that possible just allocated a fixed size of buffer for the data page and pointer array. And this can be pre-allocated and we will never allocate/release during the runtime.
    
    The only concern is how to determine the fixed size, probably we need a better estimation in the `ShuffleMemoryManager`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-10677] [SQL] [WIP] UnsafeExternalSorter...

Posted by andrewor14 <gi...@git.apache.org>.
Github user andrewor14 closed the pull request at:

    https://github.com/apache/spark/pull/8805


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-10677] [SQL] [WIP] UnsafeExternalSorter...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/8805#issuecomment-141295513
  
      [Test build #42628 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/42628/consoleFull) for   PR 8805 at commit [`6ee2025`](https://github.com/apache/spark/commit/6ee2025f26883ca652565dea82562aa1acd4aaee).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-10677] [SQL] [WIP] UnsafeExternalSorter...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/8805#issuecomment-141332046
  
      [Test build #42628 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/42628/console) for   PR 8805 at commit [`6ee2025`](https://github.com/apache/spark/commit/6ee2025f26883ca652565dea82562aa1acd4aaee).
     * This patch **passes all tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-10677] [SQL] [WIP] UnsafeExternalSorter...

Posted by andrewor14 <gi...@git.apache.org>.
Github user andrewor14 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/8805#discussion_r39885230
  
    --- Diff: core/src/main/java/org/apache/spark/util/collection/unsafe/sort/UnsafeExternalSorter.java ---
    @@ -254,22 +281,25 @@ public int getNumberOfAllocatedPages() {
     
       /**
        * Free this sorter's in-memory data structures, including its data pages and pointer array.
    -   *
    +   * @param bytesToReserve number of bytes to hold onto when releasing memory for this task.
        * @return the number of bytes freed.
        */
    -  private long freeMemory() {
    +  private long freeMemory(long bytesToReserve) {
         updatePeakMemoryUsed();
         long memoryFreed = 0;
    +    long remainingBytesToReserve = bytesToReserve;
         for (MemoryBlock block : allocatedPages) {
    --- End diff --
    
    > Do we really have to release all of the data pages? 
    
    What do you mean? That's exactly what this patch avoids. If we know we need a page immediately after the spill, we should release `n - 1` pages, where `n` is the number of pages we currently hold.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-10677] [SQL] [WIP] UnsafeExternalSorter...

Posted by chenghao-intel <gi...@git.apache.org>.
Github user chenghao-intel commented on a diff in the pull request:

    https://github.com/apache/spark/pull/8805#discussion_r39937272
  
    --- Diff: core/src/main/java/org/apache/spark/util/collection/unsafe/sort/UnsafeExternalSorter.java ---
    @@ -254,22 +281,25 @@ public int getNumberOfAllocatedPages() {
     
       /**
        * Free this sorter's in-memory data structures, including its data pages and pointer array.
    -   *
    +   * @param bytesToReserve number of bytes to hold onto when releasing memory for this task.
        * @return the number of bytes freed.
        */
    -  private long freeMemory() {
    +  private long freeMemory(long bytesToReserve) {
         updatePeakMemoryUsed();
         long memoryFreed = 0;
    +    long remainingBytesToReserve = bytesToReserve;
         for (MemoryBlock block : allocatedPages) {
    --- End diff --
    
    I meant since we find a record is greater than page size (maybe multiple page size),  then most likely we will meet the "big" record again after the spilling, which means in this case we have to reserve/free pages again and again for each spill iteration.
    
    In order to keep a high performance/throughput, probably we'd better try to reserve a large page as much as we can, so we can reduce the frequency of spilling or allocation/free operations.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-10677] [SQL] [WIP] UnsafeExternalSorter...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/8805#issuecomment-141293697
  
    Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-10677] [SQL] [WIP] UnsafeExternalSorter...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/8805#issuecomment-141332101
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/42628/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-10677] [SQL] [WIP] UnsafeExternalSorter...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/8805#issuecomment-141293682
  
     Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-10677] [SQL] [WIP] UnsafeExternalSorter...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/8805#issuecomment-141332100
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org