You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2020/09/16 11:59:00 UTC

[GitHub] [spark] tomvanbussel opened a new pull request #29772: [SPARK-32900][CORE] Allow UnsafeExternalSorter to spill when there are nulls.

tomvanbussel opened a new pull request #29772:
URL: https://github.com/apache/spark/pull/29772


   ### What changes were proposed in this pull request?
   
   This PR changes the way `UnsafeExternalSorter.SpillableIterator` checks whether it has spilled already, by checking whether `inMemSorter` is null. It also allows it to spill other `UnsafeSorterIterator`s than `UnsafeInMemorySorter.SortedIterator`.
   
   ### Why are the changes needed?
   
   Before this PR `UnsafeExternalSorter.SpillableIterator` could not spill when there are NULLs in the input and radix sorting is used. Currently, Spark determines whether UnsafeExternalSorter.SpillableIterator has spilled already by checking whether `upstream` is an instance of `UnsafeInMemorySorter.SortedIterator`. When radix sorting is used and there are NULLs in the input however, `upstream` will be an instance of `UnsafeExternalSorter.ChainedIterator` instead, but should still be spilled.
   
   ### Does this PR introduce _any_ user-facing change?
   
   No
   
   ### How was this patch tested?
   
   A test was added to `UnsafeExternalSorterSuite` (and therefore also to `UnsafeExternalSorterRadixSortSuite`). I manually confirmed that the test failed in `UnsafeExternalSorterRadixSortSuite` without this patch.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] hvanhovell commented on pull request #29772: [SPARK-32900][CORE] Allow UnsafeExternalSorter to spill when there are nulls.

Posted by GitBox <gi...@apache.org>.
hvanhovell commented on pull request #29772:
URL: https://github.com/apache/spark/pull/29772#issuecomment-693376390


   ok to test


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #29772: [SPARK-32900][CORE] Allow UnsafeExternalSorter to spill when there are nulls.

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #29772:
URL: https://github.com/apache/spark/pull/29772#issuecomment-693376824


   **[Test build #128768 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/128768/testReport)** for PR 29772 at commit [`e6dbe2b`](https://github.com/apache/spark/commit/e6dbe2b3cbf18765cfc44d38ccb311cf06e9ad6f).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #29772: [SPARK-32900][CORE] Allow UnsafeExternalSorter to spill when there are nulls.

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #29772:
URL: https://github.com/apache/spark/pull/29772#issuecomment-693603639


   Merged build finished. Test FAILed.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #29772: [SPARK-32900][CORE] Allow UnsafeExternalSorter to spill when there are nulls.

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #29772:
URL: https://github.com/apache/spark/pull/29772#issuecomment-693492865


   Merged build finished. Test FAILed.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #29772: [SPARK-32900][CORE] Allow UnsafeExternalSorter to spill when there are nulls.

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #29772:
URL: https://github.com/apache/spark/pull/29772#issuecomment-693492878


   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/128769/
   Test FAILed.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #29772: [SPARK-32900][CORE] Allow UnsafeExternalSorter to spill when there are nulls.

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #29772:
URL: https://github.com/apache/spark/pull/29772#issuecomment-693524612






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #29772: [SPARK-32900][CORE] Allow UnsafeExternalSorter to spill when there are nulls.

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #29772:
URL: https://github.com/apache/spark/pull/29772#issuecomment-693523947


   **[Test build #128776 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/128776/testReport)** for PR 29772 at commit [`e514327`](https://github.com/apache/spark/commit/e514327b4e071f0e4c8e3402cbf03d54c83ab8a4).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #29772: [SPARK-32900][CORE] Allow UnsafeExternalSorter to spill when there are nulls.

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #29772:
URL: https://github.com/apache/spark/pull/29772#issuecomment-693492865






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA removed a comment on pull request #29772: [SPARK-32900][CORE] Allow UnsafeExternalSorter to spill when there are nulls.

Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #29772:
URL: https://github.com/apache/spark/pull/29772#issuecomment-693395887


   **[Test build #128770 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/128770/testReport)** for PR 29772 at commit [`4bfa122`](https://github.com/apache/spark/commit/4bfa1223aab3af0142c0ec975d7f8cb39a996aff).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] tomvanbussel commented on a change in pull request #29772: [SPARK-32900][CORE] Allow UnsafeExternalSorter to spill when there are nulls.

Posted by GitBox <gi...@apache.org>.
tomvanbussel commented on a change in pull request #29772:
URL: https://github.com/apache/spark/pull/29772#discussion_r489506520



##########
File path: core/src/main/java/org/apache/spark/util/collection/unsafe/sort/UnsafeExternalSorter.java
##########
@@ -516,32 +520,34 @@ public int getNumRecords() {
       return numRecords;
     }
 
+    @Override
+    public long getCurrentPageNumber() {
+      throw new UnsupportedOperationException();
+    }
+
     public long spill() throws IOException {
       synchronized (this) {
-        if (!(upstream instanceof UnsafeInMemorySorter.SortedIterator && nextUpstream == null
-          && numRecords > 0)) {
+        if (inMemSorter == null || numRecords <= 0) {

Review comment:
       I think it was originally supposed to mean this, but it could also mean that all records have been read. This is actually a bug as during the call to spill it would prevent freeing the memory that's no longer necessary. I've kept this bug here to keep the changelist as small as possible (there's a little bit more to this than just removing the check). I'll address this issue in a separate ticket if that's okay.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #29772: [SPARK-32900][CORE] Allow UnsafeExternalSorter to spill when there are nulls.

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #29772:
URL: https://github.com/apache/spark/pull/29772#issuecomment-693404301


   **[Test build #128771 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/128771/testReport)** for PR 29772 at commit [`e514327`](https://github.com/apache/spark/commit/e514327b4e071f0e4c8e3402cbf03d54c83ab8a4).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] tomvanbussel commented on pull request #29772: [SPARK-32900][CORE] Allow UnsafeExternalSorter to spill when there are nulls.

Posted by GitBox <gi...@apache.org>.
tomvanbussel commented on pull request #29772:
URL: https://github.com/apache/spark/pull/29772#issuecomment-694113476


   Thanks for the review! Reworded the description to be clearer.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] tomvanbussel commented on a change in pull request #29772: [SPARK-32900][CORE] Allow UnsafeExternalSorter to spill when there are nulls.

Posted by GitBox <gi...@apache.org>.
tomvanbussel commented on a change in pull request #29772:
URL: https://github.com/apache/spark/pull/29772#discussion_r489506520



##########
File path: core/src/main/java/org/apache/spark/util/collection/unsafe/sort/UnsafeExternalSorter.java
##########
@@ -516,32 +520,34 @@ public int getNumRecords() {
       return numRecords;
     }
 
+    @Override
+    public long getCurrentPageNumber() {
+      throw new UnsupportedOperationException();
+    }
+
     public long spill() throws IOException {
       synchronized (this) {
-        if (!(upstream instanceof UnsafeInMemorySorter.SortedIterator && nextUpstream == null
-          && numRecords > 0)) {
+        if (inMemSorter == null || numRecords <= 0) {

Review comment:
       I think it was originally supposed to mean this, but it could also mean that all records have been read. This is actually a bug as during the call to spill it would prevent freeing the memory that's no longer necessary. I've kept this bug here to keep the changelist as small as possible (there's a little bit more to this than just removing the check). I'll address this issue in a separate ticket.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #29772: [SPARK-32900][CORE] Allow UnsafeExternalSorter to spill when there are nulls.

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #29772:
URL: https://github.com/apache/spark/pull/29772#issuecomment-693377420






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] hvanhovell closed pull request #29772: [SPARK-32900][CORE] Allow UnsafeExternalSorter to spill when there are nulls.

Posted by GitBox <gi...@apache.org>.
hvanhovell closed pull request #29772:
URL: https://github.com/apache/spark/pull/29772


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #29772: [SPARK-32900][CORE] Allow UnsafeExternalSorter to spill when there are nulls.

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #29772:
URL: https://github.com/apache/spark/pull/29772#issuecomment-693476482


   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/128768/
   Test FAILed.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #29772: [SPARK-32900][CORE] Allow UnsafeExternalSorter to spill when there are nulls.

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #29772:
URL: https://github.com/apache/spark/pull/29772#issuecomment-693541419






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #29772: [SPARK-32900][CORE] Allow UnsafeExternalSorter to spill when there are nulls.

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #29772:
URL: https://github.com/apache/spark/pull/29772#issuecomment-693603639






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #29772: [SPARK-32900][CORE] Allow UnsafeExternalSorter to spill when there are nulls.

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #29772:
URL: https://github.com/apache/spark/pull/29772#issuecomment-693476467






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA removed a comment on pull request #29772: [SPARK-32900][CORE] Allow UnsafeExternalSorter to spill when there are nulls.

Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #29772:
URL: https://github.com/apache/spark/pull/29772#issuecomment-693523947


   **[Test build #128776 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/128776/testReport)** for PR 29772 at commit [`e514327`](https://github.com/apache/spark/commit/e514327b4e071f0e4c8e3402cbf03d54c83ab8a4).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #29772: [SPARK-32900][CORE] Allow UnsafeExternalSorter to spill when there are nulls.

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #29772:
URL: https://github.com/apache/spark/pull/29772#issuecomment-693396475






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #29772: [SPARK-32900][CORE] Allow UnsafeExternalSorter to spill when there are nulls.

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #29772:
URL: https://github.com/apache/spark/pull/29772#issuecomment-693524612






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #29772: [SPARK-32900][CORE] Allow UnsafeExternalSorter to spill when there are nulls.

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #29772:
URL: https://github.com/apache/spark/pull/29772#issuecomment-693359580


   Can one of the admins verify this patch?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #29772: [SPARK-32900][CORE] Allow UnsafeExternalSorter to spill when there are nulls.

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #29772:
URL: https://github.com/apache/spark/pull/29772#issuecomment-693396475






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] tomvanbussel commented on a change in pull request #29772: [SPARK-32900][CORE] Allow UnsafeExternalSorter to spill when there are nulls.

Posted by GitBox <gi...@apache.org>.
tomvanbussel commented on a change in pull request #29772:
URL: https://github.com/apache/spark/pull/29772#discussion_r489506520



##########
File path: core/src/main/java/org/apache/spark/util/collection/unsafe/sort/UnsafeExternalSorter.java
##########
@@ -516,32 +520,34 @@ public int getNumRecords() {
       return numRecords;
     }
 
+    @Override
+    public long getCurrentPageNumber() {
+      throw new UnsupportedOperationException();
+    }
+
     public long spill() throws IOException {
       synchronized (this) {
-        if (!(upstream instanceof UnsafeInMemorySorter.SortedIterator && nextUpstream == null
-          && numRecords > 0)) {
+        if (inMemSorter == null || numRecords <= 0) {

Review comment:
       I think it was originally supposed to mean this, but it could also mean that all records have been read. This is actually a bug as it would prevent freeing the memory that's no longer necessary. I've kept this bug here to keep the changelist as small as possible (there's a little bit more to this than just removing the check). I'll address this issue in a separate ticket if that's okay.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on pull request #29772: [SPARK-32900][CORE] Allow UnsafeExternalSorter to spill when there are nulls.

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on pull request #29772:
URL: https://github.com/apache/spark/pull/29772#issuecomment-693523189


   retest this please


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #29772: [SPARK-32900][CORE] Allow UnsafeExternalSorter to spill when there are nulls.

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #29772:
URL: https://github.com/apache/spark/pull/29772#issuecomment-693544693


   **[Test build #128771 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/128771/testReport)** for PR 29772 at commit [`e514327`](https://github.com/apache/spark/commit/e514327b4e071f0e4c8e3402cbf03d54c83ab8a4).
    * This patch passes all tests.
    * This patch merges cleanly.
    * This patch adds no public classes.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] hvanhovell commented on pull request #29772: [SPARK-32900][CORE] Allow UnsafeExternalSorter to spill when there are nulls.

Posted by GitBox <gi...@apache.org>.
hvanhovell commented on pull request #29772:
URL: https://github.com/apache/spark/pull/29772#issuecomment-694146407


   Alright merging to master/3.0/2.4. Thanks!


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] hvanhovell commented on a change in pull request #29772: [SPARK-32900][CORE] Allow UnsafeExternalSorter to spill when there are nulls.

Posted by GitBox <gi...@apache.org>.
hvanhovell commented on a change in pull request #29772:
URL: https://github.com/apache/spark/pull/29772#discussion_r489428553



##########
File path: core/src/test/java/org/apache/spark/util/collection/unsafe/sort/UnsafeExternalSorterSuite.java
##########
@@ -359,6 +359,38 @@ public void forcedSpillingWithReadIterator() throws Exception {
     assertSpillFilesWereCleanedUp();
   }
 
+  @Test
+  public void forcedSpillingNullsWithReadIterator() throws Exception {
+    final UnsafeExternalSorter sorter = newSorter();
+    long[] record = new long[100];
+    final int recordSize = record.length * 8;
+    final int n = (int) pageSizeBytes / recordSize * 3;
+    for (int i = 0; i < n; i++) {
+      sorter.insertRecord(record, Platform.LONG_ARRAY_OFFSET, recordSize, 0, true);

Review comment:
       Can you also insert a couple of non-nulls? 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA removed a comment on pull request #29772: [SPARK-32900][CORE] Allow UnsafeExternalSorter to spill when there are nulls.

Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #29772:
URL: https://github.com/apache/spark/pull/29772#issuecomment-693404301


   **[Test build #128771 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/128771/testReport)** for PR 29772 at commit [`e514327`](https://github.com/apache/spark/commit/e514327b4e071f0e4c8e3402cbf03d54c83ab8a4).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] tomvanbussel commented on a change in pull request #29772: [SPARK-32900][CORE] Allow UnsafeExternalSorter to spill when there are nulls.

Posted by GitBox <gi...@apache.org>.
tomvanbussel commented on a change in pull request #29772:
URL: https://github.com/apache/spark/pull/29772#discussion_r489435297



##########
File path: core/src/test/java/org/apache/spark/util/collection/unsafe/sort/UnsafeExternalSorterSuite.java
##########
@@ -359,6 +359,38 @@ public void forcedSpillingWithReadIterator() throws Exception {
     assertSpillFilesWereCleanedUp();
   }
 
+  @Test
+  public void forcedSpillingNullsWithReadIterator() throws Exception {
+    final UnsafeExternalSorter sorter = newSorter();
+    long[] record = new long[100];
+    final int recordSize = record.length * 8;
+    final int n = (int) pageSizeBytes / recordSize * 3;
+    for (int i = 0; i < n; i++) {
+      sorter.insertRecord(record, Platform.LONG_ARRAY_OFFSET, recordSize, 0, true);

Review comment:
       Made every odd index non-null.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #29772: [SPARK-32900][CORE] Allow UnsafeExternalSorter to spill when there are nulls.

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #29772:
URL: https://github.com/apache/spark/pull/29772#issuecomment-693476467


   Merged build finished. Test FAILed.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #29772: [SPARK-32900][CORE] Allow UnsafeExternalSorter to spill when there are nulls.

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #29772:
URL: https://github.com/apache/spark/pull/29772#issuecomment-693360049


   Can one of the admins verify this patch?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] tomvanbussel commented on pull request #29772: [SPARK-32900][CORE] Allow UnsafeExternalSorter to spill when there are nulls.

Posted by GitBox <gi...@apache.org>.
tomvanbussel commented on pull request #29772:
URL: https://github.com/apache/spark/pull/29772#issuecomment-693358255


   cc @hvanhovell @cloud-fan 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #29772: [SPARK-32900][CORE] Allow UnsafeExternalSorter to spill when there are nulls.

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #29772:
URL: https://github.com/apache/spark/pull/29772#issuecomment-693474918


   **[Test build #128768 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/128768/testReport)** for PR 29772 at commit [`e6dbe2b`](https://github.com/apache/spark/commit/e6dbe2b3cbf18765cfc44d38ccb311cf06e9ad6f).
    * This patch **fails Spark unit tests**.
    * This patch merges cleanly.
    * This patch adds no public classes.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #29772: [SPARK-32900][CORE] Allow UnsafeExternalSorter to spill when there are nulls.

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #29772:
URL: https://github.com/apache/spark/pull/29772#issuecomment-693540130


   **[Test build #128770 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/128770/testReport)** for PR 29772 at commit [`4bfa122`](https://github.com/apache/spark/commit/4bfa1223aab3af0142c0ec975d7f8cb39a996aff).
    * This patch passes all tests.
    * This patch merges cleanly.
    * This patch adds no public classes.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #29772: [SPARK-32900][CORE] Allow UnsafeExternalSorter to spill when there are nulls.

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #29772:
URL: https://github.com/apache/spark/pull/29772#issuecomment-693541419






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #29772: [SPARK-32900][CORE] Allow UnsafeExternalSorter to spill when there are nulls.

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #29772:
URL: https://github.com/apache/spark/pull/29772#issuecomment-693391578


   **[Test build #128769 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/128769/testReport)** for PR 29772 at commit [`6eaf7f0`](https://github.com/apache/spark/commit/6eaf7f0aa583a1d48361692b202ecb2d26c4fb10).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on a change in pull request #29772: [SPARK-32900][CORE] Allow UnsafeExternalSorter to spill when there are nulls.

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on a change in pull request #29772:
URL: https://github.com/apache/spark/pull/29772#discussion_r489502836



##########
File path: core/src/main/java/org/apache/spark/util/collection/unsafe/sort/UnsafeExternalSorter.java
##########
@@ -516,32 +520,34 @@ public int getNumRecords() {
       return numRecords;
     }
 
+    @Override
+    public long getCurrentPageNumber() {
+      throw new UnsupportedOperationException();
+    }
+
     public long spill() throws IOException {
       synchronized (this) {
-        if (!(upstream instanceof UnsafeInMemorySorter.SortedIterator && nextUpstream == null
-          && numRecords > 0)) {
+        if (inMemSorter == null || numRecords <= 0) {

Review comment:
       I haven't touched this file for a long time. Do you mind providing more context? Does `inMemSorter == null || numRecords <= 0` mean this sorter has not been inserted any records yet?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #29772: [SPARK-32900][CORE] Allow UnsafeExternalSorter to spill when there are nulls.

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #29772:
URL: https://github.com/apache/spark/pull/29772#issuecomment-693359580


   Can one of the admins verify this patch?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #29772: [SPARK-32900][CORE] Allow UnsafeExternalSorter to spill when there are nulls.

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #29772:
URL: https://github.com/apache/spark/pull/29772#issuecomment-693392372






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #29772: [SPARK-32900][CORE] Allow UnsafeExternalSorter to spill when there are nulls.

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #29772:
URL: https://github.com/apache/spark/pull/29772#issuecomment-693545805






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #29772: [SPARK-32900][CORE] Allow UnsafeExternalSorter to spill when there are nulls.

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #29772:
URL: https://github.com/apache/spark/pull/29772#issuecomment-693395887


   **[Test build #128770 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/128770/testReport)** for PR 29772 at commit [`4bfa122`](https://github.com/apache/spark/commit/4bfa1223aab3af0142c0ec975d7f8cb39a996aff).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #29772: [SPARK-32900][CORE] Allow UnsafeExternalSorter to spill when there are nulls.

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #29772:
URL: https://github.com/apache/spark/pull/29772#issuecomment-693603652


   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/128776/
   Test FAILed.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #29772: [SPARK-32900][CORE] Allow UnsafeExternalSorter to spill when there are nulls.

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #29772:
URL: https://github.com/apache/spark/pull/29772#issuecomment-693545805






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cxzl25 commented on pull request #29772: [SPARK-32900][CORE] Allow UnsafeExternalSorter to spill when there are nulls.

Posted by GitBox <gi...@apache.org>.
cxzl25 commented on pull request #29772:
URL: https://github.com/apache/spark/pull/29772#issuecomment-875590137


   Recently, I encountered this problem in the production environment. There are millions of nulls in the memory. There is no way to spill. 
   I used your patch and the verification took effect. 
   Thank you.   @tomvanbussel 
   Before this, I also raised related questions. https://github.com/apache/spark/pull/26323


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #29772: [SPARK-32900][CORE] Allow UnsafeExternalSorter to spill when there are nulls.

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #29772:
URL: https://github.com/apache/spark/pull/29772#issuecomment-693404977






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA removed a comment on pull request #29772: [SPARK-32900][CORE] Allow UnsafeExternalSorter to spill when there are nulls.

Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #29772:
URL: https://github.com/apache/spark/pull/29772#issuecomment-693391578


   **[Test build #128769 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/128769/testReport)** for PR 29772 at commit [`6eaf7f0`](https://github.com/apache/spark/commit/6eaf7f0aa583a1d48361692b202ecb2d26c4fb10).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #29772: [SPARK-32900][CORE] Allow UnsafeExternalSorter to spill when there are nulls.

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #29772:
URL: https://github.com/apache/spark/pull/29772#issuecomment-693377420






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #29772: [SPARK-32900][CORE] Allow UnsafeExternalSorter to spill when there are nulls.

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #29772:
URL: https://github.com/apache/spark/pull/29772#issuecomment-693392372






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA removed a comment on pull request #29772: [SPARK-32900][CORE] Allow UnsafeExternalSorter to spill when there are nulls.

Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #29772:
URL: https://github.com/apache/spark/pull/29772#issuecomment-693376824


   **[Test build #128768 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/128768/testReport)** for PR 29772 at commit [`e6dbe2b`](https://github.com/apache/spark/commit/e6dbe2b3cbf18765cfc44d38ccb311cf06e9ad6f).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #29772: [SPARK-32900][CORE] Allow UnsafeExternalSorter to spill when there are nulls.

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #29772:
URL: https://github.com/apache/spark/pull/29772#issuecomment-693404977






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #29772: [SPARK-32900][CORE] Allow UnsafeExternalSorter to spill when there are nulls.

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #29772:
URL: https://github.com/apache/spark/pull/29772#issuecomment-693491884


   **[Test build #128769 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/128769/testReport)** for PR 29772 at commit [`6eaf7f0`](https://github.com/apache/spark/commit/6eaf7f0aa583a1d48361692b202ecb2d26c4fb10).
    * This patch **fails Spark unit tests**.
    * This patch merges cleanly.
    * This patch adds no public classes.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] viirya commented on pull request #29772: [SPARK-32900][CORE] Allow UnsafeExternalSorter to spill when there are nulls.

Posted by GitBox <gi...@apache.org>.
viirya commented on pull request #29772:
URL: https://github.com/apache/spark/pull/29772#issuecomment-693863796


   > Currently, Spark determines whether UnsafeExternalSorter.SpillableIterator has spilled already by checking whether upstream is an instance of UnsafeInMemorySorter.SortedIterator
   
   Can we update the description a bit?
   
   This is reading like Spark thinks `UnsafeExternalSorter.SpillableIterator` has spilled already if `upstream` is `UnsafeInMemorySorter.SortedIterator`. But it actually is Spark thinks `UnsafeExternalSorter.SpillableIterator` has spilled already if `upstream` is not `UnsafeInMemorySorter.SortedIterator`, right?
   
   ```scala
   if (!(upstream instanceof UnsafeInMemorySorter.SortedIterator && nextUpstream == null
     && numRecords > 0)) {
     return 0L;
   }
   ```
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #29772: [SPARK-32900][CORE] Allow UnsafeExternalSorter to spill when there are nulls.

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #29772:
URL: https://github.com/apache/spark/pull/29772#issuecomment-693360049


   Can one of the admins verify this patch?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #29772: [SPARK-32900][CORE] Allow UnsafeExternalSorter to spill when there are nulls.

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #29772:
URL: https://github.com/apache/spark/pull/29772#issuecomment-693602451


   **[Test build #128776 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/128776/testReport)** for PR 29772 at commit [`e514327`](https://github.com/apache/spark/commit/e514327b4e071f0e4c8e3402cbf03d54c83ab8a4).
    * This patch **fails Spark unit tests**.
    * This patch merges cleanly.
    * This patch adds no public classes.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org