You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by ericl <gi...@git.apache.org> on 2016/05/18 02:23:21 UTC

[GitHub] spark pull request: [SPARK-14851] Support radix sort with nullable...

GitHub user ericl opened a pull request:

    https://github.com/apache/spark/pull/13161

    [SPARK-14851] Support radix sort with nullable longs

    ## What changes were proposed in this pull request?
    
    This adds support for radix sort of nullable long fields. When a sort field is null and radix sort is enabled, we keep nulls in a separate region of the sort buffer so that radix sort does not need to deal with them. This also has performance benefits when sorting smaller integer types, since two's complement nulls otherwise force a full-width radix sort.
    
    This strategy for nulls does mean the sort is no longer stable. cc @davies 
    
    ## How was this patch tested?
    
    Existing randomized sort tests for correctness. I also tested some TPCDS queries and there does not seem to be any significant regression for non-null sorts.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/ericl/spark sc-2998

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/13161.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #13161
    
----
commit 7d29dc35488aa51f297bea64de1a36dd4ac7ed98
Author: Eric Liang <ek...@databricks.com>
Date:   2016-05-18T02:06:51Z

    Tue May 17 19:06:51 PDT 2016

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-14851] Support radix sort with nullable...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/13161#issuecomment-219908862
  
    **[Test build #58735 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/58735/consoleFull)** for PR 13161 at commit [`7d29dc3`](https://github.com/apache/spark/commit/7d29dc35488aa51f297bea64de1a36dd4ac7ed98).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #13161: [SPARK-14851] [Core] Support radix sort with null...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/spark/pull/13161


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #13161: [SPARK-14851] [Core] Support radix sort with null...

Posted by sameeragarwal <gi...@git.apache.org>.
Github user sameeragarwal commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13161#discussion_r66653192
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/SortOrder.scala ---
    @@ -64,49 +64,57 @@ case class SortOrder(child: Expression, direction: SortDirection)
     }
     
     /**
    - * An expression to generate a 64-bit long prefix used in sorting.
    + * An expression to generate a 64-bit long prefix used in sorting. If the sort must operate over
    + * null keys as well, this.nullValue can be used in place of emitted null prefixes in the sort.
      */
     case class SortPrefix(child: SortOrder) extends UnaryExpression {
     
    +  val nullValue = child.child.dataType match {
    +    case BooleanType | DateType | TimestampType | _: IntegralType =>
    +      Long.MinValue
    +    case dt: DecimalType if dt.precision - dt.scale <= Decimal.MAX_LONG_DIGITS =>
    +      Long.MinValue
    +    case _: DecimalType =>
    +      DoublePrefixComparator.computePrefix(Double.NegativeInfinity)
    +    case _ => 0L
    +  }
    +
       override def eval(input: InternalRow): Any = throw new UnsupportedOperationException
     
       override def doGenCode(ctx: CodegenContext, ev: ExprCode): ExprCode = {
    --- End diff --
    
    These changes make sense to me but it'd great for @davies to take a second look too


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #13161: [SPARK-14851] [Core] Support radix sort with null...

Posted by sameeragarwal <gi...@git.apache.org>.
Github user sameeragarwal commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13161#discussion_r66650922
  
    --- Diff: core/src/main/java/org/apache/spark/util/collection/unsafe/sort/UnsafeInMemorySorter.java ---
    @@ -301,6 +323,19 @@ public SortedIterator getSortedIterator() {
           }
         }
         totalSortTimeNanos += System.nanoTime() - start;
    -    return new SortedIterator(pos / 2, offset);
    +    if (nullPos > 0) {
    +      assert radixSortSupport != null : "Nulls are only stored separately with radix sort";
    +      LinkedList<UnsafeSorterIterator> queue = new LinkedList<>();
    +      if (radixSortSupport.sortDescending()) {
    +        queue.add(new SortedIterator((pos - nullPos) / 2, offset));
    --- End diff --
    
    nit: can we add a comment along the lines of `Nulls are smaller than non-nulls...`?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #13161: [SPARK-14851] [Core] Support radix sort with nullable lo...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/13161
  
    **[Test build #60314 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/60314/consoleFull)** for PR 13161 at commit [`fc4a151`](https://github.com/apache/spark/commit/fc4a151b0d2c6e8e609d28b452b66c2298afe545).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-14851] [Core] Support radix sort with n...

Posted by rxin <gi...@git.apache.org>.
Github user rxin commented on the pull request:

    https://github.com/apache/spark/pull/13161#issuecomment-219956800
  
    Any benchmark data you can include?



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-14851] [Core] Support radix sort with n...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/13161#issuecomment-219921050
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/58735/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #13161: [SPARK-14851] [Core] Support radix sort with nullable lo...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/13161
  
    **[Test build #60190 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/60190/consoleFull)** for PR 13161 at commit [`26bfbf2`](https://github.com/apache/spark/commit/26bfbf26ba77d2fb38c83171852a9d0ba8304eb1).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-14851] Support radix sort with nullable...

Posted by HyukjinKwon <gi...@git.apache.org>.
Github user HyukjinKwon commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13161#discussion_r63637193
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/SortOrder.scala ---
    @@ -64,49 +64,57 @@ case class SortOrder(child: Expression, direction: SortDirection)
     }
     
     /**
    - * An expression to generate a 64-bit long prefix used in sorting.
    + * An expression to generate a 64-bit long prefix used in sorting. If the sort must operate over
    + * null keys as well, this.nullValue can be used in place of emitted null prefixes in the sort.
      */
     case class SortPrefix(child: SortOrder) extends UnaryExpression {
     
    +  val nullValue = child.child.dataType match {
    +    case BooleanType | DateType | TimestampType | _: IntegralType =>
    +      Long.MinValue
    +    case dt: DecimalType if dt.precision - dt.scale <= Decimal.MAX_LONG_DIGITS =>
    +      Long.MinValue
    +    case dt: DecimalType =>
    +      DoublePrefixComparator.computePrefix(Double.NegativeInfinity)
    --- End diff --
    
    It seems `dt` is not used here. Maybe this could be as below:
    
    ```scala
    case _: DecimalType =>
      DoublePrefixComparator.computePrefix(Double.NegativeInfinity)
    ```



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-14851] [Core] Support radix sort with n...

Posted by rxin <gi...@git.apache.org>.
Github user rxin commented on the pull request:

    https://github.com/apache/spark/pull/13161#issuecomment-220139382
  
    Yea this should not go into 2.0.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #13161: [SPARK-14851] [Core] Support radix sort with nullable lo...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/13161
  
    **[Test build #60314 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/60314/consoleFull)** for PR 13161 at commit [`fc4a151`](https://github.com/apache/spark/commit/fc4a151b0d2c6e8e609d28b452b66c2298afe545).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-14851] Support radix sort with nullable...

Posted by ericl <gi...@git.apache.org>.
Github user ericl commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13161#discussion_r63637597
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/SortOrder.scala ---
    @@ -64,49 +64,57 @@ case class SortOrder(child: Expression, direction: SortDirection)
     }
     
     /**
    - * An expression to generate a 64-bit long prefix used in sorting.
    + * An expression to generate a 64-bit long prefix used in sorting. If the sort must operate over
    + * null keys as well, this.nullValue can be used in place of emitted null prefixes in the sort.
      */
     case class SortPrefix(child: SortOrder) extends UnaryExpression {
     
    +  val nullValue = child.child.dataType match {
    +    case BooleanType | DateType | TimestampType | _: IntegralType =>
    +      Long.MinValue
    +    case dt: DecimalType if dt.precision - dt.scale <= Decimal.MAX_LONG_DIGITS =>
    +      Long.MinValue
    +    case dt: DecimalType =>
    +      DoublePrefixComparator.computePrefix(Double.NegativeInfinity)
    --- End diff --
    
    Done


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #13161: [SPARK-14851] [Core] Support radix sort with null...

Posted by ericl <gi...@git.apache.org>.
Github user ericl commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13161#discussion_r66682095
  
    --- Diff: core/src/main/java/org/apache/spark/util/collection/unsafe/sort/UnsafeInMemorySorter.java ---
    @@ -206,14 +216,27 @@ public void expandPointerArray(LongArray newArray) {
        * @param recordPointer pointer to a record in a data page, encoded by {@link TaskMemoryManager}.
        * @param keyPrefix a user-defined key prefix
        */
    -  public void insertRecord(long recordPointer, long keyPrefix) {
    +  public void insertRecord(long recordPointer, long keyPrefix, boolean prefixIsNull) {
         if (!hasSpaceForAnotherRecord()) {
           throw new IllegalStateException("There is no space for new record");
         }
    -    array.set(pos, recordPointer);
    -    pos++;
    -    array.set(pos, keyPrefix);
    -    pos++;
    +    if (prefixIsNull && radixSortSupport != null) {
    +      // Swap forward a non-null record to make room for this one at the beginning of the array.
    +      array.set(pos, array.get(nullPos));
    --- End diff --
    
    Added a case to SortSuite


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #13161: [SPARK-14851] [Core] Support radix sort with nullable lo...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/13161
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #13161: [SPARK-14851] [Core] Support radix sort with null...

Posted by sameeragarwal <gi...@git.apache.org>.
Github user sameeragarwal commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13161#discussion_r66653864
  
    --- Diff: core/src/test/scala/org/apache/spark/util/collection/unsafe/sort/RadixSortSuite.scala ---
    @@ -152,7 +152,7 @@ class RadixSortSuite extends SparkFunSuite with Logging {
           val (buf1, buf2) = generateKeyPrefixTestData(N, rand.nextLong & 0xff)
           referenceKeyPrefixSort(buf1, 0, N, sortType.referenceComparator)
           val outOffset = RadixSort.sortKeyPrefixArray(
    -        buf2, N, sortType.startByteIdx, sortType.endByteIdx,
    +        buf2, 0, N, sortType.startByteIdx, sortType.endByteIdx,
    --- End diff --
    
    Also, it'd be great to explicitly test boundary case for an array with all nulls


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #13161: [SPARK-14851] [Core] Support radix sort with nullable lo...

Posted by sameeragarwal <gi...@git.apache.org>.
Github user sameeragarwal commented on the issue:

    https://github.com/apache/spark/pull/13161
  
    LGTM pending @davies's approval for `SortOrder.scala`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #13161: [SPARK-14851] [Core] Support radix sort with nullable lo...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/13161
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/60190/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #13161: [SPARK-14851] [Core] Support radix sort with null...

Posted by sameeragarwal <gi...@git.apache.org>.
Github user sameeragarwal commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13161#discussion_r66648811
  
    --- Diff: core/src/main/java/org/apache/spark/util/collection/unsafe/sort/RadixSort.java ---
    @@ -205,13 +210,14 @@ public static int sortKeyPrefixArray(
        * getCounts with some added parameters but that seems to hurt in benchmarks.
        */
       private static long[][] getKeyPrefixArrayCounts(
    -      LongArray array, int numRecords, int startByteIndex, int endByteIndex) {
    +      LongArray array, int startIndex, int numRecords, int startByteIndex, int endByteIndex) {
         long[][] counts = new long[8][];
         long bitwiseMax = 0;
         long bitwiseMin = -1L;
    -    long limit = array.getBaseOffset() + numRecords * 16;
    +    long baseOffset = array.getBaseOffset() + startIndex * 8;
    +    long limit = baseOffset + numRecords * 16;
    --- End diff --
    
    `numRecords * 16L`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #13161: [SPARK-14851] [Core] Support radix sort with null...

Posted by davies <gi...@git.apache.org>.
Github user davies commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13161#discussion_r66688459
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/SortOrder.scala ---
    @@ -64,49 +64,57 @@ case class SortOrder(child: Expression, direction: SortDirection)
     }
     
     /**
    - * An expression to generate a 64-bit long prefix used in sorting.
    + * An expression to generate a 64-bit long prefix used in sorting. If the sort must operate over
    + * null keys as well, this.nullValue can be used in place of emitted null prefixes in the sort.
      */
     case class SortPrefix(child: SortOrder) extends UnaryExpression {
     
    +  val nullValue = child.child.dataType match {
    --- End diff --
    
    Do we still need this?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #13161: [SPARK-14851] [Core] Support radix sort with null...

Posted by ericl <gi...@git.apache.org>.
Github user ericl commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13161#discussion_r66681990
  
    --- Diff: core/src/main/java/org/apache/spark/util/collection/unsafe/sort/RadixSort.java ---
    @@ -205,13 +210,14 @@ public static int sortKeyPrefixArray(
        * getCounts with some added parameters but that seems to hurt in benchmarks.
        */
       private static long[][] getKeyPrefixArrayCounts(
    -      LongArray array, int numRecords, int startByteIndex, int endByteIndex) {
    +      LongArray array, int startIndex, int numRecords, int startByteIndex, int endByteIndex) {
         long[][] counts = new long[8][];
         long bitwiseMax = 0;
         long bitwiseMin = -1L;
    -    long limit = array.getBaseOffset() + numRecords * 16;
    +    long baseOffset = array.getBaseOffset() + startIndex * 8;
    --- End diff --
    
    Done


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #13161: [SPARK-14851] [Core] Support radix sort with null...

Posted by ericl <gi...@git.apache.org>.
Github user ericl commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13161#discussion_r66682082
  
    --- Diff: core/src/main/java/org/apache/spark/util/collection/unsafe/sort/UnsafeInMemorySorter.java ---
    @@ -93,6 +94,14 @@ public int compare(RecordPointerAndKeyPrefix r1, RecordPointerAndKeyPrefix r2) {
       private int pos = 0;
     
       /**
    +   * If sorting with radix sort, specifies the starting position in the sort buffer where records
    +   * with non-null prefixes are kept. Positions [0..nullPos) will contain null-prefixed records,
    +   * and positions [nullPos..pos) non-null prefixed records. This lets us avoid radix sorting
    +   * over null values.
    +   */
    +  private int nullPos = 0;
    --- End diff --
    
    Done


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-14851] [Core] Support radix sort with nullable lo...

Posted by rxin <gi...@git.apache.org>.
Github user rxin commented on the pull request:

    https://github.com/apache/spark/pull/13161
  
    cc @ooq


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-14851] [Core] Support radix sort with n...

Posted by davies <gi...@git.apache.org>.
Github user davies commented on the pull request:

    https://github.com/apache/spark/pull/13161#issuecomment-220112381
  
    @ericl Since this is a performance improvement for a special case (nullable LongType), the changes are not trivial, should we target this to 2.1 release?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #13161: [SPARK-14851] [Core] Support radix sort with null...

Posted by sameeragarwal <gi...@git.apache.org>.
Github user sameeragarwal commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13161#discussion_r66653788
  
    --- Diff: core/src/test/scala/org/apache/spark/util/collection/unsafe/sort/RadixSortSuite.scala ---
    @@ -152,7 +152,7 @@ class RadixSortSuite extends SparkFunSuite with Logging {
           val (buf1, buf2) = generateKeyPrefixTestData(N, rand.nextLong & 0xff)
    --- End diff --
    
    would this data have nulls?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #13161: [SPARK-14851] [Core] Support radix sort with null...

Posted by ericl <gi...@git.apache.org>.
Github user ericl commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13161#discussion_r66682149
  
    --- Diff: core/src/main/java/org/apache/spark/util/collection/unsafe/sort/UnsafeInMemorySorter.java ---
    @@ -301,6 +323,19 @@ public SortedIterator getSortedIterator() {
           }
         }
         totalSortTimeNanos += System.nanoTime() - start;
    -    return new SortedIterator(pos / 2, offset);
    +    if (nullPos > 0) {
    +      assert radixSortSupport != null : "Nulls are only stored separately with radix sort";
    +      LinkedList<UnsafeSorterIterator> queue = new LinkedList<>();
    +      if (radixSortSupport.sortDescending()) {
    +        queue.add(new SortedIterator((pos - nullPos) / 2, offset));
    --- End diff --
    
    Done


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-14851] [Core] Support radix sort with n...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/13161#issuecomment-219922327
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/58736/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #13161: [SPARK-14851] [Core] Support radix sort with nullable lo...

Posted by davies <gi...@git.apache.org>.
Github user davies commented on the issue:

    https://github.com/apache/spark/pull/13161
  
    LGTM.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #13161: [SPARK-14851] [Core] Support radix sort with null...

Posted by ericl <gi...@git.apache.org>.
Github user ericl commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13161#discussion_r66682189
  
    --- Diff: core/src/test/scala/org/apache/spark/util/collection/unsafe/sort/RadixSortSuite.scala ---
    @@ -152,7 +152,7 @@ class RadixSortSuite extends SparkFunSuite with Logging {
           val (buf1, buf2) = generateKeyPrefixTestData(N, rand.nextLong & 0xff)
           referenceKeyPrefixSort(buf1, 0, N, sortType.referenceComparator)
           val outOffset = RadixSort.sortKeyPrefixArray(
    -        buf2, N, sortType.startByteIdx, sortType.endByteIdx,
    +        buf2, 0, N, sortType.startByteIdx, sortType.endByteIdx,
    --- End diff --
    
    Done, ptal


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #13161: [SPARK-14851] [Core] Support radix sort with nullable lo...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/13161
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/60314/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-14851] [Core] Support radix sort with n...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/13161#issuecomment-219920955
  
    **[Test build #58735 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/58735/consoleFull)** for PR 13161 at commit [`7d29dc3`](https://github.com/apache/spark/commit/7d29dc35488aa51f297bea64de1a36dd4ac7ed98).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds the following public classes _(experimental)_:
      * `    public static class Prefix `


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-14851] [Core] Support radix sort with n...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/13161#issuecomment-219922220
  
    **[Test build #58736 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/58736/consoleFull)** for PR 13161 at commit [`77d7b17`](https://github.com/apache/spark/commit/77d7b17f4b968d778df4597f6c72fc975ec2d333).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-14851] [Core] Support radix sort with n...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/13161#issuecomment-219921049
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-14851] Support radix sort with nullable...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/13161#issuecomment-219910107
  
    **[Test build #58736 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/58736/consoleFull)** for PR 13161 at commit [`77d7b17`](https://github.com/apache/spark/commit/77d7b17f4b968d778df4597f6c72fc975ec2d333).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #13161: [SPARK-14851] [Core] Support radix sort with null...

Posted by sameeragarwal <gi...@git.apache.org>.
Github user sameeragarwal commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13161#discussion_r66649155
  
    --- Diff: core/src/main/java/org/apache/spark/util/collection/unsafe/sort/UnsafeInMemorySorter.java ---
    @@ -93,6 +94,14 @@ public int compare(RecordPointerAndKeyPrefix r1, RecordPointerAndKeyPrefix r2) {
       private int pos = 0;
     
       /**
    +   * If sorting with radix sort, specifies the starting position in the sort buffer where records
    +   * with non-null prefixes are kept. Positions [0..nullPos) will contain null-prefixed records,
    +   * and positions [nullPos..pos) non-null prefixed records. This lets us avoid radix sorting
    +   * over null values.
    +   */
    +  private int nullPos = 0;
    --- End diff --
    
    minor nit: would something `nullBoundary` or `nullBoundaryPos` better imply the semantics?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #13161: [SPARK-14851] [Core] Support radix sort with null...

Posted by ericl <gi...@git.apache.org>.
Github user ericl commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13161#discussion_r66681992
  
    --- Diff: core/src/main/java/org/apache/spark/util/collection/unsafe/sort/RadixSort.java ---
    @@ -205,13 +210,14 @@ public static int sortKeyPrefixArray(
        * getCounts with some added parameters but that seems to hurt in benchmarks.
        */
       private static long[][] getKeyPrefixArrayCounts(
    -      LongArray array, int numRecords, int startByteIndex, int endByteIndex) {
    +      LongArray array, int startIndex, int numRecords, int startByteIndex, int endByteIndex) {
         long[][] counts = new long[8][];
         long bitwiseMax = 0;
         long bitwiseMin = -1L;
    -    long limit = array.getBaseOffset() + numRecords * 16;
    +    long baseOffset = array.getBaseOffset() + startIndex * 8;
    +    long limit = baseOffset + numRecords * 16;
    --- End diff --
    
    Done


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-14851] Support radix sort with nullable...

Posted by HyukjinKwon <gi...@git.apache.org>.
Github user HyukjinKwon commented on the pull request:

    https://github.com/apache/spark/pull/13161#issuecomment-219910188
  
    (@ericl I see you set a component in JIRA. It would be nicer if the component is specified in the PR title as described in [Contributing+to+Spark#ContributingtoSpark-PullRequest](https://cwiki.apache.org/confluence/display/SPARK/Contributing+to+Spark#ContributingtoSpark-PullRequest))


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #13161: [SPARK-14851] [Core] Support radix sort with nullable lo...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/13161
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #13161: [SPARK-14851] [Core] Support radix sort with nullable lo...

Posted by rxin <gi...@git.apache.org>.
Github user rxin commented on the issue:

    https://github.com/apache/spark/pull/13161
  
    Merging in master.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-14851] [Core] Support radix sort with n...

Posted by ericl <gi...@git.apache.org>.
Github user ericl commented on the pull request:

    https://github.com/apache/spark/pull/13161#issuecomment-220148117
  
    @davies @rxin sounds good
    
    I also added a before/after result of sorting nullable longs in the PR description.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #13161: [SPARK-14851] [Core] Support radix sort with nullable lo...

Posted by rxin <gi...@git.apache.org>.
Github user rxin commented on the issue:

    https://github.com/apache/spark/pull/13161
  
    FYI I accidentally merged this in branch-2.0 too, but I reverted it.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #13161: [SPARK-14851] [Core] Support radix sort with null...

Posted by sameeragarwal <gi...@git.apache.org>.
Github user sameeragarwal commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13161#discussion_r66650450
  
    --- Diff: core/src/main/java/org/apache/spark/util/collection/unsafe/sort/UnsafeInMemorySorter.java ---
    @@ -206,14 +216,27 @@ public void expandPointerArray(LongArray newArray) {
        * @param recordPointer pointer to a record in a data page, encoded by {@link TaskMemoryManager}.
        * @param keyPrefix a user-defined key prefix
        */
    -  public void insertRecord(long recordPointer, long keyPrefix) {
    +  public void insertRecord(long recordPointer, long keyPrefix, boolean prefixIsNull) {
         if (!hasSpaceForAnotherRecord()) {
           throw new IllegalStateException("There is no space for new record");
         }
    -    array.set(pos, recordPointer);
    -    pos++;
    -    array.set(pos, keyPrefix);
    -    pos++;
    +    if (prefixIsNull && radixSortSupport != null) {
    +      // Swap forward a non-null record to make room for this one at the beginning of the array.
    +      array.set(pos, array.get(nullPos));
    --- End diff --
    
    As discussed offline, it'd be nice to explicitly test this when pos and nullPos are both 0 (i.e., the first element in the array is one with a null prefix).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #13161: [SPARK-14851] [Core] Support radix sort with null...

Posted by ericl <gi...@git.apache.org>.
Github user ericl commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13161#discussion_r66682167
  
    --- Diff: core/src/test/scala/org/apache/spark/util/collection/unsafe/sort/RadixSortSuite.scala ---
    @@ -152,7 +152,7 @@ class RadixSortSuite extends SparkFunSuite with Logging {
           val (buf1, buf2) = generateKeyPrefixTestData(N, rand.nextLong & 0xff)
    --- End diff --
    
    No, this is the actual radix sort which does not support nulls.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-14851] [Core] Support radix sort with n...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/13161#issuecomment-219922326
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #13161: [SPARK-14851] [Core] Support radix sort with nullable lo...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/13161
  
    **[Test build #60190 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/60190/consoleFull)** for PR 13161 at commit [`26bfbf2`](https://github.com/apache/spark/commit/26bfbf26ba77d2fb38c83171852a9d0ba8304eb1).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #13161: [SPARK-14851] [Core] Support radix sort with null...

Posted by sameeragarwal <gi...@git.apache.org>.
Github user sameeragarwal commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13161#discussion_r66648782
  
    --- Diff: core/src/main/java/org/apache/spark/util/collection/unsafe/sort/RadixSort.java ---
    @@ -205,13 +210,14 @@ public static int sortKeyPrefixArray(
        * getCounts with some added parameters but that seems to hurt in benchmarks.
        */
       private static long[][] getKeyPrefixArrayCounts(
    -      LongArray array, int numRecords, int startByteIndex, int endByteIndex) {
    +      LongArray array, int startIndex, int numRecords, int startByteIndex, int endByteIndex) {
         long[][] counts = new long[8][];
         long bitwiseMax = 0;
         long bitwiseMin = -1L;
    -    long limit = array.getBaseOffset() + numRecords * 16;
    +    long baseOffset = array.getBaseOffset() + startIndex * 8;
    --- End diff --
    
    `startIndex * 8L` to prevent overflows


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org