You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by dilipbiswal <gi...@git.apache.org> on 2018/01/31 09:24:57 UTC

[GitHub] spark pull request #20453: [SPARK-23281][SQL] Query produces results in inco...

GitHub user dilipbiswal opened a pull request:

    https://github.com/apache/spark/pull/20453

    [SPARK-23281][SQL] Query produces results in incorrect order when a composite order by clause refers to both original columns and aliases

    ## What changes were proposed in this pull request?
    Here is the test snippet.
    ``` SQL
    scala> Seq[(Integer, Integer)](
         |         (1, 1),
         |         (1, 3),
         |         (2, 3),
         |         (3, 3),
         |         (4, null),
         |         (5, null)
         |       ).toDF("key", "value").createOrReplaceTempView("src")
    
    scala> sql(
         |         """
         |           |SELECT MAX(value) as value, key as col2
         |           |FROM src
         |           |GROUP BY key
         |           |ORDER BY value desc, key
         |         """.stripMargin).show
    +-----+----+
    |value|col2|
    +-----+----+
    |    3|   3|
    |    3|   2|
    |    3|   1|
    | null|   5|
    | null|   4|
    +-----+----+
    ```SQL
    Here is the explain output :
    
    ```SQL
    == Parsed Logical Plan ==
    'Sort ['value DESC NULLS LAST, 'key ASC NULLS FIRST], true
    +- 'Aggregate ['key], ['MAX('value) AS value#9, 'key AS col2#10]
       +- 'UnresolvedRelation `src`
    
    == Analyzed Logical Plan ==
    value: int, col2: int
    Project [value#9, col2#10]
    +- Sort [value#9 DESC NULLS LAST, col2#10 DESC NULLS LAST], true
       +- Aggregate [key#5], [max(value#6) AS value#9, key#5 AS col2#10]
          +- SubqueryAlias src
             +- Project [_1#2 AS key#5, _2#3 AS value#6]
                +- LocalRelation [_1#2, _2#3]
    ``` SQL
    The sort direction is being wrongly changed from ASC to DSC while resolving ```Sort``` in
    resolveAggregateFunctions.
    
    ## How was this patch tested?
    A few tests are added in SQLQuerySuite.
    
    (If this patch involves UI changes, please attach a screenshot; otherwise, remove this)
    
    Please review http://spark.apache.org/contributing.html before opening a pull request.


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/dilipbiswal/spark local_spark

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/20453.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #20453
    
----
commit 24e858c707b01982762e4695ea552ae517abb2cc
Author: Dilip Biswal <db...@...>
Date:   2018-01-31T08:59:24Z

    [SPARK-23281] Query produces results in incorrect order when a composite order by clause refers to both original columns and aliases

----


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20453: [SPARK-23281][SQL] Query produces results in incorrect o...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/20453
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20453: [SPARK-23281][SQL] Query produces results in incorrect o...

Posted by dilipbiswal <gi...@git.apache.org>.
Github user dilipbiswal commented on the issue:

    https://github.com/apache/spark/pull/20453
  
    cc @gatorsmile @cloud-fan 


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20453: [SPARK-23281][SQL] Query produces results in incorrect o...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/20453
  
    **[Test build #86870 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86870/testReport)** for PR 20453 at commit [`24e858c`](https://github.com/apache/spark/commit/24e858c707b01982762e4695ea552ae517abb2cc).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20453: [SPARK-23281][SQL] Query produces results in incorrect o...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/20453
  
    **[Test build #86870 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86870/testReport)** for PR 20453 at commit [`24e858c`](https://github.com/apache/spark/commit/24e858c707b01982762e4695ea552ae517abb2cc).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #20453: [SPARK-23281][SQL] Query produces results in inco...

Posted by dilipbiswal <gi...@git.apache.org>.
Github user dilipbiswal commented on a diff in the pull request:

    https://github.com/apache/spark/pull/20453#discussion_r165198141
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala ---
    @@ -1493,7 +1493,7 @@ class Analyzer(
               // to push down this ordering expression and can reference the original aggregate
               // expression instead.
               val needsPushDown = ArrayBuffer.empty[NamedExpression]
    -          val evaluatedOrderings = resolvedAliasedOrdering.zip(sortOrder).map {
    +          val evaluatedOrderings = resolvedAliasedOrdering.zip(unresolvedSortOrders).map {
    --- End diff --
    
    @gatorsmile Yeah.. seems like it sean.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20453: [SPARK-23281][SQL] Query produces results in incorrect o...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/20453
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #20453: [SPARK-23281][SQL] Query produces results in inco...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on a diff in the pull request:

    https://github.com/apache/spark/pull/20453#discussion_r165201975
  
    --- Diff: sql/core/src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala ---
    @@ -18,7 +18,6 @@
     package org.apache.spark.sql
     
     import java.io.File
    -import java.math.MathContext
    --- End diff --
    
    BTW, please do not remove the unnecessary import next time. Thanks!
    
    When I backported it to 2.2, I hit a build break issue because it is still being used by the other test cases in 2.2


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20453: [SPARK-23281][SQL] Query produces results in incorrect o...

Posted by dilipbiswal <gi...@git.apache.org>.
Github user dilipbiswal commented on the issue:

    https://github.com/apache/spark/pull/20453
  
    Thanks a LOT @gatorsmile 


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #20453: [SPARK-23281][SQL] Query produces results in inco...

Posted by dilipbiswal <gi...@git.apache.org>.
Github user dilipbiswal commented on a diff in the pull request:

    https://github.com/apache/spark/pull/20453#discussion_r165203182
  
    --- Diff: sql/core/src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala ---
    @@ -18,7 +18,6 @@
     package org.apache.spark.sql
     
     import java.io.File
    -import java.math.MathContext
    --- End diff --
    
    @gatorsmile Oh.. OK.. Sorry about it. I will not remove next time.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20453: [SPARK-23281][SQL] Query produces results in incorrect o...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/20453
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/426/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20453: [SPARK-23281][SQL] Query produces results in incorrect o...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/20453
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/86870/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20453: [SPARK-23281][SQL] Query produces results in incorrect o...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on the issue:

    https://github.com/apache/spark/pull/20453
  
    LGTM
    
    Thanks! Merged to master/2.3/2.2


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #20453: [SPARK-23281][SQL] Query produces results in inco...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on a diff in the pull request:

    https://github.com/apache/spark/pull/20453#discussion_r165193579
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala ---
    @@ -1493,7 +1493,7 @@ class Analyzer(
               // to push down this ordering expression and can reference the original aggregate
               // expression instead.
               val needsPushDown = ArrayBuffer.empty[NamedExpression]
    -          val evaluatedOrderings = resolvedAliasedOrdering.zip(sortOrder).map {
    +          val evaluatedOrderings = resolvedAliasedOrdering.zip(unresolvedSortOrders).map {
    --- End diff --
    
    A good catch! Is this a bug since 1.6.x?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #20453: [SPARK-23281][SQL] Query produces results in inco...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/spark/pull/20453


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org