You are viewing a plain text version of this content. The canonical link for it is here.

Posted to reviews@spark.apache.org by 10110346 <gi...@git.apache.org> on 2017/07/31 04:10:39 UTC

[GitHub] spark pull request #18779: [SPARK-21580][SQL]There's a bug with `Group by or...

GitHub user 10110346 opened a pull request:

    https://github.com/apache/spark/pull/18779

    [SPARK-21580][SQL]There's a bug with `Group by ordinal`

    ## What changes were proposed in this pull request?
    
    create temporary view data as select * from values
    (1, 1),
    (1, 2),
    (2, 1),
    (2, 2),
    (3, 1),
    (3, 2)
    as data(a, b);
    
    `select 3, 4, sum(b) from data group by 1, 2;`
    When running this case, the following exception occurred:
    `Error in query: GROUP BY position 4 is not in select list (valid range is [1, 3]); line 1 pos 10`
    
    ## How was this patch tested?
    add unit test case


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/10110346/spark groupby

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/18779.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #18779
    
----
commit 5319670abbd5668c82c505689949f53b0ca52e42
Author: liuxian <li...@zte.com.cn>
Date:   2017-07-31T03:29:09Z

    group by ordinal

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #18779: [SPARK-21580][SQL]Integers in aggregation expressions ar...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/18779
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/80246/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #18779: [SPARK-21580][SQL]Integers in aggregation expressions ar...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/18779
  
    **[Test build #80232 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80232/testReport)** for PR 18779 at commit [`bfab7e5`](https://github.com/apache/spark/commit/bfab7e55609ed9c98534c23d41ff881a6eed84d5).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #18779: [SPARK-21580][SQL]Integers in aggregation express...

Posted by viirya <gi...@git.apache.org>.

Github user viirya commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18779#discussion_r131360533
  
    --- Diff: sql/core/src/test/scala/org/apache/spark/sql/DataFrameSuite.scala ---
    @@ -2023,4 +2023,11 @@ class DataFrameSuite extends QueryTest with SharedSQLContext {
           assert(df1.join(df2, $"t1.i" === $"t2.i").cache().count() == 1)
         }
       }
    +
    +  test("order-by ordinal.") {
    +    val df = Seq((1, 1), (1, 2), (2, 1), (2, 2), (3, 1), (3, 2)).toDF("a", "b")
    +    checkAnswer(
    +      df.select(lit(7), 'a, 'b).orderBy(lit(1), lit(2), lit(3)),
    +      Seq(Row(7, 1, 1), Row(7, 1, 2), Row(7, 2, 1), Row(7, 2, 2), Row(7, 3, 1), Row(7, 3, 2)))
    --- End diff --
    
    OK. I see. When we input query like `df.select(lit(7), 'a, 'b).orderBy(lit(1), lit(2), lit(3))`, the query plan looks like:
    
        Sort [7#22 ASC NULLS FIRST, a#5 ASC NULLS FIRST, b#6 ASC NULLS FIRST], true
        +- Project [7 AS 7#22, a#5, b#6]
           +- Project [_1#2 AS a#5, _2#3 AS b#6]
              +- LocalRelation [_1#2, _2#3]
    
    We have a `Project` below `Sort`. The ordinal `1` be replaced with the attribute `7#22`. So we won't get an int literal 7 here. That is why it passes.
    
    Can you have a test for ordinal order-by that show different behavior? If not, I think ordinal order-by should be safe from this bug. And we don't need to add this test.
    



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #18779: [SPARK-21580][SQL]Integers in aggregation expressions ar...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/18779
  
    **[Test build #80197 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80197/testReport)** for PR 18779 at commit [`4fce4ab`](https://github.com/apache/spark/commit/4fce4ab3da0ce425b4ba1807d165b4ab05a812b7).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #18779: [SPARK-21580][SQL]Integers in aggregation expressions ar...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/18779
  
    **[Test build #80277 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80277/testReport)** for PR 18779 at commit [`2bf42b1`](https://github.com/apache/spark/commit/2bf42b11cec794d19ed8f2fd000a9cd7aeb159bb).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #18779: [SPARK-21580][SQL]Integers in aggregation expressions ar...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/18779
  
    **[Test build #80105 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80105/testReport)** for PR 18779 at commit [`a5667c8`](https://github.com/apache/spark/commit/a5667c83a562738ca6924ea80d9e0290969fbfcd).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #18779: [SPARK-21580][SQL]Integers in aggregation express...

Posted by 10110346 <gi...@git.apache.org>.

Github user 10110346 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18779#discussion_r131334204
  
    --- Diff: sql/core/src/test/resources/sql-tests/inputs/group-by-ordinal.sql ---
    @@ -52,8 +52,18 @@ select count(a), a from (select 1 as a) tmp group by 2 having a > 0;
     -- mixed cases: group-by ordinals and aliases
     select a, a AS k, count(b) from data group by k, 1;
     
    --- turn of group by ordinal
    +-- turn off group by ordinal
     set spark.sql.groupByOrdinal=false;
     
     -- can now group by negative literal
     select sum(b) from data group by -1;
    +
    +select 4, b from data group by 1, 2;
    +
    +set spark.sql.groupByOrdinal=true;
    +
    +select 4, b from data group by 1, 2;
    +
    +-- SPARK-21580 ints in aggregation expressions are taken as group-by ordinal
    +select 3, 4, sum(b) from data group by 1, 2;
    +select 3 as c, 4 as d, sum(b) from data group by c, d;
    --- End diff --
    
    Oh, I see,thanks. 
    They are  not necessary, i will remove  them.
    Also these test cases already exist in the `DataFrameAggregateSuite`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #18779: [SPARK-21580][SQL]Integers in aggregation express...

Posted by viirya <gi...@git.apache.org>.

Github user viirya commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18779#discussion_r131305760
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/SubstituteUnresolvedOrdinals.scala ---
    @@ -1,54 +0,0 @@
    -/*
    - * Licensed to the Apache Software Foundation (ASF) under one or more
    - * contributor license agreements.  See the NOTICE file distributed with
    - * this work for additional information regarding copyright ownership.
    - * The ASF licenses this file to You under the Apache License, Version 2.0
    - * (the "License"); you may not use this file except in compliance with
    - * the License.  You may obtain a copy of the License at
    - *
    - *    http://www.apache.org/licenses/LICENSE-2.0
    - *
    - * Unless required by applicable law or agreed to in writing, software
    - * distributed under the License is distributed on an "AS IS" BASIS,
    - * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    - * See the License for the specific language governing permissions and
    - * limitations under the License.
    - */
    -
    -package org.apache.spark.sql.catalyst.analysis
    -
    -import org.apache.spark.sql.catalyst.expressions.{Expression, Literal, SortOrder}
    -import org.apache.spark.sql.catalyst.plans.logical.{Aggregate, LogicalPlan, Sort}
    -import org.apache.spark.sql.catalyst.rules.Rule
    -import org.apache.spark.sql.catalyst.trees.CurrentOrigin.withOrigin
    -import org.apache.spark.sql.internal.SQLConf
    -import org.apache.spark.sql.types.IntegerType
    -
    -/**
    - * Replaces ordinal in 'order by' or 'group by' with UnresolvedOrdinal expression.
    - */
    -class SubstituteUnresolvedOrdinals(conf: SQLConf) extends Rule[LogicalPlan] {
    -  private def isIntLiteral(e: Expression) = e match {
    -    case Literal(_, IntegerType) => true
    -    case _ => false
    -  }
    -
    -  def apply(plan: LogicalPlan): LogicalPlan = plan transform {
    --- End diff --
    
    Aha, sorry, I made a mistake here in inspecting Analyzer...
    
    Yeah, the whole bug is due to re-entrance of an analyzed plan. Currently we can solve it by using `resolveOperators`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #18779: [SPARK-21580][SQL]Integers in aggregation expressions ar...

Posted by 10110346 <gi...@git.apache.org>.

Github user 10110346 commented on the issue:

    https://github.com/apache/spark/pull/18779
  
    @maropu  Could you  not put the test cases in the end of `group-by-ordinal.sql `? 
    because it has set `spark.sql.groupByOrdinal=false; ` in the end


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #18779: [SPARK-21580][SQL]Integers in aggregation expressions ar...

Posted by viirya <gi...@git.apache.org>.

Github user viirya commented on the issue:

    https://github.com/apache/spark/pull/18779
  
    @10110346 Thanks for working this! Sorry I've confused you in previous comments. Current changes looks good to me.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #18779: [SPARK-21580][SQL]Integers in aggregation expressions ar...

Posted by 10110346 <gi...@git.apache.org>.

Github user 10110346 commented on the issue:

    https://github.com/apache/spark/pull/18779
  
    @viirya  Only to `group-by ordinal`, i think this is a good idea.
    but this will also result in  inconsistent processing  between  `order-by ordinal` and   `group-by ordinal`.
    and i feel that  it's more complicated than the current changes


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #18779: [SPARK-21580][SQL]Integers in aggregation expressions ar...

Posted by viirya <gi...@git.apache.org>.

Github user viirya commented on the issue:

    https://github.com/apache/spark/pull/18779
  
    @gatorsmile 
    
        scala> df.groupBy(lit(2)).agg(col("a")).queryExecution.logical
        res6: org.apache.spark.sql.catalyst.plans.logical.LogicalPlan =
        'Aggregate [2], [2 AS 2#51, unresolvedalias('a, None)]
        +- Project [_1#3 AS a#7, _2#4 AS b#8, _3#5 AS c#9]
           +- LocalRelation [_1#3, _2#4, _3#5]
    
        scala> df.groupBy(lit(2)).agg(col("a")).queryExecution.analyzed
        res7: org.apache.spark.sql.catalyst.plans.logical.LogicalPlan =
        Aggregate [a#7], [2 AS 2#59, a#7]
        +- Project [_1#3 AS a#7, _2#4 AS b#8, _3#5 AS c#9]
           +- LocalRelation [_1#3, _2#4, _3#5]
    
    The int literal `2` in grouping expression is replaced with `a`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #18779: [SPARK-21580][SQL]Integers in aggregation expressions ar...

Posted by viirya <gi...@git.apache.org>.

Github user viirya commented on the issue:

    https://github.com/apache/spark/pull/18779
  
    When using Dataset groupBy API, if you use int literals as grouping expressions, do we filter this case out for substituting `UnresolvedOrdinals`? Seems there is no related logic to prevent it.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #18779: [SPARK-21580][SQL]Integers in aggregation expressions ar...

Posted by gatorsmile <gi...@git.apache.org>.

Github user gatorsmile commented on the issue:

    https://github.com/apache/spark/pull/18779
  
    I think this will fail the test case in `SubstituteUnresolvedOrdinalsSuite`. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #18779: [SPARK-21580][SQL]Integers in aggregation expressions ar...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/18779
  
    **[Test build #80156 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80156/testReport)** for PR 18779 at commit [`17bdf88`](https://github.com/apache/spark/commit/17bdf8881f09256a868a17d7df85762d057429b9).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #18779: [SPARK-21580][SQL]Integers in aggregation express...

Posted by viirya <gi...@git.apache.org>.

Github user viirya commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18779#discussion_r130505545
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala ---
    @@ -1010,7 +1014,16 @@ class Analyzer(
                     s"(valid range is [1, ${aggs.size}])")
               case o => o
             }
    -        Aggregate(newGroups, aggs, child)
    +        // If an aggregateExpression is integer, after replaced with this aggregateExpression, the
    +        // groupExpression will be considered as an ordinal still, so we should filter it.
    +        var filterGroups: Seq[Expression] = Nil
    +        newGroups.foreach { ng =>
    +          ng match {
    +            case a: Alias => if (!isIntLiteral (a.child)) filterGroups :+= ng
    +            case _ => filterGroups :+= ng
    +          }
    +        }
    +        Aggregate(filterGroups, aggs, child)
    --- End diff --
    
    @maropu yap, I think so. Because `RemoveLiteralFromGroupExpressions` takes care of this.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #18779: [SPARK-21580][SQL]Integers in aggregation expressions ar...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/18779
  
    **[Test build #80105 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80105/testReport)** for PR 18779 at commit [`a5667c8`](https://github.com/apache/spark/commit/a5667c83a562738ca6924ea80d9e0290969fbfcd).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #18779: [SPARK-21580][SQL]Integers in aggregation expressions ar...

Posted by maropu <gi...@git.apache.org>.

Github user maropu commented on the issue:

    https://github.com/apache/spark/pull/18779
  
    yea, just suggestion. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #18779: [SPARK-21580][SQL]Integers in aggregation express...

Posted by gatorsmile <gi...@git.apache.org>.

Github user gatorsmile commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18779#discussion_r131200152
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/SubstituteUnresolvedOrdinals.scala ---
    @@ -1,54 +0,0 @@
    -/*
    - * Licensed to the Apache Software Foundation (ASF) under one or more
    - * contributor license agreements.  See the NOTICE file distributed with
    - * this work for additional information regarding copyright ownership.
    - * The ASF licenses this file to You under the Apache License, Version 2.0
    - * (the "License"); you may not use this file except in compliance with
    - * the License.  You may obtain a copy of the License at
    - *
    - *    http://www.apache.org/licenses/LICENSE-2.0
    - *
    - * Unless required by applicable law or agreed to in writing, software
    - * distributed under the License is distributed on an "AS IS" BASIS,
    - * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    - * See the License for the specific language governing permissions and
    - * limitations under the License.
    - */
    -
    -package org.apache.spark.sql.catalyst.analysis
    -
    -import org.apache.spark.sql.catalyst.expressions.{Expression, Literal, SortOrder}
    -import org.apache.spark.sql.catalyst.plans.logical.{Aggregate, LogicalPlan, Sort}
    -import org.apache.spark.sql.catalyst.rules.Rule
    -import org.apache.spark.sql.catalyst.trees.CurrentOrigin.withOrigin
    -import org.apache.spark.sql.internal.SQLConf
    -import org.apache.spark.sql.types.IntegerType
    -
    -/**
    - * Replaces ordinal in 'order by' or 'group by' with UnresolvedOrdinal expression.
    - */
    -class SubstituteUnresolvedOrdinals(conf: SQLConf) extends Rule[LogicalPlan] {
    -  private def isIntLiteral(e: Expression) = e match {
    -    case Literal(_, IntegerType) => true
    -    case _ => false
    -  }
    -
    -  def apply(plan: LogicalPlan): LogicalPlan = plan transform {
    --- End diff --
    
    let us fix the whole bug using one line change.  : )
    
    `transform` -> `resolveOperators`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #18779: [SPARK-21580][SQL]Integers in aggregation expressions ar...

Posted by 10110346 <gi...@git.apache.org>.

Github user 10110346 commented on the issue:

    https://github.com/apache/spark/pull/18779
  
    ok,thanks @viirya 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #18779: [SPARK-21580][SQL]Integers in aggregation expressions ar...

Posted by viirya <gi...@git.apache.org>.

Github user viirya commented on the issue:

    https://github.com/apache/spark/pull/18779
  
    After rethinking this, do we have any chance to have a group by ordinal or order by ordinal which is not coming from user input and produced by analysis?
    
    If the answer is no, we can simply run `SubstituteUnresolvedOrdinals` rule with `Once` and avoid those issues.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #18779: [SPARK-21580][SQL]Integers in aggregation expressions ar...

Posted by viirya <gi...@git.apache.org>.

Github user viirya commented on the issue:

    https://github.com/apache/spark/pull/18779
  
    LGTM


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #18779: [SPARK-21580][SQL]There's a bug with `Group by ordinal`

Posted by viirya <gi...@git.apache.org>.

Github user viirya commented on the issue:

    https://github.com/apache/spark/pull/18779
  
    BTW, we should modify the PR description too. Please briefly describe the problem, what the fix is. Thanks.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #18779: [SPARK-21580][SQL]Integers in aggregation expressions ar...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/18779
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/80244/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #18779: [SPARK-21580][SQL]There's a bug with `Group by ordinal`

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/18779
  
    **[Test build #80067 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80067/testReport)** for PR 18779 at commit [`5319670`](https://github.com/apache/spark/commit/5319670abbd5668c82c505689949f53b0ca52e42).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #18779: [SPARK-21580][SQL]Integers in aggregation express...

Posted by gatorsmile <gi...@git.apache.org>.

Github user gatorsmile commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18779#discussion_r131432438
  
    --- Diff: sql/core/src/test/scala/org/apache/spark/sql/DataFrameSuite.scala ---
    @@ -2023,4 +2023,11 @@ class DataFrameSuite extends QueryTest with SharedSQLContext {
           assert(df1.join(df2, $"t1.i" === $"t2.i").cache().count() == 1)
         }
       }
    +
    +  test("order-by ordinal.") {
    +    val df = Seq((1, 1), (1, 2), (2, 1), (2, 2), (3, 1), (3, 2)).toDF("a", "b")
    +    checkAnswer(
    +      df.select(lit(7), 'a, 'b).orderBy(lit(1), lit(2), lit(3)),
    --- End diff --
    
    The same here. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #18779: [SPARK-21580][SQL]Integers in aggregation express...

Posted by 10110346 <gi...@git.apache.org>.

Github user 10110346 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18779#discussion_r131323389
  
    --- Diff: sql/core/src/test/resources/sql-tests/inputs/group-by-ordinal.sql ---
    @@ -52,8 +52,19 @@ select count(a), a from (select 1 as a) tmp group by 2 having a > 0;
     -- mixed cases: group-by ordinals and aliases
     select a, a AS k, count(b) from data group by k, 1;
     
    --- turn of group by ordinal
    +-- turn off group by ordinal
     set spark.sql.groupByOrdinal=false;
     
     -- can now group by negative literal
     select sum(b) from data group by -1;
    +
    +-- SPARK-21580 ints in aggregation expressions are taken as group-by ordinal
    +select 4, b from data group by 1, 2;
    +
    +set spark.sql.groupByOrdinal=true;
    +
    +select 4, b from data group by 1, 2;
    +
    +select 3, 4, sum(b) from data group by 1, 2;
    --- End diff --
    
    ok, i will to.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #18779: [SPARK-21580][SQL]Integers in aggregation expressions ar...

Posted by gatorsmile <gi...@git.apache.org>.

Github user gatorsmile commented on the issue:

    https://github.com/apache/spark/pull/18779
  
    Thank you for finding the root cause. @maropu 
    
    Moving them to the parser sounds reasonable to me, but we also should avoid analyzing the analyzed plan again. Thanks for your works!  @10110346 @viirya 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #18779: [SPARK-21580][SQL]Integers in aggregation express...

Posted by viirya <gi...@git.apache.org>.

Github user viirya commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18779#discussion_r131286224
  
    --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/analysis/SubstituteUnresolvedOrdinalsSuite.scala ---
    @@ -1,66 +0,0 @@
    -/*
    - * Licensed to the Apache Software Foundation (ASF) under one or more
    - * contributor license agreements.  See the NOTICE file distributed with
    - * this work for additional information regarding copyright ownership.
    - * The ASF licenses this file to You under the Apache License, Version 2.0
    - * (the "License"); you may not use this file except in compliance with
    - * the License.  You may obtain a copy of the License at
    - *
    - *    http://www.apache.org/licenses/LICENSE-2.0
    - *
    - * Unless required by applicable law or agreed to in writing, software
    - * distributed under the License is distributed on an "AS IS" BASIS,
    - * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    - * See the License for the specific language governing permissions and
    - * limitations under the License.
    - */
    -
    -package org.apache.spark.sql.catalyst.analysis
    -
    -import org.apache.spark.sql.catalyst.analysis.TestRelations.testRelation2
    -import org.apache.spark.sql.catalyst.dsl.expressions._
    -import org.apache.spark.sql.catalyst.dsl.plans._
    -import org.apache.spark.sql.catalyst.expressions.Literal
    -import org.apache.spark.sql.internal.SQLConf
    -
    -class SubstituteUnresolvedOrdinalsSuite extends AnalysisTest {
    -  private lazy val a = testRelation2.output(0)
    -  private lazy val b = testRelation2.output(1)
    -
    -  test("unresolved ordinal should not be unresolved") {
    --- End diff --
    
    This test can be moved to some place like `AnalysisSuite`?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #18779: [SPARK-21580][SQL]Integers in aggregation express...

Posted by gatorsmile <gi...@git.apache.org>.

Github user gatorsmile commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18779#discussion_r131053311
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala ---
    @@ -528,7 +529,15 @@ class AstBuilder(conf: SQLConf) extends SqlBaseBaseVisitor[AnyRef] with Logging
           } else {
             groupByExpressions
           }
    -      Aggregate(mappedGroupByExpressions, selectExpressions, query)
    +
    +      // Replaces ordinal in 'order by' or 'group by' with UnresolvedOrdinal expression
    --- End diff --
    
    `order by`?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #18779: [SPARK-21580][SQL]Integers in aggregation express...

Posted by viirya <gi...@git.apache.org>.

Github user viirya commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18779#discussion_r131328048
  
    --- Diff: sql/core/src/test/resources/sql-tests/inputs/group-by-ordinal.sql ---
    @@ -52,8 +52,18 @@ select count(a), a from (select 1 as a) tmp group by 2 having a > 0;
     -- mixed cases: group-by ordinals and aliases
     select a, a AS k, count(b) from data group by k, 1;
     
    --- turn of group by ordinal
    +-- turn off group by ordinal
     set spark.sql.groupByOrdinal=false;
     
     -- can now group by negative literal
     select sum(b) from data group by -1;
    +
    +select 4, b from data group by 1, 2;
    +
    +set spark.sql.groupByOrdinal=true;
    +
    +select 4, b from data group by 1, 2;
    +
    +-- SPARK-21580 ints in aggregation expressions are taken as group-by ordinal
    +select 3, 4, sum(b) from data group by 1, 2;
    +select 3 as c, 4 as d, sum(b) from data group by c, d;
    --- End diff --
    
    Do we need to add those query tests? They are actually no effect in testing this bug.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #18779: [SPARK-21580][SQL]Integers in aggregation express...

Posted by 10110346 <gi...@git.apache.org>.

Github user 10110346 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18779#discussion_r131351217
  
    --- Diff: sql/core/src/test/resources/sql-tests/inputs/order-by-ordinal.sql ---
    @@ -34,3 +34,8 @@ set spark.sql.orderByOrdinal=false;
     -- 0 is now a valid literal
     select * from data order by 0;
     select * from data sort by 0;
    +
    +select 4 as k, 5, b from data order by k, 2, 3;
    +
    +set spark.sql.orderByOrdinal=true;
    +select 4 as k, 5, b from data order by k, 2, 3;
    --- End diff --
    
     it is not necessary, i will remove it.thanks


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #18779: [SPARK-21580][SQL]Integers in aggregation expressions ar...

Posted by viirya <gi...@git.apache.org>.

Github user viirya commented on the issue:

    https://github.com/apache/spark/pull/18779
  
    Well, maybe we should revisit this after #17770 gets merged. Because after that, we won't go through analyzed plans anymore.
    
    At that time, we can simply solve all the issues by making `SubstituteUnresolvedOrdinals` run with `Once`.
    
    What do you think @gatorsmile @10110346 @maropu?
    



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #18779: [SPARK-21580][SQL]Integers in aggregation express...

Posted by gatorsmile <gi...@git.apache.org>.

Github user gatorsmile commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18779#discussion_r131308502
  
    --- Diff: sql/core/src/test/resources/sql-tests/inputs/group-by-ordinal.sql ---
    @@ -52,8 +52,19 @@ select count(a), a from (select 1 as a) tmp group by 2 having a > 0;
     -- mixed cases: group-by ordinals and aliases
     select a, a AS k, count(b) from data group by k, 1;
     
    --- turn of group by ordinal
    +-- turn off group by ordinal
     set spark.sql.groupByOrdinal=false;
     
     -- can now group by negative literal
     select sum(b) from data group by -1;
    +
    +-- SPARK-21580 ints in aggregation expressions are taken as group-by ordinal
    --- End diff --
    
    Could you move this line to the line before the line 68


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #18779: [SPARK-21580][SQL]Integers in aggregation expressions ar...

Posted by 10110346 <gi...@git.apache.org>.

Github user 10110346 commented on the issue:

    https://github.com/apache/spark/pull/18779
  
    I  have ran `SubstituteUnresolvedOrdinals` rule with `Once `,  it still looks like some problems, i will confirm it


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #18779: [SPARK-21580][SQL]There's a bug with `Group by or...

Posted by viirya <gi...@git.apache.org>.

Github user viirya commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18779#discussion_r130272987
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala ---
    @@ -1010,7 +1014,16 @@ class Analyzer(
                     s"(valid range is [1, ${aggs.size}])")
               case o => o
             }
    -        Aggregate(newGroups, aggs, child)
    +        var filterGroups: Seq[Expression] = Nil
    --- End diff --
    
    Please leave a comment explaining why we need to filter the group-by exprs.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #18779: [SPARK-21580][SQL]Integers in aggregation expressions ar...

Posted by maropu <gi...@git.apache.org>.

Github user maropu commented on the issue:

    https://github.com/apache/spark/pull/18779
  
    ok, fixed: https://github.com/apache/spark/compare/master...maropu:SPARK-21580#diff-a0f2e45a5da747e9ec483f3557aa1b8bR210


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #18779: [SPARK-21580][SQL]Integers in aggregation expressions ar...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/18779
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/80232/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #18779: [SPARK-21580][SQL]Integers in aggregation expressions ar...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/18779
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #18779: [SPARK-21580][SQL]Integers in aggregation express...

Posted by viirya <gi...@git.apache.org>.

Github user viirya commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18779#discussion_r130292918
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala ---
    @@ -1010,7 +1014,16 @@ class Analyzer(
                     s"(valid range is [1, ${aggs.size}])")
               case o => o
             }
    -        Aggregate(newGroups, aggs, child)
    +        // If an aggregateExpression is integer, after replaced with this aggregateExpression, the
    +        // groupExpression will be considered as an ordinal still, so we should filter it.
    --- End diff --
    
    Please also comment on why this filtering is safe from changing grouping result. Thanks.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #18779: [SPARK-21580][SQL]Integers in aggregation expressions ar...

Posted by viirya <gi...@git.apache.org>.

Github user viirya commented on the issue:

    https://github.com/apache/spark/pull/18779
  
    LGTM. cc @gatorsmile for final check.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #18779: [SPARK-21580][SQL]Integers in aggregation expressions ar...

Posted by viirya <gi...@git.apache.org>.

Github user viirya commented on the issue:

    https://github.com/apache/spark/pull/18779
  
    Actually I don't think `SubstituteUnresolvedOrdinals` rule should run with `fixedPoint`. It should be run with `Once`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #18779: [SPARK-21580][SQL]Integers in aggregation expressions ar...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/18779
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/80105/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #18779: [SPARK-21580][SQL]Integers in aggregation expressions ar...

Posted by maropu <gi...@git.apache.org>.

Github user maropu commented on the issue:

    https://github.com/apache/spark/pull/18779
  
    yea, it is like; it adds `UnresolvedOrdinal` in `AstBuilder` and, if `conf.groupByOrdinal`=false, analyzer drops `UnresolvedOrdinal`. Since `conf.groupByOrdinal`=true by defualt, dropping `UnresolvedOrdinal` rarely happens and I feel it is not a bad idea, maybe. https://github.com/apache/spark/compare/master...maropu:SPARK-21580-2


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #18779: [SPARK-21580][SQL]Integers in aggregation expressions ar...

Posted by 10110346 <gi...@git.apache.org>.

Github user 10110346 commented on the issue:

    https://github.com/apache/spark/pull/18779
  
    @maropu `select 3 as c, 4 as d, sum(b) from data group by c, d`
    This test case still has exeception using your modification:GROUP BY position 4 is not in select list (valid range is [1, 3]); line 1 pos 15.
    



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #18779: [SPARK-21580][SQL]Integers in aggregation expressions ar...

Posted by viirya <gi...@git.apache.org>.

Github user viirya commented on the issue:

    https://github.com/apache/spark/pull/18779
  
    @maropu Thanks for clarifying it.
    
    Although they looks similar, from semantics I'd treat them different rules.  However, I don't have strong opinion on this.
    
    Btw, `RemoveLiteralFromGroupExpressions` reminds me that we should not drop all literal grouping expressions here too.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #18779: [SPARK-21580][SQL]Integers in aggregation express...

Posted by 10110346 <gi...@git.apache.org>.

Github user 10110346 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18779#discussion_r130503562
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala ---
    @@ -1010,7 +1014,16 @@ class Analyzer(
                     s"(valid range is [1, ${aggs.size}])")
               case o => o
             }
    -        Aggregate(newGroups, aggs, child)
    +        // If an aggregateExpression is integer, after replaced with this aggregateExpression, the
    +        // groupExpression will be considered as an ordinal still, so we should filter it.
    +        var filterGroups: Seq[Expression] = Nil
    +        newGroups.foreach { ng =>
    +          ng match {
    +            case a: Alias => if (!isIntLiteral (a.child)) filterGroups :+= ng
    +            case _ => filterGroups :+= ng
    +          }
    +        }
    +        Aggregate(filterGroups, aggs, child)
    --- End diff --
    
    Actually, We have supported  `GlobalAggregates` and `group by null`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #18779: [SPARK-21580][SQL]Integers in aggregation express...

Posted by viirya <gi...@git.apache.org>.

Github user viirya commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18779#discussion_r130506005
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala ---
    @@ -1010,7 +1014,16 @@ class Analyzer(
                     s"(valid range is [1, ${aggs.size}])")
               case o => o
             }
    -        Aggregate(newGroups, aggs, child)
    +        // If an aggregateExpression is integer, after replaced with this aggregateExpression, the
    +        // groupExpression will be considered as an ordinal still, so we should filter it.
    +        var filterGroups: Seq[Expression] = Nil
    +        newGroups.foreach { ng =>
    +          ng match {
    +            case a: Alias => if (!isIntLiteral (a.child)) filterGroups :+= ng
    --- End diff --
    
    This rule is before `CleanupAliases`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #18779: [SPARK-21580][SQL]Integers in aggregation express...

Posted by maropu <gi...@git.apache.org>.

Github user maropu commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18779#discussion_r130505351
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala ---
    @@ -1010,7 +1014,16 @@ class Analyzer(
                     s"(valid range is [1, ${aggs.size}])")
               case o => o
             }
    -        Aggregate(newGroups, aggs, child)
    +        // If an aggregateExpression is integer, after replaced with this aggregateExpression, the
    +        // groupExpression will be considered as an ordinal still, so we should filter it.
    +        var filterGroups: Seq[Expression] = Nil
    +        newGroups.foreach { ng =>
    +          ng match {
    +            case a: Alias => if (!isIntLiteral (a.child)) filterGroups :+= ng
    +            case _ => filterGroups :+= ng
    +          }
    +        }
    +        Aggregate(filterGroups, aggs, child)
    --- End diff --
    
    If we reimplement this by using `ResolvedOrdinal`, we probably have no chance to hit the empty issue hrere, I think.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #18779: [SPARK-21580][SQL]Integers in aggregation expressions ar...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/18779
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #18779: [SPARK-21580][SQL]Integers in aggregation express...

Posted by viirya <gi...@git.apache.org>.

Github user viirya commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18779#discussion_r131290132
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala ---
    @@ -528,7 +541,15 @@ class AstBuilder(conf: SQLConf) extends SqlBaseBaseVisitor[AnyRef] with Logging
           } else {
             groupByExpressions
           }
    -      Aggregate(mappedGroupByExpressions, selectExpressions, query)
    +
    +      // Replaces ordinal in 'group by' with UnresolvedOrdinal expression
    +      val newGroups = mappedGroupByExpressions.map {
    --- End diff --
    
    ditto.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #18779: [SPARK-21580][SQL]Integers in aggregation expressions ar...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/18779
  
    **[Test build #80244 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80244/testReport)** for PR 18779 at commit [`2dc3610`](https://github.com/apache/spark/commit/2dc36107b33a989b18af39816eb87da8284abeba).
     * This patch **fails PySpark unit tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #18779: [SPARK-21580][SQL]Integers in aggregation expressions ar...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/18779
  
    **[Test build #80246 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80246/testReport)** for PR 18779 at commit [`c1594c7`](https://github.com/apache/spark/commit/c1594c78b27a1beca446b5530ec2e88653619742).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #18779: [SPARK-21580][SQL]Integers in aggregation express...

Posted by gatorsmile <gi...@git.apache.org>.

Github user gatorsmile commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18779#discussion_r131432337
  
    --- Diff: sql/core/src/test/scala/org/apache/spark/sql/DataFrameAggregateSuite.scala ---
    @@ -557,4 +557,22 @@ class DataFrameAggregateSuite extends QueryTest with SharedSQLContext {
         }
         assert(e.message.contains("aggregate functions are not allowed in GROUP BY"))
       }
    +
    +  test("SPARK-21580 ints in aggregation expressions are taken as group-by ordinal.") {
    +    val df = Seq((1, 1), (1, 2), (2, 1), (2, 2), (3, 1), (3, 2)).toDF("a", "b")
    +    checkAnswer(
    +      df.groupBy(lit(3), lit(4)).agg(lit(6), lit(7), sum("b")),
    +      Seq(Row(3, 4, 6, 7, 9)))
    +    checkAnswer(
    +      df.groupBy(lit(3), lit(4)).agg(lit(6), 'b, sum("b")),
    +      Seq(Row(3, 4, 6, 1, 3), Row(3, 4, 6, 2, 6)))
    +
    +    df.createOrReplaceTempView("data")
    +    checkAnswer(
    +      spark.sql("select 3, 4, sum(b) from data group by 1, 2"),
    --- End diff --
    
    Nit: Please use upper cases for SQL keywords.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #18779: [SPARK-21580][SQL]Integers in aggregation express...

Posted by 10110346 <gi...@git.apache.org>.

Github user 10110346 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18779#discussion_r131292299
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/SubstituteUnresolvedOrdinals.scala ---
    @@ -1,54 +0,0 @@
    -/*
    - * Licensed to the Apache Software Foundation (ASF) under one or more
    - * contributor license agreements.  See the NOTICE file distributed with
    - * this work for additional information regarding copyright ownership.
    - * The ASF licenses this file to You under the Apache License, Version 2.0
    - * (the "License"); you may not use this file except in compliance with
    - * the License.  You may obtain a copy of the License at
    - *
    - *    http://www.apache.org/licenses/LICENSE-2.0
    - *
    - * Unless required by applicable law or agreed to in writing, software
    - * distributed under the License is distributed on an "AS IS" BASIS,
    - * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    - * See the License for the specific language governing permissions and
    - * limitations under the License.
    - */
    -
    -package org.apache.spark.sql.catalyst.analysis
    -
    -import org.apache.spark.sql.catalyst.expressions.{Expression, Literal, SortOrder}
    -import org.apache.spark.sql.catalyst.plans.logical.{Aggregate, LogicalPlan, Sort}
    -import org.apache.spark.sql.catalyst.rules.Rule
    -import org.apache.spark.sql.catalyst.trees.CurrentOrigin.withOrigin
    -import org.apache.spark.sql.internal.SQLConf
    -import org.apache.spark.sql.types.IntegerType
    -
    -/**
    - * Replaces ordinal in 'order by' or 'group by' with UnresolvedOrdinal expression.
    - */
    -class SubstituteUnresolvedOrdinals(conf: SQLConf) extends Rule[LogicalPlan] {
    -  private def isIntLiteral(e: Expression) = e match {
    -    case Literal(_, IntegerType) => true
    -    case _ => false
    -  }
    -
    -  def apply(plan: LogicalPlan): LogicalPlan = plan transform {
    --- End diff --
    
    changing `Aggregate(grouping.map(trimAliases), cleanedAggs, child)` to  ` Aggregate(grouping.map(trimNonTopLevelAliases), cleanedAggs, child) in `CleanupAliases` ,
    looks like it can fix the whole bug 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #18779: [SPARK-21580][SQL]Integers in aggregation expressions ar...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/18779
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #18779: [SPARK-21580][SQL]Integers in aggregation expressions ar...

Posted by viirya <gi...@git.apache.org>.

Github user viirya commented on the issue:

    https://github.com/apache/spark/pull/18779
  
    Well, I think we can not avoid the various cases bringing int literals into grouping expressions.
    
    To fix it, I think we should not replace any int literals with `UnresolvedOrdinal` in analysis. We should create those `UnresolvedOrdinal` when parsing SQL query or preparing `Aggregate` in `Dataset` API.
    



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #18779: [SPARK-21580][SQL]Integers in aggregation expressions ar...

Posted by maropu <gi...@git.apache.org>.

Github user maropu commented on the issue:

    https://github.com/apache/spark/pull/18779
  
    @10110346 Thanks, I looked into this failure and I could easily fix this like: https://github.com/apache/spark/compare/master...maropu:SPARK-21580#diff-57b3d87be744b7d79a9beacf8e5e5eb2R2287. But, I feel it'd be good to fix this more simply.
    
    @viirya I quickly checked that I applied `SubstituteUnresolvedOrdinals` once though, I couldn't fix this because we have the chance that we apply Analyzer `batches`  again into an already-analyzed logical plan:
    ```
    // Apply Analyzer `batches` in the SQL statement to built `DataFrame`.
    scala> val df = sql("""select 3, 4, sum(b) from data group by 1, 2""")
    
    // Then, again apply Analyzer `batches` again to execute `show`
    scala> df.show
    ```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #18779: [SPARK-21580][SQL]Integers in aggregation expressions ar...

Posted by maropu <gi...@git.apache.org>.

Github user maropu commented on the issue:

    https://github.com/apache/spark/pull/18779
  
    Another way to fix this is something like https://github.com/apache/spark/compare/master...maropu:SPARK-21580.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #18779: [SPARK-21580][SQL]Integers in aggregation expressions ar...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/18779
  
    Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #18779: [SPARK-21580][SQL]Integers in aggregation expressions ar...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/18779
  
    **[Test build #80070 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80070/testReport)** for PR 18779 at commit [`fae3190`](https://github.com/apache/spark/commit/fae3190e987ab4aa1bb45d7f058f510bf9690f12).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #18779: [SPARK-21580][SQL]Integers in aggregation express...

Posted by viirya <gi...@git.apache.org>.

Github user viirya commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18779#discussion_r131309985
  
    --- Diff: sql/core/src/test/resources/sql-tests/inputs/group-by-ordinal.sql ---
    @@ -52,8 +52,19 @@ select count(a), a from (select 1 as a) tmp group by 2 having a > 0;
     -- mixed cases: group-by ordinals and aliases
     select a, a AS k, count(b) from data group by k, 1;
     
    --- turn of group by ordinal
    +-- turn off group by ordinal
     set spark.sql.groupByOrdinal=false;
     
     -- can now group by negative literal
     select sum(b) from data group by -1;
    +
    +-- SPARK-21580 ints in aggregation expressions are taken as group-by ordinal
    +select 4, b from data group by 1, 2;
    +
    +set spark.sql.groupByOrdinal=true;
    +
    +select 4, b from data group by 1, 2;
    +
    +select 3, 4, sum(b) from data group by 1, 2;
    --- End diff --
    
    Can those tests this bug? No matter I use `transform` or  `resolveOperators`, they generate the same results.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #18779: [SPARK-21580][SQL]Integers in aggregation expressions ar...

Posted by 10110346 <gi...@git.apache.org>.

Github user 10110346 commented on the issue:

    https://github.com/apache/spark/pull/18779
  
    @viirya Maybe adding `ResolvedOrdinal` is not very well.
     I have another problem:
    `select a, **4 AS k**, count(b) from data group by k, 1;`
    This test case has the same exception:
    `Error in query: GROUP BY position 4 is not in select list (valid range is [1, 3]); line 1 pos 10`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #18779: [SPARK-21580][SQL]Integers in aggregation expressions ar...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/18779
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #18779: [SPARK-21580][SQL]Integers in aggregation expressions ar...

Posted by maropu <gi...@git.apache.org>.

Github user maropu commented on the issue:

    https://github.com/apache/spark/pull/18779
  
    ok, thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #18779: [SPARK-21580][SQL]Integers in aggregation expressions ar...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/18779
  
    **[Test build #80237 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80237/testReport)** for PR 18779 at commit [`791bc33`](https://github.com/apache/spark/commit/791bc330f8573345e0b5890b43b25259b854bc3d).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #18779: [SPARK-21580][SQL]Integers in aggregation express...

Posted by viirya <gi...@git.apache.org>.

Github user viirya commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18779#discussion_r131349769
  
    --- Diff: sql/core/src/test/resources/sql-tests/inputs/group-by-ordinal.sql ---
    @@ -52,8 +52,14 @@ select count(a), a from (select 1 as a) tmp group by 2 having a > 0;
     -- mixed cases: group-by ordinals and aliases
     select a, a AS k, count(b) from data group by k, 1;
     
    --- turn of group by ordinal
    +-- turn off group by ordinal
     set spark.sql.groupByOrdinal=false;
     
     -- can now group by negative literal
     select sum(b) from data group by -1;
    +
    +select 4, b from data group by 1, 2;
    +
    +set spark.sql.groupByOrdinal=true;
    +
    +select 4, b from data group by 1, 2;
    --- End diff --
    
    Why do we need to have change like this? 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #18779: [SPARK-21580][SQL]Integers in aggregation expressions ar...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/18779
  
    **[Test build #80206 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80206/testReport)** for PR 18779 at commit [`4fce4ab`](https://github.com/apache/spark/commit/4fce4ab3da0ce425b4ba1807d165b4ab05a812b7).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #18779: [SPARK-21580][SQL]Integers in aggregation expressions ar...

Posted by 10110346 <gi...@git.apache.org>.

Github user 10110346 commented on the issue:

    https://github.com/apache/spark/pull/18779
  
    I think it is a perfect solution,thank you very much. @viirya @maropu 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #18779: [SPARK-21580][SQL]Integers in aggregation expressions ar...

Posted by gatorsmile <gi...@git.apache.org>.

Github user gatorsmile commented on the issue:

    https://github.com/apache/spark/pull/18779
  
    Thanks! Merging to master/2.2


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #18779: [SPARK-21580][SQL]Integers in aggregation expressions ar...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/18779
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/80070/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #18779: [SPARK-21580][SQL]Integers in aggregation expressions ar...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/18779
  
    **[Test build #80206 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80206/testReport)** for PR 18779 at commit [`4fce4ab`](https://github.com/apache/spark/commit/4fce4ab3da0ce425b4ba1807d165b4ab05a812b7).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #18779: [SPARK-21580][SQL]Integers in aggregation expressions ar...

Posted by viirya <gi...@git.apache.org>.

Github user viirya commented on the issue:

    https://github.com/apache/spark/pull/18779
  
    @10110346 As using `resolveOperators` can solve the whole bug, let's do it and simplify the whole change. Sorry for confusing.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #18779: [SPARK-21580][SQL]Integers in aggregation express...

Posted by asfgit <gi...@git.apache.org>.

Github user asfgit closed the pull request at:

    https://github.com/apache/spark/pull/18779


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #18779: [SPARK-21580][SQL]Integers in aggregation express...

Posted by 10110346 <gi...@git.apache.org>.

Github user 10110346 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18779#discussion_r131329942
  
    --- Diff: sql/core/src/test/scala/org/apache/spark/sql/DataFrameAggregateSuite.scala ---
    @@ -557,4 +557,22 @@ class DataFrameAggregateSuite extends QueryTest with SharedSQLContext {
         }
         assert(e.message.contains("aggregate functions are not allowed in GROUP BY"))
       }
    +
    +  test("SPARK-21580 ints in aggregation expressions are taken as group-by ordinal.") {
    --- End diff --
    
    ok,thanks


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #18779: [SPARK-21580][SQL]Integers in aggregation express...

Posted by 10110346 <gi...@git.apache.org>.

Github user 10110346 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18779#discussion_r131298714
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/SubstituteUnresolvedOrdinals.scala ---
    @@ -1,54 +0,0 @@
    -/*
    - * Licensed to the Apache Software Foundation (ASF) under one or more
    - * contributor license agreements.  See the NOTICE file distributed with
    - * this work for additional information regarding copyright ownership.
    - * The ASF licenses this file to You under the Apache License, Version 2.0
    - * (the "License"); you may not use this file except in compliance with
    - * the License.  You may obtain a copy of the License at
    - *
    - *    http://www.apache.org/licenses/LICENSE-2.0
    - *
    - * Unless required by applicable law or agreed to in writing, software
    - * distributed under the License is distributed on an "AS IS" BASIS,
    - * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    - * See the License for the specific language governing permissions and
    - * limitations under the License.
    - */
    -
    -package org.apache.spark.sql.catalyst.analysis
    -
    -import org.apache.spark.sql.catalyst.expressions.{Expression, Literal, SortOrder}
    -import org.apache.spark.sql.catalyst.plans.logical.{Aggregate, LogicalPlan, Sort}
    -import org.apache.spark.sql.catalyst.rules.Rule
    -import org.apache.spark.sql.catalyst.trees.CurrentOrigin.withOrigin
    -import org.apache.spark.sql.internal.SQLConf
    -import org.apache.spark.sql.types.IntegerType
    -
    -/**
    - * Replaces ordinal in 'order by' or 'group by' with UnresolvedOrdinal expression.
    - */
    -class SubstituteUnresolvedOrdinals(conf: SQLConf) extends Rule[LogicalPlan] {
    -  private def isIntLiteral(e: Expression) = e match {
    -    case Literal(_, IntegerType) => true
    -    case _ => false
    -  }
    -
    -  def apply(plan: LogicalPlan): LogicalPlan = plan transform {
    --- End diff --
    
    For this Change: `transform -> resolveOperators`,it can fix the whole bug, I have tested it. @gatorsmile  @viirya


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #18779: [SPARK-21580][SQL]Integers in aggregation expressions ar...

Posted by maropu <gi...@git.apache.org>.

Github user maropu commented on the issue:

    https://github.com/apache/spark/pull/18779
  
    Aha, I see... I'm not sure this is an expected behaviour though, yea, we might have the compatibility issue as you said. I feel, if we explicitly support `group-by-ordinal` in dataset apis, the syntax 
     like `df.groupBy(functions.ordinal(1))` is more consistent with the other apis (it's a just my thought).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #18779: [SPARK-21580][SQL]Integers in aggregation express...

Posted by viirya <gi...@git.apache.org>.

Github user viirya commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18779#discussion_r131309398
  
    --- Diff: sql/core/src/test/resources/sql-tests/inputs/group-by-ordinal.sql ---
    @@ -52,8 +52,19 @@ select count(a), a from (select 1 as a) tmp group by 2 having a > 0;
     -- mixed cases: group-by ordinals and aliases
     select a, a AS k, count(b) from data group by k, 1;
     
    --- turn of group by ordinal
    +-- turn off group by ordinal
     set spark.sql.groupByOrdinal=false;
     
     -- can now group by negative literal
     select sum(b) from data group by -1;
    +
    +-- SPARK-21580 ints in aggregation expressions are taken as group-by ordinal
    +select 4, b from data group by 1, 2;
    +
    +set spark.sql.groupByOrdinal=true;
    +
    +select 4, b from data group by 1, 2;
    +
    +select 3, 4, sum(b) from data group by 1, 2;
    --- End diff --
    
    Can those tests this bug? No matter I use `transform` or  `resolveOperators`, they generate the same results.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #18779: [SPARK-21580][SQL]Integers in aggregation expressions ar...

Posted by viirya <gi...@git.apache.org>.

Github user viirya commented on the issue:

    https://github.com/apache/spark/pull/18779
  
    @maropu does it solve @10110346's issue at https://github.com/apache/spark/pull/18779#issuecomment-319601312?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #18779: [SPARK-21580][SQL]Integers in aggregation expressions ar...

Posted by viirya <gi...@git.apache.org>.

Github user viirya commented on the issue:

    https://github.com/apache/spark/pull/18779
  
    `AstBuilder` can access conf. So it can only replace int literals with `UnresolvedOrdinal` when the config is enabled. Otherwise, leave it as it's.
    
    In Analyzer, we remove the rule `SubstituteUnresolvedOrdinals` and keep `ResolveOrdinalInOrderByAndGroupBy` untouched. 
    



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #18779: [SPARK-21580][SQL]Integers in aggregation expressions ar...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/18779
  
    **[Test build #80191 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80191/testReport)** for PR 18779 at commit [`c41f209`](https://github.com/apache/spark/commit/c41f209b619f118b237b7641de9d7cc457e4c5e0).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #18779: [SPARK-21580][SQL]Integers in aggregation expressions ar...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/18779
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/80156/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #18779: [SPARK-21580][SQL]Integers in aggregation express...

Posted by viirya <gi...@git.apache.org>.

Github user viirya commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18779#discussion_r131353213
  
    --- Diff: sql/core/src/test/scala/org/apache/spark/sql/DataFrameSuite.scala ---
    @@ -2023,4 +2023,11 @@ class DataFrameSuite extends QueryTest with SharedSQLContext {
           assert(df1.join(df2, $"t1.i" === $"t2.i").cache().count() == 1)
         }
       }
    +
    +  test("order-by ordinal.") {
    +    val df = Seq((1, 1), (1, 2), (2, 1), (2, 2), (3, 1), (3, 2)).toDF("a", "b")
    +    checkAnswer(
    +      df.select(lit(7), 'a, 'b).orderBy(lit(1), lit(2), lit(3)),
    +      Seq(Row(7, 1, 1), Row(7, 1, 2), Row(7, 2, 1), Row(7, 2, 2), Row(7, 3, 1), Row(7, 3, 2)))
    --- End diff --
    
    I've run this test. Even without `transform` -> `resolveOperators`, it still works. Can you check it again?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #18779: [SPARK-21580][SQL]Integers in aggregation expressions ar...

Posted by viirya <gi...@git.apache.org>.

Github user viirya commented on the issue:

    https://github.com/apache/spark/pull/18779
  
    Let's remove the rule `SubstituteUnresolvedOrdinals` and move all the `UnresolvedOrdinal` stuffs (order by and group by) into parser.
    
    Btw, also don't forget Dataset API, please see `Dataset.groupBy` and `RelationalGroupedDataset`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #18779: [SPARK-21580][SQL]Integers in aggregation expressions ar...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/18779
  
    **[Test build #80191 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80191/testReport)** for PR 18779 at commit [`c41f209`](https://github.com/apache/spark/commit/c41f209b619f118b237b7641de9d7cc457e4c5e0).
     * This patch **fails Spark unit tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #18779: [SPARK-21580][SQL]Integers in aggregation express...

Posted by viirya <gi...@git.apache.org>.

Github user viirya commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18779#discussion_r131330713
  
    --- Diff: sql/core/src/test/resources/sql-tests/inputs/group-by-ordinal.sql ---
    @@ -52,8 +52,18 @@ select count(a), a from (select 1 as a) tmp group by 2 having a > 0;
     -- mixed cases: group-by ordinals and aliases
     select a, a AS k, count(b) from data group by k, 1;
     
    --- turn of group by ordinal
    +-- turn off group by ordinal
     set spark.sql.groupByOrdinal=false;
     
     -- can now group by negative literal
     select sum(b) from data group by -1;
    +
    +select 4, b from data group by 1, 2;
    +
    +set spark.sql.groupByOrdinal=true;
    +
    +select 4, b from data group by 1, 2;
    +
    +-- SPARK-21580 ints in aggregation expressions are taken as group-by ordinal
    +select 3, 4, sum(b) from data group by 1, 2;
    +select 3 as c, 4 as d, sum(b) from data group by c, d;
    --- End diff --
    
    Because I ran them without the change `transform` -> `resolveOperators`, and there is no error. Do you have different result?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #18779: [SPARK-21580][SQL]Integers in aggregation expressions ar...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/18779
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #18779: [SPARK-21580][SQL]Integers in aggregation expressions ar...

Posted by viirya <gi...@git.apache.org>.

Github user viirya commented on the issue:

    https://github.com/apache/spark/pull/18779
  
    @maropu Because we already support it in Dataset API as the example https://github.com/apache/spark/pull/18779#issuecomment-319880753 shows, I'm afraid that there are users using this feature. If we remove it now, maybe there is compatibility issue.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #18779: [SPARK-21580][SQL]Integers in aggregation expressions ar...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/18779
  
    **[Test build #80156 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80156/testReport)** for PR 18779 at commit [`17bdf88`](https://github.com/apache/spark/commit/17bdf8881f09256a868a17d7df85762d057429b9).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #18779: [SPARK-21580][SQL]Integers in aggregation expressions ar...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/18779
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #18779: [SPARK-21580][SQL]Integers in aggregation expressions ar...

Posted by 10110346 <gi...@git.apache.org>.

Github user 10110346 commented on the issue:

    https://github.com/apache/spark/pull/18779
  
    I learned a lot from you, thanks all.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #18779: [SPARK-21580][SQL]Integers in aggregation express...

Posted by viirya <gi...@git.apache.org>.

Github user viirya commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18779#discussion_r130300193
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala ---
    @@ -1010,7 +1014,16 @@ class Analyzer(
                     s"(valid range is [1, ${aggs.size}])")
               case o => o
             }
    -        Aggregate(newGroups, aggs, child)
    +        // If an aggregateExpression is integer, after replaced with this aggregateExpression, the
    +        // groupExpression will be considered as an ordinal still, so we should filter it.
    +        var filterGroups: Seq[Expression] = Nil
    +        newGroups.foreach { ng =>
    +          ng match {
    +            case a: Alias => if (!isIntLiteral (a.child)) filterGroups :+= ng
    +            case _ => filterGroups :+= ng
    +          }
    +        }
    +        Aggregate(filterGroups, aggs, child)
    --- End diff --
    
    Similar to `RemoveLiteralFromGroupExpressions`, we shouldn't drop all grouping expressions if they are all int literals after resolved from `UnresolvedOrdinal`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #18779: [SPARK-21580][SQL]Integers in aggregation expressions ar...

Posted by maropu <gi...@git.apache.org>.

Github user maropu commented on the issue:

    https://github.com/apache/spark/pull/18779
  
    I checked the root cause of this and `SubstituteUnresolvedOrdinals` wrongly wraps again int literals with `UnresolvedOrdinal`.
    
    ```
    17/07/31 16:54:50 TRACE HiveSessionStateBuilder$$anon$1:
    === Applying Rule org.apache.spark.sql.catalyst.analysis.SubstituteUnresolvedOrdinals ===
    !GlobalLimit 21                                                                           'GlobalLimit 21
    !+- LocalLimit 21                                                                         +- 'LocalLimit 21
    !   +- Aggregate [3, 4], [3 AS 3#56, 4 AS 4#57, sum(cast(b#1 as bigint)) AS sum(b)#58L]      +- 'Aggregate [unresolvedordinal(3), unresolvedordinal(4)], [3 AS 3#56, 4 AS 4#57, sum(cast(b#1 as bigint)) AS sum(b)#58L]
           +- SubqueryAlias data                                                                    +- SubqueryAlias data
              +- Project [a#0, b#1]                                                                    +- Project [a#0, b#1]
                 +- SubqueryAlias data                                                                    +- SubqueryAlias data
                    +- LocalRelation [a#0, b#1]                                                              +- LocalRelation [a#0, b#1]
    ```
    It seems the current fix is similar with `RemoveLiteralFromGroupExpressions` in the optimizer, so we better move this optimizer rule to here?



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #18779: [SPARK-21580][SQL]Integers in aggregation express...

Posted by viirya <gi...@git.apache.org>.

Github user viirya commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18779#discussion_r130504819
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala ---
    @@ -1010,7 +1014,16 @@ class Analyzer(
                     s"(valid range is [1, ${aggs.size}])")
               case o => o
             }
    -        Aggregate(newGroups, aggs, child)
    +        // If an aggregateExpression is integer, after replaced with this aggregateExpression, the
    +        // groupExpression will be considered as an ordinal still, so we should filter it.
    +        var filterGroups: Seq[Expression] = Nil
    +        newGroups.foreach { ng =>
    +          ng match {
    +            case a: Alias => if (!isIntLiteral (a.child)) filterGroups :+= ng
    +            case _ => filterGroups :+= ng
    +          }
    +        }
    +        Aggregate(filterGroups, aggs, child)
    --- End diff --
    
    When the aggregates that have an empty input, if we remove all grouping expressions, this triggers the ungrouped code path (which aways returns a single row). So the query semantics are changed.
    
    You meant this is not an issue anymore?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #18779: [SPARK-21580][SQL]Integers in aggregation expressions ar...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/18779
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/80197/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #18779: [SPARK-21580][SQL]Integers in aggregation expressions ar...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/18779
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/80249/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #18779: [SPARK-21580][SQL]Integers in aggregation express...

Posted by viirya <gi...@git.apache.org>.

Github user viirya commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18779#discussion_r131361398
  
    --- Diff: sql/core/src/test/scala/org/apache/spark/sql/DataFrameAggregateSuite.scala ---
    @@ -557,4 +557,22 @@ class DataFrameAggregateSuite extends QueryTest with SharedSQLContext {
         }
         assert(e.message.contains("aggregate functions are not allowed in GROUP BY"))
       }
    +
    +  test("SPARK-21580 ints in aggregation expressions are taken as group-by ordinal.") {
    +    val df = Seq((1, 1), (1, 2), (2, 1), (2, 2), (3, 1), (3, 2)).toDF("a", "b")
    +    checkAnswer(
    +      df.groupBy(lit(3), lit(4)).agg(lit(6), lit(7), sum("b")),
    +      Seq(Row(3, 4, 6, 7, 9)))
    +    checkAnswer(
    +      df.groupBy(lit(3), lit(4)).agg(lit(6), 'b, sum("b")),
    +      Seq(Row(3, 4, 6, 1, 3), Row(3, 4, 6, 2, 6)))
    +
    +    df.createOrReplaceTempView("data")
    +    checkAnswer(
    +      spark.sql("select 3, 4, sum(b) from data group by 1, 2"),
    +      Seq(Row(3, 4, 9)))
    +    checkAnswer(
    +      spark.sql("select 3 as c, 4 as d, sum(b) from data group by c, d"),
    +      Seq(Row(3, 4, 9)))
    --- End diff --
    
    I've verified that the above tests will fail without this fix.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #18779: [SPARK-21580][SQL]Integers in aggregation expressions ar...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/18779
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #18779: [SPARK-21580][SQL]Integers in aggregation expressions ar...

Posted by gatorsmile <gi...@git.apache.org>.

Github user gatorsmile commented on the issue:

    https://github.com/apache/spark/pull/18779
  
    Really appreciate your patience and your work! This is how we work in Spark SQL. Thanks!



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #18779: [SPARK-21580][SQL]Integers in aggregation express...

Posted by viirya <gi...@git.apache.org>.

Github user viirya commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18779#discussion_r130293284
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala ---
    @@ -1010,7 +1014,16 @@ class Analyzer(
                     s"(valid range is [1, ${aggs.size}])")
               case o => o
             }
    -        Aggregate(newGroups, aggs, child)
    +        // If an aggregateExpression is integer, after replaced with this aggregateExpression, the
    +        // groupExpression will be considered as an ordinal still, so we should filter it.
    +        var filterGroups: Seq[Expression] = Nil
    +        newGroups.foreach { ng =>
    +          ng match {
    +            case a: Alias => if (!isIntLiteral (a.child)) filterGroups :+= ng
    --- End diff --
    
    I think we may need to trim all possible Aliases, e.g., Alias(Alias(1, 'a'), 'b').


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #18779: [SPARK-21580][SQL]Integers in aggregation express...

Posted by viirya <gi...@git.apache.org>.

Github user viirya commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18779#discussion_r131294567
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/SubstituteUnresolvedOrdinals.scala ---
    @@ -1,54 +0,0 @@
    -/*
    - * Licensed to the Apache Software Foundation (ASF) under one or more
    - * contributor license agreements.  See the NOTICE file distributed with
    - * this work for additional information regarding copyright ownership.
    - * The ASF licenses this file to You under the Apache License, Version 2.0
    - * (the "License"); you may not use this file except in compliance with
    - * the License.  You may obtain a copy of the License at
    - *
    - *    http://www.apache.org/licenses/LICENSE-2.0
    - *
    - * Unless required by applicable law or agreed to in writing, software
    - * distributed under the License is distributed on an "AS IS" BASIS,
    - * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    - * See the License for the specific language governing permissions and
    - * limitations under the License.
    - */
    -
    -package org.apache.spark.sql.catalyst.analysis
    -
    -import org.apache.spark.sql.catalyst.expressions.{Expression, Literal, SortOrder}
    -import org.apache.spark.sql.catalyst.plans.logical.{Aggregate, LogicalPlan, Sort}
    -import org.apache.spark.sql.catalyst.rules.Rule
    -import org.apache.spark.sql.catalyst.trees.CurrentOrigin.withOrigin
    -import org.apache.spark.sql.internal.SQLConf
    -import org.apache.spark.sql.types.IntegerType
    -
    -/**
    - * Replaces ordinal in 'order by' or 'group by' with UnresolvedOrdinal expression.
    - */
    -class SubstituteUnresolvedOrdinals(conf: SQLConf) extends Rule[LogicalPlan] {
    -  private def isIntLiteral(e: Expression) = e match {
    -    case Literal(_, IntegerType) => true
    -    case _ => false
    -  }
    -
    -  def apply(plan: LogicalPlan): LogicalPlan = plan transform {
    --- End diff --
    
    Because it keeps the redundant Aliases at the top level. That may not what we want.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #18779: [SPARK-21580][SQL]Integers in aggregation expressions ar...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/18779
  
    **[Test build #80249 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80249/testReport)** for PR 18779 at commit [`c1594c7`](https://github.com/apache/spark/commit/c1594c78b27a1beca446b5530ec2e88653619742).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #18779: [SPARK-21580][SQL]Integers in aggregation expressions ar...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/18779
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #18779: [SPARK-21580][SQL]Integers in aggregation expressions ar...

Posted by maropu <gi...@git.apache.org>.

Github user maropu commented on the issue:

    https://github.com/apache/spark/pull/18779
  
    @viirya btw, we still need to support group/sort-by-ordinal in Dataset? If this pr merged, the behaviour seems to change. I feel the syntax in Dataset is a little confusing..


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #18779: [SPARK-21580][SQL]Integers in aggregation express...

Posted by viirya <gi...@git.apache.org>.

Github user viirya commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18779#discussion_r131360956
  
    --- Diff: sql/core/src/test/scala/org/apache/spark/sql/DataFrameSuite.scala ---
    @@ -2023,4 +2023,11 @@ class DataFrameSuite extends QueryTest with SharedSQLContext {
           assert(df1.join(df2, $"t1.i" === $"t2.i").cache().count() == 1)
         }
       }
    +
    +  test("order-by ordinal.") {
    +    val df = Seq((1, 1), (1, 2), (2, 1), (2, 2), (3, 1), (3, 2)).toDF("a", "b")
    +    checkAnswer(
    +      df.select(lit(7), 'a, 'b).orderBy(lit(1), lit(2), lit(3)),
    +      Seq(Row(7, 1, 1), Row(7, 1, 2), Row(7, 2, 1), Row(7, 2, 2), Row(7, 3, 1), Row(7, 3, 2)))
    --- End diff --
    
    Hmm, maybe we still can keep this test for the case that someone changes the Sort/Project relationship in the future.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #18779: [SPARK-21580][SQL]Integers in aggregation expressions ar...

Posted by viirya <gi...@git.apache.org>.

Github user viirya commented on the issue:

    https://github.com/apache/spark/pull/18779
  
    @maropu The PR is at #17770.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #18779: [SPARK-21580][SQL]There's a bug with `Group by or...

Posted by viirya <gi...@git.apache.org>.

Github user viirya commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18779#discussion_r130272850
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala ---
    @@ -1010,7 +1014,16 @@ class Analyzer(
                     s"(valid range is [1, ${aggs.size}])")
               case o => o
             }
    -        Aggregate(newGroups, aggs, child)
    +        var filterGroups: Seq[Expression] = Nil
    +        for (ng <- newGroups)
    +        if (ng.isInstanceOf[Alias]) {
    --- End diff --
    
    We can use pattern matching here like:
    
    ng match {
        case a: Alias => if (!isIntLiteral(a.child)) filterGroups :+= ng
        case  _ => filterGroups :+= ng
    }


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #18779: [SPARK-21580][SQL]There's a bug with `Group by or...

Posted by viirya <gi...@git.apache.org>.

Github user viirya commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18779#discussion_r130272963
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala ---
    @@ -1010,7 +1014,16 @@ class Analyzer(
                     s"(valid range is [1, ${aggs.size}])")
               case o => o
             }
    -        Aggregate(newGroups, aggs, child)
    +        var filterGroups: Seq[Expression] = Nil
    +        for (ng <- newGroups)
    +        if (ng.isInstanceOf[Alias]) {
    --- End diff --
    
    Please also fix the indent of code here.
    
        newGroups.foreach { ng =>
           ng match {
               ...
           }
        }


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #18779: [SPARK-21580][SQL]Integers in aggregation expressions ar...

Posted by viirya <gi...@git.apache.org>.

Github user viirya commented on the issue:

    https://github.com/apache/spark/pull/18779
  
    @10110346 I deleted the previous comment because I find I forgot that `resolveOperators` will be removed by #17770 in near future.
    
    
    



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #18779: [SPARK-21580][SQL]Integers in aggregation expressions ar...

Posted by maropu <gi...@git.apache.org>.

Github user maropu commented on the issue:

    https://github.com/apache/spark/pull/18779
  
    @viirya great! thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #18779: [SPARK-21580][SQL]Integers in aggregation express...

Posted by 10110346 <gi...@git.apache.org>.

Github user 10110346 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18779#discussion_r131329917
  
    --- Diff: sql/core/src/test/resources/sql-tests/inputs/group-by-ordinal.sql ---
    @@ -52,8 +52,18 @@ select count(a), a from (select 1 as a) tmp group by 2 having a > 0;
     -- mixed cases: group-by ordinals and aliases
     select a, a AS k, count(b) from data group by k, 1;
     
    --- turn of group by ordinal
    +-- turn off group by ordinal
     set spark.sql.groupByOrdinal=false;
     
     -- can now group by negative literal
     select sum(b) from data group by -1;
    +
    +select 4, b from data group by 1, 2;
    +
    +set spark.sql.groupByOrdinal=true;
    +
    +select 4, b from data group by 1, 2;
    +
    +-- SPARK-21580 ints in aggregation expressions are taken as group-by ordinal
    +select 3, 4, sum(b) from data group by 1, 2;
    +select 3 as c, 4 as d, sum(b) from data group by c, d;
    --- End diff --
    
    You mean these test cases are not necessary?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #18779: [SPARK-21580][SQL]Integers in aggregation express...

Posted by viirya <gi...@git.apache.org>.

Github user viirya commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18779#discussion_r130613115
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala ---
    @@ -1010,7 +1014,16 @@ class Analyzer(
                     s"(valid range is [1, ${aggs.size}])")
               case o => o
             }
    -        Aggregate(newGroups, aggs, child)
    +        // If an aggregateExpression is integer, after replaced with this aggregateExpression, the
    +        // groupExpression will be considered as an ordinal still, so we should filter it.
    +        var filterGroups: Seq[Expression] = Nil
    +        newGroups.foreach { ng =>
    +          ng match {
    +            case a: Alias => if (!isIntLiteral (a.child)) filterGroups :+= ng
    +            case _ => filterGroups :+= ng
    +          }
    +        }
    +        Aggregate(filterGroups, aggs, child)
    --- End diff --
    
    @10110346 Please see previous comment: https://github.com/apache/spark/pull/18779#issuecomment-319007812
    
    The basic idea is, instead of resolve `UnresolvedOrdinal` to actual aggregate expressions, we resolve it to `ResolvedOrdinal`.
    
    Then in the beginning of optimization, we replace all `ResolvedOrdinal` to actual aggregate expressions.
    
    What do you think?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #18779: [SPARK-21580][SQL]Integers in aggregation expressions ar...

Posted by maropu <gi...@git.apache.org>.

Github user maropu commented on the issue:

    https://github.com/apache/spark/pull/18779
  
    @gatorsmile @viirya I looked into why it applied some analyzer rules into already-analyzed plans and I noticed that some rules used `transform/transformUp` instead of `resolveOperators` in `apply`. `SubstituteUnresolvedOrdinals` also doesn't use `resolveOperators`, so the rule is applied again into an already-analyzed plan https://github.com/apache/spark/compare/master...maropu:SPARK-21580-3.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #18779: [SPARK-21580][SQL]Integers in aggregation expressions ar...

Posted by maropu <gi...@git.apache.org>.

Github user maropu commented on the issue:

    https://github.com/apache/spark/pull/18779
  
    We need to backport this issue to branch-2.2? I think the opinion depends on the backport decision. If no, I'm with your suggestion (keep this issue as a blocker for branch-2.3).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #18779: [SPARK-21580][SQL]Integers in aggregation expressions ar...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/18779
  
    **[Test build #80244 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80244/testReport)** for PR 18779 at commit [`2dc3610`](https://github.com/apache/spark/commit/2dc36107b33a989b18af39816eb87da8284abeba).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #18779: [SPARK-21580][SQL]Integers in aggregation expressions ar...

Posted by maropu <gi...@git.apache.org>.

Github user maropu commented on the issue:

    https://github.com/apache/spark/pull/18779
  
    oh, I missed. I'll re-check.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #18779: [SPARK-21580][SQL]Integers in aggregation express...

Posted by gatorsmile <gi...@git.apache.org>.

Github user gatorsmile commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18779#discussion_r131432191
  
    --- Diff: sql/core/src/test/scala/org/apache/spark/sql/DataFrameAggregateSuite.scala ---
    @@ -557,4 +557,22 @@ class DataFrameAggregateSuite extends QueryTest with SharedSQLContext {
         }
         assert(e.message.contains("aggregate functions are not allowed in GROUP BY"))
       }
    +
    +  test("SPARK-21580 ints in aggregation expressions are taken as group-by ordinal.") {
    +    val df = Seq((1, 1), (1, 2), (2, 1), (2, 2), (3, 1), (3, 2)).toDF("a", "b")
    +    checkAnswer(
    +      df.groupBy(lit(3), lit(4)).agg(lit(6), lit(7), sum("b")),
    +      Seq(Row(3, 4, 6, 7, 9)))
    +    checkAnswer(
    +      df.groupBy(lit(3), lit(4)).agg(lit(6), 'b, sum("b")),
    +      Seq(Row(3, 4, 6, 1, 3), Row(3, 4, 6, 2, 6)))
    +
    +    df.createOrReplaceTempView("data")
    --- End diff --
    
    One more comment. You need to use `withTempView`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #18779: [SPARK-21580][SQL]Integers in aggregation expressions ar...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/18779
  
    **[Test build #80246 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80246/testReport)** for PR 18779 at commit [`c1594c7`](https://github.com/apache/spark/commit/c1594c78b27a1beca446b5530ec2e88653619742).
     * This patch **fails PySpark unit tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #18779: [SPARK-21580][SQL]Integers in aggregation expressions ar...

Posted by viirya <gi...@git.apache.org>.

Github user viirya commented on the issue:

    https://github.com/apache/spark/pull/18779
  
    @10110346 Yeah, thanks. Please confirm it. I think running `SubstituteUnresolvedOrdinals` with `Once` should work.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #18779: [SPARK-21580][SQL]Integers in aggregation express...

Posted by viirya <gi...@git.apache.org>.

Github user viirya commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18779#discussion_r131286674
  
    --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/analysis/SubstituteUnresolvedOrdinalsSuite.scala ---
    @@ -1,66 +0,0 @@
    -/*
    - * Licensed to the Apache Software Foundation (ASF) under one or more
    - * contributor license agreements.  See the NOTICE file distributed with
    - * this work for additional information regarding copyright ownership.
    - * The ASF licenses this file to You under the Apache License, Version 2.0
    - * (the "License"); you may not use this file except in compliance with
    - * the License.  You may obtain a copy of the License at
    - *
    - *    http://www.apache.org/licenses/LICENSE-2.0
    - *
    - * Unless required by applicable law or agreed to in writing, software
    - * distributed under the License is distributed on an "AS IS" BASIS,
    - * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    - * See the License for the specific language governing permissions and
    - * limitations under the License.
    - */
    -
    -package org.apache.spark.sql.catalyst.analysis
    -
    -import org.apache.spark.sql.catalyst.analysis.TestRelations.testRelation2
    -import org.apache.spark.sql.catalyst.dsl.expressions._
    -import org.apache.spark.sql.catalyst.dsl.plans._
    -import org.apache.spark.sql.catalyst.expressions.Literal
    -import org.apache.spark.sql.internal.SQLConf
    -
    -class SubstituteUnresolvedOrdinalsSuite extends AnalysisTest {
    -  private lazy val a = testRelation2.output(0)
    -  private lazy val b = testRelation2.output(1)
    -
    -  test("unresolved ordinal should not be unresolved") {
    -    // Expression OrderByOrdinal is unresolved.
    -    assert(!UnresolvedOrdinal(0).resolved)
    -  }
    -
    -  test("order by ordinal") {
    --- End diff --
    
    Similarly, we should add two tests like here to a parser test suite like `PlanParserSuite`.
    
    In the new tests, we test if order-by/group-by SQL queries can be correctly parsed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #18779: [SPARK-21580][SQL]Integers in aggregation expressions ar...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/18779
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/80206/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #18779: [SPARK-21580][SQL]Integers in aggregation expressions ar...

Posted by viirya <gi...@git.apache.org>.

Github user viirya commented on the issue:

    https://github.com/apache/spark/pull/18779
  
    Sorry @10110346 would you mind to try and see if https://github.com/apache/spark/pull/18779#discussion_r131287607 works to solve all the issues.
    
    If it could, that maybe more simple.
    
    Current approach with parser looks good, but we need to do the same thing to Dataset APIs, like groupBy and sort. Unfortunately, we need to keep Dataset API support as @gatorsmile said.
    
    If we can solve this with less code change, that is better.
    



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #18779: [SPARK-21580][SQL]Integers in aggregation express...

Posted by 10110346 <gi...@git.apache.org>.

Github user 10110346 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18779#discussion_r130507794
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala ---
    @@ -1010,7 +1014,16 @@ class Analyzer(
                     s"(valid range is [1, ${aggs.size}])")
               case o => o
             }
    -        Aggregate(newGroups, aggs, child)
    +        // If an aggregateExpression is integer, after replaced with this aggregateExpression, the
    +        // groupExpression will be considered as an ordinal still, so we should filter it.
    +        var filterGroups: Seq[Expression] = Nil
    +        newGroups.foreach { ng =>
    +          ng match {
    +            case a: Alias => if (!isIntLiteral (a.child)) filterGroups :+= ng
    +            case _ => filterGroups :+= ng
    +          }
    +        }
    +        Aggregate(filterGroups, aggs, child)
    --- End diff --
    
    thanks, I'm not very familiar with this module , how to do it?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #18779: [SPARK-21580][SQL]Integers in aggregation expressions ar...

Posted by maropu <gi...@git.apache.org>.

Github user maropu commented on the issue:

    https://github.com/apache/spark/pull/18779
  
    Aha, I feel better idea to add `ResolvedOrdinal`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #18779: [SPARK-21580][SQL]Integers in aggregation express...

Posted by viirya <gi...@git.apache.org>.

Github user viirya commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18779#discussion_r131328405
  
    --- Diff: sql/core/src/test/scala/org/apache/spark/sql/DataFrameAggregateSuite.scala ---
    @@ -557,4 +557,22 @@ class DataFrameAggregateSuite extends QueryTest with SharedSQLContext {
         }
         assert(e.message.contains("aggregate functions are not allowed in GROUP BY"))
       }
    +
    +  test("SPARK-21580 ints in aggregation expressions are taken as group-by ordinal.") {
    --- End diff --
    
    Please also add order-by test too. Maybe add to `DataFrameSuite`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #18779: [SPARK-21580][SQL]Integers in aggregation express...

Posted by viirya <gi...@git.apache.org>.

Github user viirya commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18779#discussion_r131311220
  
    --- Diff: sql/core/src/test/resources/sql-tests/inputs/group-by-ordinal.sql ---
    @@ -52,8 +52,19 @@ select count(a), a from (select 1 as a) tmp group by 2 having a > 0;
     -- mixed cases: group-by ordinals and aliases
     select a, a AS k, count(b) from data group by k, 1;
     
    --- turn of group by ordinal
    +-- turn off group by ordinal
     set spark.sql.groupByOrdinal=false;
     
     -- can now group by negative literal
     select sum(b) from data group by -1;
    +
    +-- SPARK-21580 ints in aggregation expressions are taken as group-by ordinal
    +select 4, b from data group by 1, 2;
    +
    +set spark.sql.groupByOrdinal=true;
    +
    +select 4, b from data group by 1, 2;
    +
    +select 3, 4, sum(b) from data group by 1, 2;
    --- End diff --
    
    We can move those test queries to the test suite like `DataFrameAggregateSuite`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #18779: [SPARK-21580][SQL]Integers in aggregation expressions ar...

Posted by 10110346 <gi...@git.apache.org>.

Github user 10110346 commented on the issue:

    https://github.com/apache/spark/pull/18779
  
    retest this please


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #18779: [SPARK-21580][SQL]Integers in aggregation expressions ar...

Posted by maropu <gi...@git.apache.org>.

Github user maropu commented on the issue:

    https://github.com/apache/spark/pull/18779
  
    @viirya yea, it does: https://github.com/apache/spark/compare/master...maropu:SPARK-21580#diff-a0f2e45a5da747e9ec483f3557aa1b8bR210. I also think `SubstituteUnresolvedOrdinals` should be applied once.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #18779: [SPARK-21580][SQL]Integers in aggregation expressions ar...

Posted by gatorsmile <gi...@git.apache.org>.

Github user gatorsmile commented on the issue:

    https://github.com/apache/spark/pull/18779
  
    Our Dataset APIs support `conf.groupByOrdinal`? If so, this might surprise me. `conf.groupByOrdinal` was introduced for SQL APIs only.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #18779: [SPARK-21580][SQL]Integers in aggregation expressions ar...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/18779
  
    **[Test build #80277 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80277/testReport)** for PR 18779 at commit [`2bf42b1`](https://github.com/apache/spark/commit/2bf42b11cec794d19ed8f2fd000a9cd7aeb159bb).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #18779: [SPARK-21580][SQL]Integers in aggregation expressions ar...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/18779
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/80237/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #18779: [SPARK-21580][SQL]Integers in aggregation express...

Posted by gatorsmile <gi...@git.apache.org>.

Github user gatorsmile commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18779#discussion_r131432378
  
    --- Diff: sql/core/src/test/scala/org/apache/spark/sql/DataFrameAggregateSuite.scala ---
    @@ -557,4 +557,22 @@ class DataFrameAggregateSuite extends QueryTest with SharedSQLContext {
         }
         assert(e.message.contains("aggregate functions are not allowed in GROUP BY"))
       }
    +
    +  test("SPARK-21580 ints in aggregation expressions are taken as group-by ordinal.") {
    +    val df = Seq((1, 1), (1, 2), (2, 1), (2, 2), (3, 1), (3, 2)).toDF("a", "b")
    +    checkAnswer(
    +      df.groupBy(lit(3), lit(4)).agg(lit(6), lit(7), sum("b")),
    +      Seq(Row(3, 4, 6, 7, 9)))
    +    checkAnswer(
    +      df.groupBy(lit(3), lit(4)).agg(lit(6), 'b, sum("b")),
    +      Seq(Row(3, 4, 6, 1, 3), Row(3, 4, 6, 2, 6)))
    +
    +    df.createOrReplaceTempView("data")
    +    checkAnswer(
    +      spark.sql("select 3, 4, sum(b) from data group by 1, 2"),
    +      Seq(Row(3, 4, 9)))
    +    checkAnswer(
    +      spark.sql("select 3 as c, 4 as d, sum(b) from data group by c, d"),
    --- End diff --
    
    The same here.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #18779: [SPARK-21580][SQL]Integers in aggregation expressions ar...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/18779
  
    **[Test build #80067 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80067/testReport)** for PR 18779 at commit [`5319670`](https://github.com/apache/spark/commit/5319670abbd5668c82c505689949f53b0ca52e42).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #18779: [SPARK-21580][SQL]Integers in aggregation expressions ar...

Posted by viirya <gi...@git.apache.org>.

Github user viirya commented on the issue:

    https://github.com/apache/spark/pull/18779
  
    @10110346 Can't we also do the same on order by ordinal?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #18779: [SPARK-21580][SQL]Integers in aggregation express...

Posted by viirya <gi...@git.apache.org>.

Github user viirya commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18779#discussion_r131311047
  
    --- Diff: sql/core/src/test/resources/sql-tests/inputs/group-by-ordinal.sql ---
    @@ -52,8 +52,19 @@ select count(a), a from (select 1 as a) tmp group by 2 having a > 0;
     -- mixed cases: group-by ordinals and aliases
     select a, a AS k, count(b) from data group by k, 1;
     
    --- turn of group by ordinal
    +-- turn off group by ordinal
     set spark.sql.groupByOrdinal=false;
     
     -- can now group by negative literal
     select sum(b) from data group by -1;
    +
    +-- SPARK-21580 ints in aggregation expressions are taken as group-by ordinal
    +select 4, b from data group by 1, 2;
    +
    +set spark.sql.groupByOrdinal=true;
    +
    +select 4, b from data group by 1, 2;
    +
    +select 3, 4, sum(b) from data group by 1, 2;
    --- End diff --
    
    Only the methods like `Dataset.show` causes en-entrance of analyzed plans. `collect` won't. I guess the queries here are evaluated with `collect`.
    
        scala> sql("select 4, b, sum(b) from data group by 1, 2").show
        org.apache.spark.sql.AnalysisException: GROUP BY position 4 is not in select list (valid range is [1, 3]); line 1 pos 7
    
        scala> sql("select 4, b, sum(b) from data group by 1, 2").collect
        res2: Array[org.apache.spark.sql.Row] = Array([4,3,3], [4,4,4], [4,2,2])        
    
    



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #18779: [SPARK-21580][SQL]Integers in aggregation express...

Posted by viirya <gi...@git.apache.org>.

Github user viirya commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18779#discussion_r131290115
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala ---
    @@ -245,14 +246,26 @@ class AstBuilder(conf: SQLConf) extends SqlBaseBaseVisitor[AnyRef] with Logging
           query: LogicalPlan): LogicalPlan = withOrigin(ctx) {
         import ctx._
     
    +    // Replaces ordinal in 'order by' or 'sort by' with UnresolvedOrdinal expression
    +    def replacesOrdinal(orders: Seq[SortOrder]): Seq[SortOrder] = {
    --- End diff --
    
    Because we need to apply the same on Dataset API. We may need to extract this out so we can reuse it to Dataset API.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #18779: [SPARK-21580][SQL]Integers in aggregation express...

Posted by 10110346 <gi...@git.apache.org>.

Github user 10110346 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18779#discussion_r131309163
  
    --- Diff: sql/core/src/test/resources/sql-tests/inputs/group-by-ordinal.sql ---
    @@ -52,8 +52,19 @@ select count(a), a from (select 1 as a) tmp group by 2 having a > 0;
     -- mixed cases: group-by ordinals and aliases
     select a, a AS k, count(b) from data group by k, 1;
     
    --- turn of group by ordinal
    +-- turn off group by ordinal
     set spark.sql.groupByOrdinal=false;
     
     -- can now group by negative literal
     select sum(b) from data group by -1;
    +
    +-- SPARK-21580 ints in aggregation expressions are taken as group-by ordinal
    --- End diff --
    
    OK,thanks


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #18779: [SPARK-21580][SQL]Integers in aggregation expressions ar...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/18779
  
    **[Test build #80232 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80232/testReport)** for PR 18779 at commit [`bfab7e5`](https://github.com/apache/spark/commit/bfab7e55609ed9c98534c23d41ff881a6eed84d5).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #18779: [SPARK-21580][SQL]Integers in aggregation expressions ar...

Posted by maropu <gi...@git.apache.org>.

Github user maropu commented on the issue:

    https://github.com/apache/spark/pull/18779
  
    It might help to document this in the Dataset `groupBy` comment.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #18779: [SPARK-21580][SQL]Integers in aggregation expressions ar...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/18779
  
    **[Test build #80249 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80249/testReport)** for PR 18779 at commit [`c1594c7`](https://github.com/apache/spark/commit/c1594c78b27a1beca446b5530ec2e88653619742).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #18779: [SPARK-21580][SQL]Integers in aggregation express...

Posted by gatorsmile <gi...@git.apache.org>.

Github user gatorsmile commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18779#discussion_r131308421
  
    --- Diff: sql/core/src/test/resources/sql-tests/inputs/group-by-ordinal.sql ---
    @@ -52,8 +52,19 @@ select count(a), a from (select 1 as a) tmp group by 2 having a > 0;
     -- mixed cases: group-by ordinals and aliases
     select a, a AS k, count(b) from data group by k, 1;
     
    --- turn of group by ordinal
    +-- turn off group by ordinal
     set spark.sql.groupByOrdinal=false;
     
     -- can now group by negative literal
     select sum(b) from data group by -1;
    +
    +-- SPARK-21580 ints in aggregation expressions are taken as group-by ordinal
    +select 4, b from data group by 1, 2;
    +
    +set spark.sql.groupByOrdinal=true;
    +
    +select 4, b from data group by 1, 2;
    +
    +select 3, 4, sum(b) from data group by 1, 2;
    --- End diff --
    
    Except this line, does this PR fix any other queries? 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #18779: [SPARK-21580][SQL]Integers in aggregation expressions ar...

Posted by maropu <gi...@git.apache.org>.

Github user maropu commented on the issue:

    https://github.com/apache/spark/pull/18779
  
    sorry for my ambiguous explanation and I mean;
    ```
    scala> sql("""select 3, 4, sum(b) from data group by 1, 2""").show
    17/07/31 17:13:23 TRACE HiveSessionStateBuilder$$anon$1:
    
    // 1. Replace literals in grouping exprs with unresolveordinal here
    === Applying Rule org.apache.spark.sql.catalyst.analysis.SubstituteUnresolvedOrdinals ===
    !'Aggregate [1, 2], [unresolvedalias(3, None), unresolvedalias(4, None), unresolvedalias('sum('b), None)]   'Aggregate [unresolvedordinal(1), unresolvedordinal(2)], [unresolvedalias(3, None), unresolvedalias(4, None), unresolvedalias('sum('b), None)]
     +- 'UnresolvedRelation `data`                                                                              +- 'UnresolvedRelation `data`
    ...
    
    17/07/31 17:13:23 TRACE HiveSessionStateBuilder$$anon$1:
    
    // 2. And then, resolve unresolvedordinal by using agg expressions here (but, the resolved expressions are literals)
    === Applying Rule org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveOrdinalInOrderByAndGroupBy ===
    !'Aggregate [unresolvedordinal(1), unresolvedordinal(2)], [3 AS 3#4, 4 AS 4#5, sum(cast(b#1 as bigint)) AS sum(b)#6L]   Aggregate [3 AS 3#4, 4 AS 4#5], [3 AS 3#4, 4 AS 4#5, sum(cast(b#1 as bigint)) AS sum(b)#6L]
     +- SubqueryAlias data                                                                                                  +- SubqueryAlias data
        +- Project [a#0, b#1]                                                                                                  +- Project [a#0, b#1]
           +- SubqueryAlias data                                                                                                  +- SubqueryAlias data
              +- LocalRelation [a#0, b#1]                                                                                            +- LocalRelation [a#0, b#1]
    ...
    
    17/07/31 17:13:23 TRACE HiveSessionStateBuilder$$anon$1:
    
    // 3. the resolved expressions above are literals, so wrap them with unresolveordinal again here.
    === Applying Rule org.apache.spark.sql.catalyst.analysis.SubstituteUnresolvedOrdinals ===
    !GlobalLimit 21                                                                        'GlobalLimit 21
    !+- LocalLimit 21                                                                      +- 'LocalLimit 21
    !   +- Aggregate [3, 4], [3 AS 3#4, 4 AS 4#5, sum(cast(b#1 as bigint)) AS sum(b)#6L]      +- 'Aggregate [unresolvedordinal(3), unresolvedordinal(4)], [3 AS 3#4, 4 AS 4#5, sum(cast(b#1 as bigint)) AS sum(b)#6L]
           +- SubqueryAlias data                                                                 +- SubqueryAlias data
              +- Project [a#0, b#1]                                                                 +- Project [a#0, b#1]
                 +- SubqueryAlias data                                                                 +- SubqueryAlias data
                    +- LocalRelation [a#0, b#1]                                                           +- LocalRelation [a#0, b#1]
    
    // 4. Then, it fails here.
    org.apache.spark.sql.AnalysisException: GROUP BY position 4 is not in select list (valid range is [1, 3]); line 1 pos 10
      at org.apache.spark.sql.catalyst.analysis.package$AnalysisErrorAt.failAnalysis(package.scala:42)
      at org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveOrdinalInOrderByAndGroupBy$$anonfun$apply$11$$anonfun$43.apply(Analyzer.scala:1008)
      at org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveOrdinalInOrderByAndGroupBy$$anonfun$apply$11$$anonfun$43.apply(Analyzer.scala:1004)
      at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
    ```
    
    I think replacing `unresolvedordinal`s with literals in `ResolveOrdinalInOrderByAndGroupBy` cause this failure, so we need to avoid this (I know this pr fixes in that way). I feel this solution of this pr is similar to `RemoveLiteralFromGroupExpressions ` in the optimizer, so I though we'd better to move this rule into the analyzer phase.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #18779: [SPARK-21580][SQL]Integers in aggregation expressions ar...

Posted by viirya <gi...@git.apache.org>.

Github user viirya commented on the issue:

    https://github.com/apache/spark/pull/18779
  
    @maropu As an alternative approach, we can avoid to resolve `UnresolvedOrdinal` to actual expressions in this rule.
    
    Instead, we can create a `ResolvedOrdinal` and replace it with actual agg expression in an optimization rule.
    
    By doing this, we can fix this issue (because we don't replace the int literal with `UnresolvedOrdinal` again). We also don't need to duplicate the logic of remove literals from grouping expressions.
    
    @10110346 @maropu What do you think?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #18779: [SPARK-21580][SQL]Integers in aggregation expressions ar...

Posted by viirya <gi...@git.apache.org>.

Github user viirya commented on the issue:

    https://github.com/apache/spark/pull/18779
  
    retest this please.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #18779: [SPARK-21580][SQL]Integers in aggregation expressions ar...

Posted by maropu <gi...@git.apache.org>.

Github user maropu commented on the issue:

    https://github.com/apache/spark/pull/18779
  
    oh, yea. I feel it's ok to do so.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #18779: [SPARK-21580][SQL]There's a bug with `Group by ordinal`

Posted by 10110346 <gi...@git.apache.org>.

Github user 10110346 commented on the issue:

    https://github.com/apache/spark/pull/18779
  
    Thanks,i will update @viirya 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #18779: [SPARK-21580][SQL]Integers in aggregation express...

Posted by viirya <gi...@git.apache.org>.

Github user viirya commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18779#discussion_r131058058
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/SubstituteUnresolvedOrdinals.scala ---
    @@ -42,13 +42,5 @@ class SubstituteUnresolvedOrdinals(conf: SQLConf) extends Rule[LogicalPlan] {
             case other => other
           }
           withOrigin(s.origin)(s.copy(order = newOrders))
    -
    -    case a: Aggregate if conf.groupByOrdinal && a.groupingExpressions.exists(isIntLiteral) =>
    --- End diff --
    
    Yeah, we should do it. We should completely remove `SubstituteUnresolvedOrdinals`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #18779: [SPARK-21580][SQL]Integers in aggregation expressions ar...

Posted by 10110346 <gi...@git.apache.org>.

Github user 10110346 commented on the issue:

    https://github.com/apache/spark/pull/18779
  
    `k` is resolved to `4` in `ResolveAggAliasInGroupBy`,and then `4`  is resolved to `ResolvedOrdinal(4)`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #18779: [SPARK-21580][SQL]Integers in aggregation expressions ar...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/18779
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/80191/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #18779: [SPARK-21580][SQL]Integers in aggregation expressions ar...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/18779
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/80067/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #18779: [SPARK-21580][SQL]Integers in aggregation expressions ar...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/18779
  
    **[Test build #80237 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80237/testReport)** for PR 18779 at commit [`791bc33`](https://github.com/apache/spark/commit/791bc330f8573345e0b5890b43b25259b854bc3d).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #18779: [SPARK-21580][SQL]Integers in aggregation express...

Posted by viirya <gi...@git.apache.org>.

Github user viirya commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18779#discussion_r130336844
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala ---
    @@ -1010,7 +1014,16 @@ class Analyzer(
                     s"(valid range is [1, ${aggs.size}])")
               case o => o
             }
    -        Aggregate(newGroups, aggs, child)
    +        // If an aggregateExpression is integer, after replaced with this aggregateExpression, the
    +        // groupExpression will be considered as an ordinal still, so we should filter it.
    +        var filterGroups: Seq[Expression] = Nil
    +        newGroups.foreach { ng =>
    +          ng match {
    +            case a: Alias => if (!isIntLiteral (a.child)) filterGroups :+= ng
    +            case _ => filterGroups :+= ng
    +          }
    +        }
    +        Aggregate(filterGroups, aggs, child)
    --- End diff --
    
    Please see the comments at https://github.com/apache/spark/blob/75b168fd30bb9a52ae223b6f1df73da4b1316f2e/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala#L1254-L1257. We should not make the grouping expressions empty.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #18779: [SPARK-21580][SQL]Integers in aggregation express...

Posted by viirya <gi...@git.apache.org>.

Github user viirya commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18779#discussion_r131285585
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/SubstituteUnresolvedOrdinals.scala ---
    @@ -1,54 +0,0 @@
    -/*
    - * Licensed to the Apache Software Foundation (ASF) under one or more
    - * contributor license agreements.  See the NOTICE file distributed with
    - * this work for additional information regarding copyright ownership.
    - * The ASF licenses this file to You under the Apache License, Version 2.0
    - * (the "License"); you may not use this file except in compliance with
    - * the License.  You may obtain a copy of the License at
    - *
    - *    http://www.apache.org/licenses/LICENSE-2.0
    - *
    - * Unless required by applicable law or agreed to in writing, software
    - * distributed under the License is distributed on an "AS IS" BASIS,
    - * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    - * See the License for the specific language governing permissions and
    - * limitations under the License.
    - */
    -
    -package org.apache.spark.sql.catalyst.analysis
    -
    -import org.apache.spark.sql.catalyst.expressions.{Expression, Literal, SortOrder}
    -import org.apache.spark.sql.catalyst.plans.logical.{Aggregate, LogicalPlan, Sort}
    -import org.apache.spark.sql.catalyst.rules.Rule
    -import org.apache.spark.sql.catalyst.trees.CurrentOrigin.withOrigin
    -import org.apache.spark.sql.internal.SQLConf
    -import org.apache.spark.sql.types.IntegerType
    -
    -/**
    - * Replaces ordinal in 'order by' or 'group by' with UnresolvedOrdinal expression.
    - */
    -class SubstituteUnresolvedOrdinals(conf: SQLConf) extends Rule[LogicalPlan] {
    -  private def isIntLiteral(e: Expression) = e match {
    -    case Literal(_, IntegerType) => true
    -    case _ => false
    -  }
    -
    -  def apply(plan: LogicalPlan): LogicalPlan = plan transform {
    --- End diff --
    
    By replacing `transform` with `resolveOperators`, we can just fix the issue brought by `SubstituteUnresolvedOrdinals` applies on analyzed plan.
    
    The other issues like int literals in aggregation expressions are taken as ordinal are still there.
    



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #18779: [SPARK-21580][SQL]Integers in aggregation express...

Posted by 10110346 <gi...@git.apache.org>.

Github user 10110346 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18779#discussion_r131309076
  
    --- Diff: sql/core/src/test/resources/sql-tests/inputs/group-by-ordinal.sql ---
    @@ -52,8 +52,19 @@ select count(a), a from (select 1 as a) tmp group by 2 having a > 0;
     -- mixed cases: group-by ordinals and aliases
     select a, a AS k, count(b) from data group by k, 1;
     
    --- turn of group by ordinal
    +-- turn off group by ordinal
     set spark.sql.groupByOrdinal=false;
     
     -- can now group by negative literal
     select sum(b) from data group by -1;
    +
    +-- SPARK-21580 ints in aggregation expressions are taken as group-by ordinal
    +select 4, b from data group by 1, 2;
    +
    +set spark.sql.groupByOrdinal=true;
    +
    +select 4, b from data group by 1, 2;
    +
    +select 3, 4, sum(b) from data group by 1, 2;
    --- End diff --
    
    Also, it  fixes this query:`select 3 as c, 4 as d, sum(b) from data group by c, d;`



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #18779: [SPARK-21580][SQL]Integers in aggregation expressions ar...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/18779
  
    Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #18779: [SPARK-21580][SQL]Integers in aggregation expressions ar...

Posted by gatorsmile <gi...@git.apache.org>.

Github user gatorsmile commented on the issue:

    https://github.com/apache/spark/pull/18779
  
    LGTM except a few minor comments. The fix looks great now! Thanks everyone!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #18779: [SPARK-21580][SQL]Integers in aggregation expressions ar...

Posted by gatorsmile <gi...@git.apache.org>.

Github user gatorsmile commented on the issue:

    https://github.com/apache/spark/pull/18779
  
    Unfortunately, our Dataset APIs support it. We have to keep the support. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #18779: [SPARK-21580][SQL]Integers in aggregation expressions ar...

Posted by viirya <gi...@git.apache.org>.

Github user viirya commented on the issue:

    https://github.com/apache/spark/pull/18779
  
    @maropu I am not sure if I understand your idea correctly. Those int literals in group-by is intended to be wrapped in `UnresolvedOrdinal`. So they can be replaced with aggregation expressions at the corresponding ordinal position.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #18779: [SPARK-21580][SQL]Integers in aggregation expressions ar...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/18779
  
    **[Test build #80197 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80197/testReport)** for PR 18779 at commit [`4fce4ab`](https://github.com/apache/spark/commit/4fce4ab3da0ce425b4ba1807d165b4ab05a812b7).
     * This patch **fails PySpark unit tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #18779: [SPARK-21580][SQL]Integers in aggregation express...

Posted by viirya <gi...@git.apache.org>.

Github user viirya commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18779#discussion_r131299271
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/SubstituteUnresolvedOrdinals.scala ---
    @@ -1,54 +0,0 @@
    -/*
    - * Licensed to the Apache Software Foundation (ASF) under one or more
    - * contributor license agreements.  See the NOTICE file distributed with
    - * this work for additional information regarding copyright ownership.
    - * The ASF licenses this file to You under the Apache License, Version 2.0
    - * (the "License"); you may not use this file except in compliance with
    - * the License.  You may obtain a copy of the License at
    - *
    - *    http://www.apache.org/licenses/LICENSE-2.0
    - *
    - * Unless required by applicable law or agreed to in writing, software
    - * distributed under the License is distributed on an "AS IS" BASIS,
    - * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    - * See the License for the specific language governing permissions and
    - * limitations under the License.
    - */
    -
    -package org.apache.spark.sql.catalyst.analysis
    -
    -import org.apache.spark.sql.catalyst.expressions.{Expression, Literal, SortOrder}
    -import org.apache.spark.sql.catalyst.plans.logical.{Aggregate, LogicalPlan, Sort}
    -import org.apache.spark.sql.catalyst.rules.Rule
    -import org.apache.spark.sql.catalyst.trees.CurrentOrigin.withOrigin
    -import org.apache.spark.sql.internal.SQLConf
    -import org.apache.spark.sql.types.IntegerType
    -
    -/**
    - * Replaces ordinal in 'order by' or 'group by' with UnresolvedOrdinal expression.
    - */
    -class SubstituteUnresolvedOrdinals(conf: SQLConf) extends Rule[LogicalPlan] {
    -  private def isIntLiteral(e: Expression) = e match {
    -    case Literal(_, IntegerType) => true
    -    case _ => false
    -  }
    -
    -  def apply(plan: LogicalPlan): LogicalPlan = plan transform {
    --- End diff --
    
    @10110346 May I ask which once you run `SubstituteUnresolvedOrdinals` with? `Once` or `fixedPoint`?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #18779: [SPARK-21580][SQL]Integers in aggregation express...

Posted by 10110346 <gi...@git.apache.org>.

Github user 10110346 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18779#discussion_r130330782
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala ---
    @@ -1010,7 +1014,16 @@ class Analyzer(
                     s"(valid range is [1, ${aggs.size}])")
               case o => o
             }
    -        Aggregate(newGroups, aggs, child)
    +        // If an aggregateExpression is integer, after replaced with this aggregateExpression, the
    +        // groupExpression will be considered as an ordinal still, so we should filter it.
    +        var filterGroups: Seq[Expression] = Nil
    +        newGroups.foreach { ng =>
    +          ng match {
    +            case a: Alias => if (!isIntLiteral (a.child)) filterGroups :+= ng
    +            case _ => filterGroups :+= ng
    +          }
    +        }
    +        Aggregate(filterGroups, aggs, child)
    --- End diff --
    
    It looks no problem here, although  we drop all grouping expressions


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #18779: [SPARK-21580][SQL]Integers in aggregation express...

Posted by viirya <gi...@git.apache.org>.

Github user viirya commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18779#discussion_r131349806
  
    --- Diff: sql/core/src/test/resources/sql-tests/inputs/order-by-ordinal.sql ---
    @@ -34,3 +34,8 @@ set spark.sql.orderByOrdinal=false;
     -- 0 is now a valid literal
     select * from data order by 0;
     select * from data sort by 0;
    +
    +select 4 as k, 5, b from data order by k, 2, 3;
    +
    +set spark.sql.orderByOrdinal=true;
    +select 4 as k, 5, b from data order by k, 2, 3;
    --- End diff --
    
    Do we still need to keep this change?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #18779: [SPARK-21580][SQL]Integers in aggregation express...

Posted by 10110346 <gi...@git.apache.org>.

Github user 10110346 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18779#discussion_r131300093
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/SubstituteUnresolvedOrdinals.scala ---
    @@ -1,54 +0,0 @@
    -/*
    - * Licensed to the Apache Software Foundation (ASF) under one or more
    - * contributor license agreements.  See the NOTICE file distributed with
    - * this work for additional information regarding copyright ownership.
    - * The ASF licenses this file to You under the Apache License, Version 2.0
    - * (the "License"); you may not use this file except in compliance with
    - * the License.  You may obtain a copy of the License at
    - *
    - *    http://www.apache.org/licenses/LICENSE-2.0
    - *
    - * Unless required by applicable law or agreed to in writing, software
    - * distributed under the License is distributed on an "AS IS" BASIS,
    - * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    - * See the License for the specific language governing permissions and
    - * limitations under the License.
    - */
    -
    -package org.apache.spark.sql.catalyst.analysis
    -
    -import org.apache.spark.sql.catalyst.expressions.{Expression, Literal, SortOrder}
    -import org.apache.spark.sql.catalyst.plans.logical.{Aggregate, LogicalPlan, Sort}
    -import org.apache.spark.sql.catalyst.rules.Rule
    -import org.apache.spark.sql.catalyst.trees.CurrentOrigin.withOrigin
    -import org.apache.spark.sql.internal.SQLConf
    -import org.apache.spark.sql.types.IntegerType
    -
    -/**
    - * Replaces ordinal in 'order by' or 'group by' with UnresolvedOrdinal expression.
    - */
    -class SubstituteUnresolvedOrdinals(conf: SQLConf) extends Rule[LogicalPlan] {
    -  private def isIntLiteral(e: Expression) = e match {
    -    case Literal(_, IntegerType) => true
    -    case _ => false
    -  }
    -
    -  def apply(plan: LogicalPlan): LogicalPlan = plan transform {
    --- End diff --
    
     Run `SubstituteUnresolvedOrdinals` with ` fixedPoint`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #18779: [SPARK-21580][SQL]Integers in aggregation expressions ar...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/18779
  
    Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #18779: [SPARK-21580][SQL]Integers in aggregation express...

Posted by 10110346 <gi...@git.apache.org>.

Github user 10110346 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18779#discussion_r130504796
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala ---
    @@ -1010,7 +1014,16 @@ class Analyzer(
                     s"(valid range is [1, ${aggs.size}])")
               case o => o
             }
    -        Aggregate(newGroups, aggs, child)
    +        // If an aggregateExpression is integer, after replaced with this aggregateExpression, the
    +        // groupExpression will be considered as an ordinal still, so we should filter it.
    +        var filterGroups: Seq[Expression] = Nil
    +        newGroups.foreach { ng =>
    +          ng match {
    +            case a: Alias => if (!isIntLiteral (a.child)) filterGroups :+= ng
    --- End diff --
    
    Have they  been trimmed in `aggregateExpressions`?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #18779: [SPARK-21580][SQL]Integers in aggregation expressions ar...

Posted by viirya <gi...@git.apache.org>.

Github user viirya commented on the issue:

https://github.com/apache/spark/pull/18779

@maropu I have a PR before to solve that. Due to some reasons it will be
merged on 2.3. I am out of laptop, will refer it once I can access laptop.

On Aug 3, 2017 5:07 PM, "Takeshi Yamamuro" <no...@github.com> wrote:

@gatorsmile <https://github.com/gatorsmile> @viirya
<https://github.com/viirya> I looked into why it applied some analyzer
rules into already-analyzed plans and I noticed that some rules used
transform/transformUp instead of resolveOperators in apply.
SubstituteUnresolvedOrdinals also doesn't use resolveOperators, so the rule
is applied again into an already-analyzed plan master...maropu:SPARK-21580-3
<https://github.com/apache/spark/compare/master...maropu:SPARK-21580-3>.

—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<https://github.com/apache/spark/pull/18779#issuecomment-319912782>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AAEM9yUL3COM_lTj2xv-G-LYkO7w3Ffbks5sUY3egaJpZM4On4V0>
.

---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #18779: [SPARK-21580][SQL]Integers in aggregation expressions ar...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/18779
  
    Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #18779: [SPARK-21580][SQL]There's a bug with `Group by ordinal`

Posted by viirya <gi...@git.apache.org>.

Github user viirya commented on the issue:

    https://github.com/apache/spark/pull/18779
  
    A specified title for this might be better. Such as "Integers in aggregation expressions are wrongly taken as group-by ordinal".


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #18779: [SPARK-21580][SQL]Integers in aggregation express...

Posted by viirya <gi...@git.apache.org>.

Github user viirya commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18779#discussion_r131287607
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/SubstituteUnresolvedOrdinals.scala ---
    @@ -1,54 +0,0 @@
    -/*
    - * Licensed to the Apache Software Foundation (ASF) under one or more
    - * contributor license agreements.  See the NOTICE file distributed with
    - * this work for additional information regarding copyright ownership.
    - * The ASF licenses this file to You under the Apache License, Version 2.0
    - * (the "License"); you may not use this file except in compliance with
    - * the License.  You may obtain a copy of the License at
    - *
    - *    http://www.apache.org/licenses/LICENSE-2.0
    - *
    - * Unless required by applicable law or agreed to in writing, software
    - * distributed under the License is distributed on an "AS IS" BASIS,
    - * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    - * See the License for the specific language governing permissions and
    - * limitations under the License.
    - */
    -
    -package org.apache.spark.sql.catalyst.analysis
    -
    -import org.apache.spark.sql.catalyst.expressions.{Expression, Literal, SortOrder}
    -import org.apache.spark.sql.catalyst.plans.logical.{Aggregate, LogicalPlan, Sort}
    -import org.apache.spark.sql.catalyst.rules.Rule
    -import org.apache.spark.sql.catalyst.trees.CurrentOrigin.withOrigin
    -import org.apache.spark.sql.internal.SQLConf
    -import org.apache.spark.sql.types.IntegerType
    -
    -/**
    - * Replaces ordinal in 'order by' or 'group by' with UnresolvedOrdinal expression.
    - */
    -class SubstituteUnresolvedOrdinals(conf: SQLConf) extends Rule[LogicalPlan] {
    -  private def isIntLiteral(e: Expression) = e match {
    -    case Literal(_, IntegerType) => true
    -    case _ => false
    -  }
    -
    -  def apply(plan: LogicalPlan): LogicalPlan = plan transform {
    --- End diff --
    
    If we change this to `resolveOperators` and let `SubstituteUnresolvedOrdinals` run with `Once`. Looks like it can also solve those issues.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #18779: [SPARK-21580][SQL]Integers in aggregation expressions ar...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/18779
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/80277/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #18779: [SPARK-21580][SQL]Integers in aggregation express...

Posted by gatorsmile <gi...@git.apache.org>.

Github user gatorsmile commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18779#discussion_r131053346
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/SubstituteUnresolvedOrdinals.scala ---
    @@ -42,13 +42,5 @@ class SubstituteUnresolvedOrdinals(conf: SQLConf) extends Rule[LogicalPlan] {
             case other => other
           }
           withOrigin(s.origin)(s.copy(order = newOrders))
    -
    -    case a: Aggregate if conf.groupByOrdinal && a.groupingExpressions.exists(isIntLiteral) =>
    --- End diff --
    
    Since you are moving group by to the parser, how about moving both to the parser?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #18779: [SPARK-21580][SQL]Integers in aggregation express...

Posted by viirya <gi...@git.apache.org>.

Github user viirya commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18779#discussion_r131285941
  
    --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/analysis/SubstituteUnresolvedOrdinalsSuite.scala ---
    @@ -1,66 +0,0 @@
    -/*
    - * Licensed to the Apache Software Foundation (ASF) under one or more
    - * contributor license agreements.  See the NOTICE file distributed with
    - * this work for additional information regarding copyright ownership.
    - * The ASF licenses this file to You under the Apache License, Version 2.0
    - * (the "License"); you may not use this file except in compliance with
    - * the License.  You may obtain a copy of the License at
    - *
    - *    http://www.apache.org/licenses/LICENSE-2.0
    - *
    - * Unless required by applicable law or agreed to in writing, software
    - * distributed under the License is distributed on an "AS IS" BASIS,
    - * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    - * See the License for the specific language governing permissions and
    - * limitations under the License.
    - */
    -
    -package org.apache.spark.sql.catalyst.analysis
    -
    -import org.apache.spark.sql.catalyst.analysis.TestRelations.testRelation2
    -import org.apache.spark.sql.catalyst.dsl.expressions._
    -import org.apache.spark.sql.catalyst.dsl.plans._
    -import org.apache.spark.sql.catalyst.expressions.Literal
    -import org.apache.spark.sql.internal.SQLConf
    -
    -class SubstituteUnresolvedOrdinalsSuite extends AnalysisTest {
    --- End diff --
    
    We may move those tests to parser related test suites.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #18779: [SPARK-21580][SQL]Integers in aggregation expressions ar...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/18779
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #18779: [SPARK-21580][SQL]Integers in aggregation expressions ar...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/18779
  
    **[Test build #80070 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80070/testReport)** for PR 18779 at commit [`fae3190`](https://github.com/apache/spark/commit/fae3190e987ab4aa1bb45d7f058f510bf9690f12).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #18779: [SPARK-21580][SQL]Integers in aggregation expressions ar...

Posted by 10110346 <gi...@git.apache.org>.

Github user 10110346 commented on the issue:

    https://github.com/apache/spark/pull/18779
  
    I  have updated this PR, please help to review it  again. @viirya  also cc @cloud-fan @gatorsmile 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #18779: [SPARK-21580][SQL]Integers in aggregation expressions ar...

Posted by viirya <gi...@git.apache.org>.

Github user viirya commented on the issue:

    https://github.com/apache/spark/pull/18779
  
    @10110346 Why? 
    
    In the query `select a, 4 AS k, count(b) from data group by k, 1`, `1` is resolved to `ResolvedOrdinal(1)` first. Then at the beginning of optimization, `ResolvedOrdinal(1)` is replaced by `a`.
    
    Is there any problem?
    



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org