You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by "srielau (via GitHub)" <gi...@apache.org> on 2024/01/29 20:33:00 UTC

[PR] [SPARK-46908] [WIP] Support star clause outside select [spark]

srielau opened a new pull request, #44938:
URL: https://github.com/apache/spark/pull/44938

   <!--
   Thanks for sending a pull request!  Here are some tips for you:
     1. If this is your first time, please read our contributor guidelines: https://spark.apache.org/contributing.html
     2. Ensure you have added or run the appropriate tests for your PR: https://spark.apache.org/developer-tools.html
     3. If the PR is unfinished, add '[WIP]' in your PR title, e.g., '[WIP][SPARK-XXXX] Your PR title ...'.
     4. Be sure to keep the PR description updated to reflect all changes.
     5. Please write your PR title to summarize what this PR proposes.
     6. If possible, provide a concise example to reproduce the issue for a faster review.
     7. If you want to add a new configuration, please read the guideline first for naming configurations in
        'core/src/main/scala/org/apache/spark/internal/config/ConfigEntry.scala'.
     8. If you want to add or modify an error type or message, please read the guideline first in
        'common/utils/src/main/resources/error/README.md'.
   -->
   
   ### What changes were proposed in this pull request?
   <!--
   Please clarify what changes you are proposing. The purpose of this section is to outline the changes and how this PR fixes the issue. 
   If possible, please consider writing useful notes for better and faster reviews in your PR. See the examples below.
     1. If you refactor some codes with changing classes, showing the class hierarchy will help reviewers.
     2. If you fix some SQL features, you can provide some references of other DBMSes.
     3. If there is design documentation, please add the link.
     4. If there is a discussion in the mailing list, please add the link.
   -->
   
   
   ### Why are the changes needed?
   <!--
   Please clarify why the changes are needed. For instance,
     1. If you propose a new API, clarify the use case for a new API.
     2. If you fix a bug, you can clarify why it is a bug.
   -->
   
   
   ### Does this PR introduce _any_ user-facing change?
   <!--
   Note that it means *any* user-facing change including all aspects such as the documentation fix.
   If yes, please clarify the previous behavior and the change this PR proposes - provide the console output, description and/or an example to show the behavior difference if possible.
   If possible, please also clarify if this is a user-facing change compared to the released Spark versions or within the unreleased branches such as master.
   If no, write 'No'.
   -->
   
   
   ### How was this patch tested?
   <!--
   If tests were added, say they were added here. Please make sure to add some test cases that check the changes thoroughly including negative and positive cases if possible.
   If it was tested in a way different from regular unit tests, please clarify how you tested step by step, ideally copy and paste-able, so that other reviewers can test and check, and descendants can verify in the future.
   If tests were not added, please describe why they were not added and/or why it was difficult to add.
   If benchmark tests were added, please run the benchmarks in GitHub Actions for the consistent environment, and the instructions could accord to: https://spark.apache.org/developer-tools.html#github-workflow-benchmarks.
   -->
   
   
   ### Was this patch authored or co-authored using generative AI tooling?
   <!--
   If generative AI tooling has been used in the process of authoring this patch, please include the
   phrase: 'Generated-by: ' followed by the name of the tool and its version.
   If no, write 'No'.
   Please refer to the [ASF Generative Tooling Guidance](https://www.apache.org/legal/generative-tooling.html) for details.
   -->
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-46908] Support star clause in WHERE clause [spark]

Posted by "cloud-fan (via GitHub)" <gi...@apache.org>.
cloud-fan commented on code in PR #44938:
URL: https://github.com/apache/spark/pull/44938#discussion_r1473743832


##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala:
##########
@@ -1864,8 +1867,8 @@ class Analyzer(override val catalogManager: CatalogManager) extends RuleExecutor
      * Build a project list for Project/Aggregate and expand the star if possible
      */
     private def buildExpandedProjectList(
-      exprs: Seq[NamedExpression],
-      child: LogicalPlan): Seq[NamedExpression] = {
+                                          exprs: Seq[NamedExpression],
+                                          child: LogicalPlan): Seq[NamedExpression] = {

Review Comment:
   unnecessary change



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-46908] Support star clause in WHERE clause [spark]

Posted by "cloud-fan (via GitHub)" <gi...@apache.org>.
cloud-fan commented on code in PR #44938:
URL: https://github.com/apache/spark/pull/44938#discussion_r1473842317


##########
docs/sql-ref-syntax-qry-star.md:
##########
@@ -0,0 +1,102 @@
+---
+layout: global
+title: star (*) Clause
+displayTitle: Star (*) Clause
+license: |
+  Licensed to the Apache Software Foundation (ASF) under one or more
+  contributor license agreements.  See the NOTICE file distributed with
+  this work for additional information regarding copyright ownership.
+  The ASF licenses this file to You under the Apache License, Version 2.0
+  (the "License"); you may not use this file except in compliance with
+  the License.  You may obtain a copy of the License at
+
+     http://www.apache.org/licenses/LICENSE-2.0
+
+  Unless required by applicable law or agreed to in writing, software
+  distributed under the License is distributed on an "AS IS" BASIS,
+  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+  See the License for the specific language governing permissions and
+  limitations under the License.
+---
+
+### Description
+
+A shorthand to name all the referencable columns in the FROM clause or a specific table reference's columns or fields in the FROM clause.
+The star clause clause is most frequently used in the SELECT list.
+Spark also supports its use in function invocation and certain n-ary operations within the SELECT list and WHERE clause.
+
+### Syntax
+
+```
+[ name . ] * [ except_clause ]
+
+except_clause
+   EXCEPT ( { column_name | field_name } [, ...] )
+```
+
+### Parameters
+
+* **name**
+
+  If present limits the columns or fields to be named to those in the specified referencable field, columns, or table.
+
+* **`*`**
+
+  Collects all the referencable columns in the FROM clause or the optionally specified table_name or view_name into a column list.
+  The list of columns is ordered by the order of table_references and the order of columns within each table_reference.

Review Comment:
   This seems duplicated with the overall description, shall we remove it?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-46908] Support star clause in WHERE clause [spark]

Posted by "srielau (via GitHub)" <gi...@apache.org>.
srielau commented on code in PR #44938:
URL: https://github.com/apache/spark/pull/44938#discussion_r1473759727


##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala:
##########
@@ -1534,6 +1534,9 @@ class Analyzer(override val catalogManager: CatalogManager) extends RuleExecutor
       // If the projection list contains Stars, expand it.
       case p: Project if containsStar(p.projectList) =>
         p.copy(projectList = buildExpandedProjectList(p.projectList, p.child))
+      // If the filter list contains Stars, expand it.

Review Comment:
   Define all. I don't think it's useful fro GROUP BY (ALL is better), and semantic is not clear (or dangerous) for ORDER BY.
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-46908] Support star clause in WHERE clause [spark]

Posted by "cloud-fan (via GitHub)" <gi...@apache.org>.
cloud-fan commented on code in PR #44938:
URL: https://github.com/apache/spark/pull/44938#discussion_r1473743196


##########
sql/core/src/test/resources/sql-tests/analyzer-results/subquery/in-subquery/in-group-by.sql.out:
##########
@@ -1,4 +1,4 @@
--- Automatically generated by SQLQueryTestSuite
+git commit -- Automatically generated by SQLQueryTestSuite

Review Comment:
   unnecessary change



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-46908] Support star clause in WHERE clause [spark]

Posted by "srielau (via GitHub)" <gi...@apache.org>.
srielau commented on code in PR #44938:
URL: https://github.com/apache/spark/pull/44938#discussion_r1473760092


##########
docs/sql-ref-syntax-qry-select-star.md:
##########
@@ -0,0 +1,102 @@
+---
+layout: global
+title: star (*) Clause
+displayTitle: Star (*) Clause
+license: |
+  Licensed to the Apache Software Foundation (ASF) under one or more
+  contributor license agreements.  See the NOTICE file distributed with
+  this work for additional information regarding copyright ownership.
+  The ASF licenses this file to You under the Apache License, Version 2.0
+  (the "License"); you may not use this file except in compliance with
+  the License.  You may obtain a copy of the License at
+
+     http://www.apache.org/licenses/LICENSE-2.0
+
+  Unless required by applicable law or agreed to in writing, software
+  distributed under the License is distributed on an "AS IS" BASIS,
+  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+  See the License for the specific language governing permissions and
+  limitations under the License.
+---
+
+### Description
+
+A shorthand to name all the referencable columns in the FROM clause or a specific table reference in the FROM clause.
+The star clause clause is most frequently used in the SELECT list.
+Spark also supports its use in function invocation and certain n-ary operations within the SELECT list and WHERE clause.
+
+### Syntax
+
+```
+[ { table_name | view_name } . ] * [ except_clause ]

Review Comment:
   Seriously? It unnests fields??? People have been asking for that....



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-46908] Support star clause in WHERE clause [spark]

Posted by "cloud-fan (via GitHub)" <gi...@apache.org>.
cloud-fan commented on code in PR #44938:
URL: https://github.com/apache/spark/pull/44938#discussion_r1473735906


##########
docs/sql-ref-syntax-qry-select-star.md:
##########


Review Comment:
   since it's not limited to select clause now, shall we just say `sql-ref-syntax-qry-star`?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-46908] Support star clause in WHERE clause [spark]

Posted by "cloud-fan (via GitHub)" <gi...@apache.org>.
cloud-fan commented on code in PR #44938:
URL: https://github.com/apache/spark/pull/44938#discussion_r1473840793


##########
docs/sql-ref-syntax-qry-star.md:
##########
@@ -0,0 +1,102 @@
+---
+layout: global
+title: star (*) Clause
+displayTitle: Star (*) Clause
+license: |
+  Licensed to the Apache Software Foundation (ASF) under one or more
+  contributor license agreements.  See the NOTICE file distributed with
+  this work for additional information regarding copyright ownership.
+  The ASF licenses this file to You under the Apache License, Version 2.0
+  (the "License"); you may not use this file except in compliance with
+  the License.  You may obtain a copy of the License at
+
+     http://www.apache.org/licenses/LICENSE-2.0
+
+  Unless required by applicable law or agreed to in writing, software
+  distributed under the License is distributed on an "AS IS" BASIS,
+  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+  See the License for the specific language governing permissions and
+  limitations under the License.
+---
+
+### Description
+
+A shorthand to name all the referencable columns in the FROM clause or a specific table reference's columns or fields in the FROM clause.
+The star clause clause is most frequently used in the SELECT list.

Review Comment:
   ```suggestion
   The star clause is most frequently used in the SELECT list.
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-46908] Support star clause in WHERE clause [spark]

Posted by "cloud-fan (via GitHub)" <gi...@apache.org>.
cloud-fan commented on code in PR #44938:
URL: https://github.com/apache/spark/pull/44938#discussion_r1473841600


##########
docs/sql-ref-syntax-qry-star.md:
##########
@@ -0,0 +1,102 @@
+---
+layout: global
+title: star (*) Clause
+displayTitle: Star (*) Clause
+license: |
+  Licensed to the Apache Software Foundation (ASF) under one or more
+  contributor license agreements.  See the NOTICE file distributed with
+  this work for additional information regarding copyright ownership.
+  The ASF licenses this file to You under the Apache License, Version 2.0
+  (the "License"); you may not use this file except in compliance with
+  the License.  You may obtain a copy of the License at
+
+     http://www.apache.org/licenses/LICENSE-2.0
+
+  Unless required by applicable law or agreed to in writing, software
+  distributed under the License is distributed on an "AS IS" BASIS,
+  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+  See the License for the specific language governing permissions and
+  limitations under the License.
+---
+
+### Description
+
+A shorthand to name all the referencable columns in the FROM clause or a specific table reference's columns or fields in the FROM clause.
+The star clause clause is most frequently used in the SELECT list.
+Spark also supports its use in function invocation and certain n-ary operations within the SELECT list and WHERE clause.
+
+### Syntax
+
+```
+[ name . ] * [ except_clause ]
+
+except_clause
+   EXCEPT ( { column_name | field_name } [, ...] )
+```
+
+### Parameters
+
+* **name**
+
+  If present limits the columns or fields to be named to those in the specified referencable field, columns, or table.

Review Comment:
   ```suggestion
     If present limits the columns or fields to be named to those in the specified referencable field, column, or table.
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-46908][SQL] Support star clause in WHERE clause [spark]

Posted by "srielau (via GitHub)" <gi...@apache.org>.
srielau commented on PR #44938:
URL: https://github.com/apache/spark/pull/44938#issuecomment-1922062069

   > @srielau +1 for the proposal! It would be great if we could mention the change in https://github.com/apache/spark/blob/master/docs/sql-migration-guide.md as well.
   
   Done @gengliangwang 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-46908] Support star clause in WHERE clause [spark]

Posted by "gengliangwang (via GitHub)" <gi...@apache.org>.
gengliangwang commented on PR #44938:
URL: https://github.com/apache/spark/pull/44938#issuecomment-1921936181

   @srielau +1 for the proposal! It would be great if we could mention the change in https://github.com/apache/spark/blob/master/docs/sql-migration-guide.md as well.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-46908] Support star clause in WHERE clause [spark]

Posted by "srielau (via GitHub)" <gi...@apache.org>.
srielau commented on code in PR #44938:
URL: https://github.com/apache/spark/pull/44938#discussion_r1474691983


##########
docs/sql-ref-syntax-qry-star.md:
##########
@@ -0,0 +1,102 @@
+---
+layout: global
+title: star (*) Clause
+displayTitle: Star (*) Clause
+license: |
+  Licensed to the Apache Software Foundation (ASF) under one or more
+  contributor license agreements.  See the NOTICE file distributed with
+  this work for additional information regarding copyright ownership.
+  The ASF licenses this file to You under the Apache License, Version 2.0
+  (the "License"); you may not use this file except in compliance with
+  the License.  You may obtain a copy of the License at
+
+     http://www.apache.org/licenses/LICENSE-2.0
+
+  Unless required by applicable law or agreed to in writing, software
+  distributed under the License is distributed on an "AS IS" BASIS,
+  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+  See the License for the specific language governing permissions and
+  limitations under the License.
+---
+
+### Description
+
+A shorthand to name all the referencable columns in the FROM clause or a specific table reference's columns or fields in the FROM clause.
+The star clause clause is most frequently used in the SELECT list.
+Spark also supports its use in function invocation and certain n-ary operations within the SELECT list and WHERE clause.
+
+### Syntax
+
+```
+[ name . ] * [ except_clause ]
+
+except_clause
+   EXCEPT ( { column_name | field_name } [, ...] )
+```
+
+### Parameters
+
+* **name**
+
+  If present limits the columns or fields to be named to those in the specified referencable field, columns, or table.
+
+* **`*`**
+
+  Collects all the referencable columns in the FROM clause or the optionally specified table_name or view_name into a column list.
+  The list of columns is ordered by the order of table_references and the order of columns within each table_reference.

Review Comment:
   Agreed



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-46908] Support star clause in WHERE clause [spark]

Posted by "srielau (via GitHub)" <gi...@apache.org>.
srielau commented on PR #44938:
URL: https://github.com/apache/spark/pull/44938#issuecomment-1921637655

   All comments addressed. This PR is ready to merge once it's done with QA cycle.
   @cloud-fan  


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-46908] Support star clause in WHERE clause [spark]

Posted by "cloud-fan (via GitHub)" <gi...@apache.org>.
cloud-fan commented on code in PR #44938:
URL: https://github.com/apache/spark/pull/44938#discussion_r1473742834


##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala:
##########
@@ -1936,6 +1939,11 @@ class Analyzer(override val catalogManager: CatalogManager) extends RuleExecutor
             case o => o :: Nil
           })
         // count(*) has been replaced by count(1)

Review Comment:
   shall we keep this comment above the last default case match?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-46908] Support star clause in WHERE clause [spark]

Posted by "srielau (via GitHub)" <gi...@apache.org>.
srielau commented on code in PR #44938:
URL: https://github.com/apache/spark/pull/44938#discussion_r1473761563


##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala:
##########
@@ -1864,7 +1867,7 @@ class Analyzer(override val catalogManager: CatalogManager) extends RuleExecutor
      * Build a project list for Project/Aggregate and expand the star if possible
      */
     private def buildExpandedProjectList(
-      exprs: Seq[NamedExpression],
+                                          exprs: Seq[NamedExpression],

Review Comment:
   ```suggestion
         exprs: Seq[NamedExpression],
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-46908] Support star clause in WHERE clause [spark]

Posted by "cloud-fan (via GitHub)" <gi...@apache.org>.
cloud-fan commented on code in PR #44938:
URL: https://github.com/apache/spark/pull/44938#discussion_r1473773999


##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala:
##########
@@ -1534,6 +1534,9 @@ class Analyzer(override val catalogManager: CatalogManager) extends RuleExecutor
       // If the projection list contains Stars, expand it.
       case p: Project if containsStar(p.projectList) =>
         p.copy(projectList = buildExpandedProjectList(p.projectList, p.child))
+      // If the filter list contains Stars, expand it.

Review Comment:
   oh makes sense



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-46908] Support star clause in WHERE clause [spark]

Posted by "srielau (via GitHub)" <gi...@apache.org>.
srielau commented on code in PR #44938:
URL: https://github.com/apache/spark/pull/44938#discussion_r1473760470


##########
sql/core/src/test/resources/sql-tests/analyzer-results/subquery/in-subquery/in-group-by.sql.out:
##########
@@ -1,4 +1,4 @@
--- Automatically generated by SQLQueryTestSuite
+git commit -- Automatically generated by SQLQueryTestSuite

Review Comment:
   ```suggestion
   -- Automatically generated by SQLQueryTestSuite
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-46908] Support star clause in WHERE clause [spark]

Posted by "cloud-fan (via GitHub)" <gi...@apache.org>.
cloud-fan commented on code in PR #44938:
URL: https://github.com/apache/spark/pull/44938#discussion_r1473734366


##########
docs/sql-ref-function-invocation.md:
##########
@@ -0,0 +1,112 @@
+---
+layout: global
+title: Function Invocation
+displayTitle: Function Invocation
+license: |
+  Licensed to the Apache Software Foundation (ASF) under one or more
+  contributor license agreements.  See the NOTICE file distributed with
+  this work for additional information regarding copyright ownership.
+  The ASF licenses this file to You under the Apache License, Version 2.0
+  (the "License"); you may not use this file except in compliance with
+  the License.  You may obtain a copy of the License at
+
+     http://www.apache.org/licenses/LICENSE-2.0
+
+  Unless required by applicable law or agreed to in writing, software
+  distributed under the License is distributed on an "AS IS" BASIS,
+  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+  See the License for the specific language governing permissions and
+  limitations under the License.
+---
+
+### Description
+
+A function invocation executes a builtin function or a user-defined function after associating arguments to the function’s parameters.
+
+Spark supports positional parameter invocation as well as named parameter invocation.
+
+#### Positional parameter invocation
+
+Each argument is assigned to the matching parameter at the position it is specified.
+
+This notation can be used by all functions unless it is explicitly documented that named parameter invocation is required.
+
+If the function supports optional parameters, trailing parameters for which no arguments have been specified, are defaulted.
+
+#### Named parameter invocation
+
+Arguments are explicitly assigned to parameters using the parameter names published by the function.
+
+This notation must be used for a select subset of built-in functions which allow numerous optional parameters, making positional parameter invocation impractical.
+These functions may allow a mixed invocation where a leading set of parameters are expected to be assigned by position and the trailing, optional set of parameters by name.
+
+### Syntax
+
+```sql
+function_name ( [ argExpr | table_argument ] [, ...]
+                [ namedParameter => [ argExpr | table_argument ] [, ...] )
+
+table_argument
+  { TABLE ( { table_name | query } )
+    [ table_partition ]
+    [ table_order ]
+
+table_partitioning
+  { WITH SINGLE PARTITION |
+    { PARTITION | DISTRIBUTE } BY { partition_expr | ( partition_expr [, ...] ) } }
+
+table_ordering
+  { { ORDER | SORT } BY { order_by_expr | ( order_by_expr [, ...] } }
+```
+
+### Parameters
+
+- **function_name**
+
+  The name of the built-in or user defined function. When resolving an unqualified function_name Databricks will first consider a built-in or temporary function, and then a function in the current schema.

Review Comment:
   ```suggestion
     The name of the built-in or user defined function. When resolving an unqualified function_name Spark will first consider a built-in or temporary function, and then a function in the current schema.
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-46908][SQL] Support star clause in WHERE clause [spark]

Posted by "cloud-fan (via GitHub)" <gi...@apache.org>.
cloud-fan closed pull request #44938: [SPARK-46908][SQL] Support star clause in WHERE clause
URL: https://github.com/apache/spark/pull/44938


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-46908] Support star clause in WHERE clause [spark]

Posted by "gatorsmile (via GitHub)" <gi...@apache.org>.
gatorsmile commented on PR #44938:
URL: https://github.com/apache/spark/pull/44938#issuecomment-1919482021

   @gengliangwang @cloud-fan 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-46908] Support star clause in WHERE clause [spark]

Posted by "cloud-fan (via GitHub)" <gi...@apache.org>.
cloud-fan commented on code in PR #44938:
URL: https://github.com/apache/spark/pull/44938#discussion_r1473734951


##########
docs/sql-ref-function-invocation.md:
##########
@@ -0,0 +1,112 @@
+---
+layout: global
+title: Function Invocation
+displayTitle: Function Invocation
+license: |
+  Licensed to the Apache Software Foundation (ASF) under one or more
+  contributor license agreements.  See the NOTICE file distributed with
+  this work for additional information regarding copyright ownership.
+  The ASF licenses this file to You under the Apache License, Version 2.0
+  (the "License"); you may not use this file except in compliance with
+  the License.  You may obtain a copy of the License at
+
+     http://www.apache.org/licenses/LICENSE-2.0
+
+  Unless required by applicable law or agreed to in writing, software
+  distributed under the License is distributed on an "AS IS" BASIS,
+  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+  See the License for the specific language governing permissions and
+  limitations under the License.
+---
+
+### Description
+
+A function invocation executes a builtin function or a user-defined function after associating arguments to the function’s parameters.
+
+Spark supports positional parameter invocation as well as named parameter invocation.
+
+#### Positional parameter invocation
+
+Each argument is assigned to the matching parameter at the position it is specified.
+
+This notation can be used by all functions unless it is explicitly documented that named parameter invocation is required.
+
+If the function supports optional parameters, trailing parameters for which no arguments have been specified, are defaulted.
+
+#### Named parameter invocation
+
+Arguments are explicitly assigned to parameters using the parameter names published by the function.
+
+This notation must be used for a select subset of built-in functions which allow numerous optional parameters, making positional parameter invocation impractical.
+These functions may allow a mixed invocation where a leading set of parameters are expected to be assigned by position and the trailing, optional set of parameters by name.
+
+### Syntax
+
+```sql
+function_name ( [ argExpr | table_argument ] [, ...]
+                [ namedParameter => [ argExpr | table_argument ] [, ...] )
+
+table_argument
+  { TABLE ( { table_name | query } )
+    [ table_partition ]
+    [ table_order ]
+
+table_partitioning
+  { WITH SINGLE PARTITION |
+    { PARTITION | DISTRIBUTE } BY { partition_expr | ( partition_expr [, ...] ) } }
+
+table_ordering
+  { { ORDER | SORT } BY { order_by_expr | ( order_by_expr [, ...] } }
+```
+
+### Parameters
+
+- **function_name**
+
+  The name of the built-in or user defined function. When resolving an unqualified function_name Databricks will first consider a built-in or temporary function, and then a function in the current schema.
+
+- **argExpr**
+
+  Any expression which can be implicitly cast to the parameter it is associated with.
+
+  The function may impose further restriction on the argument such as mandating literals, constant expressions, or specific values.
+
+- **namedParameter**
+
+  The unqualified name of a parameter to which the argExpr will be assigned.
+
+  Named parameter notation is supported for Python UDF, and specific built-in functions.
+
+- **table_argument**
+
+  Specifies an argument for a parameter that is a table.
+
+  - **TABLE ( table_name )**
+
+    Identifies a table to pass to the function by name.
+
+  - **TABLE ( query )**
+
+    Passes the result of query to the function.
+
+  - **table-partitioning**
+
+    Optionally specifies that the table argument is partitioned. If not specified the partitioning is determined by Databricks.

Review Comment:
   ```suggestion
       Optionally specifies that the table argument is partitioned. If not specified the partitioning is determined by Spark.
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-46908] Support star clause in WHERE clause [spark]

Posted by "srielau (via GitHub)" <gi...@apache.org>.
srielau commented on code in PR #44938:
URL: https://github.com/apache/spark/pull/44938#discussion_r1473833796


##########
docs/sql-ref-syntax-qry-select-star.md:
##########


Review Comment:
   Done



##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala:
##########
@@ -1936,6 +1939,11 @@ class Analyzer(override val catalogManager: CatalogManager) extends RuleExecutor
             case o => o :: Nil
           })
         // count(*) has been replaced by count(1)

Review Comment:
   ```suggestion
   ```



##########
docs/sql-ref-syntax-qry-select-star.md:
##########
@@ -0,0 +1,102 @@
+---
+layout: global
+title: star (*) Clause
+displayTitle: Star (*) Clause
+license: |
+  Licensed to the Apache Software Foundation (ASF) under one or more
+  contributor license agreements.  See the NOTICE file distributed with
+  this work for additional information regarding copyright ownership.
+  The ASF licenses this file to You under the Apache License, Version 2.0
+  (the "License"); you may not use this file except in compliance with
+  the License.  You may obtain a copy of the License at
+
+     http://www.apache.org/licenses/LICENSE-2.0
+
+  Unless required by applicable law or agreed to in writing, software
+  distributed under the License is distributed on an "AS IS" BASIS,
+  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+  See the License for the specific language governing permissions and
+  limitations under the License.
+---
+
+### Description
+
+A shorthand to name all the referencable columns in the FROM clause or a specific table reference in the FROM clause.
+The star clause clause is most frequently used in the SELECT list.
+Spark also supports its use in function invocation and certain n-ary operations within the SELECT list and WHERE clause.
+
+### Syntax
+
+```
+[ { table_name | view_name } . ] * [ except_clause ]

Review Comment:
   Doc-ed and added a test. Are there any other secrets in Spark I should know about?



##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala:
##########
@@ -1936,6 +1939,11 @@ class Analyzer(override val catalogManager: CatalogManager) extends RuleExecutor
             case o => o :: Nil
           })
         // count(*) has been replaced by count(1)
+        case p: In if containsStar(p.children) =>
+          p.copy(list = p.list.flatMap {
+            case s: Star => expand(s, child)
+            case o => o :: Nil
+          })

Review Comment:
   ```suggestion
             })
             // count(*) has been replaced by count(1)
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-46908] Support star clause in WHERE clause [spark]

Posted by "srielau (via GitHub)" <gi...@apache.org>.
srielau commented on code in PR #44938:
URL: https://github.com/apache/spark/pull/44938#discussion_r1473836043


##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala:
##########
@@ -1935,7 +1938,12 @@ class Analyzer(override val catalogManager: CatalogManager) extends RuleExecutor
             case s: Star => expand(s, child)
             case o => o :: Nil
           })
-        // count(*) has been replaced by count(1)
+        case p: In if containsStar(p.children) =>
+          p.copy(list = p.list.flatMap {
+            case s: Star => expand(s, child)
+            case o => o :: Nil
+          })
+          // count(*) has been replaced by count(1)

Review Comment:
   ```suggestion
           // count(*) has been replaced by count(1)
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-46908] Support star clause in WHERE clause [spark]

Posted by "cloud-fan (via GitHub)" <gi...@apache.org>.
cloud-fan commented on code in PR #44938:
URL: https://github.com/apache/spark/pull/44938#discussion_r1473736459


##########
docs/sql-ref-syntax-qry-select-star.md:
##########
@@ -0,0 +1,102 @@
+---
+layout: global
+title: star (*) Clause
+displayTitle: Star (*) Clause
+license: |
+  Licensed to the Apache Software Foundation (ASF) under one or more
+  contributor license agreements.  See the NOTICE file distributed with
+  this work for additional information regarding copyright ownership.
+  The ASF licenses this file to You under the Apache License, Version 2.0
+  (the "License"); you may not use this file except in compliance with
+  the License.  You may obtain a copy of the License at
+
+     http://www.apache.org/licenses/LICENSE-2.0
+
+  Unless required by applicable law or agreed to in writing, software
+  distributed under the License is distributed on an "AS IS" BASIS,
+  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+  See the License for the specific language governing permissions and
+  limitations under the License.
+---
+
+### Description
+
+A shorthand to name all the referencable columns in the FROM clause or a specific table reference in the FROM clause.
+The star clause clause is most frequently used in the SELECT list.
+Spark also supports its use in function invocation and certain n-ary operations within the SELECT list and WHERE clause.
+
+### Syntax
+
+```
+[ { table_name | view_name } . ] * [ except_clause ]

Review Comment:
   it's actually more than it, as it can be a column or inner field, e.g. `t1.col.*` and `t1.col.innerField.*`



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-46908] Support star clause in WHERE clause [spark]

Posted by "cloud-fan (via GitHub)" <gi...@apache.org>.
cloud-fan commented on code in PR #44938:
URL: https://github.com/apache/spark/pull/44938#discussion_r1473742093


##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala:
##########
@@ -1534,6 +1534,9 @@ class Analyzer(override val catalogManager: CatalogManager) extends RuleExecutor
       // If the projection list contains Stars, expand it.
       case p: Project if containsStar(p.projectList) =>
         p.copy(projectList = buildExpandedProjectList(p.projectList, p.child))
+      // If the filter list contains Stars, expand it.

Review Comment:
   shall we apply it for all operators?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-46908] Support star clause in WHERE clause [spark]

Posted by "srielau (via GitHub)" <gi...@apache.org>.
srielau commented on code in PR #44938:
URL: https://github.com/apache/spark/pull/44938#discussion_r1473761186


##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala:
##########
@@ -1864,8 +1867,8 @@ class Analyzer(override val catalogManager: CatalogManager) extends RuleExecutor
      * Build a project list for Project/Aggregate and expand the star if possible
      */
     private def buildExpandedProjectList(
-      exprs: Seq[NamedExpression],
-      child: LogicalPlan): Seq[NamedExpression] = {
+                                          exprs: Seq[NamedExpression],
+                                          child: LogicalPlan): Seq[NamedExpression] = {

Review Comment:
   ```suggestion
         child: LogicalPlan): Seq[NamedExpression] = {
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-46908][SQL] Support star clause in WHERE clause [spark]

Posted by "cloud-fan (via GitHub)" <gi...@apache.org>.
cloud-fan commented on PR #44938:
URL: https://github.com/apache/spark/pull/44938#issuecomment-1922613459

   thanks, merging to master!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org