You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2020/02/08 07:21:31 UTC

[GitHub] [spark] viirya opened a new pull request #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression

viirya opened a new pull request #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression
URL: https://github.com/apache/spark/pull/27499
 
 
   <!--
   Thanks for sending a pull request!  Here are some tips for you:
     1. If this is your first time, please read our contributor guidelines: https://spark.apache.org/contributing.html
     2. Ensure you have added or run the appropriate tests for your PR: https://spark.apache.org/developer-tools.html
     3. If the PR is unfinished, add '[WIP]' in your PR title, e.g., '[WIP][SPARK-XXXX] Your PR title ...'.
     4. Be sure to keep the PR description updated to reflect all changes.
     5. Please write your PR title to summarize what this PR proposes.
     6. If possible, provide a concise example to reproduce the issue for a faster review.
   -->
   
   ### What changes were proposed in this pull request?
   <!--
   Please clarify what changes you are proposing. The purpose of this section is to outline the changes and how this PR fixes the issue. 
   If possible, please consider writing useful notes for better and faster reviews in your PR. See the examples below.
     1. If you refactor some codes with changing classes, showing the class hierarchy will help reviewers.
     2. If you fix some SQL features, you can provide some references of other DBMSes.
     3. If there is design documentation, please add the link.
     4. If there is a discussion in the mailing list, please add the link.
   -->
   
   This patch proposes to throw analysis exception if untyped `Dataset.select` takes typed column expression.
   
   This patch also proposes to make `Dataset.selectUntyped` as public API for selecting multiple typed column expressions.
   
   ### Why are the changes needed?
   <!--
   Please clarify why the changes are needed. For instance,
     1. If you propose a new API, clarify the use case for a new API.
     2. If you fix a bug, you can clarify why it is a bug.
   -->
   
   `Dataset` provides few typed `select` helper functions to select typed column expressions. The maximum number of typed columns supported is 5. If wanting to select more than 5 typed columns, it silently calls untyped `Dataset.select` can causes weird unresolved error.
   
   We should explicitly let users know that they are incorrectly calling untyped `select` with typed columns.
   
   Because typed `Dataset.select` cannot be used to select more than 5 typed columns, this also opens `selectUntyped`.
   
   ### Does this PR introduce any user-facing change?
   <!--
   If yes, please clarify the previous behavior and the change this PR proposes - provide the console output, description and/or an example to show the behavior difference if possible.
   If no, write 'No'.
   -->
   
   Yes. If users intentionally or unintentionally call `Dataset.select` API with typed column, an analysis exception will be thrown. Users can use `selectUntyped` API to select multiple typed columns.
   
   ### How was this patch tested?
   <!--
   If tests were added, say they were added here. Please make sure to add some test cases that check the changes thoroughly including negative and positive cases if possible.
   If it was tested in a way different from regular unit tests, please clarify how you tested step by step, ideally copy and paste-able, so that other reviewers can test and check, and descendants can verify in the future.
   If tests were not added, please describe why they were not added and/or why it was difficult to add.
   -->
   
   Unit tests.
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA removed a comment on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression that needs input type

Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression that needs input type
URL: https://github.com/apache/spark/pull/27499#issuecomment-591219315
 
 
   **[Test build #118944 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/118944/testReport)** for PR 27499 at commit [`68d17f7`](https://github.com/apache/spark/commit/68d17f7e71a9ca861a8939453887e71d8498f2f4).

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression that needs input type

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression that needs input type
URL: https://github.com/apache/spark/pull/27499#issuecomment-591068971
 
 
   Merged build finished. Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression that needs input type

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression that needs input type
URL: https://github.com/apache/spark/pull/27499#issuecomment-591283341
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/118944/
   Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] viirya commented on a change in pull request #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression

Posted by GitBox <gi...@apache.org>.
viirya commented on a change in pull request #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression
URL: https://github.com/apache/spark/pull/27499#discussion_r376740061
 
 

 ##########
 File path: project/MimaExcludes.scala
 ##########
 @@ -492,7 +492,10 @@ object MimaExcludes {
     ProblemFilters.exclude[IncompatibleResultTypeProblem]("org.apache.spark.ml.regression.AFTSurvivalRegression.setPredictionCol"),
 
     // [SPARK-29543][SS][UI] Init structured streaming ui
-    ProblemFilters.exclude[DirectMissingMethodProblem]("org.apache.spark.sql.streaming.StreamingQueryListener#QueryStartedEvent.this")
+    ProblemFilters.exclude[DirectMissingMethodProblem]("org.apache.spark.sql.streaming.StreamingQueryListener#QueryStartedEvent.this"),
+
+    // [SPARK-30590][SQL] Untyped select API cannot take typed column expression
+    ProblemFilters.exclude[IncompatibleResultTypeProblem]("org.apache.spark.sql.functions.count")
 
 Review comment:
   Put it under 3.0 exclude rules temporarily. The version number in the master branch is still 3.0.0.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HyukjinKwon commented on a change in pull request #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression

Posted by GitBox <gi...@apache.org>.
HyukjinKwon commented on a change in pull request #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression
URL: https://github.com/apache/spark/pull/27499#discussion_r377444876
 
 

 ##########
 File path: sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala
 ##########
 @@ -1430,6 +1430,11 @@ class Dataset[T] private[sql](
    */
   @scala.annotation.varargs
   def select(cols: Column*): DataFrame = withPlan {
 
 Review comment:
   What about a fix like this?
   
   ```diff
   diff --git a/sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala b/sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala
   index 16f1cac3e0f..91b6e13b100 100644
   --- a/sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala
   +++ b/sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala
   @@ -1430,12 +1430,11 @@ class Dataset[T] private[sql](
       */
      @scala.annotation.varargs
      def select(cols: Column*): DataFrame = withPlan {
   -    cols.find(_.isInstanceOf[TypedColumn[_, _]]).foreach { typedCol =>
   -      throw new AnalysisException(s"$typedCol is a typed column that " +
   -        "cannot be passed in untyped `select` API. If you are going to select " +
   -        "multiple typed columns, you can use `Dataset.selectUntyped` API.")
   +    val newCols = cols.map {
   +      case tc: TypedColumn[_, _] => tc.withInputType(exprEnc, logicalPlan.output)
   +      case c => c
        }
   -    Project(cols.map(_.named), logicalPlan)
   +    Project(newCols.map(_.named), logicalPlan)
   ```

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression
URL: https://github.com/apache/spark/pull/27499#issuecomment-583794989
 
 
   Merged build finished. Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] viirya commented on a change in pull request #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression that needs input type

Posted by GitBox <gi...@apache.org>.
viirya commented on a change in pull request #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression that needs input type
URL: https://github.com/apache/spark/pull/27499#discussion_r384134831
 
 

 ##########
 File path: sql/core/src/main/scala/org/apache/spark/sql/Column.scala
 ##########
 @@ -97,6 +97,17 @@ class TypedColumn[-T, U](
     new TypedColumn[T, U](newExpr, encoder)
   }
 
+  /**
+   * This method is used internally in SparkSQL to check if a `TypedColumn` has been inserted with
+   * specific input type and schema by `withInputType`.
+   */
+  private[sql] def needInputType: Boolean = {
+    expr.find {
+      case ta: TypedAggregateExpression if ta.inputDeserializer.isEmpty => true
 
 Review comment:
   ok.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression
URL: https://github.com/apache/spark/pull/27499#issuecomment-584456383
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/22959/
   Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression
URL: https://github.com/apache/spark/pull/27499#issuecomment-584517379
 
 
   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/118209/
   Test FAILed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression
URL: https://github.com/apache/spark/pull/27499#issuecomment-584488387
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/22969/
   Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression
URL: https://github.com/apache/spark/pull/27499#issuecomment-589727930
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/23550/
   Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression that needs input type

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression that needs input type
URL: https://github.com/apache/spark/pull/27499#issuecomment-590853996
 
 
   Merged build finished. Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA removed a comment on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression

Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression
URL: https://github.com/apache/spark/pull/27499#issuecomment-584488115
 
 
   **[Test build #118209 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/118209/testReport)** for PR 27499 at commit [`b784ba5`](https://github.com/apache/spark/commit/b784ba52fa403886b919a3899a39d114517c2b33).

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression
URL: https://github.com/apache/spark/pull/27499#issuecomment-589934790
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/23563/
   Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression

Posted by GitBox <gi...@apache.org>.
SparkQA commented on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression
URL: https://github.com/apache/spark/pull/27499#issuecomment-589953255
 
 
   **[Test build #118813 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/118813/testReport)** for PR 27499 at commit [`096ce42`](https://github.com/apache/spark/commit/096ce420f8e5d04dac1b88e46c53820ace857507).
    * This patch passes all tests.
    * This patch merges cleanly.
    * This patch adds no public classes.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] viirya commented on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression

Posted by GitBox <gi...@apache.org>.
viirya commented on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression
URL: https://github.com/apache/spark/pull/27499#issuecomment-584451116
 
 
   retest this please

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] viirya commented on a change in pull request #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression

Posted by GitBox <gi...@apache.org>.
viirya commented on a change in pull request #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression
URL: https://github.com/apache/spark/pull/27499#discussion_r382897674
 
 

 ##########
 File path: sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala
 ##########
 @@ -1430,6 +1430,11 @@ class Dataset[T] private[sql](
    */
   @scala.annotation.varargs
   def select(cols: Column*): DataFrame = withPlan {
+    cols.find(_.isInstanceOf[TypedColumn[_, _]]).foreach { typedCol =>
+      throw new AnalysisException(s"$typedCol is a typed column that " +
+        "cannot be passed in untyped `select` API. If you are going to select " +
+        "multiple typed columns, you can use `Dataset.selectUntyped` API.")
+    }
 
 Review comment:
   We already shipped `count` as `TypedColumn` for a while, now to change it to `Column` is a breaking change.
   
   In order to not break current usage, I think the fix is to allow feasible typed columns (e.g. `count`) in the untyped `select`.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression

Posted by GitBox <gi...@apache.org>.
SparkQA commented on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression
URL: https://github.com/apache/spark/pull/27499#issuecomment-583714235
 
 
   **[Test build #118060 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/118060/testReport)** for PR 27499 at commit [`8aafa57`](https://github.com/apache/spark/commit/8aafa573a9cda815fdfbfe5c8864c8595c4e1f93).
    * This patch **fails due to an unknown error code, -9**.
    * This patch merges cleanly.
    * This patch adds no public classes.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression
URL: https://github.com/apache/spark/pull/27499#issuecomment-583794989
 
 
   Merged build finished. Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression that needs input type

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression that needs input type
URL: https://github.com/apache/spark/pull/27499#issuecomment-590738410
 
 
   Merged build finished. Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression
URL: https://github.com/apache/spark/pull/27499#issuecomment-583711745
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/22826/
   Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] viirya commented on a change in pull request #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression that needs input type

Posted by GitBox <gi...@apache.org>.
viirya commented on a change in pull request #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression that needs input type
URL: https://github.com/apache/spark/pull/27499#discussion_r384629274
 
 

 ##########
 File path: sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala
 ##########
 @@ -1430,7 +1430,18 @@ class Dataset[T] private[sql](
    */
   @scala.annotation.varargs
   def select(cols: Column*): DataFrame = withPlan {
-    Project(cols.map(_.named), logicalPlan)
+    val untypedCols = cols.map {
+      case typedCol: TypedColumn[_, _] =>
+        if (!typedCol.needInputType) {
 
 Review comment:
   Oh ok. I previously don't want to make `select` looks complicated. Inlined it now.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] viirya commented on a change in pull request #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression

Posted by GitBox <gi...@apache.org>.
viirya commented on a change in pull request #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression
URL: https://github.com/apache/spark/pull/27499#discussion_r377746240
 
 

 ##########
 File path: sql/core/src/test/scala/org/apache/spark/sql/DatasetAggregatorSuite.scala
 ##########
 @@ -394,4 +403,21 @@ class DatasetAggregatorSuite extends QueryTest with SharedSparkSession {
     checkAnswer(group, Row("bob", Row(true, 3)) :: Nil)
     checkDataset(group.as[OptionBooleanIntData], OptionBooleanIntData("bob", Some((true, 3))))
   }
+
+  test("SPARK-30590: select multiple typed column expressions") {
+    val df = Seq((1, 2, 3, 4, 5, 6)).toDF("a", "b", "c", "d", "e", "f")
+    val fooAgg = (i: Int) => FooAgg(i).toColumn.name(s"foo_agg_$i")
+
+    val agg1 = df.select(fooAgg(1), fooAgg(2), fooAgg(3), fooAgg(4), fooAgg(5))
+    checkDataset(agg1, (3, 5, 7, 9, 11))
+
+    val agg2 = df.selectUntyped(fooAgg(1), fooAgg(2), fooAgg(3), fooAgg(4), fooAgg(5), fooAgg(6))
+      .asInstanceOf[Dataset[(Int, Int, Int, Int, Int, Int)]]
 
 Review comment:
   That is also an optional.
   
   `selectUntyped` is not a user-facing API actually. Since we do not hear feedback for this issue until now, I think this should be just a corner case. So I thought exposing it should be enough.
   
   
   
   
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] viirya commented on a change in pull request #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression

Posted by GitBox <gi...@apache.org>.
viirya commented on a change in pull request #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression
URL: https://github.com/apache/spark/pull/27499#discussion_r377752050
 
 

 ##########
 File path: sql/core/src/test/scala/org/apache/spark/sql/DatasetAggregatorSuite.scala
 ##########
 @@ -394,4 +403,21 @@ class DatasetAggregatorSuite extends QueryTest with SharedSparkSession {
     checkAnswer(group, Row("bob", Row(true, 3)) :: Nil)
     checkDataset(group.as[OptionBooleanIntData], OptionBooleanIntData("bob", Some((true, 3))))
   }
+
+  test("SPARK-30590: select multiple typed column expressions") {
+    val df = Seq((1, 2, 3, 4, 5, 6)).toDF("a", "b", "c", "d", "e", "f")
+    val fooAgg = (i: Int) => FooAgg(i).toColumn.name(s"foo_agg_$i")
+
+    val agg1 = df.select(fooAgg(1), fooAgg(2), fooAgg(3), fooAgg(4), fooAgg(5))
+    checkDataset(agg1, (3, 5, 7, 9, 11))
+
+    val agg2 = df.selectUntyped(fooAgg(1), fooAgg(2), fooAgg(3), fooAgg(4), fooAgg(5), fooAgg(6))
+      .asInstanceOf[Dataset[(Int, Int, Int, Int, Int, Int)]]
+    checkDataset(agg2, (3, 5, 7, 9, 11, 13))
+
+    val err = intercept[AnalysisException] {
+      df.select(fooAgg(1), fooAgg(2), fooAgg(3), fooAgg(4), fooAgg(5), fooAgg(6))
 
 Review comment:
   I think we can do it like @HyukjinKwon suggested https://github.com/apache/spark/pull/27499#discussion_r377444876, the result is we will lose typed info from typed columns.
   
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression
URL: https://github.com/apache/spark/pull/27499#issuecomment-583728341
 
 
   Merged build finished. Test FAILed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression that needs input type

Posted by GitBox <gi...@apache.org>.
SparkQA commented on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression that needs input type
URL: https://github.com/apache/spark/pull/27499#issuecomment-591219315
 
 
   **[Test build #118944 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/118944/testReport)** for PR 27499 at commit [`68d17f7`](https://github.com/apache/spark/commit/68d17f7e71a9ca861a8939453887e71d8498f2f4).

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression
URL: https://github.com/apache/spark/pull/27499#issuecomment-584452834
 
 
   Merged build finished. Test FAILed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression
URL: https://github.com/apache/spark/pull/27499#issuecomment-589637816
 
 
   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/118777/
   Test FAILed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression that needs input type

Posted by GitBox <gi...@apache.org>.
SparkQA commented on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression that needs input type
URL: https://github.com/apache/spark/pull/27499#issuecomment-591652441
 
 
   **[Test build #118984 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/118984/testReport)** for PR 27499 at commit [`7d045fb`](https://github.com/apache/spark/commit/7d045fbfb1b6bf35c5cc49e9948db2c5140bb15d).
    * This patch passes all tests.
    * This patch merges cleanly.
    * This patch adds no public classes.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression
URL: https://github.com/apache/spark/pull/27499#issuecomment-589934785
 
 
   Merged build finished. Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression that needs input type

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression that needs input type
URL: https://github.com/apache/spark/pull/27499#issuecomment-590738417
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/23658/
   Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression

Posted by GitBox <gi...@apache.org>.
SparkQA commented on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression
URL: https://github.com/apache/spark/pull/27499#issuecomment-583794868
 
 
   **[Test build #118080 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/118080/testReport)** for PR 27499 at commit [`c9d3cd3`](https://github.com/apache/spark/commit/c9d3cd34c48435fd2323180648cbaadb9d81a7dc).
    * This patch passes all tests.
    * This patch merges cleanly.
    * This patch adds no public classes.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression
URL: https://github.com/apache/spark/pull/27499#issuecomment-589804979
 
 
   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/118800/
   Test FAILed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA removed a comment on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression

Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression
URL: https://github.com/apache/spark/pull/27499#issuecomment-584456183
 
 
   **[Test build #118197 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/118197/testReport)** for PR 27499 at commit [`b784ba5`](https://github.com/apache/spark/commit/b784ba52fa403886b919a3899a39d114517c2b33).

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression
URL: https://github.com/apache/spark/pull/27499#issuecomment-583777012
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/22842/
   Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression
URL: https://github.com/apache/spark/pull/27499#issuecomment-589953399
 
 
   Merged build finished. Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression

Posted by GitBox <gi...@apache.org>.
SparkQA commented on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression
URL: https://github.com/apache/spark/pull/27499#issuecomment-589804726
 
 
   **[Test build #118800 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/118800/testReport)** for PR 27499 at commit [`53ba69c`](https://github.com/apache/spark/commit/53ba69c5843d23a93e4be70b71126625a5ba4789).
    * This patch **fails Spark unit tests**.
    * This patch merges cleanly.
    * This patch adds no public classes.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression
URL: https://github.com/apache/spark/pull/27499#issuecomment-589727930
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/23550/
   Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression

Posted by GitBox <gi...@apache.org>.
SparkQA commented on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression
URL: https://github.com/apache/spark/pull/27499#issuecomment-589730908
 
 
   **[Test build #118800 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/118800/testReport)** for PR 27499 at commit [`53ba69c`](https://github.com/apache/spark/commit/53ba69c5843d23a93e4be70b71126625a5ba4789).

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression that needs input type

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression that needs input type
URL: https://github.com/apache/spark/pull/27499#issuecomment-590738417
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/23658/
   Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] viirya commented on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression that needs input type

Posted by GitBox <gi...@apache.org>.
viirya commented on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression that needs input type
URL: https://github.com/apache/spark/pull/27499#issuecomment-591219030
 
 
   retest this please

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression without input type

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression without input type
URL: https://github.com/apache/spark/pull/27499#issuecomment-590689514
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/23646/
   Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA removed a comment on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression

Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression
URL: https://github.com/apache/spark/pull/27499#issuecomment-589569725
 
 
   **[Test build #118777 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/118777/testReport)** for PR 27499 at commit [`45feb5c`](https://github.com/apache/spark/commit/45feb5c9ae6f7a184aee45e27755db5d59a028d0).

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression that needs input type

Posted by GitBox <gi...@apache.org>.
SparkQA commented on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression that needs input type
URL: https://github.com/apache/spark/pull/27499#issuecomment-591068391
 
 
   **[Test build #118931 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/118931/testReport)** for PR 27499 at commit [`68d17f7`](https://github.com/apache/spark/commit/68d17f7e71a9ca861a8939453887e71d8498f2f4).

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression that needs input type

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression that needs input type
URL: https://github.com/apache/spark/pull/27499#issuecomment-591219657
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/23693/
   Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression
URL: https://github.com/apache/spark/pull/27499#issuecomment-584517379
 
 
   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/118209/
   Test FAILed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression that needs input type

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression that needs input type
URL: https://github.com/apache/spark/pull/27499#issuecomment-590736309
 
 
   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/118897/
   Test FAILed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression that needs input type

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression that needs input type
URL: https://github.com/apache/spark/pull/27499#issuecomment-591219643
 
 
   Merged build finished. Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression
URL: https://github.com/apache/spark/pull/27499#issuecomment-583784899
 
 
   Merged build finished. Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression
URL: https://github.com/apache/spark/pull/27499#issuecomment-583714247
 
 
   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/118060/
   Test FAILed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on a change in pull request #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression that needs input type

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on a change in pull request #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression that needs input type
URL: https://github.com/apache/spark/pull/27499#discussion_r384423658
 
 

 ##########
 File path: sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala
 ##########
 @@ -1430,7 +1430,18 @@ class Dataset[T] private[sql](
    */
   @scala.annotation.varargs
   def select(cols: Column*): DataFrame = withPlan {
-    Project(cols.map(_.named), logicalPlan)
+    val untypedCols = cols.map {
+      case typedCol: TypedColumn[_, _] =>
+        if (!typedCol.needInputType) {
 
 Review comment:
   just noticed: why don't we inline this method? Then we can centralize the changes here. The methods in `TypedColumn` can still be accessed by java users who ignore "private[spark]", so better to avoid it if we can.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression
URL: https://github.com/apache/spark/pull/27499#issuecomment-584479410
 
 
   Merged build finished. Test FAILed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression

Posted by GitBox <gi...@apache.org>.
SparkQA commented on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression
URL: https://github.com/apache/spark/pull/27499#issuecomment-584456183
 
 
   **[Test build #118197 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/118197/testReport)** for PR 27499 at commit [`b784ba5`](https://github.com/apache/spark/commit/b784ba52fa403886b919a3899a39d114517c2b33).

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression that needs input type

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression that needs input type
URL: https://github.com/apache/spark/pull/27499#issuecomment-590738410
 
 
   Merged build finished. Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA removed a comment on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression

Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression
URL: https://github.com/apache/spark/pull/27499#issuecomment-583714993
 
 
   **[Test build #118062 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/118062/testReport)** for PR 27499 at commit [`8aafa57`](https://github.com/apache/spark/commit/8aafa573a9cda815fdfbfe5c8864c8595c4e1f93).

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression
URL: https://github.com/apache/spark/pull/27499#issuecomment-584456380
 
 
   Merged build finished. Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression
URL: https://github.com/apache/spark/pull/27499#issuecomment-584451959
 
 
   Merged build finished. Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression
URL: https://github.com/apache/spark/pull/27499#issuecomment-584410251
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/22951/
   Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA removed a comment on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression that needs input type

Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression that needs input type
URL: https://github.com/apache/spark/pull/27499#issuecomment-591534724
 
 
   **[Test build #118984 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/118984/testReport)** for PR 27499 at commit [`7d045fb`](https://github.com/apache/spark/commit/7d045fbfb1b6bf35c5cc49e9948db2c5140bb15d).

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression
URL: https://github.com/apache/spark/pull/27499#issuecomment-589953403
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/118813/
   Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression
URL: https://github.com/apache/spark/pull/27499#issuecomment-589570111
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/23530/
   Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA removed a comment on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression

Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression
URL: https://github.com/apache/spark/pull/27499#issuecomment-583711659
 
 
   **[Test build #118060 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/118060/testReport)** for PR 27499 at commit [`8aafa57`](https://github.com/apache/spark/commit/8aafa573a9cda815fdfbfe5c8864c8595c4e1f93).

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression
URL: https://github.com/apache/spark/pull/27499#issuecomment-583794991
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/118080/
   Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on a change in pull request #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on a change in pull request #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression
URL: https://github.com/apache/spark/pull/27499#discussion_r377029321
 
 

 ##########
 File path: sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala
 ##########
 @@ -1430,6 +1430,11 @@ class Dataset[T] private[sql](
    */
   @scala.annotation.varargs
   def select(cols: Column*): DataFrame = withPlan {
 
 Review comment:
   is it possible to support mixed typed and untyped columns?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression
URL: https://github.com/apache/spark/pull/27499#issuecomment-589637809
 
 
   Merged build finished. Test FAILed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression
URL: https://github.com/apache/spark/pull/27499#issuecomment-589934785
 
 
   Merged build finished. Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression
URL: https://github.com/apache/spark/pull/27499#issuecomment-584410251
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/22951/
   Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression
URL: https://github.com/apache/spark/pull/27499#issuecomment-583777012
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/22842/
   Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression that needs input type

Posted by GitBox <gi...@apache.org>.
SparkQA commented on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression that needs input type
URL: https://github.com/apache/spark/pull/27499#issuecomment-590736044
 
 
   **[Test build #118897 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/118897/testReport)** for PR 27499 at commit [`83958fb`](https://github.com/apache/spark/commit/83958fb5ec2bbcfdcf1f4ad13d3128ed7535aa67).
    * This patch **fails due to an unknown error code, -9**.
    * This patch merges cleanly.
    * This patch adds no public classes.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression that needs input type

Posted by GitBox <gi...@apache.org>.
SparkQA commented on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression that needs input type
URL: https://github.com/apache/spark/pull/27499#issuecomment-591282698
 
 
   **[Test build #118944 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/118944/testReport)** for PR 27499 at commit [`68d17f7`](https://github.com/apache/spark/commit/68d17f7e71a9ca861a8939453887e71d8498f2f4).
    * This patch passes all tests.
    * This patch merges cleanly.
    * This patch adds no public classes.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression

Posted by GitBox <gi...@apache.org>.
SparkQA commented on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression
URL: https://github.com/apache/spark/pull/27499#issuecomment-584451734
 
 
   **[Test build #118196 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/118196/testReport)** for PR 27499 at commit [`b784ba5`](https://github.com/apache/spark/commit/b784ba52fa403886b919a3899a39d114517c2b33).

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression

Posted by GitBox <gi...@apache.org>.
SparkQA commented on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression
URL: https://github.com/apache/spark/pull/27499#issuecomment-589934696
 
 
   **[Test build #118813 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/118813/testReport)** for PR 27499 at commit [`096ce42`](https://github.com/apache/spark/commit/096ce420f8e5d04dac1b88e46c53820ace857507).

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression
URL: https://github.com/apache/spark/pull/27499#issuecomment-589804968
 
 
   Merged build finished. Test FAILed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA removed a comment on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression that needs input type

Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression that needs input type
URL: https://github.com/apache/spark/pull/27499#issuecomment-591068391
 
 
   **[Test build #118931 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/118931/testReport)** for PR 27499 at commit [`68d17f7`](https://github.com/apache/spark/commit/68d17f7e71a9ca861a8939453887e71d8498f2f4).

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression
URL: https://github.com/apache/spark/pull/27499#issuecomment-583784905
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/22845/
   Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression
URL: https://github.com/apache/spark/pull/27499#issuecomment-583778216
 
 
   Merged build finished. Test FAILed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression that needs input type

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression that needs input type
URL: https://github.com/apache/spark/pull/27499#issuecomment-591170664
 
 
   Merged build finished. Test FAILed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HyukjinKwon commented on a change in pull request #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression

Posted by GitBox <gi...@apache.org>.
HyukjinKwon commented on a change in pull request #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression
URL: https://github.com/apache/spark/pull/27499#discussion_r377442595
 
 

 ##########
 File path: sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala
 ##########
 @@ -1430,6 +1430,11 @@ class Dataset[T] private[sql](
    */
   @scala.annotation.varargs
   def select(cols: Column*): DataFrame = withPlan {
 
 Review comment:
   Hm, but previously `count` case was possible to use it as untyped column.
   
   ```scala
   scala> spark.range(1).select(count("id"), count("id"), count("id"), count("id"), count("id"), count("id")).show()
   +---------+---------+---------+---------+---------+---------+
   |count(id)|count(id)|count(id)|count(id)|count(id)|count(id)|
   +---------+---------+---------+---------+---------+---------+
   |        1|        1|        1|        1|        1|        1|
   +---------+---------+---------+---------+---------+---------+
   
   
   scala> count("id")
   res0: org.apache.spark.sql.TypedColumn[Any,Long] = count(id)
   ```
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression

Posted by GitBox <gi...@apache.org>.
SparkQA commented on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression
URL: https://github.com/apache/spark/pull/27499#issuecomment-584452829
 
 
   **[Test build #118196 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/118196/testReport)** for PR 27499 at commit [`b784ba5`](https://github.com/apache/spark/commit/b784ba52fa403886b919a3899a39d114517c2b33).
    * This patch **fails build dependency tests**.
    * This patch merges cleanly.
    * This patch adds no public classes.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression
URL: https://github.com/apache/spark/pull/27499#issuecomment-589953403
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/118813/
   Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression
URL: https://github.com/apache/spark/pull/27499#issuecomment-589727920
 
 
   Merged build finished. Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression
URL: https://github.com/apache/spark/pull/27499#issuecomment-589804979
 
 
   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/118800/
   Test FAILed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression that needs input type

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression that needs input type
URL: https://github.com/apache/spark/pull/27499#issuecomment-591170664
 
 
   Merged build finished. Test FAILed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression
URL: https://github.com/apache/spark/pull/27499#issuecomment-583728341
 
 
   Merged build finished. Test FAILed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression

Posted by GitBox <gi...@apache.org>.
SparkQA commented on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression
URL: https://github.com/apache/spark/pull/27499#issuecomment-583728290
 
 
   **[Test build #118062 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/118062/testReport)** for PR 27499 at commit [`8aafa57`](https://github.com/apache/spark/commit/8aafa573a9cda815fdfbfe5c8864c8595c4e1f93).
    * This patch **fails Spark unit tests**.
    * This patch merges cleanly.
    * This patch adds no public classes.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression
URL: https://github.com/apache/spark/pull/27499#issuecomment-584517372
 
 
   Merged build finished. Test FAILed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] viirya commented on a change in pull request #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression

Posted by GitBox <gi...@apache.org>.
viirya commented on a change in pull request #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression
URL: https://github.com/apache/spark/pull/27499#discussion_r380650371
 
 

 ##########
 File path: sql/core/src/test/scala/org/apache/spark/sql/DatasetAggregatorSuite.scala
 ##########
 @@ -394,4 +403,21 @@ class DatasetAggregatorSuite extends QueryTest with SharedSparkSession {
     checkAnswer(group, Row("bob", Row(true, 3)) :: Nil)
     checkDataset(group.as[OptionBooleanIntData], OptionBooleanIntData("bob", Some((true, 3))))
   }
+
+  test("SPARK-30590: select multiple typed column expressions") {
+    val df = Seq((1, 2, 3, 4, 5, 6)).toDF("a", "b", "c", "d", "e", "f")
+    val fooAgg = (i: Int) => FooAgg(i).toColumn.name(s"foo_agg_$i")
+
+    val agg1 = df.select(fooAgg(1), fooAgg(2), fooAgg(3), fooAgg(4), fooAgg(5))
+    checkDataset(agg1, (3, 5, 7, 9, 11))
+
+    val agg2 = df.selectUntyped(fooAgg(1), fooAgg(2), fooAgg(3), fooAgg(4), fooAgg(5), fooAgg(6))
+      .asInstanceOf[Dataset[(Int, Int, Int, Int, Int, Int)]]
+    checkDataset(agg2, (3, 5, 7, 9, 11, 13))
+
+    val err = intercept[AnalysisException] {
+      df.select(fooAgg(1), fooAgg(2), fooAgg(3), fooAgg(4), fooAgg(5), fooAgg(6))
 
 Review comment:
   Sorry for late. I'm on a vacation now so will reply late.
   
   Yea, currently it silently chooses untyped `select` if the number of  typed columns is more than 5. An analysis exception will be thrown. It is very confusing for users.
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on a change in pull request #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on a change in pull request #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression
URL: https://github.com/apache/spark/pull/27499#discussion_r377577480
 
 

 ##########
 File path: sql/core/src/test/scala/org/apache/spark/sql/DatasetAggregatorSuite.scala
 ##########
 @@ -394,4 +403,21 @@ class DatasetAggregatorSuite extends QueryTest with SharedSparkSession {
     checkAnswer(group, Row("bob", Row(true, 3)) :: Nil)
     checkDataset(group.as[OptionBooleanIntData], OptionBooleanIntData("bob", Some((true, 3))))
   }
+
+  test("SPARK-30590: select multiple typed column expressions") {
+    val df = Seq((1, 2, 3, 4, 5, 6)).toDF("a", "b", "c", "d", "e", "f")
+    val fooAgg = (i: Int) => FooAgg(i).toColumn.name(s"foo_agg_$i")
+
+    val agg1 = df.select(fooAgg(1), fooAgg(2), fooAgg(3), fooAgg(4), fooAgg(5))
+    checkDataset(agg1, (3, 5, 7, 9, 11))
+
+    val agg2 = df.selectUntyped(fooAgg(1), fooAgg(2), fooAgg(3), fooAgg(4), fooAgg(5), fooAgg(6))
+      .asInstanceOf[Dataset[(Int, Int, Int, Int, Int, Int)]]
+    checkDataset(agg2, (3, 5, 7, 9, 11, 13))
+
+    val err = intercept[AnalysisException] {
+      df.select(fooAgg(1), fooAgg(2), fooAgg(3), fooAgg(4), fooAgg(5), fooAgg(6))
 
 Review comment:
   This case I'm a bit unsure. `TypedColumn` is also a `Colum` and why can't we use it like a `Column`? It's typed but it also has a corresponding catalyst schema and we should be able to use it in untyped operations.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HyukjinKwon commented on a change in pull request #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression

Posted by GitBox <gi...@apache.org>.
HyukjinKwon commented on a change in pull request #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression
URL: https://github.com/apache/spark/pull/27499#discussion_r382490723
 
 

 ##########
 File path: sql/core/src/main/scala/org/apache/spark/sql/functions.scala
 ##########
 @@ -352,8 +352,7 @@ object functions {
    * @group agg_funcs
    * @since 1.3.0
    */
-  def count(columnName: String): TypedColumn[Any, Long] =
-    count(Column(columnName)).as(ExpressionEncoder[Long]())
+  def count(columnName: String): Column = count(Column(columnName))
 
 Review comment:
   It seems a right change but let's revert this line considering it's code freeze period .. 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] viirya commented on a change in pull request #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression that needs input type

Posted by GitBox <gi...@apache.org>.
viirya commented on a change in pull request #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression that needs input type
URL: https://github.com/apache/spark/pull/27499#discussion_r384122068
 
 

 ##########
 File path: sql/core/src/test/scala/org/apache/spark/sql/DatasetAggregatorSuite.scala
 ##########
 @@ -394,4 +403,19 @@ class DatasetAggregatorSuite extends QueryTest with SharedSparkSession {
     checkAnswer(group, Row("bob", Row(true, 3)) :: Nil)
     checkDataset(group.as[OptionBooleanIntData], OptionBooleanIntData("bob", Some((true, 3))))
   }
+
+  test("SPARK-30590: untyped select should not accept typed column without input type") {
+    val df = Seq((1, 2, 3, 4, 5, 6)).toDF("a", "b", "c", "d", "e", "f")
+    val fooAgg = (i: Int) => FooAgg(i).toColumn.name(s"foo_agg_$i")
+
+    val agg1 = df.select(fooAgg(1), fooAgg(2), fooAgg(3), fooAgg(4), fooAgg(5))
+    checkDataset(agg1, (3, 5, 7, 9, 11))
+
+    // Passes typed columns to untyped `Dataset.select` API.
+    val err = intercept[AnalysisException] {
+      df.select(fooAgg(1), fooAgg(2), fooAgg(3), fooAgg(4), fooAgg(5), fooAgg(6))
 
 Review comment:
   Yea, to be clear, if we add a 6th overload of typed `select`, a call to the untyped `select` with 6 typed `count` could return `Dataset[(Long, Long, ...)]` instead of `DataFrame`.
   
   I think you meant something like existing `selectUntyped`? Although its naming is confusing.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression that needs input type

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression that needs input type
URL: https://github.com/apache/spark/pull/27499#issuecomment-591219643
 
 
   Merged build finished. Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression
URL: https://github.com/apache/spark/pull/27499#issuecomment-584410247
 
 
   Merged build finished. Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on a change in pull request #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression that needs input type

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on a change in pull request #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression that needs input type
URL: https://github.com/apache/spark/pull/27499#discussion_r384423658
 
 

 ##########
 File path: sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala
 ##########
 @@ -1430,7 +1430,18 @@ class Dataset[T] private[sql](
    */
   @scala.annotation.varargs
   def select(cols: Column*): DataFrame = withPlan {
-    Project(cols.map(_.named), logicalPlan)
+    val untypedCols = cols.map {
+      case typedCol: TypedColumn[_, _] =>
+        if (!typedCol.needInputType) {
 
 Review comment:
   just noticed: why don't we inline this method? Then we can centralize the changes here. The methods in `TypedColumn` can still be accessed by java users who ignore "private[spark]"

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression
URL: https://github.com/apache/spark/pull/27499#issuecomment-584517372
 
 
   Merged build finished. Test FAILed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression
URL: https://github.com/apache/spark/pull/27499#issuecomment-584456383
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/22959/
   Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression
URL: https://github.com/apache/spark/pull/27499#issuecomment-583714247
 
 
   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/118060/
   Test FAILed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression that needs input type

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression that needs input type
URL: https://github.com/apache/spark/pull/27499#issuecomment-591800215
 
 
   thanks, merging to master/3.0!

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression that needs input type

Posted by GitBox <gi...@apache.org>.
SparkQA commented on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression that needs input type
URL: https://github.com/apache/spark/pull/27499#issuecomment-590852899
 
 
   **[Test build #118909 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/118909/testReport)** for PR 27499 at commit [`83958fb`](https://github.com/apache/spark/commit/83958fb5ec2bbcfdcf1f4ad13d3128ed7535aa67).
    * This patch passes all tests.
    * This patch merges cleanly.
    * This patch adds no public classes.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression that needs input type

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression that needs input type
URL: https://github.com/apache/spark/pull/27499#issuecomment-591170672
 
 
   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/118931/
   Test FAILed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression that needs input type

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression that needs input type
URL: https://github.com/apache/spark/pull/27499#issuecomment-590854004
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/118909/
   Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HyukjinKwon commented on a change in pull request #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression

Posted by GitBox <gi...@apache.org>.
HyukjinKwon commented on a change in pull request #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression
URL: https://github.com/apache/spark/pull/27499#discussion_r377443955
 
 

 ##########
 File path: sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala
 ##########
 @@ -1430,6 +1430,11 @@ class Dataset[T] private[sql](
    */
   @scala.annotation.varargs
   def select(cols: Column*): DataFrame = withPlan {
 
 Review comment:
   Ah, but problem is more specific to `Aggregator` because users can set the type explicitly and later `encoder` is required, specifically via `TypedColumn.withInputType` -> `TypedAggregateExpression.withInputInfo`

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] viirya commented on a change in pull request #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression

Posted by GitBox <gi...@apache.org>.
viirya commented on a change in pull request #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression
URL: https://github.com/apache/spark/pull/27499#discussion_r382898306
 
 

 ##########
 File path: sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala
 ##########
 @@ -1430,7 +1430,23 @@ class Dataset[T] private[sql](
    */
   @scala.annotation.varargs
   def select(cols: Column*): DataFrame = withPlan {
-    Project(cols.map(_.named), logicalPlan)
+    val untypedCols = cols.map {
+      case typedCol: TypedColumn[_, _] =>
 
 Review comment:
   In order to not make a breaking change, we need to accept simple typed columns like `count`. @cloud-fan @HyukjinKwon 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression
URL: https://github.com/apache/spark/pull/27499#issuecomment-583711744
 
 
   Merged build finished. Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] viirya commented on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression

Posted by GitBox <gi...@apache.org>.
viirya commented on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression
URL: https://github.com/apache/spark/pull/27499#issuecomment-583714868
 
 
   retest this please

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression
URL: https://github.com/apache/spark/pull/27499#issuecomment-589570111
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/23530/
   Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] viirya commented on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression

Posted by GitBox <gi...@apache.org>.
viirya commented on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression
URL: https://github.com/apache/spark/pull/27499#issuecomment-589728748
 
 
   Updated the description. I think the title is still ok?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on a change in pull request #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on a change in pull request #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression
URL: https://github.com/apache/spark/pull/27499#discussion_r383120488
 
 

 ##########
 File path: sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala
 ##########
 @@ -1430,7 +1430,23 @@ class Dataset[T] private[sql](
    */
   @scala.annotation.varargs
   def select(cols: Column*): DataFrame = withPlan {
-    Project(cols.map(_.named), logicalPlan)
+    val untypedCols = cols.map {
+      case typedCol: TypedColumn[_, _] =>
+        val isSimpleEncoder = typedCol.encoder.namedExpressions.head match {
+          case Alias(_: BoundReference, _) if !typedCol.encoder.isSerializedAsStruct => true
+          case _ => false
+        }
+        if (isSimpleEncoder) {
+          // This typed column produces simple type output that can be fit into untyped `DataFrame`.
+          typedCol.withInputType(exprEnc, logicalPlan.output)
 
 Review comment:
   I mean, `df.select(count("*"))` works without calling `withInputType`, right?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on a change in pull request #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on a change in pull request #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression
URL: https://github.com/apache/spark/pull/27499#discussion_r381138896
 
 

 ##########
 File path: sql/core/src/test/scala/org/apache/spark/sql/DatasetAggregatorSuite.scala
 ##########
 @@ -394,4 +403,21 @@ class DatasetAggregatorSuite extends QueryTest with SharedSparkSession {
     checkAnswer(group, Row("bob", Row(true, 3)) :: Nil)
     checkDataset(group.as[OptionBooleanIntData], OptionBooleanIntData("bob", Some((true, 3))))
   }
+
+  test("SPARK-30590: select multiple typed column expressions") {
+    val df = Seq((1, 2, 3, 4, 5, 6)).toDF("a", "b", "c", "d", "e", "f")
+    val fooAgg = (i: Int) => FooAgg(i).toColumn.name(s"foo_agg_$i")
+
+    val agg1 = df.select(fooAgg(1), fooAgg(2), fooAgg(3), fooAgg(4), fooAgg(5))
+    checkDataset(agg1, (3, 5, 7, 9, 11))
+
+    val agg2 = df.selectUntyped(fooAgg(1), fooAgg(2), fooAgg(3), fooAgg(4), fooAgg(5), fooAgg(6))
+      .asInstanceOf[Dataset[(Int, Int, Int, Int, Int, Int)]]
+    checkDataset(agg2, (3, 5, 7, 9, 11, 13))
+
+    val err = intercept[AnalysisException] {
+      df.select(fooAgg(1), fooAgg(2), fooAgg(3), fooAgg(4), fooAgg(5), fooAgg(6))
 
 Review comment:
   We should at least make the error message clear as this PR does. But exposing `selectUntyped` needs more discussion.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression
URL: https://github.com/apache/spark/pull/27499#issuecomment-589727920
 
 
   Merged build finished. Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression
URL: https://github.com/apache/spark/pull/27499#issuecomment-583778220
 
 
   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/118077/
   Test FAILed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] viirya commented on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression

Posted by GitBox <gi...@apache.org>.
viirya commented on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression
URL: https://github.com/apache/spark/pull/27499#issuecomment-589986070
 
 
   also cc @dongjoon-hyun 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression
URL: https://github.com/apache/spark/pull/27499#issuecomment-584451959
 
 
   Merged build finished. Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression that needs input type

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression that needs input type
URL: https://github.com/apache/spark/pull/27499#issuecomment-590736309
 
 
   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/118897/
   Test FAILed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression

Posted by GitBox <gi...@apache.org>.
SparkQA commented on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression
URL: https://github.com/apache/spark/pull/27499#issuecomment-583784791
 
 
   **[Test build #118080 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/118080/testReport)** for PR 27499 at commit [`c9d3cd3`](https://github.com/apache/spark/commit/c9d3cd34c48435fd2323180648cbaadb9d81a7dc).

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression that needs input type

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression that needs input type
URL: https://github.com/apache/spark/pull/27499#issuecomment-590745253
 
 
   LGTM, let's highlight that it only refines the error message in the `Does this PR introduce any user-facing change?` section.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression

Posted by GitBox <gi...@apache.org>.
SparkQA commented on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression
URL: https://github.com/apache/spark/pull/27499#issuecomment-584409775
 
 
   **[Test build #118189 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/118189/testReport)** for PR 27499 at commit [`b784ba5`](https://github.com/apache/spark/commit/b784ba52fa403886b919a3899a39d114517c2b33).

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression
URL: https://github.com/apache/spark/pull/27499#issuecomment-583715081
 
 
   Merged build finished. Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression
URL: https://github.com/apache/spark/pull/27499#issuecomment-584452840
 
 
   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/118196/
   Test FAILed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression
URL: https://github.com/apache/spark/pull/27499#issuecomment-583778220
 
 
   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/118077/
   Test FAILed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression
URL: https://github.com/apache/spark/pull/27499#issuecomment-583711744
 
 
   Merged build finished. Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression
URL: https://github.com/apache/spark/pull/27499#issuecomment-589934790
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/23563/
   Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan closed pull request #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression that needs input type

Posted by GitBox <gi...@apache.org>.
cloud-fan closed pull request #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression that needs input type
URL: https://github.com/apache/spark/pull/27499
 
 
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression that needs input type

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression that needs input type
URL: https://github.com/apache/spark/pull/27499#issuecomment-591653223
 
 
   Merged build finished. Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA removed a comment on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression

Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression
URL: https://github.com/apache/spark/pull/27499#issuecomment-584409775
 
 
   **[Test build #118189 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/118189/testReport)** for PR 27499 at commit [`b784ba5`](https://github.com/apache/spark/commit/b784ba52fa403886b919a3899a39d114517c2b33).

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on a change in pull request #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on a change in pull request #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression
URL: https://github.com/apache/spark/pull/27499#discussion_r383136365
 
 

 ##########
 File path: sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala
 ##########
 @@ -1430,7 +1430,23 @@ class Dataset[T] private[sql](
    */
   @scala.annotation.varargs
   def select(cols: Column*): DataFrame = withPlan {
-    Project(cols.map(_.named), logicalPlan)
+    val untypedCols = cols.map {
+      case typedCol: TypedColumn[_, _] =>
+        val isSimpleEncoder = typedCol.encoder.namedExpressions.head match {
+          case Alias(_: BoundReference, _) if !typedCol.encoder.isSerializedAsStruct => true
+          case _ => false
+        }
+        if (isSimpleEncoder) {
+          // This typed column produces simple type output that can be fit into untyped `DataFrame`.
+          typedCol.withInputType(exprEnc, logicalPlan.output)
 
 Review comment:
   So here we are supporting more cases than before?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on a change in pull request #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on a change in pull request #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression
URL: https://github.com/apache/spark/pull/27499#discussion_r378195085
 
 

 ##########
 File path: sql/core/src/test/scala/org/apache/spark/sql/DatasetAggregatorSuite.scala
 ##########
 @@ -394,4 +403,21 @@ class DatasetAggregatorSuite extends QueryTest with SharedSparkSession {
     checkAnswer(group, Row("bob", Row(true, 3)) :: Nil)
     checkDataset(group.as[OptionBooleanIntData], OptionBooleanIntData("bob", Some((true, 3))))
   }
+
+  test("SPARK-30590: select multiple typed column expressions") {
+    val df = Seq((1, 2, 3, 4, 5, 6)).toDF("a", "b", "c", "d", "e", "f")
+    val fooAgg = (i: Int) => FooAgg(i).toColumn.name(s"foo_agg_$i")
+
+    val agg1 = df.select(fooAgg(1), fooAgg(2), fooAgg(3), fooAgg(4), fooAgg(5))
+    checkDataset(agg1, (3, 5, 7, 9, 11))
+
+    val agg2 = df.selectUntyped(fooAgg(1), fooAgg(2), fooAgg(3), fooAgg(4), fooAgg(5), fooAgg(6))
+      .asInstanceOf[Dataset[(Int, Int, Int, Int, Int, Int)]]
+    checkDataset(agg2, (3, 5, 7, 9, 11, 13))
+
+    val err = intercept[AnalysisException] {
+      df.select(fooAgg(1), fooAgg(2), fooAgg(3), fooAgg(4), fooAgg(5), fooAgg(6))
 
 Review comment:
   OK I see the point. If this is supported, then we can't add new typed `select` overloads as it breaks compatibility (the return type changes). What's the current behavior without your fix?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression
URL: https://github.com/apache/spark/pull/27499#issuecomment-589570102
 
 
   Merged build finished. Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression
URL: https://github.com/apache/spark/pull/27499#issuecomment-584447877
 
 
   Merged build finished. Test FAILed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] viirya commented on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression that needs input type

Posted by GitBox <gi...@apache.org>.
viirya commented on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression that needs input type
URL: https://github.com/apache/spark/pull/27499#issuecomment-592403149
 
 
   Created SPARK-30983 for discussion of typed select API.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression

Posted by GitBox <gi...@apache.org>.
SparkQA commented on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression
URL: https://github.com/apache/spark/pull/27499#issuecomment-583777503
 
 
   **[Test build #118077 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/118077/testReport)** for PR 27499 at commit [`ab7060e`](https://github.com/apache/spark/commit/ab7060ee4dc5616cc5ef8aa92190724807e5716c).

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] viirya commented on a change in pull request #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression

Posted by GitBox <gi...@apache.org>.
viirya commented on a change in pull request #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression
URL: https://github.com/apache/spark/pull/27499#discussion_r377456180
 
 

 ##########
 File path: sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala
 ##########
 @@ -1430,6 +1430,11 @@ class Dataset[T] private[sql](
    */
   @scala.annotation.varargs
   def select(cols: Column*): DataFrame = withPlan {
 
 Review comment:
   It cannot work because typed column's output is a domain object instead of a column. That is why untyped `select` returns DataFrame and typed `select` returns Dataset[U1], Dataset[(U1, U2)], Dataset[(U1, U2, U3)]...
   
   `count` is possible before because it is actually untyped column but wrongly being wrapped in a typed column.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression
URL: https://github.com/apache/spark/pull/27499#issuecomment-583715081
 
 
   Merged build finished. Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on a change in pull request #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on a change in pull request #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression
URL: https://github.com/apache/spark/pull/27499#discussion_r382475709
 
 

 ##########
 File path: sql/core/src/main/scala/org/apache/spark/sql/functions.scala
 ##########
 @@ -352,8 +352,7 @@ object functions {
    * @group agg_funcs
    * @since 1.3.0
    */
-  def count(columnName: String): TypedColumn[Any, Long] =
-    count(Column(columnName)).as(ExpressionEncoder[Long]())
+  def count(columnName: String): Column = count(Column(columnName))
 
 Review comment:
   This is a breaking change, right?
   
   At lease https://github.com/apache/spark/pull/27499/files#diff-2c67e6ae3d5115b5521681f6ef871b1dR43 is broken.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression that needs input type

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression that needs input type
URL: https://github.com/apache/spark/pull/27499#issuecomment-591283329
 
 
   Merged build finished. Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA removed a comment on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression

Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression
URL: https://github.com/apache/spark/pull/27499#issuecomment-584451734
 
 
   **[Test build #118196 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/118196/testReport)** for PR 27499 at commit [`b784ba5`](https://github.com/apache/spark/commit/b784ba52fa403886b919a3899a39d114517c2b33).

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] viirya commented on a change in pull request #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression

Posted by GitBox <gi...@apache.org>.
viirya commented on a change in pull request #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression
URL: https://github.com/apache/spark/pull/27499#discussion_r377375821
 
 

 ##########
 File path: sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala
 ##########
 @@ -1430,6 +1430,11 @@ class Dataset[T] private[sql](
    */
   @scala.annotation.varargs
   def select(cols: Column*): DataFrame = withPlan {
 
 Review comment:
   For typed column, there is encoder which specifies how we can encode its output. We have no such info for untyped columns.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression
URL: https://github.com/apache/spark/pull/27499#issuecomment-584451963
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/22958/
   Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA removed a comment on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression

Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression
URL: https://github.com/apache/spark/pull/27499#issuecomment-589934696
 
 
   **[Test build #118813 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/118813/testReport)** for PR 27499 at commit [`096ce42`](https://github.com/apache/spark/commit/096ce420f8e5d04dac1b88e46c53820ace857507).

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HyukjinKwon commented on a change in pull request #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression

Posted by GitBox <gi...@apache.org>.
HyukjinKwon commented on a change in pull request #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression
URL: https://github.com/apache/spark/pull/27499#discussion_r380507471
 
 

 ##########
 File path: sql/core/src/test/scala/org/apache/spark/sql/DatasetAggregatorSuite.scala
 ##########
 @@ -394,4 +403,21 @@ class DatasetAggregatorSuite extends QueryTest with SharedSparkSession {
     checkAnswer(group, Row("bob", Row(true, 3)) :: Nil)
     checkDataset(group.as[OptionBooleanIntData], OptionBooleanIntData("bob", Some((true, 3))))
   }
+
+  test("SPARK-30590: select multiple typed column expressions") {
+    val df = Seq((1, 2, 3, 4, 5, 6)).toDF("a", "b", "c", "d", "e", "f")
+    val fooAgg = (i: Int) => FooAgg(i).toColumn.name(s"foo_agg_$i")
+
+    val agg1 = df.select(fooAgg(1), fooAgg(2), fooAgg(3), fooAgg(4), fooAgg(5))
+    checkDataset(agg1, (3, 5, 7, 9, 11))
+
+    val agg2 = df.selectUntyped(fooAgg(1), fooAgg(2), fooAgg(3), fooAgg(4), fooAgg(5), fooAgg(6))
+      .asInstanceOf[Dataset[(Int, Int, Int, Int, Int, Int)]]
+    checkDataset(agg2, (3, 5, 7, 9, 11, 13))
+
+    val err = intercept[AnalysisException] {
+      df.select(fooAgg(1), fooAgg(2), fooAgg(3), fooAgg(4), fooAgg(5), fooAgg(6))
 
 Review comment:
   Current behaviour seems throwing an exception (as described in the JIRA).
   
   ```
   scala> df.select(fooAgg(1),fooAgg(2),fooAgg(3),fooAgg(4),fooAgg(5),fooAgg(6)).show
   
   org.apache.spark.sql.AnalysisException: unresolved operator 'Aggregate [fooagg(FooAgg(1), None, None, None, input[0, int, false] AS value#114, assertnotnull(cast(value#114 as int)), input[0, int, false] AS value#113, IntegerType, IntegerType, false) AS foo_agg_1#116, fooagg(FooAgg(2), None, None, None, input[0, int, false] AS value#119, assertnotnull(cast(value#119 as int)), input[0, int, false] AS value#118, IntegerType, IntegerType, false) AS foo_agg_2#121, fooagg(FooAgg(3), None, None, None, input[0, int, false] AS value#124, assertnotnull(cast(value#124 as int)), input[0, int, false] AS value#123, IntegerType, IntegerType, false) AS foo_agg_3#126, fooagg(FooAgg(4), None, None, None, input[0, int, false] AS value#129, assertnotnull(cast(value#129 as int)), input[0, int, false] AS value#128, IntegerType, IntegerType, false) AS foo_agg_4#131, fooagg(FooAgg(5), None, None, None, input[0, int, false] AS value#134, assertnotnull(cast(value#134 as int)), input[0, int, false] AS value#133, IntegerType, IntegerType, false) AS foo_agg_5#136, fooagg(FooAgg(6), None, None, None, input[0, int, false] AS value#139, assertnotnull(cast(value#139 as int)), input[0, int, false] AS value#138, IntegerType, IntegerType, false) AS foo_agg_6#141];;
   'Aggregate [fooagg(FooAgg(1), None, None, None, input[0, int, false] AS value#114, assertnotnull(cast(value#114 as int)), input[0, int, false] AS value#113, IntegerType, IntegerType, false) AS foo_agg_1#116, fooagg(FooAgg(2), None, None, None, input[0, int, false] AS value#119, assertnotnull(cast(value#119 as int)), input[0, int, false] AS value#118, IntegerType, IntegerType, false) AS foo_agg_2#121, fooagg(FooAgg(3), None, None, None, input[0, int, false] AS value#124, assertnotnull(cast(value#124 as int)), input[0, int, false] AS value#123, IntegerType, IntegerType, false) AS foo_agg_3#126, fooagg(FooAgg(4), None, None, None, input[0, int, false] AS value#129, assertnotnull(cast(value#129 as int)), input[0, int, false] AS value#128, IntegerType, IntegerType, false) AS foo_agg_4#131, fooagg(FooAgg(5), None, None, None, input[0, int, false] AS value#134, assertnotnull(cast(value#134 as int)), input[0, int, false] AS value#133, IntegerType, IntegerType, false) AS foo_agg_5#136, fooagg(FooAgg(6), None, None, None, input[0, int, false] AS value#139, assertnotnull(cast(value#139 as int)), input[0, int, false] AS value#138, IntegerType, IntegerType, false) AS foo_agg_6#141]
   +- Project [_1#6 AS a#13, _2#7 AS b#14, _3#8 AS c#15, _4#9 AS d#16, _5#10 AS e#17, _6#11 AS F#18]
    +- LocalRelation [_1#6, _2#7, _3#8, _4#9, _5#10, _6#11]
   
   at org.apache.spark.sql.catalyst.analysis.CheckAnalysis$class.failAnalysis(CheckAnalysis.scala:43)
    at org.apache.spark.sql.catalyst.analysis.Analyzer.failAnalysis(Analyzer.scala:95)
    at org.apache.spark.sql.catalyst.analysis.CheckAnalysis$$anonfun$checkAnalysis$3.apply(CheckAnalysis.scala:431)
    at org.apache.spark.sql.catalyst.analysis.CheckAnalysis$$anonfun$checkAnalysis$3.apply(CheckAnalysis.scala:430)
    at org.apache.spark.sql.catalyst.trees.TreeNode.foreachUp(TreeNode.scala:127)
    at org.apache.spark.sql.catalyst.analysis.CheckAnalysis$class.checkAnalysis(CheckAnalysis.scala:430)
    at org.apache.spark.sql.catalyst.analysis.Analyzer.checkAnalysis(Analyzer.scala:95)
    at org.apache.spark.sql.catalyst.analysis.Analyzer$$anonfun$executeAndCheck$1.apply(Analyzer.scala:108)
    at org.apache.spark.sql.catalyst.analysis.Analyzer$$anonfun$executeAndCheck$1.apply(Analyzer.scala:105)
    at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper$.markInAnalyzer(AnalysisHelper.scala:201)
    at org.apache.spark.sql.catalyst.analysis.Analyzer.executeAndCheck(Analyzer.scala:105)
    at org.apache.spark.sql.execution.QueryExecution.analyzed$lzycompute(QueryExecution.scala:57)
    at org.apache.spark.sql.execution.QueryExecution.analyzed(QueryExecution.scala:55)
    at org.apache.spark.sql.execution.QueryExecution.assertAnalyzed(QueryExecution.scala:47)
    at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:78)
    at org.apache.spark.sql.Dataset.org$apache$spark$sql$Dataset$$withPlan(Dataset.scala:3412)
    at org.apache.spark.sql.Dataset.select(Dataset.scala:1340)
   ```
   
   The Resolution of aggregation expr requires to set the encoder, class, etc (via `TypedColumn.withInputType`); however, it's not set. See `SimpleTypedAggregateExpression.withInputInfo`.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression

Posted by GitBox <gi...@apache.org>.
SparkQA commented on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression
URL: https://github.com/apache/spark/pull/27499#issuecomment-583711659
 
 
   **[Test build #118060 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/118060/testReport)** for PR 27499 at commit [`8aafa57`](https://github.com/apache/spark/commit/8aafa573a9cda815fdfbfe5c8864c8595c4e1f93).

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression
URL: https://github.com/apache/spark/pull/27499#issuecomment-584452840
 
 
   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/118196/
   Test FAILed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA removed a comment on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression

Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression
URL: https://github.com/apache/spark/pull/27499#issuecomment-583784791
 
 
   **[Test build #118080 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/118080/testReport)** for PR 27499 at commit [`c9d3cd3`](https://github.com/apache/spark/commit/c9d3cd34c48435fd2323180648cbaadb9d81a7dc).

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on a change in pull request #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on a change in pull request #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression
URL: https://github.com/apache/spark/pull/27499#discussion_r383104983
 
 

 ##########
 File path: sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala
 ##########
 @@ -1430,7 +1430,23 @@ class Dataset[T] private[sql](
    */
   @scala.annotation.varargs
   def select(cols: Column*): DataFrame = withPlan {
-    Project(cols.map(_.named), logicalPlan)
+    val untypedCols = cols.map {
+      case typedCol: TypedColumn[_, _] =>
+        val isSimpleEncoder = typedCol.encoder.namedExpressions.head match {
+          case Alias(_: BoundReference, _) if !typedCol.encoder.isSerializedAsStruct => true
+          case _ => false
+        }
+        if (isSimpleEncoder) {
+          // This typed column produces simple type output that can be fit into untyped `DataFrame`.
+          typedCol.withInputType(exprEnc, logicalPlan.output)
 
 Review comment:
   Previously we didn't call `withInputType` for `count`, right?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA removed a comment on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression that needs input type

Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression that needs input type
URL: https://github.com/apache/spark/pull/27499#issuecomment-590689300
 
 
   **[Test build #118897 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/118897/testReport)** for PR 27499 at commit [`83958fb`](https://github.com/apache/spark/commit/83958fb5ec2bbcfdcf1f4ad13d3128ed7535aa67).

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression
URL: https://github.com/apache/spark/pull/27499#issuecomment-584410247
 
 
   Merged build finished. Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] viirya commented on a change in pull request #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression

Posted by GitBox <gi...@apache.org>.
viirya commented on a change in pull request #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression
URL: https://github.com/apache/spark/pull/27499#discussion_r376738792
 
 

 ##########
 File path: sql/core/src/main/scala/org/apache/spark/sql/functions.scala
 ##########
 @@ -352,8 +352,7 @@ object functions {
    * @group agg_funcs
    * @since 1.3.0
    */
-  def count(columnName: String): TypedColumn[Any, Long] =
-    count(Column(columnName)).as(ExpressionEncoder[Long]())
+  def count(columnName: String): Column = count(Column(columnName))
 
 Review comment:
   This seems to me it is wrongly being a `TypedColumn`. `Count` is a `DeclarativeAggregate`.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on a change in pull request #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression that needs input type

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on a change in pull request #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression that needs input type
URL: https://github.com/apache/spark/pull/27499#discussion_r384423658
 
 

 ##########
 File path: sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala
 ##########
 @@ -1430,7 +1430,18 @@ class Dataset[T] private[sql](
    */
   @scala.annotation.varargs
   def select(cols: Column*): DataFrame = withPlan {
-    Project(cols.map(_.named), logicalPlan)
+    val untypedCols = cols.map {
+      case typedCol: TypedColumn[_, _] =>
+        if (!typedCol.needInputType) {
 
 Review comment:
   just noticed: why don't we inline this method? Then we can centralize the changes here. The methods in `TypedColumn` can still be accessed by java users who ignore "private[spark]", so better to avoid adding if we can.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression
URL: https://github.com/apache/spark/pull/27499#issuecomment-584488380
 
 
   Merged build finished. Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression
URL: https://github.com/apache/spark/pull/27499#issuecomment-583777009
 
 
   Merged build finished. Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on a change in pull request #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on a change in pull request #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression
URL: https://github.com/apache/spark/pull/27499#discussion_r377575371
 
 

 ##########
 File path: sql/core/src/test/scala/org/apache/spark/sql/DatasetAggregatorSuite.scala
 ##########
 @@ -394,4 +403,21 @@ class DatasetAggregatorSuite extends QueryTest with SharedSparkSession {
     checkAnswer(group, Row("bob", Row(true, 3)) :: Nil)
     checkDataset(group.as[OptionBooleanIntData], OptionBooleanIntData("bob", Some((true, 3))))
   }
+
+  test("SPARK-30590: select multiple typed column expressions") {
+    val df = Seq((1, 2, 3, 4, 5, 6)).toDF("a", "b", "c", "d", "e", "f")
+    val fooAgg = (i: Int) => FooAgg(i).toColumn.name(s"foo_agg_$i")
+
+    val agg1 = df.select(fooAgg(1), fooAgg(2), fooAgg(3), fooAgg(4), fooAgg(5))
+    checkDataset(agg1, (3, 5, 7, 9, 11))
+
+    val agg2 = df.selectUntyped(fooAgg(1), fooAgg(2), fooAgg(3), fooAgg(4), fooAgg(5), fooAgg(6))
+      .asInstanceOf[Dataset[(Int, Int, Int, Int, Int, Int)]]
 
 Review comment:
   is this really a good use case? It looks to me `selectUntyped` is a bad user-facing API that is hard to use. Maybe we should follow other places and add more overloads that take up to 22 colums?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression that needs input type

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression that needs input type
URL: https://github.com/apache/spark/pull/27499#issuecomment-591653223
 
 
   Merged build finished. Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] viirya commented on a change in pull request #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression

Posted by GitBox <gi...@apache.org>.
viirya commented on a change in pull request #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression
URL: https://github.com/apache/spark/pull/27499#discussion_r377910377
 
 

 ##########
 File path: sql/core/src/test/scala/org/apache/spark/sql/DatasetAggregatorSuite.scala
 ##########
 @@ -394,4 +403,21 @@ class DatasetAggregatorSuite extends QueryTest with SharedSparkSession {
     checkAnswer(group, Row("bob", Row(true, 3)) :: Nil)
     checkDataset(group.as[OptionBooleanIntData], OptionBooleanIntData("bob", Some((true, 3))))
   }
+
+  test("SPARK-30590: select multiple typed column expressions") {
+    val df = Seq((1, 2, 3, 4, 5, 6)).toDF("a", "b", "c", "d", "e", "f")
+    val fooAgg = (i: Int) => FooAgg(i).toColumn.name(s"foo_agg_$i")
+
+    val agg1 = df.select(fooAgg(1), fooAgg(2), fooAgg(3), fooAgg(4), fooAgg(5))
+    checkDataset(agg1, (3, 5, 7, 9, 11))
+
+    val agg2 = df.selectUntyped(fooAgg(1), fooAgg(2), fooAgg(3), fooAgg(4), fooAgg(5), fooAgg(6))
+      .asInstanceOf[Dataset[(Int, Int, Int, Int, Int, Int)]]
+    checkDataset(agg2, (3, 5, 7, 9, 11, 13))
+
+    val err = intercept[AnalysisException] {
+      df.select(fooAgg(1), fooAgg(2), fooAgg(3), fooAgg(4), fooAgg(5), fooAgg(6))
 
 Review comment:
   This case, the users could be unaware of that they are using a untyped select with typed columns. The result might be surprised to the users as the typed info are lost.
   
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression
URL: https://github.com/apache/spark/pull/27499#issuecomment-583777009
 
 
   Merged build finished. Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression
URL: https://github.com/apache/spark/pull/27499#issuecomment-584488387
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/22969/
   Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] viirya commented on a change in pull request #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression

Posted by GitBox <gi...@apache.org>.
viirya commented on a change in pull request #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression
URL: https://github.com/apache/spark/pull/27499#discussion_r381663074
 
 

 ##########
 File path: sql/core/src/test/scala/org/apache/spark/sql/DatasetAggregatorSuite.scala
 ##########
 @@ -394,4 +403,21 @@ class DatasetAggregatorSuite extends QueryTest with SharedSparkSession {
     checkAnswer(group, Row("bob", Row(true, 3)) :: Nil)
     checkDataset(group.as[OptionBooleanIntData], OptionBooleanIntData("bob", Some((true, 3))))
   }
+
+  test("SPARK-30590: select multiple typed column expressions") {
+    val df = Seq((1, 2, 3, 4, 5, 6)).toDF("a", "b", "c", "d", "e", "f")
+    val fooAgg = (i: Int) => FooAgg(i).toColumn.name(s"foo_agg_$i")
+
+    val agg1 = df.select(fooAgg(1), fooAgg(2), fooAgg(3), fooAgg(4), fooAgg(5))
+    checkDataset(agg1, (3, 5, 7, 9, 11))
+
+    val agg2 = df.selectUntyped(fooAgg(1), fooAgg(2), fooAgg(3), fooAgg(4), fooAgg(5), fooAgg(6))
+      .asInstanceOf[Dataset[(Int, Int, Int, Int, Int, Int)]]
+    checkDataset(agg2, (3, 5, 7, 9, 11, 13))
+
+    val err = intercept[AnalysisException] {
+      df.select(fooAgg(1), fooAgg(2), fooAgg(3), fooAgg(4), fooAgg(5), fooAgg(6))
 
 Review comment:
   Ok. It sounds reasonable to me. Where is better to raise a discussion for this, dev maillist or JIRA?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HyukjinKwon commented on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression that needs input type

Posted by GitBox <gi...@apache.org>.
HyukjinKwon commented on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression that needs input type
URL: https://github.com/apache/spark/pull/27499#issuecomment-591806127
 
 
   +1 LGTM too

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression without input type

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression without input type
URL: https://github.com/apache/spark/pull/27499#issuecomment-590689509
 
 
   Merged build finished. Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] viirya commented on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression that needs input type

Posted by GitBox <gi...@apache.org>.
viirya commented on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression that needs input type
URL: https://github.com/apache/spark/pull/27499#issuecomment-590736205
 
 
   retest this please.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression
URL: https://github.com/apache/spark/pull/27499#issuecomment-589570102
 
 
   Merged build finished. Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on a change in pull request #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on a change in pull request #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression
URL: https://github.com/apache/spark/pull/27499#discussion_r377575371
 
 

 ##########
 File path: sql/core/src/test/scala/org/apache/spark/sql/DatasetAggregatorSuite.scala
 ##########
 @@ -394,4 +403,21 @@ class DatasetAggregatorSuite extends QueryTest with SharedSparkSession {
     checkAnswer(group, Row("bob", Row(true, 3)) :: Nil)
     checkDataset(group.as[OptionBooleanIntData], OptionBooleanIntData("bob", Some((true, 3))))
   }
+
+  test("SPARK-30590: select multiple typed column expressions") {
+    val df = Seq((1, 2, 3, 4, 5, 6)).toDF("a", "b", "c", "d", "e", "f")
+    val fooAgg = (i: Int) => FooAgg(i).toColumn.name(s"foo_agg_$i")
+
+    val agg1 = df.select(fooAgg(1), fooAgg(2), fooAgg(3), fooAgg(4), fooAgg(5))
+    checkDataset(agg1, (3, 5, 7, 9, 11))
+
+    val agg2 = df.selectUntyped(fooAgg(1), fooAgg(2), fooAgg(3), fooAgg(4), fooAgg(5), fooAgg(6))
+      .asInstanceOf[Dataset[(Int, Int, Int, Int, Int, Int)]]
 
 Review comment:
   is this really a good use case? It looks to me `selectUntyped` is a bad user-facing API that is hard to use. Maybe we should follow other places and add more overloads that take up to 22 columns?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on a change in pull request #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression that needs input type

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on a change in pull request #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression that needs input type
URL: https://github.com/apache/spark/pull/27499#discussion_r383721848
 
 

 ##########
 File path: sql/core/src/main/scala/org/apache/spark/sql/Column.scala
 ##########
 @@ -97,6 +97,17 @@ class TypedColumn[-T, U](
     new TypedColumn[T, U](newExpr, encoder)
   }
 
+  /**
+   * This method is used internally in SparkSQL to check if a `TypedColumn` has been inserted with
+   * specific input type and schema by `withInputType`.
+   */
+  private[sql] def needInputType: Boolean = {
+    expr.find {
+      case ta: TypedAggregateExpression if ta.inputDeserializer.isEmpty => true
 
 Review comment:
   nit: `case ta: TypedAggregateExpression => ta.inputDeserializer.isEmpty`

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression
URL: https://github.com/apache/spark/pull/27499#issuecomment-584447877
 
 
   Merged build finished. Test FAILed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression
URL: https://github.com/apache/spark/pull/27499#issuecomment-589637809
 
 
   Merged build finished. Test FAILed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression
URL: https://github.com/apache/spark/pull/27499#issuecomment-583784899
 
 
   Merged build finished. Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] viirya commented on a change in pull request #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression

Posted by GitBox <gi...@apache.org>.
viirya commented on a change in pull request #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression
URL: https://github.com/apache/spark/pull/27499#discussion_r377464564
 
 

 ##########
 File path: sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala
 ##########
 @@ -1430,6 +1430,11 @@ class Dataset[T] private[sql](
    */
   @scala.annotation.varargs
   def select(cols: Column*): DataFrame = withPlan {
 
 Review comment:
   Yea, I think so if I do not miss anything.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HyukjinKwon commented on a change in pull request #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression

Posted by GitBox <gi...@apache.org>.
HyukjinKwon commented on a change in pull request #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression
URL: https://github.com/apache/spark/pull/27499#discussion_r377460627
 
 

 ##########
 File path: sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala
 ##########
 @@ -1430,6 +1430,11 @@ class Dataset[T] private[sql](
    */
   @scala.annotation.varargs
   def select(cols: Column*): DataFrame = withPlan {
 
 Review comment:
   Hm.. I see. So to support typed columns properly, we should either expose `selectUntyped` or make all cases for `def select[U1, ...](`  ..

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression
URL: https://github.com/apache/spark/pull/27499#issuecomment-583711745
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/22826/
   Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression
URL: https://github.com/apache/spark/pull/27499#issuecomment-584451963
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/22958/
   Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression

Posted by GitBox <gi...@apache.org>.
SparkQA commented on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression
URL: https://github.com/apache/spark/pull/27499#issuecomment-584479220
 
 
   **[Test build #118197 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/118197/testReport)** for PR 27499 at commit [`b784ba5`](https://github.com/apache/spark/commit/b784ba52fa403886b919a3899a39d114517c2b33).
    * This patch **fails Spark unit tests**.
    * This patch merges cleanly.
    * This patch adds no public classes.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression that needs input type

Posted by GitBox <gi...@apache.org>.
SparkQA commented on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression that needs input type
URL: https://github.com/apache/spark/pull/27499#issuecomment-590737883
 
 
   **[Test build #118909 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/118909/testReport)** for PR 27499 at commit [`83958fb`](https://github.com/apache/spark/commit/83958fb5ec2bbcfdcf1f4ad13d3128ed7535aa67).

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] viirya commented on a change in pull request #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression

Posted by GitBox <gi...@apache.org>.
viirya commented on a change in pull request #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression
URL: https://github.com/apache/spark/pull/27499#discussion_r382815570
 
 

 ##########
 File path: sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala
 ##########
 @@ -1430,6 +1430,11 @@ class Dataset[T] private[sql](
    */
   @scala.annotation.varargs
   def select(cols: Column*): DataFrame = withPlan {
+    cols.find(_.isInstanceOf[TypedColumn[_, _]]).foreach { typedCol =>
+      throw new AnalysisException(s"$typedCol is a typed column that " +
+        "cannot be passed in untyped `select` API. If you are going to select " +
+        "multiple typed columns, you can use `Dataset.selectUntyped` API.")
+    }
 
 Review comment:
   hmm, we also have such usage in our test:
   
   ```scala
   df.select(count("*"), countDistinct("*"))
   ```
   
   So this is still a breaking change? @cloud-fan @HyukjinKwon 
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression without input type

Posted by GitBox <gi...@apache.org>.
SparkQA commented on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression without input type
URL: https://github.com/apache/spark/pull/27499#issuecomment-590689300
 
 
   **[Test build #118897 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/118897/testReport)** for PR 27499 at commit [`83958fb`](https://github.com/apache/spark/commit/83958fb5ec2bbcfdcf1f4ad13d3128ed7535aa67).

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on a change in pull request #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on a change in pull request #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression
URL: https://github.com/apache/spark/pull/27499#discussion_r381826270
 
 

 ##########
 File path: sql/core/src/test/scala/org/apache/spark/sql/DatasetAggregatorSuite.scala
 ##########
 @@ -394,4 +403,21 @@ class DatasetAggregatorSuite extends QueryTest with SharedSparkSession {
     checkAnswer(group, Row("bob", Row(true, 3)) :: Nil)
     checkDataset(group.as[OptionBooleanIntData], OptionBooleanIntData("bob", Some((true, 3))))
   }
+
+  test("SPARK-30590: select multiple typed column expressions") {
+    val df = Seq((1, 2, 3, 4, 5, 6)).toDF("a", "b", "c", "d", "e", "f")
+    val fooAgg = (i: Int) => FooAgg(i).toColumn.name(s"foo_agg_$i")
+
+    val agg1 = df.select(fooAgg(1), fooAgg(2), fooAgg(3), fooAgg(4), fooAgg(5))
+    checkDataset(agg1, (3, 5, 7, 9, 11))
+
+    val agg2 = df.selectUntyped(fooAgg(1), fooAgg(2), fooAgg(3), fooAgg(4), fooAgg(5), fooAgg(6))
+      .asInstanceOf[Dataset[(Int, Int, Int, Int, Int, Int)]]
+    checkDataset(agg2, (3, 5, 7, 9, 11, 13))
+
+    val err = intercept[AnalysisException] {
+      df.select(fooAgg(1), fooAgg(2), fooAgg(3), fooAgg(4), fooAgg(5), fooAgg(6))
 
 Review comment:
   We can open a JIRA first.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] viirya commented on a change in pull request #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression

Posted by GitBox <gi...@apache.org>.
viirya commented on a change in pull request #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression
URL: https://github.com/apache/spark/pull/27499#discussion_r383124033
 
 

 ##########
 File path: sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala
 ##########
 @@ -1430,7 +1430,23 @@ class Dataset[T] private[sql](
    */
   @scala.annotation.varargs
   def select(cols: Column*): DataFrame = withPlan {
-    Project(cols.map(_.named), logicalPlan)
+    val untypedCols = cols.map {
+      case typedCol: TypedColumn[_, _] =>
+        val isSimpleEncoder = typedCol.encoder.namedExpressions.head match {
+          case Alias(_: BoundReference, _) if !typedCol.encoder.isSerializedAsStruct => true
+          case _ => false
+        }
+        if (isSimpleEncoder) {
+          // This typed column produces simple type output that can be fit into untyped `DataFrame`.
+          typedCol.withInputType(exprEnc, logicalPlan.output)
 
 Review comment:
   Yes, for `TypedColumn` that doesn't contain `TypedAggregateExpression`, `withInputType` is no-op, so you don't need to call `withInputType` for `df.select(count("*"))`.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression without input type

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression without input type
URL: https://github.com/apache/spark/pull/27499#issuecomment-590689514
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/23646/
   Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression that needs input type

Posted by GitBox <gi...@apache.org>.
SparkQA commented on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression that needs input type
URL: https://github.com/apache/spark/pull/27499#issuecomment-591534724
 
 
   **[Test build #118984 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/118984/testReport)** for PR 27499 at commit [`7d045fb`](https://github.com/apache/spark/commit/7d045fbfb1b6bf35c5cc49e9948db2c5140bb15d).

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on a change in pull request #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on a change in pull request #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression
URL: https://github.com/apache/spark/pull/27499#discussion_r382475709
 
 

 ##########
 File path: sql/core/src/main/scala/org/apache/spark/sql/functions.scala
 ##########
 @@ -352,8 +352,7 @@ object functions {
    * @group agg_funcs
    * @since 1.3.0
    */
-  def count(columnName: String): TypedColumn[Any, Long] =
-    count(Column(columnName)).as(ExpressionEncoder[Long]())
+  def count(columnName: String): Column = count(Column(columnName))
 
 Review comment:
   This is a breaking change, right?
   
   At least https://github.com/apache/spark/pull/27499/files#diff-2c67e6ae3d5115b5521681f6ef871b1dR43 is broken.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression

Posted by GitBox <gi...@apache.org>.
SparkQA commented on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression
URL: https://github.com/apache/spark/pull/27499#issuecomment-589637483
 
 
   **[Test build #118777 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/118777/testReport)** for PR 27499 at commit [`45feb5c`](https://github.com/apache/spark/commit/45feb5c9ae6f7a184aee45e27755db5d59a028d0).
    * This patch **fails Spark unit tests**.
    * This patch merges cleanly.
    * This patch adds no public classes.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression that needs input type

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression that needs input type
URL: https://github.com/apache/spark/pull/27499#issuecomment-591283329
 
 
   Merged build finished. Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression

Posted by GitBox <gi...@apache.org>.
SparkQA commented on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression
URL: https://github.com/apache/spark/pull/27499#issuecomment-589569725
 
 
   **[Test build #118777 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/118777/testReport)** for PR 27499 at commit [`45feb5c`](https://github.com/apache/spark/commit/45feb5c9ae6f7a184aee45e27755db5d59a028d0).

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression

Posted by GitBox <gi...@apache.org>.
SparkQA commented on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression
URL: https://github.com/apache/spark/pull/27499#issuecomment-584488115
 
 
   **[Test build #118209 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/118209/testReport)** for PR 27499 at commit [`b784ba5`](https://github.com/apache/spark/commit/b784ba52fa403886b919a3899a39d114517c2b33).

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression that needs input type

Posted by GitBox <gi...@apache.org>.
SparkQA commented on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression that needs input type
URL: https://github.com/apache/spark/pull/27499#issuecomment-591170278
 
 
   **[Test build #118931 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/118931/testReport)** for PR 27499 at commit [`68d17f7`](https://github.com/apache/spark/commit/68d17f7e71a9ca861a8939453887e71d8498f2f4).
    * This patch **fails Spark unit tests**.
    * This patch merges cleanly.
    * This patch adds no public classes.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression
URL: https://github.com/apache/spark/pull/27499#issuecomment-583794991
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/118080/
   Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression

Posted by GitBox <gi...@apache.org>.
SparkQA commented on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression
URL: https://github.com/apache/spark/pull/27499#issuecomment-584447693
 
 
   **[Test build #118189 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/118189/testReport)** for PR 27499 at commit [`b784ba5`](https://github.com/apache/spark/commit/b784ba52fa403886b919a3899a39d114517c2b33).
    * This patch **fails Spark unit tests**.
    * This patch merges cleanly.
    * This patch adds no public classes.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] viirya commented on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression

Posted by GitBox <gi...@apache.org>.
viirya commented on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression
URL: https://github.com/apache/spark/pull/27499#issuecomment-584455790
 
 
   retest this please

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression
URL: https://github.com/apache/spark/pull/27499#issuecomment-584456380
 
 
   Merged build finished. Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression
URL: https://github.com/apache/spark/pull/27499#issuecomment-583728342
 
 
   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/118062/
   Test FAILed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression
URL: https://github.com/apache/spark/pull/27499#issuecomment-583715083
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/22828/
   Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression
URL: https://github.com/apache/spark/pull/27499#issuecomment-583715083
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/22828/
   Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression that needs input type

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression that needs input type
URL: https://github.com/apache/spark/pull/27499#issuecomment-591653231
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/118984/
   Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression that needs input type

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression that needs input type
URL: https://github.com/apache/spark/pull/27499#issuecomment-591535519
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/23732/
   Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA removed a comment on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression that needs input type

Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression that needs input type
URL: https://github.com/apache/spark/pull/27499#issuecomment-590737883
 
 
   **[Test build #118909 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/118909/testReport)** for PR 27499 at commit [`83958fb`](https://github.com/apache/spark/commit/83958fb5ec2bbcfdcf1f4ad13d3128ed7535aa67).

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression that needs input type

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression that needs input type
URL: https://github.com/apache/spark/pull/27499#issuecomment-590854004
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/118909/
   Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] viirya commented on a change in pull request #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression

Posted by GitBox <gi...@apache.org>.
viirya commented on a change in pull request #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression
URL: https://github.com/apache/spark/pull/27499#discussion_r382673651
 
 

 ##########
 File path: sql/core/src/main/scala/org/apache/spark/sql/functions.scala
 ##########
 @@ -352,8 +352,7 @@ object functions {
    * @group agg_funcs
    * @since 1.3.0
    */
-  def count(columnName: String): TypedColumn[Any, Long] =
-    count(Column(columnName)).as(ExpressionEncoder[Long]())
+  def count(columnName: String): Column = count(Column(columnName))
 
 Review comment:
   Ok. :)

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression
URL: https://github.com/apache/spark/pull/27499#issuecomment-583728342
 
 
   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/118062/
   Test FAILed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression that needs input type

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression that needs input type
URL: https://github.com/apache/spark/pull/27499#issuecomment-591535511
 
 
   Merged build finished. Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression
URL: https://github.com/apache/spark/pull/27499#issuecomment-583714244
 
 
   Merged build finished. Test FAILed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA removed a comment on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression

Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression
URL: https://github.com/apache/spark/pull/27499#issuecomment-589730908
 
 
   **[Test build #118800 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/118800/testReport)** for PR 27499 at commit [`53ba69c`](https://github.com/apache/spark/commit/53ba69c5843d23a93e4be70b71126625a5ba4789).

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression
URL: https://github.com/apache/spark/pull/27499#issuecomment-584479410
 
 
   Merged build finished. Test FAILed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] viirya commented on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression

Posted by GitBox <gi...@apache.org>.
viirya commented on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression
URL: https://github.com/apache/spark/pull/27499#issuecomment-584487805
 
 
   retest this please.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression

Posted by GitBox <gi...@apache.org>.
SparkQA commented on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression
URL: https://github.com/apache/spark/pull/27499#issuecomment-583714993
 
 
   **[Test build #118062 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/118062/testReport)** for PR 27499 at commit [`8aafa57`](https://github.com/apache/spark/commit/8aafa573a9cda815fdfbfe5c8864c8595c4e1f93).

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression
URL: https://github.com/apache/spark/pull/27499#issuecomment-584447885
 
 
   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/118189/
   Test FAILed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] viirya commented on a change in pull request #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression

Posted by GitBox <gi...@apache.org>.
viirya commented on a change in pull request #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression
URL: https://github.com/apache/spark/pull/27499#discussion_r382469636
 
 

 ##########
 File path: sql/core/src/test/scala/org/apache/spark/sql/DatasetAggregatorSuite.scala
 ##########
 @@ -394,4 +403,21 @@ class DatasetAggregatorSuite extends QueryTest with SharedSparkSession {
     checkAnswer(group, Row("bob", Row(true, 3)) :: Nil)
     checkDataset(group.as[OptionBooleanIntData], OptionBooleanIntData("bob", Some((true, 3))))
   }
+
+  test("SPARK-30590: select multiple typed column expressions") {
+    val df = Seq((1, 2, 3, 4, 5, 6)).toDF("a", "b", "c", "d", "e", "f")
+    val fooAgg = (i: Int) => FooAgg(i).toColumn.name(s"foo_agg_$i")
+
+    val agg1 = df.select(fooAgg(1), fooAgg(2), fooAgg(3), fooAgg(4), fooAgg(5))
+    checkDataset(agg1, (3, 5, 7, 9, 11))
+
+    val agg2 = df.selectUntyped(fooAgg(1), fooAgg(2), fooAgg(3), fooAgg(4), fooAgg(5), fooAgg(6))
+      .asInstanceOf[Dataset[(Int, Int, Int, Int, Int, Int)]]
+    checkDataset(agg2, (3, 5, 7, 9, 11, 13))
+
+    val err = intercept[AnalysisException] {
+      df.select(fooAgg(1), fooAgg(2), fooAgg(3), fooAgg(4), fooAgg(5), fooAgg(6))
 
 Review comment:
   Ok. Let me remove `selectUntyped` change for now.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression that needs input type

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression that needs input type
URL: https://github.com/apache/spark/pull/27499#issuecomment-590736298
 
 
   Merged build finished. Test FAILed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] viirya commented on a change in pull request #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression

Posted by GitBox <gi...@apache.org>.
viirya commented on a change in pull request #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression
URL: https://github.com/apache/spark/pull/27499#discussion_r383112627
 
 

 ##########
 File path: sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala
 ##########
 @@ -1430,7 +1430,23 @@ class Dataset[T] private[sql](
    */
   @scala.annotation.varargs
   def select(cols: Column*): DataFrame = withPlan {
-    Project(cols.map(_.named), logicalPlan)
+    val untypedCols = cols.map {
+      case typedCol: TypedColumn[_, _] =>
+        val isSimpleEncoder = typedCol.encoder.namedExpressions.head match {
+          case Alias(_: BoundReference, _) if !typedCol.encoder.isSerializedAsStruct => true
+          case _ => false
+        }
+        if (isSimpleEncoder) {
+          // This typed column produces simple type output that can be fit into untyped `DataFrame`.
+          typedCol.withInputType(exprEnc, logicalPlan.output)
 
 Review comment:
   `count` has no `TypedAggregateExpression`. `withInputType` only works on `TypedAggregateExpression`.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression that needs input type

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression that needs input type
URL: https://github.com/apache/spark/pull/27499#issuecomment-591283341
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/118944/
   Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on a change in pull request #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression that needs input type

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on a change in pull request #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression that needs input type
URL: https://github.com/apache/spark/pull/27499#discussion_r383724273
 
 

 ##########
 File path: sql/core/src/test/scala/org/apache/spark/sql/DatasetAggregatorSuite.scala
 ##########
 @@ -394,4 +403,19 @@ class DatasetAggregatorSuite extends QueryTest with SharedSparkSession {
     checkAnswer(group, Row("bob", Row(true, 3)) :: Nil)
     checkDataset(group.as[OptionBooleanIntData], OptionBooleanIntData("bob", Some((true, 3))))
   }
+
+  test("SPARK-30590: untyped select should not accept typed column without input type") {
+    val df = Seq((1, 2, 3, 4, 5, 6)).toDF("a", "b", "c", "d", "e", "f")
+    val fooAgg = (i: Int) => FooAgg(i).toColumn.name(s"foo_agg_$i")
+
+    val agg1 = df.select(fooAgg(1), fooAgg(2), fooAgg(3), fooAgg(4), fooAgg(5))
+    checkDataset(agg1, (3, 5, 7, 9, 11))
+
+    // Passes typed columns to untyped `Dataset.select` API.
+    val err = intercept[AnalysisException] {
+      df.select(fooAgg(1), fooAgg(2), fooAgg(3), fooAgg(4), fooAgg(5), fooAgg(6))
 
 Review comment:
   Not related to this PR, just a note:
   
   We have 5 overloads of typed `select`, and typed `count` is supported in both typed and untyped `select`. That said, if we add a 6th overload of typed `select`, it can break queries that call the untyped `select` with 6 typed `count`s.
   
   I'm not sure what's the best way to move forward. Maybe we should add new methods `typedSelect` to disambiguate the untyped version.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression that needs input type

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression that needs input type
URL: https://github.com/apache/spark/pull/27499#issuecomment-591068979
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/23679/
   Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression
URL: https://github.com/apache/spark/pull/27499#issuecomment-583784905
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/22845/
   Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression
URL: https://github.com/apache/spark/pull/27499#issuecomment-589953399
 
 
   Merged build finished. Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression that needs input type

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression that needs input type
URL: https://github.com/apache/spark/pull/27499#issuecomment-591535511
 
 
   Merged build finished. Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HyukjinKwon commented on a change in pull request #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression

Posted by GitBox <gi...@apache.org>.
HyukjinKwon commented on a change in pull request #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression
URL: https://github.com/apache/spark/pull/27499#discussion_r382858004
 
 

 ##########
 File path: sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala
 ##########
 @@ -1430,6 +1430,11 @@ class Dataset[T] private[sql](
    */
   @scala.annotation.varargs
   def select(cols: Column*): DataFrame = withPlan {
+    cols.find(_.isInstanceOf[TypedColumn[_, _]]).foreach { typedCol =>
+      throw new AnalysisException(s"$typedCol is a typed column that " +
+        "cannot be passed in untyped `select` API. If you are going to select " +
+        "multiple typed columns, you can use `Dataset.selectUntyped` API.")
+    }
 
 Review comment:
   Ah ... true .. 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression that needs input type

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression that needs input type
URL: https://github.com/apache/spark/pull/27499#issuecomment-591219657
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/23693/
   Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HyukjinKwon commented on a change in pull request #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression

Posted by GitBox <gi...@apache.org>.
HyukjinKwon commented on a change in pull request #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression
URL: https://github.com/apache/spark/pull/27499#discussion_r382858004
 
 

 ##########
 File path: sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala
 ##########
 @@ -1430,6 +1430,11 @@ class Dataset[T] private[sql](
    */
   @scala.annotation.varargs
   def select(cols: Column*): DataFrame = withPlan {
+    cols.find(_.isInstanceOf[TypedColumn[_, _]]).foreach { typedCol =>
+      throw new AnalysisException(s"$typedCol is a typed column that " +
+        "cannot be passed in untyped `select` API. If you are going to select " +
+        "multiple typed columns, you can use `Dataset.selectUntyped` API.")
+    }
 
 Review comment:
   Ah ... true .. 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression that needs input type

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression that needs input type
URL: https://github.com/apache/spark/pull/27499#issuecomment-591535519
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/23732/
   Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA removed a comment on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression

Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression
URL: https://github.com/apache/spark/pull/27499#issuecomment-583777503
 
 
   **[Test build #118077 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/118077/testReport)** for PR 27499 at commit [`ab7060e`](https://github.com/apache/spark/commit/ab7060ee4dc5616cc5ef8aa92190724807e5716c).

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression

Posted by GitBox <gi...@apache.org>.
SparkQA commented on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression
URL: https://github.com/apache/spark/pull/27499#issuecomment-583778205
 
 
   **[Test build #118077 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/118077/testReport)** for PR 27499 at commit [`ab7060e`](https://github.com/apache/spark/commit/ab7060ee4dc5616cc5ef8aa92190724807e5716c).
    * This patch **fails MiMa tests**.
    * This patch merges cleanly.
    * This patch adds no public classes.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression
URL: https://github.com/apache/spark/pull/27499#issuecomment-589637816
 
 
   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/118777/
   Test FAILed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression that needs input type

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression that needs input type
URL: https://github.com/apache/spark/pull/27499#issuecomment-591068979
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/23679/
   Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression that needs input type

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression that needs input type
URL: https://github.com/apache/spark/pull/27499#issuecomment-590736298
 
 
   Merged build finished. Test FAILed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] viirya commented on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression that needs input type

Posted by GitBox <gi...@apache.org>.
viirya commented on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression that needs input type
URL: https://github.com/apache/spark/pull/27499#issuecomment-591804431
 
 
   Thanks! I will open a JIRA for discussion of typed select API.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression
URL: https://github.com/apache/spark/pull/27499#issuecomment-584452834
 
 
   Merged build finished. Test FAILed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression
URL: https://github.com/apache/spark/pull/27499#issuecomment-583714244
 
 
   Merged build finished. Test FAILed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression
URL: https://github.com/apache/spark/pull/27499#issuecomment-584488380
 
 
   Merged build finished. Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression that needs input type

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression that needs input type
URL: https://github.com/apache/spark/pull/27499#issuecomment-591068971
 
 
   Merged build finished. Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression that needs input type

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression that needs input type
URL: https://github.com/apache/spark/pull/27499#issuecomment-591170672
 
 
   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/118931/
   Test FAILed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression that needs input type

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression that needs input type
URL: https://github.com/apache/spark/pull/27499#issuecomment-590853996
 
 
   Merged build finished. Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression
URL: https://github.com/apache/spark/pull/27499#issuecomment-584447885
 
 
   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/118189/
   Test FAILed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HyukjinKwon commented on a change in pull request #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression

Posted by GitBox <gi...@apache.org>.
HyukjinKwon commented on a change in pull request #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression
URL: https://github.com/apache/spark/pull/27499#discussion_r377444876
 
 

 ##########
 File path: sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala
 ##########
 @@ -1430,6 +1430,11 @@ class Dataset[T] private[sql](
    */
   @scala.annotation.varargs
   def select(cols: Column*): DataFrame = withPlan {
 
 Review comment:
   What about a fix like this?
   
   ```diff
   NOTICE-binary                       common                              examples                            metastore_db                        sbin                                streaming
   diff --git a/sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala b/sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala
   index 16f1cac3e0f..91b6e13b100 100644
   --- a/sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala
   +++ b/sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala
   @@ -1430,12 +1430,11 @@ class Dataset[T] private[sql](
       */
      @scala.annotation.varargs
      def select(cols: Column*): DataFrame = withPlan {
   -    cols.find(_.isInstanceOf[TypedColumn[_, _]]).foreach { typedCol =>
   -      throw new AnalysisException(s"$typedCol is a typed column that " +
   -        "cannot be passed in untyped `select` API. If you are going to select " +
   -        "multiple typed columns, you can use `Dataset.selectUntyped` API.")
   +    val newCols = cols.map {
   +      case tc: TypedColumn[_, _] => tc.withInputType(exprEnc, logicalPlan.output)
   +      case c => c
        }
   -    Project(cols.map(_.named), logicalPlan)
   +    Project(newCols.map(_.named), logicalPlan)
   ```

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression without input type

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression without input type
URL: https://github.com/apache/spark/pull/27499#issuecomment-590689509
 
 
   Merged build finished. Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression
URL: https://github.com/apache/spark/pull/27499#issuecomment-584479414
 
 
   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/118197/
   Test FAILed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression that needs input type

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression that needs input type
URL: https://github.com/apache/spark/pull/27499#issuecomment-591653231
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/118984/
   Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression
URL: https://github.com/apache/spark/pull/27499#issuecomment-584479414
 
 
   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/118197/
   Test FAILed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression

Posted by GitBox <gi...@apache.org>.
SparkQA commented on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression
URL: https://github.com/apache/spark/pull/27499#issuecomment-584517001
 
 
   **[Test build #118209 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/118209/testReport)** for PR 27499 at commit [`b784ba5`](https://github.com/apache/spark/commit/b784ba52fa403886b919a3899a39d114517c2b33).
    * This patch **fails due to an unknown error code, -9**.
    * This patch merges cleanly.
    * This patch adds no public classes.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression
URL: https://github.com/apache/spark/pull/27499#issuecomment-583778216
 
 
   Merged build finished. Test FAILed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HyukjinKwon commented on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression

Posted by GitBox <gi...@apache.org>.
HyukjinKwon commented on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression
URL: https://github.com/apache/spark/pull/27499#issuecomment-589582072
 
 
   Seems fine except https://github.com/apache/spark/pull/27499#discussion_r376738792. Might need to update title and PR description too.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression
URL: https://github.com/apache/spark/pull/27499#issuecomment-589804968
 
 
   Merged build finished. Test FAILed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] viirya commented on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression

Posted by GitBox <gi...@apache.org>.
viirya commented on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression
URL: https://github.com/apache/spark/pull/27499#issuecomment-583811423
 
 
   cc @cloud-fan @dongjoon-hyun @HyukjinKwon 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] viirya commented on a change in pull request #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression

Posted by GitBox <gi...@apache.org>.
viirya commented on a change in pull request #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression
URL: https://github.com/apache/spark/pull/27499#discussion_r383386646
 
 

 ##########
 File path: sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala
 ##########
 @@ -1430,7 +1430,23 @@ class Dataset[T] private[sql](
    */
   @scala.annotation.varargs
   def select(cols: Column*): DataFrame = withPlan {
-    Project(cols.map(_.named), logicalPlan)
+    val untypedCols = cols.map {
+      case typedCol: TypedColumn[_, _] =>
+        val isSimpleEncoder = typedCol.encoder.namedExpressions.head match {
+          case Alias(_: BoundReference, _) if !typedCol.encoder.isSerializedAsStruct => true
+          case _ => false
+        }
+        if (isSimpleEncoder) {
+          // This typed column produces simple type output that can be fit into untyped `DataFrame`.
+          typedCol.withInputType(exprEnc, logicalPlan.output)
 
 Review comment:
   Oh, I get your point now. Yea, we should not allow more cases than before.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org