You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2020/02/25 20:58:24 UTC

[GitHub] [spark] viirya commented on a change in pull request #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression that needs input type

viirya commented on a change in pull request #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression that needs input type
URL: https://github.com/apache/spark/pull/27499#discussion_r384122068
 
 

 ##########
 File path: sql/core/src/test/scala/org/apache/spark/sql/DatasetAggregatorSuite.scala
 ##########
 @@ -394,4 +403,19 @@ class DatasetAggregatorSuite extends QueryTest with SharedSparkSession {
     checkAnswer(group, Row("bob", Row(true, 3)) :: Nil)
     checkDataset(group.as[OptionBooleanIntData], OptionBooleanIntData("bob", Some((true, 3))))
   }
+
+  test("SPARK-30590: untyped select should not accept typed column without input type") {
+    val df = Seq((1, 2, 3, 4, 5, 6)).toDF("a", "b", "c", "d", "e", "f")
+    val fooAgg = (i: Int) => FooAgg(i).toColumn.name(s"foo_agg_$i")
+
+    val agg1 = df.select(fooAgg(1), fooAgg(2), fooAgg(3), fooAgg(4), fooAgg(5))
+    checkDataset(agg1, (3, 5, 7, 9, 11))
+
+    // Passes typed columns to untyped `Dataset.select` API.
+    val err = intercept[AnalysisException] {
+      df.select(fooAgg(1), fooAgg(2), fooAgg(3), fooAgg(4), fooAgg(5), fooAgg(6))
 
 Review comment:
   Yea, to be clear, if we add a 6th overload of typed `select`, a call to the untyped `select` with 6 typed `count` could return `Dataset[(Long, Long, ...)]` instead of `DataFrame`.
   
   I think you meant something like existing `selectUntyped`? Although its naming is confusing.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org