You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2020/03/17 16:09:30 UTC

[GitHub] [spark] Ngone51 opened a new pull request #27937: [WIP][SPARK-30127][SQL] Support case class parameter for typed Scala UDF

Ngone51 opened a new pull request #27937: [WIP][SPARK-30127][SQL] Support case class parameter for typed Scala UDF
URL: https://github.com/apache/spark/pull/27937
 
 
   <!--
   Thanks for sending a pull request!  Here are some tips for you:
     1. If this is your first time, please read our contributor guidelines: https://spark.apache.org/contributing.html
     2. Ensure you have added or run the appropriate tests for your PR: https://spark.apache.org/developer-tools.html
     3. If the PR is unfinished, add '[WIP]' in your PR title, e.g., '[WIP][SPARK-XXXX] Your PR title ...'.
     4. Be sure to keep the PR description updated to reflect all changes.
     5. Please write your PR title to summarize what this PR proposes.
     6. If possible, provide a concise example to reproduce the issue for a faster review.
     7. If you want to add a new configuration, please read the guideline first for naming configurations in
        'core/src/main/scala/org/apache/spark/internal/config/ConfigEntry.scala'.
   -->
   
   ### What changes were proposed in this pull request?
   <!--
   Please clarify what changes you are proposing. The purpose of this section is to outline the changes and how this PR fixes the issue. 
   If possible, please consider writing useful notes for better and faster reviews in your PR. See the examples below.
     1. If you refactor some codes with changing classes, showing the class hierarchy will help reviewers.
     2. If you fix some SQL features, you can provide some references of other DBMSes.
     3. If there is design documentation, please add the link.
     4. If there is a discussion in the mailing list, please add the link.
   -->
   
   To support  case class parameter for typed Scala UDF, e.g.
   
   ```
   case class TestData(key: Int, value: String)
   val f = (d: TestData) => d.key * d.value.toInt
   val myUdf = udf(f)
   val df = Seq(("data", TestData(50, "2"))).toDF("col1", "col2")
   checkAnswer(df.select(myUdf(Column("col2"))), Row(100) :: Nil)
   ```
   
   
   ### Why are the changes needed?
   <!--
   Please clarify why the changes are needed. For instance,
     1. If you propose a new API, clarify the use case for a new API.
     2. If you fix a bug, you can clarify why it is a bug.
   -->
   
   Currently, Spark UDF can only work on data types like java.lang.String, o.a.s.sql.Row, Seq[_], etc. This is inconvenient if user want to apply an operation on one column, and the column is struct type. You must access data from a Row object, instead of domain object like Dataset operations. It will be great if UDF can work on types that are supported by Dataset, e.g. case class.
   
   ### Does this PR introduce any user-facing change?
   <!--
   If yes, please clarify the previous behavior and the change this PR proposes - provide the console output, description and/or an example to show the behavior difference if possible.
   If no, write 'No'.
   -->
   
   Yes. User now could be able to use typed Scala UDF with case class as input parameter.
   
   ### How was this patch tested?
   <!--
   If tests were added, say they were added here. Please make sure to add some test cases that check the changes thoroughly including negative and positive cases if possible.
   If it was tested in a way different from regular unit tests, please clarify how you tested step by step, ideally copy and paste-able, so that other reviewers can test and check, and descendants can verify in the future.
   If tests were not added, please describe why they were not added and/or why it was difficult to add.
   -->
   
   Added unit tests.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on issue #27937: [WIP][SPARK-30127][SQL] Support case class parameter for typed Scala UDF

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on issue #27937: [WIP][SPARK-30127][SQL] Support case class parameter for typed Scala UDF
URL: https://github.com/apache/spark/pull/27937#issuecomment-600964701
 
 
   Merged build finished. Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on a change in pull request #27937: [WIP][SPARK-30127][SQL] Support case class parameter for typed Scala UDF

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on a change in pull request #27937: [WIP][SPARK-30127][SQL] Support case class parameter for typed Scala UDF
URL: https://github.com/apache/spark/pull/27937#discussion_r394350861
 
 

 ##########
 File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/ScalaUDF.scala
 ##########
 @@ -59,14 +64,22 @@ case class ScalaUDF(
 
   override def toString: String = s"${udfName.getOrElse("UDF")}(${children.mkString(", ")})"
 
+  private def createToScalaConverter(i: Int, dataType: DataType): Any => Any = {
+    val encoder = inputEncoders(i)
+    encoder.isSerializedAsStructForTopLevel match {
+      case true => r: Any => encoder.resolveAndBind().fromRow(r.asInstanceOf[InternalRow])
 
 Review comment:
   We shouldn't call `resolveAndBind` for each input row

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on issue #27937: [WIP][SPARK-30127][SQL] Support case class parameter for typed Scala UDF

Posted by GitBox <gi...@apache.org>.
SparkQA commented on issue #27937: [WIP][SPARK-30127][SQL] Support case class parameter for typed Scala UDF
URL: https://github.com/apache/spark/pull/27937#issuecomment-600677564
 
 
   **[Test build #119997 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/119997/testReport)** for PR 27937 at commit [`867ad06`](https://github.com/apache/spark/commit/867ad06f8b7068b8cd24970fc45d8fd8dc12428d).

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] Ngone51 commented on a change in pull request #27937: [SPARK-30127][SQL] Support case class parameter for typed Scala UDF

Posted by GitBox <gi...@apache.org>.
Ngone51 commented on a change in pull request #27937: [SPARK-30127][SQL] Support case class parameter for typed Scala UDF
URL: https://github.com/apache/spark/pull/27937#discussion_r396181782
 
 

 ##########
 File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/ScalaUDF.scala
 ##########
 @@ -48,25 +46,87 @@ case class ScalaUDF(
     function: AnyRef,
     dataType: DataType,
     children: Seq[Expression],
-    inputPrimitives: Seq[Boolean],
-    inputTypes: Seq[AbstractDataType] = Nil,
+    inputEncoders: Seq[Option[ExpressionEncoder[_]]] = Nil,
     udfName: Option[String] = None,
     nullable: Boolean = true,
     udfDeterministic: Boolean = true)
   extends Expression with NonSQLExpression with UserDefinedExpression {
 
   override lazy val deterministic: Boolean = udfDeterministic && children.forall(_.deterministic)
 
+  private lazy val resolvedEnc = mutable.HashMap[Int, ExpressionEncoder[_]]()
+
   override def toString: String = s"${udfName.getOrElse("UDF")}(${children.mkString(", ")})"
 
+  /**
+   * The analyzer should be aware of Scala primitive types so as to make the
+   * UDF return null if there is any null input value of these types. On the
+   * other hand, Java UDFs can only have boxed types, thus this parameter will
+   * always be all false.
+   */
+  def inputPrimitives: Seq[Boolean] = {
+    inputEncoders.map { encoderOpt =>
+      // It's possible that some of the inputs don't have a specific encoder(e.g. `Any`)
+      if (encoderOpt.isDefined) {
+        val encoder = encoderOpt.get
+        if (encoder.isSerializedAsStruct) {
+          // struct type is not primitive
+          false
+        } else {
+          // `nullable` is false iff the type is primitive
+          !encoder.schema.head.nullable
+        }
+      } else {
+        // Any type is not primitive
+        false
+      }
+    }
+  }
+
+  /**
+   * The expected input types of this UDF, used to perform type coercion. If we do
+   * not want to perform coercion, simply use "Nil". Note that it would've been
+   * better to use Option of Seq[DataType] so we can use "None" as the case for no
+   * type coercion. However, that would require more refactoring of the codebase.
+   */
+  def inputTypes: Seq[AbstractDataType] = {
 
 Review comment:
   Similarly, the input types of Java UDF and untyped Scala UDF are always `Nil`.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on issue #27937: [SPARK-30127][SQL] Support case class parameter for typed Scala UDF

Posted by GitBox <gi...@apache.org>.
SparkQA commented on issue #27937: [SPARK-30127][SQL] Support case class parameter for typed Scala UDF
URL: https://github.com/apache/spark/pull/27937#issuecomment-601054801
 
 
   **[Test build #120041 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/120041/testReport)** for PR 27937 at commit [`23ca098`](https://github.com/apache/spark/commit/23ca0988637571a4b1e210acbbee7c34c927d56b).

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] Ngone51 commented on issue #27937: [SPARK-30127][SQL] Support case class parameter for typed Scala UDF

Posted by GitBox <gi...@apache.org>.
Ngone51 commented on issue #27937: [SPARK-30127][SQL] Support case class parameter for typed Scala UDF
URL: https://github.com/apache/spark/pull/27937#issuecomment-601051210
 
 
   retest this please.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on issue #27937: [SPARK-30127][SQL] Support case class parameter for typed Scala UDF

Posted by GitBox <gi...@apache.org>.
SparkQA commented on issue #27937: [SPARK-30127][SQL] Support case class parameter for typed Scala UDF
URL: https://github.com/apache/spark/pull/27937#issuecomment-601021141
 
 
   **[Test build #120018 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/120018/testReport)** for PR 27937 at commit [`23ca098`](https://github.com/apache/spark/commit/23ca0988637571a4b1e210acbbee7c34c927d56b).
    * This patch **fails due to an unknown error code, -9**.
    * This patch merges cleanly.
    * This patch adds no public classes.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] Ngone51 commented on issue #27937: [WIP][SPARK-30127][SQL] Support case class parameter for typed Scala UDF

Posted by GitBox <gi...@apache.org>.
Ngone51 commented on issue #27937: [WIP][SPARK-30127][SQL] Support case class parameter for typed Scala UDF
URL: https://github.com/apache/spark/pull/27937#issuecomment-600163963
 
 
   cc @cloud-fan 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on issue #27937: [WIP][SPARK-30127][SQL] Support case class parameter for typed Scala UDF

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on issue #27937: [WIP][SPARK-30127][SQL] Support case class parameter for typed Scala UDF
URL: https://github.com/apache/spark/pull/27937#issuecomment-600158325
 
 
   Merged build finished. Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan closed pull request #27937: [SPARK-30127][SQL] Support case class parameter for typed Scala UDF

Posted by GitBox <gi...@apache.org>.
cloud-fan closed pull request #27937: [SPARK-30127][SQL] Support case class parameter for typed Scala UDF
URL: https://github.com/apache/spark/pull/27937
 
 
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #27937: [SPARK-30127][SQL] Support case class parameter for typed Scala UDF

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #27937: [SPARK-30127][SQL] Support case class parameter for typed Scala UDF
URL: https://github.com/apache/spark/pull/27937#issuecomment-601056202
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/24757/
   Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA removed a comment on issue #27937: [SPARK-30127][SQL] Support case class parameter for typed Scala UDF

Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on issue #27937: [SPARK-30127][SQL] Support case class parameter for typed Scala UDF
URL: https://github.com/apache/spark/pull/27937#issuecomment-602324100
 
 
   **[Test build #120173 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/120173/testReport)** for PR 27937 at commit [`8e82f3f`](https://github.com/apache/spark/commit/8e82f3f75a770fc9c6163a483f297eac38c30edd).

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on issue #27937: [WIP][SPARK-30127][SQL] Support case class parameter for typed Scala UDF

Posted by GitBox <gi...@apache.org>.
SparkQA commented on issue #27937: [WIP][SPARK-30127][SQL] Support case class parameter for typed Scala UDF
URL: https://github.com/apache/spark/pull/27937#issuecomment-600510988
 
 
   **[Test build #119983 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/119983/testReport)** for PR 27937 at commit [`d600cf7`](https://github.com/apache/spark/commit/d600cf79edf5329fc8187143bf63d834d1598809).

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #27937: [WIP][SPARK-30127][SQL] Support case class parameter for typed Scala UDF

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #27937: [WIP][SPARK-30127][SQL] Support case class parameter for typed Scala UDF
URL: https://github.com/apache/spark/pull/27937#issuecomment-600821062
 
 
   Merged build finished. Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on a change in pull request #27937: [SPARK-30127][SQL] Support case class parameter for typed Scala UDF

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on a change in pull request #27937: [SPARK-30127][SQL] Support case class parameter for typed Scala UDF
URL: https://github.com/apache/spark/pull/27937#discussion_r395491256
 
 

 ##########
 File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/ScalaUDF.scala
 ##########
 @@ -48,25 +46,87 @@ case class ScalaUDF(
     function: AnyRef,
     dataType: DataType,
     children: Seq[Expression],
-    inputPrimitives: Seq[Boolean],
-    inputTypes: Seq[AbstractDataType] = Nil,
+    inputEncoders: Seq[Option[ExpressionEncoder[_]]] = Nil,
     udfName: Option[String] = None,
     nullable: Boolean = true,
     udfDeterministic: Boolean = true)
   extends Expression with NonSQLExpression with UserDefinedExpression {
 
   override lazy val deterministic: Boolean = udfDeterministic && children.forall(_.deterministic)
 
+  private lazy val resolvedEnc = mutable.HashMap[Int, ExpressionEncoder[_]]()
+
   override def toString: String = s"${udfName.getOrElse("UDF")}(${children.mkString(", ")})"
 
+  /**
+   * The analyzer should be aware of Scala primitive types so as to make the
+   * UDF return null if there is any null input value of these types. On the
+   * other hand, Java UDFs can only have boxed types, thus this parameter will
+   * always be all false.
+   */
+  def inputPrimitives: Seq[Boolean] = {
 
 Review comment:
   can this be Nil?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #27937: [WIP][SPARK-30127][SQL] Support case class parameter for typed Scala UDF

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #27937: [WIP][SPARK-30127][SQL] Support case class parameter for typed Scala UDF
URL: https://github.com/apache/spark/pull/27937#issuecomment-600556228
 
 
   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/119983/
   Test FAILed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #27937: [SPARK-30127][SQL] Support case class parameter for typed Scala UDF

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #27937: [SPARK-30127][SQL] Support case class parameter for typed Scala UDF
URL: https://github.com/apache/spark/pull/27937#issuecomment-601178680
 
 
   Merged build finished. Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #27937: [SPARK-30127][SQL] Support case class parameter for typed Scala UDF

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #27937: [SPARK-30127][SQL] Support case class parameter for typed Scala UDF
URL: https://github.com/apache/spark/pull/27937#issuecomment-601056193
 
 
   Merged build finished. Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA removed a comment on issue #27937: [SPARK-30127][SQL] Support case class parameter for typed Scala UDF

Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on issue #27937: [SPARK-30127][SQL] Support case class parameter for typed Scala UDF
URL: https://github.com/apache/spark/pull/27937#issuecomment-600964037
 
 
   **[Test build #120018 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/120018/testReport)** for PR 27937 at commit [`23ca098`](https://github.com/apache/spark/commit/23ca0988637571a4b1e210acbbee7c34c927d56b).

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #27937: [WIP][SPARK-30127][SQL] Support case class parameter for typed Scala UDF

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #27937: [WIP][SPARK-30127][SQL] Support case class parameter for typed Scala UDF
URL: https://github.com/apache/spark/pull/27937#issuecomment-600964701
 
 
   Merged build finished. Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #27937: [SPARK-30127][SQL] Support case class parameter for typed Scala UDF

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #27937: [SPARK-30127][SQL] Support case class parameter for typed Scala UDF
URL: https://github.com/apache/spark/pull/27937#issuecomment-602408123
 
 
   Merged build finished. Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #27937: [SPARK-30127][SQL] Support case class parameter for typed Scala UDF

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #27937: [SPARK-30127][SQL] Support case class parameter for typed Scala UDF
URL: https://github.com/apache/spark/pull/27937#issuecomment-601021739
 
 
   Merged build finished. Test FAILed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA removed a comment on issue #27937: [SPARK-30127][SQL] Support case class parameter for typed Scala UDF

Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on issue #27937: [SPARK-30127][SQL] Support case class parameter for typed Scala UDF
URL: https://github.com/apache/spark/pull/27937#issuecomment-601054801
 
 
   **[Test build #120041 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/120041/testReport)** for PR 27937 at commit [`23ca098`](https://github.com/apache/spark/commit/23ca0988637571a4b1e210acbbee7c34c927d56b).

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #27937: [WIP][SPARK-30127][SQL] Support case class parameter for typed Scala UDF

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #27937: [WIP][SPARK-30127][SQL] Support case class parameter for typed Scala UDF
URL: https://github.com/apache/spark/pull/27937#issuecomment-600511538
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/24705/
   Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on issue #27937: [SPARK-30127][SQL] Support case class parameter for typed Scala UDF

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on issue #27937: [SPARK-30127][SQL] Support case class parameter for typed Scala UDF
URL: https://github.com/apache/spark/pull/27937#issuecomment-602644676
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/120188/
   Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #27937: [SPARK-30127][SQL] Support case class parameter for typed Scala UDF

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #27937: [SPARK-30127][SQL] Support case class parameter for typed Scala UDF
URL: https://github.com/apache/spark/pull/27937#issuecomment-601231277
 
 
   Merged build finished. Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on issue #27937: [SPARK-30127][SQL] Support case class parameter for typed Scala UDF

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on issue #27937: [SPARK-30127][SQL] Support case class parameter for typed Scala UDF
URL: https://github.com/apache/spark/pull/27937#issuecomment-601231291
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/24778/
   Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #27937: [SPARK-30127][SQL] Support case class parameter for typed Scala UDF

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #27937: [SPARK-30127][SQL] Support case class parameter for typed Scala UDF
URL: https://github.com/apache/spark/pull/27937#issuecomment-602484459
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/24901/
   Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on a change in pull request #27937: [SPARK-30127][SQL] Support case class parameter for typed Scala UDF

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on a change in pull request #27937: [SPARK-30127][SQL] Support case class parameter for typed Scala UDF
URL: https://github.com/apache/spark/pull/27937#discussion_r395490695
 
 

 ##########
 File path: sql/core/src/main/scala/org/apache/spark/sql/UDFRegistration.scala
 ##########
 @@ -200,8 +200,8 @@ class UDFRegistration private[sql] (functionRegistry: FunctionRegistry) extends
    */
   def register[RT: TypeTag, A1: TypeTag](name: String, func: Function1[A1, RT]): UserDefinedFunction = {
     val ScalaReflection.Schema(dataType, nullable) = ScalaReflection.schemaFor[RT]
-    val inputSchemas: Seq[Option[ScalaReflection.Schema]] = Try(ScalaReflection.schemaFor[A1]).toOption :: Nil
-    val udf = SparkUserDefinedFunction(func, dataType, inputSchemas).withName(name)
+    val inputEncoders: Seq[Option[ExpressionEncoder[_]]] = Try(ExpressionEncoder[A1]()).toOption :: Nil
 
 Review comment:
   maybe we can remove `applyOption`

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on issue #27937: [SPARK-30127][SQL] Support case class parameter for typed Scala UDF

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on issue #27937: [SPARK-30127][SQL] Support case class parameter for typed Scala UDF
URL: https://github.com/apache/spark/pull/27937#issuecomment-601178692
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/120041/
   Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on issue #27937: [SPARK-30127][SQL] Support case class parameter for typed Scala UDF

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on issue #27937: [SPARK-30127][SQL] Support case class parameter for typed Scala UDF
URL: https://github.com/apache/spark/pull/27937#issuecomment-601021743
 
 
   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/120018/
   Test FAILed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #27937: [SPARK-30127][SQL] Support case class parameter for typed Scala UDF

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #27937: [SPARK-30127][SQL] Support case class parameter for typed Scala UDF
URL: https://github.com/apache/spark/pull/27937#issuecomment-602408130
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/120173/
   Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #27937: [WIP][SPARK-30127][SQL] Support case class parameter for typed Scala UDF

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #27937: [WIP][SPARK-30127][SQL] Support case class parameter for typed Scala UDF
URL: https://github.com/apache/spark/pull/27937#issuecomment-600678270
 
 
   Merged build finished. Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on a change in pull request #27937: [WIP][SPARK-30127][SQL] Support case class parameter for typed Scala UDF

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on a change in pull request #27937: [WIP][SPARK-30127][SQL] Support case class parameter for typed Scala UDF
URL: https://github.com/apache/spark/pull/27937#discussion_r394353700
 
 

 ##########
 File path: sql/core/src/test/scala/org/apache/spark/sql/UDFSuite.scala
 ##########
 @@ -551,4 +550,32 @@ class UDFSuite extends QueryTest with SharedSparkSession {
     }
     assert(e.getMessage.contains("Invalid arguments for function cast"))
   }
+
+  test("only one case class parameter") {
+    val f = (d: TestData) => d.key * d.value.toInt
+    val myUdf = udf(f)
+    val df = Seq(("data", TestData(50, "2"))).toDF("col1", "col2")
+    checkAnswer(df.select(myUdf(Column("col2"))), Row(100) :: Nil)
+  }
+
+  test("one case class with primitive parameter") {
+    val f = (i: Int, p: TestData) => p.key * i
+    val myUdf = udf(f)
+    val df = Seq((2, TestData(50, "data"))).toDF("col1", "col2")
+    checkAnswer(df.select(myUdf(Column("col1"), Column("col2"))), Row(100) :: Nil)
+  }
+
+  test("multiple case class parameters") {
+    val f = (d1: TestData, d2: TestData) => d1.key * d2.key
+    val myUdf = udf(f)
+    val df = Seq((TestData(10, "d1"), TestData(50, "d2"))).toDF("col1", "col2")
+    checkAnswer(df.select(myUdf(Column("col1"), Column("col2"))), Row(500) :: Nil)
+  }
+
+  test("input case class parameter and return case class ") {
 
 Review comment:
   can we test nested case calss as well?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on issue #27937: [WIP][SPARK-30127][SQL] Support case class parameter for typed Scala UDF

Posted by GitBox <gi...@apache.org>.
SparkQA commented on issue #27937: [WIP][SPARK-30127][SQL] Support case class parameter for typed Scala UDF
URL: https://github.com/apache/spark/pull/27937#issuecomment-600157593
 
 
   **[Test build #119938 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/119938/testReport)** for PR 27937 at commit [`2b186bd`](https://github.com/apache/spark/commit/2b186bdd46ad229dd337a0405595d09446884145).

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on issue #27937: [WIP][SPARK-30127][SQL] Support case class parameter for typed Scala UDF

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on issue #27937: [WIP][SPARK-30127][SQL] Support case class parameter for typed Scala UDF
URL: https://github.com/apache/spark/pull/27937#issuecomment-600158332
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/24661/
   Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on a change in pull request #27937: [WIP][SPARK-30127][SQL] Support case class parameter for typed Scala UDF

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on a change in pull request #27937: [WIP][SPARK-30127][SQL] Support case class parameter for typed Scala UDF
URL: https://github.com/apache/spark/pull/27937#discussion_r394350313
 
 

 ##########
 File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/ScalaUDF.scala
 ##########
 @@ -59,14 +64,22 @@ case class ScalaUDF(
 
   override def toString: String = s"${udfName.getOrElse("UDF")}(${children.mkString(", ")})"
 
+  private def createToScalaConverter(i: Int, dataType: DataType): Any => Any = {
+    val encoder = inputEncoders(i)
+    encoder.isSerializedAsStructForTopLevel match {
 
 Review comment:
   it's weir to pattern match a boolean, can we write `if else`?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA removed a comment on issue #27937: [SPARK-30127][SQL] Support case class parameter for typed Scala UDF

Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on issue #27937: [SPARK-30127][SQL] Support case class parameter for typed Scala UDF
URL: https://github.com/apache/spark/pull/27937#issuecomment-601230418
 
 
   **[Test build #120061 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/120061/testReport)** for PR 27937 at commit [`842d6fa`](https://github.com/apache/spark/commit/842d6fa7453d0cd34a41ebf2eb13c93c899ad83d).

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on issue #27937: [SPARK-30127][SQL] Support case class parameter for typed Scala UDF

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on issue #27937: [SPARK-30127][SQL] Support case class parameter for typed Scala UDF
URL: https://github.com/apache/spark/pull/27937#issuecomment-601056202
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/24757/
   Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #27937: [WIP][SPARK-30127][SQL] Support case class parameter for typed Scala UDF

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #27937: [WIP][SPARK-30127][SQL] Support case class parameter for typed Scala UDF
URL: https://github.com/apache/spark/pull/27937#issuecomment-600208226
 
 
   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/119938/
   Test FAILed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] Ngone51 commented on a change in pull request #27937: [SPARK-30127][SQL] Support case class parameter for typed Scala UDF

Posted by GitBox <gi...@apache.org>.
Ngone51 commented on a change in pull request #27937: [SPARK-30127][SQL] Support case class parameter for typed Scala UDF
URL: https://github.com/apache/spark/pull/27937#discussion_r396181776
 
 

 ##########
 File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/ScalaUDF.scala
 ##########
 @@ -48,25 +46,87 @@ case class ScalaUDF(
     function: AnyRef,
     dataType: DataType,
     children: Seq[Expression],
-    inputPrimitives: Seq[Boolean],
-    inputTypes: Seq[AbstractDataType] = Nil,
+    inputEncoders: Seq[Option[ExpressionEncoder[_]]] = Nil,
     udfName: Option[String] = None,
     nullable: Boolean = true,
     udfDeterministic: Boolean = true)
   extends Expression with NonSQLExpression with UserDefinedExpression {
 
   override lazy val deterministic: Boolean = udfDeterministic && children.forall(_.deterministic)
 
+  private lazy val resolvedEnc = mutable.HashMap[Int, ExpressionEncoder[_]]()
+
   override def toString: String = s"${udfName.getOrElse("UDF")}(${children.mkString(", ")})"
 
+  /**
+   * The analyzer should be aware of Scala primitive types so as to make the
+   * UDF return null if there is any null input value of these types. On the
+   * other hand, Java UDFs can only have boxed types, thus this parameter will
+   * always be all false.
+   */
+  def inputPrimitives: Seq[Boolean] = {
 
 Review comment:
   It can be `Nil`.  Previously, Java UDF returns `children.map(_ => false)` and it has the same affect with `Nil` indeed. And also, untyped Scala UDF always input `Nil`. 
   
   But for typed Scala UDF, it will aways has `inputPrimitives` and `inputTypes`.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] Ngone51 commented on issue #27937: [WIP][SPARK-30127][SQL] Support case class parameter for typed Scala UDF

Posted by GitBox <gi...@apache.org>.
Ngone51 commented on issue #27937: [WIP][SPARK-30127][SQL] Support case class parameter for typed Scala UDF
URL: https://github.com/apache/spark/pull/27937#issuecomment-600677598
 
 
   @cloud-fan Oh...I missed your review comments after updates...`Outdated`s doesn't really address your comments...I'll update again later...

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on a change in pull request #27937: [SPARK-30127][SQL] Support case class parameter for typed Scala UDF

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on a change in pull request #27937: [SPARK-30127][SQL] Support case class parameter for typed Scala UDF
URL: https://github.com/apache/spark/pull/27937#discussion_r395490333
 
 

 ##########
 File path: sql/core/src/main/scala/org/apache/spark/sql/UDFRegistration.scala
 ##########
 @@ -200,8 +200,8 @@ class UDFRegistration private[sql] (functionRegistry: FunctionRegistry) extends
    */
   def register[RT: TypeTag, A1: TypeTag](name: String, func: Function1[A1, RT]): UserDefinedFunction = {
     val ScalaReflection.Schema(dataType, nullable) = ScalaReflection.schemaFor[RT]
-    val inputSchemas: Seq[Option[ScalaReflection.Schema]] = Try(ScalaReflection.schemaFor[A1]).toOption :: Nil
-    val udf = SparkUserDefinedFunction(func, dataType, inputSchemas).withName(name)
+    val inputEncoders: Seq[Option[ExpressionEncoder[_]]] = Try(ExpressionEncoder[A1]()).toOption :: Nil
 
 Review comment:
   Why do we need `Try` as we have `ExpressionEncoder.applyOption`?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on issue #27937: [SPARK-30127][SQL] Support case class parameter for typed Scala UDF

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on issue #27937: [SPARK-30127][SQL] Support case class parameter for typed Scala UDF
URL: https://github.com/apache/spark/pull/27937#issuecomment-602644659
 
 
   Merged build finished. Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #27937: [SPARK-30127][SQL] Support case class parameter for typed Scala UDF

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #27937: [SPARK-30127][SQL] Support case class parameter for typed Scala UDF
URL: https://github.com/apache/spark/pull/27937#issuecomment-601231291
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/24778/
   Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on a change in pull request #27937: [SPARK-30127][SQL] Support case class parameter for typed Scala UDF

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on a change in pull request #27937: [SPARK-30127][SQL] Support case class parameter for typed Scala UDF
URL: https://github.com/apache/spark/pull/27937#discussion_r395489257
 
 

 ##########
 File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/ScalaUDF.scala
 ##########
 @@ -48,25 +46,87 @@ case class ScalaUDF(
     function: AnyRef,
     dataType: DataType,
     children: Seq[Expression],
-    inputPrimitives: Seq[Boolean],
-    inputTypes: Seq[AbstractDataType] = Nil,
+    inputEncoders: Seq[Option[ExpressionEncoder[_]]] = Nil,
     udfName: Option[String] = None,
     nullable: Boolean = true,
     udfDeterministic: Boolean = true)
   extends Expression with NonSQLExpression with UserDefinedExpression {
 
   override lazy val deterministic: Boolean = udfDeterministic && children.forall(_.deterministic)
 
+  private lazy val resolvedEnc = mutable.HashMap[Int, ExpressionEncoder[_]]()
+
   override def toString: String = s"${udfName.getOrElse("UDF")}(${children.mkString(", ")})"
 
+  /**
+   * The analyzer should be aware of Scala primitive types so as to make the
+   * UDF return null if there is any null input value of these types. On the
+   * other hand, Java UDFs can only have boxed types, thus this parameter will
+   * always be all false.
+   */
+  def inputPrimitives: Seq[Boolean] = {
+    inputEncoders.map { encoderOpt =>
+      // It's possible that some of the inputs don't have a specific encoder(e.g. `Any`)
+      if (encoderOpt.isDefined) {
+        val encoder = encoderOpt.get
+        if (encoder.isSerializedAsStruct) {
+          // struct type is not primitive
+          false
+        } else {
+          // `nullable` is false iff the type is primitive
+          !encoder.schema.head.nullable
+        }
+      } else {
+        // Any type is not primitive
+        false
+      }
+    }
+  }
+
+  /**
+   * The expected input types of this UDF, used to perform type coercion. If we do
+   * not want to perform coercion, simply use "Nil". Note that it would've been
+   * better to use Option of Seq[DataType] so we can use "None" as the case for no
+   * type coercion. However, that would require more refactoring of the codebase.
+   */
+  def inputTypes: Seq[AbstractDataType] = {
+    inputEncoders.map { encoderOpt =>
+      if (encoderOpt.isDefined) {
+        val encoder = encoderOpt.get
+        if (encoder.isSerializedAsStruct) {
+          encoder.schema
+        } else {
+          encoder.schema.head.dataType
+        }
+      } else {
+        AnyDataType
+      }
+    }
+  }
+
+  private def createToScalaConverter(i: Int, dataType: DataType): Any => Any = {
+    if (inputEncoders.isEmpty) {
+      // for untyped Scala UDF
+      CatalystTypeConverters.createToScalaConverter(dataType)
+    } else {
+      val encoder = inputEncoders(i)
+      if (encoder.isDefined && encoder.get.isSerializedAsStructForTopLevel) {
+        val enc = resolvedEnc.getOrElseUpdate(i, encoder.get.resolveAndBind())
 
 Review comment:
   why we need `resolvedEnc`? I think we can simply write
   ```
   val enc = encoder.get.resolveAndBind()
   row: Any => enc.fromRow(row.asInstanceOf[InternalRow])
   ```

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #27937: [WIP][SPARK-30127][SQL] Support case class parameter for typed Scala UDF

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #27937: [WIP][SPARK-30127][SQL] Support case class parameter for typed Scala UDF
URL: https://github.com/apache/spark/pull/27937#issuecomment-600556222
 
 
   Merged build finished. Test FAILed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on issue #27937: [WIP][SPARK-30127][SQL] Support case class parameter for typed Scala UDF

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on issue #27937: [WIP][SPARK-30127][SQL] Support case class parameter for typed Scala UDF
URL: https://github.com/apache/spark/pull/27937#issuecomment-600511531
 
 
   Merged build finished. Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #27937: [SPARK-30127][SQL] Support case class parameter for typed Scala UDF

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #27937: [SPARK-30127][SQL] Support case class parameter for typed Scala UDF
URL: https://github.com/apache/spark/pull/27937#issuecomment-601382417
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/120061/
   Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #27937: [SPARK-30127][SQL] Support case class parameter for typed Scala UDF

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #27937: [SPARK-30127][SQL] Support case class parameter for typed Scala UDF
URL: https://github.com/apache/spark/pull/27937#issuecomment-601382407
 
 
   Merged build finished. Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #27937: [WIP][SPARK-30127][SQL] Support case class parameter for typed Scala UDF

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #27937: [WIP][SPARK-30127][SQL] Support case class parameter for typed Scala UDF
URL: https://github.com/apache/spark/pull/27937#issuecomment-600208209
 
 
   Merged build finished. Test FAILed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on issue #27937: [WIP][SPARK-30127][SQL] Support case class parameter for typed Scala UDF

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on issue #27937: [WIP][SPARK-30127][SQL] Support case class parameter for typed Scala UDF
URL: https://github.com/apache/spark/pull/27937#issuecomment-600511538
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/24705/
   Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on issue #27937: [WIP][SPARK-30127][SQL] Support case class parameter for typed Scala UDF

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on issue #27937: [WIP][SPARK-30127][SQL] Support case class parameter for typed Scala UDF
URL: https://github.com/apache/spark/pull/27937#issuecomment-600208226
 
 
   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/119938/
   Test FAILed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on issue #27937: [SPARK-30127][SQL] Support case class parameter for typed Scala UDF

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on issue #27937: [SPARK-30127][SQL] Support case class parameter for typed Scala UDF
URL: https://github.com/apache/spark/pull/27937#issuecomment-601382417
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/120061/
   Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA removed a comment on issue #27937: [WIP][SPARK-30127][SQL] Support case class parameter for typed Scala UDF

Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on issue #27937: [WIP][SPARK-30127][SQL] Support case class parameter for typed Scala UDF
URL: https://github.com/apache/spark/pull/27937#issuecomment-600677564
 
 
   **[Test build #119997 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/119997/testReport)** for PR 27937 at commit [`867ad06`](https://github.com/apache/spark/commit/867ad06f8b7068b8cd24970fc45d8fd8dc12428d).

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on issue #27937: [SPARK-30127][SQL] Support case class parameter for typed Scala UDF

Posted by GitBox <gi...@apache.org>.
SparkQA commented on issue #27937: [SPARK-30127][SQL] Support case class parameter for typed Scala UDF
URL: https://github.com/apache/spark/pull/27937#issuecomment-601230418
 
 
   **[Test build #120061 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/120061/testReport)** for PR 27937 at commit [`842d6fa`](https://github.com/apache/spark/commit/842d6fa7453d0cd34a41ebf2eb13c93c899ad83d).

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on issue #27937: [SPARK-30127][SQL] Support case class parameter for typed Scala UDF

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on issue #27937: [SPARK-30127][SQL] Support case class parameter for typed Scala UDF
URL: https://github.com/apache/spark/pull/27937#issuecomment-602484459
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/24901/
   Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #27937: [SPARK-30127][SQL] Support case class parameter for typed Scala UDF

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #27937: [SPARK-30127][SQL] Support case class parameter for typed Scala UDF
URL: https://github.com/apache/spark/pull/27937#issuecomment-602644659
 
 
   Merged build finished. Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] Ngone51 commented on a change in pull request #27937: [SPARK-30127][SQL] Support case class parameter for typed Scala UDF

Posted by GitBox <gi...@apache.org>.
Ngone51 commented on a change in pull request #27937: [SPARK-30127][SQL] Support case class parameter for typed Scala UDF
URL: https://github.com/apache/spark/pull/27937#discussion_r396181906
 
 

 ##########
 File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/ScalaUDF.scala
 ##########
 @@ -48,25 +46,87 @@ case class ScalaUDF(
     function: AnyRef,
     dataType: DataType,
     children: Seq[Expression],
-    inputPrimitives: Seq[Boolean],
-    inputTypes: Seq[AbstractDataType] = Nil,
+    inputEncoders: Seq[Option[ExpressionEncoder[_]]] = Nil,
     udfName: Option[String] = None,
     nullable: Boolean = true,
     udfDeterministic: Boolean = true)
   extends Expression with NonSQLExpression with UserDefinedExpression {
 
   override lazy val deterministic: Boolean = udfDeterministic && children.forall(_.deterministic)
 
+  private lazy val resolvedEnc = mutable.HashMap[Int, ExpressionEncoder[_]]()
+
   override def toString: String = s"${udfName.getOrElse("UDF")}(${children.mkString(", ")})"
 
+  /**
+   * The analyzer should be aware of Scala primitive types so as to make the
+   * UDF return null if there is any null input value of these types. On the
+   * other hand, Java UDFs can only have boxed types, thus this parameter will
+   * always be all false.
+   */
+  def inputPrimitives: Seq[Boolean] = {
+    inputEncoders.map { encoderOpt =>
+      // It's possible that some of the inputs don't have a specific encoder(e.g. `Any`)
+      if (encoderOpt.isDefined) {
+        val encoder = encoderOpt.get
+        if (encoder.isSerializedAsStruct) {
+          // struct type is not primitive
+          false
+        } else {
+          // `nullable` is false iff the type is primitive
+          !encoder.schema.head.nullable
+        }
+      } else {
+        // Any type is not primitive
+        false
+      }
+    }
+  }
+
+  /**
+   * The expected input types of this UDF, used to perform type coercion. If we do
+   * not want to perform coercion, simply use "Nil". Note that it would've been
+   * better to use Option of Seq[DataType] so we can use "None" as the case for no
+   * type coercion. However, that would require more refactoring of the codebase.
+   */
+  def inputTypes: Seq[AbstractDataType] = {
+    inputEncoders.map { encoderOpt =>
+      if (encoderOpt.isDefined) {
+        val encoder = encoderOpt.get
+        if (encoder.isSerializedAsStruct) {
+          encoder.schema
+        } else {
+          encoder.schema.head.dataType
+        }
+      } else {
+        AnyDataType
+      }
+    }
+  }
+
+  private def createToScalaConverter(i: Int, dataType: DataType): Any => Any = {
+    if (inputEncoders.isEmpty) {
+      // for untyped Scala UDF
+      CatalystTypeConverters.createToScalaConverter(dataType)
+    } else {
+      val encoder = inputEncoders(i)
+      if (encoder.isDefined && encoder.get.isSerializedAsStructForTopLevel) {
+        val enc = resolvedEnc.getOrElseUpdate(i, encoder.get.resolveAndBind())
 
 Review comment:
   make sense. updated.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on issue #27937: [WIP][SPARK-30127][SQL] Support case class parameter for typed Scala UDF

Posted by GitBox <gi...@apache.org>.
SparkQA commented on issue #27937: [WIP][SPARK-30127][SQL] Support case class parameter for typed Scala UDF
URL: https://github.com/apache/spark/pull/27937#issuecomment-600208089
 
 
   **[Test build #119938 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/119938/testReport)** for PR 27937 at commit [`2b186bd`](https://github.com/apache/spark/commit/2b186bdd46ad229dd337a0405595d09446884145).
    * This patch **fails Spark unit tests**.
    * This patch merges cleanly.
    * This patch adds no public classes.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA removed a comment on issue #27937: [SPARK-30127][SQL] Support case class parameter for typed Scala UDF

Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on issue #27937: [SPARK-30127][SQL] Support case class parameter for typed Scala UDF
URL: https://github.com/apache/spark/pull/27937#issuecomment-602483677
 
 
   **[Test build #120188 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/120188/testReport)** for PR 27937 at commit [`b0b298e`](https://github.com/apache/spark/commit/b0b298e2d42785c54b1ffb10125741bce7e217e8).

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on issue #27937: [WIP][SPARK-30127][SQL] Support case class parameter for typed Scala UDF

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on issue #27937: [WIP][SPARK-30127][SQL] Support case class parameter for typed Scala UDF
URL: https://github.com/apache/spark/pull/27937#issuecomment-600208209
 
 
   Merged build finished. Test FAILed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on issue #27937: [SPARK-30127][SQL] Support case class parameter for typed Scala UDF

Posted by GitBox <gi...@apache.org>.
SparkQA commented on issue #27937: [SPARK-30127][SQL] Support case class parameter for typed Scala UDF
URL: https://github.com/apache/spark/pull/27937#issuecomment-602407564
 
 
   **[Test build #120173 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/120173/testReport)** for PR 27937 at commit [`8e82f3f`](https://github.com/apache/spark/commit/8e82f3f75a770fc9c6163a483f297eac38c30edd).
    * This patch passes all tests.
    * This patch merges cleanly.
    * This patch adds no public classes.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on issue #27937: [WIP][SPARK-30127][SQL] Support case class parameter for typed Scala UDF

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on issue #27937: [WIP][SPARK-30127][SQL] Support case class parameter for typed Scala UDF
URL: https://github.com/apache/spark/pull/27937#issuecomment-600678287
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/24717/
   Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] Ngone51 commented on a change in pull request #27937: [SPARK-30127][SQL] Support case class parameter for typed Scala UDF

Posted by GitBox <gi...@apache.org>.
Ngone51 commented on a change in pull request #27937: [SPARK-30127][SQL] Support case class parameter for typed Scala UDF
URL: https://github.com/apache/spark/pull/27937#discussion_r396181856
 
 

 ##########
 File path: sql/core/src/main/scala/org/apache/spark/sql/UDFRegistration.scala
 ##########
 @@ -200,8 +200,8 @@ class UDFRegistration private[sql] (functionRegistry: FunctionRegistry) extends
    */
   def register[RT: TypeTag, A1: TypeTag](name: String, func: Function1[A1, RT]): UserDefinedFunction = {
     val ScalaReflection.Schema(dataType, nullable) = ScalaReflection.schemaFor[RT]
-    val inputSchemas: Seq[Option[ScalaReflection.Schema]] = Try(ScalaReflection.schemaFor[A1]).toOption :: Nil
-    val udf = SparkUserDefinedFunction(func, dataType, inputSchemas).withName(name)
+    val inputEncoders: Seq[Option[ExpressionEncoder[_]]] = Try(ExpressionEncoder[A1]()).toOption :: Nil
 
 Review comment:
   removed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on issue #27937: [SPARK-30127][SQL] Support case class parameter for typed Scala UDF

Posted by GitBox <gi...@apache.org>.
SparkQA commented on issue #27937: [SPARK-30127][SQL] Support case class parameter for typed Scala UDF
URL: https://github.com/apache/spark/pull/27937#issuecomment-602483677
 
 
   **[Test build #120188 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/120188/testReport)** for PR 27937 at commit [`b0b298e`](https://github.com/apache/spark/commit/b0b298e2d42785c54b1ffb10125741bce7e217e8).

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #27937: [SPARK-30127][SQL] Support case class parameter for typed Scala UDF

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #27937: [SPARK-30127][SQL] Support case class parameter for typed Scala UDF
URL: https://github.com/apache/spark/pull/27937#issuecomment-601178692
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/120041/
   Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on issue #27937: [SPARK-30127][SQL] Support case class parameter for typed Scala UDF

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on issue #27937: [SPARK-30127][SQL] Support case class parameter for typed Scala UDF
URL: https://github.com/apache/spark/pull/27937#issuecomment-601178680
 
 
   Merged build finished. Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on a change in pull request #27937: [WIP][SPARK-30127][SQL] Support case class parameter for typed Scala UDF

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on a change in pull request #27937: [WIP][SPARK-30127][SQL] Support case class parameter for typed Scala UDF
URL: https://github.com/apache/spark/pull/27937#discussion_r394351893
 
 

 ##########
 File path: sql/core/src/main/scala/org/apache/spark/sql/UDFRegistration.scala
 ##########
 @@ -181,7 +181,8 @@ class UDFRegistration private[sql] (functionRegistry: FunctionRegistry) extends
   def register[RT: TypeTag](name: String, func: Function0[RT]): UserDefinedFunction = {
     val ScalaReflection.Schema(dataType, nullable) = ScalaReflection.schemaFor[RT]
     val inputSchemas: Seq[Option[ScalaReflection.Schema]] = Nil
-    val udf = SparkUserDefinedFunction(func, dataType, inputSchemas).withName(name)
+    val inputEncoders: Seq[ExpressionEncoder[_]] = Nil
 
 Review comment:
   can we update the script that generates these methods?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA removed a comment on issue #27937: [WIP][SPARK-30127][SQL] Support case class parameter for typed Scala UDF

Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on issue #27937: [WIP][SPARK-30127][SQL] Support case class parameter for typed Scala UDF
URL: https://github.com/apache/spark/pull/27937#issuecomment-600510988
 
 
   **[Test build #119983 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/119983/testReport)** for PR 27937 at commit [`d600cf7`](https://github.com/apache/spark/commit/d600cf79edf5329fc8187143bf63d834d1598809).

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on issue #27937: [WIP][SPARK-30127][SQL] Support case class parameter for typed Scala UDF

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on issue #27937: [WIP][SPARK-30127][SQL] Support case class parameter for typed Scala UDF
URL: https://github.com/apache/spark/pull/27937#issuecomment-600556228
 
 
   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/119983/
   Test FAILed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on issue #27937: [SPARK-30127][SQL] Support case class parameter for typed Scala UDF

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on issue #27937: [SPARK-30127][SQL] Support case class parameter for typed Scala UDF
URL: https://github.com/apache/spark/pull/27937#issuecomment-601021739
 
 
   Merged build finished. Test FAILed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on a change in pull request #27937: [SPARK-30127][SQL] Support case class parameter for typed Scala UDF

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on a change in pull request #27937: [SPARK-30127][SQL] Support case class parameter for typed Scala UDF
URL: https://github.com/apache/spark/pull/27937#discussion_r395492320
 
 

 ##########
 File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/ScalaUDF.scala
 ##########
 @@ -48,25 +46,87 @@ case class ScalaUDF(
     function: AnyRef,
     dataType: DataType,
     children: Seq[Expression],
-    inputPrimitives: Seq[Boolean],
-    inputTypes: Seq[AbstractDataType] = Nil,
+    inputEncoders: Seq[Option[ExpressionEncoder[_]]] = Nil,
     udfName: Option[String] = None,
     nullable: Boolean = true,
     udfDeterministic: Boolean = true)
   extends Expression with NonSQLExpression with UserDefinedExpression {
 
   override lazy val deterministic: Boolean = udfDeterministic && children.forall(_.deterministic)
 
+  private lazy val resolvedEnc = mutable.HashMap[Int, ExpressionEncoder[_]]()
+
   override def toString: String = s"${udfName.getOrElse("UDF")}(${children.mkString(", ")})"
 
+  /**
+   * The analyzer should be aware of Scala primitive types so as to make the
+   * UDF return null if there is any null input value of these types. On the
+   * other hand, Java UDFs can only have boxed types, thus this parameter will
+   * always be all false.
+   */
+  def inputPrimitives: Seq[Boolean] = {
+    inputEncoders.map { encoderOpt =>
+      // It's possible that some of the inputs don't have a specific encoder(e.g. `Any`)
+      if (encoderOpt.isDefined) {
+        val encoder = encoderOpt.get
+        if (encoder.isSerializedAsStruct) {
+          // struct type is not primitive
+          false
+        } else {
+          // `nullable` is false iff the type is primitive
+          !encoder.schema.head.nullable
+        }
+      } else {
+        // Any type is not primitive
+        false
+      }
+    }
+  }
+
+  /**
+   * The expected input types of this UDF, used to perform type coercion. If we do
+   * not want to perform coercion, simply use "Nil". Note that it would've been
+   * better to use Option of Seq[DataType] so we can use "None" as the case for no
+   * type coercion. However, that would require more refactoring of the codebase.
+   */
+  def inputTypes: Seq[AbstractDataType] = {
 
 Review comment:
   same here.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on issue #27937: [WIP][SPARK-30127][SQL] Support case class parameter for typed Scala UDF

Posted by GitBox <gi...@apache.org>.
SparkQA commented on issue #27937: [WIP][SPARK-30127][SQL] Support case class parameter for typed Scala UDF
URL: https://github.com/apache/spark/pull/27937#issuecomment-600556087
 
 
   **[Test build #119983 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/119983/testReport)** for PR 27937 at commit [`d600cf7`](https://github.com/apache/spark/commit/d600cf79edf5329fc8187143bf63d834d1598809).
    * This patch **fails Spark unit tests**.
    * This patch merges cleanly.
    * This patch adds no public classes.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on issue #27937: [SPARK-30127][SQL] Support case class parameter for typed Scala UDF

Posted by GitBox <gi...@apache.org>.
SparkQA commented on issue #27937: [SPARK-30127][SQL] Support case class parameter for typed Scala UDF
URL: https://github.com/apache/spark/pull/27937#issuecomment-601177289
 
 
   **[Test build #120041 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/120041/testReport)** for PR 27937 at commit [`23ca098`](https://github.com/apache/spark/commit/23ca0988637571a4b1e210acbbee7c34c927d56b).
    * This patch passes all tests.
    * This patch merges cleanly.
    * This patch adds no public classes.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on issue #27937: [SPARK-30127][SQL] Support case class parameter for typed Scala UDF

Posted by GitBox <gi...@apache.org>.
SparkQA commented on issue #27937: [SPARK-30127][SQL] Support case class parameter for typed Scala UDF
URL: https://github.com/apache/spark/pull/27937#issuecomment-602324100
 
 
   **[Test build #120173 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/120173/testReport)** for PR 27937 at commit [`8e82f3f`](https://github.com/apache/spark/commit/8e82f3f75a770fc9c6163a483f297eac38c30edd).

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on issue #27937: [SPARK-30127][SQL] Support case class parameter for typed Scala UDF

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on issue #27937: [SPARK-30127][SQL] Support case class parameter for typed Scala UDF
URL: https://github.com/apache/spark/pull/27937#issuecomment-602408123
 
 
   Merged build finished. Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on issue #27937: [SPARK-30127][SQL] Support case class parameter for typed Scala UDF

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on issue #27937: [SPARK-30127][SQL] Support case class parameter for typed Scala UDF
URL: https://github.com/apache/spark/pull/27937#issuecomment-603291951
 
 
   thanks, merging to master/3.0!

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on a change in pull request #27937: [SPARK-30127][SQL] Support case class parameter for typed Scala UDF

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on a change in pull request #27937: [SPARK-30127][SQL] Support case class parameter for typed Scala UDF
URL: https://github.com/apache/spark/pull/27937#discussion_r396265521
 
 

 ##########
 File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/ScalaUDF.scala
 ##########
 @@ -59,14 +56,75 @@ case class ScalaUDF(
 
   override def toString: String = s"${udfName.getOrElse("UDF")}(${children.mkString(", ")})"
 
+  /**
+   * The analyzer should be aware of Scala primitive types so as to make the
+   * UDF return null if there is any null input value of these types. On the
+   * other hand, Java UDFs can only have boxed types, thus this parameter will
+   * always be all false.
 
 Review comment:
   We need to make sure the comment is accurate. `Java UDFs can only have boxed types, thus this parameter will always be all false.` This is wrong now.
   
   I agree that `Nil` is fine in this case, but the comment needs to be updated.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on issue #27937: [SPARK-30127][SQL] Support case class parameter for typed Scala UDF

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on issue #27937: [SPARK-30127][SQL] Support case class parameter for typed Scala UDF
URL: https://github.com/apache/spark/pull/27937#issuecomment-601382407
 
 
   Merged build finished. Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on issue #27937: [SPARK-30127][SQL] Support case class parameter for typed Scala UDF

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on issue #27937: [SPARK-30127][SQL] Support case class parameter for typed Scala UDF
URL: https://github.com/apache/spark/pull/27937#issuecomment-602324393
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/24886/
   Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on issue #27937: [SPARK-30127][SQL] Support case class parameter for typed Scala UDF

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on issue #27937: [SPARK-30127][SQL] Support case class parameter for typed Scala UDF
URL: https://github.com/apache/spark/pull/27937#issuecomment-602484444
 
 
   Merged build finished. Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on issue #27937: [SPARK-30127][SQL] Support case class parameter for typed Scala UDF

Posted by GitBox <gi...@apache.org>.
SparkQA commented on issue #27937: [SPARK-30127][SQL] Support case class parameter for typed Scala UDF
URL: https://github.com/apache/spark/pull/27937#issuecomment-602641661
 
 
   **[Test build #120188 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/120188/testReport)** for PR 27937 at commit [`b0b298e`](https://github.com/apache/spark/commit/b0b298e2d42785c54b1ffb10125741bce7e217e8).
    * This patch passes all tests.
    * This patch merges cleanly.
    * This patch adds no public classes.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on issue #27937: [WIP][SPARK-30127][SQL] Support case class parameter for typed Scala UDF

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on issue #27937: [WIP][SPARK-30127][SQL] Support case class parameter for typed Scala UDF
URL: https://github.com/apache/spark/pull/27937#issuecomment-600556222
 
 
   Merged build finished. Test FAILed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on a change in pull request #27937: [SPARK-30127][SQL] Support case class parameter for typed Scala UDF

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on a change in pull request #27937: [SPARK-30127][SQL] Support case class parameter for typed Scala UDF
URL: https://github.com/apache/spark/pull/27937#discussion_r395492654
 
 

 ##########
 File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/ScalaUDF.scala
 ##########
 @@ -48,25 +46,87 @@ case class ScalaUDF(
     function: AnyRef,
     dataType: DataType,
     children: Seq[Expression],
-    inputPrimitives: Seq[Boolean],
-    inputTypes: Seq[AbstractDataType] = Nil,
+    inputEncoders: Seq[Option[ExpressionEncoder[_]]] = Nil,
     udfName: Option[String] = None,
     nullable: Boolean = true,
     udfDeterministic: Boolean = true)
   extends Expression with NonSQLExpression with UserDefinedExpression {
 
   override lazy val deterministic: Boolean = udfDeterministic && children.forall(_.deterministic)
 
+  private lazy val resolvedEnc = mutable.HashMap[Int, ExpressionEncoder[_]]()
+
   override def toString: String = s"${udfName.getOrElse("UDF")}(${children.mkString(", ")})"
 
+  /**
+   * The analyzer should be aware of Scala primitive types so as to make the
+   * UDF return null if there is any null input value of these types. On the
+   * other hand, Java UDFs can only have boxed types, thus this parameter will
+   * always be all false.
+   */
+  def inputPrimitives: Seq[Boolean] = {
+    inputEncoders.map { encoderOpt =>
+      // It's possible that some of the inputs don't have a specific encoder(e.g. `Any`)
+      if (encoderOpt.isDefined) {
+        val encoder = encoderOpt.get
+        if (encoder.isSerializedAsStruct) {
+          // struct type is not primitive
+          false
+        } else {
+          // `nullable` is false iff the type is primitive
+          !encoder.schema.head.nullable
+        }
+      } else {
+        // Any type is not primitive
+        false
+      }
+    }
+  }
+
+  /**
+   * The expected input types of this UDF, used to perform type coercion. If we do
+   * not want to perform coercion, simply use "Nil". Note that it would've been
+   * better to use Option of Seq[DataType] so we can use "None" as the case for no
+   * type coercion. However, that would require more refactoring of the codebase.
+   */
+  def inputTypes: Seq[AbstractDataType] = {
 
 Review comment:
   unless we guarantee `inputEncoders` always have the length of `children`.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on issue #27937: [SPARK-30127][SQL] Support case class parameter for typed Scala UDF

Posted by GitBox <gi...@apache.org>.
SparkQA commented on issue #27937: [SPARK-30127][SQL] Support case class parameter for typed Scala UDF
URL: https://github.com/apache/spark/pull/27937#issuecomment-601381367
 
 
   **[Test build #120061 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/120061/testReport)** for PR 27937 at commit [`842d6fa`](https://github.com/apache/spark/commit/842d6fa7453d0cd34a41ebf2eb13c93c899ad83d).
    * This patch passes all tests.
    * This patch merges cleanly.
    * This patch adds no public classes.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on issue #27937: [WIP][SPARK-30127][SQL] Support case class parameter for typed Scala UDF

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on issue #27937: [WIP][SPARK-30127][SQL] Support case class parameter for typed Scala UDF
URL: https://github.com/apache/spark/pull/27937#issuecomment-600821069
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/119997/
   Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #27937: [WIP][SPARK-30127][SQL] Support case class parameter for typed Scala UDF

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #27937: [WIP][SPARK-30127][SQL] Support case class parameter for typed Scala UDF
URL: https://github.com/apache/spark/pull/27937#issuecomment-600158332
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/24661/
   Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #27937: [WIP][SPARK-30127][SQL] Support case class parameter for typed Scala UDF

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #27937: [WIP][SPARK-30127][SQL] Support case class parameter for typed Scala UDF
URL: https://github.com/apache/spark/pull/27937#issuecomment-600678287
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/24717/
   Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] Ngone51 commented on issue #27937: [SPARK-30127][SQL] Support case class parameter for typed Scala UDF

Posted by GitBox <gi...@apache.org>.
Ngone51 commented on issue #27937: [SPARK-30127][SQL] Support case class parameter for typed Scala UDF
URL: https://github.com/apache/spark/pull/27937#issuecomment-603588619
 
 
   thanks!

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #27937: [WIP][SPARK-30127][SQL] Support case class parameter for typed Scala UDF

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #27937: [WIP][SPARK-30127][SQL] Support case class parameter for typed Scala UDF
URL: https://github.com/apache/spark/pull/27937#issuecomment-600511531
 
 
   Merged build finished. Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #27937: [SPARK-30127][SQL] Support case class parameter for typed Scala UDF

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #27937: [SPARK-30127][SQL] Support case class parameter for typed Scala UDF
URL: https://github.com/apache/spark/pull/27937#issuecomment-602324391
 
 
   Merged build finished. Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #27937: [SPARK-30127][SQL] Support case class parameter for typed Scala UDF

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #27937: [SPARK-30127][SQL] Support case class parameter for typed Scala UDF
URL: https://github.com/apache/spark/pull/27937#issuecomment-602644676
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/120188/
   Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on a change in pull request #27937: [WIP][SPARK-30127][SQL] Support case class parameter for typed Scala UDF

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on a change in pull request #27937: [WIP][SPARK-30127][SQL] Support case class parameter for typed Scala UDF
URL: https://github.com/apache/spark/pull/27937#discussion_r394353443
 
 

 ##########
 File path: sql/core/src/main/scala/org/apache/spark/sql/functions.scala
 ##########
 @@ -4420,7 +4422,8 @@ object functions {
   def udf[RT: TypeTag, A1: TypeTag, A2: TypeTag](f: Function2[A1, A2, RT]): UserDefinedFunction = {
     val ScalaReflection.Schema(dataType, nullable) = ScalaReflection.schemaFor[RT]
     val inputSchemas = Try(ScalaReflection.schemaFor(typeTag[A1])).toOption :: Try(ScalaReflection.schemaFor(typeTag[A2])).toOption :: Nil
-    val udf = SparkUserDefinedFunction(f, dataType, inputSchemas)
+    val inputEncoders: Seq[ExpressionEncoder[_]] = ExpressionEncoder[A1]() :: ExpressionEncoder[A2]() :: Nil
 
 Review comment:
   If the type is `Any`, we may fail to create encoder and we should catch it, like what we've done for `ScalaReflection.schemaFor` above.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on a change in pull request #27937: [SPARK-30127][SQL] Support case class parameter for typed Scala UDF

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on a change in pull request #27937: [SPARK-30127][SQL] Support case class parameter for typed Scala UDF
URL: https://github.com/apache/spark/pull/27937#discussion_r395492216
 
 

 ##########
 File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/ScalaUDF.scala
 ##########
 @@ -48,25 +46,87 @@ case class ScalaUDF(
     function: AnyRef,
     dataType: DataType,
     children: Seq[Expression],
-    inputPrimitives: Seq[Boolean],
-    inputTypes: Seq[AbstractDataType] = Nil,
+    inputEncoders: Seq[Option[ExpressionEncoder[_]]] = Nil,
     udfName: Option[String] = None,
     nullable: Boolean = true,
     udfDeterministic: Boolean = true)
   extends Expression with NonSQLExpression with UserDefinedExpression {
 
   override lazy val deterministic: Boolean = udfDeterministic && children.forall(_.deterministic)
 
+  private lazy val resolvedEnc = mutable.HashMap[Int, ExpressionEncoder[_]]()
+
   override def toString: String = s"${udfName.getOrElse("UDF")}(${children.mkString(", ")})"
 
+  /**
+   * The analyzer should be aware of Scala primitive types so as to make the
+   * UDF return null if there is any null input value of these types. On the
+   * other hand, Java UDFs can only have boxed types, thus this parameter will
+   * always be all false.
+   */
+  def inputPrimitives: Seq[Boolean] = {
 
 Review comment:
   I think need to return `children.map(_ => false)` if `inputEncoders` is empty.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on issue #27937: [SPARK-30127][SQL] Support case class parameter for typed Scala UDF

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on issue #27937: [SPARK-30127][SQL] Support case class parameter for typed Scala UDF
URL: https://github.com/apache/spark/pull/27937#issuecomment-601056193
 
 
   Merged build finished. Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on issue #27937: [SPARK-30127][SQL] Support case class parameter for typed Scala UDF

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on issue #27937: [SPARK-30127][SQL] Support case class parameter for typed Scala UDF
URL: https://github.com/apache/spark/pull/27937#issuecomment-602324391
 
 
   Merged build finished. Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on issue #27937: [SPARK-30127][SQL] Support case class parameter for typed Scala UDF

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on issue #27937: [SPARK-30127][SQL] Support case class parameter for typed Scala UDF
URL: https://github.com/apache/spark/pull/27937#issuecomment-602408130
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/120173/
   Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #27937: [SPARK-30127][SQL] Support case class parameter for typed Scala UDF

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #27937: [SPARK-30127][SQL] Support case class parameter for typed Scala UDF
URL: https://github.com/apache/spark/pull/27937#issuecomment-602484444
 
 
   Merged build finished. Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #27937: [SPARK-30127][SQL] Support case class parameter for typed Scala UDF

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #27937: [SPARK-30127][SQL] Support case class parameter for typed Scala UDF
URL: https://github.com/apache/spark/pull/27937#issuecomment-602324393
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/24886/
   Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on issue #27937: [WIP][SPARK-30127][SQL] Support case class parameter for typed Scala UDF

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on issue #27937: [WIP][SPARK-30127][SQL] Support case class parameter for typed Scala UDF
URL: https://github.com/apache/spark/pull/27937#issuecomment-600821062
 
 
   Merged build finished. Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA removed a comment on issue #27937: [WIP][SPARK-30127][SQL] Support case class parameter for typed Scala UDF

Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on issue #27937: [WIP][SPARK-30127][SQL] Support case class parameter for typed Scala UDF
URL: https://github.com/apache/spark/pull/27937#issuecomment-600157593
 
 
   **[Test build #119938 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/119938/testReport)** for PR 27937 at commit [`2b186bd`](https://github.com/apache/spark/commit/2b186bdd46ad229dd337a0405595d09446884145).

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #27937: [WIP][SPARK-30127][SQL] Support case class parameter for typed Scala UDF

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #27937: [WIP][SPARK-30127][SQL] Support case class parameter for typed Scala UDF
URL: https://github.com/apache/spark/pull/27937#issuecomment-600821069
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/119997/
   Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on a change in pull request #27937: [WIP][SPARK-30127][SQL] Support case class parameter for typed Scala UDF

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on a change in pull request #27937: [WIP][SPARK-30127][SQL] Support case class parameter for typed Scala UDF
URL: https://github.com/apache/spark/pull/27937#discussion_r394352753
 
 

 ##########
 File path: sql/core/src/main/scala/org/apache/spark/sql/expressions/UserDefinedFunction.scala
 ##########
 @@ -94,6 +94,7 @@ private[spark] case class SparkUserDefinedFunction(
     f: AnyRef,
     dataType: DataType,
     inputSchemas: Seq[Option[ScalaReflection.Schema]],
+    inputEncoders: Seq[ExpressionEncoder[_]] = Nil,
 
 Review comment:
   or is it possible to get the schema and nullability from the encoder?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on issue #27937: [WIP][SPARK-30127][SQL] Support case class parameter for typed Scala UDF

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on issue #27937: [WIP][SPARK-30127][SQL] Support case class parameter for typed Scala UDF
URL: https://github.com/apache/spark/pull/27937#issuecomment-600678270
 
 
   Merged build finished. Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #27937: [WIP][SPARK-30127][SQL] Support case class parameter for typed Scala UDF

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #27937: [WIP][SPARK-30127][SQL] Support case class parameter for typed Scala UDF
URL: https://github.com/apache/spark/pull/27937#issuecomment-600964708
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/24736/
   Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on issue #27937: [WIP][SPARK-30127][SQL] Support case class parameter for typed Scala UDF

Posted by GitBox <gi...@apache.org>.
SparkQA commented on issue #27937: [WIP][SPARK-30127][SQL] Support case class parameter for typed Scala UDF
URL: https://github.com/apache/spark/pull/27937#issuecomment-600964037
 
 
   **[Test build #120018 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/120018/testReport)** for PR 27937 at commit [`23ca098`](https://github.com/apache/spark/commit/23ca0988637571a4b1e210acbbee7c34c927d56b).

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on a change in pull request #27937: [WIP][SPARK-30127][SQL] Support case class parameter for typed Scala UDF

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on a change in pull request #27937: [WIP][SPARK-30127][SQL] Support case class parameter for typed Scala UDF
URL: https://github.com/apache/spark/pull/27937#discussion_r394352525
 
 

 ##########
 File path: sql/core/src/main/scala/org/apache/spark/sql/expressions/UserDefinedFunction.scala
 ##########
 @@ -94,6 +94,7 @@ private[spark] case class SparkUserDefinedFunction(
     f: AnyRef,
     dataType: DataType,
     inputSchemas: Seq[Option[ScalaReflection.Schema]],
+    inputEncoders: Seq[ExpressionEncoder[_]] = Nil,
 
 Review comment:
   Shall we have a simple `inputInfos: Seq[Option[(ScalaReflection.Schema, ExpressionEncoder)]]`?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on issue #27937: [WIP][SPARK-30127][SQL] Support case class parameter for typed Scala UDF

Posted by GitBox <gi...@apache.org>.
SparkQA commented on issue #27937: [WIP][SPARK-30127][SQL] Support case class parameter for typed Scala UDF
URL: https://github.com/apache/spark/pull/27937#issuecomment-600820013
 
 
   **[Test build #119997 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/119997/testReport)** for PR 27937 at commit [`867ad06`](https://github.com/apache/spark/commit/867ad06f8b7068b8cd24970fc45d8fd8dc12428d).
    * This patch passes all tests.
    * This patch merges cleanly.
    * This patch adds no public classes.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #27937: [WIP][SPARK-30127][SQL] Support case class parameter for typed Scala UDF

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #27937: [WIP][SPARK-30127][SQL] Support case class parameter for typed Scala UDF
URL: https://github.com/apache/spark/pull/27937#issuecomment-600158325
 
 
   Merged build finished. Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on issue #27937: [SPARK-30127][SQL] Support case class parameter for typed Scala UDF

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on issue #27937: [SPARK-30127][SQL] Support case class parameter for typed Scala UDF
URL: https://github.com/apache/spark/pull/27937#issuecomment-601021743
 
 
   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/120018/
   Test FAILed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on issue #27937: [SPARK-30127][SQL] Support case class parameter for typed Scala UDF

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on issue #27937: [SPARK-30127][SQL] Support case class parameter for typed Scala UDF
URL: https://github.com/apache/spark/pull/27937#issuecomment-601231277
 
 
   Merged build finished. Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on issue #27937: [WIP][SPARK-30127][SQL] Support case class parameter for typed Scala UDF

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on issue #27937: [WIP][SPARK-30127][SQL] Support case class parameter for typed Scala UDF
URL: https://github.com/apache/spark/pull/27937#issuecomment-600964708
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/24736/
   Test PASSed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org