You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by mgaido91 <gi...@git.apache.org> on 2017/12/05 16:22:44 UTC

[GitHub] spark pull request #19900: [SPARK-22695][SQL] ScalaUDF should not use global...

GitHub user mgaido91 opened a pull request:

    https://github.com/apache/spark/pull/19900

    [SPARK-22695][SQL] ScalaUDF should not use global variables

    ## What changes were proposed in this pull request?
    
    ScalaUDF is using global variables which are not needed. This can generate some unneeded entries in the constant pool.
    
    The PR replaces the unneeded global variables with local variables.
    
    ## How was this patch tested?
    
    added UT


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/mgaido91/spark SPARK-22695

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/19900.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #19900
    
----

----


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19900: [SPARK-22695][SQL] ScalaUDF should not use global variab...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/19900
  
    **[Test build #84559 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/84559/testReport)** for PR 19900 at commit [`f188d55`](https://github.com/apache/spark/commit/f188d55083c20f85583929c04bf916ef494b744a).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19900: [SPARK-22695][SQL] ScalaUDF should not use global variab...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/19900
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #19900: [SPARK-22695][SQL] ScalaUDF should not use global...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/spark/pull/19900


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #19900: [SPARK-22695][SQL] ScalaUDF should not use global...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19900#discussion_r155219578
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/ScalaUDF.scala ---
    @@ -982,35 +982,30 @@ case class ScalaUDF(
     
       // scalastyle:on line.size.limit
     
    -  // Generate codes used to convert the arguments to Scala type for user-defined functions
    -  private[this] def genCodeForConverter(ctx: CodegenContext, index: Int): String = {
    -    val converterClassName = classOf[Any => Any].getName
    -    val typeConvertersClassName = CatalystTypeConverters.getClass.getName + ".MODULE$"
    -    val expressionClassName = classOf[Expression].getName
    -    val scalaUDFClassName = classOf[ScalaUDF].getName
    +  private val converterClassName = classOf[Any => Any].getName
    +  private val expressionClassName = classOf[Expression].getName
    +  private val scalaUDFClassName = classOf[ScalaUDF].getName
    +  private val typeConvertersClassName = CatalystTypeConverters.getClass.getName + ".MODULE$"
     
    +  // Generate codes used to convert the arguments to Scala type for user-defined functions
    +  private[this] def genCodeForConverter(ctx: CodegenContext, index: Int): (String, String) = {
         val converterTerm = ctx.freshName("converter")
         val expressionIdx = ctx.references.size - 1
    -    ctx.addMutableState(converterClassName, converterTerm,
    -      s"$converterTerm = ($converterClassName)$typeConvertersClassName" +
    -        s".createToScalaConverter(((${expressionClassName})((($scalaUDFClassName)" +
    -          s"references[$expressionIdx]).getChildren().apply($index))).dataType());")
    -    converterTerm
    +    (converterTerm,
    +      s"$converterClassName $converterTerm = ($converterClassName)$typeConvertersClassName" +
    +        s".createToScalaConverter((($expressionClassName)((($scalaUDFClassName)" +
    +        s"references[$expressionIdx]).getChildren().apply($index))).dataType());")
       }
     
       override def doGenCode(
           ctx: CodegenContext,
           ev: ExprCode): ExprCode = {
    +    val thisClassName = this.getClass.getName
    +    val scalaUDF = ctx.freshName("scalaUDF")
    +    val scalaUDFRef = ctx.addReferenceMinorObj(this, thisClassName)
    --- End diff --
    
    `ctx.addReferenceMinorObj` has a default value for class name, which is `obj.getClass.getNane`, so the `thisClassName` is redundant.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19900: [SPARK-22695][SQL] ScalaUDF should not use global variab...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/19900
  
    **[Test build #84555 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/84555/testReport)** for PR 19900 at commit [`eef8036`](https://github.com/apache/spark/commit/eef803639ba350eaf6b7c930b06d58d4d7a0a1f9).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19900: [SPARK-22695][SQL] ScalaUDF should not use global variab...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/19900
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #19900: [SPARK-22695][SQL] ScalaUDF should not use global...

Posted by mgaido91 <gi...@git.apache.org>.
Github user mgaido91 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19900#discussion_r155239094
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/ScalaUDF.scala ---
    @@ -982,35 +982,29 @@ case class ScalaUDF(
     
       // scalastyle:on line.size.limit
     
    -  // Generate codes used to convert the arguments to Scala type for user-defined functions
    -  private[this] def genCodeForConverter(ctx: CodegenContext, index: Int): String = {
    -    val converterClassName = classOf[Any => Any].getName
    -    val typeConvertersClassName = CatalystTypeConverters.getClass.getName + ".MODULE$"
    -    val expressionClassName = classOf[Expression].getName
    -    val scalaUDFClassName = classOf[ScalaUDF].getName
    +  private val converterClassName = classOf[Any => Any].getName
    +  private val scalaUDFClassName = classOf[ScalaUDF].getName
    +  private val typeConvertersClassName = CatalystTypeConverters.getClass.getName + ".MODULE$"
     
    +  // Generate codes used to convert the arguments to Scala type for user-defined functions
    +  private[this] def genCodeForConverter(ctx: CodegenContext, index: Int): (String, String) = {
         val converterTerm = ctx.freshName("converter")
         val expressionIdx = ctx.references.size - 1
    -    ctx.addMutableState(converterClassName, converterTerm,
    -      s"$converterTerm = ($converterClassName)$typeConvertersClassName" +
    -        s".createToScalaConverter(((${expressionClassName})((($scalaUDFClassName)" +
    -          s"references[$expressionIdx]).getChildren().apply($index))).dataType());")
    -    converterTerm
    +    (converterTerm,
    +      s"$converterClassName $converterTerm = ($converterClassName)$typeConvertersClassName" +
    +        s".createToScalaConverter(((Expression)((($scalaUDFClassName)" +
    +        s"references[$expressionIdx]).getChildren().apply($index))).dataType());")
       }
     
       override def doGenCode(
           ctx: CodegenContext,
           ev: ExprCode): ExprCode = {
    +    val thisClassName = this.getClass.getName
    --- End diff --
    
    yes, thanks, nice catch! I am updating it. Thank you.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19900: [SPARK-22695][SQL] ScalaUDF should not use global variab...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/19900
  
    **[Test build #84559 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/84559/testReport)** for PR 19900 at commit [`f188d55`](https://github.com/apache/spark/commit/f188d55083c20f85583929c04bf916ef494b744a).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19900: [SPARK-22695][SQL] ScalaUDF should not use global variab...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on the issue:

    https://github.com/apache/spark/pull/19900
  
    thanks, merging to master!


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19900: [SPARK-22695][SQL] ScalaUDF should not use global variab...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/19900
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/84555/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #19900: [SPARK-22695][SQL] ScalaUDF should not use global...

Posted by mgaido91 <gi...@git.apache.org>.
Github user mgaido91 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19900#discussion_r155238814
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/ScalaUDF.scala ---
    @@ -982,35 +982,29 @@ case class ScalaUDF(
     
       // scalastyle:on line.size.limit
     
    -  // Generate codes used to convert the arguments to Scala type for user-defined functions
    -  private[this] def genCodeForConverter(ctx: CodegenContext, index: Int): String = {
    -    val converterClassName = classOf[Any => Any].getName
    -    val typeConvertersClassName = CatalystTypeConverters.getClass.getName + ".MODULE$"
    -    val expressionClassName = classOf[Expression].getName
    -    val scalaUDFClassName = classOf[ScalaUDF].getName
    +  private val converterClassName = classOf[Any => Any].getName
    +  private val scalaUDFClassName = classOf[ScalaUDF].getName
    +  private val typeConvertersClassName = CatalystTypeConverters.getClass.getName + ".MODULE$"
     
    +  // Generate codes used to convert the arguments to Scala type for user-defined functions
    +  private[this] def genCodeForConverter(ctx: CodegenContext, index: Int): (String, String) = {
         val converterTerm = ctx.freshName("converter")
         val expressionIdx = ctx.references.size - 1
    -    ctx.addMutableState(converterClassName, converterTerm,
    -      s"$converterTerm = ($converterClassName)$typeConvertersClassName" +
    -        s".createToScalaConverter(((${expressionClassName})((($scalaUDFClassName)" +
    -          s"references[$expressionIdx]).getChildren().apply($index))).dataType());")
    -    converterTerm
    +    (converterTerm,
    +      s"$converterClassName $converterTerm = ($converterClassName)$typeConvertersClassName" +
    +        s".createToScalaConverter(((Expression)((($scalaUDFClassName)" +
    +        s"references[$expressionIdx]).getChildren().apply($index))).dataType());")
       }
     
       override def doGenCode(
           ctx: CodegenContext,
           ev: ExprCode): ExprCode = {
    +    val thisClassName = this.getClass.getName
    +    val scalaUDF = ctx.freshName("scalaUDF")
    +    val scalaUDFRef = ctx.addReferenceMinorObj(this, thisClassName)
     
    -    val scalaUDF = ctx.addReferenceObj("scalaUDF", this)
    --- End diff --
    
    yes, sure, thanks. I would be happy to work on it.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19900: [SPARK-22695][SQL] ScalaUDF should not use global variab...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on the issue:

    https://github.com/apache/spark/pull/19900
  
    LGTM


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #19900: [SPARK-22695][SQL] ScalaUDF should not use global...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19900#discussion_r155237198
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/ScalaUDF.scala ---
    @@ -982,35 +982,29 @@ case class ScalaUDF(
     
       // scalastyle:on line.size.limit
     
    -  // Generate codes used to convert the arguments to Scala type for user-defined functions
    -  private[this] def genCodeForConverter(ctx: CodegenContext, index: Int): String = {
    -    val converterClassName = classOf[Any => Any].getName
    -    val typeConvertersClassName = CatalystTypeConverters.getClass.getName + ".MODULE$"
    -    val expressionClassName = classOf[Expression].getName
    -    val scalaUDFClassName = classOf[ScalaUDF].getName
    +  private val converterClassName = classOf[Any => Any].getName
    +  private val scalaUDFClassName = classOf[ScalaUDF].getName
    +  private val typeConvertersClassName = CatalystTypeConverters.getClass.getName + ".MODULE$"
     
    +  // Generate codes used to convert the arguments to Scala type for user-defined functions
    +  private[this] def genCodeForConverter(ctx: CodegenContext, index: Int): (String, String) = {
         val converterTerm = ctx.freshName("converter")
         val expressionIdx = ctx.references.size - 1
    -    ctx.addMutableState(converterClassName, converterTerm,
    -      s"$converterTerm = ($converterClassName)$typeConvertersClassName" +
    -        s".createToScalaConverter(((${expressionClassName})((($scalaUDFClassName)" +
    -          s"references[$expressionIdx]).getChildren().apply($index))).dataType());")
    -    converterTerm
    +    (converterTerm,
    +      s"$converterClassName $converterTerm = ($converterClassName)$typeConvertersClassName" +
    +        s".createToScalaConverter(((Expression)((($scalaUDFClassName)" +
    +        s"references[$expressionIdx]).getChildren().apply($index))).dataType());")
       }
     
       override def doGenCode(
           ctx: CodegenContext,
           ev: ExprCode): ExprCode = {
    +    val thisClassName = this.getClass.getName
    --- End diff --
    
    isn't it just `scalaUDFClassName`?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19900: [SPARK-22695][SQL] ScalaUDF should not use global variab...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/19900
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/84559/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #19900: [SPARK-22695][SQL] ScalaUDF should not use global...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19900#discussion_r155220121
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/ScalaUDF.scala ---
    @@ -982,35 +982,30 @@ case class ScalaUDF(
     
       // scalastyle:on line.size.limit
     
    -  // Generate codes used to convert the arguments to Scala type for user-defined functions
    -  private[this] def genCodeForConverter(ctx: CodegenContext, index: Int): String = {
    -    val converterClassName = classOf[Any => Any].getName
    -    val typeConvertersClassName = CatalystTypeConverters.getClass.getName + ".MODULE$"
    -    val expressionClassName = classOf[Expression].getName
    -    val scalaUDFClassName = classOf[ScalaUDF].getName
    +  private val converterClassName = classOf[Any => Any].getName
    +  private val expressionClassName = classOf[Expression].getName
    +  private val scalaUDFClassName = classOf[ScalaUDF].getName
    +  private val typeConvertersClassName = CatalystTypeConverters.getClass.getName + ".MODULE$"
     
    +  // Generate codes used to convert the arguments to Scala type for user-defined functions
    +  private[this] def genCodeForConverter(ctx: CodegenContext, index: Int): (String, String) = {
         val converterTerm = ctx.freshName("converter")
         val expressionIdx = ctx.references.size - 1
    -    ctx.addMutableState(converterClassName, converterTerm,
    -      s"$converterTerm = ($converterClassName)$typeConvertersClassName" +
    -        s".createToScalaConverter(((${expressionClassName})((($scalaUDFClassName)" +
    -          s"references[$expressionIdx]).getChildren().apply($index))).dataType());")
    -    converterTerm
    +    (converterTerm,
    +      s"$converterClassName $converterTerm = ($converterClassName)$typeConvertersClassName" +
    +        s".createToScalaConverter((($expressionClassName)((($scalaUDFClassName)" +
    +        s"references[$expressionIdx]).getChildren().apply($index))).dataType());")
       }
     
       override def doGenCode(
           ctx: CodegenContext,
           ev: ExprCode): ExprCode = {
    +    val thisClassName = this.getClass.getName
    +    val scalaUDF = ctx.freshName("scalaUDF")
    +    val scalaUDFRef = ctx.addReferenceMinorObj(this, thisClassName)
    --- End diff --
    
    oh i see, it's used later.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19900: [SPARK-22695][SQL] ScalaUDF should not use global variab...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/19900
  
    **[Test build #84491 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/84491/testReport)** for PR 19900 at commit [`4eaca3e`](https://github.com/apache/spark/commit/4eaca3e4176eb8107f0e9f23d2d4116cd9cba04b).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #19900: [SPARK-22695][SQL] ScalaUDF should not use global...

Posted by mgaido91 <gi...@git.apache.org>.
Github user mgaido91 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19900#discussion_r155221538
  
    --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/ScalaUDFSuite.scala ---
    @@ -47,4 +48,10 @@ class ScalaUDFSuite extends SparkFunSuite with ExpressionEvalHelper {
         assert(e2.getMessage.contains("Failed to execute user defined function"))
       }
     
    +  test("SPARK-22695: ScalaUDF should not use global variables") {
    +    val ctx = new CodegenContext
    +    ScalaUDF((s: String) => s + "x", StringType, Literal("a") :: Nil).genCode(ctx)
    +    // we have one variable (globalIsNull) introduced by reduceCodeSize
    --- End diff --
    
    yes


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19900: [SPARK-22695][SQL] ScalaUDF should not use global variab...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/19900
  
    **[Test build #84491 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/84491/testReport)** for PR 19900 at commit [`4eaca3e`](https://github.com/apache/spark/commit/4eaca3e4176eb8107f0e9f23d2d4116cd9cba04b).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #19900: [SPARK-22695][SQL] ScalaUDF should not use global...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19900#discussion_r155220823
  
    --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/ScalaUDFSuite.scala ---
    @@ -47,4 +48,10 @@ class ScalaUDFSuite extends SparkFunSuite with ExpressionEvalHelper {
         assert(e2.getMessage.contains("Failed to execute user defined function"))
       }
     
    +  test("SPARK-22695: ScalaUDF should not use global variables") {
    +    val ctx = new CodegenContext
    +    ScalaUDF((s: String) => s + "x", StringType, Literal("a") :: Nil).genCode(ctx)
    +    // we have one variable (globalIsNull) introduced by reduceCodeSize
    --- End diff --
    
    wow this simple UDF will trigger the code splitting logic in `reduceCodeSize`?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #19900: [SPARK-22695][SQL] ScalaUDF should not use global...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19900#discussion_r155238140
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/ScalaUDF.scala ---
    @@ -982,35 +982,29 @@ case class ScalaUDF(
     
       // scalastyle:on line.size.limit
     
    -  // Generate codes used to convert the arguments to Scala type for user-defined functions
    -  private[this] def genCodeForConverter(ctx: CodegenContext, index: Int): String = {
    -    val converterClassName = classOf[Any => Any].getName
    -    val typeConvertersClassName = CatalystTypeConverters.getClass.getName + ".MODULE$"
    -    val expressionClassName = classOf[Expression].getName
    -    val scalaUDFClassName = classOf[ScalaUDF].getName
    +  private val converterClassName = classOf[Any => Any].getName
    +  private val scalaUDFClassName = classOf[ScalaUDF].getName
    +  private val typeConvertersClassName = CatalystTypeConverters.getClass.getName + ".MODULE$"
     
    +  // Generate codes used to convert the arguments to Scala type for user-defined functions
    +  private[this] def genCodeForConverter(ctx: CodegenContext, index: Int): (String, String) = {
         val converterTerm = ctx.freshName("converter")
         val expressionIdx = ctx.references.size - 1
    -    ctx.addMutableState(converterClassName, converterTerm,
    -      s"$converterTerm = ($converterClassName)$typeConvertersClassName" +
    -        s".createToScalaConverter(((${expressionClassName})((($scalaUDFClassName)" +
    -          s"references[$expressionIdx]).getChildren().apply($index))).dataType());")
    -    converterTerm
    +    (converterTerm,
    +      s"$converterClassName $converterTerm = ($converterClassName)$typeConvertersClassName" +
    +        s".createToScalaConverter(((Expression)((($scalaUDFClassName)" +
    +        s"references[$expressionIdx]).getChildren().apply($index))).dataType());")
       }
     
       override def doGenCode(
           ctx: CodegenContext,
           ev: ExprCode): ExprCode = {
    +    val thisClassName = this.getClass.getName
    +    val scalaUDF = ctx.freshName("scalaUDF")
    +    val scalaUDFRef = ctx.addReferenceMinorObj(this, thisClassName)
     
    -    val scalaUDF = ctx.addReferenceObj("scalaUDF", this)
    --- End diff --
    
    We may need to revisit all the usage of `ctx.addReferenceObj`, I created https://issues.apache.org/jira/browse/SPARK-22716 for it. @mgaido91 do you have interests?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19900: [SPARK-22695][SQL] ScalaUDF should not use global variab...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/19900
  
    **[Test build #84555 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/84555/testReport)** for PR 19900 at commit [`eef8036`](https://github.com/apache/spark/commit/eef803639ba350eaf6b7c930b06d58d4d7a0a1f9).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #19900: [SPARK-22695][SQL] ScalaUDF should not use global...

Posted by mgaido91 <gi...@git.apache.org>.
Github user mgaido91 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19900#discussion_r155220498
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/ScalaUDF.scala ---
    @@ -982,35 +982,30 @@ case class ScalaUDF(
     
       // scalastyle:on line.size.limit
     
    -  // Generate codes used to convert the arguments to Scala type for user-defined functions
    -  private[this] def genCodeForConverter(ctx: CodegenContext, index: Int): String = {
    -    val converterClassName = classOf[Any => Any].getName
    -    val typeConvertersClassName = CatalystTypeConverters.getClass.getName + ".MODULE$"
    -    val expressionClassName = classOf[Expression].getName
    -    val scalaUDFClassName = classOf[ScalaUDF].getName
    +  private val converterClassName = classOf[Any => Any].getName
    +  private val expressionClassName = classOf[Expression].getName
    +  private val scalaUDFClassName = classOf[ScalaUDF].getName
    +  private val typeConvertersClassName = CatalystTypeConverters.getClass.getName + ".MODULE$"
     
    +  // Generate codes used to convert the arguments to Scala type for user-defined functions
    +  private[this] def genCodeForConverter(ctx: CodegenContext, index: Int): (String, String) = {
         val converterTerm = ctx.freshName("converter")
         val expressionIdx = ctx.references.size - 1
    -    ctx.addMutableState(converterClassName, converterTerm,
    -      s"$converterTerm = ($converterClassName)$typeConvertersClassName" +
    -        s".createToScalaConverter(((${expressionClassName})((($scalaUDFClassName)" +
    -          s"references[$expressionIdx]).getChildren().apply($index))).dataType());")
    -    converterTerm
    +    (converterTerm,
    +      s"$converterClassName $converterTerm = ($converterClassName)$typeConvertersClassName" +
    +        s".createToScalaConverter((($expressionClassName)((($scalaUDFClassName)" +
    +        s"references[$expressionIdx]).getChildren().apply($index))).dataType());")
       }
     
       override def doGenCode(
           ctx: CodegenContext,
           ev: ExprCode): ExprCode = {
    +    val thisClassName = this.getClass.getName
    +    val scalaUDF = ctx.freshName("scalaUDF")
    +    val scalaUDFRef = ctx.addReferenceMinorObj(this, thisClassName)
    --- End diff --
    
    I am using `thisClassName` also later (line 1045), that is why I passed it, despite it is not needed. What is your suggestion? Just not passing it as a parameter or getting rid of the `thisClassName ` variable itself? Thanks,


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #19900: [SPARK-22695][SQL] ScalaUDF should not use global...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19900#discussion_r155222185
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/ScalaUDF.scala ---
    @@ -982,35 +982,30 @@ case class ScalaUDF(
     
       // scalastyle:on line.size.limit
     
    -  // Generate codes used to convert the arguments to Scala type for user-defined functions
    -  private[this] def genCodeForConverter(ctx: CodegenContext, index: Int): String = {
    -    val converterClassName = classOf[Any => Any].getName
    -    val typeConvertersClassName = CatalystTypeConverters.getClass.getName + ".MODULE$"
    -    val expressionClassName = classOf[Expression].getName
    -    val scalaUDFClassName = classOf[ScalaUDF].getName
    +  private val converterClassName = classOf[Any => Any].getName
    +  private val expressionClassName = classOf[Expression].getName
    +  private val scalaUDFClassName = classOf[ScalaUDF].getName
    +  private val typeConvertersClassName = CatalystTypeConverters.getClass.getName + ".MODULE$"
     
    +  // Generate codes used to convert the arguments to Scala type for user-defined functions
    +  private[this] def genCodeForConverter(ctx: CodegenContext, index: Int): (String, String) = {
         val converterTerm = ctx.freshName("converter")
         val expressionIdx = ctx.references.size - 1
    -    ctx.addMutableState(converterClassName, converterTerm,
    -      s"$converterTerm = ($converterClassName)$typeConvertersClassName" +
    -        s".createToScalaConverter(((${expressionClassName})((($scalaUDFClassName)" +
    -          s"references[$expressionIdx]).getChildren().apply($index))).dataType());")
    -    converterTerm
    +    (converterTerm,
    +      s"$converterClassName $converterTerm = ($converterClassName)$typeConvertersClassName" +
    +        s".createToScalaConverter((($expressionClassName)((($scalaUDFClassName)" +
    +        s"references[$expressionIdx]).getChildren().apply($index))).dataType());")
       }
     
       override def doGenCode(
           ctx: CodegenContext,
           ev: ExprCode): ExprCode = {
    +    val thisClassName = this.getClass.getName
    +    val scalaUDF = ctx.freshName("scalaUDF")
    +    val scalaUDFRef = ctx.addReferenceMinorObj(this, thisClassName)
    --- End diff --
    
    ok we can keep it.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19900: [SPARK-22695][SQL] ScalaUDF should not use global variab...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/19900
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/84491/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #19900: [SPARK-22695][SQL] ScalaUDF should not use global...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19900#discussion_r155218848
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/ScalaUDF.scala ---
    @@ -982,35 +982,30 @@ case class ScalaUDF(
     
       // scalastyle:on line.size.limit
     
    -  // Generate codes used to convert the arguments to Scala type for user-defined functions
    -  private[this] def genCodeForConverter(ctx: CodegenContext, index: Int): String = {
    -    val converterClassName = classOf[Any => Any].getName
    -    val typeConvertersClassName = CatalystTypeConverters.getClass.getName + ".MODULE$"
    -    val expressionClassName = classOf[Expression].getName
    -    val scalaUDFClassName = classOf[ScalaUDF].getName
    +  private val converterClassName = classOf[Any => Any].getName
    +  private val expressionClassName = classOf[Expression].getName
    --- End diff --
    
    `Expression` is pre-imported, we can just write `Expression` in the generated code.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19900: [SPARK-22695][SQL] ScalaUDF should not use global variab...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/19900
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19900: [SPARK-22695][SQL] ScalaUDF should not use global variab...

Posted by mgaido91 <gi...@git.apache.org>.
Github user mgaido91 commented on the issue:

    https://github.com/apache/spark/pull/19900
  
    @cloud-fan @kiszk @viirya may you please review this? Thanks


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org