You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by mgaido91 <gi...@git.apache.org> on 2017/12/10 13:27:50 UTC

[GitHub] spark pull request #19940: [SPARK-22750][SQL] Reuse mutable states when poss...

GitHub user mgaido91 opened a pull request:

    https://github.com/apache/spark/pull/19940

    [SPARK-22750][SQL] Reuse mutable states when possible

    ## What changes were proposed in this pull request?
    
    The PR introduces a new method `addSingleMutableState ` to `CodeGenerator` to allow reusing and sharing the same global variable between different Expressions. This helps reducing the number of global variables needed, which is important to limit the impact on the constant pool.
    
    ## How was this patch tested?
    
    added UTs


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/mgaido91/spark SPARK-22750

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/19940.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #19940
    
----
commit 978bfd6a490cfef2c0f2da1190830ef6a64287af
Author: Marco Gaido <ma...@gmail.com>
Date:   2017-12-10T11:23:10Z

    [SPARK-22750][SQL] Reuse mutable states when possible

----


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19940: [SPARK-22750][SQL] Reuse mutable states when possible

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/19940
  
    **[Test build #85202 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/85202/testReport)** for PR 19940 at commit [`c650196`](https://github.com/apache/spark/commit/c650196a4c0c4391d3112678803bd601af9aa5fb).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19940: [SPARK-22750][SQL] Reuse mutable states when possible

Posted by mgaido91 <gi...@git.apache.org>.
Github user mgaido91 commented on the issue:

    https://github.com/apache/spark/pull/19940
  
    @cloud-fan @kiszk @viirya might you please help reviewing this? Thanks.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19940: [SPARK-22750][SQL] Reuse mutable states when possible

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/19940
  
    **[Test build #84714 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/84714/testReport)** for PR 19940 at commit [`ce76015`](https://github.com/apache/spark/commit/ce76015c9ae2f5a0d04941af6bcd8e6b99168c00).
     * This patch **fails SparkR unit tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19940: [SPARK-22750][SQL] Reuse mutable states when possible

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/19940
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/85212/
    Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19940: [SPARK-22750][SQL] Reuse mutable states when possible

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/19940
  
    Merged build finished. Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #19940: [SPARK-22750][SQL] Reuse mutable states when poss...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19940#discussion_r158324941
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/datetimeExpressions.scala ---
    @@ -1193,7 +1196,8 @@ case class ToUTCTimestamp(left: Expression, right: Expression)
             val dtu = DateTimeUtils.getClass.getName.stripSuffix("$")
             val tzTerm = ctx.addMutableState(tzClass, "tz",
               v => s"""$v = $dtu.getTimeZone("$tz");""")
    -        val utcTerm = ctx.addMutableState(tzClass, "utc",
    +        val utcTerm = "tzUTC"
    +        ctx.addImmutableStateIfNotExists(tzClass, utcTerm,
               v => s"""$v = $dtu.getTimeZone("UTC");""")
    --- End diff --
    
    unrelated question: in the codebase sometimes we use UTC sometimes we use GMT, is it corrected?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #19940: [SPARK-22750][SQL] Reuse mutable states when poss...

Posted by mgaido91 <gi...@git.apache.org>.
Github user mgaido91 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19940#discussion_r158336014
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/datetimeExpressions.scala ---
    @@ -1193,7 +1196,8 @@ case class ToUTCTimestamp(left: Expression, right: Expression)
             val dtu = DateTimeUtils.getClass.getName.stripSuffix("$")
             val tzTerm = ctx.addMutableState(tzClass, "tz",
               v => s"""$v = $dtu.getTimeZone("$tz");""")
    -        val utcTerm = ctx.addMutableState(tzClass, "utc",
    +        val utcTerm = "tzUTC"
    +        ctx.addImmutableStateIfNotExists(tzClass, utcTerm,
               v => s"""$v = $dtu.getTimeZone("UTC");""")
    --- End diff --
    
    yes, there is no difference between them in practice. But I think that being consistent would be better for readability


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19940: [SPARK-22750][SQL] Reuse mutable states when possible

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/19940
  
    **[Test build #85252 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/85252/testReport)** for PR 19940 at commit [`6143b22`](https://github.com/apache/spark/commit/6143b221ffe6387231dbdacc9a4aebb997f8e815).
     * This patch **fails PySpark unit tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19940: [SPARK-22750][SQL] Reuse mutable states when possible

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/19940
  
    **[Test build #84714 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/84714/testReport)** for PR 19940 at commit [`ce76015`](https://github.com/apache/spark/commit/ce76015c9ae2f5a0d04941af6bcd8e6b99168c00).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19940: [SPARK-22750][SQL] Reuse mutable states when possible

Posted by mgaido91 <gi...@git.apache.org>.
Github user mgaido91 commented on the issue:

    https://github.com/apache/spark/pull/19940
  
    1. Yes, to ensure that there are no problems, variables' value should not be changed. I agree that we can find a better API name, to enforce this: what about `addImmutableStateIfNotExists`? Or do you have any suggestion?
    2. Beacause as @viirya pointed out [here](https://github.com/apache/spark/pull/19940#issuecomment-350724795), they are not initialized in the declaration, but in the `init` method of the generated class, thus it cannot be made final unfortunately.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19940: [SPARK-22750][SQL] Reuse mutable states when possible

Posted by mgaido91 <gi...@git.apache.org>.
Github user mgaido91 commented on the issue:

    https://github.com/apache/spark/pull/19940
  
    Jenkins, retest this please


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19940: [SPARK-22750][SQL] Reuse mutable states when possible

Posted by mgaido91 <gi...@git.apache.org>.
Github user mgaido91 commented on the issue:

    https://github.com/apache/spark/pull/19940
  
    the test errors are unrelated to this change. Any other comments @cloud-fan @kiszk @viirya ?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19940: [SPARK-22750][SQL] Reuse mutable states when possible

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/19940
  
    **[Test build #85273 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/85273/testReport)** for PR 19940 at commit [`7397cf3`](https://github.com/apache/spark/commit/7397cf3957add3e0ce335f904f4a20b62060b566).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19940: [SPARK-22750][SQL] Reuse mutable states when possible

Posted by mgaido91 <gi...@git.apache.org>.
Github user mgaido91 commented on the issue:

    https://github.com/apache/spark/pull/19940
  
    @cloud-fan @kiszk @viirya  any more comments on this?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19940: [SPARK-22750][SQL] Reuse mutable states when possible

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/19940
  
    Merged build finished. Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19940: [SPARK-22750][SQL] Reuse mutable states when possible

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/19940
  
    **[Test build #84695 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/84695/testReport)** for PR 19940 at commit [`978bfd6`](https://github.com/apache/spark/commit/978bfd6a490cfef2c0f2da1190830ef6a64287af).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19940: [SPARK-22750][SQL] Reuse mutable states when possible

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/19940
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/84956/
    Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19940: [SPARK-22750][SQL] Reuse mutable states when possible

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/19940
  
    **[Test build #84972 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/84972/testReport)** for PR 19940 at commit [`e2725d2`](https://github.com/apache/spark/commit/e2725d2ac8c7c66fb99332414b25f549de42b40f).
     * This patch **fails SparkR unit tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #19940: [SPARK-22750][SQL] Reuse mutable states when poss...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19940#discussion_r156051417
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/CodeGenerator.scala ---
    @@ -184,6 +190,27 @@ class CodegenContext {
         mutableStates += ((javaType, variableName, initCode))
       }
     
    +  /**
    +   * Add a mutable state as a field to the generated class only if it does not exist yet a field
    +   * with that name. This helps reducing the number of the generated class' fields, since the same
    +   * variable can be reused by many functions.
    +   *
    +   * Internally, this method calls `addMutableState`.
    +   *
    +   * @param javaType Java type of the field.
    +   * @param variableName Name of the field.
    +   * @param initCode The statement(s) to put into the init() method to initialize this field.
    +   */
    +  def addSingleMutableState(
    +      javaType: String,
    +      variableName: String,
    +      initCode: String = ""): Unit = {
    --- End diff --
    
    How can we support different initCode for the same name?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19940: [SPARK-22750][SQL] Reuse mutable states when possible

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/19940
  
    Merged build finished. Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #19940: [SPARK-22750][SQL] Reuse mutable states when poss...

Posted by mgaido91 <gi...@git.apache.org>.
Github user mgaido91 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19940#discussion_r157168911
  
    --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/CodeGenerationSuite.scala ---
    @@ -401,4 +401,16 @@ class CodeGenerationSuite extends SparkFunSuite with ExpressionEvalHelper {
         ctx.addReferenceObj("foo", foo)
         assert(ctx.mutableStates.isEmpty)
       }
    +
    +  test("SPARK-22750: addSingleMutableState") {
    --- End diff --
    
    yes, thanks, good catch!


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19940: [SPARK-22750][SQL] Reuse mutable states when possible

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/19940
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/85202/
    Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19940: [SPARK-22750][SQL] Reuse mutable states when possible

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/19940
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19940: [SPARK-22750][SQL] Reuse mutable states when possible

Posted by mgaido91 <gi...@git.apache.org>.
Github user mgaido91 commented on the issue:

    https://github.com/apache/spark/pull/19940
  
    I don't think so, it would be very risky, since `splitExpressions` might put the initialization outside the constructor.
    What about 1? Any thought?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19940: [SPARK-22750][SQL] Reuse mutable states when possible

Posted by mgaido91 <gi...@git.apache.org>.
Github user mgaido91 commented on the issue:

    https://github.com/apache/spark/pull/19940
  
    the test failure is not related to the PR. It looks the same R failure we had some days ago, I thought it was solved:
    ```
    checking CRAN incoming feasibility ...Error in .check_package_CRAN_incoming(pkgdir) : 
      dims [product 22] do not match the length of object [0]
    ```


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #19940: [SPARK-22750][SQL] Reuse mutable states when poss...

Posted by kiszk <gi...@git.apache.org>.
Github user kiszk commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19940#discussion_r157157565
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/CodeGenerator.scala ---
    @@ -170,6 +170,14 @@ class CodegenContext {
       val mutableStates: mutable.ArrayBuffer[(String, String, String)] =
         mutable.ArrayBuffer.empty[(String, String, String)]
     
    +  /**
    +   * A map containing the mutable states which have been defined so far using
    +   * `addSingleMutableState`. Each entry contains the name of the mutable state as key and its
    --- End diff --
    
    Is `addSingleMutableState` old one?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19940: [SPARK-22750][SQL] Reuse mutable states when possible

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/19940
  
    **[Test build #85261 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/85261/testReport)** for PR 19940 at commit [`6143b22`](https://github.com/apache/spark/commit/6143b221ffe6387231dbdacc9a4aebb997f8e815).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19940: [SPARK-22750][SQL] Reuse mutable states when possible

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/19940
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/85273/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19940: [SPARK-22750][SQL] Reuse mutable states when possible

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/19940
  
    **[Test build #84956 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/84956/testReport)** for PR 19940 at commit [`e2725d2`](https://github.com/apache/spark/commit/e2725d2ac8c7c66fb99332414b25f549de42b40f).
     * This patch **fails SparkR unit tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19940: [SPARK-22750][SQL] Reuse mutable states when possible

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/19940
  
    **[Test build #84846 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/84846/testReport)** for PR 19940 at commit [`90f517c`](https://github.com/apache/spark/commit/90f517c38364eda0e905ea0f23b413806307ec10).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19940: [SPARK-22750][SQL] Reuse mutable states when possible

Posted by kiszk <gi...@git.apache.org>.
Github user kiszk commented on the issue:

    https://github.com/apache/spark/pull/19940
  
    I have one question. We are implementing #19811 to compact mutable states. Even when it will be merged, does this PR can reduce large number of constant pool entries?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19940: [SPARK-22750][SQL] Reuse mutable states when possible

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/19940
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/84775/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19940: [SPARK-22750][SQL] Reuse mutable states when possible

Posted by mgaido91 <gi...@git.apache.org>.
Github user mgaido91 commented on the issue:

    https://github.com/apache/spark/pull/19940
  
    Jenkins, retest this please


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19940: [SPARK-22750][SQL] Reuse mutable states when possible

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on the issue:

    https://github.com/apache/spark/pull/19940
  
    LGTM


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19940: [SPARK-22750][SQL] Reuse mutable states when possible

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/19940
  
    **[Test build #84972 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/84972/testReport)** for PR 19940 at commit [`e2725d2`](https://github.com/apache/spark/commit/e2725d2ac8c7c66fb99332414b25f549de42b40f).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19940: [SPARK-22750][SQL] Reuse mutable states when possible

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/19940
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/85261/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19940: [SPARK-22750][SQL] Reuse mutable states when possible

Posted by mgaido91 <gi...@git.apache.org>.
Github user mgaido91 commented on the issue:

    https://github.com/apache/spark/pull/19940
  
    @viirya we have seen that using arrays affects performance. Thus if we can reduce their usage it is better.
    
    I don't think that debugging is harder. These variables I made shared are never assigned, but in the initialization. Do you have an other opinion? Or are you thinking for something specific?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19940: [SPARK-22750][SQL] Reuse mutable states when possible

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/19940
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19940: [SPARK-22750][SQL] Reuse mutable states when possible

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/19940
  
    **[Test build #85273 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/85273/testReport)** for PR 19940 at commit [`7397cf3`](https://github.com/apache/spark/commit/7397cf3957add3e0ce335f904f4a20b62060b566).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19940: [SPARK-22750][SQL] Reuse mutable states when possible

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/19940
  
    Merged build finished. Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19940: [SPARK-22750][SQL] Reuse mutable states when possible

Posted by mgaido91 <gi...@git.apache.org>.
Github user mgaido91 commented on the issue:

    https://github.com/apache/spark/pull/19940
  
    @kiszk for instance it can remove one entry for every timestamp function (to_timestamp or from_utc_timestamp). Of course #19811 is the most important PR, because it solves the problem. But I think we all agree that if we can avoid to waste global variables it is better and there have been many PRs to avoid the usage of global variables. This is one of the many.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #19940: [SPARK-22750][SQL] Reuse mutable states when poss...

Posted by viirya <gi...@git.apache.org>.
Github user viirya commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19940#discussion_r156073714
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/CodeGenerator.scala ---
    @@ -184,6 +190,27 @@ class CodegenContext {
         mutableStates += ((javaType, variableName, initCode))
       }
     
    +  /**
    +   * Add a mutable state as a field to the generated class only if it does not exist yet a field
    +   * with that name. This helps reducing the number of the generated class' fields, since the same
    +   * variable can be reused by many functions.
    +   *
    +   * Internally, this method calls `addMutableState`.
    +   *
    +   * @param javaType Java type of the field.
    +   * @param variableName Name of the field.
    +   * @param initCode The statement(s) to put into the init() method to initialize this field.
    +   */
    +  def addSingleMutableState(
    +      javaType: String,
    +      variableName: String,
    +      initCode: String = ""): Unit = {
    +    if (!singleMutableStates.contains(variableName)) {
    +      addMutableState(javaType, variableName, initCode)
    --- End diff --
    
    Please also check if the java type is the same. If one expression uses the same name with different type, we should alert it early.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19940: [SPARK-22750][SQL] Reuse mutable states when possible

Posted by mgaido91 <gi...@git.apache.org>.
Github user mgaido91 commented on the issue:

    https://github.com/apache/spark/pull/19940
  
    @viirya this is the requirement I followed in this change which ensures that is it safe to share the variable across all the operators, since all the access are read only and there cannot be influences. Maybe this might be relaxed in the future, but if we follow this requirement, we are sure that this is safe.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #19940: [SPARK-22750][SQL] Reuse mutable states when poss...

Posted by kiszk <gi...@git.apache.org>.
Github user kiszk commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19940#discussion_r157157638
  
    --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/CodeGenerationSuite.scala ---
    @@ -401,4 +401,16 @@ class CodeGenerationSuite extends SparkFunSuite with ExpressionEvalHelper {
         ctx.addReferenceObj("foo", foo)
         assert(ctx.mutableStates.isEmpty)
       }
    +
    +  test("SPARK-22750: addSingleMutableState") {
    --- End diff --
    
    Is `addSingleMutableState` old one?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19940: [SPARK-22750][SQL] Reuse mutable states when possible

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/19940
  
    **[Test build #84846 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/84846/testReport)** for PR 19940 at commit [`90f517c`](https://github.com/apache/spark/commit/90f517c38364eda0e905ea0f23b413806307ec10).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19940: [SPARK-22750][SQL] Reuse mutable states when possible

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/19940
  
    **[Test build #85252 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/85252/testReport)** for PR 19940 at commit [`6143b22`](https://github.com/apache/spark/commit/6143b221ffe6387231dbdacc9a4aebb997f8e815).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19940: [SPARK-22750][SQL] Reuse mutable states when possible

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/19940
  
    **[Test build #85212 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/85212/testReport)** for PR 19940 at commit [`6143b22`](https://github.com/apache/spark/commit/6143b221ffe6387231dbdacc9a4aebb997f8e815).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19940: [SPARK-22750][SQL] Reuse mutable states when possible

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/19940
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19940: [SPARK-22750][SQL] Reuse mutable states when possible

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/19940
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/84846/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19940: [SPARK-22750][SQL] Reuse mutable states when possible

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on the issue:

    https://github.com/apache/spark/pull/19940
  
    thanks, merging to master!


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19940: [SPARK-22750][SQL] Reuse mutable states when possible

Posted by mgaido91 <gi...@git.apache.org>.
Github user mgaido91 commented on the issue:

    https://github.com/apache/spark/pull/19940
  
    Jenkins, retest this please


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19940: [SPARK-22750][SQL] Reuse mutable states when possible

Posted by viirya <gi...@git.apache.org>.
Github user viirya commented on the issue:

    https://github.com/apache/spark/pull/19940
  
    A high level question is, do we need to share mutable status, if we can compact global variables into array later?
    
    Will sharing mutable status increase the difficulty of debugging codegen in the future?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19940: [SPARK-22750][SQL] Reuse mutable states when possible

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/19940
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19940: [SPARK-22750][SQL] Reuse mutable states when possible

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/19940
  
    **[Test build #84775 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/84775/testReport)** for PR 19940 at commit [`ce76015`](https://github.com/apache/spark/commit/ce76015c9ae2f5a0d04941af6bcd8e6b99168c00).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19940: [SPARK-22750][SQL] Reuse mutable states when possible

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/19940
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/85252/
    Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #19940: [SPARK-22750][SQL] Reuse mutable states when poss...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/spark/pull/19940


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19940: [SPARK-22750][SQL] Reuse mutable states when possible

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/19940
  
    **[Test build #85261 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/85261/testReport)** for PR 19940 at commit [`6143b22`](https://github.com/apache/spark/commit/6143b221ffe6387231dbdacc9a4aebb997f8e815).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19940: [SPARK-22750][SQL] Reuse mutable states when possible

Posted by kiszk <gi...@git.apache.org>.
Github user kiszk commented on the issue:

    https://github.com/apache/spark/pull/19940
  
    Got it. It seems to be . I have two questions.
    
    1. Is this mutable? It looks it is global, but immutable. IIUC, is there any other better API name?
    2. As @viirya pointed out [here](https://github.com/apache/spark/pull/19940#issuecomment-350724536), why don't we declare `final`?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19940: [SPARK-22750][SQL] Reuse mutable states when possible

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/19940
  
    Merged build finished. Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19940: [SPARK-22750][SQL] Reuse mutable states when possible

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/19940
  
    Merged build finished. Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #19940: [SPARK-22750][SQL] Reuse mutable states when poss...

Posted by mgaido91 <gi...@git.apache.org>.
Github user mgaido91 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19940#discussion_r156057818
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/CodeGenerator.scala ---
    @@ -184,6 +190,27 @@ class CodegenContext {
         mutableStates += ((javaType, variableName, initCode))
       }
     
    +  /**
    +   * Add a mutable state as a field to the generated class only if it does not exist yet a field
    +   * with that name. This helps reducing the number of the generated class' fields, since the same
    +   * variable can be reused by many functions.
    +   *
    +   * Internally, this method calls `addMutableState`.
    +   *
    +   * @param javaType Java type of the field.
    +   * @param variableName Name of the field.
    +   * @param initCode The statement(s) to put into the init() method to initialize this field.
    +   */
    +  def addSingleMutableState(
    +      javaType: String,
    +      variableName: String,
    +      initCode: String = ""): Unit = {
    +    if (!singleMutableStates.contains(variableName)) {
    +      addMutableState(javaType, variableName, initCode)
    --- End diff --
    
    if you want, I can add it.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #19940: [SPARK-22750][SQL] Reuse mutable states when poss...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19940#discussion_r156054632
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/CodeGenerator.scala ---
    @@ -184,6 +190,27 @@ class CodegenContext {
         mutableStates += ((javaType, variableName, initCode))
       }
     
    +  /**
    +   * Add a mutable state as a field to the generated class only if it does not exist yet a field
    +   * with that name. This helps reducing the number of the generated class' fields, since the same
    +   * variable can be reused by many functions.
    +   *
    +   * Internally, this method calls `addMutableState`.
    +   *
    +   * @param javaType Java type of the field.
    +   * @param variableName Name of the field.
    +   * @param initCode The statement(s) to put into the init() method to initialize this field.
    +   */
    +  def addSingleMutableState(
    +      javaType: String,
    +      variableName: String,
    +      initCode: String = ""): Unit = {
    +    if (!singleMutableStates.contains(variableName)) {
    +      addMutableState(javaType, variableName, initCode)
    --- End diff --
    
    shall we add an assert here to make sure `initCode` is same with the previous one?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #19940: [SPARK-22750][SQL] Reuse mutable states when poss...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19940#discussion_r158323966
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/CodeGenerator.scala ---
    @@ -195,6 +195,14 @@ class CodegenContext {
     
       }
     
    +  /**
    +   * A map containing the mutable states which have been defined so far using
    +   * `addImmutableStateIfNotExists`. Each entry contains the name of the mutable state as key and
    +   * its Java type and init code as value.
    +   */
    +  val singleMutableStates: mutable.Map[String, (String, String)] =
    --- End diff --
    
    immutableStates?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19940: [SPARK-22750][SQL] Reuse mutable states when possible

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/19940
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/84972/
    Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19940: [SPARK-22750][SQL] Reuse mutable states when possible

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/19940
  
    **[Test build #85212 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/85212/testReport)** for PR 19940 at commit [`6143b22`](https://github.com/apache/spark/commit/6143b221ffe6387231dbdacc9a4aebb997f8e815).
     * This patch **fails PySpark unit tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19940: [SPARK-22750][SQL] Reuse mutable states when possible

Posted by mgaido91 <gi...@git.apache.org>.
Github user mgaido91 commented on the issue:

    https://github.com/apache/spark/pull/19940
  
    Jenkins, retest this please


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19940: [SPARK-22750][SQL] Reuse mutable states when possible

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/19940
  
    **[Test build #84775 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/84775/testReport)** for PR 19940 at commit [`ce76015`](https://github.com/apache/spark/commit/ce76015c9ae2f5a0d04941af6bcd8e6b99168c00).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19940: [SPARK-22750][SQL] Reuse mutable states when possible

Posted by viirya <gi...@git.apache.org>.
Github user viirya commented on the issue:

    https://github.com/apache/spark/pull/19940
  
    @mgaido91  Do you mean the shared global variables are required to be only assigned once (initialization) and never changed again?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19940: [SPARK-22750][SQL] Reuse mutable states when possible

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/19940
  
    **[Test build #84956 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/84956/testReport)** for PR 19940 at commit [`e2725d2`](https://github.com/apache/spark/commit/e2725d2ac8c7c66fb99332414b25f549de42b40f).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19940: [SPARK-22750][SQL] Reuse mutable states when possible

Posted by viirya <gi...@git.apache.org>.
Github user viirya commented on the issue:

    https://github.com/apache/spark/pull/19940
  
    Oh, the initialization is not right away in declaration.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19940: [SPARK-22750][SQL] Reuse mutable states when possible

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/19940
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/84714/
    Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19940: [SPARK-22750][SQL] Reuse mutable states when possible

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/19940
  
    **[Test build #84695 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/84695/testReport)** for PR 19940 at commit [`978bfd6`](https://github.com/apache/spark/commit/978bfd6a490cfef2c0f2da1190830ef6a64287af).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19940: [SPARK-22750][SQL] Reuse mutable states when possible

Posted by viirya <gi...@git.apache.org>.
Github user viirya commented on the issue:

    https://github.com/apache/spark/pull/19940
  
    Shall we make them as `final` variables to guarantee this? I think this is an important requirement to prevent something wrong when wrongly using the shared global variables.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19940: [SPARK-22750][SQL] Reuse mutable states when possible

Posted by kiszk <gi...@git.apache.org>.
Github user kiszk commented on the issue:

    https://github.com/apache/spark/pull/19940
  
    For 2., I noticed there are two types of initialization. One is in `init` method by using `addPartitionInitializationStatement`. The other is in constructor that includes an initialization statement in `addSingleMutableState`. Can we declare `final` for latter cases?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19940: [SPARK-22750][SQL] Reuse mutable states when possible

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/19940
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #19940: [SPARK-22750][SQL] Reuse mutable states when poss...

Posted by mgaido91 <gi...@git.apache.org>.
Github user mgaido91 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19940#discussion_r156052035
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/CodeGenerator.scala ---
    @@ -184,6 +190,27 @@ class CodegenContext {
         mutableStates += ((javaType, variableName, initCode))
       }
     
    +  /**
    +   * Add a mutable state as a field to the generated class only if it does not exist yet a field
    +   * with that name. This helps reducing the number of the generated class' fields, since the same
    +   * variable can be reused by many functions.
    +   *
    +   * Internally, this method calls `addMutableState`.
    +   *
    +   * @param javaType Java type of the field.
    +   * @param variableName Name of the field.
    +   * @param initCode The statement(s) to put into the init() method to initialize this field.
    +   */
    +  def addSingleMutableState(
    +      javaType: String,
    +      variableName: String,
    +      initCode: String = ""): Unit = {
    --- End diff --
    
    It is not supported.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19940: [SPARK-22750][SQL] Reuse mutable states when possible

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/19940
  
    **[Test build #85202 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/85202/testReport)** for PR 19940 at commit [`c650196`](https://github.com/apache/spark/commit/c650196a4c0c4391d3112678803bd601af9aa5fb).
     * This patch **fails Spark unit tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19940: [SPARK-22750][SQL] Reuse mutable states when possible

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/19940
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/84695/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org