You are viewing a plain text version of this content. The canonical link for it is here.

Posted to reviews@spark.apache.org by kiszk <gi...@git.apache.org> on 2017/11/12 19:06:17 UTC

[GitHub] spark pull request #19730: [SPARK-22500][SQL] Fix 64KB JVM bytecode limit pr...

GitHub user kiszk opened a pull request:

    https://github.com/apache/spark/pull/19730

    [SPARK-22500][SQL] Fix 64KB JVM bytecode limit problem with cast

    ## What changes were proposed in this pull request?
    
    This PR changes `cast` code generation to place generated code for expression for fields of a structure into separated methods if these size could be large.
    
    ## How was this patch tested?
    
    Added new test cases into `CastSuite`


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/kiszk/spark SPARK-22500

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/19730.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #19730
    
----
commit 90136717ca01a7fe426963fffabe3516d506382d
Author: Kazuaki Ishizaki <is...@jp.ibm.com>
Date:   2017-11-12T19:02:53Z

    initial commit

----


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #19730: [SPARK-22500][SQL] Fix 64KB JVM bytecode limit problem w...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/19730
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #19730: [SPARK-22500][SQL] Fix 64KB JVM bytecode limit problem w...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/19730
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/83750/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #19730: [SPARK-22500][SQL] Fix 64KB JVM bytecode limit pr...

Posted by mgaido91 <gi...@git.apache.org>.

Github user mgaido91 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19730#discussion_r150584342
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Cast.scala ---
    @@ -1039,13 +1039,19 @@ case class Cast(child: Expression, dataType: DataType, timeZoneId: Option[String
               }
             }
            """
    -    }.mkString("\n")
    +    }
    +    val fieldsEvalCodes = if (ctx.INPUT_ROW != null && ctx.currentVars == null) {
    --- End diff --
    
    shouldn't be `ctx.currentVars != null`?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #19730: [SPARK-22500][SQL] Fix 64KB JVM bytecode limit pr...

Posted by kiszk <gi...@git.apache.org>.

Github user kiszk commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19730#discussion_r150609735
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Cast.scala ---
    @@ -1039,13 +1039,19 @@ case class Cast(child: Expression, dataType: DataType, timeZoneId: Option[String
               }
             }
            """
    -    }.mkString("\n")
    +    }
    +    val fieldsEvalCodes = if (ctx.INPUT_ROW != null && ctx.currentVars == null) {
    --- End diff --
    
    If [`ctx.currentVars != null`](https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/CodeGenerator.scala#L768), we need to use `mkString("\n")`.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #19730: [SPARK-22500][SQL] Fix 64KB JVM bytecode limit problem w...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/19730
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #19730: [SPARK-22500][SQL] Fix 64KB JVM bytecode limit problem w...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/19730
  
    **[Test build #84064 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/84064/testReport)** for PR 19730 at commit [`c46ef5c`](https://github.com/apache/spark/commit/c46ef5c90dd986d22c69df94507c490252b06ce5).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #19730: [SPARK-22500][SQL] Fix 64KB JVM bytecode limit pr...

Posted by mgaido91 <gi...@git.apache.org>.

Github user mgaido91 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19730#discussion_r150419888
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Cast.scala ---
    @@ -1015,7 +1015,9 @@ case class Cast(child: Expression, dataType: DataType, timeZoneId: Option[String
         }
         val rowClass = classOf[GenericInternalRow].getName
         val result = ctx.freshName("result")
    +    ctx.addMutableState(s"$rowClass", result, "")
    --- End diff --
    
    I think that since we not assigning it, we might keep it as a local variable and pass it to the generated methods. In this way we can avoid to introduce new global variables. Have you tried that?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #19730: [SPARK-22500][SQL] Fix 64KB JVM bytecode limit problem w...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/19730
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #19730: [SPARK-22500][SQL] Fix 64KB JVM bytecode limit problem w...

Posted by kiszk <gi...@git.apache.org>.

Github user kiszk commented on the issue:

    https://github.com/apache/spark/pull/19730
  
    Working for this. I met another problem. [This code](https://github.com/kiszk/spark/blob/83fef403b92a96a13421901d161a0df5e6a6d7b3/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Cast.scala#L855) creates a lot of global variable.  
    Since `UTF8String.IntWrapper` can be reused, I am writing the following code.
    ```
          val wrapper = "intWrapper"
          if (!ctx.mutableStates.exists(s => s._1 == wrapper)) {
            ctx.addMutableState("UTF8String.IntWrapper", wrapper,
              s"$wrapper = new UTF8String.IntWrapper();")
          }
    ```


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #19730: [SPARK-22500][SQL] Fix 64KB JVM bytecode limit problem w...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/19730
  
    **[Test build #83790 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/83790/testReport)** for PR 19730 at commit [`94dfde2`](https://github.com/apache/spark/commit/94dfde25a395cfb1f221ab2018030b12c7011069).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #19730: [SPARK-22500][SQL] Fix 64KB JVM bytecode limit problem w...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/19730
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #19730: [SPARK-22500][SQL] Fix 64KB JVM bytecode limit problem w...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/19730
  
    **[Test build #83970 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/83970/testReport)** for PR 19730 at commit [`83fef40`](https://github.com/apache/spark/commit/83fef403b92a96a13421901d161a0df5e6a6d7b3).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #19730: [SPARK-22500][SQL] Fix 64KB JVM bytecode limit problem w...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/19730
  
    **[Test build #84079 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/84079/testReport)** for PR 19730 at commit [`574691f`](https://github.com/apache/spark/commit/574691f3fa048129591fb4bafe3bdb07ebf5517e).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #19730: [SPARK-22500][SQL] Fix 64KB JVM bytecode limit pr...

Posted by asfgit <gi...@git.apache.org>.

Github user asfgit closed the pull request at:

    https://github.com/apache/spark/pull/19730


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #19730: [SPARK-22500][SQL] Fix 64KB JVM bytecode limit problem w...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/19730
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #19730: [SPARK-22500][SQL] Fix 64KB JVM bytecode limit problem w...

Posted by gatorsmile <gi...@git.apache.org>.

Github user gatorsmile commented on the issue:

    https://github.com/apache/spark/pull/19730
  
    Yeah. We can fix them in this PR. 
    
    BTW, could you check all the other calls of `addMutableState` and fix them too? 


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #19730: [SPARK-22500][SQL] Fix 64KB JVM bytecode limit pr...

Posted by cloud-fan <gi...@git.apache.org>.

Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19730#discussion_r151725673
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Cast.scala ---
    @@ -1039,13 +1039,19 @@ case class Cast(child: Expression, dataType: DataType, timeZoneId: Option[String
               }
             }
            """
    -    }.mkString("\n")
    +    }
    +    val fieldsEvalCodes = if (ctx.INPUT_ROW != null && ctx.currentVars == null) {
    +      ctx.splitExpressions(fieldsEvalCode, "castStruct",
    +        ("InternalRow", ctx.INPUT_ROW) :: (rowClass, result) :: ("InternalRow", tmpRow) :: Nil)
    --- End diff --
    
    I mean, we don't need to pass in `ctx.INPUT_ROW` to the split functions.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #19730: [SPARK-22500][SQL] Fix 64KB JVM bytecode limit pr...

Posted by cloud-fan <gi...@git.apache.org>.

Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19730#discussion_r151477316
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Cast.scala ---
    @@ -1039,13 +1039,19 @@ case class Cast(child: Expression, dataType: DataType, timeZoneId: Option[String
               }
             }
            """
    -    }.mkString("\n")
    +    }
    +    val fieldsEvalCodes = if (ctx.INPUT_ROW != null && ctx.currentVars == null) {
    +      ctx.splitExpressions(fieldsEvalCode, "castStruct",
    +        ("InternalRow", ctx.INPUT_ROW) :: (rowClass, result) :: ("InternalRow", tmpRow) :: Nil)
    --- End diff --
    
    how about inner struct? We also need to pass in the `ctx.INPUT_ROW`?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #19730: [SPARK-22500][SQL] Fix 64KB JVM bytecode limit pr...

Posted by mgaido91 <gi...@git.apache.org>.

Github user mgaido91 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19730#discussion_r150419908
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Cast.scala ---
    @@ -1015,7 +1015,9 @@ case class Cast(child: Expression, dataType: DataType, timeZoneId: Option[String
         }
         val rowClass = classOf[GenericInternalRow].getName
         val result = ctx.freshName("result")
    +    ctx.addMutableState(s"$rowClass", result, "")
         val tmpRow = ctx.freshName("tmpRow")
    +    ctx.addMutableState("InternalRow", tmpRow, "")
    --- End diff --
    
    ditto


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #19730: [SPARK-22500][SQL] Fix 64KB JVM bytecode limit problem w...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/19730
  
    **[Test build #84079 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/84079/testReport)** for PR 19730 at commit [`574691f`](https://github.com/apache/spark/commit/574691f3fa048129591fb4bafe3bdb07ebf5517e).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #19730: [SPARK-22500][SQL] Fix 64KB JVM bytecode limit problem w...

Posted by cloud-fan <gi...@git.apache.org>.

Github user cloud-fan commented on the issue:

    https://github.com/apache/spark/pull/19730
  
    I think this is a different issue and should be fixed with another PR. @kiszk how about we change the test to cast int to long to avoid this issue?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #19730: [SPARK-22500][SQL] Fix 64KB JVM bytecode limit pr...

Posted by cloud-fan <gi...@git.apache.org>.

Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19730#discussion_r151723565
  
    --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/CastSuite.scala ---
    @@ -827,4 +827,34 @@ class CastSuite extends SparkFunSuite with ExpressionEvalHelper {
     
         checkEvaluation(cast(Literal.create(input, from), to), input)
       }
    +
    +  test("SPARK-22500: cast for struct should not generate codes beyond 64KB") {
    +    val N = 1000
    +
    +    val from1 = new StructType(
    +      (1 to N).map(i => StructField(s"s$i", StringType)).toArray)
    +    val to1 = new StructType(
    +      (1 to N).map(i => StructField(s"i$i", IntegerType)).toArray)
    +    val input1 = Row.fromSeq((1 to N).map(i => i.toString))
    +    val output1 = Row.fromSeq((1 to N))
    +    checkEvaluation(cast(Literal.create(input1, from1), to1), output1)
    +
    +    val from2 = new StructType(
    +      (1 to N).map(i => StructField(s"a$i", ArrayType(StringType, containsNull = false))).toArray)
    --- End diff --
    
    or just test this case.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #19730: [SPARK-22500][SQL] Fix 64KB JVM bytecode limit problem w...

Posted by cloud-fan <gi...@git.apache.org>.

Github user cloud-fan commented on the issue:

    https://github.com/apache/spark/pull/19730
  
    thanks, merging to master!


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #19730: [SPARK-22500][SQL] Fix 64KB JVM bytecode limit problem w...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/19730
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #19730: [SPARK-22500][SQL] Fix 64KB JVM bytecode limit pr...

Posted by cloud-fan <gi...@git.apache.org>.

Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19730#discussion_r151476367
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Cast.scala ---
    @@ -1039,13 +1039,19 @@ case class Cast(child: Expression, dataType: DataType, timeZoneId: Option[String
               }
             }
            """
    -    }.mkString("\n")
    +    }
    +    val fieldsEvalCodes = if (ctx.INPUT_ROW != null && ctx.currentVars == null) {
    +      ctx.splitExpressions(fieldsEvalCode, "castStruct",
    +        ("InternalRow", ctx.INPUT_ROW) :: (rowClass, result) :: ("InternalRow", tmpRow) :: Nil)
    --- End diff --
    
    what about inner struct? I think we should use `evPrim` here instead of `ctx.INPUT_ROW`.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #19730: [SPARK-22500][SQL] Fix 64KB JVM bytecode limit problem w...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/19730
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/84079/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #19730: [SPARK-22500][SQL] Fix 64KB JVM bytecode limit pr...

Posted by cloud-fan <gi...@git.apache.org>.

Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19730#discussion_r152222498
  
    --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/CastSuite.scala ---
    @@ -827,4 +827,49 @@ class CastSuite extends SparkFunSuite with ExpressionEvalHelper {
     
         checkEvaluation(cast(Literal.create(input, from), to), input)
       }
    +
    +  test("SPARK-22500: cast for struct should not generate codes beyond 64KB") {
    +    val N = 1000
    +    val M = 250
    +
    +    val from1 = new StructType(
    +      (1 to N).map(i => StructField(s"s$i", StringType)).toArray)
    +    val to1 = new StructType(
    +      (1 to N).map(i => StructField(s"i$i", IntegerType)).toArray)
    +    val input1 = Row.fromSeq((1 to N).map(i => i.toString))
    +    val output1 = Row.fromSeq((1 to N))
    +    checkEvaluation(cast(Literal.create(input1, from1), to1), output1)
    +
    +    val from2 = new StructType(
    +      (1 to N).map(i => StructField(s"a$i", ArrayType(StringType, containsNull = false))).toArray)
    +    val to2 = new StructType(
    +      (1 to N).map(i => StructField(s"i$i", ArrayType(IntegerType, containsNull = true))).toArray)
    +    val input2 = Row.fromSeq((1 to N).map(_ => Seq("456", "true", "78.9")))
    +    val output2 = Row.fromSeq((1 to N).map(_ => Seq(456, null, 78)))
    +    checkEvaluation(cast(Literal.create(input2, from2), to2), output2)
    +
    +    val from3 = new StructType(
    +      (1 to N).map(i => StructField(s"s$i",
    +        StructType(Seq(StructField("l$i", IntegerType, nullable = true))))).toArray)
    +    val to3 = new StructType(
    +      (1 to N).map(i => StructField(s"s$i",
    +        StructType(Seq(StructField("l$i", LongType, nullable = true))))).toArray)
    +    val input3 = Row.fromSeq((1 to N).map(i => Row(i)))
    +    val output3 = Row.fromSeq((1 to N).map(i => Row(i.toLong)))
    +    checkEvaluation(cast(Literal.create(input3, from3), to3), output3)
    +
    +    val fromInner = new StructType(
    +      (1 to M).map(i => StructField(s"s$i", DoubleType)).toArray)
    +    val toInner = new StructType(
    +      (1 to M).map(i => StructField(s"i$i", IntegerType)).toArray)
    +    val inputInner = Row.fromSeq((1 to M).map(i => i + 0.5))
    +    val outputInner = Row.fromSeq((1 to M))
    +    val fromOuter = new StructType(
    +      (1 to M).map(i => StructField(s"s$i", fromInner)).toArray)
    +    val toOuter = new StructType(
    +      (1 to M).map(i => StructField(s"s$i", toInner)).toArray)
    +    val inputOuter = Row.fromSeq((1 to M).map(_ => inputInner))
    +    val outputOuter = Row.fromSeq((1 to M).map(_ => outputInner))
    +    checkEvaluation(cast(Literal.create(inputOuter, fromOuter), toOuter), outputOuter)
    --- End diff --
    
    I think this case is good enough to cover all the above cases?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #19730: [SPARK-22500][SQL] Fix 64KB JVM bytecode limit problem w...

Posted by cloud-fan <gi...@git.apache.org>.

Github user cloud-fan commented on the issue:

    https://github.com/apache/spark/pull/19730
  
    ping @kiszk 


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #19730: [SPARK-22500][SQL] Fix 64KB JVM bytecode limit problem w...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/19730
  
    **[Test build #83790 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/83790/testReport)** for PR 19730 at commit [`94dfde2`](https://github.com/apache/spark/commit/94dfde25a395cfb1f221ab2018030b12c7011069).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #19730: [SPARK-22500][SQL] Fix 64KB JVM bytecode limit pr...

Posted by kiszk <gi...@git.apache.org>.

Github user kiszk commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19730#discussion_r151509423
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Cast.scala ---
    @@ -1039,13 +1039,19 @@ case class Cast(child: Expression, dataType: DataType, timeZoneId: Option[String
               }
             }
            """
    -    }.mkString("\n")
    +    }
    +    val fieldsEvalCodes = if (ctx.INPUT_ROW != null && ctx.currentVars == null) {
    +      ctx.splitExpressions(fieldsEvalCode, "castStruct",
    +        ("InternalRow", ctx.INPUT_ROW) :: (rowClass, result) :: ("InternalRow", tmpRow) :: Nil)
    --- End diff --
    
    Inner struct case in the following code works well.
    ```
        val struct = Literal.create(
          Row(
            UTF8String.fromString("123.4"),
            Seq("456", "true", "78.9"),
            Row(7)),
          StructType(Seq(
            StructField("i", StringType, nullable = true),
            StructField("a",
              ArrayType(StringType, containsNull = false), nullable = true),
            StructField("s",
              StructType(Seq(
                StructField("i", IntegerType, nullable = true)))))))
    
        val ret = cast(struct, StructType(Seq(
          StructField("d", DoubleType, nullable = true),
          StructField("a",
            ArrayType(IntegerType, containsNull = true), nullable = true),
          StructField("s",
            StructType(Seq(
              StructField("l", LongType, nullable = true)))))))
    
        assert(ret.resolved === true)
        checkEvaluation(ret, Row(123.4, Seq(456, null, 78), Row(7L)))
    ```


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #19730: [SPARK-22500][SQL] Fix 64KB JVM bytecode limit problem w...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/19730
  
    **[Test build #83792 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/83792/testReport)** for PR 19730 at commit [`39ffaa9`](https://github.com/apache/spark/commit/39ffaa9e59131dcd955d280c26434d931a61c834).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #19730: [SPARK-22500][SQL] Fix 64KB JVM bytecode limit problem w...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/19730
  
    **[Test build #83750 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/83750/testReport)** for PR 19730 at commit [`9013671`](https://github.com/apache/spark/commit/90136717ca01a7fe426963fffabe3516d506382d).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #19730: [SPARK-22500][SQL] Fix 64KB JVM bytecode limit problem w...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/19730
  
    **[Test build #84064 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/84064/testReport)** for PR 19730 at commit [`c46ef5c`](https://github.com/apache/spark/commit/c46ef5c90dd986d22c69df94507c490252b06ce5).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #19730: [SPARK-22500][SQL] Fix 64KB JVM bytecode limit problem w...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/19730
  
    **[Test build #83792 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/83792/testReport)** for PR 19730 at commit [`39ffaa9`](https://github.com/apache/spark/commit/39ffaa9e59131dcd955d280c26434d931a61c834).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #19730: [SPARK-22500][SQL] Fix 64KB JVM bytecode limit problem w...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/19730
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/83792/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #19730: [SPARK-22500][SQL] Fix 64KB JVM bytecode limit problem w...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/19730
  
    **[Test build #83970 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/83970/testReport)** for PR 19730 at commit [`83fef40`](https://github.com/apache/spark/commit/83fef403b92a96a13421901d161a0df5e6a6d7b3).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #19730: [SPARK-22500][SQL] Fix 64KB JVM bytecode limit problem w...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/19730
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/84064/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #19730: [SPARK-22500][SQL] Fix 64KB JVM bytecode limit problem w...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/19730
  
    **[Test build #83750 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/83750/testReport)** for PR 19730 at commit [`9013671`](https://github.com/apache/spark/commit/90136717ca01a7fe426963fffabe3516d506382d).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #19730: [SPARK-22500][SQL] Fix 64KB JVM bytecode limit problem w...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/19730
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/83970/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #19730: [SPARK-22500][SQL] Fix 64KB JVM bytecode limit pr...

Posted by cloud-fan <gi...@git.apache.org>.

Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19730#discussion_r151723430
  
    --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/CastSuite.scala ---
    @@ -827,4 +827,34 @@ class CastSuite extends SparkFunSuite with ExpressionEvalHelper {
     
         checkEvaluation(cast(Literal.create(input, from), to), input)
       }
    +
    +  test("SPARK-22500: cast for struct should not generate codes beyond 64KB") {
    +    val N = 1000
    +
    +    val from1 = new StructType(
    +      (1 to N).map(i => StructField(s"s$i", StringType)).toArray)
    +    val to1 = new StructType(
    +      (1 to N).map(i => StructField(s"i$i", IntegerType)).toArray)
    +    val input1 = Row.fromSeq((1 to N).map(i => i.toString))
    +    val output1 = Row.fromSeq((1 to N))
    +    checkEvaluation(cast(Literal.create(input1, from1), to1), output1)
    +
    +    val from2 = new StructType(
    +      (1 to N).map(i => StructField(s"a$i", ArrayType(StringType, containsNull = false))).toArray)
    --- End diff --
    
    I'd expect something like
    ```
    val from2 = new StructType(
      (1 to N).map(i => StructField(s"s$i", from1)).toArray)
    ```


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #19730: [SPARK-22500][SQL] Fix 64KB JVM bytecode limit problem w...

Posted by kiszk <gi...@git.apache.org>.

Github user kiszk commented on the issue:

    https://github.com/apache/spark/pull/19730
  
    @cloud-fan I see, I will create another PR to fix this global variable issue.
    @gatorsmile I will check other calls.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #19730: [SPARK-22500][SQL] Fix 64KB JVM bytecode limit problem w...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/19730
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/83790/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org