You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Barry Becker (JIRA)" <ji...@apache.org> on 2016/11/14 19:32:58 UTC

[jira] [Commented] (SPARK-8443) GenerateMutableProjection Exceeds JVM Code Size Limits

    [ https://issues.apache.org/jira/browse/SPARK-8443?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15664779#comment-15664779 ] 

Barry Becker commented on SPARK-8443:
-------------------------------------

I see the same error in spark 1.6.3. Is there a workaround? Should this issue be reopened? I am working with a dataset with more than 200 columns. Before 1.6.3 I could not even try this case because of SPARK-16664.
{code}
Caused by: org.codehaus.janino.JaninoRuntimeException: Code of method "(Lorg/apache/spark/sql/catalyst/InternalRow;Lorg/apache/spark/sql/catalyst/InternalRow;)I" of class "org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificOrdering" grows beyond 64 KB
        at org.codehaus.janino.CodeContext.makeSpace(CodeContext.java:941)
        at org.codehaus.janino.CodeContext.write(CodeContext.java:836)
{code}

> GenerateMutableProjection Exceeds JVM Code Size Limits
> ------------------------------------------------------
>
>                 Key: SPARK-8443
>                 URL: https://issues.apache.org/jira/browse/SPARK-8443
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 1.4.0
>            Reporter: Sen Fang
>            Assignee: Sen Fang
>             Fix For: 1.5.0
>
>
> GenerateMutableProjection put all expressions columns into a single apply function. When there are a lot of columns, the apply function code size exceeds the 64kb limit, which is a hard limit on jvm that cannot change.
> This comes up when we were aggregating about 100 columns using codegen and unsafe feature.
> I wrote an unit test that reproduces this issue. 
> https://github.com/saurfang/spark/blob/codegen_size_limit/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/CodeGenerationSuite.scala
> This test currently fails at 2048 expressions. It seems the master is more tolerant than branch-1.4 about this because code is more concise.
> While the code on master has changed since branch-1.4, I am able to reproduce the problem in master. For now I hacked my way in branch-1.4 to workaround this problem by wrapping each expression with a separate function then call those functions sequentially in apply. The proper way is probably check the length of the projectCode and break it up as necessary. (This seems to be easier in master actually since we are building code by string rather than quasiquote)
> Let me know if anyone has additional thoughts on this, I'm happy to contribute a pull request.
> Attaching stack trace produced by unit test
> {code}
> [info] - code size limit *** FAILED *** (7 seconds, 103 milliseconds)
> [info]   com.google.common.util.concurrent.UncheckedExecutionException: org.codehaus.janino.JaninoRuntimeException: Code of method "(Ljava/lang/Object;)Ljava/lang/Object;" of class "SC$SpecificProjection" grows beyond 64 KB
> [info]   at com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2263)
> [info]   at com.google.common.cache.LocalCache.get(LocalCache.java:4000)
> [info]   at com.google.common.cache.LocalCache.getOrLoad(LocalCache.java:4004)
> [info]   at com.google.common.cache.LocalCache$LocalLoadingCache.get(LocalCache.java:4874)
> [info]   at org.apache.spark.sql.catalyst.expressions.codegen.CodeGenerator.generate(CodeGenerator.scala:285)
> [info]   at org.apache.spark.sql.catalyst.expressions.CodeGenerationSuite$$anonfun$2$$anonfun$apply$mcV$sp$2.apply(CodeGenerationSuite.scala:50)
> [info]   at org.apache.spark.sql.catalyst.expressions.CodeGenerationSuite$$anonfun$2$$anonfun$apply$mcV$sp$2.apply(CodeGenerationSuite.scala:48)
> [info]   at scala.collection.TraversableOnce$$anonfun$foldLeft$1.apply(TraversableOnce.scala:144)
> [info]   at scala.collection.TraversableOnce$$anonfun$foldLeft$1.apply(TraversableOnce.scala:144)
> [info]   at scala.collection.immutable.Range.foreach(Range.scala:141)
> [info]   at scala.collection.TraversableOnce$class.foldLeft(TraversableOnce.scala:144)
> [info]   at scala.collection.AbstractTraversable.foldLeft(Traversable.scala:105)
> [info]   at org.apache.spark.sql.catalyst.expressions.CodeGenerationSuite$$anonfun$2.apply$mcV$sp(CodeGenerationSuite.scala:47)
> [info]   at org.apache.spark.sql.catalyst.expressions.CodeGenerationSuite$$anonfun$2.apply(CodeGenerationSuite.scala:47)
> [info]   at org.apache.spark.sql.catalyst.expressions.CodeGenerationSuite$$anonfun$2.apply(CodeGenerationSuite.scala:47)
> [info]   at org.scalatest.Transformer$$anonfun$apply$1.apply$mcV$sp(Transformer.scala:22)
> [info]   at org.scalatest.OutcomeOf$class.outcomeOf(OutcomeOf.scala:85)
> [info]   at org.scalatest.OutcomeOf$.outcomeOf(OutcomeOf.scala:104)
> [info]   at org.scalatest.Transformer.apply(Transformer.scala:22)
> [info]   at org.scalatest.Transformer.apply(Transformer.scala:20)
> [info]   at org.scalatest.FunSuiteLike$$anon$1.apply(FunSuiteLike.scala:166)
> [info]   at org.apache.spark.SparkFunSuite.withFixture(SparkFunSuite.scala:42)
> [info]   at org.scalatest.FunSuiteLike$class.invokeWithFixture$1(FunSuiteLike.scala:163)
> [info]   at org.scalatest.FunSuiteLike$$anonfun$runTest$1.apply(FunSuiteLike.scala:175)
> [info]   at org.scalatest.FunSuiteLike$$anonfun$runTest$1.apply(FunSuiteLike.scala:175)
> [info]   at org.scalatest.SuperEngine.runTestImpl(Engine.scala:306)
> [info]   at org.scalatest.FunSuiteLike$class.runTest(FunSuiteLike.scala:175)
> [info]   at org.scalatest.FunSuite.runTest(FunSuite.scala:1555)
> [info]   at org.scalatest.FunSuiteLike$$anonfun$runTests$1.apply(FunSuiteLike.scala:208)
> [info]   at org.scalatest.FunSuiteLike$$anonfun$runTests$1.apply(FunSuiteLike.scala:208)
> [info]   at org.scalatest.SuperEngine$$anonfun$traverseSubNodes$1$1.apply(Engine.scala:413)
> [info]   at org.scalatest.SuperEngine$$anonfun$traverseSubNodes$1$1.apply(Engine.scala:401)
> [info]   at scala.collection.immutable.List.foreach(List.scala:318)
> [info]   at org.scalatest.SuperEngine.traverseSubNodes$1(Engine.scala:401)
> [info]   at org.scalatest.SuperEngine.org$scalatest$SuperEngine$$runTestsInBranch(Engine.scala:396)
> [info]   at org.scalatest.SuperEngine.runTestsImpl(Engine.scala:483)
> [info]   at org.scalatest.FunSuiteLike$class.runTests(FunSuiteLike.scala:208)
> [info]   at org.scalatest.FunSuite.runTests(FunSuite.scala:1555)
> [info]   at org.scalatest.Suite$class.run(Suite.scala:1424)
> [info]   at org.scalatest.FunSuite.org$scalatest$FunSuiteLike$$super$run(FunSuite.scala:1555)
> [info]   at org.scalatest.FunSuiteLike$$anonfun$run$1.apply(FunSuiteLike.scala:212)
> [info]   at org.scalatest.FunSuiteLike$$anonfun$run$1.apply(FunSuiteLike.scala:212)
> [info]   at org.scalatest.SuperEngine.runImpl(Engine.scala:545)
> [info]   at org.scalatest.FunSuiteLike$class.run(FunSuiteLike.scala:212)
> [info]   at org.scalatest.FunSuite.run(FunSuite.scala:1555)
> [info]   at org.scalatest.tools.Framework.org$scalatest$tools$Framework$$runSuite(Framework.scala:462)
> [info]   at org.scalatest.tools.Framework$ScalaTestTask.execute(Framework.scala:671)
> [info]   at sbt.ForkMain$Run$2.call(ForkMain.java:294)
> [info]   at sbt.ForkMain$Run$2.call(ForkMain.java:284)
> [info]   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> [info]   at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> [info]   at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> [info]   at java.lang.Thread.run(Thread.java:745)
> [info]   Cause: org.codehaus.janino.JaninoRuntimeException: Code of method "(Ljava/lang/Object;)Ljava/lang/Object;" of class "SC$SpecificProjection" grows beyond 64 KB
> [info]   at org.codehaus.janino.CodeContext.makeSpace(CodeContext.java:941)
> [info]   at org.codehaus.janino.CodeContext.write(CodeContext.java:874)
> [info]   at org.codehaus.janino.CodeContext.writeBranch(CodeContext.java:965)
> [info]   at org.codehaus.janino.UnitCompiler.writeBranch(UnitCompiler.java:10261)
> [info]   at org.codehaus.janino.UnitCompiler.compileBoolean2(UnitCompiler.java:2862)
> [info]   at org.codehaus.janino.UnitCompiler.access$4800(UnitCompiler.java:185)
> [info]   at org.codehaus.janino.UnitCompiler$8.visitAmbiguousName(UnitCompiler.java:2832)
> [info]   at org.codehaus.janino.Java$AmbiguousName.accept(Java.java:3138)
> [info]   at org.codehaus.janino.UnitCompiler.compileBoolean(UnitCompiler.java:2842)
> [info]   at org.codehaus.janino.UnitCompiler.compile2(UnitCompiler.java:1741)
> [info]   at org.codehaus.janino.UnitCompiler.access$1200(UnitCompiler.java:185)
> [info]   at org.codehaus.janino.UnitCompiler$4.visitIfStatement(UnitCompiler.java:937)
> [info]   at org.codehaus.janino.Java$IfStatement.accept(Java.java:2157)
> [info]   at org.codehaus.janino.UnitCompiler.compile(UnitCompiler.java:958)
> [info]   at org.codehaus.janino.UnitCompiler.compileStatements(UnitCompiler.java:1007)
> [info]   at org.codehaus.janino.UnitCompiler.compile(UnitCompiler.java:2293)
> [info]   at org.codehaus.janino.UnitCompiler.compileDeclaredMethods(UnitCompiler.java:822)
> [info]   at org.codehaus.janino.UnitCompiler.compileDeclaredMethods(UnitCompiler.java:794)
> [info]   at org.codehaus.janino.UnitCompiler.compile2(UnitCompiler.java:507)
> [info]   at org.codehaus.janino.UnitCompiler.compile2(UnitCompiler.java:658)
> [info]   at org.codehaus.janino.UnitCompiler.compile2(UnitCompiler.java:662)
> [info]   at org.codehaus.janino.UnitCompiler.access$600(UnitCompiler.java:185)
> [info]   at org.codehaus.janino.UnitCompiler$2.visitMemberClassDeclaration(UnitCompiler.java:350)
> [info]   at org.codehaus.janino.Java$MemberClassDeclaration.accept(Java.java:1035)
> [info]   at org.codehaus.janino.UnitCompiler.compile(UnitCompiler.java:354)
> [info]   at org.codehaus.janino.UnitCompiler.compileDeclaredMemberTypes(UnitCompiler.java:769)
> [info]   at org.codehaus.janino.UnitCompiler.compile2(UnitCompiler.java:532)
> [info]   at org.codehaus.janino.UnitCompiler.compile2(UnitCompiler.java:393)
> [info]   at org.codehaus.janino.UnitCompiler.access$400(UnitCompiler.java:185)
> [info]   at org.codehaus.janino.UnitCompiler$2.visitPackageMemberClassDeclaration(UnitCompiler.java:347)
> [info]   at org.codehaus.janino.Java$PackageMemberClassDeclaration.accept(Java.java:1139)
> [info]   at org.codehaus.janino.UnitCompiler.compile(UnitCompiler.java:354)
> [info]   at org.codehaus.janino.UnitCompiler.compileUnit(UnitCompiler.java:322)
> [info]   at org.codehaus.janino.SimpleCompiler.compileToClassLoader(SimpleCompiler.java:383)
> [info]   at org.codehaus.janino.ClassBodyEvaluator.compileToClass(ClassBodyEvaluator.java:315)
> [info]   at org.codehaus.janino.ClassBodyEvaluator.cook(ClassBodyEvaluator.java:233)
> [info]   at org.codehaus.janino.SimpleCompiler.cook(SimpleCompiler.java:192)
> [info]   at org.codehaus.commons.compiler.Cookable.cook(Cookable.java:84)
> [info]   at org.codehaus.commons.compiler.Cookable.cook(Cookable.java:77)
> [info]   at org.codehaus.janino.ClassBodyEvaluator.<init>(ClassBodyEvaluator.java:72)
> [info]   at org.apache.spark.sql.catalyst.expressions.codegen.CodeGenerator.compile(CodeGenerator.scala:245)
> [info]   at org.apache.spark.sql.catalyst.expressions.codegen.GenerateMutableProjection$.create(GenerateMutableProjection.scala:87)
> [info]   at org.apache.spark.sql.catalyst.expressions.codegen.GenerateMutableProjection$.create(GenerateMutableProjection.scala:29)
> [info]   at org.apache.spark.sql.catalyst.expressions.codegen.CodeGenerator$$anon$1.load(CodeGenerator.scala:272)
> [info]   at com.google.common.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3599)
> [info]   at com.google.common.cache.LocalCache$Segment.loadSync(LocalCache.java:2379)
> [info]   at com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2342)
> [info]   at com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2257)
> [info]   at com.google.common.cache.LocalCache.get(LocalCache.java:4000)
> [info]   at com.google.common.cache.LocalCache.getOrLoad(LocalCache.java:4004)
> [info]   at com.google.common.cache.LocalCache$LocalLoadingCache.get(LocalCache.java:4874)
> [info]   at org.apache.spark.sql.catalyst.expressions.codegen.CodeGenerator.generate(CodeGenerator.scala:285)
> [info]   at org.apache.spark.sql.catalyst.expressions.CodeGenerationSuite$$anonfun$2$$anonfun$apply$mcV$sp$2.apply(CodeGenerationSuite.scala:50)
> [info]   at org.apache.spark.sql.catalyst.expressions.CodeGenerationSuite$$anonfun$2$$anonfun$apply$mcV$sp$2.apply(CodeGenerationSuite.scala:48)
> [info]   at scala.collection.TraversableOnce$$anonfun$foldLeft$1.apply(TraversableOnce.scala:144)
> [info]   at scala.collection.TraversableOnce$$anonfun$foldLeft$1.apply(TraversableOnce.scala:144)
> [info]   at scala.collection.immutable.Range.foreach(Range.scala:141)
> [info]   at scala.collection.TraversableOnce$class.foldLeft(TraversableOnce.scala:144)
> [info]   at scala.collection.AbstractTraversable.foldLeft(Traversable.scala:105)
> [info]   at org.apache.spark.sql.catalyst.expressions.CodeGenerationSuite$$anonfun$2.apply$mcV$sp(CodeGenerationSuite.scala:47)
> [info]   at org.apache.spark.sql.catalyst.expressions.CodeGenerationSuite$$anonfun$2.apply(CodeGenerationSuite.scala:47)
> [info]   at org.apache.spark.sql.catalyst.expressions.CodeGenerationSuite$$anonfun$2.apply(CodeGenerationSuite.scala:47)
> [info]   at org.scalatest.Transformer$$anonfun$apply$1.apply$mcV$sp(Transformer.scala:22)
> [info]   at org.scalatest.OutcomeOf$class.outcomeOf(OutcomeOf.scala:85)
> [info]   at org.scalatest.OutcomeOf$.outcomeOf(OutcomeOf.scala:104)
> [info]   at org.scalatest.Transformer.apply(Transformer.scala:22)
> [info]   at org.scalatest.Transformer.apply(Transformer.scala:20)
> [info]   at org.scalatest.FunSuiteLike$$anon$1.apply(FunSuiteLike.scala:166)
> [info]   at org.apache.spark.SparkFunSuite.withFixture(SparkFunSuite.scala:42)
> [info]   at org.scalatest.FunSuiteLike$class.invokeWithFixture$1(FunSuiteLike.scala:163)
> [info]   at org.scalatest.FunSuiteLike$$anonfun$runTest$1.apply(FunSuiteLike.scala:175)
> [info]   at org.scalatest.FunSuiteLike$$anonfun$runTest$1.apply(FunSuiteLike.scala:175)
> [info]   at org.scalatest.SuperEngine.runTestImpl(Engine.scala:306)
> [info]   at org.scalatest.FunSuiteLike$class.runTest(FunSuiteLike.scala:175)
> [info]   at org.scalatest.FunSuite.runTest(FunSuite.scala:1555)
> [info]   at org.scalatest.FunSuiteLike$$anonfun$runTests$1.apply(FunSuiteLike.scala:208)
> [info]   at org.scalatest.FunSuiteLike$$anonfun$runTests$1.apply(FunSuiteLike.scala:208)
> [info]   at org.scalatest.SuperEngine$$anonfun$traverseSubNodes$1$1.apply(Engine.scala:413)
> [info]   at org.scalatest.SuperEngine$$anonfun$traverseSubNodes$1$1.apply(Engine.scala:401)
> [info]   at scala.collection.immutable.List.foreach(List.scala:318)
> [info]   at org.scalatest.SuperEngine.traverseSubNodes$1(Engine.scala:401)
> [info]   at org.scalatest.SuperEngine.org$scalatest$SuperEngine$$runTestsInBranch(Engine.scala:396)
> [info]   at org.scalatest.SuperEngine.runTestsImpl(Engine.scala:483)
> [info]   at org.scalatest.FunSuiteLike$class.runTests(FunSuiteLike.scala:208)
> [info]   at org.scalatest.FunSuite.runTests(FunSuite.scala:1555)
> [info]   at org.scalatest.Suite$class.run(Suite.scala:1424)
> [info]   at org.scalatest.FunSuite.org$scalatest$FunSuiteLike$$super$run(FunSuite.scala:1555)
> [info]   at org.scalatest.FunSuiteLike$$anonfun$run$1.apply(FunSuiteLike.scala:212)
> [info]   at org.scalatest.FunSuiteLike$$anonfun$run$1.apply(FunSuiteLike.scala:212)
> [info]   at org.scalatest.SuperEngine.runImpl(Engine.scala:545)
> [info]   at org.scalatest.FunSuiteLike$class.run(FunSuiteLike.scala:212)
> [info]   at org.scalatest.FunSuite.run(FunSuite.scala:1555)
> [info]   at org.scalatest.tools.Framework.org$scalatest$tools$Framework$$runSuite(Framework.scala:462)
> [info]   at org.scalatest.tools.Framework$ScalaTestTask.execute(Framework.scala:671)
> [info]   at sbt.ForkMain$Run$2.call(ForkMain.java:294)
> [info]   at sbt.ForkMain$Run$2.call(ForkMain.java:284)
> [info]   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> [info]   at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> [info]   at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> [info]   at java.lang.Thread.run(Thread.java:745)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org