You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Aleksander Eskilson (JIRA)" <ji...@apache.org> on 2016/10/19 22:03:58 UTC
[jira] [Created] (SPARK-18016) Code Generation Fails When Encoding
Large Object to Wide Dataset
Aleksander Eskilson created SPARK-18016:
-------------------------------------------
Summary: Code Generation Fails When Encoding Large Object to Wide Dataset
Key: SPARK-18016
URL: https://issues.apache.org/jira/browse/SPARK-18016
Project: Spark
Issue Type: Bug
Components: SQL
Affects Versions: 2.1.0
Reporter: Aleksander Eskilson
When attempting to encode collections of large Java objects to Datasets having very wide or deeply nested schemas, code generation can fail, yielding:
{code}
Caused by: org.codehaus.janino.JaninoRuntimeException: Constant pool for class org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificUnsafeProjection has grown past JVM limit of 0xFFFF
at org.codehaus.janino.util.ClassFile.addToConstantPool(ClassFile.java:499)
at org.codehaus.janino.util.ClassFile.addConstantNameAndTypeInfo(ClassFile.java:439)
at org.codehaus.janino.util.ClassFile.addConstantMethodrefInfo(ClassFile.java:358)
at org.codehaus.janino.UnitCompiler.writeConstantMethodrefInfo(UnitCompiler.java:11114)
at org.codehaus.janino.UnitCompiler.compileGet2(UnitCompiler.java:4547)
at org.codehaus.janino.UnitCompiler.access$7500(UnitCompiler.java:206)
at org.codehaus.janino.UnitCompiler$12.visitMethodInvocation(UnitCompiler.java:3774)
at org.codehaus.janino.UnitCompiler$12.visitMethodInvocation(UnitCompiler.java:3762)
at org.codehaus.janino.Java$MethodInvocation.accept(Java.java:4328)
at org.codehaus.janino.UnitCompiler.compileGet(UnitCompiler.java:3762)
at org.codehaus.janino.UnitCompiler.compileGetValue(UnitCompiler.java:4933)
at org.codehaus.janino.UnitCompiler.compile2(UnitCompiler.java:3180)
at org.codehaus.janino.UnitCompiler.access$5000(UnitCompiler.java:206)
at org.codehaus.janino.UnitCompiler$9.visitMethodInvocation(UnitCompiler.java:3151)
at org.codehaus.janino.UnitCompiler$9.visitMethodInvocation(UnitCompiler.java:3139)
at org.codehaus.janino.Java$MethodInvocation.accept(Java.java:4328)
at org.codehaus.janino.UnitCompiler.compile(UnitCompiler.java:3139)
at org.codehaus.janino.UnitCompiler.compile2(UnitCompiler.java:2112)
at org.codehaus.janino.UnitCompiler.access$1700(UnitCompiler.java:206)
at org.codehaus.janino.UnitCompiler$6.visitExpressionStatement(UnitCompiler.java:1377)
at org.codehaus.janino.UnitCompiler$6.visitExpressionStatement(UnitCompiler.java:1370)
at org.codehaus.janino.Java$ExpressionStatement.accept(Java.java:2558)
at org.codehaus.janino.UnitCompiler.compile(UnitCompiler.java:1370)
at org.codehaus.janino.UnitCompiler.compileStatements(UnitCompiler.java:1450)
at org.codehaus.janino.UnitCompiler.compile(UnitCompiler.java:2811)
at org.codehaus.janino.UnitCompiler.compileDeclaredMethods(UnitCompiler.java:1262)
at org.codehaus.janino.UnitCompiler.compileDeclaredMethods(UnitCompiler.java:1234)
at org.codehaus.janino.UnitCompiler.compile2(UnitCompiler.java:538)
at org.codehaus.janino.UnitCompiler.compile2(UnitCompiler.java:890)
at org.codehaus.janino.UnitCompiler.compile2(UnitCompiler.java:894)
at org.codehaus.janino.UnitCompiler.access$600(UnitCompiler.java:206)
at org.codehaus.janino.UnitCompiler$2.visitMemberClassDeclaration(UnitCompiler.java:377)
at org.codehaus.janino.UnitCompiler$2.visitMemberClassDeclaration(UnitCompiler.java:369)
at org.codehaus.janino.Java$MemberClassDeclaration.accept(Java.java:1128)
at org.codehaus.janino.UnitCompiler.compile(UnitCompiler.java:369)
at org.codehaus.janino.UnitCompiler.compileDeclaredMemberTypes(UnitCompiler.java:1209)
at org.codehaus.janino.UnitCompiler.compile2(UnitCompiler.java:564)
at org.codehaus.janino.UnitCompiler.compile2(UnitCompiler.java:420)
at org.codehaus.janino.UnitCompiler.access$400(UnitCompiler.java:206)
at org.codehaus.janino.UnitCompiler$2.visitPackageMemberClassDeclaration(UnitCompiler.java:374)
at org.codehaus.janino.UnitCompiler$2.visitPackageMemberClassDeclaration(UnitCompiler.java:369)
at org.codehaus.janino.Java$AbstractPackageMemberClassDeclaration.accept(Java.java:1309)
at org.codehaus.janino.UnitCompiler.compile(UnitCompiler.java:369)
at org.codehaus.janino.UnitCompiler.compileUnit(UnitCompiler.java:345)
at org.codehaus.janino.SimpleCompiler.compileToClassLoader(SimpleCompiler.java:396)
at org.codehaus.janino.ClassBodyEvaluator.compileToClass(ClassBodyEvaluator.java:311)
at org.codehaus.janino.ClassBodyEvaluator.cook(ClassBodyEvaluator.java:229)
at org.codehaus.janino.SimpleCompiler.cook(SimpleCompiler.java:196)
at org.codehaus.commons.compiler.Cookable.cook(Cookable.java:91)
at org.apache.spark.sql.catalyst.expressions.codegen.CodeGenerator$.org$apache$spark$sql$catalyst$expressions$codegen$CodeGenerator$$doCompile(CodeGenerator.scala:905)
... 35 more
{code}
During generation of the code for SpecificUnsafeProjection, all the mutable variables are declared up front. If there are too many, it seems it perhaps exceeds some type of resource limit.
This issue seems related to (but is not fixed by) SPARK-17702, which itself was about the size of individual methods growing beyond the 64 KB limit. SPARK-17702 was resolved by breaking extractions into smaller methods [1], but this issue looks to be about the sheer number of up-front declared variables [2].
I've created a small project [3] where I declare a list of "wide" and "nested" Bean objects that I attempt to encode to a Dataset. This code can trigger the failure for Spark 2.1.0-SNAPSHOT. And I'll additionally attach the error log that shows the code produced and the stacktrace.
[1] - https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/GenerateUnsafeProjection.scala#L383
[2] - https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/GenerateUnsafeProjection.scala#L376
[3] - https://github.com/bdrillard/spark-codegen-error
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org