You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Justin Miller (JIRA)" <ji...@apache.org> on 2016/10/14 14:18:20 UTC

[jira] [Issue Comment Deleted] (SPARK-17936) "CodeGenerator - failed to compile: org.codehaus.janino.JaninoRuntimeException: Code of" method Error

     [ https://issues.apache.org/jira/browse/SPARK-17936?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Justin Miller updated SPARK-17936:
----------------------------------
    Comment: was deleted

(was: I did look through them and I don't think they're related. Note that the error is different and this is trying to write data not read large amounts of data.)

> "CodeGenerator - failed to compile: org.codehaus.janino.JaninoRuntimeException: Code of" method Error
> -----------------------------------------------------------------------------------------------------
>
>                 Key: SPARK-17936
>                 URL: https://issues.apache.org/jira/browse/SPARK-17936
>             Project: Spark
>          Issue Type: Bug
>          Components: Spark Core
>    Affects Versions: 2.0.1
>            Reporter: Justin Miller
>
> Greetings. I'm currently in the process of migrating a project I'm working on from Spark 1.6.2 to 2.0.1. The project uses Spark Streaming to convert Thrift structs coming from Kafka into Parquet files stored in S3. This conversion process works fine in 1.6.2 but I think there may be a bug in 2.0.1. I'll paste the stack trace below.
> org.codehaus.janino.JaninoRuntimeException: Code of method "(Lorg/apache/spark/sql/catalyst/expressions/GeneratedClass;[Ljava/lang/Object;)V" of class "org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificUnsafeProjection" grows beyond 64 KB
> 	at org.codehaus.janino.CodeContext.makeSpace(CodeContext.java:941)
> 	at org.codehaus.janino.CodeContext.write(CodeContext.java:854)
> 	at org.codehaus.janino.UnitCompiler.writeShort(UnitCompiler.java:10242)
> 	at org.codehaus.janino.UnitCompiler.writeLdc(UnitCompiler.java:9058)
> Also, later on:
> 07:35:30.191 ERROR o.a.s.u.SparkUncaughtExceptionHandler - Uncaught exception in thread Thread[Executor task launch worker-6,5,run-main-group-0]
> java.lang.OutOfMemoryError: Java heap space
> I've seen similar issues posted, but those were always on the query side. I have a hunch that this is happening at write time as the error occurs after batchDuration. Here's the write snippet.
> stream.
>       flatMap {
>         case Success(row) =>
>           thriftParseSuccess += 1
>           Some(row)
>         case Failure(ex) =>
>           thriftParseErrors += 1
>           logger.error("Error during deserialization: ", ex)
>           None
>       }.foreachRDD { rdd =>
>         val sqlContext = SQLContext.getOrCreate(rdd.context)
>         transformer(sqlContext.createDataFrame(rdd, converter.schema))
>           .coalesce(coalesceSize)
>           .write
>           .mode(Append)
>           .partitionBy(partitioning: _*)
>           .parquet(parquetPath)
>       }
> Please let me know if you can be of assistance and if there's anything I can do to help.
> Best,
> Justin



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org