You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@spark.apache.org by "Sean Owen (JIRA)" <ji...@apache.org> on 2017/10/09 14:42:00 UTC

[jira] [Issue Comment Deleted] (SPARK-22226) Code generation fails for dataframes with 10000 columns

     [ https://issues.apache.org/jira/browse/SPARK-22226?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Sean Owen updated SPARK-22226:
------------------------------
    Comment: was deleted

(was: Would the resolution to the linked issue not resolve this? because it's already pretty far along, I don't know if it's useful to solve specific cases differently.)

> Code generation fails for dataframes with 10000 columns
> -------------------------------------------------------
>
>                 Key: SPARK-22226
>                 URL: https://issues.apache.org/jira/browse/SPARK-22226
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 2.2.0
>            Reporter: Marco Gaido
>
> Code generation for very wide datasets can fail because of the Constant Pool limit reached.
> This can be caused by many reasons. One of them is that we are currently splitting the definition of the generated methods among several {{NestedClass}} but all these methods are called in the main class. Since we have entries added to the constant pool for each method invocation, this is limiting the number of rows and is leading for very wide dataset to:
> {noformat}
> org.codehaus.janino.JaninoRuntimeException: Constant pool for class org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificMutableProjection has grown past JVM limit of 0xFFFF
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org