You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Hyukjin Kwon (JIRA)" <ji...@apache.org> on 2018/10/17 08:56:00 UTC

[jira] [Resolved] (SPARK-25749) Exception thrown while reading avro file with large schema

     [ https://issues.apache.org/jira/browse/SPARK-25749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Hyukjin Kwon resolved SPARK-25749.
----------------------------------
    Resolution: Invalid

> Exception thrown while reading avro file with large schema
> ----------------------------------------------------------
>
>                 Key: SPARK-25749
>                 URL: https://issues.apache.org/jira/browse/SPARK-25749
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 2.3.0, 2.3.1, 2.3.2
>            Reporter: Raj
>            Priority: Major
>         Attachments: EncoderExample.scala, MainCC.scala, build.sbt, exception
>
>
> Hi, We are migrating our jobs from Spark 2.2.0 to Spark 2.3.1. One of the job reads avro source that has large nested schema. The job fails for Spark 2.3.1(Have tested in Spark 2.3.0 & Spark 2.3.2 and the job fails in this case also). I am able to replicate this with some sample data + dummy case class. Please find attached the,
> *Code*: EncoderExample.scala, MainCC.scala & build.sbt
> *Exception log*: exception
> PS:
> I am getting exception \{{java.lang.OutOfMemoryError: Java heap space}}. I have tried increasing the JVM size in eclipse, but that does not help either
> I have also tested the code in Spark 2.2.2 and works fine. Seems like this bug introduced in Spark 2.3.0
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org