You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Patrick Wendell (JIRA)" <ji...@apache.org> on 2014/05/16 13:06:09 UTC

[jira] [Updated] (SPARK-1851) Upgrade Avro dependency to 1.7.6 so Spark can read Avro files

     [ https://issues.apache.org/jira/browse/SPARK-1851?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Patrick Wendell updated SPARK-1851:
-----------------------------------

    Assignee: Sandy Ryza

> Upgrade Avro dependency to 1.7.6 so Spark can read Avro files
> -------------------------------------------------------------
>
>                 Key: SPARK-1851
>                 URL: https://issues.apache.org/jira/browse/SPARK-1851
>             Project: Spark
>          Issue Type: Improvement
>          Components: Spark Core
>            Reporter: Sandy Ryza
>            Assignee: Sandy Ryza
>            Priority: Critical
>             Fix For: 1.0.0
>
>
> I tried to set up a basic example getting a Spark job to read an Avro container file with Avro specifics.  This results in a ClassNotFoundException: can't convert GenericData.Record to com.cloudera.sparkavro.User.
> The reason is:
> * When creating records, to decide whether to be specific or generic, Avro tries to load a class with the name specified in the schema.
> * Initially, executors just have the system jars (which include Avro), and load the app jars dynamically with a URLClassLoader that's set as the context classloader for the task threads.
> * Avro tries to load the generated classes with SpecificData.class.getClassLoader(), which sidesteps this URLClassLoader and goes up to the AppClassLoader.
> Avro 1.7.6 has a change (AVRO-987) that falls back to the Thread's context classloader when the SpecificData.class.getClassLoader() fails.  I tested with Avro 1.7.6 and did not observe the problem.



--
This message was sent by Atlassian JIRA
(v6.2#6252)