You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Sean Owen (JIRA)" <ji...@apache.org> on 2016/01/11 11:22:39 UTC

[jira] [Resolved] (SPARK-6589) SQLUserDefinedType failed in spark-shell

     [ https://issues.apache.org/jira/browse/SPARK-6589?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Sean Owen resolved SPARK-6589.
------------------------------
    Resolution: Not A Problem

I think this is, effectively, "not a problem" in the sense that this is just how the shell works. It necessarily must put classes into its classloader, which is a child of Spark's, and Spark can't see your classes, and you must supply your classes with Spark to make it work. This basically won't work.

> SQLUserDefinedType failed in spark-shell
> ----------------------------------------
>
>                 Key: SPARK-6589
>                 URL: https://issues.apache.org/jira/browse/SPARK-6589
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 1.2.0
>         Environment: CDH 5.3.2
>            Reporter: Benyi Wang
>
> {{DataType.fromJson}} will fail in spark-shell if the schema includes "udt". It works if running in an application. 
> This causes that I cannot read a parquet file including a UDT field. {{DataType.fromCaseClass}} does not support UDT.
> I can load the class which shows that my UDT is in the classpath.
> {code}
> scala> Class.forName("com.bwang.MyTestUDT")
> res6: Class[_] = class com.bwang.MyTestUDT
> {code}
> But DataType fails:
> {code}
> scala> DataType.fromJson(json)                                                                                                      java.lang.ClassNotFoundException: com.bwang.MyTestUDT
>         at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
>         at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
>         at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
>         at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
>         at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
>         at java.lang.Class.forName0(Native Method)
>         at java.lang.Class.forName(Class.java:190)
>         at org.apache.spark.sql.catalyst.types.DataType$.parseDataType(dataTypes.scala:77)
> {code}
> The reason is DataType.fromJson tries to load {{udtClass}} using this code:
> {code}
>     case JSortedObject(
>         ("class", JString(udtClass)),
>         ("pyClass", _),
>         ("sqlType", _),
>         ("type", JString("udt"))) =>
>       Class.forName(udtClass).newInstance().asInstanceOf[UserDefinedType[_]]
>   }
> {code}
> Unfortunately, my UDT is loaded by {{SparkIMain$TranslatingClassLoader}}, but DataType is loaded by {{Launcher$AppClassLoader}}.
> {code}
> scala> DataType.getClass.getClassLoader
> res2: ClassLoader = sun.misc.Launcher$AppClassLoader@6876fb1b
> scala> this.getClass.getClassLoader
> res3: ClassLoader = org.apache.spark.repl.SparkIMain$TranslatingClassLoader@63d36b29
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org