You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@zeppelin.apache.org by "Ophir Cohen (JIRA)" <ji...@apache.org> on 2015/07/02 11:42:05 UTC
[jira] [Created] (ZEPPELIN-150) Registered UDFs does not work on
Spark jobs initiated from Zeppelin
Ophir Cohen created ZEPPELIN-150:
------------------------------------
Summary: Registered UDFs does not work on Spark jobs initiated from Zeppelin
Key: ZEPPELIN-150
URL: https://issues.apache.org/jira/browse/ZEPPELIN-150
Project: Zeppelin
Issue Type: Bug
Components: Interpreters
Affects Versions: 0.5.0
Environment: - Zeppelin 0.5.0
- Spark 1.3.1 on top yarn cluster
- Hadoop 2.4
Reporter: Ophir Cohen
When trying using UDF from Zeppelin we get _java.lang.ClassNotFoundException: org.apache.zeppelin.spark.ZeppelinContext_
(see below the full exception).
h5. Steps to reproduce:
1. Create and register the UDF:
{code}
def getNum(): Int = {
100
}
hc.udf.register("getNum",getNum _)
{code}
2. Try on exists table:
{code}
%sql select getNum() from filteredNc limit 1
{code}
Failed.
3. Directly on HiveContext:
{code}
hc.sql("select getNum() from filteredNc limit 1").collect
{code}
Failed.
h5. few insights
1. On Spark shell it works as expected.
2. This bug happened only with RDDs/tables that originated from external source (Hive/S3 parquet files). Creating new DataFrame and register it works as expected.
The (almost) full exception:
{code}
WARN [2015-06-28 08:43:53,850] ({task-result-getter-0} Logging.scala[logWarning]:71) - Lost task 0.2 in stage 23.0 (TID 1626, ip-10-216-204-246.ec2.internal): java.lang.NoClassDefFoundError: Lorg/apache/zeppelin/spark/ZeppelinContext;
at java.lang.Class.getDeclaredFields0(Native Method)
at java.lang.Class.privateGetDeclaredFields(Class.java:2499)
at java.lang.Class.getDeclaredField(Class.java:1951)
at java.io.ObjectStreamClass.getDeclaredSUID(ObjectStreamClass.java:1659)
<Many more of ObjectStreamClass lines of exception>
Caused by: java.lang.ClassNotFoundException: org.apache.zeppelin.spark.ZeppelinContext
at org.apache.spark.repl.ExecutorClassLoader.findClass(ExecutorClassLoader.scala:69)
at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
... 103 more
Caused by: java.lang.ClassNotFoundException: org.apache.zeppelin.spark.ZeppelinContext
at java.lang.ClassLoader.findClass(ClassLoader.java:531)
at org.apache.spark.util.ParentClassLoader.findClass(ParentClassLoader.scala:26)
at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
at org.apache.spark.util.ParentClassLoader.loadClass(ParentClassLoader.scala:34)
at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
at org.apache.spark.util.ParentClassLoader.loadClass(ParentClassLoader.scala:30)
at org.apache.spark.repl.ExecutorClassLoader.findClass(ExecutorClassLoader.scala:64)
... 105 more
{code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)