You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@zeppelin.apache.org by "Jin-Hyeok, Cha (JIRA)" <ji...@apache.org> on 2019/03/21 06:42:00 UTC

[jira] [Created] (ZEPPELIN-4082) Error occured when using UDF with scoped notebook

Jin-Hyeok, Cha created ZEPPELIN-4082:
----------------------------------------

             Summary: Error occured when using UDF with scoped notebook
                 Key: ZEPPELIN-4082
                 URL: https://issues.apache.org/jira/browse/ZEPPELIN-4082
             Project: Zeppelin
          Issue Type: Bug
          Components: Interpreters
    Affects Versions: 0.8.1
         Environment: * Zeppelin v0.8.1
 * Spark v2.4.0 (1 Master, N Workers)
 * Hadoop (Embedded, Maybe v2.7.x)
 * The interpreter will be instantiated *Per Note* in *scoped* process
            Reporter: Jin-Hyeok, Cha


When I defined my own function with UDF (User-Defined Functions) feature, and I got the error message like this:

 
{code:java}
java.lang.ClassCastException: cannot assign instance of scala.collection.immutable.List$SerializationProxy to field org.apache.spark.rdd.RDD.org$apache$spark$rdd$RDD$$dependencies_ of type scala.collection.Seq in instance of org.apache.spark.rdd.MapPartitionsRDD
{code}
 

 

I just defined the simple function:

 
{code:java}
import java.text.SimpleDateFormat

def diffHour(s1: String, s2: String): Long = {
  var hour = 0L
  try {
    val sdf = new SimpleDateFormat("yyyy-MM-dd HH:mm:ss")
    val d1 = sdf.parse(s1)
    val d2 = sdf.parse(s2)
    hour = d2.getTime - d1.getTime
    hour /= 1000 * 60 * 60
  } catch {
    case e: Exception => hour = -1
  }
  hour
}{code}
 

And registered my function to Spark SQL Context:
{code:java}
sqlContext.udf.register("diffHour", diffHour _)
{code}
Now I expected I can use my function on SQL.

 
{code:java}
%sql
SELECT
  id,
  time,
  diffHour(time, '2019-01-01 00:00:00') as hour
FROM users{code}
But the error occurred I mentioned at first.

 

 

I used *Per Note* and *scoped* settings for Spark Interpreter.
So I changed Interpreter settings to *Globally*.
Then error not occurred.

 

How can I fix it?

Please help me.

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)