You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@zeppelin.apache.org by shyla deshpande <de...@gmail.com> on 2017/06/01 01:44:17 UTC

Task not serializable error when I try to cache the spark sql table

Hello all,

I am getting org.apache.spark.SparkException: Task not serializable error
when I try to cache the spark sql table. I am using a UDF on a column of
table and want to cache the resultant table . I can execute the paragraph
successfully when there is no caching.

Please help! Thanks

UDF :
def fn1(res: String): Int = {
      100
    }
 spark.udf.register("fn1", fn1(_: String): Int)


       spark
      .read
      .format("org.apache.spark.sql.cassandra")
      .options(Map("keyspace" -> "k", "table" -> "t"))
      .load
      .createOrReplaceTempView("t1")


     val df1 = spark.sql("SELECT  col1, col2, fn1(col3)   from t1" )

     df1.createOrReplaceTempView("t2")

   spark.catalog.cacheTable("t2")

Re: Task not serializable error when I try to cache the spark sql table

Posted by Jongyoul Lee <jo...@gmail.com>.
Hi,

Which version of spark do you use?

On Thu, Jun 1, 2017 at 10:44 AM, shyla deshpande <de...@gmail.com>
wrote:

> Hello all,
>
> I am getting org.apache.spark.SparkException: Task not serializable error
> when I try to cache the spark sql table. I am using a UDF on a column of
> table and want to cache the resultant table . I can execute the paragraph
> successfully when there is no caching.
>
> Please help! Thanks
>
> UDF :
> def fn1(res: String): Int = {
>       100
>     }
>  spark.udf.register("fn1", fn1(_: String): Int)
>
>
>        spark
>       .read
>       .format("org.apache.spark.sql.cassandra")
>       .options(Map("keyspace" -> "k", "table" -> "t"))
>       .load
>       .createOrReplaceTempView("t1")
>
>
>      val df1 = spark.sql("SELECT  col1, col2, fn1(col3)   from t1" )
>
>      df1.createOrReplaceTempView("t2")
>
>    spark.catalog.cacheTable("t2")
>



-- 
이종열, Jongyoul Lee, 李宗烈
http://madeng.net