You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@zeppelin.apache.org by Charmee Patel <ch...@gmail.com> on 2015/04/19 07:38:02 UTC

Queries using hive context failing with class incompatibility

Hi,

I am running hive queries from within zeppelin based on the usage described
in this pull request

https://github.com/NFLabs/zeppelin/pull/306

My query is
%sql select * from db.table limit 1

Based on the pull request, using %sql should use hiveContext to run queries
by default but I am getting an invocation target exception on the GUI. Logs
show that schema is being read correctly for this task (
ParquetTypes.scala[readSchemaFromFile]) but it fails after that with this
exception

 WARN [2015-04-19 01:29:26,037] ({task-result-getter-0}
Logging.scala[logWarning]:71) - Lost task 0.0 in stage 1.0 (TID 4,
syn-c1-dev02.syntasa.com): java.io.InvalidClassException:
org.apache.spark.sql.parquet.ParquetTableScan; local class incompatible:
stream classdesc serialVersionUID = -1987731824605925228, local class
serialVersionUID = -1838176468117613003

When I use the same query as
val res = hiveContext.sql("select * from db.table limit 1")
it works and returns a schemaRDD with correct schema. However res.first
fails with exact same error as above.

How can I resolve ParquetTableScan class's version difference?

Thanks,
Charmee

Re: Queries using hive context failing with class incompatibility

Posted by moon soo Lee <mo...@apache.org>.
Hi,

I'm using hiveContext + parquet format table + Zeppelin with Spark1.3.0
cluster. and it works really well to me.

Could you check your Spark cluster version and your Zeppelin build command
first? to check if Spark version matches?
And did you set SPARK_HOME env variable for Zeppelin? if so, please  unset
it and try.

Thanks,
moon

On Sun, Apr 19, 2015 at 2:38 PM Charmee Patel <ch...@gmail.com> wrote:

> Hi,
>
> I am running hive queries from within zeppelin based on the usage
> described in this pull request
>
> https://github.com/NFLabs/zeppelin/pull/306
>
> My query is
> %sql select * from db.table limit 1
>
> Based on the pull request, using %sql should use hiveContext to run
> queries by default but I am getting an invocation target exception on the
> GUI. Logs show that schema is being read correctly for this task (
> ParquetTypes.scala[readSchemaFromFile]) but it fails after that with this
> exception
>
>  WARN [2015-04-19 01:29:26,037] ({task-result-getter-0}
> Logging.scala[logWarning]:71) - Lost task 0.0 in stage 1.0 (TID 4,
> syn-c1-dev02.syntasa.com): java.io.InvalidClassException:
> org.apache.spark.sql.parquet.ParquetTableScan; local class incompatible:
> stream classdesc serialVersionUID = -1987731824605925228, local class
> serialVersionUID = -1838176468117613003
>
> When I use the same query as
> val res = hiveContext.sql("select * from db.table limit 1")
> it works and returns a schemaRDD with correct schema. However res.first
> fails with exact same error as above.
>
> How can I resolve ParquetTableScan class's version difference?
>
> Thanks,
> Charmee
>