You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@spark.apache.org by Debasish Das <de...@gmail.com> on 2015/06/24 17:18:39 UTC

Spark SQL 1.3 Exception

Hi,

I have Impala created table with the following io format and serde:

inputFormat:parquet.hive.DeprecatedParquetInputFormat,
outputFormat:parquet.hive.DeprecatedParquetOutputFormat,
serdeInfo:SerDeInfo(name:null,
serializationLib:parquet.hive.serde.ParquetHiveSerDe, parameters:{})
I am trying to read this table on Spark SQL 1.3 and see if caching improves
my query latency but I am getting exception:

java.lang.ClassNotFoundException: Class parquet.hive.serde.ParquetHiveSerDe
not found
I understand that in hive 0.13 (which I am using)
parquet.hive.serde.ParquetHiveSerDe is deprecated but it seems Impala still
used it to write the table.

I also tried to provide the bundle jar with --jars option to Spark 1.3
Shell / SQL which has org.apache.parquet.hive.serde.ParquetHiveSerDe but I
am confused how to configure to serde in SQLContext ?

The table which has the following io format and serde can be read fine by
Spark SQL 1.3:

inputFormat=org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat,
outputFormat=org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat,
serializationLib=org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe

Thanks.
Deb

On Sat, Jun 20, 2015 at 12:21 AM, Debasish Das <de...@gmail.com>
wrote:

> Hi,
>
> I have some impala created parquet tables which hive 0.13.2 can read fine.
>
> Now the same table when I want to read using Spark SQL 1.3 I am getting
> exception class exception that parquet.hive.serde.ParquetHiveSerde not
> found.
>
> I am assuming that hive somewhere is putting the parquet-hive-bundle.jar
> in hive classpath but I tried putting the parquet-hive-bundle.jar in
> spark-1.3/conf/hive-site.xml through auxillary jar but even that did not
> work.
>
> Any input on fixing this will be really helpful.
>
> Thanks.
> Deb
>