You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@spark.apache.org by Yana Kadiyska <ya...@gmail.com> on 2014/12/02 18:31:45 UTC

Fwd: [Thrift,1.2 RC] what happened to parquet.hive.serde.ParquetHiveSerDe

Apologies if people get this more than once -- I sent mail to dev@spark
last night and don't see it in the archives. Trying the incubator list
now...wanted to make sure it doesn't get lost in case it's a bug...

---------- Forwarded message ----------
From: Yana Kadiyska <ya...@gmail.com>
Date: Mon, Dec 1, 2014 at 8:10 PM
Subject: [Thrift,1.2 RC] what happened to
parquet.hive.serde.ParquetHiveSerDe
To: dev@spark.apache.org


Hi all, apologies if this is not a question for the dev list -- figured
User list might not be appropriate since I'm having trouble with the RC tag.

I just tried deploying the RC and running ThriftServer. I see the following
error:

14/12/01 21:31:42 ERROR UserGroupInformation: PriviledgedActionException
as:anonymous (auth:SIMPLE)
cause:org.apache.hive.service.cli.HiveSQLException:
java.lang.RuntimeException:
MetaException(message:java.lang.ClassNotFoundException Class
parquet.hive.serde.ParquetHiveSerDe not found)
14/12/01 21:31:42 WARN ThriftCLIService: Error executing statement:
org.apache.hive.service.cli.HiveSQLException: java.lang.RuntimeException:
MetaException(message:java.lang.ClassNotFoundException Class
parquet.hive.serde.ParquetHiveSerDe not found)
at
org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.run(Shim13.scala:192)
at
org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementInternal(HiveSessionImpl.java:231)
at
org.apache.hive.service.cli.session.HiveSessionImpl.executeStatement(HiveSessionImpl.java:212)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at
org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:79)
at
org.apache.hive.service.cli.session.HiveSessionProxy.access$000(HiveSessionProxy.java:37)
at
org.apache.hive.service.cli.session.HiveSessionProxy$1.run(HiveSessionProxy.java:64)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
​


I looked at a working installation that I have(build master a few weeks
ago) and this class used to be included in spark-assembly:

ls *.jar|xargs grep parquet.hive.serde.ParquetHiveSerDe
Binary file spark-assembly-1.2.0-SNAPSHOT-hadoop2.0.0-mr1-cdh4.2.0.jar
matches

but with the RC build it's not there?

I tried both the prebuilt CDH drop and later manually built the tag with
the following command:

 ./make-distribution.sh --tgz -Phive -Dhadoop.version=2.0.0-mr1-cdh4.2.0
-Phive-thriftserver
$JAVA_HOME/bin/jar -tvf spark-assembly-1.2.0-hadoop2.0.0-mr1-cdh4.2.0.jar
|grep parquet.hive.serde.ParquetHiveSerDe

comes back empty...

Re: [Thrift,1.2 RC] what happened to parquet.hive.serde.ParquetHiveSerDe

Posted by Michael Armbrust <mi...@databricks.com>.
Here's a fix: https://github.com/apache/spark/pull/3586

On Wed, Dec 3, 2014 at 11:05 AM, Michael Armbrust <mi...@databricks.com>
wrote:

> Thanks for reporting. As a workaround you should be able to SET
> spark.sql.hive.convertMetastoreParquet=false, but I'm going to try to fix
> this before the next RC.
>
> On Wed, Dec 3, 2014 at 7:09 AM, Yana Kadiyska <ya...@gmail.com>
> wrote:
>
>> Thanks Michael, you are correct.
>>
>> I also opened https://issues.apache.org/jira/browse/SPARK-4702 -- if
>> someone can comment on why this might be happening that would be great.
>> This would be a blocker to me using 1.2 and it used to work so I'm a bit
>> puzzled. I was hoping that it's again a result of the default profile
>> switch but it didn't seem to be the case
>>
>> (ps. please advise if this is more user-list appropriate. I'm posting to
>> dev as it's an RC)
>>
>> On Tue, Dec 2, 2014 at 8:37 PM, Michael Armbrust <mi...@databricks.com>
>> wrote:
>>
>>> In Hive 13 (which is the default for Spark 1.2), parquet is included and
>>> thus we no longer include the Hive parquet bundle. You can now use the
>>> included
>>> ParquetSerDe: org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe
>>>
>>> If you want to compile Spark 1.2 with Hive 12 instead you can pass
>>> -Phive-0.12.0 and  parquet.hive.serde.ParquetHiveSerDe will be included as
>>> before.
>>>
>>> Michael
>>>
>>> On Tue, Dec 2, 2014 at 9:31 AM, Yana Kadiyska <ya...@gmail.com>
>>> wrote:
>>>
>>>> Apologies if people get this more than once -- I sent mail to dev@spark
>>>> last night and don't see it in the archives. Trying the incubator list
>>>> now...wanted to make sure it doesn't get lost in case it's a bug...
>>>>
>>>> ---------- Forwarded message ----------
>>>> From: Yana Kadiyska <ya...@gmail.com>
>>>> Date: Mon, Dec 1, 2014 at 8:10 PM
>>>> Subject: [Thrift,1.2 RC] what happened to
>>>> parquet.hive.serde.ParquetHiveSerDe
>>>> To: dev@spark.apache.org
>>>>
>>>>
>>>> Hi all, apologies if this is not a question for the dev list -- figured
>>>> User list might not be appropriate since I'm having trouble with the RC
>>>> tag.
>>>>
>>>> I just tried deploying the RC and running ThriftServer. I see the
>>>> following
>>>> error:
>>>>
>>>> 14/12/01 21:31:42 ERROR UserGroupInformation: PriviledgedActionException
>>>> as:anonymous (auth:SIMPLE)
>>>> cause:org.apache.hive.service.cli.HiveSQLException:
>>>> java.lang.RuntimeException:
>>>> MetaException(message:java.lang.ClassNotFoundException Class
>>>> parquet.hive.serde.ParquetHiveSerDe not found)
>>>> 14/12/01 21:31:42 WARN ThriftCLIService: Error executing statement:
>>>> org.apache.hive.service.cli.HiveSQLException:
>>>> java.lang.RuntimeException:
>>>> MetaException(message:java.lang.ClassNotFoundException Class
>>>> parquet.hive.serde.ParquetHiveSerDe not found)
>>>> at
>>>>
>>>> org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.run(Shim13.scala:192)
>>>> at
>>>>
>>>> org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementInternal(HiveSessionImpl.java:231)
>>>> at
>>>>
>>>> org.apache.hive.service.cli.session.HiveSessionImpl.executeStatement(HiveSessionImpl.java:212)
>>>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>>> at
>>>>
>>>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>>>> at
>>>>
>>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>>>> at java.lang.reflect.Method.invoke(Method.java:606)
>>>> at
>>>>
>>>> org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:79)
>>>> at
>>>>
>>>> org.apache.hive.service.cli.session.HiveSessionProxy.access$000(HiveSessionProxy.java:37)
>>>> at
>>>>
>>>> org.apache.hive.service.cli.session.HiveSessionProxy$1.run(HiveSessionProxy.java:64)
>>>> at java.security.AccessController.doPrivileged(Native Method)
>>>> at javax.security.auth.Subject.doAs(Subject.java:415)
>>>> ​
>>>>
>>>>
>>>> I looked at a working installation that I have(build master a few weeks
>>>> ago) and this class used to be included in spark-assembly:
>>>>
>>>> ls *.jar|xargs grep parquet.hive.serde.ParquetHiveSerDe
>>>> Binary file spark-assembly-1.2.0-SNAPSHOT-hadoop2.0.0-mr1-cdh4.2.0.jar
>>>> matches
>>>>
>>>> but with the RC build it's not there?
>>>>
>>>> I tried both the prebuilt CDH drop and later manually built the tag with
>>>> the following command:
>>>>
>>>>  ./make-distribution.sh --tgz -Phive -Dhadoop.version=2.0.0-mr1-cdh4.2.0
>>>> -Phive-thriftserver
>>>> $JAVA_HOME/bin/jar -tvf
>>>> spark-assembly-1.2.0-hadoop2.0.0-mr1-cdh4.2.0.jar
>>>> |grep parquet.hive.serde.ParquetHiveSerDe
>>>>
>>>> comes back empty...
>>>>
>>>
>>>
>>
>

Re: [Thrift,1.2 RC] what happened to parquet.hive.serde.ParquetHiveSerDe

Posted by Michael Armbrust <mi...@databricks.com>.
Thanks for reporting. As a workaround you should be able to SET
spark.sql.hive.convertMetastoreParquet=false, but I'm going to try to fix
this before the next RC.

On Wed, Dec 3, 2014 at 7:09 AM, Yana Kadiyska <ya...@gmail.com>
wrote:

> Thanks Michael, you are correct.
>
> I also opened https://issues.apache.org/jira/browse/SPARK-4702 -- if
> someone can comment on why this might be happening that would be great.
> This would be a blocker to me using 1.2 and it used to work so I'm a bit
> puzzled. I was hoping that it's again a result of the default profile
> switch but it didn't seem to be the case
>
> (ps. please advise if this is more user-list appropriate. I'm posting to
> dev as it's an RC)
>
> On Tue, Dec 2, 2014 at 8:37 PM, Michael Armbrust <mi...@databricks.com>
> wrote:
>
>> In Hive 13 (which is the default for Spark 1.2), parquet is included and
>> thus we no longer include the Hive parquet bundle. You can now use the
>> included
>> ParquetSerDe: org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe
>>
>> If you want to compile Spark 1.2 with Hive 12 instead you can pass
>> -Phive-0.12.0 and  parquet.hive.serde.ParquetHiveSerDe will be included as
>> before.
>>
>> Michael
>>
>> On Tue, Dec 2, 2014 at 9:31 AM, Yana Kadiyska <ya...@gmail.com>
>> wrote:
>>
>>> Apologies if people get this more than once -- I sent mail to dev@spark
>>> last night and don't see it in the archives. Trying the incubator list
>>> now...wanted to make sure it doesn't get lost in case it's a bug...
>>>
>>> ---------- Forwarded message ----------
>>> From: Yana Kadiyska <ya...@gmail.com>
>>> Date: Mon, Dec 1, 2014 at 8:10 PM
>>> Subject: [Thrift,1.2 RC] what happened to
>>> parquet.hive.serde.ParquetHiveSerDe
>>> To: dev@spark.apache.org
>>>
>>>
>>> Hi all, apologies if this is not a question for the dev list -- figured
>>> User list might not be appropriate since I'm having trouble with the RC
>>> tag.
>>>
>>> I just tried deploying the RC and running ThriftServer. I see the
>>> following
>>> error:
>>>
>>> 14/12/01 21:31:42 ERROR UserGroupInformation: PriviledgedActionException
>>> as:anonymous (auth:SIMPLE)
>>> cause:org.apache.hive.service.cli.HiveSQLException:
>>> java.lang.RuntimeException:
>>> MetaException(message:java.lang.ClassNotFoundException Class
>>> parquet.hive.serde.ParquetHiveSerDe not found)
>>> 14/12/01 21:31:42 WARN ThriftCLIService: Error executing statement:
>>> org.apache.hive.service.cli.HiveSQLException: java.lang.RuntimeException:
>>> MetaException(message:java.lang.ClassNotFoundException Class
>>> parquet.hive.serde.ParquetHiveSerDe not found)
>>> at
>>>
>>> org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.run(Shim13.scala:192)
>>> at
>>>
>>> org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementInternal(HiveSessionImpl.java:231)
>>> at
>>>
>>> org.apache.hive.service.cli.session.HiveSessionImpl.executeStatement(HiveSessionImpl.java:212)
>>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>> at
>>>
>>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>>> at
>>>
>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>>> at java.lang.reflect.Method.invoke(Method.java:606)
>>> at
>>>
>>> org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:79)
>>> at
>>>
>>> org.apache.hive.service.cli.session.HiveSessionProxy.access$000(HiveSessionProxy.java:37)
>>> at
>>>
>>> org.apache.hive.service.cli.session.HiveSessionProxy$1.run(HiveSessionProxy.java:64)
>>> at java.security.AccessController.doPrivileged(Native Method)
>>> at javax.security.auth.Subject.doAs(Subject.java:415)
>>> ​
>>>
>>>
>>> I looked at a working installation that I have(build master a few weeks
>>> ago) and this class used to be included in spark-assembly:
>>>
>>> ls *.jar|xargs grep parquet.hive.serde.ParquetHiveSerDe
>>> Binary file spark-assembly-1.2.0-SNAPSHOT-hadoop2.0.0-mr1-cdh4.2.0.jar
>>> matches
>>>
>>> but with the RC build it's not there?
>>>
>>> I tried both the prebuilt CDH drop and later manually built the tag with
>>> the following command:
>>>
>>>  ./make-distribution.sh --tgz -Phive -Dhadoop.version=2.0.0-mr1-cdh4.2.0
>>> -Phive-thriftserver
>>> $JAVA_HOME/bin/jar -tvf spark-assembly-1.2.0-hadoop2.0.0-mr1-cdh4.2.0.jar
>>> |grep parquet.hive.serde.ParquetHiveSerDe
>>>
>>> comes back empty...
>>>
>>
>>
>

Re: [Thrift,1.2 RC] what happened to parquet.hive.serde.ParquetHiveSerDe

Posted by Yana Kadiyska <ya...@gmail.com>.
Thanks Michael, you are correct.

I also opened https://issues.apache.org/jira/browse/SPARK-4702 -- if
someone can comment on why this might be happening that would be great.
This would be a blocker to me using 1.2 and it used to work so I'm a bit
puzzled. I was hoping that it's again a result of the default profile
switch but it didn't seem to be the case

(ps. please advise if this is more user-list appropriate. I'm posting to
dev as it's an RC)

On Tue, Dec 2, 2014 at 8:37 PM, Michael Armbrust <mi...@databricks.com>
wrote:

> In Hive 13 (which is the default for Spark 1.2), parquet is included and
> thus we no longer include the Hive parquet bundle. You can now use the
> included
> ParquetSerDe: org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe
>
> If you want to compile Spark 1.2 with Hive 12 instead you can pass
> -Phive-0.12.0 and  parquet.hive.serde.ParquetHiveSerDe will be included as
> before.
>
> Michael
>
> On Tue, Dec 2, 2014 at 9:31 AM, Yana Kadiyska <ya...@gmail.com>
> wrote:
>
>> Apologies if people get this more than once -- I sent mail to dev@spark
>> last night and don't see it in the archives. Trying the incubator list
>> now...wanted to make sure it doesn't get lost in case it's a bug...
>>
>> ---------- Forwarded message ----------
>> From: Yana Kadiyska <ya...@gmail.com>
>> Date: Mon, Dec 1, 2014 at 8:10 PM
>> Subject: [Thrift,1.2 RC] what happened to
>> parquet.hive.serde.ParquetHiveSerDe
>> To: dev@spark.apache.org
>>
>>
>> Hi all, apologies if this is not a question for the dev list -- figured
>> User list might not be appropriate since I'm having trouble with the RC
>> tag.
>>
>> I just tried deploying the RC and running ThriftServer. I see the
>> following
>> error:
>>
>> 14/12/01 21:31:42 ERROR UserGroupInformation: PriviledgedActionException
>> as:anonymous (auth:SIMPLE)
>> cause:org.apache.hive.service.cli.HiveSQLException:
>> java.lang.RuntimeException:
>> MetaException(message:java.lang.ClassNotFoundException Class
>> parquet.hive.serde.ParquetHiveSerDe not found)
>> 14/12/01 21:31:42 WARN ThriftCLIService: Error executing statement:
>> org.apache.hive.service.cli.HiveSQLException: java.lang.RuntimeException:
>> MetaException(message:java.lang.ClassNotFoundException Class
>> parquet.hive.serde.ParquetHiveSerDe not found)
>> at
>>
>> org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.run(Shim13.scala:192)
>> at
>>
>> org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementInternal(HiveSessionImpl.java:231)
>> at
>>
>> org.apache.hive.service.cli.session.HiveSessionImpl.executeStatement(HiveSessionImpl.java:212)
>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>> at
>>
>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>> at
>>
>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>> at java.lang.reflect.Method.invoke(Method.java:606)
>> at
>>
>> org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:79)
>> at
>>
>> org.apache.hive.service.cli.session.HiveSessionProxy.access$000(HiveSessionProxy.java:37)
>> at
>>
>> org.apache.hive.service.cli.session.HiveSessionProxy$1.run(HiveSessionProxy.java:64)
>> at java.security.AccessController.doPrivileged(Native Method)
>> at javax.security.auth.Subject.doAs(Subject.java:415)
>> ​
>>
>>
>> I looked at a working installation that I have(build master a few weeks
>> ago) and this class used to be included in spark-assembly:
>>
>> ls *.jar|xargs grep parquet.hive.serde.ParquetHiveSerDe
>> Binary file spark-assembly-1.2.0-SNAPSHOT-hadoop2.0.0-mr1-cdh4.2.0.jar
>> matches
>>
>> but with the RC build it's not there?
>>
>> I tried both the prebuilt CDH drop and later manually built the tag with
>> the following command:
>>
>>  ./make-distribution.sh --tgz -Phive -Dhadoop.version=2.0.0-mr1-cdh4.2.0
>> -Phive-thriftserver
>> $JAVA_HOME/bin/jar -tvf spark-assembly-1.2.0-hadoop2.0.0-mr1-cdh4.2.0.jar
>> |grep parquet.hive.serde.ParquetHiveSerDe
>>
>> comes back empty...
>>
>
>

Re: [Thrift,1.2 RC] what happened to parquet.hive.serde.ParquetHiveSerDe

Posted by Michael Armbrust <mi...@databricks.com>.
 In Hive 13 (which is the default for Spark 1.2), parquet is included and
thus we no longer include the Hive parquet bundle. You can now use the
included
ParquetSerDe: org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe

If you want to compile Spark 1.2 with Hive 12 instead you can pass
-Phive-0.12.0 and  parquet.hive.serde.ParquetHiveSerDe will be included as
before.

Michael

On Tue, Dec 2, 2014 at 9:31 AM, Yana Kadiyska <ya...@gmail.com>
wrote:

> Apologies if people get this more than once -- I sent mail to dev@spark
> last night and don't see it in the archives. Trying the incubator list
> now...wanted to make sure it doesn't get lost in case it's a bug...
>
> ---------- Forwarded message ----------
> From: Yana Kadiyska <ya...@gmail.com>
> Date: Mon, Dec 1, 2014 at 8:10 PM
> Subject: [Thrift,1.2 RC] what happened to
> parquet.hive.serde.ParquetHiveSerDe
> To: dev@spark.apache.org
>
>
> Hi all, apologies if this is not a question for the dev list -- figured
> User list might not be appropriate since I'm having trouble with the RC
> tag.
>
> I just tried deploying the RC and running ThriftServer. I see the following
> error:
>
> 14/12/01 21:31:42 ERROR UserGroupInformation: PriviledgedActionException
> as:anonymous (auth:SIMPLE)
> cause:org.apache.hive.service.cli.HiveSQLException:
> java.lang.RuntimeException:
> MetaException(message:java.lang.ClassNotFoundException Class
> parquet.hive.serde.ParquetHiveSerDe not found)
> 14/12/01 21:31:42 WARN ThriftCLIService: Error executing statement:
> org.apache.hive.service.cli.HiveSQLException: java.lang.RuntimeException:
> MetaException(message:java.lang.ClassNotFoundException Class
> parquet.hive.serde.ParquetHiveSerDe not found)
> at
>
> org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.run(Shim13.scala:192)
> at
>
> org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementInternal(HiveSessionImpl.java:231)
> at
>
> org.apache.hive.service.cli.session.HiveSessionImpl.executeStatement(HiveSessionImpl.java:212)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at
>
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at
>
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at
>
> org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:79)
> at
>
> org.apache.hive.service.cli.session.HiveSessionProxy.access$000(HiveSessionProxy.java:37)
> at
>
> org.apache.hive.service.cli.session.HiveSessionProxy$1.run(HiveSessionProxy.java:64)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> ​
>
>
> I looked at a working installation that I have(build master a few weeks
> ago) and this class used to be included in spark-assembly:
>
> ls *.jar|xargs grep parquet.hive.serde.ParquetHiveSerDe
> Binary file spark-assembly-1.2.0-SNAPSHOT-hadoop2.0.0-mr1-cdh4.2.0.jar
> matches
>
> but with the RC build it's not there?
>
> I tried both the prebuilt CDH drop and later manually built the tag with
> the following command:
>
>  ./make-distribution.sh --tgz -Phive -Dhadoop.version=2.0.0-mr1-cdh4.2.0
> -Phive-thriftserver
> $JAVA_HOME/bin/jar -tvf spark-assembly-1.2.0-hadoop2.0.0-mr1-cdh4.2.0.jar
> |grep parquet.hive.serde.ParquetHiveSerDe
>
> comes back empty...
>