You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@spark.apache.org by "Campagnola, Francesco" <Fr...@anritsu.com> on 2016/09/05 15:25:31 UTC

Spark 2.0.0 Thrift Server problem with Hive metastore

Hi,

in an already working Spark - Hive environment with Spark 1.6 and Hive 1.2.1, with Hive metastore configured on Postgres DB, I have upgraded Spark to the 2.0.0.

I have started the thrift server on YARN, then tried to execute from the beeline cli or a jdbc client the following command:
SHOW DATABASES;
It always gives this error on Spark server side:

spark@spark-test[spark] /home/spark> beeline -u jdbc:hive2://$(hostname):10000 -n spark

Connecting to jdbc:hive2://spark-test:10000
16/09/05 17:41:43 INFO jdbc.Utils: Supplied authorities: spark-test:10000
16/09/05 17:41:43 INFO jdbc.Utils: Resolved authority: spark-test:10000
16/09/05 17:41:43 INFO jdbc.HiveConnection: Will try to open client transport with JDBC Uri: jdbc:hive2:// spark-test:10000
Connected to: Spark SQL (version 2.0.0)
Driver: Hive JDBC (version 1.2.1.spark2)
Transaction isolation: TRANSACTION_REPEATABLE_READ
Beeline version 1.2.1.spark2 by Apache Hive

0: jdbc:hive2:// spark-test:10000> show databases;
java.lang.IllegalStateException: Can't overwrite cause with java.lang.ClassCastException: org.apache.spark.sql.catalyst.expressions.GenericInternalRow cannot be cast to org.apache.spark.sql.catalyst.expressions.UnsafeRow
        at java.lang.Throwable.initCause(Throwable.java:457)
        at org.apache.hive.service.cli.HiveSQLException.toStackTrace(HiveSQLException.java:236)
        at org.apache.hive.service.cli.HiveSQLException.toStackTrace(HiveSQLException.java:236)
        at org.apache.hive.service.cli.HiveSQLException.toCause(HiveSQLException.java:197)
        at org.apache.hive.service.cli.HiveSQLException.<init>(HiveSQLException.java:108)
        at org.apache.hive.jdbc.Utils.verifySuccess(Utils.java:256)
        at org.apache.hive.jdbc.Utils.verifySuccessWithInfo(Utils.java:242)
        at org.apache.hive.jdbc.HiveQueryResultSet.next(HiveQueryResultSet.java:365)
        at org.apache.hive.beeline.BufferedRows.<init>(BufferedRows.java:42)
        at org.apache.hive.beeline.BeeLine.print(BeeLine.java:1794)
        at org.apache.hive.beeline.Commands.execute(Commands.java:860)
        at org.apache.hive.beeline.Commands.sql(Commands.java:713)
        at org.apache.hive.beeline.BeeLine.dispatch(BeeLine.java:973)
        at org.apache.hive.beeline.BeeLine.execute(BeeLine.java:813)
        at org.apache.hive.beeline.BeeLine.begin(BeeLine.java:771)
        at org.apache.hive.beeline.BeeLine.mainWithInputRedirection(BeeLine.java:484)
        at org.apache.hive.beeline.BeeLine.main(BeeLine.java:467)
Caused by: org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 3.0 failed 10 times, most recent failure: Lost task 0.9 in stage 3.0 (TID 12, vertica204): java.lang.ClassCastException: org.apache.spark.sql.catalyst.expressions.GenericInternalRow cannot be cast to org.apache.spark.sql.catalyst.expressions.UnsafeRow
        at org.apache.spark.sql.execution.SparkPlan$$anonfun$4.apply(SparkPlan.scala:247)
        at org.apache.spark.sql.execution.SparkPlan$$anonfun$4.apply(SparkPlan.scala:240)
        at org.apache.spark.rdd.RDD$$anonfun$mapPartitionsInternal$1$$anonfun$apply$24.apply(RDD.scala:784)
        at org.apache.spark.rdd.RDD$$anonfun$mapPartitionsInternal$1$$anonfun$apply$24.apply(RDD.scala:784)
        at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:319)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:283)
        at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:70)
        at org.apache.spark.scheduler.Task.run(Task.scala:85)
        at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:274)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)

Driver stacktrace:
        at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
        at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
        at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
        at java.lang.reflect.Constructor.newInstance(Constructor.java:422)
        at org.apache.hive.service.cli.HiveSQLException.newInstance(HiveSQLException.java:244)
        at org.apache.hive.service.cli.HiveSQLException.toStackTrace(HiveSQLException.java:210)
        ... 15 more
Error: Error retrieving next row (state=,code=0)

The same command works when using Spark 1.6, is it a possible issue?

Thanks!

RE: Spark 2.0.0 Thrift Server problem with Hive metastore

Posted by "Campagnola, Francesco" <Fr...@anritsu.com>.

The same error occurs when executing any “explain” command:

0: jdbc:hive2://spark-test:10000> explain select 1 as id;
java.lang.IllegalStateException: Can't overwrite cause with java.lang.ClassCastException: org.apache.spark.sql.catalyst.expressions.GenericInternalRow cannot be cast to org.apache.spark.sql.catalyst.expressions.UnsafeRow
…
Caused by: org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 198.0 failed 10 times, most recent failure: Lost task 0.9 in stage 198.0 (TID 2046, vertica204): java.lang.ClassCastException: org.apache.spark.sql.catalyst.expressions.GenericInternalRow cannot be cast to org.apache.spark.sql.catalyst.expressions.UnsafeRow
        at org.apache.spark.sql.execution.SparkPlan$$anonfun$4.apply(SparkPlan.scala:247)

I have checked the source code, and it seems this explicit cast is causing the issue:

>spark-2.0.0/sql/core/src/main/scala/org/apache/spark/sql/execution/SparkPlan.scala

private def getByteArrayRdd(n: Int = -1): RDD[Array[Byte]] = {
    execute().mapPartitionsInternal { iter =>
      var count = 0
      val buffer = new Array[Byte](4 << 10)  // 4K
      val codec = CompressionCodec.createCodec(SparkEnv.get.conf)
      val bos = new ByteArrayOutputStream()
      val out = new DataOutputStream(codec.compressedOutputStream(bos))
      while (iter.hasNext && (n < 0 || count < n)) {
        val row = iter.next().asInstanceOf[UnsafeRow]

Hope this issue will be fixed in the next releases…


From: Campagnola, Francesco
Sent: martedì 6 settembre 2016 09:46
To: 'Jeff Zhang' <zj...@gmail.com>
Cc: user@spark.apache.org
Subject: RE: Spark 2.0.0 Thrift Server problem with Hive metastore

I mean I have installed Spark 2.0 in the same environment where Spark 1.6 thrift server was running,
then stopped the Spark 1.6 thrift server and started the Spark 2.0 one.

If I’m not mistaken, Spark 2.0 should be still compatible with Hive 1.2.1 and no upgrade procedures are required.
The spark-defaults.conf file has not been changed.

The following commands issued to the Spark 2.0 thrift server work:
create database test;
use test;
create table tb_1 (id int);
insert into table tb_1 select t.id from (select 1 as id) t;

While all of these commands return the same error:
show databases;
show tables;
show partitions tb_1;
Caused by: org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 62.0 failed 10 times, most recent failure: Lost task 0.9 in stage 62.0 (TID 540, vertica204): java.lang.ClassCastException: org.apache.spark.sql.catalyst.expressions.GenericInternalRow cannot be cast to org.apache.spark.sql.catalyst.expressions.UnsafeRow





From: Jeff Zhang [mailto:zjffdu@gmail.com]
Sent: martedì 6 settembre 2016 02:50
To: Campagnola, Francesco <Fr...@anritsu.com>>
Cc: user@spark.apache.org<ma...@spark.apache.org>
Subject: Re: Spark 2.0.0 Thrift Server problem with Hive metastore

How do you upgrade to spark 2.0 ?

On Mon, Sep 5, 2016 at 11:25 PM, Campagnola, Francesco <Fr...@anritsu.com>> wrote:
Hi,

in an already working Spark - Hive environment with Spark 1.6 and Hive 1.2.1, with Hive metastore configured on Postgres DB, I have upgraded Spark to the 2.0.0.

I have started the thrift server on YARN, then tried to execute from the beeline cli or a jdbc client the following command:
SHOW DATABASES;
It always gives this error on Spark server side:

spark@spark-test[spark] /home/spark> beeline -u jdbc:hive2://$(hostname):10000 -n spark

Connecting to jdbc:hive2://spark-test:10000
16/09/05 17:41:43 INFO jdbc.Utils: Supplied authorities: spark-test:10000
16/09/05 17:41:43 INFO jdbc.Utils: Resolved authority: spark-test:10000
16/09/05 17:41:43 INFO jdbc.HiveConnection: Will try to open client transport with JDBC Uri: jdbc:hive2:// spark-test:10000
Connected to: Spark SQL (version 2.0.0)
Driver: Hive JDBC (version 1.2.1.spark2)
Transaction isolation: TRANSACTION_REPEATABLE_READ
Beeline version 1.2.1.spark2 by Apache Hive

0: jdbc:hive2:// spark-test:10000> show databases;
java.lang.IllegalStateException: Can't overwrite cause with java.lang.ClassCastException: org.apache.spark.sql.catalyst.expressions.GenericInternalRow cannot be cast to org.apache.spark.sql.catalyst.expressions.UnsafeRow
        at java.lang.Throwable.initCause(Throwable.java:457)
        at org.apache.hive.service.cli.HiveSQLException.toStackTrace(HiveSQLException.java:236)
        at org.apache.hive.service.cli.HiveSQLException.toStackTrace(HiveSQLException.java:236)
        at org.apache.hive.service.cli.HiveSQLException.toCause(HiveSQLException.java:197)
        at org.apache.hive.service.cli.HiveSQLException.<init>(HiveSQLException.java:108)
        at org.apache.hive.jdbc.Utils.verifySuccess(Utils.java:256)
        at org.apache.hive.jdbc.Utils.verifySuccessWithInfo(Utils.java:242)
        at org.apache.hive.jdbc.HiveQueryResultSet.next(HiveQueryResultSet.java:365)
        at org.apache.hive.beeline.BufferedRows.<init>(BufferedRows.java:42)
        at org.apache.hive.beeline.BeeLine.print(BeeLine.java:1794)
        at org.apache.hive.beeline.Commands.execute(Commands.java:860)
        at org.apache.hive.beeline.Commands.sql(Commands.java:713)
        at org.apache.hive.beeline.BeeLine.dispatch(BeeLine.java:973)
        at org.apache.hive.beeline.BeeLine.execute(BeeLine.java:813)
        at org.apache.hive.beeline.BeeLine.begin(BeeLine.java:771)
        at org.apache.hive.beeline.BeeLine.mainWithInputRedirection(BeeLine.java:484)
        at org.apache.hive.beeline.BeeLine.main(BeeLine.java:467)
Caused by: org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 3.0 failed 10 times, most recent failure: Lost task 0.9 in stage 3.0 (TID 12, vertica204): java.lang.ClassCastException: org.apache.spark.sql.catalyst.expressions.GenericInternalRow cannot be cast to org.apache.spark.sql.catalyst.expressions.UnsafeRow
        at org.apache.spark.sql.execution.SparkPlan$$anonfun$4.apply(SparkPlan.scala:247)
        at org.apache.spark.sql.execution.SparkPlan$$anonfun$4.apply(SparkPlan.scala:240)
        at org.apache.spark.rdd.RDD$$anonfun$mapPartitionsInternal$1$$anonfun$apply$24.apply(RDD.scala:784)
        at org.apache.spark.rdd.RDD$$anonfun$mapPartitionsInternal$1$$anonfun$apply$24.apply(RDD.scala:784)
        at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:319)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:283)
        at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:70)
        at org.apache.spark.scheduler.Task.run(Task.scala:85)
        at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:274)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)

Driver stacktrace:
        at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
        at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
        at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
        at java.lang.reflect.Constructor.newInstance(Constructor.java:422)
        at org.apache.hive.service.cli.HiveSQLException.newInstance(HiveSQLException.java:244)
        at org.apache.hive.service.cli.HiveSQLException.toStackTrace(HiveSQLException.java:210)
        ... 15 more
Error: Error retrieving next row (state=,code=0)

The same command works when using Spark 1.6, is it a possible issue?

Thanks!



--
Best Regards

Jeff Zhang

Re: Spark 2.0.0 Thrift Server problem with Hive metastore

Posted by Chanh Le <gi...@gmail.com>.

Did anyone use STS of Spark 2.0 on production?
For me I still waiting for the compatible in parquet file created by Spark 1.6.1 


> On Sep 6, 2016, at 2:46 PM, Campagnola, Francesco <Fr...@anritsu.com> wrote:
> 
> I mean I have installed Spark 2.0 in the same environment where Spark 1.6 thrift server was running,
> then stopped the Spark 1.6 thrift server and started the Spark 2.0 one.
>  
> If I’m not mistaken, Spark 2.0 should be still compatible with Hive 1.2.1 and no upgrade procedures are required.
> The spark-defaults.conf file has not been changed.
>  
> The following commands issued to the Spark 2.0 thrift server work:
> create database test;
> use test;
> create table tb_1 (id int);
> insert into table tb_1 select t.id from (select 1 as id) t;
>  
> While all of these commands return the same error:
> show databases;
> show tables;
> show partitions tb_1;
> Caused by: org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 62.0 failed 10 times, most recent failure: Lost task 0.9 in stage 62.0 (TID 540, vertica204): java.lang.ClassCastException: org.apache.spark.sql.catalyst.expressions.GenericInternalRow cannot be cast to org.apache.spark.sql.catalyst.expressions.UnsafeRow
>  
>  
>  
>  
> From: Jeff Zhang [mailto:zjffdu@gmail.com <ma...@gmail.com>] 
> Sent: martedì 6 settembre 2016 02:50
> To: Campagnola, Francesco <Francesco.Campagnola@anritsu.com <ma...@anritsu.com>>
> Cc: user@spark.apache.org <ma...@spark.apache.org>
> Subject: Re: Spark 2.0.0 Thrift Server problem with Hive metastore
>  
> How do you upgrade to spark 2.0 ? 
>  
> On Mon, Sep 5, 2016 at 11:25 PM, Campagnola, Francesco <Francesco.Campagnola@anritsu.com <ma...@anritsu.com>> wrote:
> Hi,
>  
> in an already working Spark - Hive environment with Spark 1.6 and Hive 1.2.1, with Hive metastore configured on Postgres DB, I have upgraded Spark to the 2.0.0.
>  
> I have started the thrift server on YARN, then tried to execute from the beeline cli or a jdbc client the following command:
> SHOW DATABASES;
> It always gives this error on Spark server side:
>  
> spark@spark-test[spark] /home/spark> beeline -u jdbc:hive2://$(hostname):10000 -n spark
>  
> Connecting to jdbc:hive2://spark-test:10000
> 16/09/05 17:41:43 INFO jdbc.Utils: Supplied authorities: spark-test:10000
> 16/09/05 17:41:43 INFO jdbc.Utils: Resolved authority: spark-test:10000
> 16/09/05 17:41:43 INFO jdbc.HiveConnection: Will try to open client transport with JDBC Uri: jdbc:hive2:// spark-test:10000
> Connected to: Spark SQL (version 2.0.0)
> Driver: Hive JDBC (version 1.2.1.spark2)
> Transaction isolation: TRANSACTION_REPEATABLE_READ
> Beeline version 1.2.1.spark2 by Apache Hive
>  
> 0: jdbc:hive2:// spark-test:10000> show databases;
> java.lang.IllegalStateException: Can't overwrite cause with java.lang.ClassCastException: org.apache.spark.sql.catalyst.expressions.GenericInternalRow cannot be cast to org.apache.spark.sql.catalyst.expressions.UnsafeRow
>         at java.lang.Throwable.initCause(Throwable.java:457)
>         at org.apache.hive.service.cli.HiveSQLException.toStackTrace(HiveSQLException.java:236)
>         at org.apache.hive.service.cli.HiveSQLException.toStackTrace(HiveSQLException.java:236)
>         at org.apache.hive.service.cli.HiveSQLException.toCause(HiveSQLException.java:197)
>         at org.apache.hive.service.cli.HiveSQLException.<init>(HiveSQLException.java:108)
>         at org.apache.hive.jdbc.Utils.verifySuccess(Utils.java:256)
>         at org.apache.hive.jdbc.Utils.verifySuccessWithInfo(Utils.java:242)
>         at org.apache.hive.jdbc.HiveQueryResultSet.next(HiveQueryResultSet.java:365)
>         at org.apache.hive.beeline.BufferedRows.<init>(BufferedRows.java:42)
>         at org.apache.hive.beeline.BeeLine.print(BeeLine.java:1794)
>         at org.apache.hive.beeline.Commands.execute(Commands.java:860)
>         at org.apache.hive.beeline.Commands.sql(Commands.java:713)
>         at org.apache.hive.beeline.BeeLine.dispatch(BeeLine.java:973)
>         at org.apache.hive.beeline.BeeLine.execute(BeeLine.java:813)
>         at org.apache.hive.beeline.BeeLine.begin(BeeLine.java:771)
>         at org.apache.hive.beeline.BeeLine.mainWithInputRedirection(BeeLine.java:484)
>         at org.apache.hive.beeline.BeeLine.main(BeeLine.java:467)
> Caused by: org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 3.0 failed 10 times, most recent failure: Lost task 0.9 in stage 3.0 (TID 12, vertica204): java.lang.ClassCastException: org.apache.spark.sql.catalyst.expressions.GenericInternalRow cannot be cast to org.apache.spark.sql.catalyst.expressions.UnsafeRow
>         at org.apache.spark.sql.execution.SparkPlan$$anonfun$4.apply(SparkPlan.scala:247)
>         at org.apache.spark.sql.execution.SparkPlan$$anonfun$4.apply(SparkPlan.scala:240)
>         at org.apache.spark.rdd.RDD$$anonfun$mapPartitionsInternal$1$$anonfun$apply$24.apply(RDD.scala:784)
>         at org.apache.spark.rdd.RDD$$anonfun$mapPartitionsInternal$1$$anonfun$apply$24.apply(RDD.scala:784)
>         at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
>         at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:319)
>         at org.apache.spark.rdd.RDD.iterator(RDD.scala:283)
>         at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:70)
>         at org.apache.spark.scheduler.Task.run(Task.scala:85)
>         at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:274)
>         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>         at java.lang.Thread.run(Thread.java:745)
>  
> Driver stacktrace:
>         at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
>         at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
>         at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>         at java.lang.reflect.Constructor.newInstance(Constructor.java:422)
>         at org.apache.hive.service.cli.HiveSQLException.newInstance(HiveSQLException.java:244)
>         at org.apache.hive.service.cli.HiveSQLException.toStackTrace(HiveSQLException.java:210)
>         ... 15 more
> Error: Error retrieving next row (state=,code=0)
>  
> The same command works when using Spark 1.6, is it a possible issue?
>  
> Thanks!
> 
> 
>  
> -- 
> Best Regards
> 
> Jeff Zhang

RE: Spark 2.0.0 Thrift Server problem with Hive metastore

Posted by "Campagnola, Francesco" <Fr...@anritsu.com>.

I mean I have installed Spark 2.0 in the same environment where Spark 1.6 thrift server was running,
then stopped the Spark 1.6 thrift server and started the Spark 2.0 one.

If I’m not mistaken, Spark 2.0 should be still compatible with Hive 1.2.1 and no upgrade procedures are required.
The spark-defaults.conf file has not been changed.

The following commands issued to the Spark 2.0 thrift server work:
create database test;
use test;
create table tb_1 (id int);
insert into table tb_1 select t.id from (select 1 as id) t;

While all of these commands return the same error:
show databases;
show tables;
show partitions tb_1;
Caused by: org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 62.0 failed 10 times, most recent failure: Lost task 0.9 in stage 62.0 (TID 540, vertica204): java.lang.ClassCastException: org.apache.spark.sql.catalyst.expressions.GenericInternalRow cannot be cast to org.apache.spark.sql.catalyst.expressions.UnsafeRow




From: Jeff Zhang [mailto:zjffdu@gmail.com]
Sent: martedì 6 settembre 2016 02:50
To: Campagnola, Francesco <Fr...@anritsu.com>
Cc: user@spark.apache.org
Subject: Re: Spark 2.0.0 Thrift Server problem with Hive metastore

How do you upgrade to spark 2.0 ?

On Mon, Sep 5, 2016 at 11:25 PM, Campagnola, Francesco <Fr...@anritsu.com>> wrote:
Hi,

in an already working Spark - Hive environment with Spark 1.6 and Hive 1.2.1, with Hive metastore configured on Postgres DB, I have upgraded Spark to the 2.0.0.

I have started the thrift server on YARN, then tried to execute from the beeline cli or a jdbc client the following command:
SHOW DATABASES;
It always gives this error on Spark server side:

spark@spark-test[spark] /home/spark> beeline -u jdbc:hive2://$(hostname):10000 -n spark

Connecting to jdbc:hive2://spark-test:10000
16/09/05 17:41:43 INFO jdbc.Utils: Supplied authorities: spark-test:10000
16/09/05 17:41:43 INFO jdbc.Utils: Resolved authority: spark-test:10000
16/09/05 17:41:43 INFO jdbc.HiveConnection: Will try to open client transport with JDBC Uri: jdbc:hive2:// spark-test:10000
Connected to: Spark SQL (version 2.0.0)
Driver: Hive JDBC (version 1.2.1.spark2)
Transaction isolation: TRANSACTION_REPEATABLE_READ
Beeline version 1.2.1.spark2 by Apache Hive

0: jdbc:hive2:// spark-test:10000> show databases;
java.lang.IllegalStateException: Can't overwrite cause with java.lang.ClassCastException: org.apache.spark.sql.catalyst.expressions.GenericInternalRow cannot be cast to org.apache.spark.sql.catalyst.expressions.UnsafeRow
        at java.lang.Throwable.initCause(Throwable.java:457)
        at org.apache.hive.service.cli.HiveSQLException.toStackTrace(HiveSQLException.java:236)
        at org.apache.hive.service.cli.HiveSQLException.toStackTrace(HiveSQLException.java:236)
        at org.apache.hive.service.cli.HiveSQLException.toCause(HiveSQLException.java:197)
        at org.apache.hive.service.cli.HiveSQLException.<init>(HiveSQLException.java:108)
        at org.apache.hive.jdbc.Utils.verifySuccess(Utils.java:256)
        at org.apache.hive.jdbc.Utils.verifySuccessWithInfo(Utils.java:242)
        at org.apache.hive.jdbc.HiveQueryResultSet.next(HiveQueryResultSet.java:365)
        at org.apache.hive.beeline.BufferedRows.<init>(BufferedRows.java:42)
        at org.apache.hive.beeline.BeeLine.print(BeeLine.java:1794)
        at org.apache.hive.beeline.Commands.execute(Commands.java:860)
        at org.apache.hive.beeline.Commands.sql(Commands.java:713)
        at org.apache.hive.beeline.BeeLine.dispatch(BeeLine.java:973)
        at org.apache.hive.beeline.BeeLine.execute(BeeLine.java:813)
        at org.apache.hive.beeline.BeeLine.begin(BeeLine.java:771)
        at org.apache.hive.beeline.BeeLine.mainWithInputRedirection(BeeLine.java:484)
        at org.apache.hive.beeline.BeeLine.main(BeeLine.java:467)
Caused by: org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 3.0 failed 10 times, most recent failure: Lost task 0.9 in stage 3.0 (TID 12, vertica204): java.lang.ClassCastException: org.apache.spark.sql.catalyst.expressions.GenericInternalRow cannot be cast to org.apache.spark.sql.catalyst.expressions.UnsafeRow
        at org.apache.spark.sql.execution.SparkPlan$$anonfun$4.apply(SparkPlan.scala:247)
        at org.apache.spark.sql.execution.SparkPlan$$anonfun$4.apply(SparkPlan.scala:240)
        at org.apache.spark.rdd.RDD$$anonfun$mapPartitionsInternal$1$$anonfun$apply$24.apply(RDD.scala:784)
        at org.apache.spark.rdd.RDD$$anonfun$mapPartitionsInternal$1$$anonfun$apply$24.apply(RDD.scala:784)
        at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:319)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:283)
        at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:70)
        at org.apache.spark.scheduler.Task.run(Task.scala:85)
        at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:274)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)

Driver stacktrace:
        at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
        at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
        at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
        at java.lang.reflect.Constructor.newInstance(Constructor.java:422)
        at org.apache.hive.service.cli.HiveSQLException.newInstance(HiveSQLException.java:244)
        at org.apache.hive.service.cli.HiveSQLException.toStackTrace(HiveSQLException.java:210)
        ... 15 more
Error: Error retrieving next row (state=,code=0)

The same command works when using Spark 1.6, is it a possible issue?

Thanks!



--
Best Regards

Jeff Zhang

Re: Spark 2.0.0 Thrift Server problem with Hive metastore

Posted by Jeff Zhang <zj...@gmail.com>.

How do you upgrade to spark 2.0 ?

On Mon, Sep 5, 2016 at 11:25 PM, Campagnola, Francesco <
Francesco.Campagnola@anritsu.com> wrote:

> Hi,
>
>
>
> in an already working Spark - Hive environment with Spark 1.6 and Hive
> 1.2.1, with Hive metastore configured on Postgres DB, I have upgraded Spark
> to the 2.0.0.
>
>
>
> I have started the thrift server on YARN, then tried to execute from the
> beeline cli or a jdbc client the following command:
>
> SHOW DATABASES;
>
> It always gives this error on Spark server side:
>
>
>
> spark@spark-test[spark] /home/spark> beeline -u
> jdbc:hive2://$(hostname):10000 -n spark
>
>
>
> Connecting to jdbc:hive2://spark-test:10000
>
> 16/09/05 17:41:43 INFO jdbc.Utils: Supplied authorities: spark-test:10000
>
> 16/09/05 17:41:43 INFO jdbc.Utils: Resolved authority: spark-test:10000
>
> 16/09/05 17:41:43 INFO jdbc.HiveConnection: Will try to open client
> transport with JDBC Uri: jdbc:hive2:// spark-test:10000
>
> Connected to: Spark SQL (version 2.0.0)
>
> Driver: Hive JDBC (version 1.2.1.spark2)
>
> Transaction isolation: TRANSACTION_REPEATABLE_READ
>
> Beeline version 1.2.1.spark2 by Apache Hive
>
>
>
> 0: jdbc:hive2:// spark-test:10000> show databases;
>
> java.lang.IllegalStateException: Can't overwrite cause with
> java.lang.ClassCastException: org.apache.spark.sql.catalyst.expressions.GenericInternalRow
> cannot be cast to org.apache.spark.sql.catalyst.expressions.UnsafeRow
>
>         at java.lang.Throwable.initCause(Throwable.java:457)
>
>         at org.apache.hive.service.cli.HiveSQLException.toStackTrace(
> HiveSQLException.java:236)
>
>         at org.apache.hive.service.cli.HiveSQLException.toStackTrace(
> HiveSQLException.java:236)
>
>         at org.apache.hive.service.cli.HiveSQLException.toCause(
> HiveSQLException.java:197)
>
>         at org.apache.hive.service.cli.HiveSQLException.<init>(
> HiveSQLException.java:108)
>
>         at org.apache.hive.jdbc.Utils.verifySuccess(Utils.java:256)
>
>         at org.apache.hive.jdbc.Utils.verifySuccessWithInfo(Utils.
> java:242)
>
>         at org.apache.hive.jdbc.HiveQueryResultSet.next(
> HiveQueryResultSet.java:365)
>
>         at org.apache.hive.beeline.BufferedRows.<init>(
> BufferedRows.java:42)
>
>         at org.apache.hive.beeline.BeeLine.print(BeeLine.java:1794)
>
>         at org.apache.hive.beeline.Commands.execute(Commands.java:860)
>
>         at org.apache.hive.beeline.Commands.sql(Commands.java:713)
>
>         at org.apache.hive.beeline.BeeLine.dispatch(BeeLine.java:973)
>
>         at org.apache.hive.beeline.BeeLine.execute(BeeLine.java:813)
>
>         at org.apache.hive.beeline.BeeLine.begin(BeeLine.java:771)
>
>         at org.apache.hive.beeline.BeeLine.mainWithInputRedirection(
> BeeLine.java:484)
>
>         at org.apache.hive.beeline.BeeLine.main(BeeLine.java:467)
>
> Caused by: org.apache.spark.SparkException: Job aborted due to stage
> failure: Task 0 in stage 3.0 failed 10 times, most recent failure: Lost
> task 0.9 in stage 3.0 (TID 12, vertica204): java.lang.ClassCastException:
> org.apache.spark.sql.catalyst.expressions.GenericInternalRow cannot be
> cast to org.apache.spark.sql.catalyst.expressions.UnsafeRow
>
>         at org.apache.spark.sql.execution.SparkPlan$$anonfun$
> 4.apply(SparkPlan.scala:247)
>
>         at org.apache.spark.sql.execution.SparkPlan$$anonfun$
> 4.apply(SparkPlan.scala:240)
>
>         at org.apache.spark.rdd.RDD$$anonfun$mapPartitionsInternal$
> 1$$anonfun$apply$24.apply(RDD.scala:784)
>
>         at org.apache.spark.rdd.RDD$$anonfun$mapPartitionsInternal$
> 1$$anonfun$apply$24.apply(RDD.scala:784)
>
>         at org.apache.spark.rdd.MapPartitionsRDD.compute(
> MapPartitionsRDD.scala:38)
>
>         at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:319)
>
>         at org.apache.spark.rdd.RDD.iterator(RDD.scala:283)
>
>         at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.
> scala:70)
>
>         at org.apache.spark.scheduler.Task.run(Task.scala:85)
>
>         at org.apache.spark.executor.Executor$TaskRunner.run(
> Executor.scala:274)
>
>         at java.util.concurrent.ThreadPoolExecutor.runWorker(
> ThreadPoolExecutor.java:1142)
>
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(
> ThreadPoolExecutor.java:617)
>
>         at java.lang.Thread.run(Thread.java:745)
>
>
>
> Driver stacktrace:
>
>         at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native
> Method)
>
>         at sun.reflect.NativeConstructorAccessorImpl.newInstance(
> NativeConstructorAccessorImpl.java:62)
>
>         at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(
> DelegatingConstructorAccessorImpl.java:45)
>
>         at java.lang.reflect.Constructor.newInstance(Constructor.java:422)
>
>         at org.apache.hive.service.cli.HiveSQLException.newInstance(
> HiveSQLException.java:244)
>
>         at org.apache.hive.service.cli.HiveSQLException.toStackTrace(
> HiveSQLException.java:210)
>
>         ... 15 more
>
> Error: Error retrieving next row (state=,code=0)
>
>
>
> The same command works when using Spark 1.6, is it a possible issue?
>
>
>
> Thanks!
>



-- 
Best Regards

Jeff Zhang