You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@spark.apache.org by Srikanth <sr...@gmail.com> on 2015/07/14 03:14:49 UTC

HiveThriftServer2.startWithContext error with registerTempTable

Hello,

I want to expose result of Spark computation to external tools. I plan to
do this with Thrift server JDBC interface by registering result Dataframe
as temp table.
I wrote a sample program in spark-shell to test this.

val hiveContext = new org.apache.spark.sql.hive.HiveContext(sc)
> import hiveContext.implicits._
> HiveThriftServer2.startWithContext(hiveContext)
> val myDF =
> hiveContext.read.format("com.databricks.spark.csv").option("header",
> "true").load("/datafolder/weblog/pages.csv")
> myDF.registerTempTable("temp_table")


I'm able to see the temp table in Beeline

+-------------+--------------+
> |  tableName  | isTemporary  |
> +-------------+--------------+
> | temp_table  | true         |
> | my_table    | false        |
> +-------------+--------------+


Now when I issue "select * from temp_table" from Beeline, I see below
exception in spark-shell

15/07/13 17:18:27 WARN ThriftCLIService: Error executing statement:
org.apache.hive.service.cli.HiveSQLException:
*java.lang.ClassNotFoundException:
com.databricks.spark.csv.CsvRelation$$anonfun$buildScan$1$$anonfun$1*
        at
org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.run(Shim13.scala:206)
        at
org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementInternal(HiveSessionImpl.java:231)
        at
org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementAsync(HiveSessionImpl.java:218)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

I'm able to read the other table("my_table") from Beeline though.
Any suggestions on how to overcome this?

This is with Spark 1.4 pre-built version. Spark-shell was started with
--package to pass spark-csv.

Srikanth

Re: HiveThriftServer2.startWithContext error with registerTempTable

Posted by Srikanth <sr...@gmail.com>.

Cheng,

Yes, "select * from temp_table" was working. I was able to perform some
transformation+action on the dataframe and print it on console.
HiveThriftServer2.startWithContext was being run on the same session.

When you say "try --jars option", are you asking me to pass spark-csv jar?
I'm already doing this with --packages com.databricks:spark-csv_2.10:1.0.3
Not sure if I'm missing your point here.

Anyways, I gave it a shot. I downloaded spark-csv_2.10-0.1.jar and started
spark-shell with --jars.
I still get the same exception. I'm pasting the exception below.

scala> 15/07/16 11:29:22 ERROR SparkExecuteStatementOperation: Error
executing query:
java.lang.ClassNotFoundException:
com.databricks.spark.csv.CsvRelation$$anonfun$buildScan$1$$anonfun$1
        at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
        at java.net.URLClassLoader$1.run(URLClassLoader.java:355)

15/07/16 11:29:22 WARN ThriftCLIService: Error executing statement:
org.apache.hive.service.cli.HiveSQLException:
java.lang.ClassNotFoundException:
com.databricks.spark.csv.CsvRelation$$anonfun$buildScan$1$$anonfun$1
        at
org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.run(Shim13.scala:206)
        at
org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementInternal(HiveSessionImpl.java:231)
        at
org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementAsync(HiveSessionImpl.java:218)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

Srikanth




On Thu, Jul 16, 2015 at 12:44 AM, Cheng, Hao <ha...@intel.com> wrote:

>  Have you ever try query the “select * from temp_table” from the spark
> shell? Or can you try the option --jars while starting the spark shell?
>
>
>
> *From:* Srikanth [mailto:srikanth.ht@gmail.com]
> *Sent:* Thursday, July 16, 2015 9:36 AM
> *To:* user
> *Subject:* Re: HiveThriftServer2.startWithContext error with
> registerTempTable
>
>
>
> Hello,
>
>
>
> Re-sending this to see if I'm second time lucky!
>
> I've not managed to move past this error.
>
>
>
> Srikanth
>
>
>
> On Mon, Jul 13, 2015 at 9:14 PM, Srikanth <sr...@gmail.com> wrote:
>
>  Hello,
>
>
>
> I want to expose result of Spark computation to external tools. I plan to
> do this with Thrift server JDBC interface by registering result Dataframe
> as temp table.
>
> I wrote a sample program in spark-shell to test this.
>
>
>
> val hiveContext = new org.apache.spark.sql.hive.HiveContext(sc)
> import hiveContext.implicits._
> HiveThriftServer2.startWithContext(hiveContext)
> val myDF =
> hiveContext.read.format("com.databricks.spark.csv").option("header",
> "true").load("/datafolder/weblog/pages.csv")
> myDF.registerTempTable("temp_table")
>
>
>
> I'm able to see the temp table in Beeline
>
>
>
> +-------------+--------------+
> |  tableName  | isTemporary  |
> +-------------+--------------+
> | temp_table  | true         |
> | my_table    | false        |
> +-------------+--------------+
>
>
>
> Now when I issue "select * from temp_table" from Beeline, I see below
> exception in spark-shell
>
>
>
> 15/07/13 17:18:27 WARN ThriftCLIService: Error executing statement:
>
> org.apache.hive.service.cli.HiveSQLException: *java.lang.ClassNotFoundException:
> com.databricks.spark.csv.CsvRelation$$anonfun$buildScan$1$$anonfun$1*
>
>         at
> org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.run(Shim13.scala:206)
>
>         at
> org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementInternal(HiveSessionImpl.java:231)
>
>         at
> org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementAsync(HiveSessionImpl.java:218)
>
>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>
>
>
> I'm able to read the other table("my_table") from Beeline though.
>
> Any suggestions on how to overcome this?
>
>
>
> This is with Spark 1.4 pre-built version. Spark-shell was started with
> --package to pass spark-csv.
>
>
>
> Srikanth
>
>
>

RE: HiveThriftServer2.startWithContext error with registerTempTable

Posted by "Cheng, Hao" <ha...@intel.com>.

Have you ever try query the “select * from temp_table” from the spark shell? Or can you try the option --jars while starting the spark shell?

From: Srikanth [mailto:srikanth.ht@gmail.com]
Sent: Thursday, July 16, 2015 9:36 AM
To: user
Subject: Re: HiveThriftServer2.startWithContext error with registerTempTable

Hello,

Re-sending this to see if I'm second time lucky!
I've not managed to move past this error.

Srikanth

On Mon, Jul 13, 2015 at 9:14 PM, Srikanth <sr...@gmail.com>> wrote:
Hello,

I want to expose result of Spark computation to external tools. I plan to do this with Thrift server JDBC interface by registering result Dataframe as temp table.
I wrote a sample program in spark-shell to test this.

val hiveContext = new org.apache.spark.sql.hive.HiveContext(sc)
import hiveContext.implicits._
HiveThriftServer2.startWithContext(hiveContext)
val myDF = hiveContext.read.format("com.databricks.spark.csv").option("header", "true").load("/datafolder/weblog/pages.csv")
myDF.registerTempTable("temp_table")

I'm able to see the temp table in Beeline

+-------------+--------------+
|  tableName  | isTemporary  |
+-------------+--------------+
| temp_table  | true         |
| my_table    | false        |
+-------------+--------------+

Now when I issue "select * from temp_table" from Beeline, I see below exception in spark-shell

15/07/13 17:18:27 WARN ThriftCLIService: Error executing statement:
org.apache.hive.service.cli.HiveSQLException: java.lang.ClassNotFoundException: com.databricks.spark.csv.CsvRelation$$anonfun$buildScan$1$$anonfun$1
        at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.run(Shim13.scala:206)
        at org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementInternal(HiveSessionImpl.java:231)
        at org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementAsync(HiveSessionImpl.java:218)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

I'm able to read the other table("my_table") from Beeline though.
Any suggestions on how to overcome this?

This is with Spark 1.4 pre-built version. Spark-shell was started with --package to pass spark-csv.

Srikanth

Re: HiveThriftServer2.startWithContext error with registerTempTable

Posted by Srikanth <sr...@gmail.com>.

Hello,

Re-sending this to see if I'm second time lucky!
I've not managed to move past this error.

Srikanth

On Mon, Jul 13, 2015 at 9:14 PM, Srikanth <sr...@gmail.com> wrote:

> Hello,
>
> I want to expose result of Spark computation to external tools. I plan to
> do this with Thrift server JDBC interface by registering result Dataframe
> as temp table.
> I wrote a sample program in spark-shell to test this.
>
> val hiveContext = new org.apache.spark.sql.hive.HiveContext(sc)
>> import hiveContext.implicits._
>> HiveThriftServer2.startWithContext(hiveContext)
>> val myDF =
>> hiveContext.read.format("com.databricks.spark.csv").option("header",
>> "true").load("/datafolder/weblog/pages.csv")
>> myDF.registerTempTable("temp_table")
>
>
> I'm able to see the temp table in Beeline
>
> +-------------+--------------+
>> |  tableName  | isTemporary  |
>> +-------------+--------------+
>> | temp_table  | true         |
>> | my_table    | false        |
>> +-------------+--------------+
>
>
> Now when I issue "select * from temp_table" from Beeline, I see below
> exception in spark-shell
>
> 15/07/13 17:18:27 WARN ThriftCLIService: Error executing statement:
> org.apache.hive.service.cli.HiveSQLException: *java.lang.ClassNotFoundException:
> com.databricks.spark.csv.CsvRelation$$anonfun$buildScan$1$$anonfun$1*
>         at
> org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.run(Shim13.scala:206)
>         at
> org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementInternal(HiveSessionImpl.java:231)
>         at
> org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementAsync(HiveSessionImpl.java:218)
>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>
> I'm able to read the other table("my_table") from Beeline though.
> Any suggestions on how to overcome this?
>
> This is with Spark 1.4 pre-built version. Spark-shell was started with
> --package to pass spark-csv.
>
> Srikanth
>