You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by Srikanth <sr...@gmail.com> on 2015/07/14 03:14:49 UTC
HiveThriftServer2.startWithContext error with registerTempTable
Hello,
I want to expose result of Spark computation to external tools. I plan to
do this with Thrift server JDBC interface by registering result Dataframe
as temp table.
I wrote a sample program in spark-shell to test this.
val hiveContext = new org.apache.spark.sql.hive.HiveContext(sc)
> import hiveContext.implicits._
> HiveThriftServer2.startWithContext(hiveContext)
> val myDF =
> hiveContext.read.format("com.databricks.spark.csv").option("header",
> "true").load("/datafolder/weblog/pages.csv")
> myDF.registerTempTable("temp_table")
I'm able to see the temp table in Beeline
+-------------+--------------+
> | tableName | isTemporary |
> +-------------+--------------+
> | temp_table | true |
> | my_table | false |
> +-------------+--------------+
Now when I issue "select * from temp_table" from Beeline, I see below
exception in spark-shell
15/07/13 17:18:27 WARN ThriftCLIService: Error executing statement:
org.apache.hive.service.cli.HiveSQLException:
*java.lang.ClassNotFoundException:
com.databricks.spark.csv.CsvRelation$$anonfun$buildScan$1$$anonfun$1*
at
org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.run(Shim13.scala:206)
at
org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementInternal(HiveSessionImpl.java:231)
at
org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementAsync(HiveSessionImpl.java:218)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
I'm able to read the other table("my_table") from Beeline though.
Any suggestions on how to overcome this?
This is with Spark 1.4 pre-built version. Spark-shell was started with
--package to pass spark-csv.
Srikanth
Re: HiveThriftServer2.startWithContext error with registerTempTable
Posted by Srikanth <sr...@gmail.com>.
Cheng,
Yes, "select * from temp_table" was working. I was able to perform some
transformation+action on the dataframe and print it on console.
HiveThriftServer2.startWithContext was being run on the same session.
When you say "try --jars option", are you asking me to pass spark-csv jar?
I'm already doing this with --packages com.databricks:spark-csv_2.10:1.0.3
Not sure if I'm missing your point here.
Anyways, I gave it a shot. I downloaded spark-csv_2.10-0.1.jar and started
spark-shell with --jars.
I still get the same exception. I'm pasting the exception below.
scala> 15/07/16 11:29:22 ERROR SparkExecuteStatementOperation: Error
executing query:
java.lang.ClassNotFoundException:
com.databricks.spark.csv.CsvRelation$$anonfun$buildScan$1$$anonfun$1
at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
15/07/16 11:29:22 WARN ThriftCLIService: Error executing statement:
org.apache.hive.service.cli.HiveSQLException:
java.lang.ClassNotFoundException:
com.databricks.spark.csv.CsvRelation$$anonfun$buildScan$1$$anonfun$1
at
org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.run(Shim13.scala:206)
at
org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementInternal(HiveSessionImpl.java:231)
at
org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementAsync(HiveSessionImpl.java:218)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
Srikanth
On Thu, Jul 16, 2015 at 12:44 AM, Cheng, Hao <ha...@intel.com> wrote:
> Have you ever try query the “select * from temp_table” from the spark
> shell? Or can you try the option --jars while starting the spark shell?
>
>
>
> *From:* Srikanth [mailto:srikanth.ht@gmail.com]
> *Sent:* Thursday, July 16, 2015 9:36 AM
> *To:* user
> *Subject:* Re: HiveThriftServer2.startWithContext error with
> registerTempTable
>
>
>
> Hello,
>
>
>
> Re-sending this to see if I'm second time lucky!
>
> I've not managed to move past this error.
>
>
>
> Srikanth
>
>
>
> On Mon, Jul 13, 2015 at 9:14 PM, Srikanth <sr...@gmail.com> wrote:
>
> Hello,
>
>
>
> I want to expose result of Spark computation to external tools. I plan to
> do this with Thrift server JDBC interface by registering result Dataframe
> as temp table.
>
> I wrote a sample program in spark-shell to test this.
>
>
>
> val hiveContext = new org.apache.spark.sql.hive.HiveContext(sc)
> import hiveContext.implicits._
> HiveThriftServer2.startWithContext(hiveContext)
> val myDF =
> hiveContext.read.format("com.databricks.spark.csv").option("header",
> "true").load("/datafolder/weblog/pages.csv")
> myDF.registerTempTable("temp_table")
>
>
>
> I'm able to see the temp table in Beeline
>
>
>
> +-------------+--------------+
> | tableName | isTemporary |
> +-------------+--------------+
> | temp_table | true |
> | my_table | false |
> +-------------+--------------+
>
>
>
> Now when I issue "select * from temp_table" from Beeline, I see below
> exception in spark-shell
>
>
>
> 15/07/13 17:18:27 WARN ThriftCLIService: Error executing statement:
>
> org.apache.hive.service.cli.HiveSQLException: *java.lang.ClassNotFoundException:
> com.databricks.spark.csv.CsvRelation$$anonfun$buildScan$1$$anonfun$1*
>
> at
> org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.run(Shim13.scala:206)
>
> at
> org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementInternal(HiveSessionImpl.java:231)
>
> at
> org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementAsync(HiveSessionImpl.java:218)
>
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>
>
>
> I'm able to read the other table("my_table") from Beeline though.
>
> Any suggestions on how to overcome this?
>
>
>
> This is with Spark 1.4 pre-built version. Spark-shell was started with
> --package to pass spark-csv.
>
>
>
> Srikanth
>
>
>
RE: HiveThriftServer2.startWithContext error with registerTempTable
Posted by "Cheng, Hao" <ha...@intel.com>.
Have you ever try query the “select * from temp_table” from the spark shell? Or can you try the option --jars while starting the spark shell?
From: Srikanth [mailto:srikanth.ht@gmail.com]
Sent: Thursday, July 16, 2015 9:36 AM
To: user
Subject: Re: HiveThriftServer2.startWithContext error with registerTempTable
Hello,
Re-sending this to see if I'm second time lucky!
I've not managed to move past this error.
Srikanth
On Mon, Jul 13, 2015 at 9:14 PM, Srikanth <sr...@gmail.com>> wrote:
Hello,
I want to expose result of Spark computation to external tools. I plan to do this with Thrift server JDBC interface by registering result Dataframe as temp table.
I wrote a sample program in spark-shell to test this.
val hiveContext = new org.apache.spark.sql.hive.HiveContext(sc)
import hiveContext.implicits._
HiveThriftServer2.startWithContext(hiveContext)
val myDF = hiveContext.read.format("com.databricks.spark.csv").option("header", "true").load("/datafolder/weblog/pages.csv")
myDF.registerTempTable("temp_table")
I'm able to see the temp table in Beeline
+-------------+--------------+
| tableName | isTemporary |
+-------------+--------------+
| temp_table | true |
| my_table | false |
+-------------+--------------+
Now when I issue "select * from temp_table" from Beeline, I see below exception in spark-shell
15/07/13 17:18:27 WARN ThriftCLIService: Error executing statement:
org.apache.hive.service.cli.HiveSQLException: java.lang.ClassNotFoundException: com.databricks.spark.csv.CsvRelation$$anonfun$buildScan$1$$anonfun$1
at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.run(Shim13.scala:206)
at org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementInternal(HiveSessionImpl.java:231)
at org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementAsync(HiveSessionImpl.java:218)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
I'm able to read the other table("my_table") from Beeline though.
Any suggestions on how to overcome this?
This is with Spark 1.4 pre-built version. Spark-shell was started with --package to pass spark-csv.
Srikanth
Re: HiveThriftServer2.startWithContext error with registerTempTable
Posted by Srikanth <sr...@gmail.com>.
Hello,
Re-sending this to see if I'm second time lucky!
I've not managed to move past this error.
Srikanth
On Mon, Jul 13, 2015 at 9:14 PM, Srikanth <sr...@gmail.com> wrote:
> Hello,
>
> I want to expose result of Spark computation to external tools. I plan to
> do this with Thrift server JDBC interface by registering result Dataframe
> as temp table.
> I wrote a sample program in spark-shell to test this.
>
> val hiveContext = new org.apache.spark.sql.hive.HiveContext(sc)
>> import hiveContext.implicits._
>> HiveThriftServer2.startWithContext(hiveContext)
>> val myDF =
>> hiveContext.read.format("com.databricks.spark.csv").option("header",
>> "true").load("/datafolder/weblog/pages.csv")
>> myDF.registerTempTable("temp_table")
>
>
> I'm able to see the temp table in Beeline
>
> +-------------+--------------+
>> | tableName | isTemporary |
>> +-------------+--------------+
>> | temp_table | true |
>> | my_table | false |
>> +-------------+--------------+
>
>
> Now when I issue "select * from temp_table" from Beeline, I see below
> exception in spark-shell
>
> 15/07/13 17:18:27 WARN ThriftCLIService: Error executing statement:
> org.apache.hive.service.cli.HiveSQLException: *java.lang.ClassNotFoundException:
> com.databricks.spark.csv.CsvRelation$$anonfun$buildScan$1$$anonfun$1*
> at
> org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.run(Shim13.scala:206)
> at
> org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementInternal(HiveSessionImpl.java:231)
> at
> org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementAsync(HiveSessionImpl.java:218)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>
> I'm able to read the other table("my_table") from Beeline though.
> Any suggestions on how to overcome this?
>
> This is with Spark 1.4 pre-built version. Spark-shell was started with
> --package to pass spark-csv.
>
> Srikanth
>