You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by ๏̯͡๏ <ÐΞ€ρ@Ҝ>, de...@gmail.com on 2015/03/26 08:56:36 UTC

Hive Table not from from Spark SQL

I have a hive table named dw_bid, when i run hive from command prompt and
run describe dw_bid, it works.

I want to join a avro file (table) in HDFS with this hive dw_bid table and
i refer it as dw_bid from Spark SQL program, however i see

15/03/26 00:31:01 INFO HiveMetaStore.audit: ugi=dvasthimal
ip=unknown-ip-addr cmd=get_table : db=default tbl=dw_bid
15/03/26 00:31:01 ERROR metadata.Hive:
NoSuchObjectException(message:default.dw_bid table not found)
at
org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_table(HiveMetaStore.java:1560)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)


Code:

    val successDetail_S1 = sqlContext.avroFile(input)
    successDetail_S1.registerTempTable("sojsuccessevents1")
    val countS1 = sqlContext.sql("select
guid,sessionKey,sessionStartDate,sojDataDate,seqNum,eventTimestamp,siteId,successEventType,sourceType,itemId,"
+
        " shopCartId,b.transaction_Id as transactionId,offerId,b.bdr_id as
userId,priorPage1SeqNum,priorPage1PageId,exclWMSearchAttemptSeqNum,exclPriorSearchPageId,"
+
        "
exclPriorSearchSeqNum,exclPriorSearchCategory,exclPriorSearchL1,exclPriorSearchL2,currentImpressionId,sourceImpressionId,exclPriorSearchSqr,exclPriorSearchSort,"
+
        " isDuplicate,b.bid_date as
transactionDate,auctionTypeCode,isBin,leafCategoryId,itemSiteId,b.qty_bid
as bidQuantity," +
    " b.bid_amt_unit_lstg_curncy * b.bid_exchng_rate as
 bidAmtUsd,offerQuantity,offerAmountUsd,offerCreateDate,buyerSegment,buyerCountryId,sellerId,sellerCountryId,"
+
    " sellerStdLevel,cssSellerLevel,a.experimentChannel" +
    " from sojsuccessevents1 a join dw_bid b " +
    " on a.itemId = b.item_id  and  a.transactionId =  b.transaction_id " +
    " where b.bid_type_code IN (1,9) AND b.bdr_id > 0 AND ( b.bid_flags &
32) = 0 and lower(a.successEventType) IN ('bid','bin')")
    println("countS1.first:" + countS1.first)



Any suggestions on how to refer a hive table form Spark SQL?
-- 

Deepak

RE: Hive Table not from from Spark SQL

Posted by "Cheng, Hao" <ha...@intel.com>.
1)      Seems only in #2, the hive-site.xml was loaded correctly, (it knows the mysql driver stuffs, right?), #1 & #3 didn’t load the correct hive-site.xml, and actually it tried to run in default configuration(the empty database / metastore created).

2)      In yarn cluster, the driver probably launched in the machine is not the one you started the application, then, –driver-class-path option is useless, you’d better always try –jars?

Sorry, I am not super so with yarn stuff, just let me know how you solve the problem.

From: Denny Lee [mailto:denny.g.lee@gmail.com]
Sent: Saturday, March 28, 2015 12:06 AM
To: ÐΞ€ρ@Ҝ (๏̯͡๏); Michael Armbrust
Cc: user
Subject: Re: Hive Table not from from Spark SQL

Upon reviewing your other thread, could you confirm that your Hive metastore that you can connect to via Hive is a MySQL database?  And to also confirm, when you're running spark-shell and doing a "show tables" statement, you're getting the same error?

On Fri, Mar 27, 2015 at 6:08 AM ÐΞ€ρ@Ҝ (๏̯͡๏) <de...@gmail.com>> wrote:
I tried the following

1)

./bin/spark-submit -v --master yarn-cluster --driver-class-path /home/dvasthimal/spark1.3/mysql-connector-java-5.1.34.jar:/apache/hadoop/share/hadoop/common/hadoop-common-2.4.1-EBAY-2.jar:/apache/hadoop/lib/hadoop-lzo-0.6.0.jar:/apache/hadoop-2.4.1-2.1.3.0-2-EBAY/share/hadoop/yarn/lib/guava-11.0.2.jar:$SPARK_HOME/conf/hive-site.xml  --jars /home/dvasthimal/spark1.3/spark-avro_2.10-1.0.0.jar,/home/dvasthimal/spark1.3/spark-1.3.0-bin-hadoop2.4/lib/datanucleus-api-jdo-3.2.6.jar,/home/dvasthimal/spark1.3/spark-1.3.0-bin-hadoop2.4/lib/datanucleus-core-3.2.10.jar,/home/dvasthimal/spark1.3/spark-1.3.0-bin-hadoop2.4/lib/datanucleus-rdbms-3.2.9.jar --num-executors 1 --driver-memory 4g --driver-java-options "-XX:MaxPermSize=2G" --executor-memory 2g --executor-cores 1 --queue hdmi-express --class com.ebay.ep.poc.spark.reporting.SparkApp spark_reporting-1.0-SNAPSHOT.jar startDate=2015-02-16 endDate=2015-02-16 input=/user/dvasthimal/epdatasets/successdetail1/part-r-00000.avro subcommand=successevents2 output=/user/dvasthimal/epdatasets/successdetail2


This throws dw_bid not found. Looks like Spark SQL is unable to read my existing Hive metastore and creates its own and hence complains that table is not found.


2)

./bin/spark-submit -v --master yarn-cluster --driver-class-path /home/dvasthimal/spark1.3/mysql-connector-java-5.1.34.jar:/apache/hadoop/share/hadoop/common/hadoop-common-2.4.1-EBAY-2.jar:/apache/hadoop/lib/hadoop-lzo-0.6.0.jar:/apache/hadoop-2.4.1-2.1.3.0-2-EBAY/share/hadoop/yarn/lib/guava-11.0.2.jar  --jars /home/dvasthimal/spark1.3/spark-avro_2.10-1.0.0.jar,/home/dvasthimal/spark1.3/spark-1.3.0-bin-hadoop2.4/lib/datanucleus-api-jdo-3.2.6.jar,/home/dvasthimal/spark1.3/spark-1.3.0-bin-hadoop2.4/lib/datanucleus-core-3.2.10.jar,/home/dvasthimal/spark1.3/spark-1.3.0-bin-hadoop2.4/lib/datanucleus-rdbms-3.2.9.jar:$SPARK_HOME/conf/hive-site.xml --num-executors 1 --driver-memory 4g --driver-java-options "-XX:MaxPermSize=2G" --executor-memory 2g --executor-cores 1 --queue hdmi-express --class com.ebay.ep.poc.spark.reporting.SparkApp spark_reporting-1.0-SNAPSHOT.jar startDate=2015-02-16 endDate=2015-02-16 input=/user/dvasthimal/epdatasets/successdetail1/part-r-00000.avro subcommand=successevents2 output=/user/dvasthimal/epdatasets/successdetail2

This time i do not get above error, however i get MySQL driver not found exception. Looks like this is even before its able to communicate to Hive.


Caused by: org.datanucleus.exceptions.NucleusException: Attempt to invoke the "BONECP" plugin to create a ConnectionPool gave an error : The specified datastore driver ("com.mysql.jdbc.Driver") was not found in the CLASSPATH. Please check your CLASSPATH specification, and the name of the driver.

In both above cases, i do have hive-site.xml in Spark/conf folder.

3)
./bin/spark-submit -v --master yarn-cluster --driver-class-path /home/dvasthimal/spark1.3/mysql-connector-java-5.1.34.jar:/apache/hadoop/share/hadoop/common/hadoop-common-2.4.1-EBAY-2.jar:/apache/hadoop/lib/hadoop-lzo-0.6.0.jar:/apache/hadoop-2.4.1-2.1.3.0-2-EBAY/share/hadoop/yarn/lib/guava-11.0.2.jar  --jars /home/dvasthimal/spark1.3/spark-avro_2.10-1.0.0.jar,/home/dvasthimal/spark1.3/spark-1.3.0-bin-hadoop2.4/lib/datanucleus-api-jdo-3.2.6.jar,/home/dvasthimal/spark1.3/spark-1.3.0-bin-hadoop2.4/lib/datanucleus-core-3.2.10.jar,/home/dvasthimal/spark1.3/spark-1.3.0-bin-hadoop2.4/lib/datanucleus-rdbms-3.2.9.jar--num-executors 1 --driver-memory 4g --driver-java-options "-XX:MaxPermSize=2G" --executor-memory 2g --executor-cores 1 --queue hdmi-express --class com.ebay.ep.poc.spark.reporting.SparkApp spark_reporting-1.0-SNAPSHOT.jar startDate=2015-02-16 endDate=2015-02-16 input=/user/dvasthimal/epdatasets/successdetail1/part-r-00000.avro subcommand=successevents2 output=/user/dvasthimal/epdatasets/successdetail2

I do not specify hive-site.xml in --jars or --driver-class-path. Its present in spark/conf folder as per https://spark.apache.org/docs/1.3.0/sql-programming-guide.html#hive-tables.

In this case i get same error as #1. dw_bid table not found.

I want Spark SQL to know that there are tables in Hive and read that data. As per guide it looks like Spark SQL has that support.

Please suggest.

Regards,
Deepak


On Thu, Mar 26, 2015 at 9:01 PM, ÐΞ€ρ@Ҝ (๏̯͡๏) <de...@gmail.com>> wrote:
Stack Trace:

15/03/26 08:25:42 INFO ql.Driver: OK
15/03/26 08:25:42 INFO log.PerfLogger: <PERFLOG method=releaseLocks from=org.apache.hadoop.hive.ql.Driver>
15/03/26 08:25:42 INFO log.PerfLogger: </PERFLOG method=releaseLocks start=1427383542966 end=1427383542966 duration=0 from=org.apache.hadoop.hive.ql.Driver>
15/03/26 08:25:42 INFO log.PerfLogger: </PERFLOG method=Driver.run start=1427383535203 end=1427383542966 duration=7763 from=org.apache.hadoop.hive.ql.Driver>
15/03/26 08:25:42 INFO metastore.HiveMetaStore: 0: get_tables: db=default pat=.*
15/03/26 08:25:42 INFO HiveMetaStore.audit: ugi=dvasthimal ip=unknown-ip-addr cmd=get_tables: db=default pat=.*
15/03/26 08:25:43 INFO parse.ParseDriver: Parsing command: insert overwrite table sojsuccessevents2_spark select guid,sessionKey,sessionStartDate,sojDataDate,seqNum,eventTimestamp,siteId,successEventType,sourceType,itemId, shopCartId,b.transaction_Id as transactionId,offerId,b.bdr_id as userId,priorPage1SeqNum,priorPage1PageId,exclWMSearchAttemptSeqNum,exclPriorSearchPageId, exclPriorSearchSeqNum,exclPriorSearchCategory,exclPriorSearchL1,exclPriorSearchL2,currentImpressionId,sourceImpressionId,exclPriorSearchSqr,exclPriorSearchSort, isDuplicate,b.bid_date as transactionDate,auctionTypeCode,isBin,leafCategoryId,itemSiteId,b.qty_bid as bidQuantity, b.bid_amt_unit_lstg_curncy * b.bid_exchng_rate as  bidAmtUsd,offerQuantity,offerAmountUsd,offerCreateDate,buyerSegment,buyerCountryId,sellerId,sellerCountryId, sellerStdLevel,cssSellerLevel,a.experimentChannel from sojsuccessevents1 a join dw_bid b  on a.itemId = b.item_id  and  a.transactionId =  b.transaction_id  where b.auct_end_dt >= '2015-02-16' AND b.bid_dt >= '2015-02-16'  AND b.bid_type_code IN (1,9) AND b.bdr_id > 0 AND ( b.bid_flags & 32) = 0 and lower(a.successEventType) IN ('bid','bin')
15/03/26 08:25:43 INFO parse.ParseDriver: Parse Completed
15/03/26 08:25:43 INFO metastore.HiveMetaStore: 0: get_table : db=default tbl=sojsuccessevents2_spark
15/03/26 08:25:43 INFO HiveMetaStore.audit: ugi=dvasthimal ip=unknown-ip-addr cmd=get_table : db=default tbl=sojsuccessevents2_spark
15/03/26 08:25:44 INFO metastore.HiveMetaStore: 0: get_table : db=default tbl=dw_bid
15/03/26 08:25:44 INFO HiveMetaStore.audit: ugi=dvasthimal ip=unknown-ip-addr cmd=get_table : db=default tbl=dw_bid
15/03/26 08:25:44 ERROR metadata.Hive: NoSuchObjectException(message:default.dw_bid table not found)
at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_table(HiveMetaStore.java:1560)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:105)
at com.sun.proxy.$Proxy31.get_table(Unknown Source)
at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getTable(HiveMetaStoreClient.java:997)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:89)
at com.sun.proxy.$Proxy32.getTable(Unknown Source)
at org.apache.hadoop.hive.ql.metadata.Hive.getTable(Hive.java:976)
at org.apache.hadoop.hive.ql.metadata.Hive.getTable(Hive.java:950)
at org.apache.spark.sql.hive.HiveMetastoreCatalog.lookupRelation(HiveMetastoreCatalog.scala:180)
at org.apache.spark.sql.hive.HiveContext$$anon$1.org<http://1.org>$apache$spark$sql$catalyst$analysis$OverrideCatalog$$super$lookupRelation(HiveContext.scala:252)
at org.apache.spark.sql.catalyst.analysis.OverrideCatalog$$anonfun$lookupRelation$3.apply(Catalog.scala:161)
at org.apache.spark.sql.catalyst.analysis.OverrideCatalog$$anonfun$lookupRelation$3.apply(Catalog.scala:161)
at scala.Option.getOrElse(Option.scala:120)
at org.apache.spark.sql.catalyst.analysis.OverrideCatalog$class.lookupRelation(Catalog.scala:161)
at org.apache.spark.sql.hive.HiveContext$$anon$1.lookupRelation(HiveContext.scala:252)
at org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveRelations$.getTable(Analyzer.scala:175)
at org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveRelations$$anonfun$apply$6.applyOrElse(Analyzer.scala:187)
at org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveRelations$$anonfun$apply$6.applyOrElse(Analyzer.scala:182)
at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$3.apply(TreeNode.scala:187)
at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$3.apply(TreeNode.scala:187)
at org.apache.spark.sql.catalyst.trees.CurrentOrigin$.withOrigin(TreeNode.scala:50)
at org.apache.spark.sql.catalyst.trees.TreeNode.transformDown(TreeNode.scala:186)
at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$4.apply(TreeNode.scala:207)
at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
at scala.collection.Iterator$class.foreach(Iterator.scala:727)
at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
at scala.collection.generic.Growable$class.$plus$plus$eq(Growable.scala:48)
at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:103)
at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:47)
at scala.collection.TraversableOnce$class.to<http://class.to>(TraversableOnce.scala:273)
at scala.collection.AbstractIterator.to<http://scala.collection.AbstractIterator.to>(Iterator.scala:1157)
at scala.collection.TraversableOnce$class.toBuffer(TraversableOnce.scala:265)
at scala.collection.AbstractIterator.toBuffer(Iterator.scala:1157)
at scala.collection.TraversableOnce$class.toArray(TraversableOnce.scala:252)
at scala.collection.AbstractIterator.toArray(Iterator.scala:1157)
at org.apache.spark.sql.catalyst.trees.TreeNode.transformChildrenDown(TreeNode.scala:236)
at org.apache.spark.sql.catalyst.trees.TreeNode.transformDown(TreeNode.scala:192)
at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$4.apply(TreeNode.scala:207)
at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
at scala.collection.Iterator$class.foreach(Iterator.scala:727)
at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
at scala.collection.generic.Growable$class.$plus$plus$eq(Growable.scala:48)
at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:103)
at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:47)
at scala.collection.TraversableOnce$class.to<http://class.to>(TraversableOnce.scala:273)
at scala.collection.AbstractIterator.to<http://scala.collection.AbstractIterator.to>(Iterator.scala:1157)
at scala.collection.TraversableOnce$class.toBuffer(TraversableOnce.scala:265)
at scala.collection.AbstractIterator.toBuffer(Iterator.scala:1157)
at scala.collection.TraversableOnce$class.toArray(TraversableOnce.scala:252)
at scala.collection.AbstractIterator.toArray(Iterator.scala:1157)
at org.apache.spark.sql.catalyst.trees.TreeNode.transformChildrenDown(TreeNode.scala:236)
at org.apache.spark.sql.catalyst.trees.TreeNode.transformDown(TreeNode.scala:192)
at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$4.apply(TreeNode.scala:207)
at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
at scala.collection.Iterator$class.foreach(Iterator.scala:727)
at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
at scala.collection.generic.Growable$class.$plus$plus$eq(Growable.scala:48)
at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:103)
at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:47)
at scala.collection.TraversableOnce$class.to<http://class.to>(TraversableOnce.scala:273)
at scala.collection.AbstractIterator.to<http://scala.collection.AbstractIterator.to>(Iterator.scala:1157)
at scala.collection.TraversableOnce$class.toBuffer(TraversableOnce.scala:265)
at scala.collection.AbstractIterator.toBuffer(Iterator.scala:1157)
at scala.collection.TraversableOnce$class.toArray(TraversableOnce.scala:252)
at scala.collection.AbstractIterator.toArray(Iterator.scala:1157)
at org.apache.spark.sql.catalyst.trees.TreeNode.transformChildrenDown(TreeNode.scala:236)
at org.apache.spark.sql.catalyst.trees.TreeNode.transformDown(TreeNode.scala:192)
at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$4.apply(TreeNode.scala:207)
at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
at scala.collection.Iterator$class.foreach(Iterator.scala:727)
at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
at scala.collection.generic.Growable$class.$plus$plus$eq(Growable.scala:48)
at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:103)
at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:47)
at scala.collection.TraversableOnce$class.to<http://class.to>(TraversableOnce.scala:273)
at scala.collection.AbstractIterator.to<http://scala.collection.AbstractIterator.to>(Iterator.scala:1157)
at scala.collection.TraversableOnce$class.toBuffer(TraversableOnce.scala:265)
at scala.collection.AbstractIterator.toBuffer(Iterator.scala:1157)
at scala.collection.TraversableOnce$class.toArray(TraversableOnce.scala:252)
at scala.collection.AbstractIterator.toArray(Iterator.scala:1157)
at org.apache.spark.sql.catalyst.trees.TreeNode.transformChildrenDown(TreeNode.scala:236)
at org.apache.spark.sql.catalyst.trees.TreeNode.transformDown(TreeNode.scala:194)
at org.apache.spark.sql.catalyst.trees.TreeNode.transform(TreeNode.scala:177)
at org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveRelations$.apply(Analyzer.scala:182)
at org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveRelations$.apply(Analyzer.scala:172)
at org.apache.spark.sql.catalyst.rules.RuleExecutor$$anonfun$apply$1$$anonfun$apply$2.apply(RuleExecutor.scala:61)
at org.apache.spark.sql.catalyst.rules.RuleExecutor$$anonfun$apply$1$$anonfun$apply$2.apply(RuleExecutor.scala:59)
at scala.collection.LinearSeqOptimized$class.foldLeft(LinearSeqOptimized.scala:111)
at scala.collection.immutable.List.foldLeft(List.scala:84)
at org.apache.spark.sql.catalyst.rules.RuleExecutor$$anonfun$apply$1.apply(RuleExecutor.scala:59)
at org.apache.spark.sql.catalyst.rules.RuleExecutor$$anonfun$apply$1.apply(RuleExecutor.scala:51)
at scala.collection.immutable.List.foreach(List.scala:318)
at org.apache.spark.sql.catalyst.rules.RuleExecutor.apply(RuleExecutor.scala:51)
at org.apache.spark.sql.SQLContext$QueryExecution.analyzed$lzycompute(SQLContext.scala:1071)
at org.apache.spark.sql.SQLContext$QueryExecution.analyzed(SQLContext.scala:1071)
at org.apache.spark.sql.SQLContext$QueryExecution.assertAnalyzed(SQLContext.scala:1069)
at org.apache.spark.sql.DataFrame.<init>(DataFrame.scala:133)
at org.apache.spark.sql.DataFrame$.apply(DataFrame.scala:51)
at org.apache.spark.sql.hive.HiveContext.sql(HiveContext.scala:92)
at com.ebay.ep.poc.spark.reporting.process.service.HadoopSuccessEvents2Service.execute(HadoopSuccessEvents2Service.scala:32)
at com.ebay.ep.poc.spark.reporting.SparkApp$.main(SparkApp.scala:30)
at com.ebay.ep.poc.spark.reporting.SparkApp.main(SparkApp.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:480)

15/03/26 08:25:44 ERROR yarn.ApplicationMaster: User class threw exception: no such table List(dw_bid); line 1 pos 843
org.apache.spark.sql.AnalysisException: no such table List(dw_bid); line 1 pos 843
at org.apache.spark.sql.catalyst.analysis.package$AnalysisErrorAt.failAnalysis(package.scala:42)
at org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveRelations$.getTable(Analyzer.scala:178)
at org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveRelations$$anonfun$apply$6.applyOrElse(Analyzer.scala:187)
at org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveRelations$$anonfun$apply$6.applyOrElse(Analyzer.scala:182)
at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$3.apply(TreeNode.scala:187)
at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$3.apply(TreeNode.scala:187)
at org.apache.spark.sql.catalyst.trees.CurrentOrigin$.withOrigin(TreeNode.scala:50)
at org.apache.spark.sql.catalyst.trees.TreeNode.transformDown(TreeNode.scala:186)
at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$4.apply(TreeNode.scala:207)
at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
at scala.collection.Iterator$class.foreach(Iterator.scala:727)
at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
at scala.collection.generic.Growable$class.$plus$plus$eq(Growable.scala:48)
at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:103)
at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:47)
at scala.collection.TraversableOnce$class.to<http://class.to>(TraversableOnce.scala:273)
at scala.collection.AbstractIterator.to<http://scala.collection.AbstractIterator.to>(Iterator.scala:1157)
at scala.collection.TraversableOnce$class.toBuffer(TraversableOnce.scala:265)
at scala.collection.AbstractIterator.toBuffer(Iterator.scala:1157)
at scala.collection.TraversableOnce$class.toArray(TraversableOnce.scala:252)
at scala.collection.AbstractIterator.toArray(Iterator.scala:1157)
at org.apache.spark.sql.catalyst.trees.TreeNode.transformChildrenDown(TreeNode.scala:236)
at org.apache.spark.sql.catalyst.trees.TreeNode.transformDown(TreeNode.scala:192)
at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$4.apply(TreeNode.scala:207)
at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
at scala.collection.Iterator$class.foreach(Iterator.scala:727)
at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
at scala.collection.generic.Growable$class.$plus$plus$eq(Growable.scala:48)
at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:103)
at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:47)
at scala.collection.TraversableOnce$class.to<http://class.to>(TraversableOnce.scala:273)
at scala.collection.AbstractIterator.to<http://scala.collection.AbstractIterator.to>(Iterator.scala:1157)
at scala.collection.TraversableOnce$class.toBuffer(TraversableOnce.scala:265)
at scala.collection.AbstractIterator.toBuffer(Iterator.scala:1157)
at scala.collection.TraversableOnce$class.toArray(TraversableOnce.scala:252)
at scala.collection.AbstractIterator.toArray(Iterator.scala:1157)
at org.apache.spark.sql.catalyst.trees.TreeNode.transformChildrenDown(TreeNode.scala:236)
at org.apache.spark.sql.catalyst.trees.TreeNode.transformDown(TreeNode.scala:192)
at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$4.apply(TreeNode.scala:207)
at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
at scala.collection.Iterator$class.foreach(Iterator.scala:727)
at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
at scala.collection.generic.Growable$class.$plus$plus$eq(Growable.scala:48)
at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:103)
at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:47)
at scala.collection.TraversableOnce$class.to<http://class.to>(TraversableOnce.scala:273)
at scala.collection.AbstractIterator.to<http://scala.collection.AbstractIterator.to>(Iterator.scala:1157)
at scala.collection.TraversableOnce$class.toBuffer(TraversableOnce.scala:265)
at scala.collection.AbstractIterator.toBuffer(Iterator.scala:1157)
at scala.collection.TraversableOnce$class.toArray(TraversableOnce.scala:252)
at scala.collection.AbstractIterator.toArray(Iterator.scala:1157)
at org.apache.spark.sql.catalyst.trees.TreeNode.transformChildrenDown(TreeNode.scala:236)
at org.apache.spark.sql.catalyst.trees.TreeNode.transformDown(TreeNode.scala:192)
at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$4.apply(TreeNode.scala:207)
at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
at scala.collection.Iterator$class.foreach(Iterator.scala:727)
at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
at scala.collection.generic.Growable$class.$plus$plus$eq(Growable.scala:48)
at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:103)
at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:47)
at scala.collection.TraversableOnce$class.to<http://class.to>(TraversableOnce.scala:273)
at scala.collection.AbstractIterator.to<http://scala.collection.AbstractIterator.to>(Iterator.scala:1157)
at scala.collection.TraversableOnce$class.toBuffer(TraversableOnce.scala:265)
at scala.collection.AbstractIterator.toBuffer(Iterator.scala:1157)
at scala.collection.TraversableOnce$class.toArray(TraversableOnce.scala:252)
at scala.collection.AbstractIterator.toArray(Iterator.scala:1157)
at org.apache.spark.sql.catalyst.trees.TreeNode.transformChildrenDown(TreeNode.scala:236)
at org.apache.spark.sql.catalyst.trees.TreeNode.transformDown(TreeNode.scala:194)
at org.apache.spark.sql.catalyst.trees.TreeNode.transform(TreeNode.scala:177)
at org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveRelations$.apply(Analyzer.scala:182)
at org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveRelations$.apply(Analyzer.scala:172)
at org.apache.spark.sql.catalyst.rules.RuleExecutor$$anonfun$apply$1$$anonfun$apply$2.apply(RuleExecutor.scala:61)
at org.apache.spark.sql.catalyst.rules.RuleExecutor$$anonfun$apply$1$$anonfun$apply$2.apply(RuleExecutor.scala:59)
at scala.collection.LinearSeqOptimized$class.foldLeft(LinearSeqOptimized.scala:111)
at scala.collection.immutable.List.foldLeft(List.scala:84)
at org.apache.spark.sql.catalyst.rules.RuleExecutor$$anonfun$apply$1.apply(RuleExecutor.scala:59)
at org.apache.spark.sql.catalyst.rules.RuleExecutor$$anonfun$apply$1.apply(RuleExecutor.scala:51)
at scala.collection.immutable.List.foreach(List.scala:318)
at org.apache.spark.sql.catalyst.rules.RuleExecutor.apply(RuleExecutor.scala:51)
at org.apache.spark.sql.SQLContext$QueryExecution.analyzed$lzycompute(SQLContext.scala:1071)
at org.apache.spark.sql.SQLContext$QueryExecution.analyzed(SQLContext.scala:1071)
at org.apache.spark.sql.SQLContext$QueryExecution.assertAnalyzed(SQLContext.scala:1069)
at org.apache.spark.sql.DataFrame.<init>(DataFrame.scala:133)
at org.apache.spark.sql.DataFrame$.apply(DataFrame.scala:51)
at org.apache.spark.sql.hive.HiveContext.sql(HiveContext.scala:92)
at com.ebay.ep.poc.spark.reporting.process.service.HadoopSuccessEvents2Service.execute(HadoopSuccessEvents2Service.scala:32)
at com.ebay.ep.poc.spark.reporting.SparkApp$.main(SparkApp.scala:30)
at com.ebay.ep.poc.spark.reporting.SparkApp.main(SparkApp.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:480)
15/03/26 08:25:44 INFO yarn.ApplicationMaster: Final app status: FAILED, exitCode: 15, (reason: User class threw exception: no such table List(dw_bid); line 1 pos 843)
15/03/26 08:25:44 INFO yarn.ApplicationMaster: Invoking sc stop from shutdown hook


On Thu, Mar 26, 2015 at 8:58 PM, ÐΞ€ρ@Ҝ (๏̯͡๏) <de...@gmail.com>> wrote:
Hello Michael,
Thanks for your time.

1. show tables from Spark program returns nothing.
2. What entities are you talking about ? (I am actually new to Hive as well)


On Thu, Mar 26, 2015 at 8:35 PM, Michael Armbrust <mi...@databricks.com>> wrote:
What does "show tables" return?  You can also run "SET <optionName>" to make sure that entries from you hive site are being read correctly.

On Thu, Mar 26, 2015 at 4:02 AM, ÐΞ€ρ@Ҝ (๏̯͡๏) <de...@gmail.com>> wrote:
I have tables dw_bid that is created in Hive and has nothing to do with Spark.  I have data in avro that i want to join with dw_bid table, this join needs to be done using Spark SQL.  However for some reason Spark says dw_bid table does not exist. How do i say spark that dw_bid is a table created in Hive and read it.


Query that is run from Spark SQL
==============================
 insert overwrite table sojsuccessevents2_spark select guid,sessionKey,sessionStartDate,sojDataDate,seqNum,eventTimestamp,siteId,successEventType,sourceType,itemId, shopCartId,b.transaction_Id as transactionId,offerId,b.bdr_id as userId,priorPage1SeqNum,priorPage1PageId,exclWMSearchAttemptSeqNum,exclPriorSearchPageId, exclPriorSearchSeqNum,exclPriorSearchCategory,exclPriorSearchL1,exclPriorSearchL2,currentImpressionId,sourceImpressionId,exclPriorSearchSqr,exclPriorSearchSort, isDuplicate,b.bid_date as transactionDate,auctionTypeCode,isBin,leafCategoryId,itemSiteId,b.qty_bid as bidQuantity, b.bid_amt_unit_lstg_curncy * b.bid_exchng_rate as  bidAmtUsd,offerQuantity,offerAmountUsd,offerCreateDate,buyerSegment,buyerCountryId,sellerId,sellerCountryId, sellerStdLevel,cssSellerLevel,a.experimentChannel from sojsuccessevents1 a join dw_bid b  on a.itemId = b.item_id  and  a.transactionId =  b.transaction_id  where b.auct_end_dt >= '2015-02-16' AND b.bid_dt >= '2015-02-16'  AND b.bid_type_code IN (1,9) AND b.bdr_id > 0 AND ( b.bid_flags & 32) = 0 and lower(a.successEventType) IN ('bid','bin')


If i create sojsuccessevents2_spark from hive command line and run above command form Spark SQL program then i get error "sojsuccessevents2_spark table not found".

Hence i dropped the command from Hive and run create table sojsuccessevents2_spark from Spark SQL before running above command and it works until it hits next road block "dw_bid table not found"

This makes me belive that Spark for some reason is not able to read/understand the tables created outside Spark. I did copy   /apache/hive/conf/hive-site.xml into Spark conf directory.

Please suggest.


Logs
———
15/03/26 03:50:40 INFO HiveMetaStore.audit: ugi=dvasthimal ip=unknown-ip-addr cmd=get_table : db=default tbl=dw_bid
15/03/26 03:50:40 ERROR metadata.Hive: NoSuchObjectException(message:default.dw_bid table not found)
at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_table(HiveMetaStore.java:1560)



15/03/26 03:50:40 ERROR yarn.ApplicationMaster: User class threw exception: no such table List(dw_bid); line 1 pos 843
org.apache.spark.sql.AnalysisException: no such table List(dw_bid); line 1 pos 843
at org.apache.spark.sql.catalyst.analysis.package$AnalysisErrorAt.failAnalysis(package.scala:42)
at org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveRelations$.getTable(Analyzer.scala:178)
at org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveRelations$$anonfun$apply$6.applyOrElse(Analyzer.scala:187)



Regards,
Deepak


On Thu, Mar 26, 2015 at 4:27 PM, ÐΞ€ρ@Ҝ (๏̯͡๏) <de...@gmail.com>> wrote:
I have this query

 insert overwrite table sojsuccessevents2_spark select guid,sessionKey,sessionStartDate,sojDataDate,seqNum,eventTimestamp,siteId,successEventType,sourceType,itemId, shopCartId,b.transaction_Id as transactionId,offerId,b.bdr_id as userId,priorPage1SeqNum,priorPage1PageId,exclWMSearchAttemptSeqNum,exclPriorSearchPageId, exclPriorSearchSeqNum,exclPriorSearchCategory,exclPriorSearchL1,exclPriorSearchL2,currentImpressionId,sourceImpressionId,exclPriorSearchSqr,exclPriorSearchSort, isDuplicate,b.bid_date as transactionDate,auctionTypeCode,isBin,leafCategoryId,itemSiteId,b.qty_bid as bidQuantity, b.bid_amt_unit_lstg_curncy * b.bid_exchng_rate as  bidAmtUsd,offerQuantity,offerAmountUsd,offerCreateDate,buyerSegment,buyerCountryId,sellerId,sellerCountryId, sellerStdLevel,cssSellerLevel,a.experimentChannel from sojsuccessevents1 a join dw_bid b  on a.itemId = b.item_id  and  a.transactionId =  b.transaction_id  where b.auct_end_dt >= '2015-02-16' AND b.bid_dt >= '2015-02-16'  AND b.bid_type_code IN (1,9) AND b.bdr_id > 0 AND ( b.bid_flags & 32) = 0 and lower(a.successEventType) IN ('bid','bin')


If i create sojsuccessevents2_spark from hive command line and run above command form Spark SQL program then i get error "sojsuccessevents2_spark table not found".

Hence i dropped the command from Hive and run create table sojsuccessevents2_spark from Spark SQL before running above command and it works until it hits next road block "dw_bid table not found"

This makes me belive that Spark for some reason is not able to read/understand the tables created outside Spark. I did copy   /apache/hive/conf/hive-site.xml into Spark conf directory.

Please suggest.

Regards,
Deepak


On Thu, Mar 26, 2015 at 1:26 PM, ÐΞ€ρ@Ҝ (๏̯͡๏) <de...@gmail.com>> wrote:
I have a hive table named dw_bid, when i run hive from command prompt and run describe dw_bid, it works.

I want to join a avro file (table) in HDFS with this hive dw_bid table and i refer it as dw_bid from Spark SQL program, however i see

15/03/26 00:31:01 INFO HiveMetaStore.audit: ugi=dvasthimal ip=unknown-ip-addr cmd=get_table : db=default tbl=dw_bid
15/03/26 00:31:01 ERROR metadata.Hive: NoSuchObjectException(message:default.dw_bid table not found)
at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_table(HiveMetaStore.java:1560)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)


Code:

    val successDetail_S1 = sqlContext.avroFile(input)
    successDetail_S1.registerTempTable("sojsuccessevents1")
    val countS1 = sqlContext.sql("select guid,sessionKey,sessionStartDate,sojDataDate,seqNum,eventTimestamp,siteId,successEventType,sourceType,itemId," +
        " shopCartId,b.transaction_Id as transactionId,offerId,b.bdr_id as userId,priorPage1SeqNum,priorPage1PageId,exclWMSearchAttemptSeqNum,exclPriorSearchPageId," +
        " exclPriorSearchSeqNum,exclPriorSearchCategory,exclPriorSearchL1,exclPriorSearchL2,currentImpressionId,sourceImpressionId,exclPriorSearchSqr,exclPriorSearchSort," +
        " isDuplicate,b.bid_date as transactionDate,auctionTypeCode,isBin,leafCategoryId,itemSiteId,b.qty_bid as bidQuantity," +
    " b.bid_amt_unit_lstg_curncy * b.bid_exchng_rate as  bidAmtUsd,offerQuantity,offerAmountUsd,offerCreateDate,buyerSegment,buyerCountryId,sellerId,sellerCountryId," +
    " sellerStdLevel,cssSellerLevel,a.experimentChannel" +
    " from sojsuccessevents1 a join dw_bid b " +
    " on a.itemId = b.item_id  and  a.transactionId =  b.transaction_id " +
    " where b.bid_type_code IN (1,9) AND b.bdr_id > 0 AND ( b.bid_flags & 32) = 0 and lower(a.successEventType) IN ('bid','bin')")
    println("countS1.first:" + countS1.first)



Any suggestions on how to refer a hive table form Spark SQL?
--

Deepak




--
Deepak




--
Deepak





--
Deepak




--
Deepak




--
Deepak


Re: Hive Table not from from Spark SQL

Posted by Denny Lee <de...@gmail.com>.
Upon reviewing your other thread, could you confirm that your Hive
metastore that you can connect to via Hive is a MySQL database?  And to
also confirm, when you're running spark-shell and doing a "show tables"
statement, you're getting the same error?


On Fri, Mar 27, 2015 at 6:08 AM ÐΞ€ρ@Ҝ (๏̯͡๏) <de...@gmail.com> wrote:

> I tried the following
>
> 1)
>
> ./bin/spark-submit -v --master yarn-cluster --driver-class-path
> /home/dvasthimal/spark1.3/mysql-connector-java-5.1.34.jar:/apache/hadoop/share/hadoop/common/hadoop-common-2.4.1-EBAY-2.jar:/apache/hadoop/lib/hadoop-lzo-0.6.0.jar:/apache/hadoop-2.4.1-2.1.3.0-2-EBAY/share/hadoop/yarn/lib/guava-11.0.2.jar:
> *$SPARK_HOME/conf/hive-site.xml*  --jars
> /home/dvasthimal/spark1.3/spark-avro_2.10-1.0.0.jar,/home/dvasthimal/spark1.3/spark-1.3.0-bin-hadoop2.4/lib/datanucleus-api-jdo-3.2.6.jar,/home/dvasthimal/spark1.3/spark-1.3.0-bin-hadoop2.4/lib/datanucleus-core-3.2.10.jar,/home/dvasthimal/spark1.3/spark-1.3.0-bin-hadoop2.4/lib/datanucleus-rdbms-3.2.9.jar
> --num-executors 1 --driver-memory 4g --driver-java-options
> "-XX:MaxPermSize=2G" --executor-memory 2g --executor-cores 1 --queue
> hdmi-express --class com.ebay.ep.poc.spark.reporting.SparkApp
> spark_reporting-1.0-SNAPSHOT.jar startDate=2015-02-16 endDate=2015-02-16
> input=/user/dvasthimal/epdatasets/successdetail1/part-r-00000.avro
> subcommand=successevents2 output=/user/dvasthimal/epdatasets/successdetail2
>
>
> This throws dw_bid not found. Looks like Spark SQL is unable to read my
> existing Hive metastore and creates its own and hence complains that table
> is not found.
>
>
> 2)
>
> ./bin/spark-submit -v --master yarn-cluster --driver-class-path
> /home/dvasthimal/spark1.3/mysql-connector-java-5.1.34.jar:/apache/hadoop/share/hadoop/common/hadoop-common-2.4.1-EBAY-2.jar:/apache/hadoop/lib/hadoop-lzo-0.6.0.jar:/apache/hadoop-2.4.1-2.1.3.0-2-EBAY/share/hadoop/yarn/lib/guava-11.0.2.jar  --jars
> /home/dvasthimal/spark1.3/spark-avro_2.10-1.0.0.jar,/home/dvasthimal/spark1.3/spark-1.3.0-bin-hadoop2.4/lib/datanucleus-api-jdo-3.2.6.jar,/home/dvasthimal/spark1.3/spark-1.3.0-bin-hadoop2.4/lib/datanucleus-core-3.2.10.jar,/home/dvasthimal/spark1.3/spark-1.3.0-bin-hadoop2.4/lib/datanucleus-rdbms-3.2.9.jar:
> *$SPARK_HOME/conf/hive-site.xml* --num-executors 1 --driver-memory 4g
> --driver-java-options "-XX:MaxPermSize=2G" --executor-memory 2g
> --executor-cores 1 --queue hdmi-express --class
> com.ebay.ep.poc.spark.reporting.SparkApp spark_reporting-1.0-SNAPSHOT.jar
> startDate=2015-02-16 endDate=2015-02-16
> input=/user/dvasthimal/epdatasets/successdetail1/part-r-00000.avro
> subcommand=successevents2 output=/user/dvasthimal/epdatasets/successdetail2
>
> This time i do not get above error, however i get MySQL driver not found
> exception. Looks like this is even before its able to communicate to Hive.
>
> Caused by: org.datanucleus.exceptions.NucleusException: Attempt to invoke
> the "BONECP" plugin to create a ConnectionPool gave an error : The
> specified datastore driver ("com.mysql.jdbc.Driver") was not found in the
> CLASSPATH. Please check your CLASSPATH specification, and the name of the
> driver.
>
> In both above cases, i do have hive-site.xml in Spark/conf folder.
>
> 3)
> ./bin/spark-submit -v --master yarn-cluster --driver-class-path
> /home/dvasthimal/spark1.3/mysql-connector-java-5.1.34.jar:/apache/hadoop/share/hadoop/common/hadoop-common-2.4.1-EBAY-2.jar:/apache/hadoop/lib/hadoop-lzo-0.6.0.jar:/apache/hadoop-2.4.1-2.1.3.0-2-EBAY/share/hadoop/yarn/lib/guava-11.0.2.jar  --jars
> /home/dvasthimal/spark1.3/spark-avro_2.10-1.0.0.jar,/home/dvasthimal/spark1.3/spark-1.3.0-bin-hadoop2.4/lib/datanucleus-api-jdo-3.2.6.jar,/home/dvasthimal/spark1.3/spark-1.3.0-bin-hadoop2.4/lib/datanucleus-core-3.2.10.jar,/home/dvasthimal/spark1.3/spark-1.3.0-bin-hadoop2.4/lib/datanucleus-rdbms-3.2.9.jar--num-executors
> 1 --driver-memory 4g --driver-java-options "-XX:MaxPermSize=2G"
> --executor-memory 2g --executor-cores 1 --queue hdmi-express --class
> com.ebay.ep.poc.spark.reporting.SparkApp spark_reporting-1.0-SNAPSHOT.jar
> startDate=2015-02-16 endDate=2015-02-16
> input=/user/dvasthimal/epdatasets/successdetail1/part-r-00000.avro
> subcommand=successevents2 output=/user/dvasthimal/epdatasets/successdetail2
>
> I do not specify hive-site.xml in --jars or --driver-class-path. Its
> present in spark/conf folder as per
> https://spark.apache.org/docs/1.3.0/sql-programming-guide.html#hive-tables
> .
>
> In this case i get same error as #1. dw_bid table not found.
>
> I want Spark SQL to know that there are tables in Hive and read that data.
> As per guide it looks like Spark SQL has that support.
>
> Please suggest.
>
> Regards,
> Deepak
>
>
> On Thu, Mar 26, 2015 at 9:01 PM, ÐΞ€ρ@Ҝ (๏̯͡๏) <de...@gmail.com>
> wrote:
>
>> Stack Trace:
>>
>> 15/03/26 08:25:42 INFO ql.Driver: OK
>> 15/03/26 08:25:42 INFO log.PerfLogger: <PERFLOG method=releaseLocks
>> from=org.apache.hadoop.hive.ql.Driver>
>> 15/03/26 08:25:42 INFO log.PerfLogger: </PERFLOG method=releaseLocks
>> start=1427383542966 end=1427383542966 duration=0
>> from=org.apache.hadoop.hive.ql.Driver>
>> 15/03/26 08:25:42 INFO log.PerfLogger: </PERFLOG method=Driver.run
>> start=1427383535203 end=1427383542966 duration=7763
>> from=org.apache.hadoop.hive.ql.Driver>
>> 15/03/26 08:25:42 INFO metastore.HiveMetaStore: 0: get_tables: db=default
>> pat=.*
>> 15/03/26 08:25:42 INFO HiveMetaStore.audit: ugi=dvasthimal
>> ip=unknown-ip-addr cmd=get_tables: db=default pat=.*
>> 15/03/26 08:25:43 INFO parse.ParseDriver: Parsing command: insert
>> overwrite table sojsuccessevents2_spark select
>> guid,sessionKey,sessionStartDate,sojDataDate,seqNum,eventTimestamp,siteId,successEventType,sourceType,itemId,
>> shopCartId,b.transaction_Id as transactionId,offerId,b.bdr_id as
>> userId,priorPage1SeqNum,priorPage1PageId,exclWMSearchAttemptSeqNum,exclPriorSearchPageId,
>> exclPriorSearchSeqNum,exclPriorSearchCategory,exclPriorSearchL1,exclPriorSearchL2,currentImpressionId,sourceImpressionId,exclPriorSearchSqr,exclPriorSearchSort,
>> isDuplicate,b.bid_date as
>> transactionDate,auctionTypeCode,isBin,leafCategoryId,itemSiteId,b.qty_bid
>> as bidQuantity, b.bid_amt_unit_lstg_curncy * b.bid_exchng_rate as
>>  bidAmtUsd,offerQuantity,offerAmountUsd,offerCreateDate,buyerSegment,buyerCountryId,sellerId,sellerCountryId,
>> sellerStdLevel,cssSellerLevel,a.experimentChannel from sojsuccessevents1 a
>> join dw_bid b  on a.itemId = b.item_id  and  a.transactionId =
>>  b.transaction_id  where b.auct_end_dt >= '2015-02-16' AND b.bid_dt >=
>> '2015-02-16'  AND b.bid_type_code IN (1,9) AND b.bdr_id > 0 AND (
>> b.bid_flags & 32) = 0 and lower(a.successEventType) IN ('bid','bin')
>> 15/03/26 08:25:43 INFO parse.ParseDriver: Parse Completed
>> 15/03/26 08:25:43 INFO metastore.HiveMetaStore: 0: get_table : db=default
>> tbl=sojsuccessevents2_spark
>> 15/03/26 08:25:43 INFO HiveMetaStore.audit: ugi=dvasthimal
>> ip=unknown-ip-addr cmd=get_table : db=default tbl=sojsuccessevents2_spark
>> 15/03/26 08:25:44 INFO metastore.HiveMetaStore: 0: get_table : db=default
>> tbl=dw_bid
>> 15/03/26 08:25:44 INFO HiveMetaStore.audit: ugi=dvasthimal
>> ip=unknown-ip-addr cmd=get_table : db=default tbl=dw_bid
>> 15/03/26 08:25:44 ERROR metadata.Hive:
>> NoSuchObjectException(message:default.dw_bid table not found)
>> at
>> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_table(HiveMetaStore.java:1560)
>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>> at
>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>> at
>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>> at java.lang.reflect.Method.invoke(Method.java:606)
>> at
>> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:105)
>> at com.sun.proxy.$Proxy31.get_table(Unknown Source)
>> at
>> org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getTable(HiveMetaStoreClient.java:997)
>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>> at
>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>> at
>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>> at java.lang.reflect.Method.invoke(Method.java:606)
>> at
>> org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:89)
>> at com.sun.proxy.$Proxy32.getTable(Unknown Source)
>> at org.apache.hadoop.hive.ql.metadata.Hive.getTable(Hive.java:976)
>> at org.apache.hadoop.hive.ql.metadata.Hive.getTable(Hive.java:950)
>> at
>> org.apache.spark.sql.hive.HiveMetastoreCatalog.lookupRelation(HiveMetastoreCatalog.scala:180)
>> at org.apache.spark.sql.hive.HiveContext$$anon$1.org
>> $apache$spark$sql$catalyst$analysis$OverrideCatalog$$super$lookupRelation(HiveContext.scala:252)
>> at
>> org.apache.spark.sql.catalyst.analysis.OverrideCatalog$$anonfun$lookupRelation$3.apply(Catalog.scala:161)
>> at
>> org.apache.spark.sql.catalyst.analysis.OverrideCatalog$$anonfun$lookupRelation$3.apply(Catalog.scala:161)
>> at scala.Option.getOrElse(Option.scala:120)
>> at
>> org.apache.spark.sql.catalyst.analysis.OverrideCatalog$class.lookupRelation(Catalog.scala:161)
>> at
>> org.apache.spark.sql.hive.HiveContext$$anon$1.lookupRelation(HiveContext.scala:252)
>> at
>> org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveRelations$.getTable(Analyzer.scala:175)
>> at
>> org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveRelations$$anonfun$apply$6.applyOrElse(Analyzer.scala:187)
>> at
>> org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveRelations$$anonfun$apply$6.applyOrElse(Analyzer.scala:182)
>> at
>> org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$3.apply(TreeNode.scala:187)
>> at
>> org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$3.apply(TreeNode.scala:187)
>> at
>> org.apache.spark.sql.catalyst.trees.CurrentOrigin$.withOrigin(TreeNode.scala:50)
>> at
>> org.apache.spark.sql.catalyst.trees.TreeNode.transformDown(TreeNode.scala:186)
>> at
>> org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$4.apply(TreeNode.scala:207)
>> at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
>> at scala.collection.Iterator$class.foreach(Iterator.scala:727)
>> at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
>> at
>> scala.collection.generic.Growable$class.$plus$plus$eq(Growable.scala:48)
>> at
>> scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:103)
>> at
>> scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:47)
>> at scala.collection.TraversableOnce$class.to(TraversableOnce.scala:273)
>> at scala.collection.AbstractIterator.to(Iterator.scala:1157)
>> at
>> scala.collection.TraversableOnce$class.toBuffer(TraversableOnce.scala:265)
>> at scala.collection.AbstractIterator.toBuffer(Iterator.scala:1157)
>> at
>> scala.collection.TraversableOnce$class.toArray(TraversableOnce.scala:252)
>> at scala.collection.AbstractIterator.toArray(Iterator.scala:1157)
>> at
>> org.apache.spark.sql.catalyst.trees.TreeNode.transformChildrenDown(TreeNode.scala:236)
>> at
>> org.apache.spark.sql.catalyst.trees.TreeNode.transformDown(TreeNode.scala:192)
>> at
>> org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$4.apply(TreeNode.scala:207)
>> at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
>> at scala.collection.Iterator$class.foreach(Iterator.scala:727)
>> at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
>> at
>> scala.collection.generic.Growable$class.$plus$plus$eq(Growable.scala:48)
>> at
>> scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:103)
>> at
>> scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:47)
>> at scala.collection.TraversableOnce$class.to(TraversableOnce.scala:273)
>> at scala.collection.AbstractIterator.to(Iterator.scala:1157)
>> at
>> scala.collection.TraversableOnce$class.toBuffer(TraversableOnce.scala:265)
>> at scala.collection.AbstractIterator.toBuffer(Iterator.scala:1157)
>> at
>> scala.collection.TraversableOnce$class.toArray(TraversableOnce.scala:252)
>> at scala.collection.AbstractIterator.toArray(Iterator.scala:1157)
>> at
>> org.apache.spark.sql.catalyst.trees.TreeNode.transformChildrenDown(TreeNode.scala:236)
>> at
>> org.apache.spark.sql.catalyst.trees.TreeNode.transformDown(TreeNode.scala:192)
>> at
>> org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$4.apply(TreeNode.scala:207)
>> at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
>> at scala.collection.Iterator$class.foreach(Iterator.scala:727)
>> at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
>> at
>> scala.collection.generic.Growable$class.$plus$plus$eq(Growable.scala:48)
>> at
>> scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:103)
>> at
>> scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:47)
>> at scala.collection.TraversableOnce$class.to(TraversableOnce.scala:273)
>> at scala.collection.AbstractIterator.to(Iterator.scala:1157)
>> at
>> scala.collection.TraversableOnce$class.toBuffer(TraversableOnce.scala:265)
>> at scala.collection.AbstractIterator.toBuffer(Iterator.scala:1157)
>> at
>> scala.collection.TraversableOnce$class.toArray(TraversableOnce.scala:252)
>> at scala.collection.AbstractIterator.toArray(Iterator.scala:1157)
>> at
>> org.apache.spark.sql.catalyst.trees.TreeNode.transformChildrenDown(TreeNode.scala:236)
>> at
>> org.apache.spark.sql.catalyst.trees.TreeNode.transformDown(TreeNode.scala:192)
>> at
>> org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$4.apply(TreeNode.scala:207)
>> at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
>> at scala.collection.Iterator$class.foreach(Iterator.scala:727)
>> at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
>> at
>> scala.collection.generic.Growable$class.$plus$plus$eq(Growable.scala:48)
>> at
>> scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:103)
>> at
>> scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:47)
>> at scala.collection.TraversableOnce$class.to(TraversableOnce.scala:273)
>> at scala.collection.AbstractIterator.to(Iterator.scala:1157)
>> at
>> scala.collection.TraversableOnce$class.toBuffer(TraversableOnce.scala:265)
>> at scala.collection.AbstractIterator.toBuffer(Iterator.scala:1157)
>> at
>> scala.collection.TraversableOnce$class.toArray(TraversableOnce.scala:252)
>> at scala.collection.AbstractIterator.toArray(Iterator.scala:1157)
>> at
>> org.apache.spark.sql.catalyst.trees.TreeNode.transformChildrenDown(TreeNode.scala:236)
>> at
>> org.apache.spark.sql.catalyst.trees.TreeNode.transformDown(TreeNode.scala:194)
>> at
>> org.apache.spark.sql.catalyst.trees.TreeNode.transform(TreeNode.scala:177)
>> at
>> org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveRelations$.apply(Analyzer.scala:182)
>> at
>> org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveRelations$.apply(Analyzer.scala:172)
>> at
>> org.apache.spark.sql.catalyst.rules.RuleExecutor$$anonfun$apply$1$$anonfun$apply$2.apply(RuleExecutor.scala:61)
>> at
>> org.apache.spark.sql.catalyst.rules.RuleExecutor$$anonfun$apply$1$$anonfun$apply$2.apply(RuleExecutor.scala:59)
>> at
>> scala.collection.LinearSeqOptimized$class.foldLeft(LinearSeqOptimized.scala:111)
>> at scala.collection.immutable.List.foldLeft(List.scala:84)
>> at
>> org.apache.spark.sql.catalyst.rules.RuleExecutor$$anonfun$apply$1.apply(RuleExecutor.scala:59)
>> at
>> org.apache.spark.sql.catalyst.rules.RuleExecutor$$anonfun$apply$1.apply(RuleExecutor.scala:51)
>> at scala.collection.immutable.List.foreach(List.scala:318)
>> at
>> org.apache.spark.sql.catalyst.rules.RuleExecutor.apply(RuleExecutor.scala:51)
>> at
>> org.apache.spark.sql.SQLContext$QueryExecution.analyzed$lzycompute(SQLContext.scala:1071)
>> at
>> org.apache.spark.sql.SQLContext$QueryExecution.analyzed(SQLContext.scala:1071)
>> at
>> org.apache.spark.sql.SQLContext$QueryExecution.assertAnalyzed(SQLContext.scala:1069)
>> at org.apache.spark.sql.DataFrame.<init>(DataFrame.scala:133)
>> at org.apache.spark.sql.DataFrame$.apply(DataFrame.scala:51)
>> at org.apache.spark.sql.hive.HiveContext.sql(HiveContext.scala:92)
>> at
>> com.ebay.ep.poc.spark.reporting.process.service.HadoopSuccessEvents2Service.execute(HadoopSuccessEvents2Service.scala:32)
>> at com.ebay.ep.poc.spark.reporting.SparkApp$.main(SparkApp.scala:30)
>> at com.ebay.ep.poc.spark.reporting.SparkApp.main(SparkApp.scala)
>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>> at
>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>> at
>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>> at java.lang.reflect.Method.invoke(Method.java:606)
>> at
>> org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:480)
>>
>> 15/03/26 08:25:44 ERROR yarn.ApplicationMaster: User class threw
>> exception: no such table List(dw_bid); line 1 pos 843
>> org.apache.spark.sql.AnalysisException: no such table List(dw_bid); line
>> 1 pos 843
>> at
>> org.apache.spark.sql.catalyst.analysis.package$AnalysisErrorAt.failAnalysis(package.scala:42)
>> at
>> org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveRelations$.getTable(Analyzer.scala:178)
>> at
>> org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveRelations$$anonfun$apply$6.applyOrElse(Analyzer.scala:187)
>> at
>> org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveRelations$$anonfun$apply$6.applyOrElse(Analyzer.scala:182)
>> at
>> org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$3.apply(TreeNode.scala:187)
>> at
>> org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$3.apply(TreeNode.scala:187)
>> at
>> org.apache.spark.sql.catalyst.trees.CurrentOrigin$.withOrigin(TreeNode.scala:50)
>> at
>> org.apache.spark.sql.catalyst.trees.TreeNode.transformDown(TreeNode.scala:186)
>> at
>> org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$4.apply(TreeNode.scala:207)
>> at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
>> at scala.collection.Iterator$class.foreach(Iterator.scala:727)
>> at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
>> at
>> scala.collection.generic.Growable$class.$plus$plus$eq(Growable.scala:48)
>> at
>> scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:103)
>> at
>> scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:47)
>> at scala.collection.TraversableOnce$class.to(TraversableOnce.scala:273)
>> at scala.collection.AbstractIterator.to(Iterator.scala:1157)
>> at
>> scala.collection.TraversableOnce$class.toBuffer(TraversableOnce.scala:265)
>> at scala.collection.AbstractIterator.toBuffer(Iterator.scala:1157)
>> at
>> scala.collection.TraversableOnce$class.toArray(TraversableOnce.scala:252)
>> at scala.collection.AbstractIterator.toArray(Iterator.scala:1157)
>> at
>> org.apache.spark.sql.catalyst.trees.TreeNode.transformChildrenDown(TreeNode.scala:236)
>> at
>> org.apache.spark.sql.catalyst.trees.TreeNode.transformDown(TreeNode.scala:192)
>> at
>> org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$4.apply(TreeNode.scala:207)
>> at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
>> at scala.collection.Iterator$class.foreach(Iterator.scala:727)
>> at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
>> at
>> scala.collection.generic.Growable$class.$plus$plus$eq(Growable.scala:48)
>> at
>> scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:103)
>> at
>> scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:47)
>> at scala.collection.TraversableOnce$class.to(TraversableOnce.scala:273)
>> at scala.collection.AbstractIterator.to(Iterator.scala:1157)
>> at
>> scala.collection.TraversableOnce$class.toBuffer(TraversableOnce.scala:265)
>> at scala.collection.AbstractIterator.toBuffer(Iterator.scala:1157)
>> at
>> scala.collection.TraversableOnce$class.toArray(TraversableOnce.scala:252)
>> at scala.collection.AbstractIterator.toArray(Iterator.scala:1157)
>> at
>> org.apache.spark.sql.catalyst.trees.TreeNode.transformChildrenDown(TreeNode.scala:236)
>> at
>> org.apache.spark.sql.catalyst.trees.TreeNode.transformDown(TreeNode.scala:192)
>> at
>> org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$4.apply(TreeNode.scala:207)
>> at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
>> at scala.collection.Iterator$class.foreach(Iterator.scala:727)
>> at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
>> at
>> scala.collection.generic.Growable$class.$plus$plus$eq(Growable.scala:48)
>> at
>> scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:103)
>> at
>> scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:47)
>> at scala.collection.TraversableOnce$class.to(TraversableOnce.scala:273)
>> at scala.collection.AbstractIterator.to(Iterator.scala:1157)
>> at
>> scala.collection.TraversableOnce$class.toBuffer(TraversableOnce.scala:265)
>> at scala.collection.AbstractIterator.toBuffer(Iterator.scala:1157)
>> at
>> scala.collection.TraversableOnce$class.toArray(TraversableOnce.scala:252)
>> at scala.collection.AbstractIterator.toArray(Iterator.scala:1157)
>> at
>> org.apache.spark.sql.catalyst.trees.TreeNode.transformChildrenDown(TreeNode.scala:236)
>> at
>> org.apache.spark.sql.catalyst.trees.TreeNode.transformDown(TreeNode.scala:192)
>> at
>> org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$4.apply(TreeNode.scala:207)
>> at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
>> at scala.collection.Iterator$class.foreach(Iterator.scala:727)
>> at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
>> at
>> scala.collection.generic.Growable$class.$plus$plus$eq(Growable.scala:48)
>> at
>> scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:103)
>> at
>> scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:47)
>> at scala.collection.TraversableOnce$class.to(TraversableOnce.scala:273)
>> at scala.collection.AbstractIterator.to(Iterator.scala:1157)
>> at
>> scala.collection.TraversableOnce$class.toBuffer(TraversableOnce.scala:265)
>> at scala.collection.AbstractIterator.toBuffer(Iterator.scala:1157)
>> at
>> scala.collection.TraversableOnce$class.toArray(TraversableOnce.scala:252)
>> at scala.collection.AbstractIterator.toArray(Iterator.scala:1157)
>> at
>> org.apache.spark.sql.catalyst.trees.TreeNode.transformChildrenDown(TreeNode.scala:236)
>> at
>> org.apache.spark.sql.catalyst.trees.TreeNode.transformDown(TreeNode.scala:194)
>> at
>> org.apache.spark.sql.catalyst.trees.TreeNode.transform(TreeNode.scala:177)
>> at
>> org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveRelations$.apply(Analyzer.scala:182)
>> at
>> org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveRelations$.apply(Analyzer.scala:172)
>> at
>> org.apache.spark.sql.catalyst.rules.RuleExecutor$$anonfun$apply$1$$anonfun$apply$2.apply(RuleExecutor.scala:61)
>> at
>> org.apache.spark.sql.catalyst.rules.RuleExecutor$$anonfun$apply$1$$anonfun$apply$2.apply(RuleExecutor.scala:59)
>> at
>> scala.collection.LinearSeqOptimized$class.foldLeft(LinearSeqOptimized.scala:111)
>> at scala.collection.immutable.List.foldLeft(List.scala:84)
>> at
>> org.apache.spark.sql.catalyst.rules.RuleExecutor$$anonfun$apply$1.apply(RuleExecutor.scala:59)
>> at
>> org.apache.spark.sql.catalyst.rules.RuleExecutor$$anonfun$apply$1.apply(RuleExecutor.scala:51)
>> at scala.collection.immutable.List.foreach(List.scala:318)
>> at
>> org.apache.spark.sql.catalyst.rules.RuleExecutor.apply(RuleExecutor.scala:51)
>> at
>> org.apache.spark.sql.SQLContext$QueryExecution.analyzed$lzycompute(SQLContext.scala:1071)
>> at
>> org.apache.spark.sql.SQLContext$QueryExecution.analyzed(SQLContext.scala:1071)
>> at
>> org.apache.spark.sql.SQLContext$QueryExecution.assertAnalyzed(SQLContext.scala:1069)
>> at org.apache.spark.sql.DataFrame.<init>(DataFrame.scala:133)
>> at org.apache.spark.sql.DataFrame$.apply(DataFrame.scala:51)
>> at org.apache.spark.sql.hive.HiveContext.sql(HiveContext.scala:92)
>> at
>> com.ebay.ep.poc.spark.reporting.process.service.HadoopSuccessEvents2Service.execute(HadoopSuccessEvents2Service.scala:32)
>> at com.ebay.ep.poc.spark.reporting.SparkApp$.main(SparkApp.scala:30)
>> at com.ebay.ep.poc.spark.reporting.SparkApp.main(SparkApp.scala)
>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>> at
>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>> at
>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>> at java.lang.reflect.Method.invoke(Method.java:606)
>> at
>> org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:480)
>> 15/03/26 08:25:44 INFO yarn.ApplicationMaster: Final app status: FAILED,
>> exitCode: 15, (reason: User class threw exception: no such table
>> List(dw_bid); line 1 pos 843)
>> 15/03/26 08:25:44 INFO yarn.ApplicationMaster: Invoking sc stop from
>> shutdown hook
>>
>>
>> On Thu, Mar 26, 2015 at 8:58 PM, ÐΞ€ρ@Ҝ (๏̯͡๏) <de...@gmail.com>
>> wrote:
>>
>>> Hello Michael,
>>> Thanks for your time.
>>>
>>> 1. show tables from Spark program returns nothing.
>>> 2. What entities are you talking about ? (I am actually new to Hive as
>>> well)
>>>
>>>
>>> On Thu, Mar 26, 2015 at 8:35 PM, Michael Armbrust <
>>> michael@databricks.com> wrote:
>>>
>>>> What does "show tables" return?  You can also run "SET <optionName>" to
>>>> make sure that entries from you hive site are being read correctly.
>>>>
>>>> On Thu, Mar 26, 2015 at 4:02 AM, ÐΞ€ρ@Ҝ (๏̯͡๏) <de...@gmail.com>
>>>> wrote:
>>>>
>>>>> I have tables dw_bid that is created in Hive and has nothing to do
>>>>> with Spark.  I have data in avro that i want to join with dw_bid table,
>>>>> this join needs to be done using Spark SQL.  However for some reason Spark
>>>>> says dw_bid table does not exist. How do i say spark that dw_bid is a table
>>>>> created in Hive and read it.
>>>>>
>>>>>
>>>>> Query that is run from Spark SQL
>>>>> ==============================
>>>>>  insert overwrite table sojsuccessevents2_spark select
>>>>> guid,sessionKey,sessionStartDate,sojDataDate,seqNum,eventTimestamp,siteId,successEventType,sourceType,itemId,
>>>>> shopCartId,b.transaction_Id as transactionId,offerId,b.bdr_id as
>>>>> userId,priorPage1SeqNum,priorPage1PageId,exclWMSearchAttemptSeqNum,exclPriorSearchPageId,
>>>>> exclPriorSearchSeqNum,exclPriorSearchCategory,exclPriorSearchL1,exclPriorSearchL2,currentImpressionId,sourceImpressionId,exclPriorSearchSqr,exclPriorSearchSort,
>>>>> isDuplicate,b.bid_date as
>>>>> transactionDate,auctionTypeCode,isBin,leafCategoryId,itemSiteId,b.qty_bid
>>>>> as bidQuantity, b.bid_amt_unit_lstg_curncy * b.bid_exchng_rate as
>>>>>  bidAmtUsd,offerQuantity,offerAmountUsd,offerCreateDate,buyerSegment,buyerCountryId,sellerId,sellerCountryId,
>>>>> sellerStdLevel,cssSellerLevel,a.experimentChannel from sojsuccessevents1 a
>>>>> join dw_bid b  on a.itemId = b.item_id  and  a.transactionId =
>>>>>  b.transaction_id  where b.auct_end_dt >= '2015-02-16' AND b.bid_dt >=
>>>>> '2015-02-16'  AND b.bid_type_code IN (1,9) AND b.bdr_id > 0 AND (
>>>>> b.bid_flags & 32) = 0 and lower(a.successEventType) IN ('bid','bin')
>>>>>
>>>>>
>>>>> If i create sojsuccessevents2_spark from hive command line and run
>>>>> above command form Spark SQL program then i get error
>>>>> "sojsuccessevents2_spark table not found".
>>>>>
>>>>> Hence i dropped the command from Hive and run create table
>>>>> sojsuccessevents2_spark from Spark SQL before running above command and it
>>>>> works until it hits next road block "dw_bid table not found"
>>>>>
>>>>> This makes me belive that Spark for some reason is not able to
>>>>> read/understand the tables created outside Spark. I did copy
>>>>> /apache/hive/conf/hive-site.xml into Spark conf directory.
>>>>>
>>>>> Please suggest.
>>>>>
>>>>>
>>>>> Logs
>>>>> ———
>>>>> 15/03/26 03:50:40 INFO HiveMetaStore.audit: ugi=dvasthimal
>>>>> ip=unknown-ip-addr cmd=get_table : db=default tbl=dw_bid
>>>>> 15/03/26 03:50:40 ERROR metadata.Hive:
>>>>> NoSuchObjectException(message:default.dw_bid table not found)
>>>>> at
>>>>> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_table(HiveMetaStore.java:1560)
>>>>>
>>>>>
>>>>>
>>>>> 15/03/26 03:50:40 ERROR yarn.ApplicationMaster: User class threw
>>>>> exception: no such table List(dw_bid); line 1 pos 843
>>>>> org.apache.spark.sql.AnalysisException: no such table List(dw_bid);
>>>>> line 1 pos 843
>>>>> at
>>>>> org.apache.spark.sql.catalyst.analysis.package$AnalysisErrorAt.failAnalysis(package.scala:42)
>>>>> at
>>>>> org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveRelations$.getTable(Analyzer.scala:178)
>>>>> at
>>>>> org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveRelations$$anonfun$apply$6.applyOrElse(Analyzer.scala:187)
>>>>>
>>>>>
>>>>>
>>>>> Regards,
>>>>> Deepak
>>>>>
>>>>>
>>>>> On Thu, Mar 26, 2015 at 4:27 PM, ÐΞ€ρ@Ҝ (๏̯͡๏) <de...@gmail.com>
>>>>> wrote:
>>>>>
>>>>>> I have this query
>>>>>>
>>>>>>  insert overwrite table sojsuccessevents2_spark select
>>>>>> guid,sessionKey,sessionStartDate,sojDataDate,seqNum,eventTimestamp,siteId,successEventType,sourceType,itemId,
>>>>>> shopCartId,b.transaction_Id as transactionId,offerId,b.bdr_id as
>>>>>> userId,priorPage1SeqNum,priorPage1PageId,exclWMSearchAttemptSeqNum,exclPriorSearchPageId,
>>>>>> exclPriorSearchSeqNum,exclPriorSearchCategory,exclPriorSearchL1,exclPriorSearchL2,currentImpressionId,sourceImpressionId,exclPriorSearchSqr,exclPriorSearchSort,
>>>>>> isDuplicate,b.bid_date as
>>>>>> transactionDate,auctionTypeCode,isBin,leafCategoryId,itemSiteId,b.qty_bid
>>>>>> as bidQuantity, b.bid_amt_unit_lstg_curncy * b.bid_exchng_rate as
>>>>>>  bidAmtUsd,offerQuantity,offerAmountUsd,offerCreateDate,buyerSegment,buyerCountryId,sellerId,sellerCountryId,
>>>>>> sellerStdLevel,cssSellerLevel,a.experimentChannel from sojsuccessevents1 a *join
>>>>>> dw_bid b*  on a.itemId = b.item_id  and  a.transactionId =
>>>>>>  b.transaction_id  where b.auct_end_dt >= '2015-02-16' AND b.bid_dt >=
>>>>>> '2015-02-16'  AND b.bid_type_code IN (1,9) AND b.bdr_id > 0 AND (
>>>>>> b.bid_flags & 32) = 0 and lower(a.successEventType) IN ('bid','bin')
>>>>>>
>>>>>>
>>>>>> If i create sojsuccessevents2_spark from hive command line and run
>>>>>> above command form Spark SQL program then i get error
>>>>>> "sojsuccessevents2_spark table not found".
>>>>>>
>>>>>> Hence i dropped the command from Hive and run create table
>>>>>> sojsuccessevents2_spark from Spark SQL before running above command and it
>>>>>> works until it hits next road block "*dw_bid table not found"*
>>>>>>
>>>>>> This makes me belive that Spark for some reason is not able to
>>>>>> read/understand the tables created outside Spark. I did copy
>>>>>>   /apache/hive/conf/hive-site.xml into Spark conf directory.
>>>>>>
>>>>>> Please suggest.
>>>>>>
>>>>>> Regards,
>>>>>> Deepak
>>>>>>
>>>>>>
>>>>>> On Thu, Mar 26, 2015 at 1:26 PM, ÐΞ€ρ@Ҝ (๏̯͡๏) <de...@gmail.com>
>>>>>> wrote:
>>>>>>
>>>>>>> I have a hive table named dw_bid, when i run hive from command
>>>>>>> prompt and run describe dw_bid, it works.
>>>>>>>
>>>>>>> I want to join a avro file (table) in HDFS with this hive dw_bid
>>>>>>> table and i refer it as dw_bid from Spark SQL program, however i see
>>>>>>>
>>>>>>> 15/03/26 00:31:01 INFO HiveMetaStore.audit: ugi=dvasthimal
>>>>>>> ip=unknown-ip-addr cmd=get_table : db=default tbl=dw_bid
>>>>>>> 15/03/26 00:31:01 ERROR metadata.Hive:
>>>>>>> NoSuchObjectException(message:default.dw_bid table not found)
>>>>>>> at
>>>>>>> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_table(HiveMetaStore.java:1560)
>>>>>>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>>>>>> at
>>>>>>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>>>>>>>
>>>>>>>
>>>>>>> Code:
>>>>>>>
>>>>>>>     val successDetail_S1 = sqlContext.avroFile(input)
>>>>>>>     successDetail_S1.registerTempTable("sojsuccessevents1")
>>>>>>>     val countS1 = sqlContext.sql("select
>>>>>>> guid,sessionKey,sessionStartDate,sojDataDate,seqNum,eventTimestamp,siteId,successEventType,sourceType,itemId,"
>>>>>>> +
>>>>>>>         " shopCartId,b.transaction_Id as
>>>>>>> transactionId,offerId,b.bdr_id as
>>>>>>> userId,priorPage1SeqNum,priorPage1PageId,exclWMSearchAttemptSeqNum,exclPriorSearchPageId,"
>>>>>>> +
>>>>>>>         "
>>>>>>> exclPriorSearchSeqNum,exclPriorSearchCategory,exclPriorSearchL1,exclPriorSearchL2,currentImpressionId,sourceImpressionId,exclPriorSearchSqr,exclPriorSearchSort,"
>>>>>>> +
>>>>>>>         " isDuplicate,b.bid_date as
>>>>>>> transactionDate,auctionTypeCode,isBin,leafCategoryId,itemSiteId,b.qty_bid
>>>>>>> as bidQuantity," +
>>>>>>>     " b.bid_amt_unit_lstg_curncy * b.bid_exchng_rate as
>>>>>>>  bidAmtUsd,offerQuantity,offerAmountUsd,offerCreateDate,buyerSegment,buyerCountryId,sellerId,sellerCountryId,"
>>>>>>> +
>>>>>>>     " sellerStdLevel,cssSellerLevel,a.experimentChannel" +
>>>>>>>     " from sojsuccessevents1 a join dw_bid b " +
>>>>>>>     " on a.itemId = b.item_id  and  a.transactionId =
>>>>>>>  b.transaction_id " +
>>>>>>>     " where b.bid_type_code IN (1,9) AND b.bdr_id > 0 AND (
>>>>>>> b.bid_flags & 32) = 0 and lower(a.successEventType) IN ('bid','bin')")
>>>>>>>     println("countS1.first:" + countS1.first)
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Any suggestions on how to refer a hive table form Spark SQL?
>>>>>>> --
>>>>>>>
>>>>>>> Deepak
>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Deepak
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Deepak
>>>>>
>>>>>
>>>>
>>>
>>>
>>> --
>>> Deepak
>>>
>>>
>>
>>
>> --
>> Deepak
>>
>>
>
>
> --
> Deepak
>
>

Re: Hive Table not from from Spark SQL

Posted by ๏̯͡๏ <ÐΞ€ρ@Ҝ>, de...@gmail.com.
I tried the following

1)

./bin/spark-submit -v --master yarn-cluster --driver-class-path
/home/dvasthimal/spark1.3/mysql-connector-java-5.1.34.jar:/apache/hadoop/share/hadoop/common/hadoop-common-2.4.1-EBAY-2.jar:/apache/hadoop/lib/hadoop-lzo-0.6.0.jar:/apache/hadoop-2.4.1-2.1.3.0-2-EBAY/share/hadoop/yarn/lib/guava-11.0.2.jar:
*$SPARK_HOME/conf/hive-site.xml*  --jars
/home/dvasthimal/spark1.3/spark-avro_2.10-1.0.0.jar,/home/dvasthimal/spark1.3/spark-1.3.0-bin-hadoop2.4/lib/datanucleus-api-jdo-3.2.6.jar,/home/dvasthimal/spark1.3/spark-1.3.0-bin-hadoop2.4/lib/datanucleus-core-3.2.10.jar,/home/dvasthimal/spark1.3/spark-1.3.0-bin-hadoop2.4/lib/datanucleus-rdbms-3.2.9.jar
--num-executors 1 --driver-memory 4g --driver-java-options
"-XX:MaxPermSize=2G" --executor-memory 2g --executor-cores 1 --queue
hdmi-express --class com.ebay.ep.poc.spark.reporting.SparkApp
spark_reporting-1.0-SNAPSHOT.jar startDate=2015-02-16 endDate=2015-02-16
input=/user/dvasthimal/epdatasets/successdetail1/part-r-00000.avro
subcommand=successevents2 output=/user/dvasthimal/epdatasets/successdetail2


This throws dw_bid not found. Looks like Spark SQL is unable to read my
existing Hive metastore and creates its own and hence complains that table
is not found.


2)

./bin/spark-submit -v --master yarn-cluster --driver-class-path
/home/dvasthimal/spark1.3/mysql-connector-java-5.1.34.jar:/apache/hadoop/share/hadoop/common/hadoop-common-2.4.1-EBAY-2.jar:/apache/hadoop/lib/hadoop-lzo-0.6.0.jar:/apache/hadoop-2.4.1-2.1.3.0-2-EBAY/share/hadoop/yarn/lib/guava-11.0.2.jar
 --jars
/home/dvasthimal/spark1.3/spark-avro_2.10-1.0.0.jar,/home/dvasthimal/spark1.3/spark-1.3.0-bin-hadoop2.4/lib/datanucleus-api-jdo-3.2.6.jar,/home/dvasthimal/spark1.3/spark-1.3.0-bin-hadoop2.4/lib/datanucleus-core-3.2.10.jar,/home/dvasthimal/spark1.3/spark-1.3.0-bin-hadoop2.4/lib/datanucleus-rdbms-3.2.9.jar:
*$SPARK_HOME/conf/hive-site.xml* --num-executors 1 --driver-memory 4g
--driver-java-options "-XX:MaxPermSize=2G" --executor-memory 2g
--executor-cores 1 --queue hdmi-express --class
com.ebay.ep.poc.spark.reporting.SparkApp spark_reporting-1.0-SNAPSHOT.jar
startDate=2015-02-16 endDate=2015-02-16
input=/user/dvasthimal/epdatasets/successdetail1/part-r-00000.avro
subcommand=successevents2 output=/user/dvasthimal/epdatasets/successdetail2

This time i do not get above error, however i get MySQL driver not found
exception. Looks like this is even before its able to communicate to Hive.

Caused by: org.datanucleus.exceptions.NucleusException: Attempt to invoke
the "BONECP" plugin to create a ConnectionPool gave an error : The
specified datastore driver ("com.mysql.jdbc.Driver") was not found in the
CLASSPATH. Please check your CLASSPATH specification, and the name of the
driver.

In both above cases, i do have hive-site.xml in Spark/conf folder.

3)
./bin/spark-submit -v --master yarn-cluster --driver-class-path
/home/dvasthimal/spark1.3/mysql-connector-java-5.1.34.jar:/apache/hadoop/share/hadoop/common/hadoop-common-2.4.1-EBAY-2.jar:/apache/hadoop/lib/hadoop-lzo-0.6.0.jar:/apache/hadoop-2.4.1-2.1.3.0-2-EBAY/share/hadoop/yarn/lib/guava-11.0.2.jar
 --jars
/home/dvasthimal/spark1.3/spark-avro_2.10-1.0.0.jar,/home/dvasthimal/spark1.3/spark-1.3.0-bin-hadoop2.4/lib/datanucleus-api-jdo-3.2.6.jar,/home/dvasthimal/spark1.3/spark-1.3.0-bin-hadoop2.4/lib/datanucleus-core-3.2.10.jar,/home/dvasthimal/spark1.3/spark-1.3.0-bin-hadoop2.4/lib/datanucleus-rdbms-3.2.9.jar--num-executors
1 --driver-memory 4g --driver-java-options "-XX:MaxPermSize=2G"
--executor-memory 2g --executor-cores 1 --queue hdmi-express --class
com.ebay.ep.poc.spark.reporting.SparkApp spark_reporting-1.0-SNAPSHOT.jar
startDate=2015-02-16 endDate=2015-02-16
input=/user/dvasthimal/epdatasets/successdetail1/part-r-00000.avro
subcommand=successevents2 output=/user/dvasthimal/epdatasets/successdetail2

I do not specify hive-site.xml in --jars or --driver-class-path. Its
present in spark/conf folder as per
https://spark.apache.org/docs/1.3.0/sql-programming-guide.html#hive-tables.

In this case i get same error as #1. dw_bid table not found.

I want Spark SQL to know that there are tables in Hive and read that data.
As per guide it looks like Spark SQL has that support.

Please suggest.

Regards,
Deepak


On Thu, Mar 26, 2015 at 9:01 PM, ÐΞ€ρ@Ҝ (๏̯͡๏) <de...@gmail.com> wrote:

> Stack Trace:
>
> 15/03/26 08:25:42 INFO ql.Driver: OK
> 15/03/26 08:25:42 INFO log.PerfLogger: <PERFLOG method=releaseLocks
> from=org.apache.hadoop.hive.ql.Driver>
> 15/03/26 08:25:42 INFO log.PerfLogger: </PERFLOG method=releaseLocks
> start=1427383542966 end=1427383542966 duration=0
> from=org.apache.hadoop.hive.ql.Driver>
> 15/03/26 08:25:42 INFO log.PerfLogger: </PERFLOG method=Driver.run
> start=1427383535203 end=1427383542966 duration=7763
> from=org.apache.hadoop.hive.ql.Driver>
> 15/03/26 08:25:42 INFO metastore.HiveMetaStore: 0: get_tables: db=default
> pat=.*
> 15/03/26 08:25:42 INFO HiveMetaStore.audit: ugi=dvasthimal
> ip=unknown-ip-addr cmd=get_tables: db=default pat=.*
> 15/03/26 08:25:43 INFO parse.ParseDriver: Parsing command: insert
> overwrite table sojsuccessevents2_spark select
> guid,sessionKey,sessionStartDate,sojDataDate,seqNum,eventTimestamp,siteId,successEventType,sourceType,itemId,
> shopCartId,b.transaction_Id as transactionId,offerId,b.bdr_id as
> userId,priorPage1SeqNum,priorPage1PageId,exclWMSearchAttemptSeqNum,exclPriorSearchPageId,
> exclPriorSearchSeqNum,exclPriorSearchCategory,exclPriorSearchL1,exclPriorSearchL2,currentImpressionId,sourceImpressionId,exclPriorSearchSqr,exclPriorSearchSort,
> isDuplicate,b.bid_date as
> transactionDate,auctionTypeCode,isBin,leafCategoryId,itemSiteId,b.qty_bid
> as bidQuantity, b.bid_amt_unit_lstg_curncy * b.bid_exchng_rate as
>  bidAmtUsd,offerQuantity,offerAmountUsd,offerCreateDate,buyerSegment,buyerCountryId,sellerId,sellerCountryId,
> sellerStdLevel,cssSellerLevel,a.experimentChannel from sojsuccessevents1 a
> join dw_bid b  on a.itemId = b.item_id  and  a.transactionId =
>  b.transaction_id  where b.auct_end_dt >= '2015-02-16' AND b.bid_dt >=
> '2015-02-16'  AND b.bid_type_code IN (1,9) AND b.bdr_id > 0 AND (
> b.bid_flags & 32) = 0 and lower(a.successEventType) IN ('bid','bin')
> 15/03/26 08:25:43 INFO parse.ParseDriver: Parse Completed
> 15/03/26 08:25:43 INFO metastore.HiveMetaStore: 0: get_table : db=default
> tbl=sojsuccessevents2_spark
> 15/03/26 08:25:43 INFO HiveMetaStore.audit: ugi=dvasthimal
> ip=unknown-ip-addr cmd=get_table : db=default tbl=sojsuccessevents2_spark
> 15/03/26 08:25:44 INFO metastore.HiveMetaStore: 0: get_table : db=default
> tbl=dw_bid
> 15/03/26 08:25:44 INFO HiveMetaStore.audit: ugi=dvasthimal
> ip=unknown-ip-addr cmd=get_table : db=default tbl=dw_bid
> 15/03/26 08:25:44 ERROR metadata.Hive:
> NoSuchObjectException(message:default.dw_bid table not found)
> at
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_table(HiveMetaStore.java:1560)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:105)
> at com.sun.proxy.$Proxy31.get_table(Unknown Source)
> at
> org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getTable(HiveMetaStoreClient.java:997)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at
> org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:89)
> at com.sun.proxy.$Proxy32.getTable(Unknown Source)
> at org.apache.hadoop.hive.ql.metadata.Hive.getTable(Hive.java:976)
> at org.apache.hadoop.hive.ql.metadata.Hive.getTable(Hive.java:950)
> at
> org.apache.spark.sql.hive.HiveMetastoreCatalog.lookupRelation(HiveMetastoreCatalog.scala:180)
> at org.apache.spark.sql.hive.HiveContext$$anon$1.org
> $apache$spark$sql$catalyst$analysis$OverrideCatalog$$super$lookupRelation(HiveContext.scala:252)
> at
> org.apache.spark.sql.catalyst.analysis.OverrideCatalog$$anonfun$lookupRelation$3.apply(Catalog.scala:161)
> at
> org.apache.spark.sql.catalyst.analysis.OverrideCatalog$$anonfun$lookupRelation$3.apply(Catalog.scala:161)
> at scala.Option.getOrElse(Option.scala:120)
> at
> org.apache.spark.sql.catalyst.analysis.OverrideCatalog$class.lookupRelation(Catalog.scala:161)
> at
> org.apache.spark.sql.hive.HiveContext$$anon$1.lookupRelation(HiveContext.scala:252)
> at
> org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveRelations$.getTable(Analyzer.scala:175)
> at
> org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveRelations$$anonfun$apply$6.applyOrElse(Analyzer.scala:187)
> at
> org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveRelations$$anonfun$apply$6.applyOrElse(Analyzer.scala:182)
> at
> org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$3.apply(TreeNode.scala:187)
> at
> org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$3.apply(TreeNode.scala:187)
> at
> org.apache.spark.sql.catalyst.trees.CurrentOrigin$.withOrigin(TreeNode.scala:50)
> at
> org.apache.spark.sql.catalyst.trees.TreeNode.transformDown(TreeNode.scala:186)
> at
> org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$4.apply(TreeNode.scala:207)
> at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
> at scala.collection.Iterator$class.foreach(Iterator.scala:727)
> at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
> at scala.collection.generic.Growable$class.$plus$plus$eq(Growable.scala:48)
> at
> scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:103)
> at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:47)
> at scala.collection.TraversableOnce$class.to(TraversableOnce.scala:273)
> at scala.collection.AbstractIterator.to(Iterator.scala:1157)
> at
> scala.collection.TraversableOnce$class.toBuffer(TraversableOnce.scala:265)
> at scala.collection.AbstractIterator.toBuffer(Iterator.scala:1157)
> at
> scala.collection.TraversableOnce$class.toArray(TraversableOnce.scala:252)
> at scala.collection.AbstractIterator.toArray(Iterator.scala:1157)
> at
> org.apache.spark.sql.catalyst.trees.TreeNode.transformChildrenDown(TreeNode.scala:236)
> at
> org.apache.spark.sql.catalyst.trees.TreeNode.transformDown(TreeNode.scala:192)
> at
> org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$4.apply(TreeNode.scala:207)
> at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
> at scala.collection.Iterator$class.foreach(Iterator.scala:727)
> at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
> at scala.collection.generic.Growable$class.$plus$plus$eq(Growable.scala:48)
> at
> scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:103)
> at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:47)
> at scala.collection.TraversableOnce$class.to(TraversableOnce.scala:273)
> at scala.collection.AbstractIterator.to(Iterator.scala:1157)
> at
> scala.collection.TraversableOnce$class.toBuffer(TraversableOnce.scala:265)
> at scala.collection.AbstractIterator.toBuffer(Iterator.scala:1157)
> at
> scala.collection.TraversableOnce$class.toArray(TraversableOnce.scala:252)
> at scala.collection.AbstractIterator.toArray(Iterator.scala:1157)
> at
> org.apache.spark.sql.catalyst.trees.TreeNode.transformChildrenDown(TreeNode.scala:236)
> at
> org.apache.spark.sql.catalyst.trees.TreeNode.transformDown(TreeNode.scala:192)
> at
> org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$4.apply(TreeNode.scala:207)
> at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
> at scala.collection.Iterator$class.foreach(Iterator.scala:727)
> at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
> at scala.collection.generic.Growable$class.$plus$plus$eq(Growable.scala:48)
> at
> scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:103)
> at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:47)
> at scala.collection.TraversableOnce$class.to(TraversableOnce.scala:273)
> at scala.collection.AbstractIterator.to(Iterator.scala:1157)
> at
> scala.collection.TraversableOnce$class.toBuffer(TraversableOnce.scala:265)
> at scala.collection.AbstractIterator.toBuffer(Iterator.scala:1157)
> at
> scala.collection.TraversableOnce$class.toArray(TraversableOnce.scala:252)
> at scala.collection.AbstractIterator.toArray(Iterator.scala:1157)
> at
> org.apache.spark.sql.catalyst.trees.TreeNode.transformChildrenDown(TreeNode.scala:236)
> at
> org.apache.spark.sql.catalyst.trees.TreeNode.transformDown(TreeNode.scala:192)
> at
> org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$4.apply(TreeNode.scala:207)
> at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
> at scala.collection.Iterator$class.foreach(Iterator.scala:727)
> at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
> at scala.collection.generic.Growable$class.$plus$plus$eq(Growable.scala:48)
> at
> scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:103)
> at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:47)
> at scala.collection.TraversableOnce$class.to(TraversableOnce.scala:273)
> at scala.collection.AbstractIterator.to(Iterator.scala:1157)
> at
> scala.collection.TraversableOnce$class.toBuffer(TraversableOnce.scala:265)
> at scala.collection.AbstractIterator.toBuffer(Iterator.scala:1157)
> at
> scala.collection.TraversableOnce$class.toArray(TraversableOnce.scala:252)
> at scala.collection.AbstractIterator.toArray(Iterator.scala:1157)
> at
> org.apache.spark.sql.catalyst.trees.TreeNode.transformChildrenDown(TreeNode.scala:236)
> at
> org.apache.spark.sql.catalyst.trees.TreeNode.transformDown(TreeNode.scala:194)
> at
> org.apache.spark.sql.catalyst.trees.TreeNode.transform(TreeNode.scala:177)
> at
> org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveRelations$.apply(Analyzer.scala:182)
> at
> org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveRelations$.apply(Analyzer.scala:172)
> at
> org.apache.spark.sql.catalyst.rules.RuleExecutor$$anonfun$apply$1$$anonfun$apply$2.apply(RuleExecutor.scala:61)
> at
> org.apache.spark.sql.catalyst.rules.RuleExecutor$$anonfun$apply$1$$anonfun$apply$2.apply(RuleExecutor.scala:59)
> at
> scala.collection.LinearSeqOptimized$class.foldLeft(LinearSeqOptimized.scala:111)
> at scala.collection.immutable.List.foldLeft(List.scala:84)
> at
> org.apache.spark.sql.catalyst.rules.RuleExecutor$$anonfun$apply$1.apply(RuleExecutor.scala:59)
> at
> org.apache.spark.sql.catalyst.rules.RuleExecutor$$anonfun$apply$1.apply(RuleExecutor.scala:51)
> at scala.collection.immutable.List.foreach(List.scala:318)
> at
> org.apache.spark.sql.catalyst.rules.RuleExecutor.apply(RuleExecutor.scala:51)
> at
> org.apache.spark.sql.SQLContext$QueryExecution.analyzed$lzycompute(SQLContext.scala:1071)
> at
> org.apache.spark.sql.SQLContext$QueryExecution.analyzed(SQLContext.scala:1071)
> at
> org.apache.spark.sql.SQLContext$QueryExecution.assertAnalyzed(SQLContext.scala:1069)
> at org.apache.spark.sql.DataFrame.<init>(DataFrame.scala:133)
> at org.apache.spark.sql.DataFrame$.apply(DataFrame.scala:51)
> at org.apache.spark.sql.hive.HiveContext.sql(HiveContext.scala:92)
> at
> com.ebay.ep.poc.spark.reporting.process.service.HadoopSuccessEvents2Service.execute(HadoopSuccessEvents2Service.scala:32)
> at com.ebay.ep.poc.spark.reporting.SparkApp$.main(SparkApp.scala:30)
> at com.ebay.ep.poc.spark.reporting.SparkApp.main(SparkApp.scala)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at
> org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:480)
>
> 15/03/26 08:25:44 ERROR yarn.ApplicationMaster: User class threw
> exception: no such table List(dw_bid); line 1 pos 843
> org.apache.spark.sql.AnalysisException: no such table List(dw_bid); line 1
> pos 843
> at
> org.apache.spark.sql.catalyst.analysis.package$AnalysisErrorAt.failAnalysis(package.scala:42)
> at
> org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveRelations$.getTable(Analyzer.scala:178)
> at
> org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveRelations$$anonfun$apply$6.applyOrElse(Analyzer.scala:187)
> at
> org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveRelations$$anonfun$apply$6.applyOrElse(Analyzer.scala:182)
> at
> org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$3.apply(TreeNode.scala:187)
> at
> org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$3.apply(TreeNode.scala:187)
> at
> org.apache.spark.sql.catalyst.trees.CurrentOrigin$.withOrigin(TreeNode.scala:50)
> at
> org.apache.spark.sql.catalyst.trees.TreeNode.transformDown(TreeNode.scala:186)
> at
> org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$4.apply(TreeNode.scala:207)
> at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
> at scala.collection.Iterator$class.foreach(Iterator.scala:727)
> at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
> at scala.collection.generic.Growable$class.$plus$plus$eq(Growable.scala:48)
> at
> scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:103)
> at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:47)
> at scala.collection.TraversableOnce$class.to(TraversableOnce.scala:273)
> at scala.collection.AbstractIterator.to(Iterator.scala:1157)
> at
> scala.collection.TraversableOnce$class.toBuffer(TraversableOnce.scala:265)
> at scala.collection.AbstractIterator.toBuffer(Iterator.scala:1157)
> at
> scala.collection.TraversableOnce$class.toArray(TraversableOnce.scala:252)
> at scala.collection.AbstractIterator.toArray(Iterator.scala:1157)
> at
> org.apache.spark.sql.catalyst.trees.TreeNode.transformChildrenDown(TreeNode.scala:236)
> at
> org.apache.spark.sql.catalyst.trees.TreeNode.transformDown(TreeNode.scala:192)
> at
> org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$4.apply(TreeNode.scala:207)
> at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
> at scala.collection.Iterator$class.foreach(Iterator.scala:727)
> at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
> at scala.collection.generic.Growable$class.$plus$plus$eq(Growable.scala:48)
> at
> scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:103)
> at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:47)
> at scala.collection.TraversableOnce$class.to(TraversableOnce.scala:273)
> at scala.collection.AbstractIterator.to(Iterator.scala:1157)
> at
> scala.collection.TraversableOnce$class.toBuffer(TraversableOnce.scala:265)
> at scala.collection.AbstractIterator.toBuffer(Iterator.scala:1157)
> at
> scala.collection.TraversableOnce$class.toArray(TraversableOnce.scala:252)
> at scala.collection.AbstractIterator.toArray(Iterator.scala:1157)
> at
> org.apache.spark.sql.catalyst.trees.TreeNode.transformChildrenDown(TreeNode.scala:236)
> at
> org.apache.spark.sql.catalyst.trees.TreeNode.transformDown(TreeNode.scala:192)
> at
> org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$4.apply(TreeNode.scala:207)
> at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
> at scala.collection.Iterator$class.foreach(Iterator.scala:727)
> at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
> at scala.collection.generic.Growable$class.$plus$plus$eq(Growable.scala:48)
> at
> scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:103)
> at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:47)
> at scala.collection.TraversableOnce$class.to(TraversableOnce.scala:273)
> at scala.collection.AbstractIterator.to(Iterator.scala:1157)
> at
> scala.collection.TraversableOnce$class.toBuffer(TraversableOnce.scala:265)
> at scala.collection.AbstractIterator.toBuffer(Iterator.scala:1157)
> at
> scala.collection.TraversableOnce$class.toArray(TraversableOnce.scala:252)
> at scala.collection.AbstractIterator.toArray(Iterator.scala:1157)
> at
> org.apache.spark.sql.catalyst.trees.TreeNode.transformChildrenDown(TreeNode.scala:236)
> at
> org.apache.spark.sql.catalyst.trees.TreeNode.transformDown(TreeNode.scala:192)
> at
> org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$4.apply(TreeNode.scala:207)
> at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
> at scala.collection.Iterator$class.foreach(Iterator.scala:727)
> at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
> at scala.collection.generic.Growable$class.$plus$plus$eq(Growable.scala:48)
> at
> scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:103)
> at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:47)
> at scala.collection.TraversableOnce$class.to(TraversableOnce.scala:273)
> at scala.collection.AbstractIterator.to(Iterator.scala:1157)
> at
> scala.collection.TraversableOnce$class.toBuffer(TraversableOnce.scala:265)
> at scala.collection.AbstractIterator.toBuffer(Iterator.scala:1157)
> at
> scala.collection.TraversableOnce$class.toArray(TraversableOnce.scala:252)
> at scala.collection.AbstractIterator.toArray(Iterator.scala:1157)
> at
> org.apache.spark.sql.catalyst.trees.TreeNode.transformChildrenDown(TreeNode.scala:236)
> at
> org.apache.spark.sql.catalyst.trees.TreeNode.transformDown(TreeNode.scala:194)
> at
> org.apache.spark.sql.catalyst.trees.TreeNode.transform(TreeNode.scala:177)
> at
> org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveRelations$.apply(Analyzer.scala:182)
> at
> org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveRelations$.apply(Analyzer.scala:172)
> at
> org.apache.spark.sql.catalyst.rules.RuleExecutor$$anonfun$apply$1$$anonfun$apply$2.apply(RuleExecutor.scala:61)
> at
> org.apache.spark.sql.catalyst.rules.RuleExecutor$$anonfun$apply$1$$anonfun$apply$2.apply(RuleExecutor.scala:59)
> at
> scala.collection.LinearSeqOptimized$class.foldLeft(LinearSeqOptimized.scala:111)
> at scala.collection.immutable.List.foldLeft(List.scala:84)
> at
> org.apache.spark.sql.catalyst.rules.RuleExecutor$$anonfun$apply$1.apply(RuleExecutor.scala:59)
> at
> org.apache.spark.sql.catalyst.rules.RuleExecutor$$anonfun$apply$1.apply(RuleExecutor.scala:51)
> at scala.collection.immutable.List.foreach(List.scala:318)
> at
> org.apache.spark.sql.catalyst.rules.RuleExecutor.apply(RuleExecutor.scala:51)
> at
> org.apache.spark.sql.SQLContext$QueryExecution.analyzed$lzycompute(SQLContext.scala:1071)
> at
> org.apache.spark.sql.SQLContext$QueryExecution.analyzed(SQLContext.scala:1071)
> at
> org.apache.spark.sql.SQLContext$QueryExecution.assertAnalyzed(SQLContext.scala:1069)
> at org.apache.spark.sql.DataFrame.<init>(DataFrame.scala:133)
> at org.apache.spark.sql.DataFrame$.apply(DataFrame.scala:51)
> at org.apache.spark.sql.hive.HiveContext.sql(HiveContext.scala:92)
> at
> com.ebay.ep.poc.spark.reporting.process.service.HadoopSuccessEvents2Service.execute(HadoopSuccessEvents2Service.scala:32)
> at com.ebay.ep.poc.spark.reporting.SparkApp$.main(SparkApp.scala:30)
> at com.ebay.ep.poc.spark.reporting.SparkApp.main(SparkApp.scala)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at
> org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:480)
> 15/03/26 08:25:44 INFO yarn.ApplicationMaster: Final app status: FAILED,
> exitCode: 15, (reason: User class threw exception: no such table
> List(dw_bid); line 1 pos 843)
> 15/03/26 08:25:44 INFO yarn.ApplicationMaster: Invoking sc stop from
> shutdown hook
>
>
> On Thu, Mar 26, 2015 at 8:58 PM, ÐΞ€ρ@Ҝ (๏̯͡๏) <de...@gmail.com>
> wrote:
>
>> Hello Michael,
>> Thanks for your time.
>>
>> 1. show tables from Spark program returns nothing.
>> 2. What entities are you talking about ? (I am actually new to Hive as
>> well)
>>
>>
>> On Thu, Mar 26, 2015 at 8:35 PM, Michael Armbrust <michael@databricks.com
>> > wrote:
>>
>>> What does "show tables" return?  You can also run "SET <optionName>" to
>>> make sure that entries from you hive site are being read correctly.
>>>
>>> On Thu, Mar 26, 2015 at 4:02 AM, ÐΞ€ρ@Ҝ (๏̯͡๏) <de...@gmail.com>
>>> wrote:
>>>
>>>> I have tables dw_bid that is created in Hive and has nothing to do with
>>>> Spark.  I have data in avro that i want to join with dw_bid table, this
>>>> join needs to be done using Spark SQL.  However for some reason Spark says
>>>> dw_bid table does not exist. How do i say spark that dw_bid is a table
>>>> created in Hive and read it.
>>>>
>>>>
>>>> Query that is run from Spark SQL
>>>> ==============================
>>>>  insert overwrite table sojsuccessevents2_spark select
>>>> guid,sessionKey,sessionStartDate,sojDataDate,seqNum,eventTimestamp,siteId,successEventType,sourceType,itemId,
>>>> shopCartId,b.transaction_Id as transactionId,offerId,b.bdr_id as
>>>> userId,priorPage1SeqNum,priorPage1PageId,exclWMSearchAttemptSeqNum,exclPriorSearchPageId,
>>>> exclPriorSearchSeqNum,exclPriorSearchCategory,exclPriorSearchL1,exclPriorSearchL2,currentImpressionId,sourceImpressionId,exclPriorSearchSqr,exclPriorSearchSort,
>>>> isDuplicate,b.bid_date as
>>>> transactionDate,auctionTypeCode,isBin,leafCategoryId,itemSiteId,b.qty_bid
>>>> as bidQuantity, b.bid_amt_unit_lstg_curncy * b.bid_exchng_rate as
>>>>  bidAmtUsd,offerQuantity,offerAmountUsd,offerCreateDate,buyerSegment,buyerCountryId,sellerId,sellerCountryId,
>>>> sellerStdLevel,cssSellerLevel,a.experimentChannel from sojsuccessevents1 a
>>>> join dw_bid b  on a.itemId = b.item_id  and  a.transactionId =
>>>>  b.transaction_id  where b.auct_end_dt >= '2015-02-16' AND b.bid_dt >=
>>>> '2015-02-16'  AND b.bid_type_code IN (1,9) AND b.bdr_id > 0 AND (
>>>> b.bid_flags & 32) = 0 and lower(a.successEventType) IN ('bid','bin')
>>>>
>>>>
>>>> If i create sojsuccessevents2_spark from hive command line and run
>>>> above command form Spark SQL program then i get error
>>>> "sojsuccessevents2_spark table not found".
>>>>
>>>> Hence i dropped the command from Hive and run create table
>>>> sojsuccessevents2_spark from Spark SQL before running above command and it
>>>> works until it hits next road block "dw_bid table not found"
>>>>
>>>> This makes me belive that Spark for some reason is not able to
>>>> read/understand the tables created outside Spark. I did copy
>>>> /apache/hive/conf/hive-site.xml into Spark conf directory.
>>>>
>>>> Please suggest.
>>>>
>>>>
>>>> Logs
>>>> ———
>>>> 15/03/26 03:50:40 INFO HiveMetaStore.audit: ugi=dvasthimal
>>>> ip=unknown-ip-addr cmd=get_table : db=default tbl=dw_bid
>>>> 15/03/26 03:50:40 ERROR metadata.Hive:
>>>> NoSuchObjectException(message:default.dw_bid table not found)
>>>> at
>>>> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_table(HiveMetaStore.java:1560)
>>>>
>>>>
>>>>
>>>> 15/03/26 03:50:40 ERROR yarn.ApplicationMaster: User class threw
>>>> exception: no such table List(dw_bid); line 1 pos 843
>>>> org.apache.spark.sql.AnalysisException: no such table List(dw_bid);
>>>> line 1 pos 843
>>>> at
>>>> org.apache.spark.sql.catalyst.analysis.package$AnalysisErrorAt.failAnalysis(package.scala:42)
>>>> at
>>>> org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveRelations$.getTable(Analyzer.scala:178)
>>>> at
>>>> org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveRelations$$anonfun$apply$6.applyOrElse(Analyzer.scala:187)
>>>>
>>>>
>>>>
>>>> Regards,
>>>> Deepak
>>>>
>>>>
>>>> On Thu, Mar 26, 2015 at 4:27 PM, ÐΞ€ρ@Ҝ (๏̯͡๏) <de...@gmail.com>
>>>> wrote:
>>>>
>>>>> I have this query
>>>>>
>>>>>  insert overwrite table sojsuccessevents2_spark select
>>>>> guid,sessionKey,sessionStartDate,sojDataDate,seqNum,eventTimestamp,siteId,successEventType,sourceType,itemId,
>>>>> shopCartId,b.transaction_Id as transactionId,offerId,b.bdr_id as
>>>>> userId,priorPage1SeqNum,priorPage1PageId,exclWMSearchAttemptSeqNum,exclPriorSearchPageId,
>>>>> exclPriorSearchSeqNum,exclPriorSearchCategory,exclPriorSearchL1,exclPriorSearchL2,currentImpressionId,sourceImpressionId,exclPriorSearchSqr,exclPriorSearchSort,
>>>>> isDuplicate,b.bid_date as
>>>>> transactionDate,auctionTypeCode,isBin,leafCategoryId,itemSiteId,b.qty_bid
>>>>> as bidQuantity, b.bid_amt_unit_lstg_curncy * b.bid_exchng_rate as
>>>>>  bidAmtUsd,offerQuantity,offerAmountUsd,offerCreateDate,buyerSegment,buyerCountryId,sellerId,sellerCountryId,
>>>>> sellerStdLevel,cssSellerLevel,a.experimentChannel from sojsuccessevents1 a *join
>>>>> dw_bid b*  on a.itemId = b.item_id  and  a.transactionId =
>>>>>  b.transaction_id  where b.auct_end_dt >= '2015-02-16' AND b.bid_dt >=
>>>>> '2015-02-16'  AND b.bid_type_code IN (1,9) AND b.bdr_id > 0 AND (
>>>>> b.bid_flags & 32) = 0 and lower(a.successEventType) IN ('bid','bin')
>>>>>
>>>>>
>>>>> If i create sojsuccessevents2_spark from hive command line and run
>>>>> above command form Spark SQL program then i get error
>>>>> "sojsuccessevents2_spark table not found".
>>>>>
>>>>> Hence i dropped the command from Hive and run create table
>>>>> sojsuccessevents2_spark from Spark SQL before running above command and it
>>>>> works until it hits next road block "*dw_bid table not found"*
>>>>>
>>>>> This makes me belive that Spark for some reason is not able to
>>>>> read/understand the tables created outside Spark. I did copy
>>>>>   /apache/hive/conf/hive-site.xml into Spark conf directory.
>>>>>
>>>>> Please suggest.
>>>>>
>>>>> Regards,
>>>>> Deepak
>>>>>
>>>>>
>>>>> On Thu, Mar 26, 2015 at 1:26 PM, ÐΞ€ρ@Ҝ (๏̯͡๏) <de...@gmail.com>
>>>>> wrote:
>>>>>
>>>>>> I have a hive table named dw_bid, when i run hive from command prompt
>>>>>> and run describe dw_bid, it works.
>>>>>>
>>>>>> I want to join a avro file (table) in HDFS with this hive dw_bid
>>>>>> table and i refer it as dw_bid from Spark SQL program, however i see
>>>>>>
>>>>>> 15/03/26 00:31:01 INFO HiveMetaStore.audit: ugi=dvasthimal
>>>>>> ip=unknown-ip-addr cmd=get_table : db=default tbl=dw_bid
>>>>>> 15/03/26 00:31:01 ERROR metadata.Hive:
>>>>>> NoSuchObjectException(message:default.dw_bid table not found)
>>>>>> at
>>>>>> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_table(HiveMetaStore.java:1560)
>>>>>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>>>>> at
>>>>>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>>>>>>
>>>>>>
>>>>>> Code:
>>>>>>
>>>>>>     val successDetail_S1 = sqlContext.avroFile(input)
>>>>>>     successDetail_S1.registerTempTable("sojsuccessevents1")
>>>>>>     val countS1 = sqlContext.sql("select
>>>>>> guid,sessionKey,sessionStartDate,sojDataDate,seqNum,eventTimestamp,siteId,successEventType,sourceType,itemId,"
>>>>>> +
>>>>>>         " shopCartId,b.transaction_Id as
>>>>>> transactionId,offerId,b.bdr_id as
>>>>>> userId,priorPage1SeqNum,priorPage1PageId,exclWMSearchAttemptSeqNum,exclPriorSearchPageId,"
>>>>>> +
>>>>>>         "
>>>>>> exclPriorSearchSeqNum,exclPriorSearchCategory,exclPriorSearchL1,exclPriorSearchL2,currentImpressionId,sourceImpressionId,exclPriorSearchSqr,exclPriorSearchSort,"
>>>>>> +
>>>>>>         " isDuplicate,b.bid_date as
>>>>>> transactionDate,auctionTypeCode,isBin,leafCategoryId,itemSiteId,b.qty_bid
>>>>>> as bidQuantity," +
>>>>>>     " b.bid_amt_unit_lstg_curncy * b.bid_exchng_rate as
>>>>>>  bidAmtUsd,offerQuantity,offerAmountUsd,offerCreateDate,buyerSegment,buyerCountryId,sellerId,sellerCountryId,"
>>>>>> +
>>>>>>     " sellerStdLevel,cssSellerLevel,a.experimentChannel" +
>>>>>>     " from sojsuccessevents1 a join dw_bid b " +
>>>>>>     " on a.itemId = b.item_id  and  a.transactionId =
>>>>>>  b.transaction_id " +
>>>>>>     " where b.bid_type_code IN (1,9) AND b.bdr_id > 0 AND (
>>>>>> b.bid_flags & 32) = 0 and lower(a.successEventType) IN ('bid','bin')")
>>>>>>     println("countS1.first:" + countS1.first)
>>>>>>
>>>>>>
>>>>>>
>>>>>> Any suggestions on how to refer a hive table form Spark SQL?
>>>>>> --
>>>>>>
>>>>>> Deepak
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Deepak
>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>> Deepak
>>>>
>>>>
>>>
>>
>>
>> --
>> Deepak
>>
>>
>
>
> --
> Deepak
>
>


-- 
Deepak

Re: Hive Table not from from Spark SQL

Posted by ๏̯͡๏ <ÐΞ€ρ@Ҝ>, de...@gmail.com.
Stack Trace:

15/03/26 08:25:42 INFO ql.Driver: OK
15/03/26 08:25:42 INFO log.PerfLogger: <PERFLOG method=releaseLocks
from=org.apache.hadoop.hive.ql.Driver>
15/03/26 08:25:42 INFO log.PerfLogger: </PERFLOG method=releaseLocks
start=1427383542966 end=1427383542966 duration=0
from=org.apache.hadoop.hive.ql.Driver>
15/03/26 08:25:42 INFO log.PerfLogger: </PERFLOG method=Driver.run
start=1427383535203 end=1427383542966 duration=7763
from=org.apache.hadoop.hive.ql.Driver>
15/03/26 08:25:42 INFO metastore.HiveMetaStore: 0: get_tables: db=default
pat=.*
15/03/26 08:25:42 INFO HiveMetaStore.audit: ugi=dvasthimal
ip=unknown-ip-addr cmd=get_tables: db=default pat=.*
15/03/26 08:25:43 INFO parse.ParseDriver: Parsing command: insert overwrite
table sojsuccessevents2_spark select
guid,sessionKey,sessionStartDate,sojDataDate,seqNum,eventTimestamp,siteId,successEventType,sourceType,itemId,
shopCartId,b.transaction_Id as transactionId,offerId,b.bdr_id as
userId,priorPage1SeqNum,priorPage1PageId,exclWMSearchAttemptSeqNum,exclPriorSearchPageId,
exclPriorSearchSeqNum,exclPriorSearchCategory,exclPriorSearchL1,exclPriorSearchL2,currentImpressionId,sourceImpressionId,exclPriorSearchSqr,exclPriorSearchSort,
isDuplicate,b.bid_date as
transactionDate,auctionTypeCode,isBin,leafCategoryId,itemSiteId,b.qty_bid
as bidQuantity, b.bid_amt_unit_lstg_curncy * b.bid_exchng_rate as
 bidAmtUsd,offerQuantity,offerAmountUsd,offerCreateDate,buyerSegment,buyerCountryId,sellerId,sellerCountryId,
sellerStdLevel,cssSellerLevel,a.experimentChannel from sojsuccessevents1 a
join dw_bid b  on a.itemId = b.item_id  and  a.transactionId =
 b.transaction_id  where b.auct_end_dt >= '2015-02-16' AND b.bid_dt >=
'2015-02-16'  AND b.bid_type_code IN (1,9) AND b.bdr_id > 0 AND (
b.bid_flags & 32) = 0 and lower(a.successEventType) IN ('bid','bin')
15/03/26 08:25:43 INFO parse.ParseDriver: Parse Completed
15/03/26 08:25:43 INFO metastore.HiveMetaStore: 0: get_table : db=default
tbl=sojsuccessevents2_spark
15/03/26 08:25:43 INFO HiveMetaStore.audit: ugi=dvasthimal
ip=unknown-ip-addr cmd=get_table : db=default tbl=sojsuccessevents2_spark
15/03/26 08:25:44 INFO metastore.HiveMetaStore: 0: get_table : db=default
tbl=dw_bid
15/03/26 08:25:44 INFO HiveMetaStore.audit: ugi=dvasthimal
ip=unknown-ip-addr cmd=get_table : db=default tbl=dw_bid
15/03/26 08:25:44 ERROR metadata.Hive:
NoSuchObjectException(message:default.dw_bid table not found)
at
org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_table(HiveMetaStore.java:1560)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at
org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:105)
at com.sun.proxy.$Proxy31.get_table(Unknown Source)
at
org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getTable(HiveMetaStoreClient.java:997)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at
org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:89)
at com.sun.proxy.$Proxy32.getTable(Unknown Source)
at org.apache.hadoop.hive.ql.metadata.Hive.getTable(Hive.java:976)
at org.apache.hadoop.hive.ql.metadata.Hive.getTable(Hive.java:950)
at
org.apache.spark.sql.hive.HiveMetastoreCatalog.lookupRelation(HiveMetastoreCatalog.scala:180)
at org.apache.spark.sql.hive.HiveContext$$anon$1.org
$apache$spark$sql$catalyst$analysis$OverrideCatalog$$super$lookupRelation(HiveContext.scala:252)
at
org.apache.spark.sql.catalyst.analysis.OverrideCatalog$$anonfun$lookupRelation$3.apply(Catalog.scala:161)
at
org.apache.spark.sql.catalyst.analysis.OverrideCatalog$$anonfun$lookupRelation$3.apply(Catalog.scala:161)
at scala.Option.getOrElse(Option.scala:120)
at
org.apache.spark.sql.catalyst.analysis.OverrideCatalog$class.lookupRelation(Catalog.scala:161)
at
org.apache.spark.sql.hive.HiveContext$$anon$1.lookupRelation(HiveContext.scala:252)
at
org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveRelations$.getTable(Analyzer.scala:175)
at
org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveRelations$$anonfun$apply$6.applyOrElse(Analyzer.scala:187)
at
org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveRelations$$anonfun$apply$6.applyOrElse(Analyzer.scala:182)
at
org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$3.apply(TreeNode.scala:187)
at
org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$3.apply(TreeNode.scala:187)
at
org.apache.spark.sql.catalyst.trees.CurrentOrigin$.withOrigin(TreeNode.scala:50)
at
org.apache.spark.sql.catalyst.trees.TreeNode.transformDown(TreeNode.scala:186)
at
org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$4.apply(TreeNode.scala:207)
at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
at scala.collection.Iterator$class.foreach(Iterator.scala:727)
at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
at scala.collection.generic.Growable$class.$plus$plus$eq(Growable.scala:48)
at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:103)
at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:47)
at scala.collection.TraversableOnce$class.to(TraversableOnce.scala:273)
at scala.collection.AbstractIterator.to(Iterator.scala:1157)
at
scala.collection.TraversableOnce$class.toBuffer(TraversableOnce.scala:265)
at scala.collection.AbstractIterator.toBuffer(Iterator.scala:1157)
at scala.collection.TraversableOnce$class.toArray(TraversableOnce.scala:252)
at scala.collection.AbstractIterator.toArray(Iterator.scala:1157)
at
org.apache.spark.sql.catalyst.trees.TreeNode.transformChildrenDown(TreeNode.scala:236)
at
org.apache.spark.sql.catalyst.trees.TreeNode.transformDown(TreeNode.scala:192)
at
org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$4.apply(TreeNode.scala:207)
at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
at scala.collection.Iterator$class.foreach(Iterator.scala:727)
at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
at scala.collection.generic.Growable$class.$plus$plus$eq(Growable.scala:48)
at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:103)
at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:47)
at scala.collection.TraversableOnce$class.to(TraversableOnce.scala:273)
at scala.collection.AbstractIterator.to(Iterator.scala:1157)
at
scala.collection.TraversableOnce$class.toBuffer(TraversableOnce.scala:265)
at scala.collection.AbstractIterator.toBuffer(Iterator.scala:1157)
at scala.collection.TraversableOnce$class.toArray(TraversableOnce.scala:252)
at scala.collection.AbstractIterator.toArray(Iterator.scala:1157)
at
org.apache.spark.sql.catalyst.trees.TreeNode.transformChildrenDown(TreeNode.scala:236)
at
org.apache.spark.sql.catalyst.trees.TreeNode.transformDown(TreeNode.scala:192)
at
org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$4.apply(TreeNode.scala:207)
at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
at scala.collection.Iterator$class.foreach(Iterator.scala:727)
at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
at scala.collection.generic.Growable$class.$plus$plus$eq(Growable.scala:48)
at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:103)
at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:47)
at scala.collection.TraversableOnce$class.to(TraversableOnce.scala:273)
at scala.collection.AbstractIterator.to(Iterator.scala:1157)
at
scala.collection.TraversableOnce$class.toBuffer(TraversableOnce.scala:265)
at scala.collection.AbstractIterator.toBuffer(Iterator.scala:1157)
at scala.collection.TraversableOnce$class.toArray(TraversableOnce.scala:252)
at scala.collection.AbstractIterator.toArray(Iterator.scala:1157)
at
org.apache.spark.sql.catalyst.trees.TreeNode.transformChildrenDown(TreeNode.scala:236)
at
org.apache.spark.sql.catalyst.trees.TreeNode.transformDown(TreeNode.scala:192)
at
org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$4.apply(TreeNode.scala:207)
at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
at scala.collection.Iterator$class.foreach(Iterator.scala:727)
at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
at scala.collection.generic.Growable$class.$plus$plus$eq(Growable.scala:48)
at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:103)
at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:47)
at scala.collection.TraversableOnce$class.to(TraversableOnce.scala:273)
at scala.collection.AbstractIterator.to(Iterator.scala:1157)
at
scala.collection.TraversableOnce$class.toBuffer(TraversableOnce.scala:265)
at scala.collection.AbstractIterator.toBuffer(Iterator.scala:1157)
at scala.collection.TraversableOnce$class.toArray(TraversableOnce.scala:252)
at scala.collection.AbstractIterator.toArray(Iterator.scala:1157)
at
org.apache.spark.sql.catalyst.trees.TreeNode.transformChildrenDown(TreeNode.scala:236)
at
org.apache.spark.sql.catalyst.trees.TreeNode.transformDown(TreeNode.scala:194)
at
org.apache.spark.sql.catalyst.trees.TreeNode.transform(TreeNode.scala:177)
at
org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveRelations$.apply(Analyzer.scala:182)
at
org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveRelations$.apply(Analyzer.scala:172)
at
org.apache.spark.sql.catalyst.rules.RuleExecutor$$anonfun$apply$1$$anonfun$apply$2.apply(RuleExecutor.scala:61)
at
org.apache.spark.sql.catalyst.rules.RuleExecutor$$anonfun$apply$1$$anonfun$apply$2.apply(RuleExecutor.scala:59)
at
scala.collection.LinearSeqOptimized$class.foldLeft(LinearSeqOptimized.scala:111)
at scala.collection.immutable.List.foldLeft(List.scala:84)
at
org.apache.spark.sql.catalyst.rules.RuleExecutor$$anonfun$apply$1.apply(RuleExecutor.scala:59)
at
org.apache.spark.sql.catalyst.rules.RuleExecutor$$anonfun$apply$1.apply(RuleExecutor.scala:51)
at scala.collection.immutable.List.foreach(List.scala:318)
at
org.apache.spark.sql.catalyst.rules.RuleExecutor.apply(RuleExecutor.scala:51)
at
org.apache.spark.sql.SQLContext$QueryExecution.analyzed$lzycompute(SQLContext.scala:1071)
at
org.apache.spark.sql.SQLContext$QueryExecution.analyzed(SQLContext.scala:1071)
at
org.apache.spark.sql.SQLContext$QueryExecution.assertAnalyzed(SQLContext.scala:1069)
at org.apache.spark.sql.DataFrame.<init>(DataFrame.scala:133)
at org.apache.spark.sql.DataFrame$.apply(DataFrame.scala:51)
at org.apache.spark.sql.hive.HiveContext.sql(HiveContext.scala:92)
at
com.ebay.ep.poc.spark.reporting.process.service.HadoopSuccessEvents2Service.execute(HadoopSuccessEvents2Service.scala:32)
at com.ebay.ep.poc.spark.reporting.SparkApp$.main(SparkApp.scala:30)
at com.ebay.ep.poc.spark.reporting.SparkApp.main(SparkApp.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at
org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:480)

15/03/26 08:25:44 ERROR yarn.ApplicationMaster: User class threw exception:
no such table List(dw_bid); line 1 pos 843
org.apache.spark.sql.AnalysisException: no such table List(dw_bid); line 1
pos 843
at
org.apache.spark.sql.catalyst.analysis.package$AnalysisErrorAt.failAnalysis(package.scala:42)
at
org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveRelations$.getTable(Analyzer.scala:178)
at
org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveRelations$$anonfun$apply$6.applyOrElse(Analyzer.scala:187)
at
org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveRelations$$anonfun$apply$6.applyOrElse(Analyzer.scala:182)
at
org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$3.apply(TreeNode.scala:187)
at
org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$3.apply(TreeNode.scala:187)
at
org.apache.spark.sql.catalyst.trees.CurrentOrigin$.withOrigin(TreeNode.scala:50)
at
org.apache.spark.sql.catalyst.trees.TreeNode.transformDown(TreeNode.scala:186)
at
org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$4.apply(TreeNode.scala:207)
at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
at scala.collection.Iterator$class.foreach(Iterator.scala:727)
at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
at scala.collection.generic.Growable$class.$plus$plus$eq(Growable.scala:48)
at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:103)
at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:47)
at scala.collection.TraversableOnce$class.to(TraversableOnce.scala:273)
at scala.collection.AbstractIterator.to(Iterator.scala:1157)
at
scala.collection.TraversableOnce$class.toBuffer(TraversableOnce.scala:265)
at scala.collection.AbstractIterator.toBuffer(Iterator.scala:1157)
at scala.collection.TraversableOnce$class.toArray(TraversableOnce.scala:252)
at scala.collection.AbstractIterator.toArray(Iterator.scala:1157)
at
org.apache.spark.sql.catalyst.trees.TreeNode.transformChildrenDown(TreeNode.scala:236)
at
org.apache.spark.sql.catalyst.trees.TreeNode.transformDown(TreeNode.scala:192)
at
org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$4.apply(TreeNode.scala:207)
at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
at scala.collection.Iterator$class.foreach(Iterator.scala:727)
at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
at scala.collection.generic.Growable$class.$plus$plus$eq(Growable.scala:48)
at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:103)
at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:47)
at scala.collection.TraversableOnce$class.to(TraversableOnce.scala:273)
at scala.collection.AbstractIterator.to(Iterator.scala:1157)
at
scala.collection.TraversableOnce$class.toBuffer(TraversableOnce.scala:265)
at scala.collection.AbstractIterator.toBuffer(Iterator.scala:1157)
at scala.collection.TraversableOnce$class.toArray(TraversableOnce.scala:252)
at scala.collection.AbstractIterator.toArray(Iterator.scala:1157)
at
org.apache.spark.sql.catalyst.trees.TreeNode.transformChildrenDown(TreeNode.scala:236)
at
org.apache.spark.sql.catalyst.trees.TreeNode.transformDown(TreeNode.scala:192)
at
org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$4.apply(TreeNode.scala:207)
at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
at scala.collection.Iterator$class.foreach(Iterator.scala:727)
at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
at scala.collection.generic.Growable$class.$plus$plus$eq(Growable.scala:48)
at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:103)
at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:47)
at scala.collection.TraversableOnce$class.to(TraversableOnce.scala:273)
at scala.collection.AbstractIterator.to(Iterator.scala:1157)
at
scala.collection.TraversableOnce$class.toBuffer(TraversableOnce.scala:265)
at scala.collection.AbstractIterator.toBuffer(Iterator.scala:1157)
at scala.collection.TraversableOnce$class.toArray(TraversableOnce.scala:252)
at scala.collection.AbstractIterator.toArray(Iterator.scala:1157)
at
org.apache.spark.sql.catalyst.trees.TreeNode.transformChildrenDown(TreeNode.scala:236)
at
org.apache.spark.sql.catalyst.trees.TreeNode.transformDown(TreeNode.scala:192)
at
org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$4.apply(TreeNode.scala:207)
at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
at scala.collection.Iterator$class.foreach(Iterator.scala:727)
at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
at scala.collection.generic.Growable$class.$plus$plus$eq(Growable.scala:48)
at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:103)
at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:47)
at scala.collection.TraversableOnce$class.to(TraversableOnce.scala:273)
at scala.collection.AbstractIterator.to(Iterator.scala:1157)
at
scala.collection.TraversableOnce$class.toBuffer(TraversableOnce.scala:265)
at scala.collection.AbstractIterator.toBuffer(Iterator.scala:1157)
at scala.collection.TraversableOnce$class.toArray(TraversableOnce.scala:252)
at scala.collection.AbstractIterator.toArray(Iterator.scala:1157)
at
org.apache.spark.sql.catalyst.trees.TreeNode.transformChildrenDown(TreeNode.scala:236)
at
org.apache.spark.sql.catalyst.trees.TreeNode.transformDown(TreeNode.scala:194)
at
org.apache.spark.sql.catalyst.trees.TreeNode.transform(TreeNode.scala:177)
at
org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveRelations$.apply(Analyzer.scala:182)
at
org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveRelations$.apply(Analyzer.scala:172)
at
org.apache.spark.sql.catalyst.rules.RuleExecutor$$anonfun$apply$1$$anonfun$apply$2.apply(RuleExecutor.scala:61)
at
org.apache.spark.sql.catalyst.rules.RuleExecutor$$anonfun$apply$1$$anonfun$apply$2.apply(RuleExecutor.scala:59)
at
scala.collection.LinearSeqOptimized$class.foldLeft(LinearSeqOptimized.scala:111)
at scala.collection.immutable.List.foldLeft(List.scala:84)
at
org.apache.spark.sql.catalyst.rules.RuleExecutor$$anonfun$apply$1.apply(RuleExecutor.scala:59)
at
org.apache.spark.sql.catalyst.rules.RuleExecutor$$anonfun$apply$1.apply(RuleExecutor.scala:51)
at scala.collection.immutable.List.foreach(List.scala:318)
at
org.apache.spark.sql.catalyst.rules.RuleExecutor.apply(RuleExecutor.scala:51)
at
org.apache.spark.sql.SQLContext$QueryExecution.analyzed$lzycompute(SQLContext.scala:1071)
at
org.apache.spark.sql.SQLContext$QueryExecution.analyzed(SQLContext.scala:1071)
at
org.apache.spark.sql.SQLContext$QueryExecution.assertAnalyzed(SQLContext.scala:1069)
at org.apache.spark.sql.DataFrame.<init>(DataFrame.scala:133)
at org.apache.spark.sql.DataFrame$.apply(DataFrame.scala:51)
at org.apache.spark.sql.hive.HiveContext.sql(HiveContext.scala:92)
at
com.ebay.ep.poc.spark.reporting.process.service.HadoopSuccessEvents2Service.execute(HadoopSuccessEvents2Service.scala:32)
at com.ebay.ep.poc.spark.reporting.SparkApp$.main(SparkApp.scala:30)
at com.ebay.ep.poc.spark.reporting.SparkApp.main(SparkApp.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at
org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:480)
15/03/26 08:25:44 INFO yarn.ApplicationMaster: Final app status: FAILED,
exitCode: 15, (reason: User class threw exception: no such table
List(dw_bid); line 1 pos 843)
15/03/26 08:25:44 INFO yarn.ApplicationMaster: Invoking sc stop from
shutdown hook


On Thu, Mar 26, 2015 at 8:58 PM, ÐΞ€ρ@Ҝ (๏̯͡๏) <de...@gmail.com> wrote:

> Hello Michael,
> Thanks for your time.
>
> 1. show tables from Spark program returns nothing.
> 2. What entities are you talking about ? (I am actually new to Hive as
> well)
>
>
> On Thu, Mar 26, 2015 at 8:35 PM, Michael Armbrust <mi...@databricks.com>
> wrote:
>
>> What does "show tables" return?  You can also run "SET <optionName>" to
>> make sure that entries from you hive site are being read correctly.
>>
>> On Thu, Mar 26, 2015 at 4:02 AM, ÐΞ€ρ@Ҝ (๏̯͡๏) <de...@gmail.com>
>> wrote:
>>
>>> I have tables dw_bid that is created in Hive and has nothing to do with
>>> Spark.  I have data in avro that i want to join with dw_bid table, this
>>> join needs to be done using Spark SQL.  However for some reason Spark says
>>> dw_bid table does not exist. How do i say spark that dw_bid is a table
>>> created in Hive and read it.
>>>
>>>
>>> Query that is run from Spark SQL
>>> ==============================
>>>  insert overwrite table sojsuccessevents2_spark select
>>> guid,sessionKey,sessionStartDate,sojDataDate,seqNum,eventTimestamp,siteId,successEventType,sourceType,itemId,
>>> shopCartId,b.transaction_Id as transactionId,offerId,b.bdr_id as
>>> userId,priorPage1SeqNum,priorPage1PageId,exclWMSearchAttemptSeqNum,exclPriorSearchPageId,
>>> exclPriorSearchSeqNum,exclPriorSearchCategory,exclPriorSearchL1,exclPriorSearchL2,currentImpressionId,sourceImpressionId,exclPriorSearchSqr,exclPriorSearchSort,
>>> isDuplicate,b.bid_date as
>>> transactionDate,auctionTypeCode,isBin,leafCategoryId,itemSiteId,b.qty_bid
>>> as bidQuantity, b.bid_amt_unit_lstg_curncy * b.bid_exchng_rate as
>>>  bidAmtUsd,offerQuantity,offerAmountUsd,offerCreateDate,buyerSegment,buyerCountryId,sellerId,sellerCountryId,
>>> sellerStdLevel,cssSellerLevel,a.experimentChannel from sojsuccessevents1 a
>>> join dw_bid b  on a.itemId = b.item_id  and  a.transactionId =
>>>  b.transaction_id  where b.auct_end_dt >= '2015-02-16' AND b.bid_dt >=
>>> '2015-02-16'  AND b.bid_type_code IN (1,9) AND b.bdr_id > 0 AND (
>>> b.bid_flags & 32) = 0 and lower(a.successEventType) IN ('bid','bin')
>>>
>>>
>>> If i create sojsuccessevents2_spark from hive command line and run above
>>> command form Spark SQL program then i get error "sojsuccessevents2_spark
>>> table not found".
>>>
>>> Hence i dropped the command from Hive and run create table
>>> sojsuccessevents2_spark from Spark SQL before running above command and it
>>> works until it hits next road block "dw_bid table not found"
>>>
>>> This makes me belive that Spark for some reason is not able to
>>> read/understand the tables created outside Spark. I did copy
>>> /apache/hive/conf/hive-site.xml into Spark conf directory.
>>>
>>> Please suggest.
>>>
>>>
>>> Logs
>>> ———
>>> 15/03/26 03:50:40 INFO HiveMetaStore.audit: ugi=dvasthimal
>>> ip=unknown-ip-addr cmd=get_table : db=default tbl=dw_bid
>>> 15/03/26 03:50:40 ERROR metadata.Hive:
>>> NoSuchObjectException(message:default.dw_bid table not found)
>>> at
>>> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_table(HiveMetaStore.java:1560)
>>>
>>>
>>>
>>> 15/03/26 03:50:40 ERROR yarn.ApplicationMaster: User class threw
>>> exception: no such table List(dw_bid); line 1 pos 843
>>> org.apache.spark.sql.AnalysisException: no such table List(dw_bid); line
>>> 1 pos 843
>>> at
>>> org.apache.spark.sql.catalyst.analysis.package$AnalysisErrorAt.failAnalysis(package.scala:42)
>>> at
>>> org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveRelations$.getTable(Analyzer.scala:178)
>>> at
>>> org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveRelations$$anonfun$apply$6.applyOrElse(Analyzer.scala:187)
>>>
>>>
>>>
>>> Regards,
>>> Deepak
>>>
>>>
>>> On Thu, Mar 26, 2015 at 4:27 PM, ÐΞ€ρ@Ҝ (๏̯͡๏) <de...@gmail.com>
>>> wrote:
>>>
>>>> I have this query
>>>>
>>>>  insert overwrite table sojsuccessevents2_spark select
>>>> guid,sessionKey,sessionStartDate,sojDataDate,seqNum,eventTimestamp,siteId,successEventType,sourceType,itemId,
>>>> shopCartId,b.transaction_Id as transactionId,offerId,b.bdr_id as
>>>> userId,priorPage1SeqNum,priorPage1PageId,exclWMSearchAttemptSeqNum,exclPriorSearchPageId,
>>>> exclPriorSearchSeqNum,exclPriorSearchCategory,exclPriorSearchL1,exclPriorSearchL2,currentImpressionId,sourceImpressionId,exclPriorSearchSqr,exclPriorSearchSort,
>>>> isDuplicate,b.bid_date as
>>>> transactionDate,auctionTypeCode,isBin,leafCategoryId,itemSiteId,b.qty_bid
>>>> as bidQuantity, b.bid_amt_unit_lstg_curncy * b.bid_exchng_rate as
>>>>  bidAmtUsd,offerQuantity,offerAmountUsd,offerCreateDate,buyerSegment,buyerCountryId,sellerId,sellerCountryId,
>>>> sellerStdLevel,cssSellerLevel,a.experimentChannel from sojsuccessevents1 a *join
>>>> dw_bid b*  on a.itemId = b.item_id  and  a.transactionId =
>>>>  b.transaction_id  where b.auct_end_dt >= '2015-02-16' AND b.bid_dt >=
>>>> '2015-02-16'  AND b.bid_type_code IN (1,9) AND b.bdr_id > 0 AND (
>>>> b.bid_flags & 32) = 0 and lower(a.successEventType) IN ('bid','bin')
>>>>
>>>>
>>>> If i create sojsuccessevents2_spark from hive command line and run
>>>> above command form Spark SQL program then i get error
>>>> "sojsuccessevents2_spark table not found".
>>>>
>>>> Hence i dropped the command from Hive and run create table
>>>> sojsuccessevents2_spark from Spark SQL before running above command and it
>>>> works until it hits next road block "*dw_bid table not found"*
>>>>
>>>> This makes me belive that Spark for some reason is not able to
>>>> read/understand the tables created outside Spark. I did copy
>>>>   /apache/hive/conf/hive-site.xml into Spark conf directory.
>>>>
>>>> Please suggest.
>>>>
>>>> Regards,
>>>> Deepak
>>>>
>>>>
>>>> On Thu, Mar 26, 2015 at 1:26 PM, ÐΞ€ρ@Ҝ (๏̯͡๏) <de...@gmail.com>
>>>> wrote:
>>>>
>>>>> I have a hive table named dw_bid, when i run hive from command prompt
>>>>> and run describe dw_bid, it works.
>>>>>
>>>>> I want to join a avro file (table) in HDFS with this hive dw_bid table
>>>>> and i refer it as dw_bid from Spark SQL program, however i see
>>>>>
>>>>> 15/03/26 00:31:01 INFO HiveMetaStore.audit: ugi=dvasthimal
>>>>> ip=unknown-ip-addr cmd=get_table : db=default tbl=dw_bid
>>>>> 15/03/26 00:31:01 ERROR metadata.Hive:
>>>>> NoSuchObjectException(message:default.dw_bid table not found)
>>>>> at
>>>>> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_table(HiveMetaStore.java:1560)
>>>>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>>>> at
>>>>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>>>>>
>>>>>
>>>>> Code:
>>>>>
>>>>>     val successDetail_S1 = sqlContext.avroFile(input)
>>>>>     successDetail_S1.registerTempTable("sojsuccessevents1")
>>>>>     val countS1 = sqlContext.sql("select
>>>>> guid,sessionKey,sessionStartDate,sojDataDate,seqNum,eventTimestamp,siteId,successEventType,sourceType,itemId,"
>>>>> +
>>>>>         " shopCartId,b.transaction_Id as
>>>>> transactionId,offerId,b.bdr_id as
>>>>> userId,priorPage1SeqNum,priorPage1PageId,exclWMSearchAttemptSeqNum,exclPriorSearchPageId,"
>>>>> +
>>>>>         "
>>>>> exclPriorSearchSeqNum,exclPriorSearchCategory,exclPriorSearchL1,exclPriorSearchL2,currentImpressionId,sourceImpressionId,exclPriorSearchSqr,exclPriorSearchSort,"
>>>>> +
>>>>>         " isDuplicate,b.bid_date as
>>>>> transactionDate,auctionTypeCode,isBin,leafCategoryId,itemSiteId,b.qty_bid
>>>>> as bidQuantity," +
>>>>>     " b.bid_amt_unit_lstg_curncy * b.bid_exchng_rate as
>>>>>  bidAmtUsd,offerQuantity,offerAmountUsd,offerCreateDate,buyerSegment,buyerCountryId,sellerId,sellerCountryId,"
>>>>> +
>>>>>     " sellerStdLevel,cssSellerLevel,a.experimentChannel" +
>>>>>     " from sojsuccessevents1 a join dw_bid b " +
>>>>>     " on a.itemId = b.item_id  and  a.transactionId =
>>>>>  b.transaction_id " +
>>>>>     " where b.bid_type_code IN (1,9) AND b.bdr_id > 0 AND (
>>>>> b.bid_flags & 32) = 0 and lower(a.successEventType) IN ('bid','bin')")
>>>>>     println("countS1.first:" + countS1.first)
>>>>>
>>>>>
>>>>>
>>>>> Any suggestions on how to refer a hive table form Spark SQL?
>>>>> --
>>>>>
>>>>> Deepak
>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>> Deepak
>>>>
>>>>
>>>
>>>
>>> --
>>> Deepak
>>>
>>>
>>
>
>
> --
> Deepak
>
>


-- 
Deepak

Re: Hive Table not from from Spark SQL

Posted by ๏̯͡๏ <ÐΞ€ρ@Ҝ>, de...@gmail.com.
Hello Michael,
Thanks for your time.

1. show tables from Spark program returns nothing.
2. What entities are you talking about ? (I am actually new to Hive as well)


On Thu, Mar 26, 2015 at 8:35 PM, Michael Armbrust <mi...@databricks.com>
wrote:

> What does "show tables" return?  You can also run "SET <optionName>" to
> make sure that entries from you hive site are being read correctly.
>
> On Thu, Mar 26, 2015 at 4:02 AM, ÐΞ€ρ@Ҝ (๏̯͡๏) <de...@gmail.com>
> wrote:
>
>> I have tables dw_bid that is created in Hive and has nothing to do with
>> Spark.  I have data in avro that i want to join with dw_bid table, this
>> join needs to be done using Spark SQL.  However for some reason Spark says
>> dw_bid table does not exist. How do i say spark that dw_bid is a table
>> created in Hive and read it.
>>
>>
>> Query that is run from Spark SQL
>> ==============================
>>  insert overwrite table sojsuccessevents2_spark select
>> guid,sessionKey,sessionStartDate,sojDataDate,seqNum,eventTimestamp,siteId,successEventType,sourceType,itemId,
>> shopCartId,b.transaction_Id as transactionId,offerId,b.bdr_id as
>> userId,priorPage1SeqNum,priorPage1PageId,exclWMSearchAttemptSeqNum,exclPriorSearchPageId,
>> exclPriorSearchSeqNum,exclPriorSearchCategory,exclPriorSearchL1,exclPriorSearchL2,currentImpressionId,sourceImpressionId,exclPriorSearchSqr,exclPriorSearchSort,
>> isDuplicate,b.bid_date as
>> transactionDate,auctionTypeCode,isBin,leafCategoryId,itemSiteId,b.qty_bid
>> as bidQuantity, b.bid_amt_unit_lstg_curncy * b.bid_exchng_rate as
>>  bidAmtUsd,offerQuantity,offerAmountUsd,offerCreateDate,buyerSegment,buyerCountryId,sellerId,sellerCountryId,
>> sellerStdLevel,cssSellerLevel,a.experimentChannel from sojsuccessevents1 a
>> join dw_bid b  on a.itemId = b.item_id  and  a.transactionId =
>>  b.transaction_id  where b.auct_end_dt >= '2015-02-16' AND b.bid_dt >=
>> '2015-02-16'  AND b.bid_type_code IN (1,9) AND b.bdr_id > 0 AND (
>> b.bid_flags & 32) = 0 and lower(a.successEventType) IN ('bid','bin')
>>
>>
>> If i create sojsuccessevents2_spark from hive command line and run above
>> command form Spark SQL program then i get error "sojsuccessevents2_spark
>> table not found".
>>
>> Hence i dropped the command from Hive and run create table
>> sojsuccessevents2_spark from Spark SQL before running above command and it
>> works until it hits next road block "dw_bid table not found"
>>
>> This makes me belive that Spark for some reason is not able to
>> read/understand the tables created outside Spark. I did copy
>> /apache/hive/conf/hive-site.xml into Spark conf directory.
>>
>> Please suggest.
>>
>>
>> Logs
>> ———
>> 15/03/26 03:50:40 INFO HiveMetaStore.audit: ugi=dvasthimal
>> ip=unknown-ip-addr cmd=get_table : db=default tbl=dw_bid
>> 15/03/26 03:50:40 ERROR metadata.Hive:
>> NoSuchObjectException(message:default.dw_bid table not found)
>> at
>> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_table(HiveMetaStore.java:1560)
>>
>>
>>
>> 15/03/26 03:50:40 ERROR yarn.ApplicationMaster: User class threw
>> exception: no such table List(dw_bid); line 1 pos 843
>> org.apache.spark.sql.AnalysisException: no such table List(dw_bid); line
>> 1 pos 843
>> at
>> org.apache.spark.sql.catalyst.analysis.package$AnalysisErrorAt.failAnalysis(package.scala:42)
>> at
>> org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveRelations$.getTable(Analyzer.scala:178)
>> at
>> org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveRelations$$anonfun$apply$6.applyOrElse(Analyzer.scala:187)
>>
>>
>>
>> Regards,
>> Deepak
>>
>>
>> On Thu, Mar 26, 2015 at 4:27 PM, ÐΞ€ρ@Ҝ (๏̯͡๏) <de...@gmail.com>
>> wrote:
>>
>>> I have this query
>>>
>>>  insert overwrite table sojsuccessevents2_spark select
>>> guid,sessionKey,sessionStartDate,sojDataDate,seqNum,eventTimestamp,siteId,successEventType,sourceType,itemId,
>>> shopCartId,b.transaction_Id as transactionId,offerId,b.bdr_id as
>>> userId,priorPage1SeqNum,priorPage1PageId,exclWMSearchAttemptSeqNum,exclPriorSearchPageId,
>>> exclPriorSearchSeqNum,exclPriorSearchCategory,exclPriorSearchL1,exclPriorSearchL2,currentImpressionId,sourceImpressionId,exclPriorSearchSqr,exclPriorSearchSort,
>>> isDuplicate,b.bid_date as
>>> transactionDate,auctionTypeCode,isBin,leafCategoryId,itemSiteId,b.qty_bid
>>> as bidQuantity, b.bid_amt_unit_lstg_curncy * b.bid_exchng_rate as
>>>  bidAmtUsd,offerQuantity,offerAmountUsd,offerCreateDate,buyerSegment,buyerCountryId,sellerId,sellerCountryId,
>>> sellerStdLevel,cssSellerLevel,a.experimentChannel from sojsuccessevents1 a *join
>>> dw_bid b*  on a.itemId = b.item_id  and  a.transactionId =
>>>  b.transaction_id  where b.auct_end_dt >= '2015-02-16' AND b.bid_dt >=
>>> '2015-02-16'  AND b.bid_type_code IN (1,9) AND b.bdr_id > 0 AND (
>>> b.bid_flags & 32) = 0 and lower(a.successEventType) IN ('bid','bin')
>>>
>>>
>>> If i create sojsuccessevents2_spark from hive command line and run above
>>> command form Spark SQL program then i get error "sojsuccessevents2_spark
>>> table not found".
>>>
>>> Hence i dropped the command from Hive and run create table
>>> sojsuccessevents2_spark from Spark SQL before running above command and it
>>> works until it hits next road block "*dw_bid table not found"*
>>>
>>> This makes me belive that Spark for some reason is not able to
>>> read/understand the tables created outside Spark. I did copy
>>>   /apache/hive/conf/hive-site.xml into Spark conf directory.
>>>
>>> Please suggest.
>>>
>>> Regards,
>>> Deepak
>>>
>>>
>>> On Thu, Mar 26, 2015 at 1:26 PM, ÐΞ€ρ@Ҝ (๏̯͡๏) <de...@gmail.com>
>>> wrote:
>>>
>>>> I have a hive table named dw_bid, when i run hive from command prompt
>>>> and run describe dw_bid, it works.
>>>>
>>>> I want to join a avro file (table) in HDFS with this hive dw_bid table
>>>> and i refer it as dw_bid from Spark SQL program, however i see
>>>>
>>>> 15/03/26 00:31:01 INFO HiveMetaStore.audit: ugi=dvasthimal
>>>> ip=unknown-ip-addr cmd=get_table : db=default tbl=dw_bid
>>>> 15/03/26 00:31:01 ERROR metadata.Hive:
>>>> NoSuchObjectException(message:default.dw_bid table not found)
>>>> at
>>>> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_table(HiveMetaStore.java:1560)
>>>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>>> at
>>>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>>>>
>>>>
>>>> Code:
>>>>
>>>>     val successDetail_S1 = sqlContext.avroFile(input)
>>>>     successDetail_S1.registerTempTable("sojsuccessevents1")
>>>>     val countS1 = sqlContext.sql("select
>>>> guid,sessionKey,sessionStartDate,sojDataDate,seqNum,eventTimestamp,siteId,successEventType,sourceType,itemId,"
>>>> +
>>>>         " shopCartId,b.transaction_Id as transactionId,offerId,b.bdr_id
>>>> as
>>>> userId,priorPage1SeqNum,priorPage1PageId,exclWMSearchAttemptSeqNum,exclPriorSearchPageId,"
>>>> +
>>>>         "
>>>> exclPriorSearchSeqNum,exclPriorSearchCategory,exclPriorSearchL1,exclPriorSearchL2,currentImpressionId,sourceImpressionId,exclPriorSearchSqr,exclPriorSearchSort,"
>>>> +
>>>>         " isDuplicate,b.bid_date as
>>>> transactionDate,auctionTypeCode,isBin,leafCategoryId,itemSiteId,b.qty_bid
>>>> as bidQuantity," +
>>>>     " b.bid_amt_unit_lstg_curncy * b.bid_exchng_rate as
>>>>  bidAmtUsd,offerQuantity,offerAmountUsd,offerCreateDate,buyerSegment,buyerCountryId,sellerId,sellerCountryId,"
>>>> +
>>>>     " sellerStdLevel,cssSellerLevel,a.experimentChannel" +
>>>>     " from sojsuccessevents1 a join dw_bid b " +
>>>>     " on a.itemId = b.item_id  and  a.transactionId =
>>>>  b.transaction_id " +
>>>>     " where b.bid_type_code IN (1,9) AND b.bdr_id > 0 AND (
>>>> b.bid_flags & 32) = 0 and lower(a.successEventType) IN ('bid','bin')")
>>>>     println("countS1.first:" + countS1.first)
>>>>
>>>>
>>>>
>>>> Any suggestions on how to refer a hive table form Spark SQL?
>>>> --
>>>>
>>>> Deepak
>>>>
>>>>
>>>
>>>
>>> --
>>> Deepak
>>>
>>>
>>
>>
>> --
>> Deepak
>>
>>
>


-- 
Deepak

Re: Hive Table not from from Spark SQL

Posted by Michael Armbrust <mi...@databricks.com>.
What does "show tables" return?  You can also run "SET <optionName>" to
make sure that entries from you hive site are being read correctly.

On Thu, Mar 26, 2015 at 4:02 AM, ÐΞ€ρ@Ҝ (๏̯͡๏) <de...@gmail.com> wrote:

> I have tables dw_bid that is created in Hive and has nothing to do with
> Spark.  I have data in avro that i want to join with dw_bid table, this
> join needs to be done using Spark SQL.  However for some reason Spark says
> dw_bid table does not exist. How do i say spark that dw_bid is a table
> created in Hive and read it.
>
>
> Query that is run from Spark SQL
> ==============================
>  insert overwrite table sojsuccessevents2_spark select
> guid,sessionKey,sessionStartDate,sojDataDate,seqNum,eventTimestamp,siteId,successEventType,sourceType,itemId,
> shopCartId,b.transaction_Id as transactionId,offerId,b.bdr_id as
> userId,priorPage1SeqNum,priorPage1PageId,exclWMSearchAttemptSeqNum,exclPriorSearchPageId,
> exclPriorSearchSeqNum,exclPriorSearchCategory,exclPriorSearchL1,exclPriorSearchL2,currentImpressionId,sourceImpressionId,exclPriorSearchSqr,exclPriorSearchSort,
> isDuplicate,b.bid_date as
> transactionDate,auctionTypeCode,isBin,leafCategoryId,itemSiteId,b.qty_bid
> as bidQuantity, b.bid_amt_unit_lstg_curncy * b.bid_exchng_rate as
>  bidAmtUsd,offerQuantity,offerAmountUsd,offerCreateDate,buyerSegment,buyerCountryId,sellerId,sellerCountryId,
> sellerStdLevel,cssSellerLevel,a.experimentChannel from sojsuccessevents1 a
> join dw_bid b  on a.itemId = b.item_id  and  a.transactionId =
>  b.transaction_id  where b.auct_end_dt >= '2015-02-16' AND b.bid_dt >=
> '2015-02-16'  AND b.bid_type_code IN (1,9) AND b.bdr_id > 0 AND (
> b.bid_flags & 32) = 0 and lower(a.successEventType) IN ('bid','bin')
>
>
> If i create sojsuccessevents2_spark from hive command line and run above
> command form Spark SQL program then i get error "sojsuccessevents2_spark
> table not found".
>
> Hence i dropped the command from Hive and run create table
> sojsuccessevents2_spark from Spark SQL before running above command and it
> works until it hits next road block "dw_bid table not found"
>
> This makes me belive that Spark for some reason is not able to
> read/understand the tables created outside Spark. I did copy
> /apache/hive/conf/hive-site.xml into Spark conf directory.
>
> Please suggest.
>
>
> Logs
> ———
> 15/03/26 03:50:40 INFO HiveMetaStore.audit: ugi=dvasthimal
> ip=unknown-ip-addr cmd=get_table : db=default tbl=dw_bid
> 15/03/26 03:50:40 ERROR metadata.Hive:
> NoSuchObjectException(message:default.dw_bid table not found)
> at
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_table(HiveMetaStore.java:1560)
>
>
>
> 15/03/26 03:50:40 ERROR yarn.ApplicationMaster: User class threw
> exception: no such table List(dw_bid); line 1 pos 843
> org.apache.spark.sql.AnalysisException: no such table List(dw_bid); line 1
> pos 843
> at
> org.apache.spark.sql.catalyst.analysis.package$AnalysisErrorAt.failAnalysis(package.scala:42)
> at
> org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveRelations$.getTable(Analyzer.scala:178)
> at
> org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveRelations$$anonfun$apply$6.applyOrElse(Analyzer.scala:187)
>
>
>
> Regards,
> Deepak
>
>
> On Thu, Mar 26, 2015 at 4:27 PM, ÐΞ€ρ@Ҝ (๏̯͡๏) <de...@gmail.com>
> wrote:
>
>> I have this query
>>
>>  insert overwrite table sojsuccessevents2_spark select
>> guid,sessionKey,sessionStartDate,sojDataDate,seqNum,eventTimestamp,siteId,successEventType,sourceType,itemId,
>> shopCartId,b.transaction_Id as transactionId,offerId,b.bdr_id as
>> userId,priorPage1SeqNum,priorPage1PageId,exclWMSearchAttemptSeqNum,exclPriorSearchPageId,
>> exclPriorSearchSeqNum,exclPriorSearchCategory,exclPriorSearchL1,exclPriorSearchL2,currentImpressionId,sourceImpressionId,exclPriorSearchSqr,exclPriorSearchSort,
>> isDuplicate,b.bid_date as
>> transactionDate,auctionTypeCode,isBin,leafCategoryId,itemSiteId,b.qty_bid
>> as bidQuantity, b.bid_amt_unit_lstg_curncy * b.bid_exchng_rate as
>>  bidAmtUsd,offerQuantity,offerAmountUsd,offerCreateDate,buyerSegment,buyerCountryId,sellerId,sellerCountryId,
>> sellerStdLevel,cssSellerLevel,a.experimentChannel from sojsuccessevents1 a *join
>> dw_bid b*  on a.itemId = b.item_id  and  a.transactionId =
>>  b.transaction_id  where b.auct_end_dt >= '2015-02-16' AND b.bid_dt >=
>> '2015-02-16'  AND b.bid_type_code IN (1,9) AND b.bdr_id > 0 AND (
>> b.bid_flags & 32) = 0 and lower(a.successEventType) IN ('bid','bin')
>>
>>
>> If i create sojsuccessevents2_spark from hive command line and run above
>> command form Spark SQL program then i get error "sojsuccessevents2_spark
>> table not found".
>>
>> Hence i dropped the command from Hive and run create table
>> sojsuccessevents2_spark from Spark SQL before running above command and it
>> works until it hits next road block "*dw_bid table not found"*
>>
>> This makes me belive that Spark for some reason is not able to
>> read/understand the tables created outside Spark. I did copy
>>   /apache/hive/conf/hive-site.xml into Spark conf directory.
>>
>> Please suggest.
>>
>> Regards,
>> Deepak
>>
>>
>> On Thu, Mar 26, 2015 at 1:26 PM, ÐΞ€ρ@Ҝ (๏̯͡๏) <de...@gmail.com>
>> wrote:
>>
>>> I have a hive table named dw_bid, when i run hive from command prompt
>>> and run describe dw_bid, it works.
>>>
>>> I want to join a avro file (table) in HDFS with this hive dw_bid table
>>> and i refer it as dw_bid from Spark SQL program, however i see
>>>
>>> 15/03/26 00:31:01 INFO HiveMetaStore.audit: ugi=dvasthimal
>>> ip=unknown-ip-addr cmd=get_table : db=default tbl=dw_bid
>>> 15/03/26 00:31:01 ERROR metadata.Hive:
>>> NoSuchObjectException(message:default.dw_bid table not found)
>>> at
>>> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_table(HiveMetaStore.java:1560)
>>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>> at
>>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>>>
>>>
>>> Code:
>>>
>>>     val successDetail_S1 = sqlContext.avroFile(input)
>>>     successDetail_S1.registerTempTable("sojsuccessevents1")
>>>     val countS1 = sqlContext.sql("select
>>> guid,sessionKey,sessionStartDate,sojDataDate,seqNum,eventTimestamp,siteId,successEventType,sourceType,itemId,"
>>> +
>>>         " shopCartId,b.transaction_Id as transactionId,offerId,b.bdr_id
>>> as
>>> userId,priorPage1SeqNum,priorPage1PageId,exclWMSearchAttemptSeqNum,exclPriorSearchPageId,"
>>> +
>>>         "
>>> exclPriorSearchSeqNum,exclPriorSearchCategory,exclPriorSearchL1,exclPriorSearchL2,currentImpressionId,sourceImpressionId,exclPriorSearchSqr,exclPriorSearchSort,"
>>> +
>>>         " isDuplicate,b.bid_date as
>>> transactionDate,auctionTypeCode,isBin,leafCategoryId,itemSiteId,b.qty_bid
>>> as bidQuantity," +
>>>     " b.bid_amt_unit_lstg_curncy * b.bid_exchng_rate as
>>>  bidAmtUsd,offerQuantity,offerAmountUsd,offerCreateDate,buyerSegment,buyerCountryId,sellerId,sellerCountryId,"
>>> +
>>>     " sellerStdLevel,cssSellerLevel,a.experimentChannel" +
>>>     " from sojsuccessevents1 a join dw_bid b " +
>>>     " on a.itemId = b.item_id  and  a.transactionId =  b.transaction_id
>>> " +
>>>     " where b.bid_type_code IN (1,9) AND b.bdr_id > 0 AND ( b.bid_flags
>>> & 32) = 0 and lower(a.successEventType) IN ('bid','bin')")
>>>     println("countS1.first:" + countS1.first)
>>>
>>>
>>>
>>> Any suggestions on how to refer a hive table form Spark SQL?
>>> --
>>>
>>> Deepak
>>>
>>>
>>
>>
>> --
>> Deepak
>>
>>
>
>
> --
> Deepak
>
>

Re: Hive Table not from from Spark SQL

Posted by ๏̯͡๏ <ÐΞ€ρ@Ҝ>, de...@gmail.com.
I have tables dw_bid that is created in Hive and has nothing to do with
Spark.  I have data in avro that i want to join with dw_bid table, this
join needs to be done using Spark SQL.  However for some reason Spark says
dw_bid table does not exist. How do i say spark that dw_bid is a table
created in Hive and read it.


Query that is run from Spark SQL
==============================
 insert overwrite table sojsuccessevents2_spark select
guid,sessionKey,sessionStartDate,sojDataDate,seqNum,eventTimestamp,siteId,successEventType,sourceType,itemId,
shopCartId,b.transaction_Id as transactionId,offerId,b.bdr_id as
userId,priorPage1SeqNum,priorPage1PageId,exclWMSearchAttemptSeqNum,exclPriorSearchPageId,
exclPriorSearchSeqNum,exclPriorSearchCategory,exclPriorSearchL1,exclPriorSearchL2,currentImpressionId,sourceImpressionId,exclPriorSearchSqr,exclPriorSearchSort,
isDuplicate,b.bid_date as
transactionDate,auctionTypeCode,isBin,leafCategoryId,itemSiteId,b.qty_bid
as bidQuantity, b.bid_amt_unit_lstg_curncy * b.bid_exchng_rate as
 bidAmtUsd,offerQuantity,offerAmountUsd,offerCreateDate,buyerSegment,buyerCountryId,sellerId,sellerCountryId,
sellerStdLevel,cssSellerLevel,a.experimentChannel from sojsuccessevents1 a
join dw_bid b  on a.itemId = b.item_id  and  a.transactionId =
 b.transaction_id  where b.auct_end_dt >= '2015-02-16' AND b.bid_dt >=
'2015-02-16'  AND b.bid_type_code IN (1,9) AND b.bdr_id > 0 AND (
b.bid_flags & 32) = 0 and lower(a.successEventType) IN ('bid','bin')


If i create sojsuccessevents2_spark from hive command line and run above
command form Spark SQL program then i get error "sojsuccessevents2_spark
table not found".

Hence i dropped the command from Hive and run create table
sojsuccessevents2_spark from Spark SQL before running above command and it
works until it hits next road block "dw_bid table not found"

This makes me belive that Spark for some reason is not able to
read/understand the tables created outside Spark. I did copy
/apache/hive/conf/hive-site.xml into Spark conf directory.

Please suggest.


Logs
———
15/03/26 03:50:40 INFO HiveMetaStore.audit: ugi=dvasthimal
ip=unknown-ip-addr cmd=get_table : db=default tbl=dw_bid
15/03/26 03:50:40 ERROR metadata.Hive:
NoSuchObjectException(message:default.dw_bid table not found)
at
org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_table(HiveMetaStore.java:1560)



15/03/26 03:50:40 ERROR yarn.ApplicationMaster: User class threw exception:
no such table List(dw_bid); line 1 pos 843
org.apache.spark.sql.AnalysisException: no such table List(dw_bid); line 1
pos 843
at
org.apache.spark.sql.catalyst.analysis.package$AnalysisErrorAt.failAnalysis(package.scala:42)
at
org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveRelations$.getTable(Analyzer.scala:178)
at
org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveRelations$$anonfun$apply$6.applyOrElse(Analyzer.scala:187)



Regards,
Deepak


On Thu, Mar 26, 2015 at 4:27 PM, ÐΞ€ρ@Ҝ (๏̯͡๏) <de...@gmail.com> wrote:

> I have this query
>
>  insert overwrite table sojsuccessevents2_spark select
> guid,sessionKey,sessionStartDate,sojDataDate,seqNum,eventTimestamp,siteId,successEventType,sourceType,itemId,
> shopCartId,b.transaction_Id as transactionId,offerId,b.bdr_id as
> userId,priorPage1SeqNum,priorPage1PageId,exclWMSearchAttemptSeqNum,exclPriorSearchPageId,
> exclPriorSearchSeqNum,exclPriorSearchCategory,exclPriorSearchL1,exclPriorSearchL2,currentImpressionId,sourceImpressionId,exclPriorSearchSqr,exclPriorSearchSort,
> isDuplicate,b.bid_date as
> transactionDate,auctionTypeCode,isBin,leafCategoryId,itemSiteId,b.qty_bid
> as bidQuantity, b.bid_amt_unit_lstg_curncy * b.bid_exchng_rate as
>  bidAmtUsd,offerQuantity,offerAmountUsd,offerCreateDate,buyerSegment,buyerCountryId,sellerId,sellerCountryId,
> sellerStdLevel,cssSellerLevel,a.experimentChannel from sojsuccessevents1 a *join
> dw_bid b*  on a.itemId = b.item_id  and  a.transactionId =
>  b.transaction_id  where b.auct_end_dt >= '2015-02-16' AND b.bid_dt >=
> '2015-02-16'  AND b.bid_type_code IN (1,9) AND b.bdr_id > 0 AND (
> b.bid_flags & 32) = 0 and lower(a.successEventType) IN ('bid','bin')
>
>
> If i create sojsuccessevents2_spark from hive command line and run above
> command form Spark SQL program then i get error "sojsuccessevents2_spark
> table not found".
>
> Hence i dropped the command from Hive and run create table
> sojsuccessevents2_spark from Spark SQL before running above command and it
> works until it hits next road block "*dw_bid table not found"*
>
> This makes me belive that Spark for some reason is not able to
> read/understand the tables created outside Spark. I did copy
>   /apache/hive/conf/hive-site.xml into Spark conf directory.
>
> Please suggest.
>
> Regards,
> Deepak
>
>
> On Thu, Mar 26, 2015 at 1:26 PM, ÐΞ€ρ@Ҝ (๏̯͡๏) <de...@gmail.com>
> wrote:
>
>> I have a hive table named dw_bid, when i run hive from command prompt and
>> run describe dw_bid, it works.
>>
>> I want to join a avro file (table) in HDFS with this hive dw_bid table
>> and i refer it as dw_bid from Spark SQL program, however i see
>>
>> 15/03/26 00:31:01 INFO HiveMetaStore.audit: ugi=dvasthimal
>> ip=unknown-ip-addr cmd=get_table : db=default tbl=dw_bid
>> 15/03/26 00:31:01 ERROR metadata.Hive:
>> NoSuchObjectException(message:default.dw_bid table not found)
>> at
>> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_table(HiveMetaStore.java:1560)
>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>> at
>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>>
>>
>> Code:
>>
>>     val successDetail_S1 = sqlContext.avroFile(input)
>>     successDetail_S1.registerTempTable("sojsuccessevents1")
>>     val countS1 = sqlContext.sql("select
>> guid,sessionKey,sessionStartDate,sojDataDate,seqNum,eventTimestamp,siteId,successEventType,sourceType,itemId,"
>> +
>>         " shopCartId,b.transaction_Id as transactionId,offerId,b.bdr_id
>> as
>> userId,priorPage1SeqNum,priorPage1PageId,exclWMSearchAttemptSeqNum,exclPriorSearchPageId,"
>> +
>>         "
>> exclPriorSearchSeqNum,exclPriorSearchCategory,exclPriorSearchL1,exclPriorSearchL2,currentImpressionId,sourceImpressionId,exclPriorSearchSqr,exclPriorSearchSort,"
>> +
>>         " isDuplicate,b.bid_date as
>> transactionDate,auctionTypeCode,isBin,leafCategoryId,itemSiteId,b.qty_bid
>> as bidQuantity," +
>>     " b.bid_amt_unit_lstg_curncy * b.bid_exchng_rate as
>>  bidAmtUsd,offerQuantity,offerAmountUsd,offerCreateDate,buyerSegment,buyerCountryId,sellerId,sellerCountryId,"
>> +
>>     " sellerStdLevel,cssSellerLevel,a.experimentChannel" +
>>     " from sojsuccessevents1 a join dw_bid b " +
>>     " on a.itemId = b.item_id  and  a.transactionId =  b.transaction_id
>> " +
>>     " where b.bid_type_code IN (1,9) AND b.bdr_id > 0 AND ( b.bid_flags
>> & 32) = 0 and lower(a.successEventType) IN ('bid','bin')")
>>     println("countS1.first:" + countS1.first)
>>
>>
>>
>> Any suggestions on how to refer a hive table form Spark SQL?
>> --
>>
>> Deepak
>>
>>
>
>
> --
> Deepak
>
>


-- 
Deepak

Re: Hive Table not from from Spark SQL

Posted by ๏̯͡๏ <ÐΞ€ρ@Ҝ>, de...@gmail.com.
I have this query

 insert overwrite table sojsuccessevents2_spark select
guid,sessionKey,sessionStartDate,sojDataDate,seqNum,eventTimestamp,siteId,successEventType,sourceType,itemId,
shopCartId,b.transaction_Id as transactionId,offerId,b.bdr_id as
userId,priorPage1SeqNum,priorPage1PageId,exclWMSearchAttemptSeqNum,exclPriorSearchPageId,
exclPriorSearchSeqNum,exclPriorSearchCategory,exclPriorSearchL1,exclPriorSearchL2,currentImpressionId,sourceImpressionId,exclPriorSearchSqr,exclPriorSearchSort,
isDuplicate,b.bid_date as
transactionDate,auctionTypeCode,isBin,leafCategoryId,itemSiteId,b.qty_bid
as bidQuantity, b.bid_amt_unit_lstg_curncy * b.bid_exchng_rate as
 bidAmtUsd,offerQuantity,offerAmountUsd,offerCreateDate,buyerSegment,buyerCountryId,sellerId,sellerCountryId,
sellerStdLevel,cssSellerLevel,a.experimentChannel from
sojsuccessevents1 a *join
dw_bid b*  on a.itemId = b.item_id  and  a.transactionId =
 b.transaction_id  where b.auct_end_dt >= '2015-02-16' AND b.bid_dt >=
'2015-02-16'  AND b.bid_type_code IN (1,9) AND b.bdr_id > 0 AND (
b.bid_flags & 32) = 0 and lower(a.successEventType) IN ('bid','bin')


If i create sojsuccessevents2_spark from hive command line and run above
command form Spark SQL program then i get error "sojsuccessevents2_spark
table not found".

Hence i dropped the command from Hive and run create table
sojsuccessevents2_spark from Spark SQL before running above command and it
works until it hits next road block "*dw_bid table not found"*

This makes me belive that Spark for some reason is not able to
read/understand the tables created outside Spark. I did copy
  /apache/hive/conf/hive-site.xml into Spark conf directory.

Please suggest.

Regards,
Deepak


On Thu, Mar 26, 2015 at 1:26 PM, ÐΞ€ρ@Ҝ (๏̯͡๏) <de...@gmail.com> wrote:

> I have a hive table named dw_bid, when i run hive from command prompt and
> run describe dw_bid, it works.
>
> I want to join a avro file (table) in HDFS with this hive dw_bid table and
> i refer it as dw_bid from Spark SQL program, however i see
>
> 15/03/26 00:31:01 INFO HiveMetaStore.audit: ugi=dvasthimal
> ip=unknown-ip-addr cmd=get_table : db=default tbl=dw_bid
> 15/03/26 00:31:01 ERROR metadata.Hive:
> NoSuchObjectException(message:default.dw_bid table not found)
> at
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_table(HiveMetaStore.java:1560)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>
>
> Code:
>
>     val successDetail_S1 = sqlContext.avroFile(input)
>     successDetail_S1.registerTempTable("sojsuccessevents1")
>     val countS1 = sqlContext.sql("select
> guid,sessionKey,sessionStartDate,sojDataDate,seqNum,eventTimestamp,siteId,successEventType,sourceType,itemId,"
> +
>         " shopCartId,b.transaction_Id as transactionId,offerId,b.bdr_id as
> userId,priorPage1SeqNum,priorPage1PageId,exclWMSearchAttemptSeqNum,exclPriorSearchPageId,"
> +
>         "
> exclPriorSearchSeqNum,exclPriorSearchCategory,exclPriorSearchL1,exclPriorSearchL2,currentImpressionId,sourceImpressionId,exclPriorSearchSqr,exclPriorSearchSort,"
> +
>         " isDuplicate,b.bid_date as
> transactionDate,auctionTypeCode,isBin,leafCategoryId,itemSiteId,b.qty_bid
> as bidQuantity," +
>     " b.bid_amt_unit_lstg_curncy * b.bid_exchng_rate as
>  bidAmtUsd,offerQuantity,offerAmountUsd,offerCreateDate,buyerSegment,buyerCountryId,sellerId,sellerCountryId,"
> +
>     " sellerStdLevel,cssSellerLevel,a.experimentChannel" +
>     " from sojsuccessevents1 a join dw_bid b " +
>     " on a.itemId = b.item_id  and  a.transactionId =  b.transaction_id "
> +
>     " where b.bid_type_code IN (1,9) AND b.bdr_id > 0 AND ( b.bid_flags &
> 32) = 0 and lower(a.successEventType) IN ('bid','bin')")
>     println("countS1.first:" + countS1.first)
>
>
>
> Any suggestions on how to refer a hive table form Spark SQL?
> --
>
> Deepak
>
>


-- 
Deepak