You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by Divya Gehlot <di...@gmail.com> on 2016/03/01 05:27:53 UTC

[ERROR]: Spark 1.5.2 + Hbase 1.1 + Hive 1.2 + HbaseIntegration

Hi,
I am getting error when I am trying to connect hive table (which is being
created through HbaseIntegration) in spark

Steps I followed :
*Hive Table creation code  *:
CREATE EXTERNAL TABLE IF NOT EXISTS TEST(NAME STRING,AGE INT)
STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
WITH SERDEPROPERTIES ("hbase.columns.mapping" = ":key,0:AGE")
TBLPROPERTIES ("hbase.table.name" = "TEST",
"hbase.mapred.output.outputtable" = "TEST");


*DESCRIBE TEST ;*
col_name    data_type    comment
name            string         from deserializer
age               int             from deserializer


*Spark Code :*
import org.apache.spark._
import org.apache.spark.sql._

val hiveContext = new org.apache.spark.sql.hive.HiveContext(sc)
hiveContext.sql("from TEST SELECT  NAME").collect.foreach(println)


*Starting Spark shell*
spark-shell --jars
/usr/hdp/2.3.4.0-3485/hive/lib/guava-14.0.1.jar,/usr/hdp/2.3.4.0-3485/hive/lib/hive-hbase-handler.jar,/usr/hdp/2.3.4.0-3485/hbase/lib/hbase-client.jar,/usr/hdp/2.3.4.0-3485/hbase/lib/hbase-common.jar,/usr/hdp/2.3.4.0-3485/hbase/lib/hbase-protocol.jar,/usr/hdp/2.3.4.0-3485/hbase/lib/hbase-server.jar,/usr/hdp/2.3.4.0-3485/hive/lib/hive-hbase-handler.jar,/usr/hdp/2.3.4.0-3485/hive/lib/htrace-core-3.1.0-incubating.jar,/usr/hdp/2.3.4.0-3485/hive/lib/zookeeper-3.4.6.2.3.4.0-3485.jar,/usr/hdp/2.3.4.0-3485/hbase/lib/hbase-server.jar
--driver-class-path
/usr/hdp/2.3.4.0-3485/hive/lib/guava-14.0.1.jar,/usr/hdp/2.3.4.0-3485/hbase/lib/hbase-client.jar,/usr/hdp/2.3.4.0-3485/hbase/lib/hbase-common.jar,/usr/hdp/2.3.4.0-3485/hbase/lib/hbase-protocol.jar,/usr/hdp/2.3.4.0-3485/hbase/lib/hbase-server.jar,/usr/hdp/2.3.4.0-3485/hive/lib/hive-hbase-handler.jar,/usr/hdp/2.3.4.0-3485/hive/lib/htrace-core-3.1.0-incubating.jar,/usr/hdp/2.3.4.0-3485/hive/lib/zookeeper-3.4.6.2.3.4.0-3485.jar,/usr/hdp/2.3.4.0-3485/hive/lib/hive-hbase-handler.jar,/usr/hdp/2.3.4.0-3485/hbase/lib/hbase-server.jar
--packages com.databricks:spark-csv_2.10:1.3.0  --master yarn-client -i
/TestDivya/Spark/InstrumentCopyToHDFSHive.scala

*Stack Trace* :

Stack SQL context available as sqlContext.
> Loading /TestDivya/Spark/InstrumentCopyToHDFSHive.scala...
> import org.apache.spark._
> import org.apache.spark.sql._
> 16/02/29 23:09:29 INFO HiveContext: Initializing execution hive, version
> 1.2.1
> 16/02/29 23:09:29 INFO ClientWrapper: Inspected Hadoop version:
> 2.7.1.2.3.4.0-3485
> 16/02/29 23:09:29 INFO ClientWrapper: Loaded
> org.apache.hadoop.hive.shims.Hadoop23Shims for Hadoop version
> 2.7.1.2.3.4.0-3485
> 16/02/29 23:09:29 INFO HiveContext: default warehouse location is
> /user/hive/warehouse
> 16/02/29 23:09:29 INFO HiveContext: Initializing HiveMetastoreConnection
> version 1.2.1 using Spark classes.
> 16/02/29 23:09:29 INFO ClientWrapper: Inspected Hadoop version:
> 2.7.1.2.3.4.0-3485
> 16/02/29 23:09:29 INFO ClientWrapper: Loaded
> org.apache.hadoop.hive.shims.Hadoop23Shims for Hadoop version
> 2.7.1.2.3.4.0-3485
> 16/02/29 23:09:30 WARN NativeCodeLoader: Unable to load native-hadoop
> library for your platform... using builtin-java classes where applicable
> 16/02/29 23:09:30 INFO metastore: Trying to connect to metastore with URI
> thrift://ip-xxx-xx-xx-xxx.ap-southeast-1.compute.internal:9083
> 16/02/29 23:09:30 INFO metastore: Connected to metastore.
> 16/02/29 23:09:30 WARN DomainSocketFactory: The short-circuit local reads
> feature cannot be used because libhadoop cannot be loaded.
> 16/02/29 23:09:31 INFO SessionState: Created local directory:
> /tmp/1bf53785-f7c8-406d-a733-a5858ccb2d16_resources
> 16/02/29 23:09:31 INFO SessionState: Created HDFS directory:
> /tmp/hive/hdfs/1bf53785-f7c8-406d-a733-a5858ccb2d16
> 16/02/29 23:09:31 INFO SessionState: Created local directory:
> /tmp/hdfs/1bf53785-f7c8-406d-a733-a5858ccb2d16
> 16/02/29 23:09:31 INFO SessionState: Created HDFS directory:
> /tmp/hive/hdfs/1bf53785-f7c8-406d-a733-a5858ccb2d16/_tmp_space.db
> hiveContext: org.apache.spark.sql.hive.HiveContext =
> org.apache.spark.sql.hive.HiveContext@10b14f32
> 16/02/29 23:09:32 INFO ParseDriver: Parsing command: from TEST SELECT  NAME
> 16/02/29 23:09:32 INFO ParseDriver: Parse Completed
> 16/02/29 23:09:33 INFO deprecation: mapred.map.tasks is deprecated.
> Instead, use mapreduce.job.maps
> 16/02/29 23:09:33 INFO MemoryStore: ensureFreeSpace(468352) called with
> curMem=0, maxMem=556038881
> 16/02/29 23:09:33 INFO MemoryStore: Block broadcast_0 stored as values in
> memory (estimated size 457.4 KB, free 529.8 MB)
> 16/02/29 23:09:33 INFO MemoryStore: ensureFreeSpace(49454) called with
> curMem=468352, maxMem=556038881
> 16/02/29 23:09:33 INFO MemoryStore: Block broadcast_0_piece0 stored as
> bytes in memory (estimated size 48.3 KB, free 529.8 MB)
> 16/02/29 23:09:33 INFO BlockManagerInfo: Added broadcast_0_piece0 in
> memory on xxx.xx.xx.xxx:37784 (size: 48.3 KB, free: 530.2 MB)
> 16/02/29 23:09:33 INFO SparkContext: Created broadcast 0 from collect at
> <console>:30
> 16/02/29 23:09:34 INFO HBaseStorageHandler: Configuring input job
> properties
> 16/02/29 23:09:34 INFO RecoverableZooKeeper: Process
> identifier=hconnection-0x26fa89a2 connecting to ZooKeeper
> ensemble=localhost:2181
> 16/02/29 23:09:34 INFO ZooKeeper: Client
> environment:zookeeper.version=3.4.6-3485--1, built on 12/16/2015 02:35 GMT
> 16/02/29 23:09:34 INFO ZooKeeper: Client environment:host.name
> =ip-xxx-xx-xx-xxx.ap-southeast-1.compute.internal
> 16/02/29 23:09:34 INFO ZooKeeper: Client environment:java.version=1.7.0_67
> 16/02/29 23:09:34 INFO ZooKeeper: Client environment:java.vendor=Oracle
> Corporation
> 16/02/29 23:09:34 INFO ZooKeeper: Client
> environment:java.home=/usr/jdk64/jdk1.7.0_67/jre
> 16/02/29 23:09:34 INFO ZooKeeper: Client
> environment:java.class.path=/usr/hdp/2.3.4.0-3485/hive/lib/guava-14.0.1.jar,/usr/hdp/2.3.4.0-3485/hbase/lib/hbase-client.jar,/usr/hdp/2.3.4.0-3485/hbase/lib/hbase-common.jar,/usr/hdp/2.3.4.0-3485/hbase/lib/hbase-protocol.jar,/usr/hdp/2.3.4.0-3485/hbase/lib/hbase-server.jar,/usr/hdp/2.3.4.0-3485/hive/lib/hive-hbase-handler.jar,/usr/hdp/2.3.4.0-3485/hive/lib/htrace-core-3.1.0-incubating.jar,/usr/hdp/2.3.4.0-3485/hive/lib/zookeeper-3.4.6.2.3.4.0-3485.jar,/usr/hdp/2.3.4.0-3485/hive/lib/hive-hbase-handler.jar,/usr/hdp/2.3.4.0-3485/hbase/lib/hbase-server.jar:/usr/hdp/current/spark-thriftserver/conf/:/usr/hdp/2.3.4.0-3485/spark/lib/spark-assembly-1.5.2.2.3.4.0-3485-hadoop2.7.1.2.3.4.0-3485.jar:/usr/hdp/2.3.4.0-3485/spark/lib/datanucleus-api-jdo-3.2.6.jar:/usr/hdp/2.3.4.0-3485/spark/lib/datanucleus-core-3.2.10.jar:/usr/hdp/2.3.4.0-3485/spark/lib/datanucleus-rdbms-3.2.9.jar:/usr/hdp/current/hadoop-client/conf/
> 16/02/29 23:09:34 INFO ZooKeeper: Client
> environment:java.library.path=/usr/java/packages/lib/amd64:/usr/lib64:/lib64:/lib:/usr/lib
> 16/02/29 23:09:34 INFO ZooKeeper: Client environment:java.io.tmpdir=/tmp
> 16/02/29 23:09:34 INFO ZooKeeper: Client environment:java.compiler=<NA>
> 16/02/29 23:09:34 INFO ZooKeeper: Client environment:os.name=Linux
> 16/02/29 23:09:34 INFO ZooKeeper: Client environment:os.arch=amd64
> 16/02/29 23:09:34 INFO ZooKeeper: Client
> environment:os.version=3.10.0-229.el7.x86_64
> 16/02/29 23:09:34 INFO ZooKeeper: Client environment:user.name=hdfs
> 16/02/29 23:09:34 INFO ZooKeeper: Client environment:user.home=/home/hdfs
> 16/02/29 23:09:34 INFO ZooKeeper: Client environment:user.dir=/home/hdfs
> 16/02/29 23:09:34 INFO ZooKeeper: Initiating client connection,
> connectString=localhost:2181 sessionTimeout=90000
> watcher=hconnection-0x26fa89a20x0, quorum=localhost:2181, baseZNode=/hbase
> 16/02/29 23:09:34 INFO ClientCnxn: Opening socket connection to server
> localhost/0:0:0:0:0:0:0:1:2181. Will not attempt to authenticate using SASL
> (unknown error)
> 16/02/29 23:09:34 INFO ClientCnxn: Socket connection established to
> localhost/0:0:0:0:0:0:0:1:2181, initiating session
> 16/02/29 23:09:34 INFO ClientCnxn: Session establishment complete on
> server localhost/0:0:0:0:0:0:0:1:2181, sessionid = 0x3532fb70ba20034,
> negotiated timeout = 40000
> 16/02/29 23:09:34 WARN TableInputFormatBase: You are using an HTable
> instance that relies on an HBase-managed Connection. This is usually due to
> directly creating an HTable, which is deprecated. Instead, you should
> create a Connection object and then request a Table instance from it. If
> you don't need the Table instance for your own use, you should instead use
> the TableInputFormatBase.initalizeTable method directly.
> 16/02/29 23:09:34 INFO TableInputFormatBase: Creating an additional
> unmanaged connection because user provided one can't be used for
> administrative actions. We'll close it when we close out the table.
> 16/02/29 23:09:34 INFO RecoverableZooKeeper: Process
> identifier=hconnection-0x6fd74d35 connecting to ZooKeeper
> ensemble=localhost:2181
> 16/02/29 23:09:34 INFO ZooKeeper: Initiating client connection,
> connectString=localhost:2181 sessionTimeout=90000
> watcher=hconnection-0x6fd74d350x0, quorum=localhost:2181, baseZNode=/hbase
> 16/02/29 23:09:34 INFO ClientCnxn: Opening socket connection to server
> localhost/0:0:0:0:0:0:0:1:2181. Will not attempt to authenticate using SASL
> (unknown error)
> 16/02/29 23:09:34 INFO ClientCnxn: Socket connection established to
> localhost/0:0:0:0:0:0:0:1:2181, initiating session
> 16/02/29 23:09:34 INFO ClientCnxn: Session establishment complete on
> server localhost/0:0:0:0:0:0:0:1:2181, sessionid = 0x3532fb70ba20035,
> negotiated timeout = 40000
> 16/02/29 23:09:34 INFO RegionSizeCalculator: Calculating region sizes for
> table "TEST".
> 16/02/29 23:10:23 INFO RpcRetryingCaller: Call exception, tries=10,
> retries=35, started=48318 ms ago, cancelled=false, msg=
> 16/02/29 23:10:43 INFO RpcRetryingCaller: Call exception, tries=11,
> retries=35, started=68524 ms ago, cancelled=false, msg=
> 16/02/29 23:11:03 INFO RpcRetryingCaller: Call exception, tries=12,
> retries=35, started=88617 ms ago, cancelled=false, msg=
> 16/02/29 23:11:23 INFO RpcRetryingCaller: Call exception, tries=13,
> retries=35, started=108676 ms ago, cancelled=false, msg=
> 16/02/29 23:11:43 INFO RpcRetryingCaller: Call exception, tries=14,
> retries=35, started=128747 ms ago, cancelled=false, msg=
> 16/02/29 23:12:03 INFO RpcRetryingCaller: Call exception, tries=15,
> retries=35, started=148938 ms ago, cancelled=false, msg=
> 16/02/29 23:12:23 INFO RpcRetryingCaller: Call exception, tries=16,
> retries=35, started=168942 ms ago, cancelled=false, msg=
> 16/02/29 23:12:43 INFO RpcRetryingCaller: Call exception, tries=17,
> retries=35, started=188975 ms ago, cancelled=false, msg=
> Trace :



Could somebody help me in resolving the error.
Would really appreciate the help .


Thanks,
Divya

Re: [ERROR]: Spark 1.5.2 + Hbase 1.1 + Hive 1.2 + HbaseIntegration

Posted by Teng Qiu <te...@gmail.com>.
and also make sure that hbase-site.xml is set in your classpath on all
nodes, both master and workers, and also client.

normally i put it into $SPARK_HOME/conf/ then the spark cluster will
be started with this conf file.

btw. @Ted, did you tried insert into hbase table with spark's
HiveContext? i got this issue:
https://issues.apache.org/jira/browse/SPARK-6628

and there is a patch available: https://issues.apache.org/jira/browse/HIVE-11166


2016-03-01 15:16 GMT+01:00 Ted Yu <yu...@gmail.com>:
> 16/03/01 01:36:31 WARN TaskSetManager: Lost task 0.0 in stage 0.0 (TID 0,
> ip-xxx-xx-xx-xxx.ap-southeast-1.compute.internal):
> java.lang.RuntimeException: hbase-default.xml file seems to be for an older
> version of HBase (null), this version is 1.1.2.2.3.4.0-3485
>
> The above was likely caused by some component being built with different
> release of hbase.
>
> Try setting "hbase.defaults.for.version.skip" to true.
>
> Cheers
>
>
> On Mon, Feb 29, 2016 at 9:12 PM, Ted Yu <yu...@gmail.com> wrote:
>>
>> 16/02/29 23:09:34 INFO ZooKeeper: Initiating client connection,
>> connectString=localhost:2181 sessionTimeout=90000
>> watcher=hconnection-0x26fa89a20x0, quorum=localhost:2181, baseZNode=/hbase
>>
>> Since baseZNode didn't match what you set in hbase-site.xml, the cause was
>> likely that hbase-site.xml being inaccessible to your Spark job.
>>
>> Please add it in your classpath.
>>
>> On Mon, Feb 29, 2016 at 8:42 PM, Ted Yu <yu...@gmail.com> wrote:
>>>
>>> 16/02/29 23:09:34 INFO ClientCnxn: Opening socket connection to server
>>> localhost/0:0:0:0:0:0:0:1:2181. Will not attempt to authenticate using SASL
>>> (unknown error)
>>>
>>> Is your cluster secure cluster ?
>>>
>>> bq. Trace :
>>>
>>> Was there any output after 'Trace :' ?
>>>
>>> Was hbase-site.xml accessible to your Spark job ?
>>>
>>> Thanks
>>>
>>> On Mon, Feb 29, 2016 at 8:27 PM, Divya Gehlot <di...@gmail.com>
>>> wrote:
>>>>
>>>> Hi,
>>>> I am getting error when I am trying to connect hive table (which is
>>>> being created through HbaseIntegration) in spark
>>>>
>>>> Steps I followed :
>>>> Hive Table creation code  :
>>>> CREATE EXTERNAL TABLE IF NOT EXISTS TEST(NAME STRING,AGE INT)
>>>> STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
>>>> WITH SERDEPROPERTIES ("hbase.columns.mapping" = ":key,0:AGE")
>>>> TBLPROPERTIES ("hbase.table.name" = "TEST",
>>>> "hbase.mapred.output.outputtable" = "TEST");
>>>>
>>>>
>>>> DESCRIBE TEST ;
>>>> col_name    data_type    comment
>>>> name            string         from deserializer
>>>> age               int             from deserializer
>>>>
>>>>
>>>> Spark Code :
>>>> import org.apache.spark._
>>>> import org.apache.spark.sql._
>>>>
>>>> val hiveContext = new org.apache.spark.sql.hive.HiveContext(sc)
>>>> hiveContext.sql("from TEST SELECT  NAME").collect.foreach(println)
>>>>
>>>>
>>>> Starting Spark shell
>>>> spark-shell --jars
>>>> /usr/hdp/2.3.4.0-3485/hive/lib/guava-14.0.1.jar,/usr/hdp/2.3.4.0-3485/hive/lib/hive-hbase-handler.jar,/usr/hdp/2.3.4.0-3485/hbase/lib/hbase-client.jar,/usr/hdp/2.3.4.0-3485/hbase/lib/hbase-common.jar,/usr/hdp/2.3.4.0-3485/hbase/lib/hbase-protocol.jar,/usr/hdp/2.3.4.0-3485/hbase/lib/hbase-server.jar,/usr/hdp/2.3.4.0-3485/hive/lib/hive-hbase-handler.jar,/usr/hdp/2.3.4.0-3485/hive/lib/htrace-core-3.1.0-incubating.jar,/usr/hdp/2.3.4.0-3485/hive/lib/zookeeper-3.4.6.2.3.4.0-3485.jar,/usr/hdp/2.3.4.0-3485/hbase/lib/hbase-server.jar
>>>> --driver-class-path
>>>> /usr/hdp/2.3.4.0-3485/hive/lib/guava-14.0.1.jar,/usr/hdp/2.3.4.0-3485/hbase/lib/hbase-client.jar,/usr/hdp/2.3.4.0-3485/hbase/lib/hbase-common.jar,/usr/hdp/2.3.4.0-3485/hbase/lib/hbase-protocol.jar,/usr/hdp/2.3.4.0-3485/hbase/lib/hbase-server.jar,/usr/hdp/2.3.4.0-3485/hive/lib/hive-hbase-handler.jar,/usr/hdp/2.3.4.0-3485/hive/lib/htrace-core-3.1.0-incubating.jar,/usr/hdp/2.3.4.0-3485/hive/lib/zookeeper-3.4.6.2.3.4.0-3485.jar,/usr/hdp/2.3.4.0-3485/hive/lib/hive-hbase-handler.jar,/usr/hdp/2.3.4.0-3485/hbase/lib/hbase-server.jar
>>>> --packages com.databricks:spark-csv_2.10:1.3.0  --master yarn-client -i
>>>> /TestDivya/Spark/InstrumentCopyToHDFSHive.scala
>>>>
>>>> Stack Trace :
>>>>
>>>>> Stack SQL context available as sqlContext.
>>>>> Loading /TestDivya/Spark/InstrumentCopyToHDFSHive.scala...
>>>>> import org.apache.spark._
>>>>> import org.apache.spark.sql._
>>>>> 16/02/29 23:09:29 INFO HiveContext: Initializing execution hive,
>>>>> version 1.2.1
>>>>> 16/02/29 23:09:29 INFO ClientWrapper: Inspected Hadoop version:
>>>>> 2.7.1.2.3.4.0-3485
>>>>> 16/02/29 23:09:29 INFO ClientWrapper: Loaded
>>>>> org.apache.hadoop.hive.shims.Hadoop23Shims for Hadoop version
>>>>> 2.7.1.2.3.4.0-3485
>>>>> 16/02/29 23:09:29 INFO HiveContext: default warehouse location is
>>>>> /user/hive/warehouse
>>>>> 16/02/29 23:09:29 INFO HiveContext: Initializing
>>>>> HiveMetastoreConnection version 1.2.1 using Spark classes.
>>>>> 16/02/29 23:09:29 INFO ClientWrapper: Inspected Hadoop version:
>>>>> 2.7.1.2.3.4.0-3485
>>>>> 16/02/29 23:09:29 INFO ClientWrapper: Loaded
>>>>> org.apache.hadoop.hive.shims.Hadoop23Shims for Hadoop version
>>>>> 2.7.1.2.3.4.0-3485
>>>>> 16/02/29 23:09:30 WARN NativeCodeLoader: Unable to load native-hadoop
>>>>> library for your platform... using builtin-java classes where applicable
>>>>> 16/02/29 23:09:30 INFO metastore: Trying to connect to metastore with
>>>>> URI thrift://ip-xxx-xx-xx-xxx.ap-southeast-1.compute.internal:9083
>>>>> 16/02/29 23:09:30 INFO metastore: Connected to metastore.
>>>>> 16/02/29 23:09:30 WARN DomainSocketFactory: The short-circuit local
>>>>> reads feature cannot be used because libhadoop cannot be loaded.
>>>>> 16/02/29 23:09:31 INFO SessionState: Created local directory:
>>>>> /tmp/1bf53785-f7c8-406d-a733-a5858ccb2d16_resources
>>>>> 16/02/29 23:09:31 INFO SessionState: Created HDFS directory:
>>>>> /tmp/hive/hdfs/1bf53785-f7c8-406d-a733-a5858ccb2d16
>>>>> 16/02/29 23:09:31 INFO SessionState: Created local directory:
>>>>> /tmp/hdfs/1bf53785-f7c8-406d-a733-a5858ccb2d16
>>>>> 16/02/29 23:09:31 INFO SessionState: Created HDFS directory:
>>>>> /tmp/hive/hdfs/1bf53785-f7c8-406d-a733-a5858ccb2d16/_tmp_space.db
>>>>> hiveContext: org.apache.spark.sql.hive.HiveContext =
>>>>> org.apache.spark.sql.hive.HiveContext@10b14f32
>>>>> 16/02/29 23:09:32 INFO ParseDriver: Parsing command: from TEST SELECT
>>>>> NAME
>>>>> 16/02/29 23:09:32 INFO ParseDriver: Parse Completed
>>>>> 16/02/29 23:09:33 INFO deprecation: mapred.map.tasks is deprecated.
>>>>> Instead, use mapreduce.job.maps
>>>>> 16/02/29 23:09:33 INFO MemoryStore: ensureFreeSpace(468352) called with
>>>>> curMem=0, maxMem=556038881
>>>>> 16/02/29 23:09:33 INFO MemoryStore: Block broadcast_0 stored as values
>>>>> in memory (estimated size 457.4 KB, free 529.8 MB)
>>>>> 16/02/29 23:09:33 INFO MemoryStore: ensureFreeSpace(49454) called with
>>>>> curMem=468352, maxMem=556038881
>>>>> 16/02/29 23:09:33 INFO MemoryStore: Block broadcast_0_piece0 stored as
>>>>> bytes in memory (estimated size 48.3 KB, free 529.8 MB)
>>>>> 16/02/29 23:09:33 INFO BlockManagerInfo: Added broadcast_0_piece0 in
>>>>> memory on xxx.xx.xx.xxx:37784 (size: 48.3 KB, free: 530.2 MB)
>>>>> 16/02/29 23:09:33 INFO SparkContext: Created broadcast 0 from collect
>>>>> at <console>:30
>>>>> 16/02/29 23:09:34 INFO HBaseStorageHandler: Configuring input job
>>>>> properties
>>>>> 16/02/29 23:09:34 INFO RecoverableZooKeeper: Process
>>>>> identifier=hconnection-0x26fa89a2 connecting to ZooKeeper
>>>>> ensemble=localhost:2181
>>>>> 16/02/29 23:09:34 INFO ZooKeeper: Client
>>>>> environment:zookeeper.version=3.4.6-3485--1, built on 12/16/2015 02:35 GMT
>>>>> 16/02/29 23:09:34 INFO ZooKeeper: Client
>>>>> environment:host.name=ip-xxx-xx-xx-xxx.ap-southeast-1.compute.internal
>>>>> 16/02/29 23:09:34 INFO ZooKeeper: Client
>>>>> environment:java.version=1.7.0_67
>>>>> 16/02/29 23:09:34 INFO ZooKeeper: Client environment:java.vendor=Oracle
>>>>> Corporation
>>>>> 16/02/29 23:09:34 INFO ZooKeeper: Client
>>>>> environment:java.home=/usr/jdk64/jdk1.7.0_67/jre
>>>>> 16/02/29 23:09:34 INFO ZooKeeper: Client
>>>>> environment:java.class.path=/usr/hdp/2.3.4.0-3485/hive/lib/guava-14.0.1.jar,/usr/hdp/2.3.4.0-3485/hbase/lib/hbase-client.jar,/usr/hdp/2.3.4.0-3485/hbase/lib/hbase-common.jar,/usr/hdp/2.3.4.0-3485/hbase/lib/hbase-protocol.jar,/usr/hdp/2.3.4.0-3485/hbase/lib/hbase-server.jar,/usr/hdp/2.3.4.0-3485/hive/lib/hive-hbase-handler.jar,/usr/hdp/2.3.4.0-3485/hive/lib/htrace-core-3.1.0-incubating.jar,/usr/hdp/2.3.4.0-3485/hive/lib/zookeeper-3.4.6.2.3.4.0-3485.jar,/usr/hdp/2.3.4.0-3485/hive/lib/hive-hbase-handler.jar,/usr/hdp/2.3.4.0-3485/hbase/lib/hbase-server.jar:/usr/hdp/current/spark-thriftserver/conf/:/usr/hdp/2.3.4.0-3485/spark/lib/spark-assembly-1.5.2.2.3.4.0-3485-hadoop2.7.1.2.3.4.0-3485.jar:/usr/hdp/2.3.4.0-3485/spark/lib/datanucleus-api-jdo-3.2.6.jar:/usr/hdp/2.3.4.0-3485/spark/lib/datanucleus-core-3.2.10.jar:/usr/hdp/2.3.4.0-3485/spark/lib/datanucleus-rdbms-3.2.9.jar:/usr/hdp/current/hadoop-client/conf/
>>>>> 16/02/29 23:09:34 INFO ZooKeeper: Client
>>>>> environment:java.library.path=/usr/java/packages/lib/amd64:/usr/lib64:/lib64:/lib:/usr/lib
>>>>> 16/02/29 23:09:34 INFO ZooKeeper: Client
>>>>> environment:java.io.tmpdir=/tmp
>>>>> 16/02/29 23:09:34 INFO ZooKeeper: Client environment:java.compiler=<NA>
>>>>> 16/02/29 23:09:34 INFO ZooKeeper: Client environment:os.name=Linux
>>>>> 16/02/29 23:09:34 INFO ZooKeeper: Client environment:os.arch=amd64
>>>>> 16/02/29 23:09:34 INFO ZooKeeper: Client
>>>>> environment:os.version=3.10.0-229.el7.x86_64
>>>>> 16/02/29 23:09:34 INFO ZooKeeper: Client environment:user.name=hdfs
>>>>> 16/02/29 23:09:34 INFO ZooKeeper: Client
>>>>> environment:user.home=/home/hdfs
>>>>> 16/02/29 23:09:34 INFO ZooKeeper: Client
>>>>> environment:user.dir=/home/hdfs
>>>>> 16/02/29 23:09:34 INFO ZooKeeper: Initiating client connection,
>>>>> connectString=localhost:2181 sessionTimeout=90000
>>>>> watcher=hconnection-0x26fa89a20x0, quorum=localhost:2181, baseZNode=/hbase
>>>>> 16/02/29 23:09:34 INFO ClientCnxn: Opening socket connection to server
>>>>> localhost/0:0:0:0:0:0:0:1:2181. Will not attempt to authenticate using SASL
>>>>> (unknown error)
>>>>> 16/02/29 23:09:34 INFO ClientCnxn: Socket connection established to
>>>>> localhost/0:0:0:0:0:0:0:1:2181, initiating session
>>>>> 16/02/29 23:09:34 INFO ClientCnxn: Session establishment complete on
>>>>> server localhost/0:0:0:0:0:0:0:1:2181, sessionid = 0x3532fb70ba20034,
>>>>> negotiated timeout = 40000
>>>>> 16/02/29 23:09:34 WARN TableInputFormatBase: You are using an HTable
>>>>> instance that relies on an HBase-managed Connection. This is usually due to
>>>>> directly creating an HTable, which is deprecated. Instead, you should create
>>>>> a Connection object and then request a Table instance from it. If you don't
>>>>> need the Table instance for your own use, you should instead use the
>>>>> TableInputFormatBase.initalizeTable method directly.
>>>>> 16/02/29 23:09:34 INFO TableInputFormatBase: Creating an additional
>>>>> unmanaged connection because user provided one can't be used for
>>>>> administrative actions. We'll close it when we close out the table.
>>>>> 16/02/29 23:09:34 INFO RecoverableZooKeeper: Process
>>>>> identifier=hconnection-0x6fd74d35 connecting to ZooKeeper
>>>>> ensemble=localhost:2181
>>>>> 16/02/29 23:09:34 INFO ZooKeeper: Initiating client connection,
>>>>> connectString=localhost:2181 sessionTimeout=90000
>>>>> watcher=hconnection-0x6fd74d350x0, quorum=localhost:2181, baseZNode=/hbase
>>>>> 16/02/29 23:09:34 INFO ClientCnxn: Opening socket connection to server
>>>>> localhost/0:0:0:0:0:0:0:1:2181. Will not attempt to authenticate using SASL
>>>>> (unknown error)
>>>>> 16/02/29 23:09:34 INFO ClientCnxn: Socket connection established to
>>>>> localhost/0:0:0:0:0:0:0:1:2181, initiating session
>>>>> 16/02/29 23:09:34 INFO ClientCnxn: Session establishment complete on
>>>>> server localhost/0:0:0:0:0:0:0:1:2181, sessionid = 0x3532fb70ba20035,
>>>>> negotiated timeout = 40000
>>>>> 16/02/29 23:09:34 INFO RegionSizeCalculator: Calculating region sizes
>>>>> for table "TEST".
>>>>> 16/02/29 23:10:23 INFO RpcRetryingCaller: Call exception, tries=10,
>>>>> retries=35, started=48318 ms ago, cancelled=false, msg=
>>>>> 16/02/29 23:10:43 INFO RpcRetryingCaller: Call exception, tries=11,
>>>>> retries=35, started=68524 ms ago, cancelled=false, msg=
>>>>> 16/02/29 23:11:03 INFO RpcRetryingCaller: Call exception, tries=12,
>>>>> retries=35, started=88617 ms ago, cancelled=false, msg=
>>>>> 16/02/29 23:11:23 INFO RpcRetryingCaller: Call exception, tries=13,
>>>>> retries=35, started=108676 ms ago, cancelled=false, msg=
>>>>> 16/02/29 23:11:43 INFO RpcRetryingCaller: Call exception, tries=14,
>>>>> retries=35, started=128747 ms ago, cancelled=false, msg=
>>>>> 16/02/29 23:12:03 INFO RpcRetryingCaller: Call exception, tries=15,
>>>>> retries=35, started=148938 ms ago, cancelled=false, msg=
>>>>> 16/02/29 23:12:23 INFO RpcRetryingCaller: Call exception, tries=16,
>>>>> retries=35, started=168942 ms ago, cancelled=false, msg=
>>>>> 16/02/29 23:12:43 INFO RpcRetryingCaller: Call exception, tries=17,
>>>>> retries=35, started=188975 ms ago, cancelled=false, msg=
>>>>> Trace :
>>>>
>>>>
>>>>
>>>> Could somebody help me in resolving the error.
>>>> Would really appreciate the help .
>>>>
>>>>
>>>> Thanks,
>>>> Divya
>>>>
>>>>
>>>>
>>>>
>>>>
>>>
>>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
For additional commands, e-mail: user-help@spark.apache.org


Re: [ERROR]: Spark 1.5.2 + Hbase 1.1 + Hive 1.2 + HbaseIntegration

Posted by Ted Yu <yu...@gmail.com>.
16/03/01 01:36:31 WARN TaskSetManager: Lost task 0.0 in stage 0.0 (TID 0,
ip-xxx-xx-xx-xxx.ap-southeast-1.compute.internal):
java.lang.RuntimeException: hbase-default.xml file seems to be for an older
version of HBase (null), this version is 1.1.2.2.3.4.0-3485

The above was likely caused by some component being built with different
release of hbase.

Try setting "hbase.defaults.for.version.skip" to true.

Cheers

On Mon, Feb 29, 2016 at 9:12 PM, Ted Yu <yu...@gmail.com> wrote:

> 16/02/29 23:09:34 INFO ZooKeeper: Initiating client connection,
> connectString=localhost:2181 sessionTimeout=90000
> watcher=hconnection-0x26fa89a20x0, quorum=localhost:2181, baseZNode=/hbase
>
> Since baseZNode didn't match what you set in hbase-site.xml, the cause was
> likely that hbase-site.xml being inaccessible to your Spark job.
>
> Please add it in your classpath.
>
> On Mon, Feb 29, 2016 at 8:42 PM, Ted Yu <yu...@gmail.com> wrote:
>
>> 16/02/29 23:09:34 INFO ClientCnxn: Opening socket connection to server
>> localhost/0:0:0:0:0:0:0:1:2181. Will not attempt to authenticate using
>> SASL (unknown error)
>>
>> Is your cluster secure cluster ?
>>
>> bq. Trace :
>>
>> Was there any output after 'Trace :' ?
>>
>> Was hbase-site.xml accessible to your Spark job ?
>>
>> Thanks
>>
>> On Mon, Feb 29, 2016 at 8:27 PM, Divya Gehlot <di...@gmail.com>
>> wrote:
>>
>>> Hi,
>>> I am getting error when I am trying to connect hive table (which is
>>> being created through HbaseIntegration) in spark
>>>
>>> Steps I followed :
>>> *Hive Table creation code  *:
>>> CREATE EXTERNAL TABLE IF NOT EXISTS TEST(NAME STRING,AGE INT)
>>> STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
>>> WITH SERDEPROPERTIES ("hbase.columns.mapping" = ":key,0:AGE")
>>> TBLPROPERTIES ("hbase.table.name" = "TEST",
>>> "hbase.mapred.output.outputtable" = "TEST");
>>>
>>>
>>> *DESCRIBE TEST ;*
>>> col_name    data_type    comment
>>> name            string         from deserializer
>>> age               int             from deserializer
>>>
>>>
>>> *Spark Code :*
>>> import org.apache.spark._
>>> import org.apache.spark.sql._
>>>
>>> val hiveContext = new org.apache.spark.sql.hive.HiveContext(sc)
>>> hiveContext.sql("from TEST SELECT  NAME").collect.foreach(println)
>>>
>>>
>>> *Starting Spark shell*
>>> spark-shell --jars
>>> /usr/hdp/2.3.4.0-3485/hive/lib/guava-14.0.1.jar,/usr/hdp/2.3.4.0-3485/hive/lib/hive-hbase-handler.jar,/usr/hdp/2.3.4.0-3485/hbase/lib/hbase-client.jar,/usr/hdp/2.3.4.0-3485/hbase/lib/hbase-common.jar,/usr/hdp/2.3.4.0-3485/hbase/lib/hbase-protocol.jar,/usr/hdp/2.3.4.0-3485/hbase/lib/hbase-server.jar,/usr/hdp/2.3.4.0-3485/hive/lib/hive-hbase-handler.jar,/usr/hdp/2.3.4.0-3485/hive/lib/htrace-core-3.1.0-incubating.jar,/usr/hdp/2.3.4.0-3485/hive/lib/zookeeper-3.4.6.2.3.4.0-3485.jar,/usr/hdp/2.3.4.0-3485/hbase/lib/hbase-server.jar
>>> --driver-class-path
>>> /usr/hdp/2.3.4.0-3485/hive/lib/guava-14.0.1.jar,/usr/hdp/2.3.4.0-3485/hbase/lib/hbase-client.jar,/usr/hdp/2.3.4.0-3485/hbase/lib/hbase-common.jar,/usr/hdp/2.3.4.0-3485/hbase/lib/hbase-protocol.jar,/usr/hdp/2.3.4.0-3485/hbase/lib/hbase-server.jar,/usr/hdp/2.3.4.0-3485/hive/lib/hive-hbase-handler.jar,/usr/hdp/2.3.4.0-3485/hive/lib/htrace-core-3.1.0-incubating.jar,/usr/hdp/2.3.4.0-3485/hive/lib/zookeeper-3.4.6.2.3.4.0-3485.jar,/usr/hdp/2.3.4.0-3485/hive/lib/hive-hbase-handler.jar,/usr/hdp/2.3.4.0-3485/hbase/lib/hbase-server.jar
>>> --packages com.databricks:spark-csv_2.10:1.3.0  --master yarn-client -i
>>> /TestDivya/Spark/InstrumentCopyToHDFSHive.scala
>>>
>>> *Stack Trace* :
>>>
>>> Stack SQL context available as sqlContext.
>>>> Loading /TestDivya/Spark/InstrumentCopyToHDFSHive.scala...
>>>> import org.apache.spark._
>>>> import org.apache.spark.sql._
>>>> 16/02/29 23:09:29 INFO HiveContext: Initializing execution hive,
>>>> version 1.2.1
>>>> 16/02/29 23:09:29 INFO ClientWrapper: Inspected Hadoop version:
>>>> 2.7.1.2.3.4.0-3485
>>>> 16/02/29 23:09:29 INFO ClientWrapper: Loaded
>>>> org.apache.hadoop.hive.shims.Hadoop23Shims for Hadoop version
>>>> 2.7.1.2.3.4.0-3485
>>>> 16/02/29 23:09:29 INFO HiveContext: default warehouse location is
>>>> /user/hive/warehouse
>>>> 16/02/29 23:09:29 INFO HiveContext: Initializing
>>>> HiveMetastoreConnection version 1.2.1 using Spark classes.
>>>> 16/02/29 23:09:29 INFO ClientWrapper: Inspected Hadoop version:
>>>> 2.7.1.2.3.4.0-3485
>>>> 16/02/29 23:09:29 INFO ClientWrapper: Loaded
>>>> org.apache.hadoop.hive.shims.Hadoop23Shims for Hadoop version
>>>> 2.7.1.2.3.4.0-3485
>>>> 16/02/29 23:09:30 WARN NativeCodeLoader: Unable to load native-hadoop
>>>> library for your platform... using builtin-java classes where applicable
>>>> 16/02/29 23:09:30 INFO metastore: Trying to connect to metastore with
>>>> URI thrift://ip-xxx-xx-xx-xxx.ap-southeast-1.compute.internal:9083
>>>> 16/02/29 23:09:30 INFO metastore: Connected to metastore.
>>>> 16/02/29 23:09:30 WARN DomainSocketFactory: The short-circuit local
>>>> reads feature cannot be used because libhadoop cannot be loaded.
>>>> 16/02/29 23:09:31 INFO SessionState: Created local directory:
>>>> /tmp/1bf53785-f7c8-406d-a733-a5858ccb2d16_resources
>>>> 16/02/29 23:09:31 INFO SessionState: Created HDFS directory:
>>>> /tmp/hive/hdfs/1bf53785-f7c8-406d-a733-a5858ccb2d16
>>>> 16/02/29 23:09:31 INFO SessionState: Created local directory:
>>>> /tmp/hdfs/1bf53785-f7c8-406d-a733-a5858ccb2d16
>>>> 16/02/29 23:09:31 INFO SessionState: Created HDFS directory:
>>>> /tmp/hive/hdfs/1bf53785-f7c8-406d-a733-a5858ccb2d16/_tmp_space.db
>>>> hiveContext: org.apache.spark.sql.hive.HiveContext =
>>>> org.apache.spark.sql.hive.HiveContext@10b14f32
>>>> 16/02/29 23:09:32 INFO ParseDriver: Parsing command: from TEST SELECT
>>>>  NAME
>>>> 16/02/29 23:09:32 INFO ParseDriver: Parse Completed
>>>> 16/02/29 23:09:33 INFO deprecation: mapred.map.tasks is deprecated.
>>>> Instead, use mapreduce.job.maps
>>>> 16/02/29 23:09:33 INFO MemoryStore: ensureFreeSpace(468352) called with
>>>> curMem=0, maxMem=556038881
>>>> 16/02/29 23:09:33 INFO MemoryStore: Block broadcast_0 stored as values
>>>> in memory (estimated size 457.4 KB, free 529.8 MB)
>>>> 16/02/29 23:09:33 INFO MemoryStore: ensureFreeSpace(49454) called with
>>>> curMem=468352, maxMem=556038881
>>>> 16/02/29 23:09:33 INFO MemoryStore: Block broadcast_0_piece0 stored as
>>>> bytes in memory (estimated size 48.3 KB, free 529.8 MB)
>>>> 16/02/29 23:09:33 INFO BlockManagerInfo: Added broadcast_0_piece0 in
>>>> memory on xxx.xx.xx.xxx:37784 (size: 48.3 KB, free: 530.2 MB)
>>>> 16/02/29 23:09:33 INFO SparkContext: Created broadcast 0 from collect
>>>> at <console>:30
>>>> 16/02/29 23:09:34 INFO HBaseStorageHandler: Configuring input job
>>>> properties
>>>> 16/02/29 23:09:34 INFO RecoverableZooKeeper: Process
>>>> identifier=hconnection-0x26fa89a2 connecting to ZooKeeper
>>>> ensemble=localhost:2181
>>>> 16/02/29 23:09:34 INFO ZooKeeper: Client
>>>> environment:zookeeper.version=3.4.6-3485--1, built on 12/16/2015 02:35 GMT
>>>> 16/02/29 23:09:34 INFO ZooKeeper: Client environment:host.name
>>>> =ip-xxx-xx-xx-xxx.ap-southeast-1.compute.internal
>>>> 16/02/29 23:09:34 INFO ZooKeeper: Client
>>>> environment:java.version=1.7.0_67
>>>> 16/02/29 23:09:34 INFO ZooKeeper: Client environment:java.vendor=Oracle
>>>> Corporation
>>>> 16/02/29 23:09:34 INFO ZooKeeper: Client
>>>> environment:java.home=/usr/jdk64/jdk1.7.0_67/jre
>>>> 16/02/29 23:09:34 INFO ZooKeeper: Client
>>>> environment:java.class.path=/usr/hdp/2.3.4.0-3485/hive/lib/guava-14.0.1.jar,/usr/hdp/2.3.4.0-3485/hbase/lib/hbase-client.jar,/usr/hdp/2.3.4.0-3485/hbase/lib/hbase-common.jar,/usr/hdp/2.3.4.0-3485/hbase/lib/hbase-protocol.jar,/usr/hdp/2.3.4.0-3485/hbase/lib/hbase-server.jar,/usr/hdp/2.3.4.0-3485/hive/lib/hive-hbase-handler.jar,/usr/hdp/2.3.4.0-3485/hive/lib/htrace-core-3.1.0-incubating.jar,/usr/hdp/2.3.4.0-3485/hive/lib/zookeeper-3.4.6.2.3.4.0-3485.jar,/usr/hdp/2.3.4.0-3485/hive/lib/hive-hbase-handler.jar,/usr/hdp/2.3.4.0-3485/hbase/lib/hbase-server.jar:/usr/hdp/current/spark-thriftserver/conf/:/usr/hdp/2.3.4.0-3485/spark/lib/spark-assembly-1.5.2.2.3.4.0-3485-hadoop2.7.1.2.3.4.0-3485.jar:/usr/hdp/2.3.4.0-3485/spark/lib/datanucleus-api-jdo-3.2.6.jar:/usr/hdp/2.3.4.0-3485/spark/lib/datanucleus-core-3.2.10.jar:/usr/hdp/2.3.4.0-3485/spark/lib/datanucleus-rdbms-3.2.9.jar:/usr/hdp/current/hadoop-client/conf/
>>>> 16/02/29 23:09:34 INFO ZooKeeper: Client
>>>> environment:java.library.path=/usr/java/packages/lib/amd64:/usr/lib64:/lib64:/lib:/usr/lib
>>>> 16/02/29 23:09:34 INFO ZooKeeper: Client environment:java.io.tmpdir=/tmp
>>>> 16/02/29 23:09:34 INFO ZooKeeper: Client environment:java.compiler=<NA>
>>>> 16/02/29 23:09:34 INFO ZooKeeper: Client environment:os.name=Linux
>>>> 16/02/29 23:09:34 INFO ZooKeeper: Client environment:os.arch=amd64
>>>> 16/02/29 23:09:34 INFO ZooKeeper: Client
>>>> environment:os.version=3.10.0-229.el7.x86_64
>>>> 16/02/29 23:09:34 INFO ZooKeeper: Client environment:user.name=hdfs
>>>> 16/02/29 23:09:34 INFO ZooKeeper: Client
>>>> environment:user.home=/home/hdfs
>>>> 16/02/29 23:09:34 INFO ZooKeeper: Client environment:user.dir=/home/hdfs
>>>> 16/02/29 23:09:34 INFO ZooKeeper: Initiating client connection,
>>>> connectString=localhost:2181 sessionTimeout=90000
>>>> watcher=hconnection-0x26fa89a20x0, quorum=localhost:2181, baseZNode=/hbase
>>>> 16/02/29 23:09:34 INFO ClientCnxn: Opening socket connection to server
>>>> localhost/0:0:0:0:0:0:0:1:2181. Will not attempt to authenticate using SASL
>>>> (unknown error)
>>>> 16/02/29 23:09:34 INFO ClientCnxn: Socket connection established to
>>>> localhost/0:0:0:0:0:0:0:1:2181, initiating session
>>>> 16/02/29 23:09:34 INFO ClientCnxn: Session establishment complete on
>>>> server localhost/0:0:0:0:0:0:0:1:2181, sessionid = 0x3532fb70ba20034,
>>>> negotiated timeout = 40000
>>>> 16/02/29 23:09:34 WARN TableInputFormatBase: You are using an HTable
>>>> instance that relies on an HBase-managed Connection. This is usually due to
>>>> directly creating an HTable, which is deprecated. Instead, you should
>>>> create a Connection object and then request a Table instance from it. If
>>>> you don't need the Table instance for your own use, you should instead use
>>>> the TableInputFormatBase.initalizeTable method directly.
>>>> 16/02/29 23:09:34 INFO TableInputFormatBase: Creating an additional
>>>> unmanaged connection because user provided one can't be used for
>>>> administrative actions. We'll close it when we close out the table.
>>>> 16/02/29 23:09:34 INFO RecoverableZooKeeper: Process
>>>> identifier=hconnection-0x6fd74d35 connecting to ZooKeeper
>>>> ensemble=localhost:2181
>>>> 16/02/29 23:09:34 INFO ZooKeeper: Initiating client connection,
>>>> connectString=localhost:2181 sessionTimeout=90000
>>>> watcher=hconnection-0x6fd74d350x0, quorum=localhost:2181, baseZNode=/hbase
>>>> 16/02/29 23:09:34 INFO ClientCnxn: Opening socket connection to server
>>>> localhost/0:0:0:0:0:0:0:1:2181. Will not attempt to authenticate using SASL
>>>> (unknown error)
>>>> 16/02/29 23:09:34 INFO ClientCnxn: Socket connection established to
>>>> localhost/0:0:0:0:0:0:0:1:2181, initiating session
>>>> 16/02/29 23:09:34 INFO ClientCnxn: Session establishment complete on
>>>> server localhost/0:0:0:0:0:0:0:1:2181, sessionid = 0x3532fb70ba20035,
>>>> negotiated timeout = 40000
>>>> 16/02/29 23:09:34 INFO RegionSizeCalculator: Calculating region sizes
>>>> for table "TEST".
>>>> 16/02/29 23:10:23 INFO RpcRetryingCaller: Call exception, tries=10,
>>>> retries=35, started=48318 ms ago, cancelled=false, msg=
>>>> 16/02/29 23:10:43 INFO RpcRetryingCaller: Call exception, tries=11,
>>>> retries=35, started=68524 ms ago, cancelled=false, msg=
>>>> 16/02/29 23:11:03 INFO RpcRetryingCaller: Call exception, tries=12,
>>>> retries=35, started=88617 ms ago, cancelled=false, msg=
>>>> 16/02/29 23:11:23 INFO RpcRetryingCaller: Call exception, tries=13,
>>>> retries=35, started=108676 ms ago, cancelled=false, msg=
>>>> 16/02/29 23:11:43 INFO RpcRetryingCaller: Call exception, tries=14,
>>>> retries=35, started=128747 ms ago, cancelled=false, msg=
>>>> 16/02/29 23:12:03 INFO RpcRetryingCaller: Call exception, tries=15,
>>>> retries=35, started=148938 ms ago, cancelled=false, msg=
>>>> 16/02/29 23:12:23 INFO RpcRetryingCaller: Call exception, tries=16,
>>>> retries=35, started=168942 ms ago, cancelled=false, msg=
>>>> 16/02/29 23:12:43 INFO RpcRetryingCaller: Call exception, tries=17,
>>>> retries=35, started=188975 ms ago, cancelled=false, msg=
>>>> Trace :
>>>
>>>
>>>
>>> Could somebody help me in resolving the error.
>>> Would really appreciate the help .
>>>
>>>
>>> Thanks,
>>> Divya
>>>
>>>
>>>
>>>
>>>
>>>
>>
>

Re: [ERROR]: Spark 1.5.2 + Hbase 1.1 + Hive 1.2 + HbaseIntegration

Posted by Ted Yu <yu...@gmail.com>.
16/02/29 23:09:34 INFO ZooKeeper: Initiating client connection,
connectString=localhost:2181 sessionTimeout=90000
watcher=hconnection-0x26fa89a20x0, quorum=localhost:2181, baseZNode=/hbase

Since baseZNode didn't match what you set in hbase-site.xml, the cause was
likely that hbase-site.xml being inaccessible to your Spark job.

Please add it in your classpath.

On Mon, Feb 29, 2016 at 8:42 PM, Ted Yu <yu...@gmail.com> wrote:

> 16/02/29 23:09:34 INFO ClientCnxn: Opening socket connection to server
> localhost/0:0:0:0:0:0:0:1:2181. Will not attempt to authenticate using
> SASL (unknown error)
>
> Is your cluster secure cluster ?
>
> bq. Trace :
>
> Was there any output after 'Trace :' ?
>
> Was hbase-site.xml accessible to your Spark job ?
>
> Thanks
>
> On Mon, Feb 29, 2016 at 8:27 PM, Divya Gehlot <di...@gmail.com>
> wrote:
>
>> Hi,
>> I am getting error when I am trying to connect hive table (which is being
>> created through HbaseIntegration) in spark
>>
>> Steps I followed :
>> *Hive Table creation code  *:
>> CREATE EXTERNAL TABLE IF NOT EXISTS TEST(NAME STRING,AGE INT)
>> STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
>> WITH SERDEPROPERTIES ("hbase.columns.mapping" = ":key,0:AGE")
>> TBLPROPERTIES ("hbase.table.name" = "TEST",
>> "hbase.mapred.output.outputtable" = "TEST");
>>
>>
>> *DESCRIBE TEST ;*
>> col_name    data_type    comment
>> name            string         from deserializer
>> age               int             from deserializer
>>
>>
>> *Spark Code :*
>> import org.apache.spark._
>> import org.apache.spark.sql._
>>
>> val hiveContext = new org.apache.spark.sql.hive.HiveContext(sc)
>> hiveContext.sql("from TEST SELECT  NAME").collect.foreach(println)
>>
>>
>> *Starting Spark shell*
>> spark-shell --jars
>> /usr/hdp/2.3.4.0-3485/hive/lib/guava-14.0.1.jar,/usr/hdp/2.3.4.0-3485/hive/lib/hive-hbase-handler.jar,/usr/hdp/2.3.4.0-3485/hbase/lib/hbase-client.jar,/usr/hdp/2.3.4.0-3485/hbase/lib/hbase-common.jar,/usr/hdp/2.3.4.0-3485/hbase/lib/hbase-protocol.jar,/usr/hdp/2.3.4.0-3485/hbase/lib/hbase-server.jar,/usr/hdp/2.3.4.0-3485/hive/lib/hive-hbase-handler.jar,/usr/hdp/2.3.4.0-3485/hive/lib/htrace-core-3.1.0-incubating.jar,/usr/hdp/2.3.4.0-3485/hive/lib/zookeeper-3.4.6.2.3.4.0-3485.jar,/usr/hdp/2.3.4.0-3485/hbase/lib/hbase-server.jar
>> --driver-class-path
>> /usr/hdp/2.3.4.0-3485/hive/lib/guava-14.0.1.jar,/usr/hdp/2.3.4.0-3485/hbase/lib/hbase-client.jar,/usr/hdp/2.3.4.0-3485/hbase/lib/hbase-common.jar,/usr/hdp/2.3.4.0-3485/hbase/lib/hbase-protocol.jar,/usr/hdp/2.3.4.0-3485/hbase/lib/hbase-server.jar,/usr/hdp/2.3.4.0-3485/hive/lib/hive-hbase-handler.jar,/usr/hdp/2.3.4.0-3485/hive/lib/htrace-core-3.1.0-incubating.jar,/usr/hdp/2.3.4.0-3485/hive/lib/zookeeper-3.4.6.2.3.4.0-3485.jar,/usr/hdp/2.3.4.0-3485/hive/lib/hive-hbase-handler.jar,/usr/hdp/2.3.4.0-3485/hbase/lib/hbase-server.jar
>> --packages com.databricks:spark-csv_2.10:1.3.0  --master yarn-client -i
>> /TestDivya/Spark/InstrumentCopyToHDFSHive.scala
>>
>> *Stack Trace* :
>>
>> Stack SQL context available as sqlContext.
>>> Loading /TestDivya/Spark/InstrumentCopyToHDFSHive.scala...
>>> import org.apache.spark._
>>> import org.apache.spark.sql._
>>> 16/02/29 23:09:29 INFO HiveContext: Initializing execution hive, version
>>> 1.2.1
>>> 16/02/29 23:09:29 INFO ClientWrapper: Inspected Hadoop version:
>>> 2.7.1.2.3.4.0-3485
>>> 16/02/29 23:09:29 INFO ClientWrapper: Loaded
>>> org.apache.hadoop.hive.shims.Hadoop23Shims for Hadoop version
>>> 2.7.1.2.3.4.0-3485
>>> 16/02/29 23:09:29 INFO HiveContext: default warehouse location is
>>> /user/hive/warehouse
>>> 16/02/29 23:09:29 INFO HiveContext: Initializing HiveMetastoreConnection
>>> version 1.2.1 using Spark classes.
>>> 16/02/29 23:09:29 INFO ClientWrapper: Inspected Hadoop version:
>>> 2.7.1.2.3.4.0-3485
>>> 16/02/29 23:09:29 INFO ClientWrapper: Loaded
>>> org.apache.hadoop.hive.shims.Hadoop23Shims for Hadoop version
>>> 2.7.1.2.3.4.0-3485
>>> 16/02/29 23:09:30 WARN NativeCodeLoader: Unable to load native-hadoop
>>> library for your platform... using builtin-java classes where applicable
>>> 16/02/29 23:09:30 INFO metastore: Trying to connect to metastore with
>>> URI thrift://ip-xxx-xx-xx-xxx.ap-southeast-1.compute.internal:9083
>>> 16/02/29 23:09:30 INFO metastore: Connected to metastore.
>>> 16/02/29 23:09:30 WARN DomainSocketFactory: The short-circuit local
>>> reads feature cannot be used because libhadoop cannot be loaded.
>>> 16/02/29 23:09:31 INFO SessionState: Created local directory:
>>> /tmp/1bf53785-f7c8-406d-a733-a5858ccb2d16_resources
>>> 16/02/29 23:09:31 INFO SessionState: Created HDFS directory:
>>> /tmp/hive/hdfs/1bf53785-f7c8-406d-a733-a5858ccb2d16
>>> 16/02/29 23:09:31 INFO SessionState: Created local directory:
>>> /tmp/hdfs/1bf53785-f7c8-406d-a733-a5858ccb2d16
>>> 16/02/29 23:09:31 INFO SessionState: Created HDFS directory:
>>> /tmp/hive/hdfs/1bf53785-f7c8-406d-a733-a5858ccb2d16/_tmp_space.db
>>> hiveContext: org.apache.spark.sql.hive.HiveContext =
>>> org.apache.spark.sql.hive.HiveContext@10b14f32
>>> 16/02/29 23:09:32 INFO ParseDriver: Parsing command: from TEST SELECT
>>>  NAME
>>> 16/02/29 23:09:32 INFO ParseDriver: Parse Completed
>>> 16/02/29 23:09:33 INFO deprecation: mapred.map.tasks is deprecated.
>>> Instead, use mapreduce.job.maps
>>> 16/02/29 23:09:33 INFO MemoryStore: ensureFreeSpace(468352) called with
>>> curMem=0, maxMem=556038881
>>> 16/02/29 23:09:33 INFO MemoryStore: Block broadcast_0 stored as values
>>> in memory (estimated size 457.4 KB, free 529.8 MB)
>>> 16/02/29 23:09:33 INFO MemoryStore: ensureFreeSpace(49454) called with
>>> curMem=468352, maxMem=556038881
>>> 16/02/29 23:09:33 INFO MemoryStore: Block broadcast_0_piece0 stored as
>>> bytes in memory (estimated size 48.3 KB, free 529.8 MB)
>>> 16/02/29 23:09:33 INFO BlockManagerInfo: Added broadcast_0_piece0 in
>>> memory on xxx.xx.xx.xxx:37784 (size: 48.3 KB, free: 530.2 MB)
>>> 16/02/29 23:09:33 INFO SparkContext: Created broadcast 0 from collect at
>>> <console>:30
>>> 16/02/29 23:09:34 INFO HBaseStorageHandler: Configuring input job
>>> properties
>>> 16/02/29 23:09:34 INFO RecoverableZooKeeper: Process
>>> identifier=hconnection-0x26fa89a2 connecting to ZooKeeper
>>> ensemble=localhost:2181
>>> 16/02/29 23:09:34 INFO ZooKeeper: Client
>>> environment:zookeeper.version=3.4.6-3485--1, built on 12/16/2015 02:35 GMT
>>> 16/02/29 23:09:34 INFO ZooKeeper: Client environment:host.name
>>> =ip-xxx-xx-xx-xxx.ap-southeast-1.compute.internal
>>> 16/02/29 23:09:34 INFO ZooKeeper: Client
>>> environment:java.version=1.7.0_67
>>> 16/02/29 23:09:34 INFO ZooKeeper: Client environment:java.vendor=Oracle
>>> Corporation
>>> 16/02/29 23:09:34 INFO ZooKeeper: Client
>>> environment:java.home=/usr/jdk64/jdk1.7.0_67/jre
>>> 16/02/29 23:09:34 INFO ZooKeeper: Client
>>> environment:java.class.path=/usr/hdp/2.3.4.0-3485/hive/lib/guava-14.0.1.jar,/usr/hdp/2.3.4.0-3485/hbase/lib/hbase-client.jar,/usr/hdp/2.3.4.0-3485/hbase/lib/hbase-common.jar,/usr/hdp/2.3.4.0-3485/hbase/lib/hbase-protocol.jar,/usr/hdp/2.3.4.0-3485/hbase/lib/hbase-server.jar,/usr/hdp/2.3.4.0-3485/hive/lib/hive-hbase-handler.jar,/usr/hdp/2.3.4.0-3485/hive/lib/htrace-core-3.1.0-incubating.jar,/usr/hdp/2.3.4.0-3485/hive/lib/zookeeper-3.4.6.2.3.4.0-3485.jar,/usr/hdp/2.3.4.0-3485/hive/lib/hive-hbase-handler.jar,/usr/hdp/2.3.4.0-3485/hbase/lib/hbase-server.jar:/usr/hdp/current/spark-thriftserver/conf/:/usr/hdp/2.3.4.0-3485/spark/lib/spark-assembly-1.5.2.2.3.4.0-3485-hadoop2.7.1.2.3.4.0-3485.jar:/usr/hdp/2.3.4.0-3485/spark/lib/datanucleus-api-jdo-3.2.6.jar:/usr/hdp/2.3.4.0-3485/spark/lib/datanucleus-core-3.2.10.jar:/usr/hdp/2.3.4.0-3485/spark/lib/datanucleus-rdbms-3.2.9.jar:/usr/hdp/current/hadoop-client/conf/
>>> 16/02/29 23:09:34 INFO ZooKeeper: Client
>>> environment:java.library.path=/usr/java/packages/lib/amd64:/usr/lib64:/lib64:/lib:/usr/lib
>>> 16/02/29 23:09:34 INFO ZooKeeper: Client environment:java.io.tmpdir=/tmp
>>> 16/02/29 23:09:34 INFO ZooKeeper: Client environment:java.compiler=<NA>
>>> 16/02/29 23:09:34 INFO ZooKeeper: Client environment:os.name=Linux
>>> 16/02/29 23:09:34 INFO ZooKeeper: Client environment:os.arch=amd64
>>> 16/02/29 23:09:34 INFO ZooKeeper: Client
>>> environment:os.version=3.10.0-229.el7.x86_64
>>> 16/02/29 23:09:34 INFO ZooKeeper: Client environment:user.name=hdfs
>>> 16/02/29 23:09:34 INFO ZooKeeper: Client environment:user.home=/home/hdfs
>>> 16/02/29 23:09:34 INFO ZooKeeper: Client environment:user.dir=/home/hdfs
>>> 16/02/29 23:09:34 INFO ZooKeeper: Initiating client connection,
>>> connectString=localhost:2181 sessionTimeout=90000
>>> watcher=hconnection-0x26fa89a20x0, quorum=localhost:2181, baseZNode=/hbase
>>> 16/02/29 23:09:34 INFO ClientCnxn: Opening socket connection to server
>>> localhost/0:0:0:0:0:0:0:1:2181. Will not attempt to authenticate using SASL
>>> (unknown error)
>>> 16/02/29 23:09:34 INFO ClientCnxn: Socket connection established to
>>> localhost/0:0:0:0:0:0:0:1:2181, initiating session
>>> 16/02/29 23:09:34 INFO ClientCnxn: Session establishment complete on
>>> server localhost/0:0:0:0:0:0:0:1:2181, sessionid = 0x3532fb70ba20034,
>>> negotiated timeout = 40000
>>> 16/02/29 23:09:34 WARN TableInputFormatBase: You are using an HTable
>>> instance that relies on an HBase-managed Connection. This is usually due to
>>> directly creating an HTable, which is deprecated. Instead, you should
>>> create a Connection object and then request a Table instance from it. If
>>> you don't need the Table instance for your own use, you should instead use
>>> the TableInputFormatBase.initalizeTable method directly.
>>> 16/02/29 23:09:34 INFO TableInputFormatBase: Creating an additional
>>> unmanaged connection because user provided one can't be used for
>>> administrative actions. We'll close it when we close out the table.
>>> 16/02/29 23:09:34 INFO RecoverableZooKeeper: Process
>>> identifier=hconnection-0x6fd74d35 connecting to ZooKeeper
>>> ensemble=localhost:2181
>>> 16/02/29 23:09:34 INFO ZooKeeper: Initiating client connection,
>>> connectString=localhost:2181 sessionTimeout=90000
>>> watcher=hconnection-0x6fd74d350x0, quorum=localhost:2181, baseZNode=/hbase
>>> 16/02/29 23:09:34 INFO ClientCnxn: Opening socket connection to server
>>> localhost/0:0:0:0:0:0:0:1:2181. Will not attempt to authenticate using SASL
>>> (unknown error)
>>> 16/02/29 23:09:34 INFO ClientCnxn: Socket connection established to
>>> localhost/0:0:0:0:0:0:0:1:2181, initiating session
>>> 16/02/29 23:09:34 INFO ClientCnxn: Session establishment complete on
>>> server localhost/0:0:0:0:0:0:0:1:2181, sessionid = 0x3532fb70ba20035,
>>> negotiated timeout = 40000
>>> 16/02/29 23:09:34 INFO RegionSizeCalculator: Calculating region sizes
>>> for table "TEST".
>>> 16/02/29 23:10:23 INFO RpcRetryingCaller: Call exception, tries=10,
>>> retries=35, started=48318 ms ago, cancelled=false, msg=
>>> 16/02/29 23:10:43 INFO RpcRetryingCaller: Call exception, tries=11,
>>> retries=35, started=68524 ms ago, cancelled=false, msg=
>>> 16/02/29 23:11:03 INFO RpcRetryingCaller: Call exception, tries=12,
>>> retries=35, started=88617 ms ago, cancelled=false, msg=
>>> 16/02/29 23:11:23 INFO RpcRetryingCaller: Call exception, tries=13,
>>> retries=35, started=108676 ms ago, cancelled=false, msg=
>>> 16/02/29 23:11:43 INFO RpcRetryingCaller: Call exception, tries=14,
>>> retries=35, started=128747 ms ago, cancelled=false, msg=
>>> 16/02/29 23:12:03 INFO RpcRetryingCaller: Call exception, tries=15,
>>> retries=35, started=148938 ms ago, cancelled=false, msg=
>>> 16/02/29 23:12:23 INFO RpcRetryingCaller: Call exception, tries=16,
>>> retries=35, started=168942 ms ago, cancelled=false, msg=
>>> 16/02/29 23:12:43 INFO RpcRetryingCaller: Call exception, tries=17,
>>> retries=35, started=188975 ms ago, cancelled=false, msg=
>>> Trace :
>>
>>
>>
>> Could somebody help me in resolving the error.
>> Would really appreciate the help .
>>
>>
>> Thanks,
>> Divya
>>
>>
>>
>>
>>
>>
>

Re: [ERROR]: Spark 1.5.2 + Hbase 1.1 + Hive 1.2 + HbaseIntegration

Posted by Ted Yu <yu...@gmail.com>.
16/02/29 23:09:34 INFO ClientCnxn: Opening socket connection to server
localhost/0:0:0:0:0:0:0:1:2181. Will not attempt to authenticate using SASL
(unknown error)

Is your cluster secure cluster ?

bq. Trace :

Was there any output after 'Trace :' ?

Was hbase-site.xml accessible to your Spark job ?

Thanks

On Mon, Feb 29, 2016 at 8:27 PM, Divya Gehlot <di...@gmail.com>
wrote:

> Hi,
> I am getting error when I am trying to connect hive table (which is being
> created through HbaseIntegration) in spark
>
> Steps I followed :
> *Hive Table creation code  *:
> CREATE EXTERNAL TABLE IF NOT EXISTS TEST(NAME STRING,AGE INT)
> STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
> WITH SERDEPROPERTIES ("hbase.columns.mapping" = ":key,0:AGE")
> TBLPROPERTIES ("hbase.table.name" = "TEST",
> "hbase.mapred.output.outputtable" = "TEST");
>
>
> *DESCRIBE TEST ;*
> col_name    data_type    comment
> name            string         from deserializer
> age               int             from deserializer
>
>
> *Spark Code :*
> import org.apache.spark._
> import org.apache.spark.sql._
>
> val hiveContext = new org.apache.spark.sql.hive.HiveContext(sc)
> hiveContext.sql("from TEST SELECT  NAME").collect.foreach(println)
>
>
> *Starting Spark shell*
> spark-shell --jars
> /usr/hdp/2.3.4.0-3485/hive/lib/guava-14.0.1.jar,/usr/hdp/2.3.4.0-3485/hive/lib/hive-hbase-handler.jar,/usr/hdp/2.3.4.0-3485/hbase/lib/hbase-client.jar,/usr/hdp/2.3.4.0-3485/hbase/lib/hbase-common.jar,/usr/hdp/2.3.4.0-3485/hbase/lib/hbase-protocol.jar,/usr/hdp/2.3.4.0-3485/hbase/lib/hbase-server.jar,/usr/hdp/2.3.4.0-3485/hive/lib/hive-hbase-handler.jar,/usr/hdp/2.3.4.0-3485/hive/lib/htrace-core-3.1.0-incubating.jar,/usr/hdp/2.3.4.0-3485/hive/lib/zookeeper-3.4.6.2.3.4.0-3485.jar,/usr/hdp/2.3.4.0-3485/hbase/lib/hbase-server.jar
> --driver-class-path
> /usr/hdp/2.3.4.0-3485/hive/lib/guava-14.0.1.jar,/usr/hdp/2.3.4.0-3485/hbase/lib/hbase-client.jar,/usr/hdp/2.3.4.0-3485/hbase/lib/hbase-common.jar,/usr/hdp/2.3.4.0-3485/hbase/lib/hbase-protocol.jar,/usr/hdp/2.3.4.0-3485/hbase/lib/hbase-server.jar,/usr/hdp/2.3.4.0-3485/hive/lib/hive-hbase-handler.jar,/usr/hdp/2.3.4.0-3485/hive/lib/htrace-core-3.1.0-incubating.jar,/usr/hdp/2.3.4.0-3485/hive/lib/zookeeper-3.4.6.2.3.4.0-3485.jar,/usr/hdp/2.3.4.0-3485/hive/lib/hive-hbase-handler.jar,/usr/hdp/2.3.4.0-3485/hbase/lib/hbase-server.jar
> --packages com.databricks:spark-csv_2.10:1.3.0  --master yarn-client -i
> /TestDivya/Spark/InstrumentCopyToHDFSHive.scala
>
> *Stack Trace* :
>
> Stack SQL context available as sqlContext.
>> Loading /TestDivya/Spark/InstrumentCopyToHDFSHive.scala...
>> import org.apache.spark._
>> import org.apache.spark.sql._
>> 16/02/29 23:09:29 INFO HiveContext: Initializing execution hive, version
>> 1.2.1
>> 16/02/29 23:09:29 INFO ClientWrapper: Inspected Hadoop version:
>> 2.7.1.2.3.4.0-3485
>> 16/02/29 23:09:29 INFO ClientWrapper: Loaded
>> org.apache.hadoop.hive.shims.Hadoop23Shims for Hadoop version
>> 2.7.1.2.3.4.0-3485
>> 16/02/29 23:09:29 INFO HiveContext: default warehouse location is
>> /user/hive/warehouse
>> 16/02/29 23:09:29 INFO HiveContext: Initializing HiveMetastoreConnection
>> version 1.2.1 using Spark classes.
>> 16/02/29 23:09:29 INFO ClientWrapper: Inspected Hadoop version:
>> 2.7.1.2.3.4.0-3485
>> 16/02/29 23:09:29 INFO ClientWrapper: Loaded
>> org.apache.hadoop.hive.shims.Hadoop23Shims for Hadoop version
>> 2.7.1.2.3.4.0-3485
>> 16/02/29 23:09:30 WARN NativeCodeLoader: Unable to load native-hadoop
>> library for your platform... using builtin-java classes where applicable
>> 16/02/29 23:09:30 INFO metastore: Trying to connect to metastore with URI
>> thrift://ip-xxx-xx-xx-xxx.ap-southeast-1.compute.internal:9083
>> 16/02/29 23:09:30 INFO metastore: Connected to metastore.
>> 16/02/29 23:09:30 WARN DomainSocketFactory: The short-circuit local reads
>> feature cannot be used because libhadoop cannot be loaded.
>> 16/02/29 23:09:31 INFO SessionState: Created local directory:
>> /tmp/1bf53785-f7c8-406d-a733-a5858ccb2d16_resources
>> 16/02/29 23:09:31 INFO SessionState: Created HDFS directory:
>> /tmp/hive/hdfs/1bf53785-f7c8-406d-a733-a5858ccb2d16
>> 16/02/29 23:09:31 INFO SessionState: Created local directory:
>> /tmp/hdfs/1bf53785-f7c8-406d-a733-a5858ccb2d16
>> 16/02/29 23:09:31 INFO SessionState: Created HDFS directory:
>> /tmp/hive/hdfs/1bf53785-f7c8-406d-a733-a5858ccb2d16/_tmp_space.db
>> hiveContext: org.apache.spark.sql.hive.HiveContext =
>> org.apache.spark.sql.hive.HiveContext@10b14f32
>> 16/02/29 23:09:32 INFO ParseDriver: Parsing command: from TEST SELECT
>>  NAME
>> 16/02/29 23:09:32 INFO ParseDriver: Parse Completed
>> 16/02/29 23:09:33 INFO deprecation: mapred.map.tasks is deprecated.
>> Instead, use mapreduce.job.maps
>> 16/02/29 23:09:33 INFO MemoryStore: ensureFreeSpace(468352) called with
>> curMem=0, maxMem=556038881
>> 16/02/29 23:09:33 INFO MemoryStore: Block broadcast_0 stored as values in
>> memory (estimated size 457.4 KB, free 529.8 MB)
>> 16/02/29 23:09:33 INFO MemoryStore: ensureFreeSpace(49454) called with
>> curMem=468352, maxMem=556038881
>> 16/02/29 23:09:33 INFO MemoryStore: Block broadcast_0_piece0 stored as
>> bytes in memory (estimated size 48.3 KB, free 529.8 MB)
>> 16/02/29 23:09:33 INFO BlockManagerInfo: Added broadcast_0_piece0 in
>> memory on xxx.xx.xx.xxx:37784 (size: 48.3 KB, free: 530.2 MB)
>> 16/02/29 23:09:33 INFO SparkContext: Created broadcast 0 from collect at
>> <console>:30
>> 16/02/29 23:09:34 INFO HBaseStorageHandler: Configuring input job
>> properties
>> 16/02/29 23:09:34 INFO RecoverableZooKeeper: Process
>> identifier=hconnection-0x26fa89a2 connecting to ZooKeeper
>> ensemble=localhost:2181
>> 16/02/29 23:09:34 INFO ZooKeeper: Client
>> environment:zookeeper.version=3.4.6-3485--1, built on 12/16/2015 02:35 GMT
>> 16/02/29 23:09:34 INFO ZooKeeper: Client environment:host.name
>> =ip-xxx-xx-xx-xxx.ap-southeast-1.compute.internal
>> 16/02/29 23:09:34 INFO ZooKeeper: Client environment:java.version=1.7.0_67
>> 16/02/29 23:09:34 INFO ZooKeeper: Client environment:java.vendor=Oracle
>> Corporation
>> 16/02/29 23:09:34 INFO ZooKeeper: Client
>> environment:java.home=/usr/jdk64/jdk1.7.0_67/jre
>> 16/02/29 23:09:34 INFO ZooKeeper: Client
>> environment:java.class.path=/usr/hdp/2.3.4.0-3485/hive/lib/guava-14.0.1.jar,/usr/hdp/2.3.4.0-3485/hbase/lib/hbase-client.jar,/usr/hdp/2.3.4.0-3485/hbase/lib/hbase-common.jar,/usr/hdp/2.3.4.0-3485/hbase/lib/hbase-protocol.jar,/usr/hdp/2.3.4.0-3485/hbase/lib/hbase-server.jar,/usr/hdp/2.3.4.0-3485/hive/lib/hive-hbase-handler.jar,/usr/hdp/2.3.4.0-3485/hive/lib/htrace-core-3.1.0-incubating.jar,/usr/hdp/2.3.4.0-3485/hive/lib/zookeeper-3.4.6.2.3.4.0-3485.jar,/usr/hdp/2.3.4.0-3485/hive/lib/hive-hbase-handler.jar,/usr/hdp/2.3.4.0-3485/hbase/lib/hbase-server.jar:/usr/hdp/current/spark-thriftserver/conf/:/usr/hdp/2.3.4.0-3485/spark/lib/spark-assembly-1.5.2.2.3.4.0-3485-hadoop2.7.1.2.3.4.0-3485.jar:/usr/hdp/2.3.4.0-3485/spark/lib/datanucleus-api-jdo-3.2.6.jar:/usr/hdp/2.3.4.0-3485/spark/lib/datanucleus-core-3.2.10.jar:/usr/hdp/2.3.4.0-3485/spark/lib/datanucleus-rdbms-3.2.9.jar:/usr/hdp/current/hadoop-client/conf/
>> 16/02/29 23:09:34 INFO ZooKeeper: Client
>> environment:java.library.path=/usr/java/packages/lib/amd64:/usr/lib64:/lib64:/lib:/usr/lib
>> 16/02/29 23:09:34 INFO ZooKeeper: Client environment:java.io.tmpdir=/tmp
>> 16/02/29 23:09:34 INFO ZooKeeper: Client environment:java.compiler=<NA>
>> 16/02/29 23:09:34 INFO ZooKeeper: Client environment:os.name=Linux
>> 16/02/29 23:09:34 INFO ZooKeeper: Client environment:os.arch=amd64
>> 16/02/29 23:09:34 INFO ZooKeeper: Client
>> environment:os.version=3.10.0-229.el7.x86_64
>> 16/02/29 23:09:34 INFO ZooKeeper: Client environment:user.name=hdfs
>> 16/02/29 23:09:34 INFO ZooKeeper: Client environment:user.home=/home/hdfs
>> 16/02/29 23:09:34 INFO ZooKeeper: Client environment:user.dir=/home/hdfs
>> 16/02/29 23:09:34 INFO ZooKeeper: Initiating client connection,
>> connectString=localhost:2181 sessionTimeout=90000
>> watcher=hconnection-0x26fa89a20x0, quorum=localhost:2181, baseZNode=/hbase
>> 16/02/29 23:09:34 INFO ClientCnxn: Opening socket connection to server
>> localhost/0:0:0:0:0:0:0:1:2181. Will not attempt to authenticate using SASL
>> (unknown error)
>> 16/02/29 23:09:34 INFO ClientCnxn: Socket connection established to
>> localhost/0:0:0:0:0:0:0:1:2181, initiating session
>> 16/02/29 23:09:34 INFO ClientCnxn: Session establishment complete on
>> server localhost/0:0:0:0:0:0:0:1:2181, sessionid = 0x3532fb70ba20034,
>> negotiated timeout = 40000
>> 16/02/29 23:09:34 WARN TableInputFormatBase: You are using an HTable
>> instance that relies on an HBase-managed Connection. This is usually due to
>> directly creating an HTable, which is deprecated. Instead, you should
>> create a Connection object and then request a Table instance from it. If
>> you don't need the Table instance for your own use, you should instead use
>> the TableInputFormatBase.initalizeTable method directly.
>> 16/02/29 23:09:34 INFO TableInputFormatBase: Creating an additional
>> unmanaged connection because user provided one can't be used for
>> administrative actions. We'll close it when we close out the table.
>> 16/02/29 23:09:34 INFO RecoverableZooKeeper: Process
>> identifier=hconnection-0x6fd74d35 connecting to ZooKeeper
>> ensemble=localhost:2181
>> 16/02/29 23:09:34 INFO ZooKeeper: Initiating client connection,
>> connectString=localhost:2181 sessionTimeout=90000
>> watcher=hconnection-0x6fd74d350x0, quorum=localhost:2181, baseZNode=/hbase
>> 16/02/29 23:09:34 INFO ClientCnxn: Opening socket connection to server
>> localhost/0:0:0:0:0:0:0:1:2181. Will not attempt to authenticate using SASL
>> (unknown error)
>> 16/02/29 23:09:34 INFO ClientCnxn: Socket connection established to
>> localhost/0:0:0:0:0:0:0:1:2181, initiating session
>> 16/02/29 23:09:34 INFO ClientCnxn: Session establishment complete on
>> server localhost/0:0:0:0:0:0:0:1:2181, sessionid = 0x3532fb70ba20035,
>> negotiated timeout = 40000
>> 16/02/29 23:09:34 INFO RegionSizeCalculator: Calculating region sizes for
>> table "TEST".
>> 16/02/29 23:10:23 INFO RpcRetryingCaller: Call exception, tries=10,
>> retries=35, started=48318 ms ago, cancelled=false, msg=
>> 16/02/29 23:10:43 INFO RpcRetryingCaller: Call exception, tries=11,
>> retries=35, started=68524 ms ago, cancelled=false, msg=
>> 16/02/29 23:11:03 INFO RpcRetryingCaller: Call exception, tries=12,
>> retries=35, started=88617 ms ago, cancelled=false, msg=
>> 16/02/29 23:11:23 INFO RpcRetryingCaller: Call exception, tries=13,
>> retries=35, started=108676 ms ago, cancelled=false, msg=
>> 16/02/29 23:11:43 INFO RpcRetryingCaller: Call exception, tries=14,
>> retries=35, started=128747 ms ago, cancelled=false, msg=
>> 16/02/29 23:12:03 INFO RpcRetryingCaller: Call exception, tries=15,
>> retries=35, started=148938 ms ago, cancelled=false, msg=
>> 16/02/29 23:12:23 INFO RpcRetryingCaller: Call exception, tries=16,
>> retries=35, started=168942 ms ago, cancelled=false, msg=
>> 16/02/29 23:12:43 INFO RpcRetryingCaller: Call exception, tries=17,
>> retries=35, started=188975 ms ago, cancelled=false, msg=
>> Trace :
>
>
>
> Could somebody help me in resolving the error.
> Would really appreciate the help .
>
>
> Thanks,
> Divya
>
>
>
>
>
>

Re: [ERROR]: Spark 1.5.2 + Hbase 1.1 + Hive 1.2 + HbaseIntegration

Posted by Teng Qiu <te...@gmail.com>.
forward you this mails, hope these can help you, you can take a look
at this post http://www.abcn.net/2014/07/lighting-spark-with-hbase-full-edition.html

2016-03-04 3:30 GMT+01:00 Divya Gehlot <di...@gmail.com>:
> Hi Teng,
>
> Thanks for the link you shared , helped me figure out the missing
> dependency.
> Was missing hbase-hadoop-compat.jar
>
>
>
>
>
> Thanks a lot,
>
> Divya
>
> On 2 March 2016 at 17:05, Teng Qiu <te...@gmail.com> wrote:
>>
>> Hi, maybe the dependencies described in
>> http://www.abcn.net/2014/07/lighting-spark-with-hbase-full-edition.html
>> can help, add hive-hbase handler jar as well for HiveIntegration in
>> spark
>>
>> 2016-03-02 2:19 GMT+01:00 Divya Gehlot <di...@gmail.com>:
>> > Hello Teng,
>> > As you could see in chain email.
>> > I am facing lots of  issues while trying to connect to hbase  registered
>> > hive table.
>> > Could your pls help me with the list of jars which needs to be place in
>> > spark classpath?
>> > Would be very grateful you could send me the steps to follow .
>> > Would really appreciate the help.
>> > Thanks,
>> > Divya
>> >
>> > On Mar 2, 2016 4:50 AM, "Teng Qiu" <te...@gmail.com> wrote:
>> >>
>> >> and also make sure that hbase-site.xml is set in your classpath on all
>> >> nodes, both master and workers, and also client.
>> >>
>> >> normally i put it into $SPARK_HOME/conf/ then the spark cluster will
>> >> be started with this conf file.
>> >>
>> >> btw. @Ted, did you tried insert into hbase table with spark's
>> >> HiveContext? i got this issue:
>> >> https://issues.apache.org/jira/browse/SPARK-6628
>> >>
>> >> and there is a patch available:
>> >> https://issues.apache.org/jira/browse/HIVE-11166
>> >>
>> >>
>> >> 2016-03-01 15:16 GMT+01:00 Ted Yu <yu...@gmail.com>:
>> >> > 16/03/01 01:36:31 WARN TaskSetManager: Lost task 0.0 in stage 0.0
>> >> > (TID
>> >> > 0,
>> >> > ip-xxx-xx-xx-xxx.ap-southeast-1.compute.internal):
>> >> > java.lang.RuntimeException: hbase-default.xml file seems to be for an
>> >> > older
>> >> > version of HBase (null), this version is 1.1.2.2.3.4.0-3485
>> >> >
>> >> > The above was likely caused by some component being built with
>> >> > different
>> >> > release of hbase.
>> >> >
>> >> > Try setting "hbase.defaults.for.version.skip" to true.
>> >> >
>> >> > Cheers
>> >> >
>> >> >
>> >> > On Mon, Feb 29, 2016 at 9:12 PM, Ted Yu <yu...@gmail.com> wrote:
>> >> >>
>> >> >> 16/02/29 23:09:34 INFO ZooKeeper: Initiating client connection,
>> >> >> connectString=localhost:2181 sessionTimeout=90000
>> >> >> watcher=hconnection-0x26fa89a20x0, quorum=localhost:2181,
>> >> >> baseZNode=/hbase
>> >> >>
>> >> >> Since baseZNode didn't match what you set in hbase-site.xml, the
>> >> >> cause
>> >> >> was
>> >> >> likely that hbase-site.xml being inaccessible to your Spark job.
>> >> >>
>> >> >> Please add it in your classpath.
>> >> >>
>> >> >> On Mon, Feb 29, 2016 at 8:42 PM, Ted Yu <yu...@gmail.com> wrote:
>> >> >>>
>> >> >>> 16/02/29 23:09:34 INFO ClientCnxn: Opening socket connection to
>> >> >>> server
>> >> >>> localhost/0:0:0:0:0:0:0:1:2181. Will not attempt to authenticate
>> >> >>> using
>> >> >>> SASL
>> >> >>> (unknown error)
>> >> >>>
>> >> >>> Is your cluster secure cluster ?
>> >> >>>
>> >> >>> bq. Trace :
>> >> >>>
>> >> >>> Was there any output after 'Trace :' ?
>> >> >>>
>> >> >>> Was hbase-site.xml accessible to your Spark job ?
>> >> >>>
>> >> >>> Thanks
>> >> >>>
>> >> >>> On Mon, Feb 29, 2016 at 8:27 PM, Divya Gehlot
>> >> >>> <di...@gmail.com>
>> >> >>> wrote:
>> >> >>>>
>> >> >>>> Hi,
>> >> >>>> I am getting error when I am trying to connect hive table (which
>> >> >>>> is
>> >> >>>> being created through HbaseIntegration) in spark
>> >> >>>>
>> >> >>>> Steps I followed :
>> >> >>>> Hive Table creation code  :
>> >> >>>> CREATE EXTERNAL TABLE IF NOT EXISTS TEST(NAME STRING,AGE INT)
>> >> >>>> STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
>> >> >>>> WITH SERDEPROPERTIES ("hbase.columns.mapping" = ":key,0:AGE")
>> >> >>>> TBLPROPERTIES ("hbase.table.name" = "TEST",
>> >> >>>> "hbase.mapred.output.outputtable" = "TEST");
>> >> >>>>
>> >> >>>>
>> >> >>>> DESCRIBE TEST ;
>> >> >>>> col_name    data_type    comment
>> >> >>>> name            string         from deserializer
>> >> >>>> age               int             from deserializer
>> >> >>>>
>> >> >>>>
>> >> >>>> Spark Code :
>> >> >>>> import org.apache.spark._
>> >> >>>> import org.apache.spark.sql._
>> >> >>>>
>> >> >>>> val hiveContext = new org.apache.spark.sql.hive.HiveContext(sc)
>> >> >>>> hiveContext.sql("from TEST SELECT  NAME").collect.foreach(println)
>> >> >>>>
>> >> >>>>
>> >> >>>> Starting Spark shell
>> >> >>>> spark-shell --jars
>> >> >>>>
>> >> >>>>
>> >> >>>> /usr/hdp/2.3.4.0-3485/hive/lib/guava-14.0.1.jar,/usr/hdp/2.3.4.0-3485/hive/lib/hive-hbase-handler.jar,/usr/hdp/2.3.4.0-3485/hbase/lib/hbase-client.jar,/usr/hdp/2.3.4.0-3485/hbase/lib/hbase-common.jar,/usr/hdp/2.3.4.0-3485/hbase/lib/hbase-protocol.jar,/usr/hdp/2.3.4.0-3485/hbase/lib/hbase-server.jar,/usr/hdp/2.3.4.0-3485/hive/lib/hive-hbase-handler.jar,/usr/hdp/2.3.4.0-3485/hive/lib/htrace-core-3.1.0-incubating.jar,/usr/hdp/2.3.4.0-3485/hive/lib/zookeeper-3.4.6.2.3.4.0-3485.jar,/usr/hdp/2.3.4.0-3485/hbase/lib/hbase-server.jar
>> >> >>>> --driver-class-path
>> >> >>>>
>> >> >>>>
>> >> >>>> /usr/hdp/2.3.4.0-3485/hive/lib/guava-14.0.1.jar,/usr/hdp/2.3.4.0-3485/hbase/lib/hbase-client.jar,/usr/hdp/2.3.4.0-3485/hbase/lib/hbase-common.jar,/usr/hdp/2.3.4.0-3485/hbase/lib/hbase-protocol.jar,/usr/hdp/2.3.4.0-3485/hbase/lib/hbase-server.jar,/usr/hdp/2.3.4.0-3485/hive/lib/hive-hbase-handler.jar,/usr/hdp/2.3.4.0-3485/hive/lib/htrace-core-3.1.0-incubating.jar,/usr/hdp/2.3.4.0-3485/hive/lib/zookeeper-3.4.6.2.3.4.0-3485.jar,/usr/hdp/2.3.4.0-3485/hive/lib/hive-hbase-handler.jar,/usr/hdp/2.3.4.0-3485/hbase/lib/hbase-server.jar
>> >> >>>> --packages com.databricks:spark-csv_2.10:1.3.0  --master
>> >> >>>> yarn-client
>> >> >>>> -i
>> >> >>>> /TestDivya/Spark/InstrumentCopyToHDFSHive.scala
>> >> >>>>
>> >> >>>> Stack Trace :
>> >> >>>>
>> >> >>>>> Stack SQL context available as sqlContext.
>> >> >>>>> Loading /TestDivya/Spark/InstrumentCopyToHDFSHive.scala...
>> >> >>>>> import org.apache.spark._
>> >> >>>>> import org.apache.spark.sql._
>> >> >>>>> 16/02/29 23:09:29 INFO HiveContext: Initializing execution hive,
>> >> >>>>> version 1.2.1
>> >> >>>>> 16/02/29 23:09:29 INFO ClientWrapper: Inspected Hadoop version:
>> >> >>>>> 2.7.1.2.3.4.0-3485
>> >> >>>>> 16/02/29 23:09:29 INFO ClientWrapper: Loaded
>> >> >>>>> org.apache.hadoop.hive.shims.Hadoop23Shims for Hadoop version
>> >> >>>>> 2.7.1.2.3.4.0-3485
>> >> >>>>> 16/02/29 23:09:29 INFO HiveContext: default warehouse location is
>> >> >>>>> /user/hive/warehouse
>> >> >>>>> 16/02/29 23:09:29 INFO HiveContext: Initializing
>> >> >>>>> HiveMetastoreConnection version 1.2.1 using Spark classes.
>> >> >>>>> 16/02/29 23:09:29 INFO ClientWrapper: Inspected Hadoop version:
>> >> >>>>> 2.7.1.2.3.4.0-3485
>> >> >>>>> 16/02/29 23:09:29 INFO ClientWrapper: Loaded
>> >> >>>>> org.apache.hadoop.hive.shims.Hadoop23Shims for Hadoop version
>> >> >>>>> 2.7.1.2.3.4.0-3485
>> >> >>>>> 16/02/29 23:09:30 WARN NativeCodeLoader: Unable to load
>> >> >>>>> native-hadoop
>> >> >>>>> library for your platform... using builtin-java classes where
>> >> >>>>> applicable
>> >> >>>>> 16/02/29 23:09:30 INFO metastore: Trying to connect to metastore
>> >> >>>>> with
>> >> >>>>> URI
>> >> >>>>> thrift://ip-xxx-xx-xx-xxx.ap-southeast-1.compute.internal:9083
>> >> >>>>> 16/02/29 23:09:30 INFO metastore: Connected to metastore.
>> >> >>>>> 16/02/29 23:09:30 WARN DomainSocketFactory: The short-circuit
>> >> >>>>> local
>> >> >>>>> reads feature cannot be used because libhadoop cannot be loaded.
>> >> >>>>> 16/02/29 23:09:31 INFO SessionState: Created local directory:
>> >> >>>>> /tmp/1bf53785-f7c8-406d-a733-a5858ccb2d16_resources
>> >> >>>>> 16/02/29 23:09:31 INFO SessionState: Created HDFS directory:
>> >> >>>>> /tmp/hive/hdfs/1bf53785-f7c8-406d-a733-a5858ccb2d16
>> >> >>>>> 16/02/29 23:09:31 INFO SessionState: Created local directory:
>> >> >>>>> /tmp/hdfs/1bf53785-f7c8-406d-a733-a5858ccb2d16
>> >> >>>>> 16/02/29 23:09:31 INFO SessionState: Created HDFS directory:
>> >> >>>>> /tmp/hive/hdfs/1bf53785-f7c8-406d-a733-a5858ccb2d16/_tmp_space.db
>> >> >>>>> hiveContext: org.apache.spark.sql.hive.HiveContext =
>> >> >>>>> org.apache.spark.sql.hive.HiveContext@10b14f32
>> >> >>>>> 16/02/29 23:09:32 INFO ParseDriver: Parsing command: from TEST
>> >> >>>>> SELECT
>> >> >>>>> NAME
>> >> >>>>> 16/02/29 23:09:32 INFO ParseDriver: Parse Completed
>> >> >>>>> 16/02/29 23:09:33 INFO deprecation: mapred.map.tasks is
>> >> >>>>> deprecated.
>> >> >>>>> Instead, use mapreduce.job.maps
>> >> >>>>> 16/02/29 23:09:33 INFO MemoryStore: ensureFreeSpace(468352)
>> >> >>>>> called
>> >> >>>>> with
>> >> >>>>> curMem=0, maxMem=556038881
>> >> >>>>> 16/02/29 23:09:33 INFO MemoryStore: Block broadcast_0 stored as
>> >> >>>>> values
>> >> >>>>> in memory (estimated size 457.4 KB, free 529.8 MB)
>> >> >>>>> 16/02/29 23:09:33 INFO MemoryStore: ensureFreeSpace(49454) called
>> >> >>>>> with
>> >> >>>>> curMem=468352, maxMem=556038881
>> >> >>>>> 16/02/29 23:09:33 INFO MemoryStore: Block broadcast_0_piece0
>> >> >>>>> stored
>> >> >>>>> as
>> >> >>>>> bytes in memory (estimated size 48.3 KB, free 529.8 MB)
>> >> >>>>> 16/02/29 23:09:33 INFO BlockManagerInfo: Added broadcast_0_piece0
>> >> >>>>> in
>> >> >>>>> memory on xxx.xx.xx.xxx:37784 (size: 48.3 KB, free: 530.2 MB)
>> >> >>>>> 16/02/29 23:09:33 INFO SparkContext: Created broadcast 0 from
>> >> >>>>> collect
>> >> >>>>> at <console>:30
>> >> >>>>> 16/02/29 23:09:34 INFO HBaseStorageHandler: Configuring input job
>> >> >>>>> properties
>> >> >>>>> 16/02/29 23:09:34 INFO RecoverableZooKeeper: Process
>> >> >>>>> identifier=hconnection-0x26fa89a2 connecting to ZooKeeper
>> >> >>>>> ensemble=localhost:2181
>> >> >>>>> 16/02/29 23:09:34 INFO ZooKeeper: Client
>> >> >>>>> environment:zookeeper.version=3.4.6-3485--1, built on 12/16/2015
>> >> >>>>> 02:35 GMT
>> >> >>>>> 16/02/29 23:09:34 INFO ZooKeeper: Client
>> >> >>>>>
>> >> >>>>>
>> >> >>>>> environment:host.name=ip-xxx-xx-xx-xxx.ap-southeast-1.compute.internal
>> >> >>>>> 16/02/29 23:09:34 INFO ZooKeeper: Client
>> >> >>>>> environment:java.version=1.7.0_67
>> >> >>>>> 16/02/29 23:09:34 INFO ZooKeeper: Client
>> >> >>>>> environment:java.vendor=Oracle
>> >> >>>>> Corporation
>> >> >>>>> 16/02/29 23:09:34 INFO ZooKeeper: Client
>> >> >>>>> environment:java.home=/usr/jdk64/jdk1.7.0_67/jre
>> >> >>>>> 16/02/29 23:09:34 INFO ZooKeeper: Client
>> >> >>>>>
>> >> >>>>>
>> >> >>>>> environment:java.class.path=/usr/hdp/2.3.4.0-3485/hive/lib/guava-14.0.1.jar,/usr/hdp/2.3.4.0-3485/hbase/lib/hbase-client.jar,/usr/hdp/2.3.4.0-3485/hbase/lib/hbase-common.jar,/usr/hdp/2.3.4.0-3485/hbase/lib/hbase-protocol.jar,/usr/hdp/2.3.4.0-3485/hbase/lib/hbase-server.jar,/usr/hdp/2.3.4.0-3485/hive/lib/hive-hbase-handler.jar,/usr/hdp/2.3.4.0-3485/hive/lib/htrace-core-3.1.0-incubating.jar,/usr/hdp/2.3.4.0-3485/hive/lib/zookeeper-3.4.6.2.3.4.0-3485.jar,/usr/hdp/2.3.4.0-3485/hive/lib/hive-hbase-handler.jar,/usr/hdp/2.3.4.0-3485/hbase/lib/hbase-server.jar:/usr/hdp/current/spark-thriftserver/conf/:/usr/hdp/2.3.4.0-3485/spark/lib/spark-assembly-1.5.2.2.3.4.0-3485-hadoop2.7.1.2.3.4.0-3485.jar:/usr/hdp/2.3.4.0-3485/spark/lib/datanucleus-api-jdo-3.2.6.jar:/usr/hdp/2.3.4.0-3485/spark/lib/datanucleus-core-3.2.10.jar:/usr/hdp/2.3.4.0-3485/spark/lib/datanucleus-rdbms-3.2.9.jar:/usr/hdp/current/hadoop-client/conf/
>> >> >>>>> 16/02/29 23:09:34 INFO ZooKeeper: Client
>> >> >>>>>
>> >> >>>>>
>> >> >>>>> environment:java.library.path=/usr/java/packages/lib/amd64:/usr/lib64:/lib64:/lib:/usr/lib
>> >> >>>>> 16/02/29 23:09:34 INFO ZooKeeper: Client
>> >> >>>>> environment:java.io.tmpdir=/tmp
>> >> >>>>> 16/02/29 23:09:34 INFO ZooKeeper: Client
>> >> >>>>> environment:java.compiler=<NA>
>> >> >>>>> 16/02/29 23:09:34 INFO ZooKeeper: Client
>> >> >>>>> environment:os.name=Linux
>> >> >>>>> 16/02/29 23:09:34 INFO ZooKeeper: Client
>> >> >>>>> environment:os.arch=amd64
>> >> >>>>> 16/02/29 23:09:34 INFO ZooKeeper: Client
>> >> >>>>> environment:os.version=3.10.0-229.el7.x86_64
>> >> >>>>> 16/02/29 23:09:34 INFO ZooKeeper: Client
>> >> >>>>> environment:user.name=hdfs
>> >> >>>>> 16/02/29 23:09:34 INFO ZooKeeper: Client
>> >> >>>>> environment:user.home=/home/hdfs
>> >> >>>>> 16/02/29 23:09:34 INFO ZooKeeper: Client
>> >> >>>>> environment:user.dir=/home/hdfs
>> >> >>>>> 16/02/29 23:09:34 INFO ZooKeeper: Initiating client connection,
>> >> >>>>> connectString=localhost:2181 sessionTimeout=90000
>> >> >>>>> watcher=hconnection-0x26fa89a20x0, quorum=localhost:2181,
>> >> >>>>> baseZNode=/hbase
>> >> >>>>> 16/02/29 23:09:34 INFO ClientCnxn: Opening socket connection to
>> >> >>>>> server
>> >> >>>>> localhost/0:0:0:0:0:0:0:1:2181. Will not attempt to authenticate
>> >> >>>>> using SASL
>> >> >>>>> (unknown error)
>> >> >>>>> 16/02/29 23:09:34 INFO ClientCnxn: Socket connection established
>> >> >>>>> to
>> >> >>>>> localhost/0:0:0:0:0:0:0:1:2181, initiating session
>> >> >>>>> 16/02/29 23:09:34 INFO ClientCnxn: Session establishment complete
>> >> >>>>> on
>> >> >>>>> server localhost/0:0:0:0:0:0:0:1:2181, sessionid =
>> >> >>>>> 0x3532fb70ba20034,
>> >> >>>>> negotiated timeout = 40000
>> >> >>>>> 16/02/29 23:09:34 WARN TableInputFormatBase: You are using an
>> >> >>>>> HTable
>> >> >>>>> instance that relies on an HBase-managed Connection. This is
>> >> >>>>> usually
>> >> >>>>> due to
>> >> >>>>> directly creating an HTable, which is deprecated. Instead, you
>> >> >>>>> should create
>> >> >>>>> a Connection object and then request a Table instance from it. If
>> >> >>>>> you don't
>> >> >>>>> need the Table instance for your own use, you should instead use
>> >> >>>>> the
>> >> >>>>> TableInputFormatBase.initalizeTable method directly.
>> >> >>>>> 16/02/29 23:09:34 INFO TableInputFormatBase: Creating an
>> >> >>>>> additional
>> >> >>>>> unmanaged connection because user provided one can't be used for
>> >> >>>>> administrative actions. We'll close it when we close out the
>> >> >>>>> table.
>> >> >>>>> 16/02/29 23:09:34 INFO RecoverableZooKeeper: Process
>> >> >>>>> identifier=hconnection-0x6fd74d35 connecting to ZooKeeper
>> >> >>>>> ensemble=localhost:2181
>> >> >>>>> 16/02/29 23:09:34 INFO ZooKeeper: Initiating client connection,
>> >> >>>>> connectString=localhost:2181 sessionTimeout=90000
>> >> >>>>> watcher=hconnection-0x6fd74d350x0, quorum=localhost:2181,
>> >> >>>>> baseZNode=/hbase
>> >> >>>>> 16/02/29 23:09:34 INFO ClientCnxn: Opening socket connection to
>> >> >>>>> server
>> >> >>>>> localhost/0:0:0:0:0:0:0:1:2181. Will not attempt to authenticate
>> >> >>>>> using SASL
>> >> >>>>> (unknown error)
>> >> >>>>> 16/02/29 23:09:34 INFO ClientCnxn: Socket connection established
>> >> >>>>> to
>> >> >>>>> localhost/0:0:0:0:0:0:0:1:2181, initiating session
>> >> >>>>> 16/02/29 23:09:34 INFO ClientCnxn: Session establishment complete
>> >> >>>>> on
>> >> >>>>> server localhost/0:0:0:0:0:0:0:1:2181, sessionid =
>> >> >>>>> 0x3532fb70ba20035,
>> >> >>>>> negotiated timeout = 40000
>> >> >>>>> 16/02/29 23:09:34 INFO RegionSizeCalculator: Calculating region
>> >> >>>>> sizes
>> >> >>>>> for table "TEST".
>> >> >>>>> 16/02/29 23:10:23 INFO RpcRetryingCaller: Call exception,
>> >> >>>>> tries=10,
>> >> >>>>> retries=35, started=48318 ms ago, cancelled=false, msg=
>> >> >>>>> 16/02/29 23:10:43 INFO RpcRetryingCaller: Call exception,
>> >> >>>>> tries=11,
>> >> >>>>> retries=35, started=68524 ms ago, cancelled=false, msg=
>> >> >>>>> 16/02/29 23:11:03 INFO RpcRetryingCaller: Call exception,
>> >> >>>>> tries=12,
>> >> >>>>> retries=35, started=88617 ms ago, cancelled=false, msg=
>> >> >>>>> 16/02/29 23:11:23 INFO RpcRetryingCaller: Call exception,
>> >> >>>>> tries=13,
>> >> >>>>> retries=35, started=108676 ms ago, cancelled=false, msg=
>> >> >>>>> 16/02/29 23:11:43 INFO RpcRetryingCaller: Call exception,
>> >> >>>>> tries=14,
>> >> >>>>> retries=35, started=128747 ms ago, cancelled=false, msg=
>> >> >>>>> 16/02/29 23:12:03 INFO RpcRetryingCaller: Call exception,
>> >> >>>>> tries=15,
>> >> >>>>> retries=35, started=148938 ms ago, cancelled=false, msg=
>> >> >>>>> 16/02/29 23:12:23 INFO RpcRetryingCaller: Call exception,
>> >> >>>>> tries=16,
>> >> >>>>> retries=35, started=168942 ms ago, cancelled=false, msg=
>> >> >>>>> 16/02/29 23:12:43 INFO RpcRetryingCaller: Call exception,
>> >> >>>>> tries=17,
>> >> >>>>> retries=35, started=188975 ms ago, cancelled=false, msg=
>> >> >>>>> Trace :
>> >> >>>>
>> >> >>>>
>> >> >>>>
>> >> >>>> Could somebody help me in resolving the error.
>> >> >>>> Would really appreciate the help .
>> >> >>>>
>> >> >>>>
>> >> >>>> Thanks,
>> >> >>>> Divya
>> >> >>>>
>> >> >>>>
>> >> >>>>
>> >> >>>>
>> >> >>>>
>> >> >>>
>> >> >>
>> >> >
>
>

2016-04-08 8:45 GMT+02:00 Wojciech Indyk <wo...@gmail.com>:
> Hello Divya!
> Have you solved the problem?
> I suppose the log comes from driver. You need to look also at logs on
> worker JVMs, there can be an exception or something.
> Do you have Kerberos on your cluster? It could be similar to a problem
> http://issues.apache.org/jira/browse/SPARK-14115
>
> Based on your logs:
>> 16/02/29 23:09:34 INFO ClientCnxn: Opening socket connection to server
>> localhost/0:0:0:0:0:0:0:1:2181. Will not attempt to authenticate using SASL
>> (unknown error)
>> 16/02/29 23:09:34 INFO ClientCnxn: Socket connection established to
>> localhost/0:0:0:0:0:0:0:1:2181, initiating session
>> 16/02/29 23:09:34 INFO ClientCnxn: Session establishment complete on
>> server localhost/0:0:0:0:0:0:0:1:2181, sessionid = 0x3532fb70ba20035,
>
> Maybe there is a problem with using RPC call to regions using IPv6
> (but I just guess).
>
> --
> Kind regards/ Pozdrawiam,
> Wojciech Indyk
> http://datacentric.pl
>
>
> 2016-03-01 5:27 GMT+01:00 Divya Gehlot <di...@gmail.com>:
>> Hi,
>> I am getting error when I am trying to connect hive table (which is being
>> created through HbaseIntegration) in spark
>>
>> Steps I followed :
>> *Hive Table creation code  *:
>> CREATE EXTERNAL TABLE IF NOT EXISTS TEST(NAME STRING,AGE INT)
>> STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
>> WITH SERDEPROPERTIES ("hbase.columns.mapping" = ":key,0:AGE")
>> TBLPROPERTIES ("hbase.table.name" = "TEST",
>> "hbase.mapred.output.outputtable" = "TEST");
>>
>>
>> *DESCRIBE TEST ;*
>> col_name    data_type    comment
>> name            string         from deserializer
>> age               int             from deserializer
>>
>>
>> *Spark Code :*
>> import org.apache.spark._
>> import org.apache.spark.sql._
>>
>> val hiveContext = new org.apache.spark.sql.hive.HiveContext(sc)
>> hiveContext.sql("from TEST SELECT  NAME").collect.foreach(println)
>>
>>
>> *Starting Spark shell*
>> spark-shell --jars
>> /usr/hdp/2.3.4.0-3485/hive/lib/guava-14.0.1.jar,/usr/hdp/2.3.4.0-3485/hive/lib/hive-hbase-handler.jar,/usr/hdp/2.3.4.0-3485/hbase/lib/hbase-client.jar,/usr/hdp/2.3.4.0-3485/hbase/lib/hbase-common.jar,/usr/hdp/2.3.4.0-3485/hbase/lib/hbase-protocol.jar,/usr/hdp/2.3.4.0-3485/hbase/lib/hbase-server.jar,/usr/hdp/2.3.4.0-3485/hive/lib/hive-hbase-handler.jar,/usr/hdp/2.3.4.0-3485/hive/lib/htrace-core-3.1.0-incubating.jar,/usr/hdp/2.3.4.0-3485/hive/lib/zookeeper-3.4.6.2.3.4.0-3485.jar,/usr/hdp/2.3.4.0-3485/hbase/lib/hbase-server.jar
>> --driver-class-path
>> /usr/hdp/2.3.4.0-3485/hive/lib/guava-14.0.1.jar,/usr/hdp/2.3.4.0-3485/hbase/lib/hbase-client.jar,/usr/hdp/2.3.4.0-3485/hbase/lib/hbase-common.jar,/usr/hdp/2.3.4.0-3485/hbase/lib/hbase-protocol.jar,/usr/hdp/2.3.4.0-3485/hbase/lib/hbase-server.jar,/usr/hdp/2.3.4.0-3485/hive/lib/hive-hbase-handler.jar,/usr/hdp/2.3.4.0-3485/hive/lib/htrace-core-3.1.0-incubating.jar,/usr/hdp/2.3.4.0-3485/hive/lib/zookeeper-3.4.6.2.3.4.0-3485.jar,/usr/hdp/2.3.4.0-3485/hive/lib/hive-hbase-handler.jar,/usr/hdp/2.3.4.0-3485/hbase/lib/hbase-server.jar
>> --packages com.databricks:spark-csv_2.10:1.3.0  --master yarn-client -i
>> /TestDivya/Spark/InstrumentCopyToHDFSHive.scala
>>
>> *Stack Trace* :
>>
>> Stack SQL context available as sqlContext.
>>> Loading /TestDivya/Spark/InstrumentCopyToHDFSHive.scala...
>>> import org.apache.spark._
>>> import org.apache.spark.sql._
>>> 16/02/29 23:09:29 INFO HiveContext: Initializing execution hive, version
>>> 1.2.1
>>> 16/02/29 23:09:29 INFO ClientWrapper: Inspected Hadoop version:
>>> 2.7.1.2.3.4.0-3485
>>> 16/02/29 23:09:29 INFO ClientWrapper: Loaded
>>> org.apache.hadoop.hive.shims.Hadoop23Shims for Hadoop version
>>> 2.7.1.2.3.4.0-3485
>>> 16/02/29 23:09:29 INFO HiveContext: default warehouse location is
>>> /user/hive/warehouse
>>> 16/02/29 23:09:29 INFO HiveContext: Initializing HiveMetastoreConnection
>>> version 1.2.1 using Spark classes.
>>> 16/02/29 23:09:29 INFO ClientWrapper: Inspected Hadoop version:
>>> 2.7.1.2.3.4.0-3485
>>> 16/02/29 23:09:29 INFO ClientWrapper: Loaded
>>> org.apache.hadoop.hive.shims.Hadoop23Shims for Hadoop version
>>> 2.7.1.2.3.4.0-3485
>>> 16/02/29 23:09:30 WARN NativeCodeLoader: Unable to load native-hadoop
>>> library for your platform... using builtin-java classes where applicable
>>> 16/02/29 23:09:30 INFO metastore: Trying to connect to metastore with URI
>>> thrift://ip-xxx-xx-xx-xxx.ap-southeast-1.compute.internal:9083
>>> 16/02/29 23:09:30 INFO metastore: Connected to metastore.
>>> 16/02/29 23:09:30 WARN DomainSocketFactory: The short-circuit local reads
>>> feature cannot be used because libhadoop cannot be loaded.
>>> 16/02/29 23:09:31 INFO SessionState: Created local directory:
>>> /tmp/1bf53785-f7c8-406d-a733-a5858ccb2d16_resources
>>> 16/02/29 23:09:31 INFO SessionState: Created HDFS directory:
>>> /tmp/hive/hdfs/1bf53785-f7c8-406d-a733-a5858ccb2d16
>>> 16/02/29 23:09:31 INFO SessionState: Created local directory:
>>> /tmp/hdfs/1bf53785-f7c8-406d-a733-a5858ccb2d16
>>> 16/02/29 23:09:31 INFO SessionState: Created HDFS directory:
>>> /tmp/hive/hdfs/1bf53785-f7c8-406d-a733-a5858ccb2d16/_tmp_space.db
>>> hiveContext: org.apache.spark.sql.hive.HiveContext =
>>> org.apache.spark.sql.hive.HiveContext@10b14f32
>>> 16/02/29 23:09:32 INFO ParseDriver: Parsing command: from TEST SELECT  NAME
>>> 16/02/29 23:09:32 INFO ParseDriver: Parse Completed
>>> 16/02/29 23:09:33 INFO deprecation: mapred.map.tasks is deprecated.
>>> Instead, use mapreduce.job.maps
>>> 16/02/29 23:09:33 INFO MemoryStore: ensureFreeSpace(468352) called with
>>> curMem=0, maxMem=556038881
>>> 16/02/29 23:09:33 INFO MemoryStore: Block broadcast_0 stored as values in
>>> memory (estimated size 457.4 KB, free 529.8 MB)
>>> 16/02/29 23:09:33 INFO MemoryStore: ensureFreeSpace(49454) called with
>>> curMem=468352, maxMem=556038881
>>> 16/02/29 23:09:33 INFO MemoryStore: Block broadcast_0_piece0 stored as
>>> bytes in memory (estimated size 48.3 KB, free 529.8 MB)
>>> 16/02/29 23:09:33 INFO BlockManagerInfo: Added broadcast_0_piece0 in
>>> memory on xxx.xx.xx.xxx:37784 (size: 48.3 KB, free: 530.2 MB)
>>> 16/02/29 23:09:33 INFO SparkContext: Created broadcast 0 from collect at
>>> <console>:30
>>> 16/02/29 23:09:34 INFO HBaseStorageHandler: Configuring input job
>>> properties
>>> 16/02/29 23:09:34 INFO RecoverableZooKeeper: Process
>>> identifier=hconnection-0x26fa89a2 connecting to ZooKeeper
>>> ensemble=localhost:2181
>>> 16/02/29 23:09:34 INFO ZooKeeper: Client
>>> environment:zookeeper.version=3.4.6-3485--1, built on 12/16/2015 02:35 GMT
>>> 16/02/29 23:09:34 INFO ZooKeeper: Client environment:host.name
>>> =ip-xxx-xx-xx-xxx.ap-southeast-1.compute.internal
>>> 16/02/29 23:09:34 INFO ZooKeeper: Client environment:java.version=1.7.0_67
>>> 16/02/29 23:09:34 INFO ZooKeeper: Client environment:java.vendor=Oracle
>>> Corporation
>>> 16/02/29 23:09:34 INFO ZooKeeper: Client
>>> environment:java.home=/usr/jdk64/jdk1.7.0_67/jre
>>> 16/02/29 23:09:34 INFO ZooKeeper: Client
>>> environment:java.class.path=/usr/hdp/2.3.4.0-3485/hive/lib/guava-14.0.1.jar,/usr/hdp/2.3.4.0-3485/hbase/lib/hbase-client.jar,/usr/hdp/2.3.4.0-3485/hbase/lib/hbase-common.jar,/usr/hdp/2.3.4.0-3485/hbase/lib/hbase-protocol.jar,/usr/hdp/2.3.4.0-3485/hbase/lib/hbase-server.jar,/usr/hdp/2.3.4.0-3485/hive/lib/hive-hbase-handler.jar,/usr/hdp/2.3.4.0-3485/hive/lib/htrace-core-3.1.0-incubating.jar,/usr/hdp/2.3.4.0-3485/hive/lib/zookeeper-3.4.6.2.3.4.0-3485.jar,/usr/hdp/2.3.4.0-3485/hive/lib/hive-hbase-handler.jar,/usr/hdp/2.3.4.0-3485/hbase/lib/hbase-server.jar:/usr/hdp/current/spark-thriftserver/conf/:/usr/hdp/2.3.4.0-3485/spark/lib/spark-assembly-1.5.2.2.3.4.0-3485-hadoop2.7.1.2.3.4.0-3485.jar:/usr/hdp/2.3.4.0-3485/spark/lib/datanucleus-api-jdo-3.2.6.jar:/usr/hdp/2.3.4.0-3485/spark/lib/datanucleus-core-3.2.10.jar:/usr/hdp/2.3.4.0-3485/spark/lib/datanucleus-rdbms-3.2.9.jar:/usr/hdp/current/hadoop-client/conf/
>>> 16/02/29 23:09:34 INFO ZooKeeper: Client
>>> environment:java.library.path=/usr/java/packages/lib/amd64:/usr/lib64:/lib64:/lib:/usr/lib
>>> 16/02/29 23:09:34 INFO ZooKeeper: Client environment:java.io.tmpdir=/tmp
>>> 16/02/29 23:09:34 INFO ZooKeeper: Client environment:java.compiler=<NA>
>>> 16/02/29 23:09:34 INFO ZooKeeper: Client environment:os.name=Linux
>>> 16/02/29 23:09:34 INFO ZooKeeper: Client environment:os.arch=amd64
>>> 16/02/29 23:09:34 INFO ZooKeeper: Client
>>> environment:os.version=3.10.0-229.el7.x86_64
>>> 16/02/29 23:09:34 INFO ZooKeeper: Client environment:user.name=hdfs
>>> 16/02/29 23:09:34 INFO ZooKeeper: Client environment:user.home=/home/hdfs
>>> 16/02/29 23:09:34 INFO ZooKeeper: Client environment:user.dir=/home/hdfs
>>> 16/02/29 23:09:34 INFO ZooKeeper: Initiating client connection,
>>> connectString=localhost:2181 sessionTimeout=90000
>>> watcher=hconnection-0x26fa89a20x0, quorum=localhost:2181, baseZNode=/hbase
>>> 16/02/29 23:09:34 INFO ClientCnxn: Opening socket connection to server
>>> localhost/0:0:0:0:0:0:0:1:2181. Will not attempt to authenticate using SASL
>>> (unknown error)
>>> 16/02/29 23:09:34 INFO ClientCnxn: Socket connection established to
>>> localhost/0:0:0:0:0:0:0:1:2181, initiating session
>>> 16/02/29 23:09:34 INFO ClientCnxn: Session establishment complete on
>>> server localhost/0:0:0:0:0:0:0:1:2181, sessionid = 0x3532fb70ba20034,
>>> negotiated timeout = 40000
>>> 16/02/29 23:09:34 WARN TableInputFormatBase: You are using an HTable
>>> instance that relies on an HBase-managed Connection. This is usually due to
>>> directly creating an HTable, which is deprecated. Instead, you should
>>> create a Connection object and then request a Table instance from it. If
>>> you don't need the Table instance for your own use, you should instead use
>>> the TableInputFormatBase.initalizeTable method directly.
>>> 16/02/29 23:09:34 INFO TableInputFormatBase: Creating an additional
>>> unmanaged connection because user provided one can't be used for
>>> administrative actions. We'll close it when we close out the table.
>>> 16/02/29 23:09:34 INFO RecoverableZooKeeper: Process
>>> identifier=hconnection-0x6fd74d35 connecting to ZooKeeper
>>> ensemble=localhost:2181
>>> 16/02/29 23:09:34 INFO ZooKeeper: Initiating client connection,
>>> connectString=localhost:2181 sessionTimeout=90000
>>> watcher=hconnection-0x6fd74d350x0, quorum=localhost:2181, baseZNode=/hbase
>>> 16/02/29 23:09:34 INFO ClientCnxn: Opening socket connection to server
>>> localhost/0:0:0:0:0:0:0:1:2181. Will not attempt to authenticate using SASL
>>> (unknown error)
>>> 16/02/29 23:09:34 INFO ClientCnxn: Socket connection established to
>>> localhost/0:0:0:0:0:0:0:1:2181, initiating session
>>> 16/02/29 23:09:34 INFO ClientCnxn: Session establishment complete on
>>> server localhost/0:0:0:0:0:0:0:1:2181, sessionid = 0x3532fb70ba20035,
>>> negotiated timeout = 40000
>>> 16/02/29 23:09:34 INFO RegionSizeCalculator: Calculating region sizes for
>>> table "TEST".
>>> 16/02/29 23:10:23 INFO RpcRetryingCaller: Call exception, tries=10,
>>> retries=35, started=48318 ms ago, cancelled=false, msg=
>>> 16/02/29 23:10:43 INFO RpcRetryingCaller: Call exception, tries=11,
>>> retries=35, started=68524 ms ago, cancelled=false, msg=
>>> 16/02/29 23:11:03 INFO RpcRetryingCaller: Call exception, tries=12,
>>> retries=35, started=88617 ms ago, cancelled=false, msg=
>>> 16/02/29 23:11:23 INFO RpcRetryingCaller: Call exception, tries=13,
>>> retries=35, started=108676 ms ago, cancelled=false, msg=
>>> 16/02/29 23:11:43 INFO RpcRetryingCaller: Call exception, tries=14,
>>> retries=35, started=128747 ms ago, cancelled=false, msg=
>>> 16/02/29 23:12:03 INFO RpcRetryingCaller: Call exception, tries=15,
>>> retries=35, started=148938 ms ago, cancelled=false, msg=
>>> 16/02/29 23:12:23 INFO RpcRetryingCaller: Call exception, tries=16,
>>> retries=35, started=168942 ms ago, cancelled=false, msg=
>>> 16/02/29 23:12:43 INFO RpcRetryingCaller: Call exception, tries=17,
>>> retries=35, started=188975 ms ago, cancelled=false, msg=
>>> Trace :
>>
>>
>>
>> Could somebody help me in resolving the error.
>> Would really appreciate the help .
>>
>>
>> Thanks,
>> Divya
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
> For additional commands, e-mail: user-help@spark.apache.org
>

Re: [ERROR]: Spark 1.5.2 + Hbase 1.1 + Hive 1.2 + HbaseIntegration

Posted by Teng Qiu <te...@gmail.com>.
forward you this mails, hope these can help you, you can take a look
at this post http://www.abcn.net/2014/07/lighting-spark-with-hbase-full-edition.html

2016-03-04 3:30 GMT+01:00 Divya Gehlot <di...@gmail.com>:
> Hi Teng,
>
> Thanks for the link you shared , helped me figure out the missing
> dependency.
> Was missing hbase-hadoop-compat.jar
>
>
>
>
>
> Thanks a lot,
>
> Divya
>
> On 2 March 2016 at 17:05, Teng Qiu <te...@gmail.com> wrote:
>>
>> Hi, maybe the dependencies described in
>> http://www.abcn.net/2014/07/lighting-spark-with-hbase-full-edition.html
>> can help, add hive-hbase handler jar as well for HiveIntegration in
>> spark
>>
>> 2016-03-02 2:19 GMT+01:00 Divya Gehlot <di...@gmail.com>:
>> > Hello Teng,
>> > As you could see in chain email.
>> > I am facing lots of  issues while trying to connect to hbase  registered
>> > hive table.
>> > Could your pls help me with the list of jars which needs to be place in
>> > spark classpath?
>> > Would be very grateful you could send me the steps to follow .
>> > Would really appreciate the help.
>> > Thanks,
>> > Divya
>> >
>> > On Mar 2, 2016 4:50 AM, "Teng Qiu" <te...@gmail.com> wrote:
>> >>
>> >> and also make sure that hbase-site.xml is set in your classpath on all
>> >> nodes, both master and workers, and also client.
>> >>
>> >> normally i put it into $SPARK_HOME/conf/ then the spark cluster will
>> >> be started with this conf file.
>> >>
>> >> btw. @Ted, did you tried insert into hbase table with spark's
>> >> HiveContext? i got this issue:
>> >> https://issues.apache.org/jira/browse/SPARK-6628
>> >>
>> >> and there is a patch available:
>> >> https://issues.apache.org/jira/browse/HIVE-11166
>> >>
>> >>
>> >> 2016-03-01 15:16 GMT+01:00 Ted Yu <yu...@gmail.com>:
>> >> > 16/03/01 01:36:31 WARN TaskSetManager: Lost task 0.0 in stage 0.0
>> >> > (TID
>> >> > 0,
>> >> > ip-xxx-xx-xx-xxx.ap-southeast-1.compute.internal):
>> >> > java.lang.RuntimeException: hbase-default.xml file seems to be for an
>> >> > older
>> >> > version of HBase (null), this version is 1.1.2.2.3.4.0-3485
>> >> >
>> >> > The above was likely caused by some component being built with
>> >> > different
>> >> > release of hbase.
>> >> >
>> >> > Try setting "hbase.defaults.for.version.skip" to true.
>> >> >
>> >> > Cheers
>> >> >
>> >> >
>> >> > On Mon, Feb 29, 2016 at 9:12 PM, Ted Yu <yu...@gmail.com> wrote:
>> >> >>
>> >> >> 16/02/29 23:09:34 INFO ZooKeeper: Initiating client connection,
>> >> >> connectString=localhost:2181 sessionTimeout=90000
>> >> >> watcher=hconnection-0x26fa89a20x0, quorum=localhost:2181,
>> >> >> baseZNode=/hbase
>> >> >>
>> >> >> Since baseZNode didn't match what you set in hbase-site.xml, the
>> >> >> cause
>> >> >> was
>> >> >> likely that hbase-site.xml being inaccessible to your Spark job.
>> >> >>
>> >> >> Please add it in your classpath.
>> >> >>
>> >> >> On Mon, Feb 29, 2016 at 8:42 PM, Ted Yu <yu...@gmail.com> wrote:
>> >> >>>
>> >> >>> 16/02/29 23:09:34 INFO ClientCnxn: Opening socket connection to
>> >> >>> server
>> >> >>> localhost/0:0:0:0:0:0:0:1:2181. Will not attempt to authenticate
>> >> >>> using
>> >> >>> SASL
>> >> >>> (unknown error)
>> >> >>>
>> >> >>> Is your cluster secure cluster ?
>> >> >>>
>> >> >>> bq. Trace :
>> >> >>>
>> >> >>> Was there any output after 'Trace :' ?
>> >> >>>
>> >> >>> Was hbase-site.xml accessible to your Spark job ?
>> >> >>>
>> >> >>> Thanks
>> >> >>>
>> >> >>> On Mon, Feb 29, 2016 at 8:27 PM, Divya Gehlot
>> >> >>> <di...@gmail.com>
>> >> >>> wrote:
>> >> >>>>
>> >> >>>> Hi,
>> >> >>>> I am getting error when I am trying to connect hive table (which
>> >> >>>> is
>> >> >>>> being created through HbaseIntegration) in spark
>> >> >>>>
>> >> >>>> Steps I followed :
>> >> >>>> Hive Table creation code  :
>> >> >>>> CREATE EXTERNAL TABLE IF NOT EXISTS TEST(NAME STRING,AGE INT)
>> >> >>>> STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
>> >> >>>> WITH SERDEPROPERTIES ("hbase.columns.mapping" = ":key,0:AGE")
>> >> >>>> TBLPROPERTIES ("hbase.table.name" = "TEST",
>> >> >>>> "hbase.mapred.output.outputtable" = "TEST");
>> >> >>>>
>> >> >>>>
>> >> >>>> DESCRIBE TEST ;
>> >> >>>> col_name    data_type    comment
>> >> >>>> name            string         from deserializer
>> >> >>>> age               int             from deserializer
>> >> >>>>
>> >> >>>>
>> >> >>>> Spark Code :
>> >> >>>> import org.apache.spark._
>> >> >>>> import org.apache.spark.sql._
>> >> >>>>
>> >> >>>> val hiveContext = new org.apache.spark.sql.hive.HiveContext(sc)
>> >> >>>> hiveContext.sql("from TEST SELECT  NAME").collect.foreach(println)
>> >> >>>>
>> >> >>>>
>> >> >>>> Starting Spark shell
>> >> >>>> spark-shell --jars
>> >> >>>>
>> >> >>>>
>> >> >>>> /usr/hdp/2.3.4.0-3485/hive/lib/guava-14.0.1.jar,/usr/hdp/2.3.4.0-3485/hive/lib/hive-hbase-handler.jar,/usr/hdp/2.3.4.0-3485/hbase/lib/hbase-client.jar,/usr/hdp/2.3.4.0-3485/hbase/lib/hbase-common.jar,/usr/hdp/2.3.4.0-3485/hbase/lib/hbase-protocol.jar,/usr/hdp/2.3.4.0-3485/hbase/lib/hbase-server.jar,/usr/hdp/2.3.4.0-3485/hive/lib/hive-hbase-handler.jar,/usr/hdp/2.3.4.0-3485/hive/lib/htrace-core-3.1.0-incubating.jar,/usr/hdp/2.3.4.0-3485/hive/lib/zookeeper-3.4.6.2.3.4.0-3485.jar,/usr/hdp/2.3.4.0-3485/hbase/lib/hbase-server.jar
>> >> >>>> --driver-class-path
>> >> >>>>
>> >> >>>>
>> >> >>>> /usr/hdp/2.3.4.0-3485/hive/lib/guava-14.0.1.jar,/usr/hdp/2.3.4.0-3485/hbase/lib/hbase-client.jar,/usr/hdp/2.3.4.0-3485/hbase/lib/hbase-common.jar,/usr/hdp/2.3.4.0-3485/hbase/lib/hbase-protocol.jar,/usr/hdp/2.3.4.0-3485/hbase/lib/hbase-server.jar,/usr/hdp/2.3.4.0-3485/hive/lib/hive-hbase-handler.jar,/usr/hdp/2.3.4.0-3485/hive/lib/htrace-core-3.1.0-incubating.jar,/usr/hdp/2.3.4.0-3485/hive/lib/zookeeper-3.4.6.2.3.4.0-3485.jar,/usr/hdp/2.3.4.0-3485/hive/lib/hive-hbase-handler.jar,/usr/hdp/2.3.4.0-3485/hbase/lib/hbase-server.jar
>> >> >>>> --packages com.databricks:spark-csv_2.10:1.3.0  --master
>> >> >>>> yarn-client
>> >> >>>> -i
>> >> >>>> /TestDivya/Spark/InstrumentCopyToHDFSHive.scala
>> >> >>>>
>> >> >>>> Stack Trace :
>> >> >>>>
>> >> >>>>> Stack SQL context available as sqlContext.
>> >> >>>>> Loading /TestDivya/Spark/InstrumentCopyToHDFSHive.scala...
>> >> >>>>> import org.apache.spark._
>> >> >>>>> import org.apache.spark.sql._
>> >> >>>>> 16/02/29 23:09:29 INFO HiveContext: Initializing execution hive,
>> >> >>>>> version 1.2.1
>> >> >>>>> 16/02/29 23:09:29 INFO ClientWrapper: Inspected Hadoop version:
>> >> >>>>> 2.7.1.2.3.4.0-3485
>> >> >>>>> 16/02/29 23:09:29 INFO ClientWrapper: Loaded
>> >> >>>>> org.apache.hadoop.hive.shims.Hadoop23Shims for Hadoop version
>> >> >>>>> 2.7.1.2.3.4.0-3485
>> >> >>>>> 16/02/29 23:09:29 INFO HiveContext: default warehouse location is
>> >> >>>>> /user/hive/warehouse
>> >> >>>>> 16/02/29 23:09:29 INFO HiveContext: Initializing
>> >> >>>>> HiveMetastoreConnection version 1.2.1 using Spark classes.
>> >> >>>>> 16/02/29 23:09:29 INFO ClientWrapper: Inspected Hadoop version:
>> >> >>>>> 2.7.1.2.3.4.0-3485
>> >> >>>>> 16/02/29 23:09:29 INFO ClientWrapper: Loaded
>> >> >>>>> org.apache.hadoop.hive.shims.Hadoop23Shims for Hadoop version
>> >> >>>>> 2.7.1.2.3.4.0-3485
>> >> >>>>> 16/02/29 23:09:30 WARN NativeCodeLoader: Unable to load
>> >> >>>>> native-hadoop
>> >> >>>>> library for your platform... using builtin-java classes where
>> >> >>>>> applicable
>> >> >>>>> 16/02/29 23:09:30 INFO metastore: Trying to connect to metastore
>> >> >>>>> with
>> >> >>>>> URI
>> >> >>>>> thrift://ip-xxx-xx-xx-xxx.ap-southeast-1.compute.internal:9083
>> >> >>>>> 16/02/29 23:09:30 INFO metastore: Connected to metastore.
>> >> >>>>> 16/02/29 23:09:30 WARN DomainSocketFactory: The short-circuit
>> >> >>>>> local
>> >> >>>>> reads feature cannot be used because libhadoop cannot be loaded.
>> >> >>>>> 16/02/29 23:09:31 INFO SessionState: Created local directory:
>> >> >>>>> /tmp/1bf53785-f7c8-406d-a733-a5858ccb2d16_resources
>> >> >>>>> 16/02/29 23:09:31 INFO SessionState: Created HDFS directory:
>> >> >>>>> /tmp/hive/hdfs/1bf53785-f7c8-406d-a733-a5858ccb2d16
>> >> >>>>> 16/02/29 23:09:31 INFO SessionState: Created local directory:
>> >> >>>>> /tmp/hdfs/1bf53785-f7c8-406d-a733-a5858ccb2d16
>> >> >>>>> 16/02/29 23:09:31 INFO SessionState: Created HDFS directory:
>> >> >>>>> /tmp/hive/hdfs/1bf53785-f7c8-406d-a733-a5858ccb2d16/_tmp_space.db
>> >> >>>>> hiveContext: org.apache.spark.sql.hive.HiveContext =
>> >> >>>>> org.apache.spark.sql.hive.HiveContext@10b14f32
>> >> >>>>> 16/02/29 23:09:32 INFO ParseDriver: Parsing command: from TEST
>> >> >>>>> SELECT
>> >> >>>>> NAME
>> >> >>>>> 16/02/29 23:09:32 INFO ParseDriver: Parse Completed
>> >> >>>>> 16/02/29 23:09:33 INFO deprecation: mapred.map.tasks is
>> >> >>>>> deprecated.
>> >> >>>>> Instead, use mapreduce.job.maps
>> >> >>>>> 16/02/29 23:09:33 INFO MemoryStore: ensureFreeSpace(468352)
>> >> >>>>> called
>> >> >>>>> with
>> >> >>>>> curMem=0, maxMem=556038881
>> >> >>>>> 16/02/29 23:09:33 INFO MemoryStore: Block broadcast_0 stored as
>> >> >>>>> values
>> >> >>>>> in memory (estimated size 457.4 KB, free 529.8 MB)
>> >> >>>>> 16/02/29 23:09:33 INFO MemoryStore: ensureFreeSpace(49454) called
>> >> >>>>> with
>> >> >>>>> curMem=468352, maxMem=556038881
>> >> >>>>> 16/02/29 23:09:33 INFO MemoryStore: Block broadcast_0_piece0
>> >> >>>>> stored
>> >> >>>>> as
>> >> >>>>> bytes in memory (estimated size 48.3 KB, free 529.8 MB)
>> >> >>>>> 16/02/29 23:09:33 INFO BlockManagerInfo: Added broadcast_0_piece0
>> >> >>>>> in
>> >> >>>>> memory on xxx.xx.xx.xxx:37784 (size: 48.3 KB, free: 530.2 MB)
>> >> >>>>> 16/02/29 23:09:33 INFO SparkContext: Created broadcast 0 from
>> >> >>>>> collect
>> >> >>>>> at <console>:30
>> >> >>>>> 16/02/29 23:09:34 INFO HBaseStorageHandler: Configuring input job
>> >> >>>>> properties
>> >> >>>>> 16/02/29 23:09:34 INFO RecoverableZooKeeper: Process
>> >> >>>>> identifier=hconnection-0x26fa89a2 connecting to ZooKeeper
>> >> >>>>> ensemble=localhost:2181
>> >> >>>>> 16/02/29 23:09:34 INFO ZooKeeper: Client
>> >> >>>>> environment:zookeeper.version=3.4.6-3485--1, built on 12/16/2015
>> >> >>>>> 02:35 GMT
>> >> >>>>> 16/02/29 23:09:34 INFO ZooKeeper: Client
>> >> >>>>>
>> >> >>>>>
>> >> >>>>> environment:host.name=ip-xxx-xx-xx-xxx.ap-southeast-1.compute.internal
>> >> >>>>> 16/02/29 23:09:34 INFO ZooKeeper: Client
>> >> >>>>> environment:java.version=1.7.0_67
>> >> >>>>> 16/02/29 23:09:34 INFO ZooKeeper: Client
>> >> >>>>> environment:java.vendor=Oracle
>> >> >>>>> Corporation
>> >> >>>>> 16/02/29 23:09:34 INFO ZooKeeper: Client
>> >> >>>>> environment:java.home=/usr/jdk64/jdk1.7.0_67/jre
>> >> >>>>> 16/02/29 23:09:34 INFO ZooKeeper: Client
>> >> >>>>>
>> >> >>>>>
>> >> >>>>> environment:java.class.path=/usr/hdp/2.3.4.0-3485/hive/lib/guava-14.0.1.jar,/usr/hdp/2.3.4.0-3485/hbase/lib/hbase-client.jar,/usr/hdp/2.3.4.0-3485/hbase/lib/hbase-common.jar,/usr/hdp/2.3.4.0-3485/hbase/lib/hbase-protocol.jar,/usr/hdp/2.3.4.0-3485/hbase/lib/hbase-server.jar,/usr/hdp/2.3.4.0-3485/hive/lib/hive-hbase-handler.jar,/usr/hdp/2.3.4.0-3485/hive/lib/htrace-core-3.1.0-incubating.jar,/usr/hdp/2.3.4.0-3485/hive/lib/zookeeper-3.4.6.2.3.4.0-3485.jar,/usr/hdp/2.3.4.0-3485/hive/lib/hive-hbase-handler.jar,/usr/hdp/2.3.4.0-3485/hbase/lib/hbase-server.jar:/usr/hdp/current/spark-thriftserver/conf/:/usr/hdp/2.3.4.0-3485/spark/lib/spark-assembly-1.5.2.2.3.4.0-3485-hadoop2.7.1.2.3.4.0-3485.jar:/usr/hdp/2.3.4.0-3485/spark/lib/datanucleus-api-jdo-3.2.6.jar:/usr/hdp/2.3.4.0-3485/spark/lib/datanucleus-core-3.2.10.jar:/usr/hdp/2.3.4.0-3485/spark/lib/datanucleus-rdbms-3.2.9.jar:/usr/hdp/current/hadoop-client/conf/
>> >> >>>>> 16/02/29 23:09:34 INFO ZooKeeper: Client
>> >> >>>>>
>> >> >>>>>
>> >> >>>>> environment:java.library.path=/usr/java/packages/lib/amd64:/usr/lib64:/lib64:/lib:/usr/lib
>> >> >>>>> 16/02/29 23:09:34 INFO ZooKeeper: Client
>> >> >>>>> environment:java.io.tmpdir=/tmp
>> >> >>>>> 16/02/29 23:09:34 INFO ZooKeeper: Client
>> >> >>>>> environment:java.compiler=<NA>
>> >> >>>>> 16/02/29 23:09:34 INFO ZooKeeper: Client
>> >> >>>>> environment:os.name=Linux
>> >> >>>>> 16/02/29 23:09:34 INFO ZooKeeper: Client
>> >> >>>>> environment:os.arch=amd64
>> >> >>>>> 16/02/29 23:09:34 INFO ZooKeeper: Client
>> >> >>>>> environment:os.version=3.10.0-229.el7.x86_64
>> >> >>>>> 16/02/29 23:09:34 INFO ZooKeeper: Client
>> >> >>>>> environment:user.name=hdfs
>> >> >>>>> 16/02/29 23:09:34 INFO ZooKeeper: Client
>> >> >>>>> environment:user.home=/home/hdfs
>> >> >>>>> 16/02/29 23:09:34 INFO ZooKeeper: Client
>> >> >>>>> environment:user.dir=/home/hdfs
>> >> >>>>> 16/02/29 23:09:34 INFO ZooKeeper: Initiating client connection,
>> >> >>>>> connectString=localhost:2181 sessionTimeout=90000
>> >> >>>>> watcher=hconnection-0x26fa89a20x0, quorum=localhost:2181,
>> >> >>>>> baseZNode=/hbase
>> >> >>>>> 16/02/29 23:09:34 INFO ClientCnxn: Opening socket connection to
>> >> >>>>> server
>> >> >>>>> localhost/0:0:0:0:0:0:0:1:2181. Will not attempt to authenticate
>> >> >>>>> using SASL
>> >> >>>>> (unknown error)
>> >> >>>>> 16/02/29 23:09:34 INFO ClientCnxn: Socket connection established
>> >> >>>>> to
>> >> >>>>> localhost/0:0:0:0:0:0:0:1:2181, initiating session
>> >> >>>>> 16/02/29 23:09:34 INFO ClientCnxn: Session establishment complete
>> >> >>>>> on
>> >> >>>>> server localhost/0:0:0:0:0:0:0:1:2181, sessionid =
>> >> >>>>> 0x3532fb70ba20034,
>> >> >>>>> negotiated timeout = 40000
>> >> >>>>> 16/02/29 23:09:34 WARN TableInputFormatBase: You are using an
>> >> >>>>> HTable
>> >> >>>>> instance that relies on an HBase-managed Connection. This is
>> >> >>>>> usually
>> >> >>>>> due to
>> >> >>>>> directly creating an HTable, which is deprecated. Instead, you
>> >> >>>>> should create
>> >> >>>>> a Connection object and then request a Table instance from it. If
>> >> >>>>> you don't
>> >> >>>>> need the Table instance for your own use, you should instead use
>> >> >>>>> the
>> >> >>>>> TableInputFormatBase.initalizeTable method directly.
>> >> >>>>> 16/02/29 23:09:34 INFO TableInputFormatBase: Creating an
>> >> >>>>> additional
>> >> >>>>> unmanaged connection because user provided one can't be used for
>> >> >>>>> administrative actions. We'll close it when we close out the
>> >> >>>>> table.
>> >> >>>>> 16/02/29 23:09:34 INFO RecoverableZooKeeper: Process
>> >> >>>>> identifier=hconnection-0x6fd74d35 connecting to ZooKeeper
>> >> >>>>> ensemble=localhost:2181
>> >> >>>>> 16/02/29 23:09:34 INFO ZooKeeper: Initiating client connection,
>> >> >>>>> connectString=localhost:2181 sessionTimeout=90000
>> >> >>>>> watcher=hconnection-0x6fd74d350x0, quorum=localhost:2181,
>> >> >>>>> baseZNode=/hbase
>> >> >>>>> 16/02/29 23:09:34 INFO ClientCnxn: Opening socket connection to
>> >> >>>>> server
>> >> >>>>> localhost/0:0:0:0:0:0:0:1:2181. Will not attempt to authenticate
>> >> >>>>> using SASL
>> >> >>>>> (unknown error)
>> >> >>>>> 16/02/29 23:09:34 INFO ClientCnxn: Socket connection established
>> >> >>>>> to
>> >> >>>>> localhost/0:0:0:0:0:0:0:1:2181, initiating session
>> >> >>>>> 16/02/29 23:09:34 INFO ClientCnxn: Session establishment complete
>> >> >>>>> on
>> >> >>>>> server localhost/0:0:0:0:0:0:0:1:2181, sessionid =
>> >> >>>>> 0x3532fb70ba20035,
>> >> >>>>> negotiated timeout = 40000
>> >> >>>>> 16/02/29 23:09:34 INFO RegionSizeCalculator: Calculating region
>> >> >>>>> sizes
>> >> >>>>> for table "TEST".
>> >> >>>>> 16/02/29 23:10:23 INFO RpcRetryingCaller: Call exception,
>> >> >>>>> tries=10,
>> >> >>>>> retries=35, started=48318 ms ago, cancelled=false, msg=
>> >> >>>>> 16/02/29 23:10:43 INFO RpcRetryingCaller: Call exception,
>> >> >>>>> tries=11,
>> >> >>>>> retries=35, started=68524 ms ago, cancelled=false, msg=
>> >> >>>>> 16/02/29 23:11:03 INFO RpcRetryingCaller: Call exception,
>> >> >>>>> tries=12,
>> >> >>>>> retries=35, started=88617 ms ago, cancelled=false, msg=
>> >> >>>>> 16/02/29 23:11:23 INFO RpcRetryingCaller: Call exception,
>> >> >>>>> tries=13,
>> >> >>>>> retries=35, started=108676 ms ago, cancelled=false, msg=
>> >> >>>>> 16/02/29 23:11:43 INFO RpcRetryingCaller: Call exception,
>> >> >>>>> tries=14,
>> >> >>>>> retries=35, started=128747 ms ago, cancelled=false, msg=
>> >> >>>>> 16/02/29 23:12:03 INFO RpcRetryingCaller: Call exception,
>> >> >>>>> tries=15,
>> >> >>>>> retries=35, started=148938 ms ago, cancelled=false, msg=
>> >> >>>>> 16/02/29 23:12:23 INFO RpcRetryingCaller: Call exception,
>> >> >>>>> tries=16,
>> >> >>>>> retries=35, started=168942 ms ago, cancelled=false, msg=
>> >> >>>>> 16/02/29 23:12:43 INFO RpcRetryingCaller: Call exception,
>> >> >>>>> tries=17,
>> >> >>>>> retries=35, started=188975 ms ago, cancelled=false, msg=
>> >> >>>>> Trace :
>> >> >>>>
>> >> >>>>
>> >> >>>>
>> >> >>>> Could somebody help me in resolving the error.
>> >> >>>> Would really appreciate the help .
>> >> >>>>
>> >> >>>>
>> >> >>>> Thanks,
>> >> >>>> Divya
>> >> >>>>
>> >> >>>>
>> >> >>>>
>> >> >>>>
>> >> >>>>
>> >> >>>
>> >> >>
>> >> >
>
>

2016-04-08 8:45 GMT+02:00 Wojciech Indyk <wo...@gmail.com>:
> Hello Divya!
> Have you solved the problem?
> I suppose the log comes from driver. You need to look also at logs on
> worker JVMs, there can be an exception or something.
> Do you have Kerberos on your cluster? It could be similar to a problem
> http://issues.apache.org/jira/browse/SPARK-14115
>
> Based on your logs:
>> 16/02/29 23:09:34 INFO ClientCnxn: Opening socket connection to server
>> localhost/0:0:0:0:0:0:0:1:2181. Will not attempt to authenticate using SASL
>> (unknown error)
>> 16/02/29 23:09:34 INFO ClientCnxn: Socket connection established to
>> localhost/0:0:0:0:0:0:0:1:2181, initiating session
>> 16/02/29 23:09:34 INFO ClientCnxn: Session establishment complete on
>> server localhost/0:0:0:0:0:0:0:1:2181, sessionid = 0x3532fb70ba20035,
>
> Maybe there is a problem with using RPC call to regions using IPv6
> (but I just guess).
>
> --
> Kind regards/ Pozdrawiam,
> Wojciech Indyk
> http://datacentric.pl
>
>
> 2016-03-01 5:27 GMT+01:00 Divya Gehlot <di...@gmail.com>:
>> Hi,
>> I am getting error when I am trying to connect hive table (which is being
>> created through HbaseIntegration) in spark
>>
>> Steps I followed :
>> *Hive Table creation code  *:
>> CREATE EXTERNAL TABLE IF NOT EXISTS TEST(NAME STRING,AGE INT)
>> STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
>> WITH SERDEPROPERTIES ("hbase.columns.mapping" = ":key,0:AGE")
>> TBLPROPERTIES ("hbase.table.name" = "TEST",
>> "hbase.mapred.output.outputtable" = "TEST");
>>
>>
>> *DESCRIBE TEST ;*
>> col_name    data_type    comment
>> name            string         from deserializer
>> age               int             from deserializer
>>
>>
>> *Spark Code :*
>> import org.apache.spark._
>> import org.apache.spark.sql._
>>
>> val hiveContext = new org.apache.spark.sql.hive.HiveContext(sc)
>> hiveContext.sql("from TEST SELECT  NAME").collect.foreach(println)
>>
>>
>> *Starting Spark shell*
>> spark-shell --jars
>> /usr/hdp/2.3.4.0-3485/hive/lib/guava-14.0.1.jar,/usr/hdp/2.3.4.0-3485/hive/lib/hive-hbase-handler.jar,/usr/hdp/2.3.4.0-3485/hbase/lib/hbase-client.jar,/usr/hdp/2.3.4.0-3485/hbase/lib/hbase-common.jar,/usr/hdp/2.3.4.0-3485/hbase/lib/hbase-protocol.jar,/usr/hdp/2.3.4.0-3485/hbase/lib/hbase-server.jar,/usr/hdp/2.3.4.0-3485/hive/lib/hive-hbase-handler.jar,/usr/hdp/2.3.4.0-3485/hive/lib/htrace-core-3.1.0-incubating.jar,/usr/hdp/2.3.4.0-3485/hive/lib/zookeeper-3.4.6.2.3.4.0-3485.jar,/usr/hdp/2.3.4.0-3485/hbase/lib/hbase-server.jar
>> --driver-class-path
>> /usr/hdp/2.3.4.0-3485/hive/lib/guava-14.0.1.jar,/usr/hdp/2.3.4.0-3485/hbase/lib/hbase-client.jar,/usr/hdp/2.3.4.0-3485/hbase/lib/hbase-common.jar,/usr/hdp/2.3.4.0-3485/hbase/lib/hbase-protocol.jar,/usr/hdp/2.3.4.0-3485/hbase/lib/hbase-server.jar,/usr/hdp/2.3.4.0-3485/hive/lib/hive-hbase-handler.jar,/usr/hdp/2.3.4.0-3485/hive/lib/htrace-core-3.1.0-incubating.jar,/usr/hdp/2.3.4.0-3485/hive/lib/zookeeper-3.4.6.2.3.4.0-3485.jar,/usr/hdp/2.3.4.0-3485/hive/lib/hive-hbase-handler.jar,/usr/hdp/2.3.4.0-3485/hbase/lib/hbase-server.jar
>> --packages com.databricks:spark-csv_2.10:1.3.0  --master yarn-client -i
>> /TestDivya/Spark/InstrumentCopyToHDFSHive.scala
>>
>> *Stack Trace* :
>>
>> Stack SQL context available as sqlContext.
>>> Loading /TestDivya/Spark/InstrumentCopyToHDFSHive.scala...
>>> import org.apache.spark._
>>> import org.apache.spark.sql._
>>> 16/02/29 23:09:29 INFO HiveContext: Initializing execution hive, version
>>> 1.2.1
>>> 16/02/29 23:09:29 INFO ClientWrapper: Inspected Hadoop version:
>>> 2.7.1.2.3.4.0-3485
>>> 16/02/29 23:09:29 INFO ClientWrapper: Loaded
>>> org.apache.hadoop.hive.shims.Hadoop23Shims for Hadoop version
>>> 2.7.1.2.3.4.0-3485
>>> 16/02/29 23:09:29 INFO HiveContext: default warehouse location is
>>> /user/hive/warehouse
>>> 16/02/29 23:09:29 INFO HiveContext: Initializing HiveMetastoreConnection
>>> version 1.2.1 using Spark classes.
>>> 16/02/29 23:09:29 INFO ClientWrapper: Inspected Hadoop version:
>>> 2.7.1.2.3.4.0-3485
>>> 16/02/29 23:09:29 INFO ClientWrapper: Loaded
>>> org.apache.hadoop.hive.shims.Hadoop23Shims for Hadoop version
>>> 2.7.1.2.3.4.0-3485
>>> 16/02/29 23:09:30 WARN NativeCodeLoader: Unable to load native-hadoop
>>> library for your platform... using builtin-java classes where applicable
>>> 16/02/29 23:09:30 INFO metastore: Trying to connect to metastore with URI
>>> thrift://ip-xxx-xx-xx-xxx.ap-southeast-1.compute.internal:9083
>>> 16/02/29 23:09:30 INFO metastore: Connected to metastore.
>>> 16/02/29 23:09:30 WARN DomainSocketFactory: The short-circuit local reads
>>> feature cannot be used because libhadoop cannot be loaded.
>>> 16/02/29 23:09:31 INFO SessionState: Created local directory:
>>> /tmp/1bf53785-f7c8-406d-a733-a5858ccb2d16_resources
>>> 16/02/29 23:09:31 INFO SessionState: Created HDFS directory:
>>> /tmp/hive/hdfs/1bf53785-f7c8-406d-a733-a5858ccb2d16
>>> 16/02/29 23:09:31 INFO SessionState: Created local directory:
>>> /tmp/hdfs/1bf53785-f7c8-406d-a733-a5858ccb2d16
>>> 16/02/29 23:09:31 INFO SessionState: Created HDFS directory:
>>> /tmp/hive/hdfs/1bf53785-f7c8-406d-a733-a5858ccb2d16/_tmp_space.db
>>> hiveContext: org.apache.spark.sql.hive.HiveContext =
>>> org.apache.spark.sql.hive.HiveContext@10b14f32
>>> 16/02/29 23:09:32 INFO ParseDriver: Parsing command: from TEST SELECT  NAME
>>> 16/02/29 23:09:32 INFO ParseDriver: Parse Completed
>>> 16/02/29 23:09:33 INFO deprecation: mapred.map.tasks is deprecated.
>>> Instead, use mapreduce.job.maps
>>> 16/02/29 23:09:33 INFO MemoryStore: ensureFreeSpace(468352) called with
>>> curMem=0, maxMem=556038881
>>> 16/02/29 23:09:33 INFO MemoryStore: Block broadcast_0 stored as values in
>>> memory (estimated size 457.4 KB, free 529.8 MB)
>>> 16/02/29 23:09:33 INFO MemoryStore: ensureFreeSpace(49454) called with
>>> curMem=468352, maxMem=556038881
>>> 16/02/29 23:09:33 INFO MemoryStore: Block broadcast_0_piece0 stored as
>>> bytes in memory (estimated size 48.3 KB, free 529.8 MB)
>>> 16/02/29 23:09:33 INFO BlockManagerInfo: Added broadcast_0_piece0 in
>>> memory on xxx.xx.xx.xxx:37784 (size: 48.3 KB, free: 530.2 MB)
>>> 16/02/29 23:09:33 INFO SparkContext: Created broadcast 0 from collect at
>>> <console>:30
>>> 16/02/29 23:09:34 INFO HBaseStorageHandler: Configuring input job
>>> properties
>>> 16/02/29 23:09:34 INFO RecoverableZooKeeper: Process
>>> identifier=hconnection-0x26fa89a2 connecting to ZooKeeper
>>> ensemble=localhost:2181
>>> 16/02/29 23:09:34 INFO ZooKeeper: Client
>>> environment:zookeeper.version=3.4.6-3485--1, built on 12/16/2015 02:35 GMT
>>> 16/02/29 23:09:34 INFO ZooKeeper: Client environment:host.name
>>> =ip-xxx-xx-xx-xxx.ap-southeast-1.compute.internal
>>> 16/02/29 23:09:34 INFO ZooKeeper: Client environment:java.version=1.7.0_67
>>> 16/02/29 23:09:34 INFO ZooKeeper: Client environment:java.vendor=Oracle
>>> Corporation
>>> 16/02/29 23:09:34 INFO ZooKeeper: Client
>>> environment:java.home=/usr/jdk64/jdk1.7.0_67/jre
>>> 16/02/29 23:09:34 INFO ZooKeeper: Client
>>> environment:java.class.path=/usr/hdp/2.3.4.0-3485/hive/lib/guava-14.0.1.jar,/usr/hdp/2.3.4.0-3485/hbase/lib/hbase-client.jar,/usr/hdp/2.3.4.0-3485/hbase/lib/hbase-common.jar,/usr/hdp/2.3.4.0-3485/hbase/lib/hbase-protocol.jar,/usr/hdp/2.3.4.0-3485/hbase/lib/hbase-server.jar,/usr/hdp/2.3.4.0-3485/hive/lib/hive-hbase-handler.jar,/usr/hdp/2.3.4.0-3485/hive/lib/htrace-core-3.1.0-incubating.jar,/usr/hdp/2.3.4.0-3485/hive/lib/zookeeper-3.4.6.2.3.4.0-3485.jar,/usr/hdp/2.3.4.0-3485/hive/lib/hive-hbase-handler.jar,/usr/hdp/2.3.4.0-3485/hbase/lib/hbase-server.jar:/usr/hdp/current/spark-thriftserver/conf/:/usr/hdp/2.3.4.0-3485/spark/lib/spark-assembly-1.5.2.2.3.4.0-3485-hadoop2.7.1.2.3.4.0-3485.jar:/usr/hdp/2.3.4.0-3485/spark/lib/datanucleus-api-jdo-3.2.6.jar:/usr/hdp/2.3.4.0-3485/spark/lib/datanucleus-core-3.2.10.jar:/usr/hdp/2.3.4.0-3485/spark/lib/datanucleus-rdbms-3.2.9.jar:/usr/hdp/current/hadoop-client/conf/
>>> 16/02/29 23:09:34 INFO ZooKeeper: Client
>>> environment:java.library.path=/usr/java/packages/lib/amd64:/usr/lib64:/lib64:/lib:/usr/lib
>>> 16/02/29 23:09:34 INFO ZooKeeper: Client environment:java.io.tmpdir=/tmp
>>> 16/02/29 23:09:34 INFO ZooKeeper: Client environment:java.compiler=<NA>
>>> 16/02/29 23:09:34 INFO ZooKeeper: Client environment:os.name=Linux
>>> 16/02/29 23:09:34 INFO ZooKeeper: Client environment:os.arch=amd64
>>> 16/02/29 23:09:34 INFO ZooKeeper: Client
>>> environment:os.version=3.10.0-229.el7.x86_64
>>> 16/02/29 23:09:34 INFO ZooKeeper: Client environment:user.name=hdfs
>>> 16/02/29 23:09:34 INFO ZooKeeper: Client environment:user.home=/home/hdfs
>>> 16/02/29 23:09:34 INFO ZooKeeper: Client environment:user.dir=/home/hdfs
>>> 16/02/29 23:09:34 INFO ZooKeeper: Initiating client connection,
>>> connectString=localhost:2181 sessionTimeout=90000
>>> watcher=hconnection-0x26fa89a20x0, quorum=localhost:2181, baseZNode=/hbase
>>> 16/02/29 23:09:34 INFO ClientCnxn: Opening socket connection to server
>>> localhost/0:0:0:0:0:0:0:1:2181. Will not attempt to authenticate using SASL
>>> (unknown error)
>>> 16/02/29 23:09:34 INFO ClientCnxn: Socket connection established to
>>> localhost/0:0:0:0:0:0:0:1:2181, initiating session
>>> 16/02/29 23:09:34 INFO ClientCnxn: Session establishment complete on
>>> server localhost/0:0:0:0:0:0:0:1:2181, sessionid = 0x3532fb70ba20034,
>>> negotiated timeout = 40000
>>> 16/02/29 23:09:34 WARN TableInputFormatBase: You are using an HTable
>>> instance that relies on an HBase-managed Connection. This is usually due to
>>> directly creating an HTable, which is deprecated. Instead, you should
>>> create a Connection object and then request a Table instance from it. If
>>> you don't need the Table instance for your own use, you should instead use
>>> the TableInputFormatBase.initalizeTable method directly.
>>> 16/02/29 23:09:34 INFO TableInputFormatBase: Creating an additional
>>> unmanaged connection because user provided one can't be used for
>>> administrative actions. We'll close it when we close out the table.
>>> 16/02/29 23:09:34 INFO RecoverableZooKeeper: Process
>>> identifier=hconnection-0x6fd74d35 connecting to ZooKeeper
>>> ensemble=localhost:2181
>>> 16/02/29 23:09:34 INFO ZooKeeper: Initiating client connection,
>>> connectString=localhost:2181 sessionTimeout=90000
>>> watcher=hconnection-0x6fd74d350x0, quorum=localhost:2181, baseZNode=/hbase
>>> 16/02/29 23:09:34 INFO ClientCnxn: Opening socket connection to server
>>> localhost/0:0:0:0:0:0:0:1:2181. Will not attempt to authenticate using SASL
>>> (unknown error)
>>> 16/02/29 23:09:34 INFO ClientCnxn: Socket connection established to
>>> localhost/0:0:0:0:0:0:0:1:2181, initiating session
>>> 16/02/29 23:09:34 INFO ClientCnxn: Session establishment complete on
>>> server localhost/0:0:0:0:0:0:0:1:2181, sessionid = 0x3532fb70ba20035,
>>> negotiated timeout = 40000
>>> 16/02/29 23:09:34 INFO RegionSizeCalculator: Calculating region sizes for
>>> table "TEST".
>>> 16/02/29 23:10:23 INFO RpcRetryingCaller: Call exception, tries=10,
>>> retries=35, started=48318 ms ago, cancelled=false, msg=
>>> 16/02/29 23:10:43 INFO RpcRetryingCaller: Call exception, tries=11,
>>> retries=35, started=68524 ms ago, cancelled=false, msg=
>>> 16/02/29 23:11:03 INFO RpcRetryingCaller: Call exception, tries=12,
>>> retries=35, started=88617 ms ago, cancelled=false, msg=
>>> 16/02/29 23:11:23 INFO RpcRetryingCaller: Call exception, tries=13,
>>> retries=35, started=108676 ms ago, cancelled=false, msg=
>>> 16/02/29 23:11:43 INFO RpcRetryingCaller: Call exception, tries=14,
>>> retries=35, started=128747 ms ago, cancelled=false, msg=
>>> 16/02/29 23:12:03 INFO RpcRetryingCaller: Call exception, tries=15,
>>> retries=35, started=148938 ms ago, cancelled=false, msg=
>>> 16/02/29 23:12:23 INFO RpcRetryingCaller: Call exception, tries=16,
>>> retries=35, started=168942 ms ago, cancelled=false, msg=
>>> 16/02/29 23:12:43 INFO RpcRetryingCaller: Call exception, tries=17,
>>> retries=35, started=188975 ms ago, cancelled=false, msg=
>>> Trace :
>>
>>
>>
>> Could somebody help me in resolving the error.
>> Would really appreciate the help .
>>
>>
>> Thanks,
>> Divya
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
> For additional commands, e-mail: user-help@spark.apache.org
>

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
For additional commands, e-mail: user-help@spark.apache.org


Re: [ERROR]: Spark 1.5.2 + Hbase 1.1 + Hive 1.2 + HbaseIntegration

Posted by Wojciech Indyk <wo...@gmail.com>.
Hello Divya!
Have you solved the problem?
I suppose the log comes from driver. You need to look also at logs on
worker JVMs, there can be an exception or something.
Do you have Kerberos on your cluster? It could be similar to a problem
http://issues.apache.org/jira/browse/SPARK-14115

Based on your logs:
> 16/02/29 23:09:34 INFO ClientCnxn: Opening socket connection to server
> localhost/0:0:0:0:0:0:0:1:2181. Will not attempt to authenticate using SASL
> (unknown error)
> 16/02/29 23:09:34 INFO ClientCnxn: Socket connection established to
> localhost/0:0:0:0:0:0:0:1:2181, initiating session
> 16/02/29 23:09:34 INFO ClientCnxn: Session establishment complete on
> server localhost/0:0:0:0:0:0:0:1:2181, sessionid = 0x3532fb70ba20035,

Maybe there is a problem with using RPC call to regions using IPv6
(but I just guess).

--
Kind regards/ Pozdrawiam,
Wojciech Indyk
http://datacentric.pl


2016-03-01 5:27 GMT+01:00 Divya Gehlot <di...@gmail.com>:
> Hi,
> I am getting error when I am trying to connect hive table (which is being
> created through HbaseIntegration) in spark
>
> Steps I followed :
> *Hive Table creation code  *:
> CREATE EXTERNAL TABLE IF NOT EXISTS TEST(NAME STRING,AGE INT)
> STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
> WITH SERDEPROPERTIES ("hbase.columns.mapping" = ":key,0:AGE")
> TBLPROPERTIES ("hbase.table.name" = "TEST",
> "hbase.mapred.output.outputtable" = "TEST");
>
>
> *DESCRIBE TEST ;*
> col_name    data_type    comment
> name            string         from deserializer
> age               int             from deserializer
>
>
> *Spark Code :*
> import org.apache.spark._
> import org.apache.spark.sql._
>
> val hiveContext = new org.apache.spark.sql.hive.HiveContext(sc)
> hiveContext.sql("from TEST SELECT  NAME").collect.foreach(println)
>
>
> *Starting Spark shell*
> spark-shell --jars
> /usr/hdp/2.3.4.0-3485/hive/lib/guava-14.0.1.jar,/usr/hdp/2.3.4.0-3485/hive/lib/hive-hbase-handler.jar,/usr/hdp/2.3.4.0-3485/hbase/lib/hbase-client.jar,/usr/hdp/2.3.4.0-3485/hbase/lib/hbase-common.jar,/usr/hdp/2.3.4.0-3485/hbase/lib/hbase-protocol.jar,/usr/hdp/2.3.4.0-3485/hbase/lib/hbase-server.jar,/usr/hdp/2.3.4.0-3485/hive/lib/hive-hbase-handler.jar,/usr/hdp/2.3.4.0-3485/hive/lib/htrace-core-3.1.0-incubating.jar,/usr/hdp/2.3.4.0-3485/hive/lib/zookeeper-3.4.6.2.3.4.0-3485.jar,/usr/hdp/2.3.4.0-3485/hbase/lib/hbase-server.jar
> --driver-class-path
> /usr/hdp/2.3.4.0-3485/hive/lib/guava-14.0.1.jar,/usr/hdp/2.3.4.0-3485/hbase/lib/hbase-client.jar,/usr/hdp/2.3.4.0-3485/hbase/lib/hbase-common.jar,/usr/hdp/2.3.4.0-3485/hbase/lib/hbase-protocol.jar,/usr/hdp/2.3.4.0-3485/hbase/lib/hbase-server.jar,/usr/hdp/2.3.4.0-3485/hive/lib/hive-hbase-handler.jar,/usr/hdp/2.3.4.0-3485/hive/lib/htrace-core-3.1.0-incubating.jar,/usr/hdp/2.3.4.0-3485/hive/lib/zookeeper-3.4.6.2.3.4.0-3485.jar,/usr/hdp/2.3.4.0-3485/hive/lib/hive-hbase-handler.jar,/usr/hdp/2.3.4.0-3485/hbase/lib/hbase-server.jar
> --packages com.databricks:spark-csv_2.10:1.3.0  --master yarn-client -i
> /TestDivya/Spark/InstrumentCopyToHDFSHive.scala
>
> *Stack Trace* :
>
> Stack SQL context available as sqlContext.
>> Loading /TestDivya/Spark/InstrumentCopyToHDFSHive.scala...
>> import org.apache.spark._
>> import org.apache.spark.sql._
>> 16/02/29 23:09:29 INFO HiveContext: Initializing execution hive, version
>> 1.2.1
>> 16/02/29 23:09:29 INFO ClientWrapper: Inspected Hadoop version:
>> 2.7.1.2.3.4.0-3485
>> 16/02/29 23:09:29 INFO ClientWrapper: Loaded
>> org.apache.hadoop.hive.shims.Hadoop23Shims for Hadoop version
>> 2.7.1.2.3.4.0-3485
>> 16/02/29 23:09:29 INFO HiveContext: default warehouse location is
>> /user/hive/warehouse
>> 16/02/29 23:09:29 INFO HiveContext: Initializing HiveMetastoreConnection
>> version 1.2.1 using Spark classes.
>> 16/02/29 23:09:29 INFO ClientWrapper: Inspected Hadoop version:
>> 2.7.1.2.3.4.0-3485
>> 16/02/29 23:09:29 INFO ClientWrapper: Loaded
>> org.apache.hadoop.hive.shims.Hadoop23Shims for Hadoop version
>> 2.7.1.2.3.4.0-3485
>> 16/02/29 23:09:30 WARN NativeCodeLoader: Unable to load native-hadoop
>> library for your platform... using builtin-java classes where applicable
>> 16/02/29 23:09:30 INFO metastore: Trying to connect to metastore with URI
>> thrift://ip-xxx-xx-xx-xxx.ap-southeast-1.compute.internal:9083
>> 16/02/29 23:09:30 INFO metastore: Connected to metastore.
>> 16/02/29 23:09:30 WARN DomainSocketFactory: The short-circuit local reads
>> feature cannot be used because libhadoop cannot be loaded.
>> 16/02/29 23:09:31 INFO SessionState: Created local directory:
>> /tmp/1bf53785-f7c8-406d-a733-a5858ccb2d16_resources
>> 16/02/29 23:09:31 INFO SessionState: Created HDFS directory:
>> /tmp/hive/hdfs/1bf53785-f7c8-406d-a733-a5858ccb2d16
>> 16/02/29 23:09:31 INFO SessionState: Created local directory:
>> /tmp/hdfs/1bf53785-f7c8-406d-a733-a5858ccb2d16
>> 16/02/29 23:09:31 INFO SessionState: Created HDFS directory:
>> /tmp/hive/hdfs/1bf53785-f7c8-406d-a733-a5858ccb2d16/_tmp_space.db
>> hiveContext: org.apache.spark.sql.hive.HiveContext =
>> org.apache.spark.sql.hive.HiveContext@10b14f32
>> 16/02/29 23:09:32 INFO ParseDriver: Parsing command: from TEST SELECT  NAME
>> 16/02/29 23:09:32 INFO ParseDriver: Parse Completed
>> 16/02/29 23:09:33 INFO deprecation: mapred.map.tasks is deprecated.
>> Instead, use mapreduce.job.maps
>> 16/02/29 23:09:33 INFO MemoryStore: ensureFreeSpace(468352) called with
>> curMem=0, maxMem=556038881
>> 16/02/29 23:09:33 INFO MemoryStore: Block broadcast_0 stored as values in
>> memory (estimated size 457.4 KB, free 529.8 MB)
>> 16/02/29 23:09:33 INFO MemoryStore: ensureFreeSpace(49454) called with
>> curMem=468352, maxMem=556038881
>> 16/02/29 23:09:33 INFO MemoryStore: Block broadcast_0_piece0 stored as
>> bytes in memory (estimated size 48.3 KB, free 529.8 MB)
>> 16/02/29 23:09:33 INFO BlockManagerInfo: Added broadcast_0_piece0 in
>> memory on xxx.xx.xx.xxx:37784 (size: 48.3 KB, free: 530.2 MB)
>> 16/02/29 23:09:33 INFO SparkContext: Created broadcast 0 from collect at
>> <console>:30
>> 16/02/29 23:09:34 INFO HBaseStorageHandler: Configuring input job
>> properties
>> 16/02/29 23:09:34 INFO RecoverableZooKeeper: Process
>> identifier=hconnection-0x26fa89a2 connecting to ZooKeeper
>> ensemble=localhost:2181
>> 16/02/29 23:09:34 INFO ZooKeeper: Client
>> environment:zookeeper.version=3.4.6-3485--1, built on 12/16/2015 02:35 GMT
>> 16/02/29 23:09:34 INFO ZooKeeper: Client environment:host.name
>> =ip-xxx-xx-xx-xxx.ap-southeast-1.compute.internal
>> 16/02/29 23:09:34 INFO ZooKeeper: Client environment:java.version=1.7.0_67
>> 16/02/29 23:09:34 INFO ZooKeeper: Client environment:java.vendor=Oracle
>> Corporation
>> 16/02/29 23:09:34 INFO ZooKeeper: Client
>> environment:java.home=/usr/jdk64/jdk1.7.0_67/jre
>> 16/02/29 23:09:34 INFO ZooKeeper: Client
>> environment:java.class.path=/usr/hdp/2.3.4.0-3485/hive/lib/guava-14.0.1.jar,/usr/hdp/2.3.4.0-3485/hbase/lib/hbase-client.jar,/usr/hdp/2.3.4.0-3485/hbase/lib/hbase-common.jar,/usr/hdp/2.3.4.0-3485/hbase/lib/hbase-protocol.jar,/usr/hdp/2.3.4.0-3485/hbase/lib/hbase-server.jar,/usr/hdp/2.3.4.0-3485/hive/lib/hive-hbase-handler.jar,/usr/hdp/2.3.4.0-3485/hive/lib/htrace-core-3.1.0-incubating.jar,/usr/hdp/2.3.4.0-3485/hive/lib/zookeeper-3.4.6.2.3.4.0-3485.jar,/usr/hdp/2.3.4.0-3485/hive/lib/hive-hbase-handler.jar,/usr/hdp/2.3.4.0-3485/hbase/lib/hbase-server.jar:/usr/hdp/current/spark-thriftserver/conf/:/usr/hdp/2.3.4.0-3485/spark/lib/spark-assembly-1.5.2.2.3.4.0-3485-hadoop2.7.1.2.3.4.0-3485.jar:/usr/hdp/2.3.4.0-3485/spark/lib/datanucleus-api-jdo-3.2.6.jar:/usr/hdp/2.3.4.0-3485/spark/lib/datanucleus-core-3.2.10.jar:/usr/hdp/2.3.4.0-3485/spark/lib/datanucleus-rdbms-3.2.9.jar:/usr/hdp/current/hadoop-client/conf/
>> 16/02/29 23:09:34 INFO ZooKeeper: Client
>> environment:java.library.path=/usr/java/packages/lib/amd64:/usr/lib64:/lib64:/lib:/usr/lib
>> 16/02/29 23:09:34 INFO ZooKeeper: Client environment:java.io.tmpdir=/tmp
>> 16/02/29 23:09:34 INFO ZooKeeper: Client environment:java.compiler=<NA>
>> 16/02/29 23:09:34 INFO ZooKeeper: Client environment:os.name=Linux
>> 16/02/29 23:09:34 INFO ZooKeeper: Client environment:os.arch=amd64
>> 16/02/29 23:09:34 INFO ZooKeeper: Client
>> environment:os.version=3.10.0-229.el7.x86_64
>> 16/02/29 23:09:34 INFO ZooKeeper: Client environment:user.name=hdfs
>> 16/02/29 23:09:34 INFO ZooKeeper: Client environment:user.home=/home/hdfs
>> 16/02/29 23:09:34 INFO ZooKeeper: Client environment:user.dir=/home/hdfs
>> 16/02/29 23:09:34 INFO ZooKeeper: Initiating client connection,
>> connectString=localhost:2181 sessionTimeout=90000
>> watcher=hconnection-0x26fa89a20x0, quorum=localhost:2181, baseZNode=/hbase
>> 16/02/29 23:09:34 INFO ClientCnxn: Opening socket connection to server
>> localhost/0:0:0:0:0:0:0:1:2181. Will not attempt to authenticate using SASL
>> (unknown error)
>> 16/02/29 23:09:34 INFO ClientCnxn: Socket connection established to
>> localhost/0:0:0:0:0:0:0:1:2181, initiating session
>> 16/02/29 23:09:34 INFO ClientCnxn: Session establishment complete on
>> server localhost/0:0:0:0:0:0:0:1:2181, sessionid = 0x3532fb70ba20034,
>> negotiated timeout = 40000
>> 16/02/29 23:09:34 WARN TableInputFormatBase: You are using an HTable
>> instance that relies on an HBase-managed Connection. This is usually due to
>> directly creating an HTable, which is deprecated. Instead, you should
>> create a Connection object and then request a Table instance from it. If
>> you don't need the Table instance for your own use, you should instead use
>> the TableInputFormatBase.initalizeTable method directly.
>> 16/02/29 23:09:34 INFO TableInputFormatBase: Creating an additional
>> unmanaged connection because user provided one can't be used for
>> administrative actions. We'll close it when we close out the table.
>> 16/02/29 23:09:34 INFO RecoverableZooKeeper: Process
>> identifier=hconnection-0x6fd74d35 connecting to ZooKeeper
>> ensemble=localhost:2181
>> 16/02/29 23:09:34 INFO ZooKeeper: Initiating client connection,
>> connectString=localhost:2181 sessionTimeout=90000
>> watcher=hconnection-0x6fd74d350x0, quorum=localhost:2181, baseZNode=/hbase
>> 16/02/29 23:09:34 INFO ClientCnxn: Opening socket connection to server
>> localhost/0:0:0:0:0:0:0:1:2181. Will not attempt to authenticate using SASL
>> (unknown error)
>> 16/02/29 23:09:34 INFO ClientCnxn: Socket connection established to
>> localhost/0:0:0:0:0:0:0:1:2181, initiating session
>> 16/02/29 23:09:34 INFO ClientCnxn: Session establishment complete on
>> server localhost/0:0:0:0:0:0:0:1:2181, sessionid = 0x3532fb70ba20035,
>> negotiated timeout = 40000
>> 16/02/29 23:09:34 INFO RegionSizeCalculator: Calculating region sizes for
>> table "TEST".
>> 16/02/29 23:10:23 INFO RpcRetryingCaller: Call exception, tries=10,
>> retries=35, started=48318 ms ago, cancelled=false, msg=
>> 16/02/29 23:10:43 INFO RpcRetryingCaller: Call exception, tries=11,
>> retries=35, started=68524 ms ago, cancelled=false, msg=
>> 16/02/29 23:11:03 INFO RpcRetryingCaller: Call exception, tries=12,
>> retries=35, started=88617 ms ago, cancelled=false, msg=
>> 16/02/29 23:11:23 INFO RpcRetryingCaller: Call exception, tries=13,
>> retries=35, started=108676 ms ago, cancelled=false, msg=
>> 16/02/29 23:11:43 INFO RpcRetryingCaller: Call exception, tries=14,
>> retries=35, started=128747 ms ago, cancelled=false, msg=
>> 16/02/29 23:12:03 INFO RpcRetryingCaller: Call exception, tries=15,
>> retries=35, started=148938 ms ago, cancelled=false, msg=
>> 16/02/29 23:12:23 INFO RpcRetryingCaller: Call exception, tries=16,
>> retries=35, started=168942 ms ago, cancelled=false, msg=
>> 16/02/29 23:12:43 INFO RpcRetryingCaller: Call exception, tries=17,
>> retries=35, started=188975 ms ago, cancelled=false, msg=
>> Trace :
>
>
>
> Could somebody help me in resolving the error.
> Would really appreciate the help .
>
>
> Thanks,
> Divya

Re: [ERROR]: Spark 1.5.2 + Hbase 1.1 + Hive 1.2 + HbaseIntegration

Posted by Wojciech Indyk <wo...@gmail.com>.
Hello Divya!
Have you solved the problem?
I suppose the log comes from driver. You need to look also at logs on
worker JVMs, there can be an exception or something.
Do you have Kerberos on your cluster? It could be similar to a problem
http://issues.apache.org/jira/browse/SPARK-14115

Based on your logs:
> 16/02/29 23:09:34 INFO ClientCnxn: Opening socket connection to server
> localhost/0:0:0:0:0:0:0:1:2181. Will not attempt to authenticate using SASL
> (unknown error)
> 16/02/29 23:09:34 INFO ClientCnxn: Socket connection established to
> localhost/0:0:0:0:0:0:0:1:2181, initiating session
> 16/02/29 23:09:34 INFO ClientCnxn: Session establishment complete on
> server localhost/0:0:0:0:0:0:0:1:2181, sessionid = 0x3532fb70ba20035,

Maybe there is a problem with using RPC call to regions using IPv6
(but I just guess).

--
Kind regards/ Pozdrawiam,
Wojciech Indyk
http://datacentric.pl


2016-03-01 5:27 GMT+01:00 Divya Gehlot <di...@gmail.com>:
> Hi,
> I am getting error when I am trying to connect hive table (which is being
> created through HbaseIntegration) in spark
>
> Steps I followed :
> *Hive Table creation code  *:
> CREATE EXTERNAL TABLE IF NOT EXISTS TEST(NAME STRING,AGE INT)
> STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
> WITH SERDEPROPERTIES ("hbase.columns.mapping" = ":key,0:AGE")
> TBLPROPERTIES ("hbase.table.name" = "TEST",
> "hbase.mapred.output.outputtable" = "TEST");
>
>
> *DESCRIBE TEST ;*
> col_name    data_type    comment
> name            string         from deserializer
> age               int             from deserializer
>
>
> *Spark Code :*
> import org.apache.spark._
> import org.apache.spark.sql._
>
> val hiveContext = new org.apache.spark.sql.hive.HiveContext(sc)
> hiveContext.sql("from TEST SELECT  NAME").collect.foreach(println)
>
>
> *Starting Spark shell*
> spark-shell --jars
> /usr/hdp/2.3.4.0-3485/hive/lib/guava-14.0.1.jar,/usr/hdp/2.3.4.0-3485/hive/lib/hive-hbase-handler.jar,/usr/hdp/2.3.4.0-3485/hbase/lib/hbase-client.jar,/usr/hdp/2.3.4.0-3485/hbase/lib/hbase-common.jar,/usr/hdp/2.3.4.0-3485/hbase/lib/hbase-protocol.jar,/usr/hdp/2.3.4.0-3485/hbase/lib/hbase-server.jar,/usr/hdp/2.3.4.0-3485/hive/lib/hive-hbase-handler.jar,/usr/hdp/2.3.4.0-3485/hive/lib/htrace-core-3.1.0-incubating.jar,/usr/hdp/2.3.4.0-3485/hive/lib/zookeeper-3.4.6.2.3.4.0-3485.jar,/usr/hdp/2.3.4.0-3485/hbase/lib/hbase-server.jar
> --driver-class-path
> /usr/hdp/2.3.4.0-3485/hive/lib/guava-14.0.1.jar,/usr/hdp/2.3.4.0-3485/hbase/lib/hbase-client.jar,/usr/hdp/2.3.4.0-3485/hbase/lib/hbase-common.jar,/usr/hdp/2.3.4.0-3485/hbase/lib/hbase-protocol.jar,/usr/hdp/2.3.4.0-3485/hbase/lib/hbase-server.jar,/usr/hdp/2.3.4.0-3485/hive/lib/hive-hbase-handler.jar,/usr/hdp/2.3.4.0-3485/hive/lib/htrace-core-3.1.0-incubating.jar,/usr/hdp/2.3.4.0-3485/hive/lib/zookeeper-3.4.6.2.3.4.0-3485.jar,/usr/hdp/2.3.4.0-3485/hive/lib/hive-hbase-handler.jar,/usr/hdp/2.3.4.0-3485/hbase/lib/hbase-server.jar
> --packages com.databricks:spark-csv_2.10:1.3.0  --master yarn-client -i
> /TestDivya/Spark/InstrumentCopyToHDFSHive.scala
>
> *Stack Trace* :
>
> Stack SQL context available as sqlContext.
>> Loading /TestDivya/Spark/InstrumentCopyToHDFSHive.scala...
>> import org.apache.spark._
>> import org.apache.spark.sql._
>> 16/02/29 23:09:29 INFO HiveContext: Initializing execution hive, version
>> 1.2.1
>> 16/02/29 23:09:29 INFO ClientWrapper: Inspected Hadoop version:
>> 2.7.1.2.3.4.0-3485
>> 16/02/29 23:09:29 INFO ClientWrapper: Loaded
>> org.apache.hadoop.hive.shims.Hadoop23Shims for Hadoop version
>> 2.7.1.2.3.4.0-3485
>> 16/02/29 23:09:29 INFO HiveContext: default warehouse location is
>> /user/hive/warehouse
>> 16/02/29 23:09:29 INFO HiveContext: Initializing HiveMetastoreConnection
>> version 1.2.1 using Spark classes.
>> 16/02/29 23:09:29 INFO ClientWrapper: Inspected Hadoop version:
>> 2.7.1.2.3.4.0-3485
>> 16/02/29 23:09:29 INFO ClientWrapper: Loaded
>> org.apache.hadoop.hive.shims.Hadoop23Shims for Hadoop version
>> 2.7.1.2.3.4.0-3485
>> 16/02/29 23:09:30 WARN NativeCodeLoader: Unable to load native-hadoop
>> library for your platform... using builtin-java classes where applicable
>> 16/02/29 23:09:30 INFO metastore: Trying to connect to metastore with URI
>> thrift://ip-xxx-xx-xx-xxx.ap-southeast-1.compute.internal:9083
>> 16/02/29 23:09:30 INFO metastore: Connected to metastore.
>> 16/02/29 23:09:30 WARN DomainSocketFactory: The short-circuit local reads
>> feature cannot be used because libhadoop cannot be loaded.
>> 16/02/29 23:09:31 INFO SessionState: Created local directory:
>> /tmp/1bf53785-f7c8-406d-a733-a5858ccb2d16_resources
>> 16/02/29 23:09:31 INFO SessionState: Created HDFS directory:
>> /tmp/hive/hdfs/1bf53785-f7c8-406d-a733-a5858ccb2d16
>> 16/02/29 23:09:31 INFO SessionState: Created local directory:
>> /tmp/hdfs/1bf53785-f7c8-406d-a733-a5858ccb2d16
>> 16/02/29 23:09:31 INFO SessionState: Created HDFS directory:
>> /tmp/hive/hdfs/1bf53785-f7c8-406d-a733-a5858ccb2d16/_tmp_space.db
>> hiveContext: org.apache.spark.sql.hive.HiveContext =
>> org.apache.spark.sql.hive.HiveContext@10b14f32
>> 16/02/29 23:09:32 INFO ParseDriver: Parsing command: from TEST SELECT  NAME
>> 16/02/29 23:09:32 INFO ParseDriver: Parse Completed
>> 16/02/29 23:09:33 INFO deprecation: mapred.map.tasks is deprecated.
>> Instead, use mapreduce.job.maps
>> 16/02/29 23:09:33 INFO MemoryStore: ensureFreeSpace(468352) called with
>> curMem=0, maxMem=556038881
>> 16/02/29 23:09:33 INFO MemoryStore: Block broadcast_0 stored as values in
>> memory (estimated size 457.4 KB, free 529.8 MB)
>> 16/02/29 23:09:33 INFO MemoryStore: ensureFreeSpace(49454) called with
>> curMem=468352, maxMem=556038881
>> 16/02/29 23:09:33 INFO MemoryStore: Block broadcast_0_piece0 stored as
>> bytes in memory (estimated size 48.3 KB, free 529.8 MB)
>> 16/02/29 23:09:33 INFO BlockManagerInfo: Added broadcast_0_piece0 in
>> memory on xxx.xx.xx.xxx:37784 (size: 48.3 KB, free: 530.2 MB)
>> 16/02/29 23:09:33 INFO SparkContext: Created broadcast 0 from collect at
>> <console>:30
>> 16/02/29 23:09:34 INFO HBaseStorageHandler: Configuring input job
>> properties
>> 16/02/29 23:09:34 INFO RecoverableZooKeeper: Process
>> identifier=hconnection-0x26fa89a2 connecting to ZooKeeper
>> ensemble=localhost:2181
>> 16/02/29 23:09:34 INFO ZooKeeper: Client
>> environment:zookeeper.version=3.4.6-3485--1, built on 12/16/2015 02:35 GMT
>> 16/02/29 23:09:34 INFO ZooKeeper: Client environment:host.name
>> =ip-xxx-xx-xx-xxx.ap-southeast-1.compute.internal
>> 16/02/29 23:09:34 INFO ZooKeeper: Client environment:java.version=1.7.0_67
>> 16/02/29 23:09:34 INFO ZooKeeper: Client environment:java.vendor=Oracle
>> Corporation
>> 16/02/29 23:09:34 INFO ZooKeeper: Client
>> environment:java.home=/usr/jdk64/jdk1.7.0_67/jre
>> 16/02/29 23:09:34 INFO ZooKeeper: Client
>> environment:java.class.path=/usr/hdp/2.3.4.0-3485/hive/lib/guava-14.0.1.jar,/usr/hdp/2.3.4.0-3485/hbase/lib/hbase-client.jar,/usr/hdp/2.3.4.0-3485/hbase/lib/hbase-common.jar,/usr/hdp/2.3.4.0-3485/hbase/lib/hbase-protocol.jar,/usr/hdp/2.3.4.0-3485/hbase/lib/hbase-server.jar,/usr/hdp/2.3.4.0-3485/hive/lib/hive-hbase-handler.jar,/usr/hdp/2.3.4.0-3485/hive/lib/htrace-core-3.1.0-incubating.jar,/usr/hdp/2.3.4.0-3485/hive/lib/zookeeper-3.4.6.2.3.4.0-3485.jar,/usr/hdp/2.3.4.0-3485/hive/lib/hive-hbase-handler.jar,/usr/hdp/2.3.4.0-3485/hbase/lib/hbase-server.jar:/usr/hdp/current/spark-thriftserver/conf/:/usr/hdp/2.3.4.0-3485/spark/lib/spark-assembly-1.5.2.2.3.4.0-3485-hadoop2.7.1.2.3.4.0-3485.jar:/usr/hdp/2.3.4.0-3485/spark/lib/datanucleus-api-jdo-3.2.6.jar:/usr/hdp/2.3.4.0-3485/spark/lib/datanucleus-core-3.2.10.jar:/usr/hdp/2.3.4.0-3485/spark/lib/datanucleus-rdbms-3.2.9.jar:/usr/hdp/current/hadoop-client/conf/
>> 16/02/29 23:09:34 INFO ZooKeeper: Client
>> environment:java.library.path=/usr/java/packages/lib/amd64:/usr/lib64:/lib64:/lib:/usr/lib
>> 16/02/29 23:09:34 INFO ZooKeeper: Client environment:java.io.tmpdir=/tmp
>> 16/02/29 23:09:34 INFO ZooKeeper: Client environment:java.compiler=<NA>
>> 16/02/29 23:09:34 INFO ZooKeeper: Client environment:os.name=Linux
>> 16/02/29 23:09:34 INFO ZooKeeper: Client environment:os.arch=amd64
>> 16/02/29 23:09:34 INFO ZooKeeper: Client
>> environment:os.version=3.10.0-229.el7.x86_64
>> 16/02/29 23:09:34 INFO ZooKeeper: Client environment:user.name=hdfs
>> 16/02/29 23:09:34 INFO ZooKeeper: Client environment:user.home=/home/hdfs
>> 16/02/29 23:09:34 INFO ZooKeeper: Client environment:user.dir=/home/hdfs
>> 16/02/29 23:09:34 INFO ZooKeeper: Initiating client connection,
>> connectString=localhost:2181 sessionTimeout=90000
>> watcher=hconnection-0x26fa89a20x0, quorum=localhost:2181, baseZNode=/hbase
>> 16/02/29 23:09:34 INFO ClientCnxn: Opening socket connection to server
>> localhost/0:0:0:0:0:0:0:1:2181. Will not attempt to authenticate using SASL
>> (unknown error)
>> 16/02/29 23:09:34 INFO ClientCnxn: Socket connection established to
>> localhost/0:0:0:0:0:0:0:1:2181, initiating session
>> 16/02/29 23:09:34 INFO ClientCnxn: Session establishment complete on
>> server localhost/0:0:0:0:0:0:0:1:2181, sessionid = 0x3532fb70ba20034,
>> negotiated timeout = 40000
>> 16/02/29 23:09:34 WARN TableInputFormatBase: You are using an HTable
>> instance that relies on an HBase-managed Connection. This is usually due to
>> directly creating an HTable, which is deprecated. Instead, you should
>> create a Connection object and then request a Table instance from it. If
>> you don't need the Table instance for your own use, you should instead use
>> the TableInputFormatBase.initalizeTable method directly.
>> 16/02/29 23:09:34 INFO TableInputFormatBase: Creating an additional
>> unmanaged connection because user provided one can't be used for
>> administrative actions. We'll close it when we close out the table.
>> 16/02/29 23:09:34 INFO RecoverableZooKeeper: Process
>> identifier=hconnection-0x6fd74d35 connecting to ZooKeeper
>> ensemble=localhost:2181
>> 16/02/29 23:09:34 INFO ZooKeeper: Initiating client connection,
>> connectString=localhost:2181 sessionTimeout=90000
>> watcher=hconnection-0x6fd74d350x0, quorum=localhost:2181, baseZNode=/hbase
>> 16/02/29 23:09:34 INFO ClientCnxn: Opening socket connection to server
>> localhost/0:0:0:0:0:0:0:1:2181. Will not attempt to authenticate using SASL
>> (unknown error)
>> 16/02/29 23:09:34 INFO ClientCnxn: Socket connection established to
>> localhost/0:0:0:0:0:0:0:1:2181, initiating session
>> 16/02/29 23:09:34 INFO ClientCnxn: Session establishment complete on
>> server localhost/0:0:0:0:0:0:0:1:2181, sessionid = 0x3532fb70ba20035,
>> negotiated timeout = 40000
>> 16/02/29 23:09:34 INFO RegionSizeCalculator: Calculating region sizes for
>> table "TEST".
>> 16/02/29 23:10:23 INFO RpcRetryingCaller: Call exception, tries=10,
>> retries=35, started=48318 ms ago, cancelled=false, msg=
>> 16/02/29 23:10:43 INFO RpcRetryingCaller: Call exception, tries=11,
>> retries=35, started=68524 ms ago, cancelled=false, msg=
>> 16/02/29 23:11:03 INFO RpcRetryingCaller: Call exception, tries=12,
>> retries=35, started=88617 ms ago, cancelled=false, msg=
>> 16/02/29 23:11:23 INFO RpcRetryingCaller: Call exception, tries=13,
>> retries=35, started=108676 ms ago, cancelled=false, msg=
>> 16/02/29 23:11:43 INFO RpcRetryingCaller: Call exception, tries=14,
>> retries=35, started=128747 ms ago, cancelled=false, msg=
>> 16/02/29 23:12:03 INFO RpcRetryingCaller: Call exception, tries=15,
>> retries=35, started=148938 ms ago, cancelled=false, msg=
>> 16/02/29 23:12:23 INFO RpcRetryingCaller: Call exception, tries=16,
>> retries=35, started=168942 ms ago, cancelled=false, msg=
>> 16/02/29 23:12:43 INFO RpcRetryingCaller: Call exception, tries=17,
>> retries=35, started=188975 ms ago, cancelled=false, msg=
>> Trace :
>
>
>
> Could somebody help me in resolving the error.
> Would really appreciate the help .
>
>
> Thanks,
> Divya

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
For additional commands, e-mail: user-help@spark.apache.org