You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hive.apache.org by Neil Yalowitz <ne...@gmail.com> on 2012/04/06 15:36:47 UTC

hive trunk (0.9.0) with CDH3u3 - JAR issue?

Hi all,

I need some Hive+HBase functionality that is not currently available in the
hive distribution but is available in hive trunk, so I downloaded the
tar.gz and did an ant build.  Unfortunately, I'm not able to create an
external table using HBaseStorageHandler.  HBase and Hadoop are installed
via ClouderaManager with CDH3u3.

CDH3u3 uses flavors of HBase 0.90.4 and Hive 0.7.1.  The current hive-trunk
is HBase 0.92.0 and Hive 0.9.0.  I'm guessing my problem is with the JARs?
 I suspect that I've configured it to use the wrong ones, but I'm in
"trial-and-error" mode with the JAR versions to use.  I have several JAR
versions to choose from and it's difficult to determine if I should be
using what is packaged with CDH3u3 or with hive-trunk (for hbase,
zookeeper, guava, hive-hbase-handler etc.).

Has anyone successfully used hive 0.9.0 with HBase from CDH3u3?  Any
suggestions?



I've tried several configurations, but here's the current
hive-trunk/build/dist/conf/hive-site.xml:

<configuration>
<property>
  <name>hive.aux.jars.path</name>

<value>file:///usr/local/hive-trunk/build/dist/lib/hive-contrib-0.9.0-SNAPSHOT.jar,file:///usr/local/hive-trunk/build/dist/lib/hbase-0.92.0.jar,file:///usr/local/hive-trunk/build/dist/lib/hive-hbase-handler-0.9.0-SNAPSHOT.jar,file:///usr/local/hive-trunk/build/dist/lib/zookeeper-3.4.3.jar,file:///usr/local/hive-trunk/build/dist/lib/guava-r09.jar</value>
  <description>These JAR file are available to all users for all
jobs</description>
</property>

<property>
  <name>hbase.zookeeper.quorum</name>
  <value>myzookeeper</value>
</property>

<property>
  <name>hive.zookeeper.client.port</name>
  <value>2181</value>
  <description>The port of zookeeper servers to talk to. This is only
needed for read/write locks.</description>
</property>
[...SNIP...]
</configuration>




...and here's what I'm seeing:

/usr/local/hive-trunk/build/dist # bin/hive
Logging initialized using configuration in
file:/usr/local/hive-trunk/build/dist/conf/hive-log4j.properties
Hive history file=/tmp/root/hive_job_log_root_201204061212_2013977734.txt
hive>
    > CREATE EXTERNAL TABLE myhivetable(uid string, confidence
map<string,string>)
    > STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
    > WITH SERDEPROPERTIES ("hbase.columns.mapping" = "mycolfam:")
    > TBLPROPERTIES("hbase.table.name" = "myhbasetable");
Interrupting... Be patient, this might take some time.
Press Ctrl+C again to kill JVM
FAILED: Error in metadata: java.lang.RuntimeException: Thread was
interrupted while trying to connect to master.
FAILED: Execution Error, return code 1 from
org.apache.hadoop.hive.ql.exec.DDLTask




...the interrupt is a ^C from me, since the process becomes unresponsive.
 Logs output:




ip-10-0-2-116:/tmp/root # tail -n 50 hive.log
2012-04-06 12:12:10,583 WARN  conf.HiveConf (HiveConf.java:<clinit>(63)) -
DEPRECATED: Ignoring hive-default.xml found on the CLASSPATH at
/usr/local/hive-trunk/build/dist/conf/hive-default.xml
2012-04-06 12:12:15,065 ERROR DataNucleus.Plugin
(Log4JLogger.java:error(115)) - Bundle "org.eclipse.jdt.core" requires
"org.eclipse.core.resources" but it cannot be resolved.
2012-04-06 12:12:15,065 ERROR DataNucleus.Plugin
(Log4JLogger.java:error(115)) - Bundle "org.eclipse.jdt.core" requires
"org.eclipse.core.resources" but it cannot be resolved.
2012-04-06 12:12:15,066 ERROR DataNucleus.Plugin
(Log4JLogger.java:error(115)) - Bundle "org.eclipse.jdt.core" requires
"org.eclipse.core.runtime" but it cannot be resolved.
2012-04-06 12:12:15,066 ERROR DataNucleus.Plugin
(Log4JLogger.java:error(115)) - Bundle "org.eclipse.jdt.core" requires
"org.eclipse.core.runtime" but it cannot be resolved.
2012-04-06 12:12:15,067 ERROR DataNucleus.Plugin
(Log4JLogger.java:error(115)) - Bundle "org.eclipse.jdt.core" requires
"org.eclipse.text" but it cannot be resolved.
2012-04-06 12:12:15,067 ERROR DataNucleus.Plugin
(Log4JLogger.java:error(115)) - Bundle "org.eclipse.jdt.core" requires
"org.eclipse.text" but it cannot be resolved.
2012-04-06 12:12:17,566 WARN  zookeeper.ClientCnxnSocket
(ClientCnxnSocket.java:readConnectResult(139)) - Connected to an old
server; r-o mode will be unavailable
2012-04-06 12:12:44,440 ERROR exec.Task (SessionState.java:printError(397))
- FAILED: Error in metadata: java.lang.RuntimeException: Thread was
interrupted while trying to connect to master.
org.apache.hadoop.hive.ql.metadata.HiveException:
java.lang.RuntimeException: Thread was interrupted while trying to connect
to master.
at org.apache.hadoop.hive.ql.metadata.Hive.createTable(Hive.java:544)
at org.apache.hadoop.hive.ql.exec.DDLTask.createTable(DDLTask.java:3304)
at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:241)
at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:134)
at
org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:57)
at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1325)
at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1117)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:951)
at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:258)
at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:215)
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:406)
at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:689)
at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:557)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.util.RunJar.main(RunJar.java:197)
Caused by: java.lang.RuntimeException: Thread was interrupted while trying
to connect to master.
at
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getMaster(HConnectionManager.java:669)
at org.apache.hadoop.hbase.client.HBaseAdmin.<init>(HBaseAdmin.java:106)
at
org.apache.hadoop.hive.hbase.HBaseStorageHandler.getHBaseAdmin(HBaseStorageHandler.java:73)
at
org.apache.hadoop.hive.hbase.HBaseStorageHandler.preCreateTable(HBaseStorageHandler.java:147)
at
org.apache.hadoop.hive.metastore.HiveMetaStoreClient.createTable(HiveMetaStoreClient.java:396)
at org.apache.hadoop.hive.ql.metadata.Hive.createTable(Hive.java:538)
... 17 more

2012-04-06 12:12:44,441 ERROR ql.Driver (SessionState.java:printError(397))
- FAILED: Execution Error, return code 1 from
org.apache.hadoop.hive.ql.exec.DDLTask



...and HBase master logs:




ip-10-0-2-116:/var/log/hbase # tail -n 16
hbase-cmf-hbase1-MASTER-myhost.log.out
2012-04-06 12:57:29,048 WARN org.apache.hadoop.ipc.HBaseServer: IPC Server
listener on 60000: readAndProcess threw exception java.io.EOFException.
Count of bytes read: 0
java.io.EOFException
at java.io.DataInputStream.readFully(DataInputStream.java:180)
at
org.apache.hadoop.io.DataOutputBuffer$Buffer.write(DataOutputBuffer.java:63)
at org.apache.hadoop.io.DataOutputBuffer.write(DataOutputBuffer.java:101)
at org.apache.hadoop.io.UTF8.readChars(UTF8.java:216)
at org.apache.hadoop.io.UTF8.readString(UTF8.java:208)
at org.apache.hadoop.io.ObjectWritable.readObject(ObjectWritable.java:179)
at org.apache.hadoop.io.ObjectWritable.readObject(ObjectWritable.java:171)
at
org.apache.hadoop.hbase.ipc.HBaseServer$Connection.processHeader(HBaseServer.java:966)
at
org.apache.hadoop.hbase.ipc.HBaseServer$Connection.readAndProcess(HBaseServer.java:950)
at
org.apache.hadoop.hbase.ipc.HBaseServer$Listener.doRead(HBaseServer.java:522)
at
org.apache.hadoop.hbase.ipc.HBaseServer$Listener$Reader.run(HBaseServer.java:316)
at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)

Re: hive trunk (0.9.0) with CDH3u3 - JAR issue?

Posted by Neil Yalowitz <ne...@gmail.com>.
Solved.

This required a rebuild of the hbase-handlers JAR file.  To review the
versions I'm using:

HBase: 0.90.4-cdh3u3
Hadoop: 0.20.2-cdh3u3
Hive: 0.9.0 (trunk as of Apr 6, 2012)

Edited the HBase version in:

ivy/libraries.properties

...and set...

hbase.version=0.90.4

...and then:

ant package


During my searches I found several other posts of people struggling with
this problem.  Some recommend that rebuilding the hbase-handlers JAR may
not be necessary, but it was the only solution for my issue since adding
JARs via hive.aux.jars.path did not help.



Neil


On Fri, Apr 6, 2012 at 9:36 AM, Neil Yalowitz <ne...@gmail.com>wrote:

> Hi all,
>
> I need some Hive+HBase functionality that is not currently available in
> the hive distribution but is available in hive trunk, so I downloaded the
> tar.gz and did an ant build.  Unfortunately, I'm not able to create an
> external table using HBaseStorageHandler.  HBase and Hadoop are installed
> via ClouderaManager with CDH3u3.
>
> CDH3u3 uses flavors of HBase 0.90.4 and Hive 0.7.1.  The current
> hive-trunk is HBase 0.92.0 and Hive 0.9.0.  I'm guessing my problem is with
> the JARs?  I suspect that I've configured it to use the wrong ones, but I'm
> in "trial-and-error" mode with the JAR versions to use.  I have several JAR
> versions to choose from and it's difficult to determine if I should be
> using what is packaged with CDH3u3 or with hive-trunk (for hbase,
> zookeeper, guava, hive-hbase-handler etc.).
>
> Has anyone successfully used hive 0.9.0 with HBase from CDH3u3?  Any
> suggestions?
>
>
>
> I've tried several configurations, but here's the current
> hive-trunk/build/dist/conf/hive-site.xml:
>
> <configuration>
> <property>
>   <name>hive.aux.jars.path</name>
>
> <value>file:///usr/local/hive-trunk/build/dist/lib/hive-contrib-0.9.0-SNAPSHOT.jar,file:///usr/local/hive-trunk/build/dist/lib/hbase-0.92.0.jar,file:///usr/local/hive-trunk/build/dist/lib/hive-hbase-handler-0.9.0-SNAPSHOT.jar,file:///usr/local/hive-trunk/build/dist/lib/zookeeper-3.4.3.jar,file:///usr/local/hive-trunk/build/dist/lib/guava-r09.jar</value>
>   <description>These JAR file are available to all users for all
> jobs</description>
> </property>
>
> <property>
>   <name>hbase.zookeeper.quorum</name>
>   <value>myzookeeper</value>
> </property>
>
> <property>
>   <name>hive.zookeeper.client.port</name>
>   <value>2181</value>
>   <description>The port of zookeeper servers to talk to. This is only
> needed for read/write locks.</description>
> </property>
> [...SNIP...]
> </configuration>
>
>
>
>
> ...and here's what I'm seeing:
>
> /usr/local/hive-trunk/build/dist # bin/hive
> Logging initialized using configuration in
> file:/usr/local/hive-trunk/build/dist/conf/hive-log4j.properties
> Hive history file=/tmp/root/hive_job_log_root_201204061212_2013977734.txt
> hive>
>     > CREATE EXTERNAL TABLE myhivetable(uid string, confidence
> map<string,string>)
>     > STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
>     > WITH SERDEPROPERTIES ("hbase.columns.mapping" = "mycolfam:")
>     > TBLPROPERTIES("hbase.table.name" = "myhbasetable");
> Interrupting... Be patient, this might take some time.
> Press Ctrl+C again to kill JVM
> FAILED: Error in metadata: java.lang.RuntimeException: Thread was
> interrupted while trying to connect to master.
> FAILED: Execution Error, return code 1 from
> org.apache.hadoop.hive.ql.exec.DDLTask
>
>
>
>
> ...the interrupt is a ^C from me, since the process becomes unresponsive.
>  Logs output:
>
>
>
>
> ip-10-0-2-116:/tmp/root # tail -n 50 hive.log
> 2012-04-06 12:12:10,583 WARN  conf.HiveConf (HiveConf.java:<clinit>(63)) -
> DEPRECATED: Ignoring hive-default.xml found on the CLASSPATH at
> /usr/local/hive-trunk/build/dist/conf/hive-default.xml
> 2012-04-06 12:12:15,065 ERROR DataNucleus.Plugin
> (Log4JLogger.java:error(115)) - Bundle "org.eclipse.jdt.core" requires
> "org.eclipse.core.resources" but it cannot be resolved.
> 2012-04-06 12:12:15,065 ERROR DataNucleus.Plugin
> (Log4JLogger.java:error(115)) - Bundle "org.eclipse.jdt.core" requires
> "org.eclipse.core.resources" but it cannot be resolved.
> 2012-04-06 12:12:15,066 ERROR DataNucleus.Plugin
> (Log4JLogger.java:error(115)) - Bundle "org.eclipse.jdt.core" requires
> "org.eclipse.core.runtime" but it cannot be resolved.
> 2012-04-06 12:12:15,066 ERROR DataNucleus.Plugin
> (Log4JLogger.java:error(115)) - Bundle "org.eclipse.jdt.core" requires
> "org.eclipse.core.runtime" but it cannot be resolved.
> 2012-04-06 12:12:15,067 ERROR DataNucleus.Plugin
> (Log4JLogger.java:error(115)) - Bundle "org.eclipse.jdt.core" requires
> "org.eclipse.text" but it cannot be resolved.
> 2012-04-06 12:12:15,067 ERROR DataNucleus.Plugin
> (Log4JLogger.java:error(115)) - Bundle "org.eclipse.jdt.core" requires
> "org.eclipse.text" but it cannot be resolved.
> 2012-04-06 12:12:17,566 WARN  zookeeper.ClientCnxnSocket
> (ClientCnxnSocket.java:readConnectResult(139)) - Connected to an old
> server; r-o mode will be unavailable
> 2012-04-06 12:12:44,440 ERROR exec.Task
> (SessionState.java:printError(397)) - FAILED: Error in metadata:
> java.lang.RuntimeException: Thread was interrupted while trying to connect
> to master.
> org.apache.hadoop.hive.ql.metadata.HiveException:
> java.lang.RuntimeException: Thread was interrupted while trying to connect
> to master.
> at org.apache.hadoop.hive.ql.metadata.Hive.createTable(Hive.java:544)
>  at org.apache.hadoop.hive.ql.exec.DDLTask.createTable(DDLTask.java:3304)
> at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:241)
>  at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:134)
> at
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:57)
>  at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1325)
> at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1117)
>  at org.apache.hadoop.hive.ql.Driver.run(Driver.java:951)
> at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:258)
>  at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:215)
> at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:406)
>  at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:689)
> at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:557)
>  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>  at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:597)
>  at org.apache.hadoop.util.RunJar.main(RunJar.java:197)
> Caused by: java.lang.RuntimeException: Thread was interrupted while trying
> to connect to master.
>  at
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getMaster(HConnectionManager.java:669)
> at org.apache.hadoop.hbase.client.HBaseAdmin.<init>(HBaseAdmin.java:106)
>  at
> org.apache.hadoop.hive.hbase.HBaseStorageHandler.getHBaseAdmin(HBaseStorageHandler.java:73)
> at
> org.apache.hadoop.hive.hbase.HBaseStorageHandler.preCreateTable(HBaseStorageHandler.java:147)
>  at
> org.apache.hadoop.hive.metastore.HiveMetaStoreClient.createTable(HiveMetaStoreClient.java:396)
> at org.apache.hadoop.hive.ql.metadata.Hive.createTable(Hive.java:538)
>  ... 17 more
>
> 2012-04-06 12:12:44,441 ERROR ql.Driver
> (SessionState.java:printError(397)) - FAILED: Execution Error, return code
> 1 from org.apache.hadoop.hive.ql.exec.DDLTask
>
>
>
> ...and HBase master logs:
>
>
>
>
> ip-10-0-2-116:/var/log/hbase # tail -n 16
> hbase-cmf-hbase1-MASTER-myhost.log.out
> 2012-04-06 12:57:29,048 WARN org.apache.hadoop.ipc.HBaseServer: IPC Server
> listener on 60000: readAndProcess threw exception java.io.EOFException.
> Count of bytes read: 0
> java.io.EOFException
> at java.io.DataInputStream.readFully(DataInputStream.java:180)
>  at
> org.apache.hadoop.io.DataOutputBuffer$Buffer.write(DataOutputBuffer.java:63)
> at org.apache.hadoop.io.DataOutputBuffer.write(DataOutputBuffer.java:101)
>  at org.apache.hadoop.io.UTF8.readChars(UTF8.java:216)
> at org.apache.hadoop.io.UTF8.readString(UTF8.java:208)
>  at
> org.apache.hadoop.io.ObjectWritable.readObject(ObjectWritable.java:179)
> at org.apache.hadoop.io.ObjectWritable.readObject(ObjectWritable.java:171)
>  at
> org.apache.hadoop.hbase.ipc.HBaseServer$Connection.processHeader(HBaseServer.java:966)
> at
> org.apache.hadoop.hbase.ipc.HBaseServer$Connection.readAndProcess(HBaseServer.java:950)
>  at
> org.apache.hadoop.hbase.ipc.HBaseServer$Listener.doRead(HBaseServer.java:522)
> at
> org.apache.hadoop.hbase.ipc.HBaseServer$Listener$Reader.run(HBaseServer.java:316)
>  at
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>  at java.lang.Thread.run(Thread.java:662)
>
>