You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hive.apache.org by Ted Yu <yu...@gmail.com> on 2010/03/26 05:39:48 UTC

Re: create external table in CLI

Adding hive-user.

On Thu, Mar 25, 2010 at 9:38 PM, Ted Yu <yu...@gmail.com> wrote:

> If I use the following commandline, I would see RetriesExhaustedException
> as I reported earlier this afternoon:
> [root@tyu-linux dist]# bin/hive -hiveconf hbase.master=snv-it-lin-006
> -hiveconf
> hbase.zookeeper.quorum=snv-it-lin-010,snv-it-lin-011,snv-it-lin-012
> -hiveconf hbase.zookeeper.property.clientPort=2181
>
> org.apache.hadoop.hbase.client.RetriesExhaustedException: Trying to contact
> region server 10.10.31.136:60020 for region .META.,,1, row '', but failed
> after 10 attempts.
>
> On Thu, Mar 25, 2010 at 9:18 PM, Ted Yu <yu...@gmail.com> wrote:
>
>> I searched source code tree under trunk and didn't see how hbase.master
>> parameter is passed to HBaseStorageHandler.java
>>
>> Also it looks like the unit test only specifies hostname for hbase.master:
>> "Z:\kindsight\hive\trunk\hbase-handler\src\test\org\apache\hadoop\hive\hbase\HBaseQTestUtil.java"(55,17):
>> conf.set("hbase.master", "local");
>>
>> BTW I saw a lot of following exceptions:
>> 2010-03-25 13:24:27,056 WARN  zookeeper.ClientCnxn
>> (ClientCnxn.java:cleanup(1001)) - Ignoring exception during shutdown input
>> java.nio.channels.ClosedChannelException
>>         at
>> sun.nio.ch.SocketChannelImpl.shutdownInput(SocketChannelImpl.java:638)
>>         at sun.nio.ch.SocketAdaptor.shutdownInput(SocketAdaptor.java:360)
>>         at
>> org.apache.zookeeper.ClientCnxn$SendThread.cleanup(ClientCnxn.java:999)
>>         at
>> org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:970)
>>
>> Thanks
>>
>>
>> On Thu, Mar 25, 2010 at 7:12 PM, Ted Yu <yu...@gmail.com> wrote:
>>
>>> We use distributed mode because the cluster would be deployed to
>>> production.
>>>
>>> I didn't find zookeeper commandline parameter on
>>> http://wiki.apache.org/hadoop/Hive/HBaseIntegration#Usage
>>> Do I need to use an entry in hbase-site.xml ?
>>>
>>> Our hbase cluster is based on 0.20.1
>>> But I used hbase client 0.20.3 against it before and didn't see
>>> RetriesExhaustedException.
>>>
>>> Thanks
>>>
>>>
>>> On Thu, Mar 25, 2010 at 6:07 PM, John Sichi <js...@facebook.com> wrote:
>>>
>>>> Hmmm, what kind of HBase cluster are you running (standalone,
>>>> pseudo-distributed, or distributed)?  If distributed, you may need to pass
>>>> zookeeper connection info instead of hbase.master; I haven't tested out
>>>> distributed yet but will be soon.
>>>>
>>>> JVS
>>>>
>>>> On Mar 25, 2010, at 4:43 PM, Ted Yu wrote:
>>>>
>>>> I removed hbase-site.xml
>>>>
>>>> [root@tyu-linux dist]# bin/hive -hiveconf
>>>> hbase.master=snv-it-lin-006:60000
>>>> Hive history
>>>> file=/tmp/root/hive_job_log_root_201003251614_1493474415.txt
>>>> hive> CREATE EXTERNAL TABLE ruletable(key string, exactmatch_cat string,
>>>> lpm_cat int)
>>>>     >     STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
>>>>     >     WITH SERDEPROPERTIES (
>>>>     >     "hbase.key.type" = "string",
>>>>     >     "hbase.columns.mapping" =
>>>> "exactmatch_1.0:category,lpm_1.0:category",
>>>>     >     "hbase.table.name" = "ruletable"
>>>>     >     );
>>>> FAILED: Error in metadata:
>>>> MetaException(message:org.apache.hadoop.hbase.MasterNotRunningException
>>>>         at
>>>> org.apache.hadoop.hbase.client.HConnectionManager$TableServers.getMaster(HConnectionManager.java:374)
>>>>         at
>>>> org.apache.hadoop.hbase.client.HBaseAdmin.<init>(HBaseAdmin.java:72)
>>>>         at
>>>> org.apache.hadoop.hive.hbase.HBaseStorageHandler.getHBaseAdmin(HBaseStorageHandler.java:63)
>>>>         at
>>>> org.apache.hadoop.hive.hbase.HBaseStorageHandler.preCreateTable(HBaseStorageHandler.java:149)
>>>>         at
>>>> org.apache.hadoop.hive.metastore.HiveMetaStoreClient.createTable(HiveMetaStoreClient.java:280)
>>>>         at
>>>> org.apache.hadoop.hive.ql.metadata.Hive.createTable(Hive.java:320)
>>>>         at
>>>> org.apache.hadoop.hive.ql.exec.DDLTask.createTable(DDLTask.java:1445)
>>>>         at
>>>> org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:124)
>>>>         at
>>>> org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:107)
>>>>         at
>>>> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:55)
>>>>         at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:630)
>>>>         at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:504)
>>>>         at org.apache.hadoop.hive.ql.Driver.run(Driver.java:382)
>>>>         at
>>>> org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:138)
>>>>         at
>>>> org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:197)
>>>>         at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:303)
>>>>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>>>         at
>>>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>>>>         at
>>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>>>>         at java.lang.reflect.Method.invoke(Method.java:597)
>>>>         at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
>>>> )
>>>>
>>>> But http://snv-it-lin-006:60010/master.jsp shows everything is normal.
>>>> And hbase.master.port is 60000 in hbase-default.xml
>>>>
>>>> Some suggestions ?
>>>>
>>>> Thanks
>>>>
>>>> On Thu, Mar 25, 2010 at 4:21 PM, John Sichi <js...@facebook.com>wrote:
>>>>
>>>>> Don't copy hbase-site.xml; you don't want to set up a new HBase
>>>>> cluster.  Just set your hive conf to point to your existing HBase master:
>>>>>
>>>>> hbase.master=hbase.yoyodyne.com:60000
>>>>>
>>>>> http://wiki.apache.org/hadoop/Hive/HBaseIntegration#Usage
>>>>>
>>>>> JVS
>>>>>
>>>>> On Mar 25, 2010, at 4:16 PM, Ted Yu wrote:
>>>>>
>>>>> I got pass that error when I use CLI from trunk.
>>>>>
>>>>> I copied hbase-site.xml from hbase master machine to trunk/dist/conf.
>>>>>
>>>>> When I issue the following command in CLI:
>>>>> CREATE EXTERNAL TABLE ruletable(key string, exactmatch_cat string,
>>>>> lpm_cat int)
>>>>>     STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
>>>>>     WITH SERDEPROPERTIES (
>>>>>     "hbase.key.type" = "string",
>>>>>     "hbase.columns.mapping" =
>>>>> "exactmatch_1.0:category,lpm_1.0:category",
>>>>>     "hbase.table.name" = "ruletable"
>>>>>     );
>>>>>
>>>>> I get:
>>>>>
>>>>> 2010-03-25 15:49:52,557 WARN  client.HConnectionManager$TableServers
>>>>> (HConnectionManager.java:tableExists(411)) - Testing for table existence
>>>>> threw exception
>>>>> org.apache.hadoop.hbase.client.RetriesExhaustedException: Trying to
>>>>> contact region server 10.10.31.143:60020 for region .META.,,1, row '',
>>>>> but failed after 10 attempts.
>>>>> Exceptions:
>>>>> java.io.IOException: Call to /10.10.31.143:60020 failed on local
>>>>> exception: java.io.EOFException
>>>>> java.io.IOException: Call to /10.10.31.143:60020 failed on local
>>>>> exception: java.io.EOFException
>>>>> java.io.IOException: Call to /10.10.31.143:60020 failed on local
>>>>> exception: java.io.EOFException
>>>>> java.io.IOException: Call to /10.10.31.143:60020 failed on local
>>>>> exception: java.io.EOFException
>>>>> java.io.IOException: Call to /10.10.31.143:60020 failed on local
>>>>> exception: java.io.EOFException
>>>>> java.io.IOException: Call to /10.10.31.143:60020 failed on local
>>>>> exception: java.io.EOFException
>>>>> java.io.IOException: Call to /10.10.31.143:60020 failed on local
>>>>> exception: java.io.EOFException
>>>>> java.io.IOException: Call to /10.10.31.143:60020 failed on local
>>>>> exception: java.io.EOFException
>>>>> java.io.IOException: Call to /10.10.31.143:60020 failed on local
>>>>> exception: java.io.EOFException
>>>>> java.io.IOException: Call to /10.10.31.143:60020 failed on local
>>>>> exception: java.io.EOFException
>>>>>
>>>>>         at
>>>>> org.apache.hadoop.hbase.client.HConnectionManager$TableServers.getRegionServerWithRetries(HConnectionManager.java:1048)
>>>>>         at
>>>>> org.apache.hadoop.hbase.client.MetaScanner.metaScan(MetaScanner.java:55)
>>>>>         at
>>>>> org.apache.hadoop.hbase.client.MetaScanner.metaScan(MetaScanner.java:28)
>>>>>         at
>>>>> org.apache.hadoop.hbase.client.HConnectionManager$TableServers.listTables(HConnectionManager.java:454)
>>>>>         at
>>>>> org.apache.hadoop.hbase.client.HConnectionManager$TableServers.tableExists(HConnectionManager.java:404)
>>>>>         at
>>>>> org.apache.hadoop.hbase.client.HBaseAdmin.tableExists(HBaseAdmin.java:113)
>>>>>         at
>>>>> org.apache.hadoop.hbase.client.HBaseAdmin.tableExists(HBaseAdmin.java:100)
>>>>>         at
>>>>> org.apache.hadoop.hive.hbase.HBaseStorageHandler.preCreateTable(HBaseStorageHandler.java:149)
>>>>>         at
>>>>> org.apache.hadoop.hive.metastore.HiveMetaStoreClient.createTable(HiveMetaStoreClient.java:280)
>>>>>
>>>>> However, I don't see any exception in hbase-hbaseadmin-regionserver.log
>>>>> on 10.10.31.143 within the past 4 hours.
>>>>> No exception in hbase-hbaseadmin-master.log on hbase master machine
>>>>> either.
>>>>>
>>>>> Please comment.
>>>>>
>>>>> On Thu, Mar 25, 2010 at 11:22 AM, Carl Steinbach <ca...@cloudera.com>wrote:
>>>>>
>>>>>> Hi Ted,
>>>>>>
>>>>>> It looks like your copy of Hive does not have the changes that
>>>>>> implemented support for integration with HBase. HBase support was committed
>>>>>> to trunk on March 12th, and currently it is only available on trunk. In
>>>>>> order to use it you need to  checkout (or update) the source from the svn
>>>>>> repository. Instructions describing how to do this are on the Hive wiki.
>>>>>>
>>>>>> Carl
>>>>>>
>>>>>>
>>>>>> On Thu, Mar 25, 2010 at 9:29 AM, Ted Yu <yu...@gmail.com> wrote:
>>>>>>
>>>>>>> Hi,
>>>>>>> I encountered Parse Error creating table in CLI:
>>>>>>>
>>>>>>> [root@tyu-linux hive-0.5.0-bin]# bin/hive
>>>>>>> Hive history
>>>>>>> file=/tmp/root/hive_job_log_root_201003240832_1752130264.txt
>>>>>>> hive> SHOW TABLES;
>>>>>>> OK
>>>>>>> Time taken: 8.09 seconds
>>>>>>> hive> CREATE EXTERNAL TABLE users(key string, state string, country
>>>>>>> string,
>>>>>>> country_id int)
>>>>>>>     > STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
>>>>>>>     >
>>>>>>>     > WITH SERDEPROPERTIES (
>>>>>>>     >
>>>>>>>     > "hbase.key.type" = "string",
>>>>>>>     >
>>>>>>>     > "hbase.columns.mapping" =
>>>>>>> "info:state,info:country,info:country_id",
>>>>>>>     >
>>>>>>>     > "hbase.table.name" = "users"
>>>>>>>     >
>>>>>>>     > );
>>>>>>> FAILED: Parse Error: line 2:0 cannot recognize input 'STORED' in
>>>>>>> table file
>>>>>>> format specification
>>>>>>>
>>>>>>> *Here is the tail of hive.log:*
>>>>>>> 2010-03-24 08:32:07,937 ERROR DataNucleus.Plugin
>>>>>>> (Log4JLogger.java:error(115)) - Bundle "org.eclipse.jdt.core"
>>>>>>> requires
>>>>>>> "org.eclipse.text" but it cannot be resolved.
>>>>>>> 2010-03-24 08:34:18,337 ERROR ql.Driver
>>>>>>> (SessionState.java:printError(248))
>>>>>>> - FAILED: Parse Error: line 2:0 cannot recognize input 'STORED' in
>>>>>>> table
>>>>>>> file format specification
>>>>>>>
>>>>>>> org.apache.hadoop.hive.ql.parse.ParseException: line 2:0 cannot
>>>>>>> recognize
>>>>>>> input 'STORED' in table file format specification
>>>>>>>
>>>>>>>         at
>>>>>>>
>>>>>>> org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:357)
>>>>>>>         at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:267)
>>>>>>>         at
>>>>>>> org.apache.hadoop.hive.ql.Driver.runCommand(Driver.java:320)
>>>>>>>         at org.apache.hadoop.hive.ql.Driver.run(Driver.java:312)
>>>>>>>         at
>>>>>>> org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:123)
>>>>>>>         at
>>>>>>> org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:181)
>>>>>>>         at
>>>>>>> org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:287)
>>>>>>>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native
>>>>>>> Method)
>>>>>>>         at
>>>>>>>
>>>>>>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>>>>>>>         at
>>>>>>>
>>>>>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>>>>>>>         at java.lang.reflect.Method.invoke(Method.java:597)
>>>>>>>         at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
>>>>>>>
>>>>>>> If you know how to fix my query, please share.
>>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>
>>
>

RE: create external table in CLI

Posted by John Sichi <js...@facebook.com>.
Hi Ted,

I got our HBase test cluster set up and encountered the same problems you did at first, but I've got it working now.  Here's what I found out (I have updated the wiki accordingly).

1.  You MUST use the same HBase version consistently for client and server.  Most likely this is your problem.  I was initially trying with an 0.20.4-dev version on the cluster and 0.20.3 on the Hive client, and consistently getting MasterNotRunningException.  Then I rebuilt the handler against 0.20.4-dev and ran with that, and problem solved.  So delete the jars from hbase-handler/lib, replace them with the ones which correspond to your HBase version, and then run ant clean and ant package again.  (If you get build failures, you'll have to upgrade your HBase cluster or figure out how to make the handler backward compatible.)  Browsing the HBase JIRA, I found that the RPC protocol has changed quite a bit, so I guess this is the reason for the incompatibilities (it would be nice if the handshake failed fast with an accurate error message instead of saying that the master is not running when in fact it is).

2.  The only configuration necessary is to set hbase.zookeeper.quorum.  Do not set hbase.master for talking to a cluster.  And you don't need to set hbase.zookeeper.property.clientPort (assuming you are using the default on the zookeeper servers).

3.  Make sure there is no firewall weirdness in between client and server.  I hit this when I tested from my laptop over VPN to the cluster.  In the firewall case, MasterNotRunningException had null for its message, whereas in the inconsistent client/server version case, MasterNotRunningException had the correct IP:port for its message.

Let me know how it goes.

JVS