You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "Nicolas Spiegelberg (JIRA)" <ji...@apache.org> on 2010/07/14 00:53:49 UTC

[jira] Updated: (HBASE-2827) HBase Client doesn't handle master failover

     [ https://issues.apache.org/jira/browse/HBASE-2827?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Nicolas Spiegelberg updated HBASE-2827:
---------------------------------------

    Priority: Major  (was: Critical)

Downgrading priority after further investigation.  Still working on this issue, but HBASE-2828 was really the critical patch.

11:47:48 AM Nicolas Spiegelberg: should I push 2828 to the titan hbase, or should we wait for the trunk refresh.  that jira should fix the Exception that we saw on our cluster
11:49:22 AM Kannan: this is the master failover for client?
11:49:48 AM Nicolas Spiegelberg: it's decoupling the HTable from the master
11:50:18 AM Nicolas Spiegelberg: HBaseAdmin is the one that has master failover problems, but a client only uses it when disabling tables, creating new tables, etc
11:50:57 AM Kannan: i was under the impression that HBaseAdmin was the more critical one... but I think you are right.
11:51:08 AM Nicolas Spiegelberg: still needs to be fixed, but our problem was that they used the HBaseAdmin code, which is almost never used and sometimes buggy, inside the HTable code, which is used all the time
11:54:05 AM Nicolas Spiegelberg: I'm still working on 2827, which is the failover.  2828 just relies on zookeeper instead of the master.

> HBase Client doesn't handle master failover
> -------------------------------------------
>
>                 Key: HBASE-2827
>                 URL: https://issues.apache.org/jira/browse/HBASE-2827
>             Project: HBase
>          Issue Type: Bug
>          Components: client
>    Affects Versions: 0.90.0
>            Reporter: Nicolas Spiegelberg
>            Assignee: Nicolas Spiegelberg
>             Fix For: 0.90.0
>
>
> A client on our beta tier was stuck in this exception loop when we issued a new HMaster after the old one died:
> Exception while trying to connect hBase
> java.lang.reflect.UndeclaredThrowableException
> at $Proxy1.getClusterStatus(Unknown Source)
> at org.apache.hadoop.hbase.client.HBaseAdmin.getClusterStatus(HBaseAdmin.java:912)
> at org.apache.hadoop.hbase.client.HTable.getCurrentNrHRS(HTable.java:170)
> at org.apache.hadoop.hbase.client.HTable.<init>(HTable.java:143)
> ...
> at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:253)
> at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:885)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:907)
> at java.lang.Thread.run(Thread.java:619)
> Caused by: java.net.SocketTimeoutException: 20000 millis timeout while waiting for channel to be ready for connect. ch : java.nio.channels.SocketChannel[connection-pending remote=/10.18.34.212:60000]
> at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:213)
> at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:406)
> at org.apache.hadoop.hbase.ipc.HBaseClient$Connection.setupIOstreams(HBaseClient.java:309)
> at org.apache.hadoop.hbase.ipc.HBaseClient.getConnection(HBaseClient.java:856)
> at org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:724)
> at org.apache.hadoop.hbase.ipc.HBaseRPC$Invoker.invoke(HBaseRPC.java:252)
> ... 20 more
> 12:52:55,863 [pool-4-thread-5182] INFO PersistentUtil:153 - Retry after 1 second...
> Looking at the client code, the HConnectionManager does not watch ZK for NodeDeleted & NodeCreated of /hbase/master

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.