You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "Jean-Daniel Cryans (JIRA)" <ji...@apache.org> on 2010/04/07 20:02:33 UTC

[jira] Commented: (HBASE-2417) HCM.locateRootRegion fails hard on "Connection refused"

    [ https://issues.apache.org/jira/browse/HBASE-2417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12854622#action_12854622 ] 

Jean-Daniel Cryans commented on HBASE-2417:
-------------------------------------------

Here's a stack trace:

{code}

java.lang.reflect.UndeclaredThrowableException
        at $Proxy12.getRegionInfo(Unknown Source)
        at org.apache.hadoop.hbase.client.HConnectionManager$TableServers.locateRootRegion(HConnectionManager.java:1013)
        at org.apache.hadoop.hbase.client.HConnectionManager$TableServers.locateRegion(HConnectionManager.java:629)
        at org.apache.hadoop.hbase.client.HConnectionManager$TableServers.relocateRegion(HConnectionManager.java:611)
        at org.apache.hadoop.hbase.client.HConnectionManager$TableServers.locateRegionInMeta(HConnectionManager.java:762)
        at org.apache.hadoop.hbase.client.HConnectionManager$TableServers.locateRegion(HConnectionManager.java:634)
        at org.apache.hadoop.hbase.client.HConnectionManager$TableServers.relocateRegion(HConnectionManager.java:611)
        at org.apache.hadoop.hbase.client.HConnectionManager$TableServers.locateRegionInMeta(HConnectionManager.java:762)
        at org.apache.hadoop.hbase.client.HConnectionManager$TableServers.locateRegion(HConnectionManager.java:638)
        at org.apache.hadoop.hbase.client.HConnectionManager$TableServers.processBatchOfPuts(HConnectionManager.java:1332)
        at org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:629)
        at org.apache.hadoop.hbase.client.HTable.doPut(HTable.java:513)
        at org.apache.hadoop.hbase.client.HTable.put(HTable.java:491)
...
Caused by: java.net.ConnectException: Connection refused
        at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
        at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:574)
        at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
        at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:404)
        at org.apache.hadoop.hbase.ipc.HBaseClient$Connection.setupIOstreams(HBaseClient.java:309)
        at org.apache.hadoop.hbase.ipc.HBaseClient.getConnection(HBaseClient.java:839)
        at org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:716)
        at org.apache.hadoop.hbase.ipc.HBaseRPC$Invoker.invoke(HBaseRPC.java:252)
        ... 39 more
{code}

The solution is to catch a Throwable like getRegionServerWithRetries does and convert it. Also in locateRootRegion we don't want to retry on a same server, if the server is dead we need to look back in ZK.

> HCM.locateRootRegion fails hard on "Connection refused"
> -------------------------------------------------------
>
>                 Key: HBASE-2417
>                 URL: https://issues.apache.org/jira/browse/HBASE-2417
>             Project: Hadoop HBase
>          Issue Type: Bug
>    Affects Versions: 0.20.3
>            Reporter: Jean-Daniel Cryans
>             Fix For: 0.20.4, 0.21.0
>
>
> While running some tests on replication, I saw that our client does something dumb if it tries to contact a dead region server that held the ROOT region in HCM.locateRootRegion. Will post stack trace in a comment.
> The problem here is that we don't retry at all, the exception will come straight out of HCM like it's the end of the world.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.