You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "Heng Chen (JIRA)" <ji...@apache.org> on 2015/08/04 16:34:05 UTC

[jira] [Commented] (HBASE-14182) My regionserver change ip. But hmaster still connect to old ip after the rs restart

    [ https://issues.apache.org/jira/browse/HBASE-14182?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14653730#comment-14653730 ] 

Heng Chen commented on HBASE-14182:
-----------------------------------

I think i found the answer!

RpcClient use InetAddress class in Java.  And InetAddress has a cache to store <host,ip> pair
getAllByName0 will be called when request ip for a host, the source code in jdk1.8 is below:

{code}
private static InetAddress[] getAllByName0 (String host, InetAddress reqAddr, boolean check)
        throws UnknownHostException  {

        /* If it gets here it is presumed to be a hostname */
        /* Cache.get can return: null, unknownAddress, or InetAddress[] */

        /* make sure the connection to the host is allowed, before we
         * give out a hostname
         */
        if (check) {
            SecurityManager security = System.getSecurityManager();
            if (security != null) {
                security.checkConnect(host, -1);
            }
        }

        InetAddress[] addresses = getCachedAddresses(host);

        /* If no entry in cache, then do the host lookup */
        if (addresses == null) {
            addresses = getAddressesFromNameService(host, reqAddr);
        }

        if (addresses == unknown_array)
            throw new UnknownHostException(host);

        return addresses.clone();
    }
{code}

It will request cache first.  

So we can't change rs ip without hmaster restart.

One solution is that we can store ip information in ZK, and pass ip information into InetAddress Constructor when generate new instance.  The problem will be solved. 



> My regionserver change ip. But hmaster still connect to old ip after the rs restart
> -----------------------------------------------------------------------------------
>
>                 Key: HBASE-14182
>                 URL: https://issues.apache.org/jira/browse/HBASE-14182
>             Project: HBase
>          Issue Type: Bug
>          Components: master
>    Affects Versions: 0.98.6
>            Reporter: Heng Chen
>
> I use docker to deploy my hbase cluster, and the RS ip changed. When restart this RS,  hmaster webUI shows it connect to hmaster, but regions num. is zero after a long time. I check the hmaster log and found that master still use old ip to connect this rs.
> This is hmaster's log below:
> PS: 10.11.21.140 is old ip of  rs dx-ape-regionserver1-online
> {code}
> 2015-08-04 17:24:00,081 INFO  [AM.ZK.Worker-pool2-t14141] master.AssignmentManager: Assigning solar_image,\x01Y\x8E\xA3y,1434968237206.4a1bdeec85b9f55b962596f9fb2cd07f. to dx-ape-regionserver1-online,60020,1438679950072
> 2015-08-04 17:24:06,800 WARN  [AM.ZK.Worker-pool2-t14133] master.AssignmentManager: Failed assignment of solar_image,\x00\x94\x09\x8D\x95,1430991781025.b0f5b755f443d41cf306026a60675020. to dx-ape-regionserver1-online,60020,1438679950072, trying to assign elsewhere instead; try=3 of 10
> java.net.ConnectException: Connection timed out
>         at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>         at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:716)
>         at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
>         at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:529)
>         at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:493)
>         at org.apache.hadoop.hbase.ipc.RpcClient$Connection.setupConnection(RpcClient.java:578)
>         at org.apache.hadoop.hbase.ipc.RpcClient$Connection.setupIOstreams(RpcClient.java:868)
>         at org.apache.hadoop.hbase.ipc.RpcClient.getConnection(RpcClient.java:1543)
>         at org.apache.hadoop.hbase.ipc.RpcClient.call(RpcClient.java:1442)
>         at org.apache.hadoop.hbase.ipc.RpcClient.callBlockingMethod(RpcClient.java:1661)
>         at org.apache.hadoop.hbase.ipc.RpcClient$BlockingRpcChannelImplementation.callBlockingMethod(RpcClient.java:1719)
>         at org.apache.hadoop.hbase.protobuf.generated.AdminProtos$AdminService$BlockingStub.openRegion(AdminProtos.java:20964)
>         at org.apache.hadoop.hbase.master.ServerManager.sendRegionOpen(ServerManager.java:671)
>         at org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:2097)
>         at org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1577)
>         at org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1550)
>         at org.apache.hadoop.hbase.master.handler.ClosedRegionHandler.process(ClosedRegionHandler.java:104)
>         at org.apache.hadoop.hbase.master.AssignmentManager.handleRegion(AssignmentManager.java:999)
>         at org.apache.hadoop.hbase.master.AssignmentManager$6.run(AssignmentManager.java:1447)
>         at org.apache.hadoop.hbase.master.AssignmentManager$3.run(AssignmentManager.java:1260)
>         at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>         at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>         at java.lang.Thread.run(Thread.java:745)
> 2015-08-04 17:24:06,801 WARN  [AM.ZK.Worker-pool2-t14140] master.AssignmentManager: Failed assignment of solar_image,\x00(.\xE7\xB1L,1430024620929.534025fcf4cae5516513b9c9a4cf73dc. to dx-ape-regionserver1-online,60020,1438679950072, trying to assign elsewhere instead; try=2 of 10
> java.net.ConnectException: Call to dx-ape-regionserver1-online/10.11.21.140:60020 failed on connection exception: java.net.ConnectException: Connection timed out
>         at org.apache.hadoop.hbase.ipc.RpcClient.wrapException(RpcClient.java:1483)
>         at org.apache.hadoop.hbase.ipc.RpcClient.call(RpcClient.java:1461)
>         at org.apache.hadoop.hbase.ipc.RpcClient.callBlockingMethod(RpcClient.java:1661)
>         at org.apache.hadoop.hbase.ipc.RpcClient$BlockingRpcChannelImplementation.callBlockingMethod(RpcClient.java:1719)
>         at org.apache.hadoop.hbase.protobuf.generated.AdminProtos$AdminService$BlockingStub.openRegion(AdminProtos.java:20964)
>         at org.apache.hadoop.hbase.master.ServerManager.sendRegionOpen(ServerManager.java:671)
>         at org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:2097)
>         at org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1577)
>         at org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1550)
>         at org.apache.hadoop.hbase.master.handler.ClosedRegionHandler.process(ClosedRegionHandler.java:104)
>         at org.apache.hadoop.hbase.master.AssignmentManager.handleRegion(AssignmentManager.java:999)
>         at org.apache.hadoop.hbase.master.AssignmentManager$6.run(AssignmentManager.java:1447)
>         at org.apache.hadoop.hbase.master.AssignmentManager$3.run(AssignmentManager.java:1260)
>         at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>         at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>         at java.lang.Thread.run(Thread.java:745)
> Caused by: java.net.ConnectException: Connection timed out
>         at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>         at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:716)
>         at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
>         at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:529)
>         at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:493)
>         at org.apache.hadoop.hbase.ipc.RpcClient$Connection.setupConnection(RpcClient.java:578)
>         at org.apache.hadoop.hbase.ipc.RpcClient$Connection.setupIOstreams(RpcClient.java:868)
>         at org.apache.hadoop.hbase.ipc.RpcClient.getConnection(RpcClient.java:1543)
>         at org.apache.hadoop.hbase.ipc.RpcClient.call(RpcClient.java:1442)
>         ... 16 more
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)