You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by Rural Hunter <ru...@gmail.com> on 2014/02/20 09:59:56 UTC

Long run client cannot reconnect to hbase server after the server is restarted

Hi,

I'm using hbase 0.96.1.1. I notice a problem with my java client. The 
client is a long run application and it connects to hbase server from 
time to time. I notice a problem for the client to reconnect to the 
server after the server is stopped and started again.

The client prints out these log all the time:
2014-02-20 16:19:20 INFO main-SendThread(test-ubt:2181) 
org.apache.zookeeper.ClientCnxn - Socket connection established to 
test-ubt/192.168.1.99:2181, initiating session
2014-02-20 16:19:20 INFO main-SendThread(test-ubt:2181) 
org.apache.zookeeper.ClientCnxn - Unable to read additional data from 
server sessionid 0x144402c05020006, likely server has closed socket, 
closing socket connection and attempting reconnect
2014-02-20 16:19:22 INFO main-SendThread(test-ubt:2181) 
org.apache.zookeeper.ClientCnxn - Opening socket connection to server 
test-ubt/192.168.1.99:2181. Will not attempt to authenticate using SASL 
(unknown error)
2014-02-20 16:19:22 INFO main-SendThread(test-ubt:2181) 
org.apache.zookeeper.ClientCnxn - Socket connection established to 
test-ubt/192.168.1.99:2181, initiating session
2014-02-20 16:19:22 INFO main-SendThread(test-ubt:2181) 
org.apache.zookeeper.ClientCnxn - Unable to read additional data from 
server sessionid 0x144402c05020006, likely server has closed socket, 
closing socket connection and attempting reconnect
2014-02-20 16:19:24 INFO main-SendThread(test-ubt:2181) 
org.apache.zookeeper.ClientCnxn - Opening socket connection to server 
test-ubt/192.168.1.99:2181. Will not attempt to authenticate using SASL 
(unknown error)
2014-02-20 16:19:24 INFO main-SendThread(test-ubt:2181) 
org.apache.zookeeper.ClientCnxn - Socket connection established to 
test-ubt/192.168.1.99:2181, initiating session
2014-02-20 16:19:24 INFO main-SendThread(test-ubt:2181) 
org.apache.zookeeper.ClientCnxn - Unable to read additional data from 
server sessionid 0x144402c05020006, likely server has closed socket, 
closing socket connection and attempting reconnect

Meanwhile the log from zookeeper is like this:
2014-02-20 16:19:21,113 INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181] 
server.NIOServerCnxn: Closed socket connection for client 
/192.168.1.166:50996 (no session established for client)
2014-02-20 16:19:21,921 INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181] 
server.NIOServerCnxnFactory: Accepted socket connection from 
/192.168.1.166:50997
2014-02-20 16:19:21,922 INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181] 
server.ZooKeeperServer: Refusing session request for client 
/192.168.1.166:50997 as it has seen zxid 0x14f our last zxid is 0x86
  client must try another server
2014-02-20 16:19:21,922 INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181] 
server.NIOServerCnxn: Closed socket connection for client 
/192.168.1.166:50997 (no session established for client)
2014-02-20 16:19:22,604 INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181] 
server.NIOServerCnxnFactory: Accepted socket connection from 
/192.168.1.166:51000
2014-02-20 16:19:22,604 INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181] 
server.ZooKeeperServer: Refusing session request for client 
/192.168.1.166:51000 as it has seen zxid 0x133 our last zxid is 0x86
  client must try another server
2014-02-20 16:19:22,604 INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181] 
server.NIOServerCnxn: Closed socket connection for client 
/192.168.1.166:51000 (no session established for client)
2014-02-20 16:19:23,952 INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181] 
server.NIOServerCnxnFactory: Accepted socket connection from 
/192.168.1.166:51001
2014-02-20 16:19:23,952 INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181] 
server.ZooKeeperServer: Refusing session request for client 
/192.168.1.166:51001 as it has seen zxid 0x14f our last zxid is 0x86
  client must try another server

I even creates a new Configuration if my HConnection.getTable() fails in 
the client logic. But it doesn't help.
Configuration myConf=HBaseConfiguration.create();
myConf.set("hbase.zookeeper.quorum", hbaseQuorum);
myConf.set("hbase.client.retries.number", "3");
myConf.set("hbase.client.pause", "1000");
myConf.set("zookeeper.recovery.retry", "1");
HConnection hbase=HConnectionManager.createConnection(myConf);

The problem is resolved if I restarts the client. One more thing, the 
problem only happens if the server stop for long enough such as several 
hours. If I restarts the server right after it is stopped, there is no 
problem for the client to reconnect to the server.

My question is, if the server stops long enough, what should I do on the 
long run client side so it can reconnect to the server when it starts 
again?