You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@zookeeper.apache.org by Goteti Bhanu <ud...@yahoo-inc.com> on 2010/03/16 10:57:35 UTC
Zookeeper issues
Hi,
We are having issues with our web interface (WI) to zookeeper. We deployed a java app (WI), in tomcat, which shows a (mostly read-only) UI for the zk tree, in our cluster. The app had been deployed quite some time ago. However, recently, it is going down many times, in the recent past, showing these errors in the catalina.out. These errors are coming continuously in the catalina.out, while the website is down.
<catalina.out>
[INFO] SyncZKTreeWatcher - [registerWatches] About to register watch for '/llf/version-1.0/serverNodes/llf-server/nodes/b3091258:8085/connected', recursive: false
[INFO] SyncZKTreeWatcher - Path: null, type: None, state: Disconnected
[INFO] SyncZKTreeWatcher - Path: null, type: None, state: SyncConnected
[WARN] SyncZKTreeWatcher - [run] Exception while handling event WatchedEvent: Znode change. Path: /llf/version-1.0/serverNodes/llf-server/nodes/b3091258:8085/connected Type: NodeChildrenChanged <org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss>org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss
at org.apache.zookeeper.KeeperException.create(KeeperException.java:90)
at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:809)
at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:837)
at com.yahoo.cluster.zookeeper.SyncZKTreeWatcher.getNodeValue(SyncZKTreeWatcher.java:465)
at com.yahoo.cluster.zookeeper.SyncZKTreeWatcher.registerWatches(SyncZKTreeWatcher.java:376)
at com.yahoo.cluster.zookeeper.SyncZKTreeWatcher.access$100(SyncZKTreeWatcher.java:37)
at com.yahoo.cluster.zookeeper.SyncZKTreeWatcher$1.run(SyncZKTreeWatcher.java:169)
at java.lang.Thread.run(Thread.java:619)
</catalina.out>
In another log, we are getting the following errors continuously.
<AlertReceiver.log>
java.net.SocketException: Transport endpoint is not connected
at sun.nio.ch.SocketChannelImpl.shutdown(Native Method)
at sun.nio.ch.SocketChannelImpl.shutdownOutput(SocketChannelImpl.java:651)
at sun.nio.ch.SocketAdaptor.shutdownOutput(SocketAdaptor.java:368)
at org.apache.zookeeper.ClientCnxn$SendThread.cleanup(ClientCnxn.java:935)
at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:901)
2010 03 16 09:49:55,645 org.apache.zookeeper.ClientCnxn ZK Event Handler-SendThread Attempting connection to server b3091279.crawl.yahoo.net/67.195.112.40:22801
2010 03 16 09:49:55,646 org.apache.zookeeper.ClientCnxn ZK Event Handler-SendThread Priming connection to java.nio.channels.SocketChannel[connected local=/67.195.37.119:38793 remote=b3091279.crawl.yahoo.net/67.195.112.40:22801]
2010 03 16 09:49:55,739 org.apache.zookeeper.ClientCnxn ZK Event Handler-SendThread Server connection successful
2010 03 16 09:49:55,775 org.apache.zookeeper.ClientCnxn ZK Event Handler-SendThread Exception closing session 0xe727001201aa32ab to sun.nio.ch.SelectionKeyImpl@3ef92f5e<ma...@3ef92f5e>
java.io.IOException: Read error rc = -1 java.nio.DirectByteBuffer[pos=0 lim=4 cap=4]
at org.apache.zookeeper.ClientCnxn$SendThread.doIO(ClientCnxn.java:632)
at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:876)
2010 03 16 09:49:55,775 org.apache.zookeeper.ClientCnxn ZK Event Handler-SendThread Ignoring exception during shutdown input
</AlertReceiver.log>
What do these errors indicate? Always, restarting tomcat helps. But since the problem is occurring many times, we want to understand why is this happening? Is there a known ZK issue which could cause such behavior?
Thanks
Udaya
Re: Zookeeper issues
Posted by Patrick Hunt <ph...@apache.org>.
It looks like you are losing connectivity btw your client and server.
You'd need to provide some basic information for us to do any debugging
- for example what version of server/client are you using?
The first "connection loss" exception is probably due to the fact that
you got disconnected while a request (getdata) was in progress. However
it's not totally clear to me as you don't have date/time in your log, so
hard for me to have context.
I suspect the second error is due to session expiration, either that or
you are exceeding the "maxclientcnxns" (10 by default) from a single
host. There's isn't enough log detail for me to be sure.
***Could it be that you are creating a ZK client connection for each
operation of your "web interface"? (or maybe not closing the connection
in all cases afterward?) If this is they case you are more than likely
hitting the "maxclientcnxns" limit, try grepping the server logs (all of
them on all servers) for "Too many connections from" and see if that
shows up at all.
Try using the "4 letter words" against your servers to see what your
connection status looks like - in particular the "stat" and "dump"
commands. http://bit.ly/dglVld Hopefully you are using this for
monitoring, if so you might be able to look back at historical
information on this.
Patrick
Goteti Bhanu wrote:
> Hi,
>
> We are having issues with our web interface (WI) to zookeeper. We deployed a java app (WI), in tomcat, which shows a (mostly read-only) UI for the zk tree, in our cluster. The app had been deployed quite some time ago. However, recently, it is going down many times, in the recent past, showing these errors in the catalina.out. These errors are coming continuously in the catalina.out, while the website is down.
>
> <catalina.out>
> [INFO] SyncZKTreeWatcher - [registerWatches] About to register watch for '/llf/version-1.0/serverNodes/llf-server/nodes/b3091258:8085/connected', recursive: false
> [INFO] SyncZKTreeWatcher - Path: null, type: None, state: Disconnected
> [INFO] SyncZKTreeWatcher - Path: null, type: None, state: SyncConnected
> [WARN] SyncZKTreeWatcher - [run] Exception while handling event WatchedEvent: Znode change. Path: /llf/version-1.0/serverNodes/llf-server/nodes/b3091258:8085/connected Type: NodeChildrenChanged <org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss>org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss
> at org.apache.zookeeper.KeeperException.create(KeeperException.java:90)
> at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:809)
> at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:837)
> at com.yahoo.cluster.zookeeper.SyncZKTreeWatcher.getNodeValue(SyncZKTreeWatcher.java:465)
> at com.yahoo.cluster.zookeeper.SyncZKTreeWatcher.registerWatches(SyncZKTreeWatcher.java:376)
> at com.yahoo.cluster.zookeeper.SyncZKTreeWatcher.access$100(SyncZKTreeWatcher.java:37)
> at com.yahoo.cluster.zookeeper.SyncZKTreeWatcher$1.run(SyncZKTreeWatcher.java:169)
> at java.lang.Thread.run(Thread.java:619)
> </catalina.out>
>
> In another log, we are getting the following errors continuously.
>
> <AlertReceiver.log>
> java.net.SocketException: Transport endpoint is not connected
> at sun.nio.ch.SocketChannelImpl.shutdown(Native Method)
> at sun.nio.ch.SocketChannelImpl.shutdownOutput(SocketChannelImpl.java:651)
> at sun.nio.ch.SocketAdaptor.shutdownOutput(SocketAdaptor.java:368)
> at org.apache.zookeeper.ClientCnxn$SendThread.cleanup(ClientCnxn.java:935)
> at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:901)
> 2010 03 16 09:49:55,645 org.apache.zookeeper.ClientCnxn ZK Event Handler-SendThread Attempting connection to server b3091279.crawl.yahoo.net/67.195.112.40:22801
> 2010 03 16 09:49:55,646 org.apache.zookeeper.ClientCnxn ZK Event Handler-SendThread Priming connection to java.nio.channels.SocketChannel[connected local=/67.195.37.119:38793 remote=b3091279.crawl.yahoo.net/67.195.112.40:22801]
> 2010 03 16 09:49:55,739 org.apache.zookeeper.ClientCnxn ZK Event Handler-SendThread Server connection successful
> 2010 03 16 09:49:55,775 org.apache.zookeeper.ClientCnxn ZK Event Handler-SendThread Exception closing session 0xe727001201aa32ab to sun.nio.ch.SelectionKeyImpl@3ef92f5e<ma...@3ef92f5e>
> java.io.IOException: Read error rc = -1 java.nio.DirectByteBuffer[pos=0 lim=4 cap=4]
> at org.apache.zookeeper.ClientCnxn$SendThread.doIO(ClientCnxn.java:632)
> at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:876)
> 2010 03 16 09:49:55,775 org.apache.zookeeper.ClientCnxn ZK Event Handler-SendThread Ignoring exception during shutdown input
> </AlertReceiver.log>
>
> What do these errors indicate? Always, restarting tomcat helps. But since the problem is occurring many times, we want to understand why is this happening? Is there a known ZK issue which could cause such behavior?
>
> Thanks
> Udaya
>
>