You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@zookeeper.apache.org by Mahadev Konar <ma...@yahoo-inc.com> on 2010/10/04 18:52:52 UTC

Re: possible bug in zookeeper ?

Hi Yatir,

  Any update on this? Are you still struggling with this problem?

Thanks
mahadev

On 9/15/10 12:56 AM, "Yatir Ben Shlomo" <ya...@outbrain.com> wrote:

> Thanks to all who replied, I appreciate your efforts:
> 
> 1. There is no connections problem from the client machine:
> (ob1078)(tomcat@cass3:~)$ echo ruok | nc zook1 2181
> imok(ob1078)(tomcat@cass3:~)$ echo ruok | nc zook2 2181
> imok(ob1078)(tomcat@cass3:~)$ echo ruok | nc zook3 2181
> imok(ob1078)(tomcat@cass3:~)$
> 
> 2. Unfortunately I have already tried to switch to the new jar but it does not
> seem to be backward compatible.
> It seems that the QuorumPeerConfig class does not have the following field
> protected int clientPort;
> It was replaced by InetSocketAddress clientPortAddress in the new jar
> So I am getting java.lang.NoSuchFieldError exception...
> 
> 3. I looked at the ClientCnxn.java code.
> It seems that the logic for iterating over the available servers
> (nextAddrToTry++ ) is used only inside the startConnect() function but not in
> the finishConnect() function, nor anywhere else.
> 
> Possibly something along these lines is happening:
> some exception that happens inside the finishConnect() function is cauasing
> the cleanup() function which in turn causes another exception.
> Nowhere in this code path is the nextAddrToTry++ applied.
> Can this make sense to someone ?
> thanks
> 
> 
> 
> 
> 
> 
> -----Original Message-----
> From: Patrick Hunt [mailto:phunt@apache.org]
> Sent: Tuesday, September 14, 2010 6:20 PM
> To: zookeeper-user@hadoop.apache.org
> Subject: Re: possible bug in zookeeper ?
> 
> That is unusual. I don't recall anyone reporting a similar issue, and
> looking at the code I don't see any issues off hand. Can you try the
> following?
> 
> 1) on that particular zk client machine resolve the hosts zook1/zook2/zook3,
> what ip addresses does this resolve to? (try dig)
> 2) try running the client using the 3.3.1 jar file (just replace the jar on
> the client), it includes more log4j information, turn on DEBUG or TRACE
> logging
> 
> Patrick
> 
> On Tue, Sep 14, 2010 at 8:44 AM, Yatir Ben Shlomo <ya...@outbrain.com>wrote:
> 
>> zook1:2181,zook2:2181,zook3:2181
>> 
>> 
>> -----Original Message-----
>> From: Ted Dunning [mailto:ted.dunning@gmail.com]
>> Sent: Tuesday, September 14, 2010 4:11 PM
>> To: zookeeper-user@hadoop.apache.org
>> Subject: Re: possible bug in zookeeper ?
>> 
>> What was the list of servers that was given originally to open the
>> connection to ZK?
>> 
>> On Tue, Sep 14, 2010 at 6:15 AM, Yatir Ben Shlomo <yatirb@outbrain.com
>>> wrote:
>> 
>>> Hi I am using solrCloud which uses an ensemble of 3 zookeeper instances.
>>> 
>>> I am performing survivability  tests:
>>> Taking one of the zookeeper instances down I would expect the client to
>> use
>>> a different zookeeper server instance.
>>> 
>>> But as you can see in the below logs attached
>>> Depending on which instance I choose to take down (in my case,  the last
>>> one in the list of zookeeper servers)
>>> the client is constantly insisting on the same zookeeper server
>> (Attempting
>>> connection to server zook3/192.168.252.78:2181)
>>> and not switching to a different one
>>> the problem seems to arrive from ClientCnxn.java
>>> Any one has an idea on this ?
>>> 
>>> Solr cloud currently is using  zookeeper-3.2.2.jar
>>> Is this a know bug that was fixed in later versions ?( 3.3.1)
>>> 
>>> Thanks in advance,
>>> Yatir
>>> 
>>> 
>>> Logs:
>>> 
>>> Sep 14, 2010 9:02:20 AM org.apache.log4j.Category warn
>>> WARNING: Ignoring exception during shutdown input
>>> java.nio.channels.ClosedChannelException
>>>        at
>>> sun.nio.ch.SocketChannelImpl.shutdownInput(SocketChannelImpl.java:638)
>>>        at sun.nio.ch.SocketAdaptor.shutdownInput(SocketAdaptor.java:360)
>>>        at
>>> 
>> org.apache.zookeeper.ClientCnxn$SendThread.cleanup(zookeeper:ClientCnxn.java)
>> :999)
>>>        at
>>> 
>> 
org.apache.zookeeper.ClientCnxn$SendThread.run(zookeeper:ClientCnxn.java):970>>
)
>>> Sep 14, 2010 9:02:20 AM org.apache.log4j.Category warn
>>> WARNING: Ignoring exception during shutdown output
>>> java.nio.channels.ClosedChannelException
>>>        at
>>> sun.nio.ch.SocketChannelImpl.shutdownOutput(SocketChannelImpl.java:649)
>>>        at sun.nio.ch.SocketAdaptor.shutdownOutput(SocketAdaptor.java:368)
>>>        at
>>> 
>> org.apache.zookeeper.ClientCnxn$SendThread.cleanup(zookeeper:ClientCnxn.java)
>> :1004)
>>>        at
>>> 
>> 
org.apache.zookeeper.ClientCnxn$SendThread.run(zookeeper:ClientCnxn.java):970>>
)
>>> Sep 14, 2010 9:02:22 AM org.apache.log4j.Category info
>>> INFO: Attempting connection to server zook3/192.168.252.78:2181
>>> Sep 14, 2010 9:02:22 AM org.apache.log4j.Category warn
>>> WARNING: Exception closing session 0x32b105244a20001 to
>>> sun.nio.ch.SelectionKeyImpl@3ca58cbf
>>> java.net.ConnectException: Connection refused
>>>        at sun.nio.ch.SocketChannelImpl.$$YJP$$checkConnect(Native Method)
>>>        at
>> sun.nio.ch.SocketChannelImpl.checkConnect(SocketChannelImpl.java)
>>>        at
>>> sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:574)
>>>        at
>>> 
>> 
org.apache.zookeeper.ClientCnxn$SendThread.run(zookeeper:ClientCnxn.java):933>>
)
>>> Sep 14, 2010 9:02:22 AM org.apache.log4j.Category warn
>>> WARNING: Ignoring exception during shutdown input
>>> java.nio.channels.ClosedChannelException
>>>        at
>>> sun.nio.ch.SocketChannelImpl.shutdownInput(SocketChannelImpl.java:638)
>>>        at sun.nio.ch.SocketAdaptor.shutdownInput(SocketAdaptor.java:360)
>>>        at
>>> 
>> org.apache.zookeeper.ClientCnxn$SendThread.cleanup(zookeeper:ClientCnxn.java)
>> :999)
>>>        at
>>> 
>> 
org.apache.zookeeper.ClientCnxn$SendThread.run(zookeeper:ClientCnxn.java):970>>
)
>>> Sep 14, 2010 9:02:22 AM org.apache.log4j.Category warn
>>> WARNING: Ignoring exception during shutdown output
>>> java.nio.channels.ClosedChannelException
>>>        at
>>> sun.nio.ch.SocketChannelImpl.shutdownOutput(SocketChannelImpl.java:649)
>>>        at sun.nio.ch.SocketAdaptor.shutdownOutput(SocketAdaptor.java:368)
>>>        at
>>> 
>> org.apache.zookeeper.ClientCnxn$SendThread.cleanup(zookeeper:ClientCnxn.java)
>> :1004)
>>>        at
>>> 
>> 
org.apache.zookeeper.ClientCnxn$SendThread.run(zookeeper:ClientCnxn.java):970>>
)
>>> Sep 14, 2010 9:02:22 AM org.apache.log4j.Category info
>>> INFO: Attempting connection to server zook3/192.168.252.78:2181
>>> Sep 14, 2010 9:02:22 AM org.apache.log4j.Category warn
>>> WARNING: Exception closing session 0x32b105244a20000 to
>>> sun.nio.ch.SelectionKeyImpl@3960f81b
>>> java.net.ConnectException: Connection refused
>>>        at sun.nio.ch.SocketChannelImpl.$$YJP$$checkConnect(Native Method)
>>>        at
>> sun.nio.ch.SocketChannelImpl.checkConnect(SocketChannelImpl.java)
>>>        at
>>> sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:574)
>>>        at
>>> 
>> 
org.apache.zookeeper.ClientCnxn$SendThread.run(zookeeper:ClientCnxn.java):933>>
)
>>> Sep 14, 2010 9:02:22 AM org.apache.log4j.Category warn
>>> WARNING: Ignoring exception during shutdown input
>>> java.nio.channels.ClosedChannelException
>>>        at
>>> sun.nio.ch.SocketChannelImpl.shutdownInput(SocketChannelImpl.java:638)
>>>        at sun.nio.ch.SocketAdaptor.shutdownInput(SocketAdaptor.java:360)
>>>        at
>>> 
>> org.apache.zookeeper.ClientCnxn$SendThread.cleanup(zookeeper:ClientCnxn.java)
>> :999)
>>>        at
>>> 
>> 
org.apache.zookeeper.ClientCnxn$SendThread.run(zookeeper:ClientCnxn.java):970>>
)
>>> Sep 14, 2010 9:02:22 AM org.apache.log4j.Category warn
>>> WARNING: Ignoring exception during shutdown output
>>> java.nio.channels.ClosedChannelException
>>>        at
>>> sun.nio.ch.SocketChannelImpl.shutdownOutput(SocketChannelImpl.java:649)
>>>        at sun.nio.ch.SocketAdaptor.shutdownOutput(SocketAdaptor.java:368)
>>>        at
>>> 
>> org.apache.zookeeper.ClientCnxn$SendThread.cleanup(zookeeper:ClientCnxn.java)
>> :1004)
>>>        at
>>> 
>> 
org.apache.zookeeper.ClientCnxn$SendThread.run(zookeeper:ClientCnxn.java):970>>
)
>>> 
>>> 
>> 
>