You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@zookeeper.apache.org by "Adam Whitney (JIRA)" <ji...@apache.org> on 2016/09/16 04:52:20 UTC

[jira] [Commented] (ZOOKEEPER-2251) Add Client side packet response timeout to avoid infinite wait.

    [ https://issues.apache.org/jira/browse/ZOOKEEPER-2251?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15495374#comment-15495374 ] 

Adam Whitney commented on ZOOKEEPER-2251:
-----------------------------------------

I think I'm seeing this problem using activemq with replicated leveldb ... every once in a while our network loses some packets when zookeeper server is acking back to the zookeeper client in activemq, and intermittently it seems to lead to a hang on the activemq side. Unfortunately, I don't have the ability yet to get a thread dump, so I can't tell for sure, but the cause and symptoms of my issue match very well with this bug. It looks like this bug has passed all the reviews ... is there any reason why it is not yet merged?

> Add Client side packet response timeout to avoid infinite wait.
> ---------------------------------------------------------------
>
>                 Key: ZOOKEEPER-2251
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2251
>             Project: ZooKeeper
>          Issue Type: Bug
>            Reporter: nijel
>            Assignee: Arshad Mohammad
>         Attachments: ZOOKEEPER-2251-01.patch, ZOOKEEPER-2251-02.patch, ZOOKEEPER-2251-03.patch, ZOOKEEPER-2251-04.patch
>
>
> I came across one issue related to Client side packet response timeout In my cluster many packet drops happened for some time.
> One observation is the zookeeper client got hanged. As per the thread dump it is waiting for the response/ACK for the operation performed (synchronous API used here).
> I am using zookeeper.serverCnxnFactory=org.apache.zookeeper.server.NIOServerCnxnFactory
> Since only few packets missed there is no DISCONNECTED event occurred.
> Need add a "response time out" for the operations or packets.
> *Comments from [~rakeshr]*
> My observation about the problem:-
> * Can use tools like 'Wireshark' to simulate the artificial packet loss.
> * Assume there is only one packet in the 'outgoingQueue' and unfortunately the server response packet lost. Now, client will enter into infinite waiting. https://github.com/apache/zookeeper/blob/trunk/src/java/main/org/apache/zookeeper/ClientCnxn.java#L1515
> * Probably we can discuss more about this problem and possible solutions(add packet ACK timeout or another better approach) in the jira.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)