You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@zookeeper.apache.org by "Michael Han (JIRA)" <ji...@apache.org> on 2016/12/02 00:01:05 UTC

[jira] [Updated] (ZOOKEEPER-2251) Add Client side packet response timeout to avoid infinite wait.

     [ https://issues.apache.org/jira/browse/ZOOKEEPER-2251?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Michael Han updated ZOOKEEPER-2251:
-----------------------------------
    Affects Version/s: 3.4.9
                       3.5.2

> Add Client side packet response timeout to avoid infinite wait.
> ---------------------------------------------------------------
>
>                 Key: ZOOKEEPER-2251
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2251
>             Project: ZooKeeper
>          Issue Type: Bug
>    Affects Versions: 3.4.9, 3.5.2
>            Reporter: nijel
>            Assignee: Arshad Mohammad
>         Attachments: ZOOKEEPER-2251-01.patch, ZOOKEEPER-2251-02.patch, ZOOKEEPER-2251-03.patch, ZOOKEEPER-2251-04.patch
>
>
> I came across one issue related to Client side packet response timeout In my cluster many packet drops happened for some time.
> One observation is the zookeeper client got hanged. As per the thread dump it is waiting for the response/ACK for the operation performed (synchronous API used here).
> I am using zookeeper.serverCnxnFactory=org.apache.zookeeper.server.NIOServerCnxnFactory
> Since only few packets missed there is no DISCONNECTED event occurred.
> Need add a "response time out" for the operations or packets.
> *Comments from [~rakeshr]*
> My observation about the problem:-
> * Can use tools like 'Wireshark' to simulate the artificial packet loss.
> * Assume there is only one packet in the 'outgoingQueue' and unfortunately the server response packet lost. Now, client will enter into infinite waiting. https://github.com/apache/zookeeper/blob/trunk/src/java/main/org/apache/zookeeper/ClientCnxn.java#L1515
> * Probably we can discuss more about this problem and possible solutions(add packet ACK timeout or another better approach) in the jira.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)