You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@zookeeper.apache.org by "Michael Han (JIRA)" <ji...@apache.org> on 2016/12/02 00:09:59 UTC

[jira] [Commented] (ZOOKEEPER-2251) Add Client side packet response timeout to avoid infinite wait.

    [ https://issues.apache.org/jira/browse/ZOOKEEPER-2251?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15713491#comment-15713491 ] 

Michael Han commented on ZOOKEEPER-2251:
----------------------------------------

bq. Is the problem simply that the patch needs to be updated to match the latest code?

[~m.martinez] Thanks for bumping this up. Community has been working on a couple of high priority issues to prepare incoming 3.4.10 and 3.5.3 releases. This issue was not labelled with version info so it does not get much visibility in the queue. Just updated the JIRA and reviewing the patch.

bq. We are definitely running into what looks to be this problem
[~m.martinez] Which version of ZooKeeper you are running?

On a side note, the patch does look outdated (i.e. the doc changes refer to 3.5.2 which is released already) and needs to be rebased. [~arshad.mohammad] Do you mind update the patch and send a pull request on git? We can start iterating from there.

> Add Client side packet response timeout to avoid infinite wait.
> ---------------------------------------------------------------
>
>                 Key: ZOOKEEPER-2251
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2251
>             Project: ZooKeeper
>          Issue Type: Bug
>          Components: java client
>    Affects Versions: 3.4.9, 3.5.2
>            Reporter: nijel
>            Assignee: Arshad Mohammad
>            Priority: Critical
>              Labels: fault
>             Fix For: 3.5.3, 3.6.0
>
>         Attachments: ZOOKEEPER-2251-01.patch, ZOOKEEPER-2251-02.patch, ZOOKEEPER-2251-03.patch, ZOOKEEPER-2251-04.patch
>
>
> I came across one issue related to Client side packet response timeout In my cluster many packet drops happened for some time.
> One observation is the zookeeper client got hanged. As per the thread dump it is waiting for the response/ACK for the operation performed (synchronous API used here).
> I am using zookeeper.serverCnxnFactory=org.apache.zookeeper.server.NIOServerCnxnFactory
> Since only few packets missed there is no DISCONNECTED event occurred.
> Need add a "response time out" for the operations or packets.
> *Comments from [~rakeshr]*
> My observation about the problem:-
> * Can use tools like 'Wireshark' to simulate the artificial packet loss.
> * Assume there is only one packet in the 'outgoingQueue' and unfortunately the server response packet lost. Now, client will enter into infinite waiting. https://github.com/apache/zookeeper/blob/trunk/src/java/main/org/apache/zookeeper/ClientCnxn.java#L1515
> * Probably we can discuss more about this problem and possible solutions(add packet ACK timeout or another better approach) in the jira.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)