You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@zookeeper.apache.org by "Travis Crawford (JIRA)" <ji...@apache.org> on 2010/07/02 22:53:50 UTC

[jira] Created: (ZOOKEEPER-803) Improve defenses against misbehaving clients

Improve defenses against misbehaving clients
--------------------------------------------

                 Key: ZOOKEEPER-803
                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-803
             Project: Zookeeper
          Issue Type: Bug
    Affects Versions: 3.3.0
            Reporter: Travis Crawford


This issue is in response to ZOOKEEPER-801. Short version is a small number of buggy clients opened thousands of connections and caused Zookeeper to fail.

The misbehaving client did not correctly handle expired sessions, creating a new connection each time. The huge number of connections exacerbated the issue.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (ZOOKEEPER-803) Improve defenses against misbehaving clients

Posted by "Patrick Hunt (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/ZOOKEEPER-803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12884828#action_12884828 ] 

Patrick Hunt commented on ZOOKEEPER-803:
----------------------------------------

thanks for this - approx how many clients are we talking about?

> Improve defenses against misbehaving clients
> --------------------------------------------
>
>                 Key: ZOOKEEPER-803
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-803
>             Project: Zookeeper
>          Issue Type: Bug
>    Affects Versions: 3.3.0
>            Reporter: Travis Crawford
>         Attachments: connection-bugfix-diff.png
>
>
> This issue is in response to ZOOKEEPER-801. Short version is a small number of buggy clients opened thousands of connections and caused Zookeeper to fail.
> The misbehaving client did not correctly handle expired sessions, creating a new connection each time. The huge number of connections exacerbated the issue.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (ZOOKEEPER-803) Improve defenses against misbehaving clients

Posted by "Travis Crawford (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/ZOOKEEPER-803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12884829#action_12884829 ] 

Travis Crawford commented on ZOOKEEPER-803:
-------------------------------------------

Maybe 8-10 clients were running the buggy code. Not too many.

> Improve defenses against misbehaving clients
> --------------------------------------------
>
>                 Key: ZOOKEEPER-803
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-803
>             Project: Zookeeper
>          Issue Type: Bug
>    Affects Versions: 3.3.0
>            Reporter: Travis Crawford
>         Attachments: connection-bugfix-diff.png
>
>
> This issue is in response to ZOOKEEPER-801. Short version is a small number of buggy clients opened thousands of connections and caused Zookeeper to fail.
> The misbehaving client did not correctly handle expired sessions, creating a new connection each time. The huge number of connections exacerbated the issue.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (ZOOKEEPER-803) Improve defenses against misbehaving clients

Posted by "Travis Crawford (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/ZOOKEEPER-803?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Travis Crawford updated ZOOKEEPER-803:
--------------------------------------

    Attachment: connection-bugfix-diff.png

This diff shows a bug where the client developer confused disconnections and expired sessions. In the zookeeper programing model, clients reconnect themselves automatically when disconnected. However, should the session expire the application is responsible for reconnecting.

In this case the developer attempted to throttle reconnects, however, due to a bug the application created a new connection each time.

A small number of clients running the buggy code took down a 3 node Zookeeper cluster by exhausting 65k file descriptor limit. It only recovered after shutting down clients, restarting the Zookeepers, and then restarting the well-behaved clients.

> Improve defenses against misbehaving clients
> --------------------------------------------
>
>                 Key: ZOOKEEPER-803
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-803
>             Project: Zookeeper
>          Issue Type: Bug
>    Affects Versions: 3.3.0
>            Reporter: Travis Crawford
>         Attachments: connection-bugfix-diff.png
>
>
> This issue is in response to ZOOKEEPER-801. Short version is a small number of buggy clients opened thousands of connections and caused Zookeeper to fail.
> The misbehaving client did not correctly handle expired sessions, creating a new connection each time. The huge number of connections exacerbated the issue.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.