You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@zookeeper.apache.org by "Christian Wiedmann (JIRA)" <ji...@apache.org> on 2009/10/06 03:55:31 UTC

[jira] Created: (ZOOKEEPER-542) c-client can spin when server unresponsive

c-client can spin when server unresponsive
------------------------------------------

                 Key: ZOOKEEPER-542
                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-542
             Project: Zookeeper
          Issue Type: Bug
          Components: c client
    Affects Versions: 3.2.0
            Reporter: Christian Wiedmann


Due to a mismatch between zookeeper_interest() and zookeeper_process(), when the zookeeper server is unresponsive the client can spin when reconnecting to the server.

In particular, zookeeper_interest() adds ZOOKEEPER_WRITE whenever there is data to be sent, but flush_send_queue() only writes the data if the state is ZOO_CONNECTED_STATE.  When in ZOO_ASSOCIATING_STATE, this results in spinning.

This probably doesn't affect production, but I had a runaway process in a development deployment that caused performance issues on the node.  This is easy to reproduce in a single node environment by doing a kill -STOP on the server and waiting for the session timeout.

Patch to be added.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (ZOOKEEPER-542) c-client can spin when server unresponsive

Posted by "Mahadev konar (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/ZOOKEEPER-542?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12762724#action_12762724 ] 

Mahadev konar commented on ZOOKEEPER-542:
-----------------------------------------

+1 for the patch.... it would be really hard to write a test for this since we would have to have a server in which a connect does not complete (and also does not error out soon)... which can be done via SIGSTOP but would be rather hard to do in a automated test.

> c-client can spin when server unresponsive
> ------------------------------------------
>
>                 Key: ZOOKEEPER-542
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-542
>             Project: Zookeeper
>          Issue Type: Bug
>          Components: c client
>    Affects Versions: 3.2.0, 3.2.1
>            Reporter: Christian Wiedmann
>            Assignee: Christian Wiedmann
>             Fix For: 3.3.0
>
>         Attachments: ZOOKEEPER-542.patch, ZOOKEEPER-542.patch
>
>
> Due to a mismatch between zookeeper_interest() and zookeeper_process(), when the zookeeper server is unresponsive the client can spin when reconnecting to the server.
> In particular, zookeeper_interest() adds ZOOKEEPER_WRITE whenever there is data to be sent, but flush_send_queue() only writes the data if the state is ZOO_CONNECTED_STATE.  When in ZOO_ASSOCIATING_STATE, this results in spinning.
> This probably doesn't affect production, but I had a runaway process in a development deployment that caused performance issues on the node.  This is easy to reproduce in a single node environment by doing a kill -STOP on the server and waiting for the session timeout.
> Patch to be added.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (ZOOKEEPER-542) c-client can spin when server unresponsive

Posted by "Benjamin Reed (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/ZOOKEEPER-542?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Benjamin Reed updated ZOOKEEPER-542:
------------------------------------

    Attachment: ZOOKEEPER-542.patch

added comments

> c-client can spin when server unresponsive
> ------------------------------------------
>
>                 Key: ZOOKEEPER-542
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-542
>             Project: Zookeeper
>          Issue Type: Bug
>          Components: c client
>    Affects Versions: 3.2.0
>            Reporter: Christian Wiedmann
>             Fix For: 3.3.0
>
>         Attachments: ZOOKEEPER-542.patch, ZOOKEEPER-542.patch
>
>
> Due to a mismatch between zookeeper_interest() and zookeeper_process(), when the zookeeper server is unresponsive the client can spin when reconnecting to the server.
> In particular, zookeeper_interest() adds ZOOKEEPER_WRITE whenever there is data to be sent, but flush_send_queue() only writes the data if the state is ZOO_CONNECTED_STATE.  When in ZOO_ASSOCIATING_STATE, this results in spinning.
> This probably doesn't affect production, but I had a runaway process in a development deployment that caused performance issues on the node.  This is easy to reproduce in a single node environment by doing a kill -STOP on the server and waiting for the session timeout.
> Patch to be added.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (ZOOKEEPER-542) c-client can spin when server unresponsive

Posted by "Mahadev konar (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/ZOOKEEPER-542?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Mahadev konar updated ZOOKEEPER-542:
------------------------------------

      Resolution: Fixed
    Hadoop Flags: [Reviewed]
          Status: Resolved  (was: Patch Available)

I just committed this... thanks christian!

> c-client can spin when server unresponsive
> ------------------------------------------
>
>                 Key: ZOOKEEPER-542
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-542
>             Project: Zookeeper
>          Issue Type: Bug
>          Components: c client
>    Affects Versions: 3.2.0, 3.2.1
>            Reporter: Christian Wiedmann
>            Assignee: Christian Wiedmann
>             Fix For: 3.3.0
>
>         Attachments: ZOOKEEPER-542.patch, ZOOKEEPER-542.patch
>
>
> Due to a mismatch between zookeeper_interest() and zookeeper_process(), when the zookeeper server is unresponsive the client can spin when reconnecting to the server.
> In particular, zookeeper_interest() adds ZOOKEEPER_WRITE whenever there is data to be sent, but flush_send_queue() only writes the data if the state is ZOO_CONNECTED_STATE.  When in ZOO_ASSOCIATING_STATE, this results in spinning.
> This probably doesn't affect production, but I had a runaway process in a development deployment that caused performance issues on the node.  This is easy to reproduce in a single node environment by doing a kill -STOP on the server and waiting for the session timeout.
> Patch to be added.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (ZOOKEEPER-542) c-client can spin when server unresponsive

Posted by "Patrick Hunt (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/ZOOKEEPER-542?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Patrick Hunt updated ZOOKEEPER-542:
-----------------------------------

    Affects Version/s: 3.2.1

> c-client can spin when server unresponsive
> ------------------------------------------
>
>                 Key: ZOOKEEPER-542
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-542
>             Project: Zookeeper
>          Issue Type: Bug
>          Components: c client
>    Affects Versions: 3.2.0, 3.2.1
>            Reporter: Christian Wiedmann
>             Fix For: 3.3.0
>
>         Attachments: ZOOKEEPER-542.patch, ZOOKEEPER-542.patch
>
>
> Due to a mismatch between zookeeper_interest() and zookeeper_process(), when the zookeeper server is unresponsive the client can spin when reconnecting to the server.
> In particular, zookeeper_interest() adds ZOOKEEPER_WRITE whenever there is data to be sent, but flush_send_queue() only writes the data if the state is ZOO_CONNECTED_STATE.  When in ZOO_ASSOCIATING_STATE, this results in spinning.
> This probably doesn't affect production, but I had a runaway process in a development deployment that caused performance issues on the node.  This is easy to reproduce in a single node environment by doing a kill -STOP on the server and waiting for the session timeout.
> Patch to be added.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (ZOOKEEPER-542) c-client can spin when server unresponsive

Posted by "Hudson (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/ZOOKEEPER-542?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12763444#action_12763444 ] 

Hudson commented on ZOOKEEPER-542:
----------------------------------

Integrated in ZooKeeper-trunk #491 (See [http://hudson.zones.apache.org/hudson/job/ZooKeeper-trunk/491/])
    . c-client can spin when server unresponsive (Christian Wiedmann via mahadev)


> c-client can spin when server unresponsive
> ------------------------------------------
>
>                 Key: ZOOKEEPER-542
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-542
>             Project: Zookeeper
>          Issue Type: Bug
>          Components: c client
>    Affects Versions: 3.2.0, 3.2.1
>            Reporter: Christian Wiedmann
>            Assignee: Christian Wiedmann
>             Fix For: 3.3.0
>
>         Attachments: ZOOKEEPER-542.patch, ZOOKEEPER-542.patch
>
>
> Due to a mismatch between zookeeper_interest() and zookeeper_process(), when the zookeeper server is unresponsive the client can spin when reconnecting to the server.
> In particular, zookeeper_interest() adds ZOOKEEPER_WRITE whenever there is data to be sent, but flush_send_queue() only writes the data if the state is ZOO_CONNECTED_STATE.  When in ZOO_ASSOCIATING_STATE, this results in spinning.
> This probably doesn't affect production, but I had a runaway process in a development deployment that caused performance issues on the node.  This is easy to reproduce in a single node environment by doing a kill -STOP on the server and waiting for the session timeout.
> Patch to be added.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (ZOOKEEPER-542) c-client can spin when server unresponsive

Posted by "Mahadev konar (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/ZOOKEEPER-542?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Mahadev konar updated ZOOKEEPER-542:
------------------------------------

    Status: Patch Available  (was: Open)

making it PA for now... pat agrees that its a hard to test jira.. 

> c-client can spin when server unresponsive
> ------------------------------------------
>
>                 Key: ZOOKEEPER-542
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-542
>             Project: Zookeeper
>          Issue Type: Bug
>          Components: c client
>    Affects Versions: 3.2.1, 3.2.0
>            Reporter: Christian Wiedmann
>            Assignee: Christian Wiedmann
>             Fix For: 3.3.0
>
>         Attachments: ZOOKEEPER-542.patch, ZOOKEEPER-542.patch
>
>
> Due to a mismatch between zookeeper_interest() and zookeeper_process(), when the zookeeper server is unresponsive the client can spin when reconnecting to the server.
> In particular, zookeeper_interest() adds ZOOKEEPER_WRITE whenever there is data to be sent, but flush_send_queue() only writes the data if the state is ZOO_CONNECTED_STATE.  When in ZOO_ASSOCIATING_STATE, this results in spinning.
> This probably doesn't affect production, but I had a runaway process in a development deployment that caused performance issues on the node.  This is easy to reproduce in a single node environment by doing a kill -STOP on the server and waiting for the session timeout.
> Patch to be added.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (ZOOKEEPER-542) c-client can spin when server unresponsive

Posted by "Christian Wiedmann (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/ZOOKEEPER-542?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Christian Wiedmann updated ZOOKEEPER-542:
-----------------------------------------

    Attachment: ZOOKEEPER-542.patch

> c-client can spin when server unresponsive
> ------------------------------------------
>
>                 Key: ZOOKEEPER-542
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-542
>             Project: Zookeeper
>          Issue Type: Bug
>          Components: c client
>    Affects Versions: 3.2.0
>            Reporter: Christian Wiedmann
>         Attachments: ZOOKEEPER-542.patch
>
>
> Due to a mismatch between zookeeper_interest() and zookeeper_process(), when the zookeeper server is unresponsive the client can spin when reconnecting to the server.
> In particular, zookeeper_interest() adds ZOOKEEPER_WRITE whenever there is data to be sent, but flush_send_queue() only writes the data if the state is ZOO_CONNECTED_STATE.  When in ZOO_ASSOCIATING_STATE, this results in spinning.
> This probably doesn't affect production, but I had a runaway process in a development deployment that caused performance issues on the node.  This is easy to reproduce in a single node environment by doing a kill -STOP on the server and waiting for the session timeout.
> Patch to be added.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (ZOOKEEPER-542) c-client can spin when server unresponsive

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/ZOOKEEPER-542?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12762696#action_12762696 ] 

Hadoop QA commented on ZOOKEEPER-542:
-------------------------------------

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12421444/ZOOKEEPER-542.patch
  against trunk revision 822065.

    +1 @author.  The patch does not contain any @author tags.

    -1 tests included.  The patch doesn't appear to include any new or modified tests.
                        Please justify why no tests are needed for this patch.

    +1 javadoc.  The javadoc tool did not generate any warning messages.

    +1 javac.  The applied patch does not increase the total number of javac compiler warnings.

    +1 findbugs.  The patch does not introduce any new Findbugs warnings.

    +1 release audit.  The applied patch does not increase the total number of release audit warnings.

    +1 core tests.  The patch passed core unit tests.

    +1 contrib tests.  The patch passed contrib unit tests.

Test results: http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h8.grid.sp2.yahoo.net/17/testReport/
Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h8.grid.sp2.yahoo.net/17/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output: http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h8.grid.sp2.yahoo.net/17/console

This message is automatically generated.

> c-client can spin when server unresponsive
> ------------------------------------------
>
>                 Key: ZOOKEEPER-542
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-542
>             Project: Zookeeper
>          Issue Type: Bug
>          Components: c client
>    Affects Versions: 3.2.0
>            Reporter: Christian Wiedmann
>             Fix For: 3.3.0
>
>         Attachments: ZOOKEEPER-542.patch, ZOOKEEPER-542.patch
>
>
> Due to a mismatch between zookeeper_interest() and zookeeper_process(), when the zookeeper server is unresponsive the client can spin when reconnecting to the server.
> In particular, zookeeper_interest() adds ZOOKEEPER_WRITE whenever there is data to be sent, but flush_send_queue() only writes the data if the state is ZOO_CONNECTED_STATE.  When in ZOO_ASSOCIATING_STATE, this results in spinning.
> This probably doesn't affect production, but I had a runaway process in a development deployment that caused performance issues on the node.  This is easy to reproduce in a single node environment by doing a kill -STOP on the server and waiting for the session timeout.
> Patch to be added.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (ZOOKEEPER-542) c-client can spin when server unresponsive

Posted by "Benjamin Reed (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/ZOOKEEPER-542?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Benjamin Reed updated ZOOKEEPER-542:
------------------------------------

    Fix Version/s: 3.3.0
           Status: Patch Available  (was: Open)

> c-client can spin when server unresponsive
> ------------------------------------------
>
>                 Key: ZOOKEEPER-542
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-542
>             Project: Zookeeper
>          Issue Type: Bug
>          Components: c client
>    Affects Versions: 3.2.0
>            Reporter: Christian Wiedmann
>             Fix For: 3.3.0
>
>         Attachments: ZOOKEEPER-542.patch, ZOOKEEPER-542.patch
>
>
> Due to a mismatch between zookeeper_interest() and zookeeper_process(), when the zookeeper server is unresponsive the client can spin when reconnecting to the server.
> In particular, zookeeper_interest() adds ZOOKEEPER_WRITE whenever there is data to be sent, but flush_send_queue() only writes the data if the state is ZOO_CONNECTED_STATE.  When in ZOO_ASSOCIATING_STATE, this results in spinning.
> This probably doesn't affect production, but I had a runaway process in a development deployment that caused performance issues on the node.  This is easy to reproduce in a single node environment by doing a kill -STOP on the server and waiting for the session timeout.
> Patch to be added.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (ZOOKEEPER-542) c-client can spin when server unresponsive

Posted by "Benjamin Reed (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/ZOOKEEPER-542?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12762683#action_12762683 ] 

Benjamin Reed commented on ZOOKEEPER-542:
-----------------------------------------

+1 good catch and good fix. i'm going to extend the patch slightly by putting in a comment to document how we are handling the non-blocking connect. (somehow that got deleted long ago.)

> c-client can spin when server unresponsive
> ------------------------------------------
>
>                 Key: ZOOKEEPER-542
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-542
>             Project: Zookeeper
>          Issue Type: Bug
>          Components: c client
>    Affects Versions: 3.2.0
>            Reporter: Christian Wiedmann
>         Attachments: ZOOKEEPER-542.patch
>
>
> Due to a mismatch between zookeeper_interest() and zookeeper_process(), when the zookeeper server is unresponsive the client can spin when reconnecting to the server.
> In particular, zookeeper_interest() adds ZOOKEEPER_WRITE whenever there is data to be sent, but flush_send_queue() only writes the data if the state is ZOO_CONNECTED_STATE.  When in ZOO_ASSOCIATING_STATE, this results in spinning.
> This probably doesn't affect production, but I had a runaway process in a development deployment that caused performance issues on the node.  This is easy to reproduce in a single node environment by doing a kill -STOP on the server and waiting for the session timeout.
> Patch to be added.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (ZOOKEEPER-542) c-client can spin when server unresponsive

Posted by "Patrick Hunt (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/ZOOKEEPER-542?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Patrick Hunt updated ZOOKEEPER-542:
-----------------------------------

    Status: Open  (was: Patch Available)

Is a test possible here? It would be great to have one to verify the fix.

> c-client can spin when server unresponsive
> ------------------------------------------
>
>                 Key: ZOOKEEPER-542
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-542
>             Project: Zookeeper
>          Issue Type: Bug
>          Components: c client
>    Affects Versions: 3.2.0
>            Reporter: Christian Wiedmann
>             Fix For: 3.3.0
>
>         Attachments: ZOOKEEPER-542.patch, ZOOKEEPER-542.patch
>
>
> Due to a mismatch between zookeeper_interest() and zookeeper_process(), when the zookeeper server is unresponsive the client can spin when reconnecting to the server.
> In particular, zookeeper_interest() adds ZOOKEEPER_WRITE whenever there is data to be sent, but flush_send_queue() only writes the data if the state is ZOO_CONNECTED_STATE.  When in ZOO_ASSOCIATING_STATE, this results in spinning.
> This probably doesn't affect production, but I had a runaway process in a development deployment that caused performance issues on the node.  This is easy to reproduce in a single node environment by doing a kill -STOP on the server and waiting for the session timeout.
> Patch to be added.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (ZOOKEEPER-542) c-client can spin when server unresponsive

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/ZOOKEEPER-542?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12762736#action_12762736 ] 

Hadoop QA commented on ZOOKEEPER-542:
-------------------------------------

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12421444/ZOOKEEPER-542.patch
  against trunk revision 822065.

    +1 @author.  The patch does not contain any @author tags.

    -1 tests included.  The patch doesn't appear to include any new or modified tests.
                        Please justify why no tests are needed for this patch.

    +1 javadoc.  The javadoc tool did not generate any warning messages.

    +1 javac.  The applied patch does not increase the total number of javac compiler warnings.

    +1 findbugs.  The patch does not introduce any new Findbugs warnings.

    +1 release audit.  The applied patch does not increase the total number of release audit warnings.

    +1 core tests.  The patch passed core unit tests.

    +1 contrib tests.  The patch passed contrib unit tests.

Test results: http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h8.grid.sp2.yahoo.net/18/testReport/
Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h8.grid.sp2.yahoo.net/18/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output: http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h8.grid.sp2.yahoo.net/18/console

This message is automatically generated.

> c-client can spin when server unresponsive
> ------------------------------------------
>
>                 Key: ZOOKEEPER-542
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-542
>             Project: Zookeeper
>          Issue Type: Bug
>          Components: c client
>    Affects Versions: 3.2.0, 3.2.1
>            Reporter: Christian Wiedmann
>            Assignee: Christian Wiedmann
>             Fix For: 3.3.0
>
>         Attachments: ZOOKEEPER-542.patch, ZOOKEEPER-542.patch
>
>
> Due to a mismatch between zookeeper_interest() and zookeeper_process(), when the zookeeper server is unresponsive the client can spin when reconnecting to the server.
> In particular, zookeeper_interest() adds ZOOKEEPER_WRITE whenever there is data to be sent, but flush_send_queue() only writes the data if the state is ZOO_CONNECTED_STATE.  When in ZOO_ASSOCIATING_STATE, this results in spinning.
> This probably doesn't affect production, but I had a runaway process in a development deployment that caused performance issues on the node.  This is easy to reproduce in a single node environment by doing a kill -STOP on the server and waiting for the session timeout.
> Patch to be added.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (ZOOKEEPER-542) c-client can spin when server unresponsive

Posted by "Christian Wiedmann (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/ZOOKEEPER-542?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12762713#action_12762713 ] 

Christian Wiedmann commented on ZOOKEEPER-542:
----------------------------------------------

I don't really know how to do an automated test for this, since the spinning is not visible outside of the API.  The manual test I used is to kill -STOP the server and then wait until the client tries to reconnect while running strace on the I/O thread (I'm using python bindings, btw).  Pre-patch the strace shows repeated calls to poll, with POLLOUT set on the server fd.  Post-patch, POLLOUT is not set, and there is no spinning.

> c-client can spin when server unresponsive
> ------------------------------------------
>
>                 Key: ZOOKEEPER-542
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-542
>             Project: Zookeeper
>          Issue Type: Bug
>          Components: c client
>    Affects Versions: 3.2.0, 3.2.1
>            Reporter: Christian Wiedmann
>            Assignee: Christian Wiedmann
>             Fix For: 3.3.0
>
>         Attachments: ZOOKEEPER-542.patch, ZOOKEEPER-542.patch
>
>
> Due to a mismatch between zookeeper_interest() and zookeeper_process(), when the zookeeper server is unresponsive the client can spin when reconnecting to the server.
> In particular, zookeeper_interest() adds ZOOKEEPER_WRITE whenever there is data to be sent, but flush_send_queue() only writes the data if the state is ZOO_CONNECTED_STATE.  When in ZOO_ASSOCIATING_STATE, this results in spinning.
> This probably doesn't affect production, but I had a runaway process in a development deployment that caused performance issues on the node.  This is easy to reproduce in a single node environment by doing a kill -STOP on the server and waiting for the session timeout.
> Patch to be added.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Assigned: (ZOOKEEPER-542) c-client can spin when server unresponsive

Posted by "Patrick Hunt (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/ZOOKEEPER-542?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Patrick Hunt reassigned ZOOKEEPER-542:
--------------------------------------

    Assignee: Christian Wiedmann

> c-client can spin when server unresponsive
> ------------------------------------------
>
>                 Key: ZOOKEEPER-542
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-542
>             Project: Zookeeper
>          Issue Type: Bug
>          Components: c client
>    Affects Versions: 3.2.0, 3.2.1
>            Reporter: Christian Wiedmann
>            Assignee: Christian Wiedmann
>             Fix For: 3.3.0
>
>         Attachments: ZOOKEEPER-542.patch, ZOOKEEPER-542.patch
>
>
> Due to a mismatch between zookeeper_interest() and zookeeper_process(), when the zookeeper server is unresponsive the client can spin when reconnecting to the server.
> In particular, zookeeper_interest() adds ZOOKEEPER_WRITE whenever there is data to be sent, but flush_send_queue() only writes the data if the state is ZOO_CONNECTED_STATE.  When in ZOO_ASSOCIATING_STATE, this results in spinning.
> This probably doesn't affect production, but I had a runaway process in a development deployment that caused performance issues on the node.  This is easy to reproduce in a single node environment by doing a kill -STOP on the server and waiting for the session timeout.
> Patch to be added.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.