You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@zookeeper.apache.org by "Jacky007 (JIRA)" <ji...@apache.org> on 2012/10/11 09:45:07 UTC

[jira] [Created] (ZOOKEEPER-1561) Zookeeper client may hang on a server restart

Jacky007 created ZOOKEEPER-1561:
-----------------------------------

             Summary: Zookeeper client may hang on a server restart
                 Key: ZOOKEEPER-1561
                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1561
             Project: ZooKeeper
          Issue Type: Bug
          Components: java client
    Affects Versions: 3.5.0
            Reporter: Jacky007
             Fix For: 3.5.0


In the doIO method of ClientCnxnSocketNIO
{noformat}
 if (p != null) {
                    outgoingQueue.removeFirstOccurrence(p);
                    updateLastSend();
                    if ((p.requestHeader != null) &&
                            (p.requestHeader.getType() != OpCode.ping) &&
                            (p.requestHeader.getType() != OpCode.auth)) {
                        p.requestHeader.setXid(cnxn.getXid());
                    }
                    p.createBB();
                    ByteBuffer pbb = p.bb;
                    sock.write(pbb);
                    if (!pbb.hasRemaining()) {
                        sentCount++;
                        if (p.requestHeader != null
                                && p.requestHeader.getType() != OpCode.ping
                                && p.requestHeader.getType() != OpCode.auth) {
                            pending.add(p);
                        }
                    }
{noformat}
When the sock.write(pbb) method throws an exception, the packet will not be cleanup(not in outgoingQueue nor in pendingQueue). If the client wait for it, it will wait forever...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (ZOOKEEPER-1561) Zookeeper client may hang on a server restart

Posted by "Eugene Koontz (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/ZOOKEEPER-1561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13476586#comment-13476586 ] 

Eugene Koontz commented on ZOOKEEPER-1561:
------------------------------------------

Marshall McMullen (@marshall) mentions in ZOOKEEPER-107 (https://issues.apache.org/jira/browse/ZOOKEEPER-107?focusedCommentId=13476346&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13476346) that this test failure:

https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1230//testReport/

might be due to ZOOKEEPER-1561 - console output might be useful for creating test case.
                
> Zookeeper client may hang on a server restart
> ---------------------------------------------
>
>                 Key: ZOOKEEPER-1561
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1561
>             Project: ZooKeeper
>          Issue Type: Bug
>          Components: java client
>    Affects Versions: 3.5.0
>            Reporter: Jacky007
>             Fix For: 3.5.0
>
>
> In the doIO method of ClientCnxnSocketNIO
> {noformat}
>  if (p != null) {
>                     outgoingQueue.removeFirstOccurrence(p);
>                     updateLastSend();
>                     if ((p.requestHeader != null) &&
>                             (p.requestHeader.getType() != OpCode.ping) &&
>                             (p.requestHeader.getType() != OpCode.auth)) {
>                         p.requestHeader.setXid(cnxn.getXid());
>                     }
>                     p.createBB();
>                     ByteBuffer pbb = p.bb;
>                     sock.write(pbb);
>                     if (!pbb.hasRemaining()) {
>                         sentCount++;
>                         if (p.requestHeader != null
>                                 && p.requestHeader.getType() != OpCode.ping
>                                 && p.requestHeader.getType() != OpCode.auth) {
>                             pending.add(p);
>                         }
>                     }
> {noformat}
> When the sock.write(pbb) method throws an exception, the packet will not be cleanup(not in outgoingQueue nor in pendingQueue). If the client wait for it, it will wait forever...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (ZOOKEEPER-1561) Zookeeper client may hang on a server restart

Posted by "Eugene Koontz (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/ZOOKEEPER-1561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13475903#comment-13475903 ] 

Eugene Koontz commented on ZOOKEEPER-1561:
------------------------------------------

I should say in my last sentence above, "client *will* hang, if bug exists" - client should clearly not hang if client code is functioning correctly.
                
> Zookeeper client may hang on a server restart
> ---------------------------------------------
>
>                 Key: ZOOKEEPER-1561
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1561
>             Project: ZooKeeper
>          Issue Type: Bug
>          Components: java client
>    Affects Versions: 3.5.0
>            Reporter: Jacky007
>             Fix For: 3.5.0
>
>
> In the doIO method of ClientCnxnSocketNIO
> {noformat}
>  if (p != null) {
>                     outgoingQueue.removeFirstOccurrence(p);
>                     updateLastSend();
>                     if ((p.requestHeader != null) &&
>                             (p.requestHeader.getType() != OpCode.ping) &&
>                             (p.requestHeader.getType() != OpCode.auth)) {
>                         p.requestHeader.setXid(cnxn.getXid());
>                     }
>                     p.createBB();
>                     ByteBuffer pbb = p.bb;
>                     sock.write(pbb);
>                     if (!pbb.hasRemaining()) {
>                         sentCount++;
>                         if (p.requestHeader != null
>                                 && p.requestHeader.getType() != OpCode.ping
>                                 && p.requestHeader.getType() != OpCode.auth) {
>                             pending.add(p);
>                         }
>                     }
> {noformat}
> When the sock.write(pbb) method throws an exception, the packet will not be cleanup(not in outgoingQueue nor in pendingQueue). If the client wait for it, it will wait forever...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (ZOOKEEPER-1561) Zookeeper client may hang on a server restart

Posted by "Eugene Koontz (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/ZOOKEEPER-1561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13490117#comment-13490117 ] 

Eugene Koontz commented on ZOOKEEPER-1561:
------------------------------------------

It's a good time to revisit this now that ZOOKEEPER-1560 is fixed. 
                
> Zookeeper client may hang on a server restart
> ---------------------------------------------
>
>                 Key: ZOOKEEPER-1561
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1561
>             Project: ZooKeeper
>          Issue Type: Bug
>          Components: java client
>    Affects Versions: 3.5.0
>            Reporter: Jacky007
>             Fix For: 3.5.0
>
>
> In the doIO method of ClientCnxnSocketNIO
> {noformat}
>  if (p != null) {
>                     outgoingQueue.removeFirstOccurrence(p);
>                     updateLastSend();
>                     if ((p.requestHeader != null) &&
>                             (p.requestHeader.getType() != OpCode.ping) &&
>                             (p.requestHeader.getType() != OpCode.auth)) {
>                         p.requestHeader.setXid(cnxn.getXid());
>                     }
>                     p.createBB();
>                     ByteBuffer pbb = p.bb;
>                     sock.write(pbb);
>                     if (!pbb.hasRemaining()) {
>                         sentCount++;
>                         if (p.requestHeader != null
>                                 && p.requestHeader.getType() != OpCode.ping
>                                 && p.requestHeader.getType() != OpCode.auth) {
>                             pending.add(p);
>                         }
>                     }
> {noformat}
> When the sock.write(pbb) method throws an exception, the packet will not be cleanup(not in outgoingQueue nor in pendingQueue). If the client wait for it, it will wait forever...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (ZOOKEEPER-1561) Zookeeper client may hang on a server restart

Posted by "Eugene Koontz (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/ZOOKEEPER-1561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13475895#comment-13475895 ] 

Eugene Koontz commented on ZOOKEEPER-1561:
------------------------------------------

I'd like to help with this if I can - first step would be a unit test that exposes it. If I understand from @Jacky007's description, I think that the test would be:

1. Start a client and server
2. Client waits till server comes up.
3. Stop the server.
4. Client sends a packet to the server (e.g. "get /").
5. Restart the server.

Client should hang at step 4. Test should detect the hang somehow and fail the test.
                
> Zookeeper client may hang on a server restart
> ---------------------------------------------
>
>                 Key: ZOOKEEPER-1561
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1561
>             Project: ZooKeeper
>          Issue Type: Bug
>          Components: java client
>    Affects Versions: 3.5.0
>            Reporter: Jacky007
>             Fix For: 3.5.0
>
>
> In the doIO method of ClientCnxnSocketNIO
> {noformat}
>  if (p != null) {
>                     outgoingQueue.removeFirstOccurrence(p);
>                     updateLastSend();
>                     if ((p.requestHeader != null) &&
>                             (p.requestHeader.getType() != OpCode.ping) &&
>                             (p.requestHeader.getType() != OpCode.auth)) {
>                         p.requestHeader.setXid(cnxn.getXid());
>                     }
>                     p.createBB();
>                     ByteBuffer pbb = p.bb;
>                     sock.write(pbb);
>                     if (!pbb.hasRemaining()) {
>                         sentCount++;
>                         if (p.requestHeader != null
>                                 && p.requestHeader.getType() != OpCode.ping
>                                 && p.requestHeader.getType() != OpCode.auth) {
>                             pending.add(p);
>                         }
>                     }
> {noformat}
> When the sock.write(pbb) method throws an exception, the packet will not be cleanup(not in outgoingQueue nor in pendingQueue). If the client wait for it, it will wait forever...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira