You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@zookeeper.apache.org by "Camille Fournier (JIRA)" <ji...@apache.org> on 2011/09/21 16:39:08 UTC

[jira] [Created] (ZOOKEEPER-1197) Incorrect socket handling of 4 letter words for NIO

Incorrect socket handling of 4 letter words for NIO
---------------------------------------------------

                 Key: ZOOKEEPER-1197
                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1197
             Project: ZooKeeper
          Issue Type: Bug
          Components: server
    Affects Versions: 3.3.3, 3.4.0
            Reporter: Camille Fournier
            Assignee: Camille Fournier
            Priority: Blocker
             Fix For: 3.3.4, 3.4.0
         Attachments: ZOOKEEPER-1197.patch

When transferring a large amount of information from a 4 letter word, especially in interactive mode (telnet or nc) over a slower network link, the connection can be closed before all of the data has reached the client. This is due to the way we handle nc non-interactive mode, by cancelling the selector key. 

Instead of cancelling the selector key for 4-letter-words, we should instead flag the NIOServerCnxn to ignore detection of a close condition on that socket (CancelledKeyException, EndOfStreamException). Since the 4lw will close the connection immediately upon completion, this should be safe to do. 

See ZOOKEEPER-737 for more details


--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (ZOOKEEPER-1197) Incorrect socket handling of 4 letter words for NIO

Posted by "Flavio Junqueira (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/ZOOKEEPER-1197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13121451#comment-13121451 ] 

Flavio Junqueira commented on ZOOKEEPER-1197:
---------------------------------------------

Camille, is setting the socket to linger an option here?
                
> Incorrect socket handling of 4 letter words for NIO
> ---------------------------------------------------
>
>                 Key: ZOOKEEPER-1197
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1197
>             Project: ZooKeeper
>          Issue Type: Bug
>          Components: server
>    Affects Versions: 3.3.3, 3.4.0
>            Reporter: Camille Fournier
>            Assignee: Camille Fournier
>            Priority: Blocker
>             Fix For: 3.3.4, 3.4.0, 3.5.0
>
>         Attachments: ZOOKEEPER-1197.patch
>
>
> When transferring a large amount of information from a 4 letter word, especially in interactive mode (telnet or nc) over a slower network link, the connection can be closed before all of the data has reached the client. This is due to the way we handle nc non-interactive mode, by cancelling the selector key. 
> Instead of cancelling the selector key for 4-letter-words, we should instead flag the NIOServerCnxn to ignore detection of a close condition on that socket (CancelledKeyException, EndOfStreamException). Since the 4lw will close the connection immediately upon completion, this should be safe to do. 
> See ZOOKEEPER-737 for more details

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (ZOOKEEPER-1197) Incorrect socket handling of 4 letter words for NIO

Posted by "Patrick Hunt (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/ZOOKEEPER-1197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13464317#comment-13464317 ] 

Patrick Hunt commented on ZOOKEEPER-1197:
-----------------------------------------

Interesting web page I found: http://ia600609.us.archive.org/22/items/TheUltimateSo_lingerPageOrWhyIsMyTcpNotReliable/the-ultimate-so_linger-page-or-why-is-my-tcp-not-reliable.html
                
> Incorrect socket handling of 4 letter words for NIO
> ---------------------------------------------------
>
>                 Key: ZOOKEEPER-1197
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1197
>             Project: ZooKeeper
>          Issue Type: Bug
>          Components: server
>    Affects Versions: 3.3.3, 3.4.0
>            Reporter: Camille Fournier
>            Assignee: Camille Fournier
>            Priority: Critical
>             Fix For: 3.5.0
>
>         Attachments: ZOOKEEPER-1197.patch
>
>
> When transferring a large amount of information from a 4 letter word, especially in interactive mode (telnet or nc) over a slower network link, the connection can be closed before all of the data has reached the client. This is due to the way we handle nc non-interactive mode, by cancelling the selector key. 
> Instead of cancelling the selector key for 4-letter-words, we should instead flag the NIOServerCnxn to ignore detection of a close condition on that socket (CancelledKeyException, EndOfStreamException). Since the 4lw will close the connection immediately upon completion, this should be safe to do. 
> See ZOOKEEPER-737 for more details

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (ZOOKEEPER-1197) Incorrect socket handling of 4 letter words for NIO

Posted by "Camille Fournier (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/ZOOKEEPER-1197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13121452#comment-13121452 ] 

Camille Fournier commented on ZOOKEEPER-1197:
---------------------------------------------

socket linger has no effect on this bug.
                
> Incorrect socket handling of 4 letter words for NIO
> ---------------------------------------------------
>
>                 Key: ZOOKEEPER-1197
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1197
>             Project: ZooKeeper
>          Issue Type: Bug
>          Components: server
>    Affects Versions: 3.3.3, 3.4.0
>            Reporter: Camille Fournier
>            Assignee: Camille Fournier
>            Priority: Blocker
>             Fix For: 3.3.4, 3.4.0, 3.5.0
>
>         Attachments: ZOOKEEPER-1197.patch
>
>
> When transferring a large amount of information from a 4 letter word, especially in interactive mode (telnet or nc) over a slower network link, the connection can be closed before all of the data has reached the client. This is due to the way we handle nc non-interactive mode, by cancelling the selector key. 
> Instead of cancelling the selector key for 4-letter-words, we should instead flag the NIOServerCnxn to ignore detection of a close condition on that socket (CancelledKeyException, EndOfStreamException). Since the 4lw will close the connection immediately upon completion, this should be safe to do. 
> See ZOOKEEPER-737 for more details

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (ZOOKEEPER-1197) Incorrect socket handling of 4 letter words for NIO

Posted by "Camille Fournier (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/ZOOKEEPER-1197?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Camille Fournier updated ZOOKEEPER-1197:
----------------------------------------

    Attachment: ZOOKEEPER-1197.patch

> Incorrect socket handling of 4 letter words for NIO
> ---------------------------------------------------
>
>                 Key: ZOOKEEPER-1197
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1197
>             Project: ZooKeeper
>          Issue Type: Bug
>          Components: server
>    Affects Versions: 3.3.3, 3.4.0
>            Reporter: Camille Fournier
>            Assignee: Camille Fournier
>            Priority: Blocker
>             Fix For: 3.3.4, 3.4.0
>
>         Attachments: ZOOKEEPER-1197.patch
>
>
> When transferring a large amount of information from a 4 letter word, especially in interactive mode (telnet or nc) over a slower network link, the connection can be closed before all of the data has reached the client. This is due to the way we handle nc non-interactive mode, by cancelling the selector key. 
> Instead of cancelling the selector key for 4-letter-words, we should instead flag the NIOServerCnxn to ignore detection of a close condition on that socket (CancelledKeyException, EndOfStreamException). Since the 4lw will close the connection immediately upon completion, this should be safe to do. 
> See ZOOKEEPER-737 for more details

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (ZOOKEEPER-1197) Incorrect socket handling of 4 letter words for NIO

Posted by "Skye Wanderman-Milne (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/ZOOKEEPER-1197?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Skye Wanderman-Milne updated ZOOKEEPER-1197:
--------------------------------------------

    Attachment: long_4lw.patch
                ZOOKEEPER-1197_shutdown.patch
    
> Incorrect socket handling of 4 letter words for NIO
> ---------------------------------------------------
>
>                 Key: ZOOKEEPER-1197
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1197
>             Project: ZooKeeper
>          Issue Type: Bug
>          Components: server
>    Affects Versions: 3.3.3, 3.4.0
>            Reporter: Camille Fournier
>            Assignee: Camille Fournier
>            Priority: Critical
>             Fix For: 3.5.0
>
>         Attachments: long_4lw.patch, ZOOKEEPER-1197.patch, ZOOKEEPER-1197_shutdown.patch
>
>
> When transferring a large amount of information from a 4 letter word, especially in interactive mode (telnet or nc) over a slower network link, the connection can be closed before all of the data has reached the client. This is due to the way we handle nc non-interactive mode, by cancelling the selector key. 
> Instead of cancelling the selector key for 4-letter-words, we should instead flag the NIOServerCnxn to ignore detection of a close condition on that socket (CancelledKeyException, EndOfStreamException). Since the 4lw will close the connection immediately upon completion, this should be safe to do. 
> See ZOOKEEPER-737 for more details

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (ZOOKEEPER-1197) Incorrect socket handling of 4 letter words for NIO

Posted by "Camille Fournier (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/ZOOKEEPER-1197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13118116#comment-13118116 ] 

Camille Fournier commented on ZOOKEEPER-1197:
---------------------------------------------

After some manual testing I've verified that this greatly improves the problem, but does not completely solve it, so I'm still stuck on this. Any help would be appreciated, this is a major annoyance when monitoring over any sort of WAN
                
> Incorrect socket handling of 4 letter words for NIO
> ---------------------------------------------------
>
>                 Key: ZOOKEEPER-1197
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1197
>             Project: ZooKeeper
>          Issue Type: Bug
>          Components: server
>    Affects Versions: 3.3.3, 3.4.0
>            Reporter: Camille Fournier
>            Assignee: Camille Fournier
>            Priority: Blocker
>             Fix For: 3.3.4, 3.4.0, 3.5.0
>
>         Attachments: ZOOKEEPER-1197.patch
>
>
> When transferring a large amount of information from a 4 letter word, especially in interactive mode (telnet or nc) over a slower network link, the connection can be closed before all of the data has reached the client. This is due to the way we handle nc non-interactive mode, by cancelling the selector key. 
> Instead of cancelling the selector key for 4-letter-words, we should instead flag the NIOServerCnxn to ignore detection of a close condition on that socket (CancelledKeyException, EndOfStreamException). Since the 4lw will close the connection immediately upon completion, this should be safe to do. 
> See ZOOKEEPER-737 for more details

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (ZOOKEEPER-1197) Incorrect socket handling of 4 letter words for NIO

Posted by "Camille Fournier (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/ZOOKEEPER-1197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13119557#comment-13119557 ] 

Camille Fournier commented on ZOOKEEPER-1197:
---------------------------------------------

One more update for this. After talking to some folks here that are more networking experts than I am, it appears that the only way to truly fix this bug is to allow the client to initiate the close of the socket, not the server. Because a socket close event (especially in recent Linux kernels) may cause the kernel to give up on sending data even when the server has not finished/received acks, the only way to guarantee that a client has actually received all the data is to wait for the client to close the connection. We can add a timeout in the server to clean up 4lw clients that have not closed their connection, but we cannot rely on all of the data getting to the client without an explicit client close.
                
> Incorrect socket handling of 4 letter words for NIO
> ---------------------------------------------------
>
>                 Key: ZOOKEEPER-1197
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1197
>             Project: ZooKeeper
>          Issue Type: Bug
>          Components: server
>    Affects Versions: 3.3.3, 3.4.0
>            Reporter: Camille Fournier
>            Assignee: Camille Fournier
>            Priority: Blocker
>             Fix For: 3.3.4, 3.4.0, 3.5.0
>
>         Attachments: ZOOKEEPER-1197.patch
>
>
> When transferring a large amount of information from a 4 letter word, especially in interactive mode (telnet or nc) over a slower network link, the connection can be closed before all of the data has reached the client. This is due to the way we handle nc non-interactive mode, by cancelling the selector key. 
> Instead of cancelling the selector key for 4-letter-words, we should instead flag the NIOServerCnxn to ignore detection of a close condition on that socket (CancelledKeyException, EndOfStreamException). Since the 4lw will close the connection immediately upon completion, this should be safe to do. 
> See ZOOKEEPER-737 for more details

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (ZOOKEEPER-1197) Incorrect socket handling of 4 letter words for NIO

Posted by "Mahadev konar (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/ZOOKEEPER-1197?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Mahadev konar updated ZOOKEEPER-1197:
-------------------------------------

    Fix Version/s:     (was: 3.4.0)
                   3.4.1
    
> Incorrect socket handling of 4 letter words for NIO
> ---------------------------------------------------
>
>                 Key: ZOOKEEPER-1197
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1197
>             Project: ZooKeeper
>          Issue Type: Bug
>          Components: server
>    Affects Versions: 3.3.3, 3.4.0
>            Reporter: Camille Fournier
>            Assignee: Camille Fournier
>            Priority: Critical
>             Fix For: 3.3.4, 3.5.0, 3.4.1
>
>         Attachments: ZOOKEEPER-1197.patch
>
>
> When transferring a large amount of information from a 4 letter word, especially in interactive mode (telnet or nc) over a slower network link, the connection can be closed before all of the data has reached the client. This is due to the way we handle nc non-interactive mode, by cancelling the selector key. 
> Instead of cancelling the selector key for 4-letter-words, we should instead flag the NIOServerCnxn to ignore detection of a close condition on that socket (CancelledKeyException, EndOfStreamException). Since the 4lw will close the connection immediately upon completion, this should be safe to do. 
> See ZOOKEEPER-737 for more details

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (ZOOKEEPER-1197) Incorrect socket handling of 4 letter words for NIO

Posted by "Camille Fournier (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/ZOOKEEPER-1197?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Camille Fournier updated ZOOKEEPER-1197:
----------------------------------------

    Priority: Critical  (was: Blocker)
    
> Incorrect socket handling of 4 letter words for NIO
> ---------------------------------------------------
>
>                 Key: ZOOKEEPER-1197
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1197
>             Project: ZooKeeper
>          Issue Type: Bug
>          Components: server
>    Affects Versions: 3.3.3, 3.4.0
>            Reporter: Camille Fournier
>            Assignee: Camille Fournier
>            Priority: Critical
>             Fix For: 3.3.4, 3.4.0, 3.5.0
>
>         Attachments: ZOOKEEPER-1197.patch
>
>
> When transferring a large amount of information from a 4 letter word, especially in interactive mode (telnet or nc) over a slower network link, the connection can be closed before all of the data has reached the client. This is due to the way we handle nc non-interactive mode, by cancelling the selector key. 
> Instead of cancelling the selector key for 4-letter-words, we should instead flag the NIOServerCnxn to ignore detection of a close condition on that socket (CancelledKeyException, EndOfStreamException). Since the 4lw will close the connection immediately upon completion, this should be safe to do. 
> See ZOOKEEPER-737 for more details

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (ZOOKEEPER-1197) Incorrect socket handling of 4 letter words for NIO

Posted by "Mahadev konar (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/ZOOKEEPER-1197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13125524#comment-13125524 ] 

Mahadev konar commented on ZOOKEEPER-1197:
------------------------------------------

Camille,
 What do we want to do then? Closing the connection from client is probably not feasible. Should we just checkin what we have? I am not a big fan of letting the connections linger on the server and then close them later. 
                
> Incorrect socket handling of 4 letter words for NIO
> ---------------------------------------------------
>
>                 Key: ZOOKEEPER-1197
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1197
>             Project: ZooKeeper
>          Issue Type: Bug
>          Components: server
>    Affects Versions: 3.3.3, 3.4.0
>            Reporter: Camille Fournier
>            Assignee: Camille Fournier
>            Priority: Blocker
>             Fix For: 3.3.4, 3.4.0, 3.5.0
>
>         Attachments: ZOOKEEPER-1197.patch
>
>
> When transferring a large amount of information from a 4 letter word, especially in interactive mode (telnet or nc) over a slower network link, the connection can be closed before all of the data has reached the client. This is due to the way we handle nc non-interactive mode, by cancelling the selector key. 
> Instead of cancelling the selector key for 4-letter-words, we should instead flag the NIOServerCnxn to ignore detection of a close condition on that socket (CancelledKeyException, EndOfStreamException). Since the 4lw will close the connection immediately upon completion, this should be safe to do. 
> See ZOOKEEPER-737 for more details

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (ZOOKEEPER-1197) Incorrect socket handling of 4 letter words for NIO

Posted by "Patrick Hunt (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/ZOOKEEPER-1197?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Patrick Hunt updated ZOOKEEPER-1197:
------------------------------------

    Fix Version/s:     (was: 3.3.4)
                   3.3.5
    
> Incorrect socket handling of 4 letter words for NIO
> ---------------------------------------------------
>
>                 Key: ZOOKEEPER-1197
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1197
>             Project: ZooKeeper
>          Issue Type: Bug
>          Components: server
>    Affects Versions: 3.3.3, 3.4.0
>            Reporter: Camille Fournier
>            Assignee: Camille Fournier
>            Priority: Critical
>             Fix For: 3.5.0, 3.4.1, 3.3.5
>
>         Attachments: ZOOKEEPER-1197.patch
>
>
> When transferring a large amount of information from a 4 letter word, especially in interactive mode (telnet or nc) over a slower network link, the connection can be closed before all of the data has reached the client. This is due to the way we handle nc non-interactive mode, by cancelling the selector key. 
> Instead of cancelling the selector key for 4-letter-words, we should instead flag the NIOServerCnxn to ignore detection of a close condition on that socket (CancelledKeyException, EndOfStreamException). Since the 4lw will close the connection immediately upon completion, this should be safe to do. 
> See ZOOKEEPER-737 for more details

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (ZOOKEEPER-1197) Incorrect socket handling of 4 letter words for NIO

Posted by "Camille Fournier (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/ZOOKEEPER-1197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13126184#comment-13126184 ] 

Camille Fournier commented on ZOOKEEPER-1197:
---------------------------------------------

I'm not entirely sure why closing the connection from the client isn't feasible. Right now, if we were to leave the client connection as it is, it would still close itself in nc non-interactive (and possibly truncate the data), and for other clients it would close when the client closed, or we could put a timeout around connections created with a 4lw since we can identify them. I think the trickier thing is that anything else that comes in off a 4lw connection causes an IO length exception which is nasty. 

I'll play with SO_LINGER again. It doesn't fix this problem in my testing, but my testing was under windows. If it makes it better in linux, it's still worth doing for that bit.
                
> Incorrect socket handling of 4 letter words for NIO
> ---------------------------------------------------
>
>                 Key: ZOOKEEPER-1197
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1197
>             Project: ZooKeeper
>          Issue Type: Bug
>          Components: server
>    Affects Versions: 3.3.3, 3.4.0
>            Reporter: Camille Fournier
>            Assignee: Camille Fournier
>            Priority: Blocker
>             Fix For: 3.3.4, 3.4.0, 3.5.0
>
>         Attachments: ZOOKEEPER-1197.patch
>
>
> When transferring a large amount of information from a 4 letter word, especially in interactive mode (telnet or nc) over a slower network link, the connection can be closed before all of the data has reached the client. This is due to the way we handle nc non-interactive mode, by cancelling the selector key. 
> Instead of cancelling the selector key for 4-letter-words, we should instead flag the NIOServerCnxn to ignore detection of a close condition on that socket (CancelledKeyException, EndOfStreamException). Since the 4lw will close the connection immediately upon completion, this should be safe to do. 
> See ZOOKEEPER-737 for more details

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (ZOOKEEPER-1197) Incorrect socket handling of 4 letter words for NIO

Posted by "Camille Fournier (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/ZOOKEEPER-1197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13109538#comment-13109538 ] 

Camille Fournier commented on ZOOKEEPER-1197:
---------------------------------------------

I'm not sure how to write a test for this. On a local connection, especially within the same VM, I really doubt that I can reproduce the issue (no luck so far). I have tested this manually.

> Incorrect socket handling of 4 letter words for NIO
> ---------------------------------------------------
>
>                 Key: ZOOKEEPER-1197
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1197
>             Project: ZooKeeper
>          Issue Type: Bug
>          Components: server
>    Affects Versions: 3.3.3, 3.4.0
>            Reporter: Camille Fournier
>            Assignee: Camille Fournier
>            Priority: Blocker
>             Fix For: 3.3.4, 3.4.0
>
>         Attachments: ZOOKEEPER-1197.patch
>
>
> When transferring a large amount of information from a 4 letter word, especially in interactive mode (telnet or nc) over a slower network link, the connection can be closed before all of the data has reached the client. This is due to the way we handle nc non-interactive mode, by cancelling the selector key. 
> Instead of cancelling the selector key for 4-letter-words, we should instead flag the NIOServerCnxn to ignore detection of a close condition on that socket (CancelledKeyException, EndOfStreamException). Since the 4lw will close the connection immediately upon completion, this should be safe to do. 
> See ZOOKEEPER-737 for more details

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (ZOOKEEPER-1197) Incorrect socket handling of 4 letter words for NIO

Posted by "Patrick Hunt (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/ZOOKEEPER-1197?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Patrick Hunt updated ZOOKEEPER-1197:
------------------------------------

    Fix Version/s: 3.5.0

> Incorrect socket handling of 4 letter words for NIO
> ---------------------------------------------------
>
>                 Key: ZOOKEEPER-1197
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1197
>             Project: ZooKeeper
>          Issue Type: Bug
>          Components: server
>    Affects Versions: 3.3.3, 3.4.0
>            Reporter: Camille Fournier
>            Assignee: Camille Fournier
>            Priority: Blocker
>             Fix For: 3.3.4, 3.4.0, 3.5.0
>
>         Attachments: ZOOKEEPER-1197.patch
>
>
> When transferring a large amount of information from a 4 letter word, especially in interactive mode (telnet or nc) over a slower network link, the connection can be closed before all of the data has reached the client. This is due to the way we handle nc non-interactive mode, by cancelling the selector key. 
> Instead of cancelling the selector key for 4-letter-words, we should instead flag the NIOServerCnxn to ignore detection of a close condition on that socket (CancelledKeyException, EndOfStreamException). Since the 4lw will close the connection immediately upon completion, this should be safe to do. 
> See ZOOKEEPER-737 for more details

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (ZOOKEEPER-1197) Incorrect socket handling of 4 letter words for NIO

Posted by "Flavio Junqueira (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/ZOOKEEPER-1197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13121488#comment-13121488 ] 

Flavio Junqueira commented on ZOOKEEPER-1197:
---------------------------------------------

Could you elaborate please? My understanding is that the linger option is there exactly to allow unsent data to be transmitted, so I'm not sure why you say that it has no effect.
                
> Incorrect socket handling of 4 letter words for NIO
> ---------------------------------------------------
>
>                 Key: ZOOKEEPER-1197
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1197
>             Project: ZooKeeper
>          Issue Type: Bug
>          Components: server
>    Affects Versions: 3.3.3, 3.4.0
>            Reporter: Camille Fournier
>            Assignee: Camille Fournier
>            Priority: Blocker
>             Fix For: 3.3.4, 3.4.0, 3.5.0
>
>         Attachments: ZOOKEEPER-1197.patch
>
>
> When transferring a large amount of information from a 4 letter word, especially in interactive mode (telnet or nc) over a slower network link, the connection can be closed before all of the data has reached the client. This is due to the way we handle nc non-interactive mode, by cancelling the selector key. 
> Instead of cancelling the selector key for 4-letter-words, we should instead flag the NIOServerCnxn to ignore detection of a close condition on that socket (CancelledKeyException, EndOfStreamException). Since the 4lw will close the connection immediately upon completion, this should be safe to do. 
> See ZOOKEEPER-737 for more details

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (ZOOKEEPER-1197) Incorrect socket handling of 4 letter words for NIO

Posted by "Thomas Koch (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/ZOOKEEPER-1197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13109723#comment-13109723 ] 

Thomas Koch commented on ZOOKEEPER-1197:
----------------------------------------

Could you please be so kind to add the Code for review to https://reviews.apache.org/r/new/ ? Thank you!

> Incorrect socket handling of 4 letter words for NIO
> ---------------------------------------------------
>
>                 Key: ZOOKEEPER-1197
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1197
>             Project: ZooKeeper
>          Issue Type: Bug
>          Components: server
>    Affects Versions: 3.3.3, 3.4.0
>            Reporter: Camille Fournier
>            Assignee: Camille Fournier
>            Priority: Blocker
>             Fix For: 3.3.4, 3.4.0
>
>         Attachments: ZOOKEEPER-1197.patch
>
>
> When transferring a large amount of information from a 4 letter word, especially in interactive mode (telnet or nc) over a slower network link, the connection can be closed before all of the data has reached the client. This is due to the way we handle nc non-interactive mode, by cancelling the selector key. 
> Instead of cancelling the selector key for 4-letter-words, we should instead flag the NIOServerCnxn to ignore detection of a close condition on that socket (CancelledKeyException, EndOfStreamException). Since the 4lw will close the connection immediately upon completion, this should be safe to do. 
> See ZOOKEEPER-737 for more details

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (ZOOKEEPER-1197) Incorrect socket handling of 4 letter words for NIO

Posted by "Camille Fournier (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/ZOOKEEPER-1197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13121495#comment-13121495 ] 

Camille Fournier commented on ZOOKEEPER-1197:
---------------------------------------------

I'm saying, in my testing, setting this value has never had any effect. It seem that the OS is ignoring this setting.
                
> Incorrect socket handling of 4 letter words for NIO
> ---------------------------------------------------
>
>                 Key: ZOOKEEPER-1197
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1197
>             Project: ZooKeeper
>          Issue Type: Bug
>          Components: server
>    Affects Versions: 3.3.3, 3.4.0
>            Reporter: Camille Fournier
>            Assignee: Camille Fournier
>            Priority: Blocker
>             Fix For: 3.3.4, 3.4.0, 3.5.0
>
>         Attachments: ZOOKEEPER-1197.patch
>
>
> When transferring a large amount of information from a 4 letter word, especially in interactive mode (telnet or nc) over a slower network link, the connection can be closed before all of the data has reached the client. This is due to the way we handle nc non-interactive mode, by cancelling the selector key. 
> Instead of cancelling the selector key for 4-letter-words, we should instead flag the NIOServerCnxn to ignore detection of a close condition on that socket (CancelledKeyException, EndOfStreamException). Since the 4lw will close the connection immediately upon completion, this should be safe to do. 
> See ZOOKEEPER-737 for more details

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (ZOOKEEPER-1197) Incorrect socket handling of 4 letter words for NIO

Posted by "Skye Wanderman-Milne (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/ZOOKEEPER-1197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13467265#comment-13467265 ] 

Skye Wanderman-Milne commented on ZOOKEEPER-1197:
-------------------------------------------------

Based on the second code snippet in the link Patrick found, I've written an alternate patch (ZOOKEEPER-1197_shutdown.patch) that waits for the client to shutdown the input connection before closing the socket.

I've managed to reproduce the problem locally for testing:
0. My base setup: trunk zk checkout, Ubuntu 12.04, Oracle Java 6, no fancy zoo.cfg 
1. I wrote a new 4lw, "long", that prints 100k '.'s followed by a '!' (so you can see if it's truncated). See long_4lw.patch.
2. Slow down your localhost connection: sudo tc qdisc add dev lo root netem delay 1000ms 
(To undo this command: sudo tc qdisc del dev lo root) 
3. Start a zk server with ./bin/zkServer.sh start 
4. Send a long command: echo long | nc localhost 2181 (I've also been using FourLetterWordMain) 
This should fill your terminal with dots, but shouldn't end with a !. If you see the !, the output wasn't truncated (i.e., you haven't hit the bug).

The new patch fixes the bug with this setup. Camille's patch also fixes it, although I do not fully understand how :) Since Camille reports having problems even after applying her patch (which I could not reproduce), this patch probably needs further testing as well. It would also be useful to test with a variety of clients (it sounds like this is the standard way to close a TCP connection, but I'm not positive every client will shut down the connection correctly).
                
> Incorrect socket handling of 4 letter words for NIO
> ---------------------------------------------------
>
>                 Key: ZOOKEEPER-1197
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1197
>             Project: ZooKeeper
>          Issue Type: Bug
>          Components: server
>    Affects Versions: 3.3.3, 3.4.0
>            Reporter: Camille Fournier
>            Assignee: Camille Fournier
>            Priority: Critical
>             Fix For: 3.5.0
>
>         Attachments: long_4lw.patch, ZOOKEEPER-1197.patch, ZOOKEEPER-1197_shutdown.patch
>
>
> When transferring a large amount of information from a 4 letter word, especially in interactive mode (telnet or nc) over a slower network link, the connection can be closed before all of the data has reached the client. This is due to the way we handle nc non-interactive mode, by cancelling the selector key. 
> Instead of cancelling the selector key for 4-letter-words, we should instead flag the NIOServerCnxn to ignore detection of a close condition on that socket (CancelledKeyException, EndOfStreamException). Since the 4lw will close the connection immediately upon completion, this should be safe to do. 
> See ZOOKEEPER-737 for more details

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (ZOOKEEPER-1197) Incorrect socket handling of 4 letter words for NIO

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/ZOOKEEPER-1197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13109555#comment-13109555 ] 

Hadoop QA commented on ZOOKEEPER-1197:
--------------------------------------

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12495373/ZOOKEEPER-1197.patch
  against trunk revision 1172406.

    +1 @author.  The patch does not contain any @author tags.

    -1 tests included.  The patch doesn't appear to include any new or modified tests.
                        Please justify why no new tests are needed for this patch.
                        Also please list what manual steps were performed to verify this patch.

    +1 javadoc.  The javadoc tool did not generate any warning messages.

    +1 javac.  The applied patch does not increase the total number of javac compiler warnings.

    +1 findbugs.  The patch does not introduce any new Findbugs (version 1.3.9) warnings.

    +1 release audit.  The applied patch does not increase the total number of release audit warnings.

    +1 core tests.  The patch passed core unit tests.

    +1 contrib tests.  The patch passed contrib unit tests.

Test results: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/574//testReport/
Findbugs warnings: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/574//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/574//console

This message is automatically generated.

> Incorrect socket handling of 4 letter words for NIO
> ---------------------------------------------------
>
>                 Key: ZOOKEEPER-1197
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1197
>             Project: ZooKeeper
>          Issue Type: Bug
>          Components: server
>    Affects Versions: 3.3.3, 3.4.0
>            Reporter: Camille Fournier
>            Assignee: Camille Fournier
>            Priority: Blocker
>             Fix For: 3.3.4, 3.4.0
>
>         Attachments: ZOOKEEPER-1197.patch
>
>
> When transferring a large amount of information from a 4 letter word, especially in interactive mode (telnet or nc) over a slower network link, the connection can be closed before all of the data has reached the client. This is due to the way we handle nc non-interactive mode, by cancelling the selector key. 
> Instead of cancelling the selector key for 4-letter-words, we should instead flag the NIOServerCnxn to ignore detection of a close condition on that socket (CancelledKeyException, EndOfStreamException). Since the 4lw will close the connection immediately upon completion, this should be safe to do. 
> See ZOOKEEPER-737 for more details

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (ZOOKEEPER-1197) Incorrect socket handling of 4 letter words for NIO

Posted by "Patrick Hunt (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/ZOOKEEPER-1197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13131775#comment-13131775 ] 

Patrick Hunt commented on ZOOKEEPER-1197:
-----------------------------------------

Can you take a look at ZOOKEEPER-1237 ? I believe we're seeing the same issue for regular sessions. This shows up much more frequently once linger is turned off (at least in my testing of ZOOKEEPER-1049 that seems to be the case).

Seems we need to do a better job coordinating session/4lw teardown across the board?
                
> Incorrect socket handling of 4 letter words for NIO
> ---------------------------------------------------
>
>                 Key: ZOOKEEPER-1197
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1197
>             Project: ZooKeeper
>          Issue Type: Bug
>          Components: server
>    Affects Versions: 3.3.3, 3.4.0
>            Reporter: Camille Fournier
>            Assignee: Camille Fournier
>            Priority: Critical
>             Fix For: 3.3.4, 3.5.0, 3.4.1
>
>         Attachments: ZOOKEEPER-1197.patch
>
>
> When transferring a large amount of information from a 4 letter word, especially in interactive mode (telnet or nc) over a slower network link, the connection can be closed before all of the data has reached the client. This is due to the way we handle nc non-interactive mode, by cancelling the selector key. 
> Instead of cancelling the selector key for 4-letter-words, we should instead flag the NIOServerCnxn to ignore detection of a close condition on that socket (CancelledKeyException, EndOfStreamException). Since the 4lw will close the connection immediately upon completion, this should be safe to do. 
> See ZOOKEEPER-737 for more details

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira