You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@thrift.apache.org by "Ning Liang (JIRA)" <ji...@apache.org> on 2010/08/02 22:41:16 UTC

[jira] Created: (THRIFT-836) Race condition causes CancelledKeyException in TAsyncClientManager

Race condition causes CancelledKeyException in TAsyncClientManager
------------------------------------------------------------------

                 Key: THRIFT-836
                 URL: https://issues.apache.org/jira/browse/THRIFT-836
             Project: Thrift
          Issue Type: Bug
          Components: Compiler (Java)
            Reporter: Ning Liang


Currently, TAsyncClientMethod cancels its selection key on a successful method call. The current model assumes the key goes away and the client gets a new key on the next Selector registration, which is incorrect - the cancelled key can hang around, with the implication that the developer needs to check key.isValid() on the next selector action, which TAsyncClientMethod currently does. However, if the developer re-uses the client for another call, you can get a CancelledKeyException from the Selector#register call, which crashes the SelectorThread. This occurs once you have sufficient concurrency on the Selector - cancelled key cleanup takes longer, so we hit this condition.

Attached is a patch with fix and regression test. Summary of fix:

1) Don't cancel() the key after a successful method call.
2) Catch CancelledKeyException in the SelectorThread, so the manager can't die.
3) Add an onError method to TAsyncMethodCall, so that the SelectorThread can notify the client of an error on selector registration.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (THRIFT-836) Race condition causes CancelledKeyException in TAsyncClientManager

Posted by "Ning Liang (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/THRIFT-836?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ning Liang updated THRIFT-836:
------------------------------

    Attachment: async_client_catch_all.diff

One more catch statement... need to catch the ClosedSelectorException from the key iteration block

> Race condition causes CancelledKeyException in TAsyncClientManager
> ------------------------------------------------------------------
>
>                 Key: THRIFT-836
>                 URL: https://issues.apache.org/jira/browse/THRIFT-836
>             Project: Thrift
>          Issue Type: Bug
>          Components: Library (Java)
>            Reporter: Ning Liang
>            Assignee: Bryan Duxbury
>         Attachments: async_client_catch_all.diff, async_thrift.diff, async_thrift_catch_except.diff
>
>
> Currently, TAsyncClientMethod cancels its selection key on a successful method call. The current model assumes the key goes away and the client gets a new key on the next Selector registration, which is incorrect - the cancelled key can hang around, with the implication that the developer needs to check key.isValid() on the next selector action, which TAsyncClientMethod currently does. However, if the developer re-uses the client for another call, you can get a CancelledKeyException from the Selector#register call, which crashes the SelectorThread. This occurs once you have sufficient concurrency on the Selector - cancelled key cleanup takes longer, so we hit this condition.
> Attached is a patch with fix and regression test. Summary of fix:
> 1) Don't cancel() the key after a successful method call.
> 2) Catch CancelledKeyException in the SelectorThread, so the manager can't die.
> 3) Add an onError method to TAsyncMethodCall, so that the SelectorThread can notify the client of an error on selector registration.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (THRIFT-836) Race condition causes CancelledKeyException in TAsyncClientManager

Posted by "Ning Liang (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/THRIFT-836?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ning Liang updated THRIFT-836:
------------------------------

    Component/s: Library (Java)
                     (was: Compiler (Java))

> Race condition causes CancelledKeyException in TAsyncClientManager
> ------------------------------------------------------------------
>
>                 Key: THRIFT-836
>                 URL: https://issues.apache.org/jira/browse/THRIFT-836
>             Project: Thrift
>          Issue Type: Bug
>          Components: Library (Java)
>            Reporter: Ning Liang
>            Assignee: Bryan Duxbury
>         Attachments: async_thrift.diff, async_thrift_catch_except.diff
>
>
> Currently, TAsyncClientMethod cancels its selection key on a successful method call. The current model assumes the key goes away and the client gets a new key on the next Selector registration, which is incorrect - the cancelled key can hang around, with the implication that the developer needs to check key.isValid() on the next selector action, which TAsyncClientMethod currently does. However, if the developer re-uses the client for another call, you can get a CancelledKeyException from the Selector#register call, which crashes the SelectorThread. This occurs once you have sufficient concurrency on the Selector - cancelled key cleanup takes longer, so we hit this condition.
> Attached is a patch with fix and regression test. Summary of fix:
> 1) Don't cancel() the key after a successful method call.
> 2) Catch CancelledKeyException in the SelectorThread, so the manager can't die.
> 3) Add an onError method to TAsyncMethodCall, so that the SelectorThread can notify the client of an error on selector registration.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Assigned: (THRIFT-836) Race condition causes CancelledKeyException in TAsyncClientManager

Posted by "Bryan Duxbury (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/THRIFT-836?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Bryan Duxbury reassigned THRIFT-836:
------------------------------------

    Assignee: Bryan Duxbury

> Race condition causes CancelledKeyException in TAsyncClientManager
> ------------------------------------------------------------------
>
>                 Key: THRIFT-836
>                 URL: https://issues.apache.org/jira/browse/THRIFT-836
>             Project: Thrift
>          Issue Type: Bug
>          Components: Compiler (Java)
>            Reporter: Ning Liang
>            Assignee: Bryan Duxbury
>
> Currently, TAsyncClientMethod cancels its selection key on a successful method call. The current model assumes the key goes away and the client gets a new key on the next Selector registration, which is incorrect - the cancelled key can hang around, with the implication that the developer needs to check key.isValid() on the next selector action, which TAsyncClientMethod currently does. However, if the developer re-uses the client for another call, you can get a CancelledKeyException from the Selector#register call, which crashes the SelectorThread. This occurs once you have sufficient concurrency on the Selector - cancelled key cleanup takes longer, so we hit this condition.
> Attached is a patch with fix and regression test. Summary of fix:
> 1) Don't cancel() the key after a successful method call.
> 2) Catch CancelledKeyException in the SelectorThread, so the manager can't die.
> 3) Add an onError method to TAsyncMethodCall, so that the SelectorThread can notify the client of an error on selector registration.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (THRIFT-836) Race condition causes CancelledKeyException in TAsyncClientManager

Posted by "Ning Liang (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/THRIFT-836?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ning Liang updated THRIFT-836:
------------------------------

    Attachment: async_thrift_catch_except.diff

Handle ClosedChannelException explicitly, and catch all further runtime errors. These are:

    IllegalBlockingModeException - If this channel is in blocking mode 
    IllegalSelectorException - If this channel was not created by the same provider as the given selector
    IllegalArgumentException - If a bit in ops does not correspond to an operation that is supported by this channel, that is, if set & ~validOps() != 0

> Race condition causes CancelledKeyException in TAsyncClientManager
> ------------------------------------------------------------------
>
>                 Key: THRIFT-836
>                 URL: https://issues.apache.org/jira/browse/THRIFT-836
>             Project: Thrift
>          Issue Type: Bug
>          Components: Compiler (Java)
>            Reporter: Ning Liang
>            Assignee: Bryan Duxbury
>         Attachments: async_thrift.diff, async_thrift_catch_except.diff
>
>
> Currently, TAsyncClientMethod cancels its selection key on a successful method call. The current model assumes the key goes away and the client gets a new key on the next Selector registration, which is incorrect - the cancelled key can hang around, with the implication that the developer needs to check key.isValid() on the next selector action, which TAsyncClientMethod currently does. However, if the developer re-uses the client for another call, you can get a CancelledKeyException from the Selector#register call, which crashes the SelectorThread. This occurs once you have sufficient concurrency on the Selector - cancelled key cleanup takes longer, so we hit this condition.
> Attached is a patch with fix and regression test. Summary of fix:
> 1) Don't cancel() the key after a successful method call.
> 2) Catch CancelledKeyException in the SelectorThread, so the manager can't die.
> 3) Add an onError method to TAsyncMethodCall, so that the SelectorThread can notify the client of an error on selector registration.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (THRIFT-836) Race condition causes CancelledKeyException in TAsyncClientManager

Posted by "Bryan Duxbury (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/THRIFT-836?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Bryan Duxbury updated THRIFT-836:
---------------------------------

    Fix Version/s: 0.4

> Race condition causes CancelledKeyException in TAsyncClientManager
> ------------------------------------------------------------------
>
>                 Key: THRIFT-836
>                 URL: https://issues.apache.org/jira/browse/THRIFT-836
>             Project: Thrift
>          Issue Type: Bug
>          Components: Library (Java)
>            Reporter: Ning Liang
>            Assignee: Bryan Duxbury
>            Priority: Blocker
>             Fix For: 0.4
>
>         Attachments: async_client_catch_all.diff, async_thrift.diff, async_thrift_catch_except.diff
>
>
> Currently, TAsyncClientMethod cancels its selection key on a successful method call. The current model assumes the key goes away and the client gets a new key on the next Selector registration, which is incorrect - the cancelled key can hang around, with the implication that the developer needs to check key.isValid() on the next selector action, which TAsyncClientMethod currently does. However, if the developer re-uses the client for another call, you can get a CancelledKeyException from the Selector#register call, which crashes the SelectorThread. This occurs once you have sufficient concurrency on the Selector - cancelled key cleanup takes longer, so we hit this condition.
> Attached is a patch with fix and regression test. Summary of fix:
> 1) Don't cancel() the key after a successful method call.
> 2) Catch CancelledKeyException in the SelectorThread, so the manager can't die.
> 3) Add an onError method to TAsyncMethodCall, so that the SelectorThread can notify the client of an error on selector registration.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (THRIFT-836) Race condition causes CancelledKeyException in TAsyncClientManager

Posted by "Ning Liang (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/THRIFT-836?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ning Liang updated THRIFT-836:
------------------------------

    Attachment: async_thrift.diff

Summary:

1) Only cancel key on error
2) Catch runtime exceptions in TAsyncClientManager's SelectorThread
3) Notify TAsyncMethodCall of error on bad registration, and propagate to client and user (via error callback)
4) Regression test

> Race condition causes CancelledKeyException in TAsyncClientManager
> ------------------------------------------------------------------
>
>                 Key: THRIFT-836
>                 URL: https://issues.apache.org/jira/browse/THRIFT-836
>             Project: Thrift
>          Issue Type: Bug
>          Components: Compiler (Java)
>            Reporter: Ning Liang
>            Assignee: Bryan Duxbury
>         Attachments: async_thrift.diff
>
>
> Currently, TAsyncClientMethod cancels its selection key on a successful method call. The current model assumes the key goes away and the client gets a new key on the next Selector registration, which is incorrect - the cancelled key can hang around, with the implication that the developer needs to check key.isValid() on the next selector action, which TAsyncClientMethod currently does. However, if the developer re-uses the client for another call, you can get a CancelledKeyException from the Selector#register call, which crashes the SelectorThread. This occurs once you have sufficient concurrency on the Selector - cancelled key cleanup takes longer, so we hit this condition.
> Attached is a patch with fix and regression test. Summary of fix:
> 1) Don't cancel() the key after a successful method call.
> 2) Catch CancelledKeyException in the SelectorThread, so the manager can't die.
> 3) Add an onError method to TAsyncMethodCall, so that the SelectorThread can notify the client of an error on selector registration.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (THRIFT-836) Race condition causes CancelledKeyException in TAsyncClientManager

Posted by "Bryan Duxbury (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/THRIFT-836?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Bryan Duxbury updated THRIFT-836:
---------------------------------

    Priority: Blocker  (was: Major)

> Race condition causes CancelledKeyException in TAsyncClientManager
> ------------------------------------------------------------------
>
>                 Key: THRIFT-836
>                 URL: https://issues.apache.org/jira/browse/THRIFT-836
>             Project: Thrift
>          Issue Type: Bug
>          Components: Library (Java)
>            Reporter: Ning Liang
>            Assignee: Bryan Duxbury
>            Priority: Blocker
>             Fix For: 0.4
>
>         Attachments: async_client_catch_all.diff, async_thrift.diff, async_thrift_catch_except.diff
>
>
> Currently, TAsyncClientMethod cancels its selection key on a successful method call. The current model assumes the key goes away and the client gets a new key on the next Selector registration, which is incorrect - the cancelled key can hang around, with the implication that the developer needs to check key.isValid() on the next selector action, which TAsyncClientMethod currently does. However, if the developer re-uses the client for another call, you can get a CancelledKeyException from the Selector#register call, which crashes the SelectorThread. This occurs once you have sufficient concurrency on the Selector - cancelled key cleanup takes longer, so we hit this condition.
> Attached is a patch with fix and regression test. Summary of fix:
> 1) Don't cancel() the key after a successful method call.
> 2) Catch CancelledKeyException in the SelectorThread, so the manager can't die.
> 3) Add an onError method to TAsyncMethodCall, so that the SelectorThread can notify the client of an error on selector registration.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Closed: (THRIFT-836) Race condition causes CancelledKeyException in TAsyncClientManager

Posted by "Bryan Duxbury (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/THRIFT-836?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Bryan Duxbury closed THRIFT-836.
--------------------------------

    Resolution: Fixed

I just committed this. Thanks for the patch, Ning!

> Race condition causes CancelledKeyException in TAsyncClientManager
> ------------------------------------------------------------------
>
>                 Key: THRIFT-836
>                 URL: https://issues.apache.org/jira/browse/THRIFT-836
>             Project: Thrift
>          Issue Type: Bug
>          Components: Library (Java)
>            Reporter: Ning Liang
>            Assignee: Bryan Duxbury
>            Priority: Blocker
>             Fix For: 0.4
>
>         Attachments: async_client_catch_all.diff, async_thrift.diff, async_thrift_catch_except.diff
>
>
> Currently, TAsyncClientMethod cancels its selection key on a successful method call. The current model assumes the key goes away and the client gets a new key on the next Selector registration, which is incorrect - the cancelled key can hang around, with the implication that the developer needs to check key.isValid() on the next selector action, which TAsyncClientMethod currently does. However, if the developer re-uses the client for another call, you can get a CancelledKeyException from the Selector#register call, which crashes the SelectorThread. This occurs once you have sufficient concurrency on the Selector - cancelled key cleanup takes longer, so we hit this condition.
> Attached is a patch with fix and regression test. Summary of fix:
> 1) Don't cancel() the key after a successful method call.
> 2) Catch CancelledKeyException in the SelectorThread, so the manager can't die.
> 3) Add an onError method to TAsyncMethodCall, so that the SelectorThread can notify the client of an error on selector registration.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.