You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@avro.apache.org by "Bruno Dumon (Created) (JIRA)" <ji...@apache.org> on 2011/12/15 15:24:32 UTC

[jira] [Created] (AVRO-982) NettyTransceiver: can hang on connection interruption

NettyTransceiver: can hang on connection interruption
-----------------------------------------------------

                 Key: AVRO-982
                 URL: https://issues.apache.org/jira/browse/AVRO-982
             Project: Avro
          Issue Type: Improvement
          Components: java
            Reporter: Bruno Dumon
            Priority: Minor


When stopping my avro server, I noticed that my avro client was hanging. This makes it impossible for my client to retry the operation, as it hangs inside the avro code:

{noformat}
"pool-2-thread-1" prio=10 tid=0x00007fc66840e800 nid=0x75fc waiting on condition [0x00007fc674176000]
   java.lang.Thread.State: WAITING (parking)
        at sun.misc.Unsafe.park(Native Method)
        - parking to wait for  <0x00000007d7471bd0> (a java.util.concurrent.CountDownLatch$Sync)
        at java.util.concurrent.locks.LockSupport.park(LockSupport.java:156)
        at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:811)
        at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:969)
        at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1281)
        at java.util.concurrent.CountDownLatch.await(CountDownLatch.java:207)
        at org.apache.avro.ipc.CallFuture.get(CallFuture.java:116)
        at org.apache.avro.ipc.Requestor.request(Requestor.java:106)
        at org.apache.avro.ipc.specific.SpecificRequestor.invoke(SpecificRequestor.java:72)
{noformat}

In a similar situation elsewhere in the NettyTransceiver (method exceptionCaught), the pending requests are canceled. It seems appropriate to do that also on closed connections. I'll attach a patch.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (AVRO-982) NettyTransceiver: can hang on connection interruption

Posted by "Bruno Dumon (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/AVRO-982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13170943#comment-13170943 ] 

Bruno Dumon commented on AVRO-982:
----------------------------------

I've attached a test (the patch has a few empty line diffs that I wasn't able to avoid).

Without the patch applied, the test will hang for 10 seconds and then fail with the message "Client request should not be blocked on server shutdown".
                
> NettyTransceiver: can hang on connection interruption
> -----------------------------------------------------
>
>                 Key: AVRO-982
>                 URL: https://issues.apache.org/jira/browse/AVRO-982
>             Project: Avro
>          Issue Type: Improvement
>          Components: java
>            Reporter: Bruno Dumon
>            Priority: Minor
>         Attachments: AVRO-982-testcase.patch, AVRO-982.patch
>
>
> When stopping my avro server, I noticed that my avro client was hanging. This makes it impossible for my client to retry the operation, as it hangs inside the avro code:
> {noformat}
> "pool-2-thread-1" prio=10 tid=0x00007fc66840e800 nid=0x75fc waiting on condition [0x00007fc674176000]
>    java.lang.Thread.State: WAITING (parking)
>         at sun.misc.Unsafe.park(Native Method)
>         - parking to wait for  <0x00000007d7471bd0> (a java.util.concurrent.CountDownLatch$Sync)
>         at java.util.concurrent.locks.LockSupport.park(LockSupport.java:156)
>         at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:811)
>         at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:969)
>         at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1281)
>         at java.util.concurrent.CountDownLatch.await(CountDownLatch.java:207)
>         at org.apache.avro.ipc.CallFuture.get(CallFuture.java:116)
>         at org.apache.avro.ipc.Requestor.request(Requestor.java:106)
>         at org.apache.avro.ipc.specific.SpecificRequestor.invoke(SpecificRequestor.java:72)
> {noformat}
> In a similar situation elsewhere in the NettyTransceiver (method exceptionCaught), the pending requests are canceled. It seems appropriate to do that also on closed connections. I'll attach a patch.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Resolved] (AVRO-982) NettyTransceiver: can hang on connection interruption

Posted by "Doug Cutting (Resolved) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/AVRO-982?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Doug Cutting resolved AVRO-982.
-------------------------------

       Resolution: Fixed
    Fix Version/s: 1.6.2
         Assignee: Bruno Dumon
     Hadoop Flags: Reviewed

The test looks great.  I used diff's -w flag to remove the spurious whitespace changes.

I committed this.

Thanks, Bruno!
                
> NettyTransceiver: can hang on connection interruption
> -----------------------------------------------------
>
>                 Key: AVRO-982
>                 URL: https://issues.apache.org/jira/browse/AVRO-982
>             Project: Avro
>          Issue Type: Improvement
>          Components: java
>            Reporter: Bruno Dumon
>            Assignee: Bruno Dumon
>            Priority: Minor
>             Fix For: 1.6.2
>
>         Attachments: AVRO-982-testcase.patch, AVRO-982.patch
>
>
> When stopping my avro server, I noticed that my avro client was hanging. This makes it impossible for my client to retry the operation, as it hangs inside the avro code:
> {noformat}
> "pool-2-thread-1" prio=10 tid=0x00007fc66840e800 nid=0x75fc waiting on condition [0x00007fc674176000]
>    java.lang.Thread.State: WAITING (parking)
>         at sun.misc.Unsafe.park(Native Method)
>         - parking to wait for  <0x00000007d7471bd0> (a java.util.concurrent.CountDownLatch$Sync)
>         at java.util.concurrent.locks.LockSupport.park(LockSupport.java:156)
>         at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:811)
>         at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:969)
>         at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1281)
>         at java.util.concurrent.CountDownLatch.await(CountDownLatch.java:207)
>         at org.apache.avro.ipc.CallFuture.get(CallFuture.java:116)
>         at org.apache.avro.ipc.Requestor.request(Requestor.java:106)
>         at org.apache.avro.ipc.specific.SpecificRequestor.invoke(SpecificRequestor.java:72)
> {noformat}
> In a similar situation elsewhere in the NettyTransceiver (method exceptionCaught), the pending requests are canceled. It seems appropriate to do that also on closed connections. I'll attach a patch.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (AVRO-982) NettyTransceiver: can hang on connection interruption

Posted by "Doug Cutting (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/AVRO-982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13170626#comment-13170626 ] 

Doug Cutting commented on AVRO-982:
-----------------------------------

It would be great to have a test that this fixes.  I tried some simple changes to TestNettyServerWithCallbacks to reproduce the problem and could not.  Can you devise a test?
                
> NettyTransceiver: can hang on connection interruption
> -----------------------------------------------------
>
>                 Key: AVRO-982
>                 URL: https://issues.apache.org/jira/browse/AVRO-982
>             Project: Avro
>          Issue Type: Improvement
>          Components: java
>            Reporter: Bruno Dumon
>            Priority: Minor
>         Attachments: AVRO-982.patch
>
>
> When stopping my avro server, I noticed that my avro client was hanging. This makes it impossible for my client to retry the operation, as it hangs inside the avro code:
> {noformat}
> "pool-2-thread-1" prio=10 tid=0x00007fc66840e800 nid=0x75fc waiting on condition [0x00007fc674176000]
>    java.lang.Thread.State: WAITING (parking)
>         at sun.misc.Unsafe.park(Native Method)
>         - parking to wait for  <0x00000007d7471bd0> (a java.util.concurrent.CountDownLatch$Sync)
>         at java.util.concurrent.locks.LockSupport.park(LockSupport.java:156)
>         at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:811)
>         at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:969)
>         at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1281)
>         at java.util.concurrent.CountDownLatch.await(CountDownLatch.java:207)
>         at org.apache.avro.ipc.CallFuture.get(CallFuture.java:116)
>         at org.apache.avro.ipc.Requestor.request(Requestor.java:106)
>         at org.apache.avro.ipc.specific.SpecificRequestor.invoke(SpecificRequestor.java:72)
> {noformat}
> In a similar situation elsewhere in the NettyTransceiver (method exceptionCaught), the pending requests are canceled. It seems appropriate to do that also on closed connections. I'll attach a patch.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (AVRO-982) NettyTransceiver: can hang on connection interruption

Posted by "Bruno Dumon (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/AVRO-982?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Bruno Dumon updated AVRO-982:
-----------------------------

    Attachment: AVRO-982.patch
    
> NettyTransceiver: can hang on connection interruption
> -----------------------------------------------------
>
>                 Key: AVRO-982
>                 URL: https://issues.apache.org/jira/browse/AVRO-982
>             Project: Avro
>          Issue Type: Improvement
>          Components: java
>            Reporter: Bruno Dumon
>            Priority: Minor
>         Attachments: AVRO-982.patch
>
>
> When stopping my avro server, I noticed that my avro client was hanging. This makes it impossible for my client to retry the operation, as it hangs inside the avro code:
> {noformat}
> "pool-2-thread-1" prio=10 tid=0x00007fc66840e800 nid=0x75fc waiting on condition [0x00007fc674176000]
>    java.lang.Thread.State: WAITING (parking)
>         at sun.misc.Unsafe.park(Native Method)
>         - parking to wait for  <0x00000007d7471bd0> (a java.util.concurrent.CountDownLatch$Sync)
>         at java.util.concurrent.locks.LockSupport.park(LockSupport.java:156)
>         at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:811)
>         at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:969)
>         at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1281)
>         at java.util.concurrent.CountDownLatch.await(CountDownLatch.java:207)
>         at org.apache.avro.ipc.CallFuture.get(CallFuture.java:116)
>         at org.apache.avro.ipc.Requestor.request(Requestor.java:106)
>         at org.apache.avro.ipc.specific.SpecificRequestor.invoke(SpecificRequestor.java:72)
> {noformat}
> In a similar situation elsewhere in the NettyTransceiver (method exceptionCaught), the pending requests are canceled. It seems appropriate to do that also on closed connections. I'll attach a patch.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (AVRO-982) NettyTransceiver: can hang on connection interruption

Posted by "Scott Carey (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/AVRO-982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13171308#comment-13171308 ] 

Scott Carey commented on AVRO-982:
----------------------------------

+1  Simple fix, great test.
                
> NettyTransceiver: can hang on connection interruption
> -----------------------------------------------------
>
>                 Key: AVRO-982
>                 URL: https://issues.apache.org/jira/browse/AVRO-982
>             Project: Avro
>          Issue Type: Improvement
>          Components: java
>            Reporter: Bruno Dumon
>            Assignee: Bruno Dumon
>            Priority: Minor
>             Fix For: 1.6.2
>
>         Attachments: AVRO-982-testcase.patch, AVRO-982.patch
>
>
> When stopping my avro server, I noticed that my avro client was hanging. This makes it impossible for my client to retry the operation, as it hangs inside the avro code:
> {noformat}
> "pool-2-thread-1" prio=10 tid=0x00007fc66840e800 nid=0x75fc waiting on condition [0x00007fc674176000]
>    java.lang.Thread.State: WAITING (parking)
>         at sun.misc.Unsafe.park(Native Method)
>         - parking to wait for  <0x00000007d7471bd0> (a java.util.concurrent.CountDownLatch$Sync)
>         at java.util.concurrent.locks.LockSupport.park(LockSupport.java:156)
>         at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:811)
>         at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:969)
>         at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1281)
>         at java.util.concurrent.CountDownLatch.await(CountDownLatch.java:207)
>         at org.apache.avro.ipc.CallFuture.get(CallFuture.java:116)
>         at org.apache.avro.ipc.Requestor.request(Requestor.java:106)
>         at org.apache.avro.ipc.specific.SpecificRequestor.invoke(SpecificRequestor.java:72)
> {noformat}
> In a similar situation elsewhere in the NettyTransceiver (method exceptionCaught), the pending requests are canceled. It seems appropriate to do that also on closed connections. I'll attach a patch.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (AVRO-982) NettyTransceiver: can hang on connection interruption

Posted by "Bruno Dumon (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/AVRO-982?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Bruno Dumon updated AVRO-982:
-----------------------------

    Attachment: AVRO-982-testcase.patch
    
> NettyTransceiver: can hang on connection interruption
> -----------------------------------------------------
>
>                 Key: AVRO-982
>                 URL: https://issues.apache.org/jira/browse/AVRO-982
>             Project: Avro
>          Issue Type: Improvement
>          Components: java
>            Reporter: Bruno Dumon
>            Priority: Minor
>         Attachments: AVRO-982-testcase.patch, AVRO-982.patch
>
>
> When stopping my avro server, I noticed that my avro client was hanging. This makes it impossible for my client to retry the operation, as it hangs inside the avro code:
> {noformat}
> "pool-2-thread-1" prio=10 tid=0x00007fc66840e800 nid=0x75fc waiting on condition [0x00007fc674176000]
>    java.lang.Thread.State: WAITING (parking)
>         at sun.misc.Unsafe.park(Native Method)
>         - parking to wait for  <0x00000007d7471bd0> (a java.util.concurrent.CountDownLatch$Sync)
>         at java.util.concurrent.locks.LockSupport.park(LockSupport.java:156)
>         at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:811)
>         at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:969)
>         at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1281)
>         at java.util.concurrent.CountDownLatch.await(CountDownLatch.java:207)
>         at org.apache.avro.ipc.CallFuture.get(CallFuture.java:116)
>         at org.apache.avro.ipc.Requestor.request(Requestor.java:106)
>         at org.apache.avro.ipc.specific.SpecificRequestor.invoke(SpecificRequestor.java:72)
> {noformat}
> In a similar situation elsewhere in the NettyTransceiver (method exceptionCaught), the pending requests are canceled. It seems appropriate to do that also on closed connections. I'll attach a patch.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira