You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@activemq.apache.org by "Kevin Yaussy (JIRA)" <ji...@apache.org> on 2006/06/22 22:03:51 UTC

[jira] Created: (AMQ-771) org.apache.activemq.broker.TransportConnection::stop should not attempt to send a message over the connection.

org.apache.activemq.broker.TransportConnection::stop should not attempt to send a message over the connection.
--------------------------------------------------------------------------------------------------------------

         Key: AMQ-771
         URL: https://issues.apache.org/activemq/browse/AMQ-771
     Project: ActiveMQ
        Type: Bug

  Components: Connector  
    Versions: 4.0.1, 4.0    
    Reporter: Kevin Yaussy


Especially when using "failover", there can be a problem with respect to TransportConnection::stop attempting to send a "shutdown" message over the connection.  If another thread is sending messages to the connection, and it gets stuck for some reason, such as a network freeze, the target machine panics, or the target process freezes for some reason, the TransportConnection::dispatch will eventually block, locking the MutextTransport object.  When the InactivityMonitor wakes up and detects that the connection is dead, it will go through the process of stopping the connection.  This goes back into TransportConnection, and calls stop, which attemtps to lock the MutexTransport so it can send the "shutdown" command.  Now, both threads are stuck, potentially for a long time, as a box panic will not cleanly close the tcp connection.

I'm not sure the rationale for wanting to send a shutdown command to the other side of the connection, since the target has to handle the connection going down hard anyway.  Seems to me, if you are intending on closing the connection, just close it - don't try to be nice to the other side.  Especially in this code path, there is something wrong with the other side anyway.


-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   https://issues.apache.org/activemq/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


[jira] Assigned: (AMQ-771) org.apache.activemq.broker.TransportConnection::stop should not attempt to send a message over the connection.

Posted by "james strachan (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/activemq/browse/AMQ-771?page=all ]

james strachan reassigned AMQ-771:
----------------------------------

    Assignee: Rob Davies

> org.apache.activemq.broker.TransportConnection::stop should not attempt to send a message over the connection.
> --------------------------------------------------------------------------------------------------------------
>
>                 Key: AMQ-771
>                 URL: https://issues.apache.org/activemq/browse/AMQ-771
>             Project: ActiveMQ
>          Issue Type: Bug
>          Components: Connector
>    Affects Versions: 4.0, 4.0.1
>            Reporter: Kevin Yaussy
>         Assigned To: Rob Davies
>
> Especially when using "failover", there can be a problem with respect to TransportConnection::stop attempting to send a "shutdown" message over the connection.  If another thread is sending messages to the connection, and it gets stuck for some reason, such as a network freeze, the target machine panics, or the target process freezes for some reason, the TransportConnection::dispatch will eventually block, locking the MutextTransport object.  When the InactivityMonitor wakes up and detects that the connection is dead, it will go through the process of stopping the connection.  This goes back into TransportConnection, and calls stop, which attemtps to lock the MutexTransport so it can send the "shutdown" command.  Now, both threads are stuck, potentially for a long time, as a box panic will not cleanly close the tcp connection.
> I'm not sure the rationale for wanting to send a shutdown command to the other side of the connection, since the target has to handle the connection going down hard anyway.  Seems to me, if you are intending on closing the connection, just close it - don't try to be nice to the other side.  Especially in this code path, there is something wrong with the other side anyway.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: https://issues.apache.org/activemq/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] Commented: (AMQ-771) org.apache.activemq.broker.TransportConnection::stop should not attempt to send a message over the connection.

Posted by "Kevin Yaussy (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/activemq/browse/AMQ-771?page=comments#action_36777 ] 
            
Kevin Yaussy commented on AMQ-771:
----------------------------------

Rob,

This issue is a bit more complex than I've noted above.  I've been out for a week, but prior to leaving I got a version (based upon 4.0.1) working.  I will submit comments and patches sometime today, hopefully.


> org.apache.activemq.broker.TransportConnection::stop should not attempt to send a message over the connection.
> --------------------------------------------------------------------------------------------------------------
>
>                 Key: AMQ-771
>                 URL: https://issues.apache.org/activemq/browse/AMQ-771
>             Project: ActiveMQ
>          Issue Type: Bug
>          Components: Connector
>    Affects Versions: 4.0, 4.0.1
>            Reporter: Kevin Yaussy
>         Assigned To: Rob Davies
>
> Especially when using "failover", there can be a problem with respect to TransportConnection::stop attempting to send a "shutdown" message over the connection.  If another thread is sending messages to the connection, and it gets stuck for some reason, such as a network freeze, the target machine panics, or the target process freezes for some reason, the TransportConnection::dispatch will eventually block, locking the MutextTransport object.  When the InactivityMonitor wakes up and detects that the connection is dead, it will go through the process of stopping the connection.  This goes back into TransportConnection, and calls stop, which attemtps to lock the MutexTransport so it can send the "shutdown" command.  Now, both threads are stuck, potentially for a long time, as a box panic will not cleanly close the tcp connection.
> I'm not sure the rationale for wanting to send a shutdown command to the other side of the connection, since the target has to handle the connection going down hard anyway.  Seems to me, if you are intending on closing the connection, just close it - don't try to be nice to the other side.  Especially in this code path, there is something wrong with the other side anyway.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: https://issues.apache.org/activemq/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] Commented: (AMQ-771) org.apache.activemq.broker.TransportConnection::stop should not attempt to send a message over the connection.

Posted by "Kevin Yaussy (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/activemq/browse/AMQ-771?page=comments#action_36794 ] 
            
Kevin Yaussy commented on AMQ-771:
----------------------------------

Rob,

The fixes I have for the 4.0.1 release for this issue rely upon the fix I made to DemandForwardingBridgeSupport for issue [ https://issues.apache.org/activemq/browse/AMQ-776?page=all ].

So, for 4.0.2, as I reported in AMQ-776, the 4.0.1 fix did not fix 4.0.2.  I will have to try and track down what is wrong there and get 4.0.2 fixed for AMQ-776.

At any rate, I should describe this issue better:

Not only is there an issue with TransportConnection::stop, wherein it attempts to send something on the socket before closing, there are problems in general with the fact that FailoverTransport is decorated by MutexTransport.  When publishing to a consumer between two brokers, and the consumer-side broker is frozen, and the socket fills up, then the FailoverTransport (InactivityMonitor) attempts to close down the connection.  This will fail, as everything is blocked around the MutexTransport.  See the scenario list below for how to recreate the problem.  

The changes I made are rather surgical, in order to make it work.  I wasn't particularly happy with them, but maybe they are acceptable.  I will attach patches as soon as I get AMQ-776 working.  But, the changed source files include:
org.apache.activemq.transport.MutexTransport
org.apache.activemq.transport.failover.FailoverTransport
org.apache.activemq.transport.tcp.TcpTransport
org.apache.activemq.broker.TransportConnection

The changes were not tested against all unit tests, so there may be similar changes required to other files (i.e. some other transport than TcpTransport).


Scenario (using ConsumerTool and ProducerTool from examples):
-Start broker A
-Start broker B
-Start consumer, on FOO, attaching to broker B (failover transport, only broker B)
-Start publisher, on FOO, publishing large messages, such as 10K bytes, attaching to broker A (failover transport, only broker A)
-On Solaris, pstop broker B

Wait for the socket to fill up, and then when broker A reports the dead connection, notice that it does not close off the connection properly.  Do a kill-3 on broker A and note that it is waiting on MutexTransport lock and FailoverTransport can't close off the connection.


> org.apache.activemq.broker.TransportConnection::stop should not attempt to send a message over the connection.
> --------------------------------------------------------------------------------------------------------------
>
>                 Key: AMQ-771
>                 URL: https://issues.apache.org/activemq/browse/AMQ-771
>             Project: ActiveMQ
>          Issue Type: Bug
>          Components: Connector
>    Affects Versions: 4.0.1, 4.0
>            Reporter: Kevin Yaussy
>         Assigned To: Rob Davies
>
> Especially when using "failover", there can be a problem with respect to TransportConnection::stop attempting to send a "shutdown" message over the connection.  If another thread is sending messages to the connection, and it gets stuck for some reason, such as a network freeze, the target machine panics, or the target process freezes for some reason, the TransportConnection::dispatch will eventually block, locking the MutextTransport object.  When the InactivityMonitor wakes up and detects that the connection is dead, it will go through the process of stopping the connection.  This goes back into TransportConnection, and calls stop, which attemtps to lock the MutexTransport so it can send the "shutdown" command.  Now, both threads are stuck, potentially for a long time, as a box panic will not cleanly close the tcp connection.
> I'm not sure the rationale for wanting to send a shutdown command to the other side of the connection, since the target has to handle the connection going down hard anyway.  Seems to me, if you are intending on closing the connection, just close it - don't try to be nice to the other side.  Especially in this code path, there is something wrong with the other side anyway.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: https://issues.apache.org/activemq/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira