You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@activemq.apache.org by "Daniel Ellis (JIRA)" <ji...@apache.org> on 2010/11/04 10:13:00 UTC
[jira] Commented: (AMQNET-289) Deadlock while sending a message
after failover within a consumer
[ https://issues.apache.org/activemq/browse/AMQNET-289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=63014#action_63014 ]
Daniel Ellis commented on AMQNET-289:
-------------------------------------
We have also experienced this issue. Here is the deadlock which happened on our system:-
Thread 1 - lock(reconnectMutex) (FailoverTransport.cs:line 366)
Thread 1 - wait on lock(this.consumers.SyncRoot) (Session.cs:line 830)
Thread 2 - lock(this.consumers.SyncRoot) (SessionExecutor.cs:line 147)
Thread 2 - wait on lock(transmissionLock) (MutexTransport.cs:line 35)
Thread 3 - lock(transmissionLock) (MutexTransport.cs:line 35)
Thread 3 - wait on lock(reconnectMutex) (FailoverTransport.cs:line 531)
T1 waiting on T2 waiting on T3 waiting on T1.
The change Morgan made has resolved it for us also. Thanks Morgan!
> Deadlock while sending a message after failover within a consumer
> -----------------------------------------------------------------
>
> Key: AMQNET-289
> URL: https://issues.apache.org/activemq/browse/AMQNET-289
> Project: ActiveMQ .Net
> Issue Type: Bug
> Components: ActiveMQ
> Affects Versions: 1.4.1
> Environment: Windows 7 64 bits
> Reporter: Morgan Martinet
> Assignee: Jim Gomes
> Priority: Critical
> Fix For: 1.5.0
>
> Attachments: deadlock.jpg, SessionExecutor.cs
>
>
> Scenario:
> - I have one producer that sends a request (with a temporary queue specified in the Reply-to attribute) to a consumer, in a separate process.
> - both, the producer and the consumer, use the following connection string: failover:(tcp://localhost:61616)?timeout=3000
> - the consumer, when processing the request, waits 10 seconds then sends a response back, using the Reply-To attribute.
> - immediately after the message has been sent, while the consumer is waiting for 10 secs, I restart the ActiveMQ broker.
> - once the the consumer wakes up and tries to send its reply, it will deadlock because of the failover.
> We have managed to identify the resources that deadlock:
> Thread1 - lock(reconnectMutex) (c:\Temp\Apache\NMS.ActiveMQ\1.4.1\src\main\csharp\Transport\Failover\FailoverTransport.cs: line 366)
> Thread1 - wait on lock(this.consumers.SyncRoot) (c:\Temp\Apache\NMS.ActiveMQ\1.4.1\src\main\csharp\Session.cs: line 830)
> Thread2 - lock(this.consumers.SyncRoot) (c:\Temp\Apache\NMS.ActiveMQ\1.4.1\src\main\csharp\SessionExecutor.cs: line 147)
> Thread2 - wait on lock(reconnectMutex) (c:\Temp\Apache\NMS.ActiveMQ\1.4.1\src\main\csharp\Transport\Failover\FailoverTransport.cs: line 531)
> Patch:
> I managed to find a simple fix for this, by moving the consumer dispatch out of the this.consumers.SyncRoot lock in SessionExecutor.cs:
> {{
> public void Dispatch(MessageDispatch dispatch)
> {
> try
> {
> MessageConsumer consumer = null;
> lock(this.consumers.SyncRoot)
> {
> if(this.consumers.Contains(dispatch.ConsumerId))
> {
> consumer = this.consumers[dispatch.ConsumerId] as MessageConsumer;
> }
> // Note that consumer.Dispatch(...) was moved below, outside of the lock.
> }
> // If the consumer is not available, just ignore the message.
> // Otherwise, dispatch the message to the consumer.
> if(consumer != null) {
> consumer.Dispatch(dispatch);
> }
> }
> catch(Exception ex)
> {
> Tracer.DebugFormat("Caught Exception While Dispatching: {0}", ex.Message );
> }
> }
> }}
> Note that I ran the unit tests before my patch and I got 3 failures. Then I got the same failures with my patch. So, I hope it didn't break anything but I'll let you find the best solution...
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.