You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@activemq.apache.org by rarciere <ro...@peersoftware.com> on 2016/05/10 17:38:06 UTC

Connection sometimes hang dispatching message when interrupted failover transport resumes

We are using ActiveMQ version 5.9.0 and using the Failover Transport with a
single host and suspect that a JMS connection failure
(transportInterrupted)/re-connection (transportResumed) is sometimes causing
a thread which is dispatching a message to the Broker to become stuck/hung. 
Once this occurs no messages can be consumed or dispatched from that
connection.  This is a relatively rare occurrence in production but it does
happen in this one environment from time to time.  The client host where
this occurs is in Hong Kong and the connection is a bit unreliable and does
cause broker disconnects a few times a day.   At the end of this message are
two examples of thread dumps for two different scenarios, but with the same
end result that we are no longer able to dispatch messages to the broker on
a particular connection. 
 
 In “Example 1”, we have a case where the ActiveMQ Task-25 thread appears to
be stuck and blocking our EventDispatcher thread with the dispatcher thread
marked as being in a BLOCKED state.  Note how the task thread appears to be
inside the FailoverTransport.doReconnect function in both examples: 

In “Example 2” we have a case where the ActiveMQ Task-22 thread appears to
be stuck and blocking other threads from sending, though in this example the
other threads are not reported as being in a BLOCKED state.  

 
“org.apache.activemq.transport.tcp.TcpTransport.oneway(TcpTransport.java:176)
org.apache.activemq.transport.AbstractInactivityMonitor.doOnewaySend(AbstractInactivityMonitor.java:304)
org.apache.activemq.transport.AbstractInactivityMonitor.oneway(AbstractInactivityMonitor.java:286)
org.apache.activemq.transport.WireFormatNegotiator.sendWireFormat(WireFormatNegotiator.java:168)
org.apache.activemq.transport.WireFormatNegotiator.sendWireFormat(WireFormatNegotiator.java:84)
org.apache.activemq.transport.WireFormatNegotiator.start(WireFormatNegotiator.java:74)
org.apache.activemq.transport.failover.FailoverTransport.doReconnect(FailoverTransport.java:1002)”
 
From example 1:
“org.apache.activemq.transport.tcp.TcpTransport.oneway(TcpTransport.java:176)
org.apache.activemq.transport.AbstractInactivityMonitor.doOnewaySend(AbstractInactivityMonitor.java:304)
org.apache.activemq.transport.AbstractInactivityMonitor.oneway(AbstractInactivityMonitor.java:286)
org.apache.activemq.transport.WireFormatNegotiator.sendWireFormat(WireFormatNegotiator.java:168)
org.apache.activemq.transport.WireFormatNegotiator.sendWireFormat(WireFormatNegotiator.java:84)
org.apache.activemq.transport.WireFormatNegotiator.start(WireFormatNegotiator.java:74)
org.apache.activemq.transport.failover.FailoverTransport.doReconnect(FailoverTransport.java:1002)”
 
Example #2 looks to be very similar to the post from 02/14 08:46am in the
following thread
http://activemq.2283324.n4.nabble.com/Hung-Producer-td4678060.html but I did
not find any follow ups to this post in JIRA.
 
In both cases we saw numerous JMS Transport interruptions/reconnections
messages throughout the day (via TransportListener) logged though we do not
know the exact relation of timing for these errors occurring vs. when the
threads become stuck.  We had to restart the client service to resume
operation for both cases. 
 
When this issue occurs it appears to only affect one of the several
connections that we maintain to the broker from that client.  Our other
connections from the same client are still able to send and receive messages
as normal.   Also noteworthy is that in this environment we are using SSL
transport as the connector using synchronous sends.   The connection URL
looks like:  

failover:(ssl://hostName:61617?socketBufferSize=131072)?jms.useCompression=true&jms.alwaysSyncSend=true&jms.prefetchPolicy.all=30

We plan to retest in production with the latest 5.13.3 Broker some in the
future.

Example 1
====================================================================
3030 	EventDispatcher-PL 	BLOCKED on java.lang.Object@149f2b3 owned by:
ActiveMQ Task-25
Total blocked: 8 Total waited: 280178
Stack trace:
org.apache.activemq.transport.failover.FailoverTransport.oneway(FailoverTransport.java:556)
org.apache.activemq.transport.MutexTransport.oneway(MutexTransport.java:68)
org.apache.activemq.transport.ResponseCorrelator.asyncRequest(ResponseCorrelator.java:81)
org.apache.activemq.transport.ResponseCorrelator.request(ResponseCorrelator.java:86)
org.apache.activemq.ActiveMQConnection.syncSendPacket(ActiveMQConnection.java:1394)
org.apache.activemq.ActiveMQConnection.syncSendPacket(ActiveMQConnection.java:1333)
org.apache.activemq.ActiveMQSession.send(ActiveMQSession.java:1811)
org.apache.activemq.ActiveMQMessageProducer.send(ActiveMQMessageProducer.java:289)
org.apache.activemq.ActiveMQMessageProducer.send(ActiveMQMessageProducer.java:224)
org.apache.activemq.ActiveMQMessageProducerSupport.send(ActiveMQMessageProducerSupport.java:300)
com.ci.pl.agent.core.EventDispatcher$BrokerDispatcher.dispatch(EventDispatcher.java:550)
com.ci.pl.agent.core.EventDispatcher$DispatcherContext.dispatch(EventDispatcher.java:337)
com.ci.pl.agent.core.EventDispatcher$DispatchWorker.run(EventDispatcher.java:279)
java.lang.Thread.run(Unknown Source)


139576139576 	ActiveMQ Task-25 	RUNNABLE
Total blocked: 0 Total waited: 1
Stack trace:
java.net.SocketInputStream.socketRead0(Native Method)
java.net.SocketInputStream.socketRead(Unknown Source)
java.net.SocketInputStream.read(Unknown Source)
java.net.SocketInputStream.read(Unknown Source)
sun.security.ssl.InputRecord.readFully(Unknown Source)
sun.security.ssl.InputRecord.read(Unknown Source)
sun.security.ssl.SSLSocketImpl.readRecord(Unknown Source)
sun.security.ssl.SSLSocketImpl.performInitialHandshake(Unknown Source)
sun.security.ssl.SSLSocketImpl.writeRecord(Unknown Source)
sun.security.ssl.AppOutputStream.write(Unknown Source)
org.apache.activemq.transport.tcp.TcpBufferedOutputStream.flush(TcpBufferedOutputStream.java:115)
java.io.DataOutputStream.flush(Unknown Source)
org.apache.activemq.transport.tcp.TcpTransport.oneway(TcpTransport.java:176)
org.apache.activemq.transport.AbstractInactivityMonitor.doOnewaySend(AbstractInactivityMonitor.java:304)
org.apache.activemq.transport.AbstractInactivityMonitor.oneway(AbstractInactivityMonitor.java:286)
org.apache.activemq.transport.WireFormatNegotiator.sendWireFormat(WireFormatNegotiator.java:168)
org.apache.activemq.transport.WireFormatNegotiator.sendWireFormat(WireFormatNegotiator.java:84)
org.apache.activemq.transport.WireFormatNegotiator.start(WireFormatNegotiator.java:74)
org.apache.activemq.transport.failover.FailoverTransport.doReconnect(FailoverTransport.java:1002)
org.apache.activemq.transport.failover.FailoverTransport$2.iterate(FailoverTransport.java:136)
org.apache.activemq.thread.PooledTaskRunner.runTask(PooledTaskRunner.java:129)
org.apache.activemq.thread.PooledTaskRunner$1.run(PooledTaskRunner.java:47)
java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
java.lang.Thread.run(Unknown Source)


Example 2
=====================================================================

3030 	EventDispatcher-PL 	WAITING on
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@cdb4ee
Total blocked: 33 Total waited: 110108
Stack trace:
sun.misc.Unsafe.park(Native Method)
java.util.concurrent.locks.LockSupport.park(Unknown Source)
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(Unknown
Source)
java.util.concurrent.ArrayBlockingQueue.take(Unknown Source)
org.apache.activemq.transport.FutureResponse.getResult(FutureResponse.java:40)
org.apache.activemq.transport.ResponseCorrelator.request(ResponseCorrelator.java:87)
org.apache.activemq.ActiveMQConnection.syncSendPacket(ActiveMQConnection.java:1394)
org.apache.activemq.ActiveMQConnection.syncSendPacket(ActiveMQConnection.java:1333)
org.apache.activemq.ActiveMQSession.send(ActiveMQSession.java:1811)
org.apache.activemq.ActiveMQMessageProducer.send(ActiveMQMessageProducer.java:289)
org.apache.activemq.ActiveMQMessageProducer.send(ActiveMQMessageProducer.java:224)
org.apache.activemq.ActiveMQMessageProducerSupport.send(ActiveMQMessageProducerSupport.java:300)
com.ci.pl.agent.core.EventDispatcher$BrokerDispatcher.dispatch(EventDispatcher.java:550)
com.ci.pl.agent.core.EventDispatcher$DispatcherContext.dispatch(EventDispatcher.java:337)
com.ci.pl.agent.core.EventDispatcher$DispatchWorker.run(EventDispatcher.java:279)
java.lang.Thread.run(Unknown Source)



5252 	CloseAccessedFile-107-PL 	WAITING on
java.util.concurrent.SynchronousQueue$TransferQueue@6e76e5
Total blocked: 311 Total waited: 78110
Stack trace:
sun.misc.Unsafe.park(Native Method)
java.util.concurrent.locks.LockSupport.park(Unknown Source)
java.util.concurrent.SynchronousQueue$TransferQueue.awaitFulfill(Unknown
Source)
java.util.concurrent.SynchronousQueue$TransferQueue.transfer(Unknown Source)
java.util.concurrent.SynchronousQueue.put(Unknown Source)
com.ci.pl.agent.core.EventDispatcher.dispatch(EventDispatcher.java:148)
com.ci.pl.agent.AgentServices.dispatch(AgentServices.java:327)
com.ci.pl.instruction.CloseAccessedFile.perform(CloseAccessedFile.java:372)
com.ci.pl.instruction.Command.run(Command.java:137)
com.ci.pl.agent.core.CommandInvoker$InvocationWorker.run(CommandInvoker.java:698)
java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)
java.util.concurrent.FutureTask.run(Unknown Source)
java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
java.lang.Thread.run(Unknown Source)


130130 	RebuildFile-107-PL 	WAITING on
java.util.concurrent.SynchronousQueue$TransferQueue@6e76e5
Total blocked: 272 Total waited: 20511
Stack trace:
sun.misc.Unsafe.park(Native Method)
java.util.concurrent.locks.LockSupport.park(Unknown Source)
java.util.concurrent.SynchronousQueue$TransferQueue.awaitFulfill(Unknown
Source)
java.util.concurrent.SynchronousQueue$TransferQueue.transfer(Unknown Source)
java.util.concurrent.SynchronousQueue.put(Unknown Source)
com.ci.pl.agent.core.EventDispatcher.dispatch(EventDispatcher.java:148)
com.ci.pl.agent.AgentServices.dispatch(AgentServices.java:327)
com.ci.pl.instruction.RebuildFile.perform(RebuildFile.java:441)
com.ci.pl.instruction.Command.run(Command.java:137)
com.ci.pl.agent.core.CommandInvoker$InvocationWorker.run(CommandInvoker.java:698)
java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)
java.util.concurrent.FutureTask.run(Unknown Source)
java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
java.lang.Thread.run(Unknown Source)

143143 	TransferAccessedFile-107-PL 	WAITING on
java.util.concurrent.SynchronousQueue$TransferQueue@6e76e5
Total blocked: 208 Total waited: 20462
Stack trace:
sun.misc.Unsafe.park(Native Method)
java.util.concurrent.locks.LockSupport.park(Unknown Source)
java.util.concurrent.SynchronousQueue$TransferQueue.awaitFulfill(Unknown
Source)
java.util.concurrent.SynchronousQueue$TransferQueue.transfer(Unknown Source)
java.util.concurrent.SynchronousQueue.put(Unknown Source)
com.ci.pl.agent.core.EventDispatcher.dispatch(EventDispatcher.java:148)
com.ci.pl.agent.AgentServices.dispatch(AgentServices.java:327)
com.ci.pl.instruction.TransferAccessedFile.perform(TransferAccessedFile.java:652)
com.ci.pl.instruction.Command.run(Command.java:137)
com.ci.pl.agent.core.CommandInvoker$InvocationWorker.run(CommandInvoker.java:698)
java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)
java.util.concurrent.FutureTask.run(Unknown Source)
java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
java.lang.Thread.run(Unknown Source)

7313373133 	ActiveMQ Task-22 	RUNNABLE
Total blocked: 0 Total waited: 1
Stack trace:
java.net.SocketInputStream.socketRead0(Native Method)
java.net.SocketInputStream.socketRead(Unknown Source)
java.net.SocketInputStream.read(Unknown Source)
java.net.SocketInputStream.read(Unknown Source)
sun.security.ssl.InputRecord.readFully(Unknown Source)
sun.security.ssl.InputRecord.read(Unknown Source)
sun.security.ssl.SSLSocketImpl.readRecord(Unknown Source)
sun.security.ssl.SSLSocketImpl.performInitialHandshake(Unknown Source)
sun.security.ssl.SSLSocketImpl.writeRecord(Unknown Source)
sun.security.ssl.AppOutputStream.write(Unknown Source)
org.apache.activemq.transport.tcp.TcpBufferedOutputStream.flush(TcpBufferedOutputStream.java:115)
java.io.DataOutputStream.flush(Unknown Source)
org.apache.activemq.transport.tcp.TcpTransport.oneway(TcpTransport.java:176)
org.apache.activemq.transport.AbstractInactivityMonitor.doOnewaySend(AbstractInactivityMonitor.java:304)
org.apache.activemq.transport.AbstractInactivityMonitor.oneway(AbstractInactivityMonitor.java:286)
org.apache.activemq.transport.WireFormatNegotiator.sendWireFormat(WireFormatNegotiator.java:168)
org.apache.activemq.transport.WireFormatNegotiator.sendWireFormat(WireFormatNegotiator.java:84)
org.apache.activemq.transport.WireFormatNegotiator.start(WireFormatNegotiator.java:74)
org.apache.activemq.transport.failover.FailoverTransport.doReconnect(FailoverTransport.java:1002)
org.apache.activemq.transport.failover.FailoverTransport$2.iterate(FailoverTransport.java:136)
org.apache.activemq.thread.PooledTaskRunner.runTask(PooledTaskRunner.java:129)
org.apache.activemq.thread.PooledTaskRunner$1.run(PooledTaskRunner.java:47)
java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
java.lang.Thread.run(Unknown Source)


Any help or suggestions as to what may be happening would be appreciated?

Thanks,

Rob 




--
View this message in context: http://activemq.2283324.n4.nabble.com/Connection-sometimes-hang-dispatching-message-when-interrupted-failover-transport-resumes-tp4711802.html
Sent from the ActiveMQ - User mailing list archive at Nabble.com.