You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@activemq.apache.org by "Rico Neubauer (Jira)" <ji...@apache.org> on 2020/01/03 13:48:00 UTC

[jira] [Created] (ARTEMIS-2586) Inifinite Block in AMQ212054 after transient DB-error

Rico Neubauer created ARTEMIS-2586:
--------------------------------------

             Summary: Inifinite Block in AMQ212054 after transient DB-error
                 Key: ARTEMIS-2586
                 URL: https://issues.apache.org/jira/browse/ARTEMIS-2586
             Project: ActiveMQ Artemis
          Issue Type: Bug
          Components: AMQP
    Affects Versions: 2.10.1
         Environment: This is Ubuntu 18.04 and Oracle DB, but don't think it's that relevant for the issue.
            Reporter: Rico Neubauer
         Attachments: initial-error.txt

Hi,

Would like to describe a quite severe situation which was expirienced in a long-running test with 2 out of 3 instances/machines.

We are running Karaf with Artemis 2.10.1.

After some time (see screenshot), first one, then after a while a 2nd instance came to a complete stop.

Looking into the logs and thread-dumps revealed the following (same for bith instances):
 # There was a temporary problem connecting to the DB (\{{connection reset by peer}}and \{{Closed Connection }})
 # This resulted (due to handling on our side) in an \{{IllegalStateException}}/\{{Error during two phase commit}} being thrown back to Artemis.
 # After this, there is no messaging possible anymore at all and the following log repeats:
{noformat}
AMQ212054: Destination address=DLQ is blocked. If the system is configured to block make sure you consume messages on this configuration.{noformat}
which comes from threads like these, trying to obtain credits for sending:

 
{noformat}
"Thread-93 (ActiveMQ-client-global-threads)" Id=2001 in TIMED_WAITING on lock=java.util.concurrent.Semaphore$NonfairSync@1f9a57e0
 at sun.misc.Unsafe.park(Native Method)
 at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
 at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedNanos(AbstractQueuedSynchronizer.java:1039)
 at java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireSharedNanos(AbstractQueuedSynchronizer.java:1332)
 at java.util.concurrent.Semaphore.tryAcquire(Semaphore.java:582)
 at org.apache.activemq.artemis.core.client.impl.ClientProducerCreditsImpl.actualAcquire(ClientProducerCreditsImpl.java:73)
 at org.apache.activemq.artemis.core.client.impl.AbstractProducerCreditsImpl.acquireCredits(AbstractProducerCreditsImpl.java:77)
 at org.apache.activemq.artemis.core.client.impl.ClientProducerImpl.sendRegularMessage(ClientProducerImpl.java:301)
 at org.apache.activemq.artemis.core.client.impl.ClientProducerImpl.doSend(ClientProducerImpl.java:275)
 at org.apache.activemq.artemis.core.client.impl.ClientProducerImpl.send(ClientProducerImpl.java:128)
 at org.apache.activemq.artemis.jms.client.ActiveMQMessageProducer.doSendx(ActiveMQMessageProducer.java:485)
 at org.apache.activemq.artemis.jms.client.ActiveMQMessageProducer.send(ActiveMQMessageProducer.java:195)
 at com.seeburger.engine.jms.MessageReceiverBase.sendToDLQ(MessageReceiverBase.java:571)
 at com.seeburger.engine.jms.MessageReceiverBase.handleException(MessageReceiverBase.java:493)
 at com.seeburger.engine.jms.MessageReceiverBase.onMessage(MessageReceiverBase.java:387)
 at org.apache.activemq.artemis.jms.client.JMSMessageListenerWrapper.onMessage(JMSMessageListenerWrapper.java:110)
 at org.apache.activemq.artemis.core.client.impl.ClientConsumerImpl.callOnMessage(ClientConsumerImpl.java:1031)
 at org.apache.activemq.artemis.core.client.impl.ClientConsumerImpl.access$400(ClientConsumerImpl.java:50)
 at org.apache.activemq.artemis.core.client.impl.ClientConsumerImpl$Runner.run(ClientConsumerImpl.java:1154)
 at org.apache.activemq.artemis.utils.actors.OrderedExecutor.doTask(OrderedExecutor.java:42)
 at org.apache.activemq.artemis.utils.actors.OrderedExecutor.doTask(OrderedExecutor.java:31)
 at org.apache.activemq.artemis.utils.actors.ProcessorBase.executePendingTasks(ProcessorBase.java:66)
 at org.apache.activemq.artemis.utils.actors.ProcessorBase$$Lambda$431/1769898766.run(Unknown Source)
 at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
 at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
 at org.apache.activemq.artemis.utils.ActiveMQThreadFactory$1.run(ActiveMQThreadFactory.java:118)
Locked synchronizers: count = 1
 - java.util.concurrent.ThreadPoolExecutor$Worker@bc49fcf
{noformat}
which will never succeed, since the credits seem to no suffice (see heap-dump screenshot)

From my point of view, the thrown IllegalStateException should not lead to the system going in this non-recoverable state, what do you think, is there something that can be enhanced?

 

[Fastthread-Link|https://fastthread.io/my-thread-report.jsp?p=c2hhcmVkLzIwMjAvMDEvMy8tLTIwMTktMTItMDRfdGhyZWFkZHVtcF8wMS50eHQtLTEzLTM4LTE1OzstLTIwMTktMTEtMjhfdGhyZWFkZHVtcF8wMS50eHQtLTEzLTM4LTE1]

In case it helps: The 2 instances are still in this state (since September) and I can fetch additional information or debug them on request.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)