You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@activemq.apache.org by "Clebert Suconic (Jira)" <ji...@apache.org> on 2021/02/09 21:25:07 UTC
[jira] [Closed] (ARTEMIS-3037) JournalImpl#checkKnownRecordID()
implementation can leave a thread hanging in WAITING state
[ https://issues.apache.org/jira/browse/ARTEMIS-3037?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Clebert Suconic closed ARTEMIS-3037.
------------------------------------
> JournalImpl#checkKnownRecordID() implementation can leave a thread hanging in WAITING state
> -------------------------------------------------------------------------------------------
>
> Key: ARTEMIS-3037
> URL: https://issues.apache.org/jira/browse/ARTEMIS-3037
> Project: ActiveMQ Artemis
> Issue Type: Bug
> Components: Broker
> Affects Versions: 2.9.0, 2.16.0
> Reporter: Tomas Hofman
> Priority: Major
> Fix For: 2.17.0
>
> Time Spent: 20m
> Remaining Estimate: 0h
>
> The {{JournalImpl#checkKnownRecordID()}} implementation contains following code:
> {code}
> final SimpleFuture<Boolean> known = new SimpleFutureImpl<>();
> // retry on the append thread. maybe the appender thread is not keeping up.
> appendExecutor.execute(new Runnable() {
> @Override
> public void run() {
> journalLock.readLock().lock();
> try {
> known.set(records.containsKey(id)
> || pendingRecords.contains(id)
> || (compactor != null && compactor.containsRecord(id)));
> } finally {
> journalLock.readLock().unlock();
> }
> }
> });
> if (!known.get()) {
> ...
> }
> {code}
> If the code in the Runnable fails with exception before the {{known}} future value is set, the main thread would be left in the WAITING state forever. Exception handling should be added that would cancel the future in case of exception.
> We've observed cases where following threads were left hanging, while no other threads operating inside JournalImpl were present. I believe that {{JournalImpl#checkKnownRecordID()}} implementation may be responsible for that:
> {code}
> "Thread-16 (ActiveMQ-server-org.apache.activemq.artemis.core.server.impl.ActiveMQServerImpl$6@423fe5c3)" #1078 prio=5 os_prio=64 tid=0x000000011c34a000 nid=0x4eb waiting on condition [0xfffffffabe9ad000]
> java.lang.Thread.State: WAITING (parking)
> at sun.misc.Unsafe.park(Native Method)
> - parking to wait for <0xfffffffbe73c29e8> (a java.util.concurrent.CountDownLatch$Sync)
> at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
> at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
> at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:997)
> at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1304)
> at java.util.concurrent.CountDownLatch.await(CountDownLatch.java:231)
> at org.apache.activemq.artemis.utils.SimpleFutureImpl.get(SimpleFutureImpl.java:62)
> at org.apache.activemq.artemis.core.journal.impl.JournalImpl.checkKnownRecordID(JournalImpl.java:1080)
> at org.apache.activemq.artemis.core.journal.impl.JournalImpl.appendDeleteRecord(JournalImpl.java:950)
> at org.apache.activemq.artemis.core.persistence.impl.journal.AbstractJournalStorageManager.confirmPendingLargeMessage(AbstractJournalStorageManager.java:361)
> at org.apache.activemq.artemis.core.postoffice.impl.PostOfficeImpl.confirmLargeMessageSend(PostOfficeImpl.java:1390)
> - locked <0xfffffffbe73aa1b0> (a org.apache.activemq.artemis.core.persistence.impl.journal.LargeServerMessageImpl)
> at org.apache.activemq.artemis.core.postoffice.impl.PostOfficeImpl.processRoute(PostOfficeImpl.java:1336)
> at org.apache.activemq.artemis.core.postoffice.impl.PostOfficeImpl.route(PostOfficeImpl.java:980)
> at org.apache.activemq.artemis.core.postoffice.impl.PostOfficeImpl.route(PostOfficeImpl.java:871)
> at org.apache.activemq.artemis.core.server.impl.ServerSessionImpl.doSend(ServerSessionImpl.java:2045)
> - locked <0xfffffffb19447fb8> (a org.apache.activemq.artemis.core.server.impl.ServerSessionImpl)
> at org.apache.activemq.artemis.core.server.impl.ServerSessionImpl.doSend(ServerSessionImpl.java:1989)
> - locked <0xfffffffb19447fb8> (a org.apache.activemq.artemis.core.server.impl.ServerSessionImpl)
> at org.apache.activemq.artemis.core.protocol.core.ServerSessionPacketHandler.sendContinuations(ServerSessionPacketHandler.java:1034)
> - locked <0xfffffffb1962b900> (a java.lang.Object)
> at org.apache.activemq.artemis.core.protocol.core.ServerSessionPacketHandler.slowPacketHandler(ServerSessionPacketHandler.java:312)
> at org.apache.activemq.artemis.core.protocol.core.ServerSessionPacketHandler.onMessagePacket(ServerSessionPacketHandler.java:285)
> at org.apache.activemq.artemis.core.protocol.core.ServerSessionPacketHandler$$Lambda$651/2097400985.onMessage(Unknown Source)
> at org.apache.activemq.artemis.utils.actors.Actor.doTask(Actor.java:33)
> at org.apache.activemq.artemis.utils.actors.ProcessorBase.executePendingTasks(ProcessorBase.java:66)
> at org.apache.activemq.artemis.utils.actors.ProcessorBase$$Lambda$413/494003142.run(Unknown Source)
> at org.apache.activemq.artemis.utils.actors.OrderedExecutor.doTask(OrderedExecutor.java:42)
> at org.apache.activemq.artemis.utils.actors.OrderedExecutor.doTask(OrderedExecutor.java:31)
> at org.apache.activemq.artemis.utils.actors.ProcessorBase.executePendingTasks(ProcessorBase.java:66)
> at org.apache.activemq.artemis.utils.actors.ProcessorBase$$Lambda$413/494003142.run(Unknown Source)
> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java)
> at org.apache.activemq.artemis.utils.ActiveMQThreadFactory$1.run(ActiveMQThreadFactory.java:118)
> Locked ownable synchronizers:
> - <0xfffffffba1800ca0> (a java.util.concurrent.ThreadPoolExecutor$Worker)
> {code}
> {code}
> "Thread-82 (ActiveMQ-IO-server-org.apache.activemq.artemis.core.server.impl.ActiveMQServerImpl$7@3bde9e44)" #2130 prio=5 os_prio=64 tid=0x000000017b6df800 nid=0x907 waiting for monitor entry [0xffffffff045de000]
> java.lang.Thread.State: BLOCKED (on object monitor)
> at org.apache.activemq.artemis.core.persistence.impl.journal.LargeServerMessageImpl.getEncodeSize(LargeServerMessageImpl.java:178)
> - waiting to lock <0xfffffffbe73aa1b0> (a org.apache.activemq.artemis.core.persistence.impl.journal.LargeServerMessageImpl)
> at org.apache.activemq.artemis.core.persistence.impl.journal.codec.LargeMessagePersister.getEncodeSize(LargeMessagePersister.java:59)
> at org.apache.activemq.artemis.core.persistence.impl.journal.codec.LargeMessagePersister.getEncodeSize(LargeMessagePersister.java:25)
> at org.apache.activemq.artemis.core.journal.impl.dataformat.JournalAddRecord.getEncodeSize(JournalAddRecord.java:79)
> at org.apache.activemq.artemis.core.journal.impl.JournalImpl.appendRecord(JournalImpl.java:2792)
> at org.apache.activemq.artemis.core.journal.impl.JournalImpl.access$100(JournalImpl.java:91)
> at org.apache.activemq.artemis.core.journal.impl.JournalImpl$1.run(JournalImpl.java:850)
> {code}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)