You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@activemq.apache.org by "Przemek Bruski (JIRA)" <ji...@apache.org> on 2011/05/12 23:58:47 UTC

[jira] [Created] (AMQ-3319) A possible browsing concurrency issue in org.apache.activemq.broker.region.Queue

A possible browsing concurrency issue in org.apache.activemq.broker.region.Queue
--------------------------------------------------------------------------------

                 Key: AMQ-3319
                 URL: https://issues.apache.org/jira/browse/AMQ-3319
             Project: ActiveMQ
          Issue Type: Bug
            Reporter: Przemek Bruski


Some time ago clients of our ActiveMQ instance locked up while browsing. Analysis of the log files showed a large amount of:
{code}
 java.util.NoSuchElementException
	at java.util.LinkedList.remove(LinkedList.java:788)
	at java.util.LinkedList.removeFirst(LinkedList.java:134)
	at org.apache.activemq.broker.region.Queue.getNextBrowserDispatch(Queue.java:1341)
	at org.apache.activemq.broker.region.Queue.iterate(Queue.java:1463)
	at org.apache.activemq.thread.PooledTaskRunner.runTask(PooledTaskRunner.java:122)
	at org.apache.activemq.thread.PooledTaskRunner$1.run(PooledTaskRunner.java:43)
	at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
	at java.lang.Thread.run(Thread.java:637)
{code}
Happening before lockup.

It seems there is a problem in Queue class, which uses non thread safe LinkedList collection. Additions and removals to/from this collection are wrapped by a shared readLock, which means there is no guard against concurrent modification and there is also a possibility of a race condition between isEmpty and removeFirst call during concurrent usages of getNextBrowserDispatch (if they are possible).

I think the easiest fix is to switch from LinkedList to ConcurrentLinkedQueue and make use of Queue methods to access the collection (because they allow single step isEmpty/remove call). I am attaching a patch that does it. I've left the readLocks in case they are used to block writes someplace else, but they are not needed anymore for concurrency control over the new collection.


--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Resolved] (AMQ-3319) A possible browsing concurrency issue in org.apache.activemq.broker.region.Queue

Posted by "Timothy Bish (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/AMQ-3319?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Timothy Bish resolved AMQ-3319.
-------------------------------

       Resolution: Fixed
    Fix Version/s: 5.6.0
         Assignee: Timothy Bish

Verified that the queue is not protected and the read lock around that code is not needed once the code is switched to a ConcurrentLinkQueue.

Nice catch, thanks!

> A possible browsing concurrency issue in org.apache.activemq.broker.region.Queue
> --------------------------------------------------------------------------------
>
>                 Key: AMQ-3319
>                 URL: https://issues.apache.org/jira/browse/AMQ-3319
>             Project: ActiveMQ
>          Issue Type: Bug
>          Components: Broker
>    Affects Versions: 5.4.2
>         Environment: OS X, Linux
>            Reporter: Przemek Bruski
>            Assignee: Timothy Bish
>             Fix For: 5.6.0
>
>         Attachments: patch.diff
>
>
> Some time ago clients of our ActiveMQ instance locked up while browsing. Analysis of the log files showed a large amount of:
> {code}
>  java.util.NoSuchElementException
> 	at java.util.LinkedList.remove(LinkedList.java:788)
> 	at java.util.LinkedList.removeFirst(LinkedList.java:134)
> 	at org.apache.activemq.broker.region.Queue.getNextBrowserDispatch(Queue.java:1341)
> 	at org.apache.activemq.broker.region.Queue.iterate(Queue.java:1463)
> 	at org.apache.activemq.thread.PooledTaskRunner.runTask(PooledTaskRunner.java:122)
> 	at org.apache.activemq.thread.PooledTaskRunner$1.run(PooledTaskRunner.java:43)
> 	at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> 	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> 	at java.lang.Thread.run(Thread.java:637)
> {code}
> Happening before lockup.
> It seems there is a problem in Queue class, which uses non thread safe LinkedList collection. Additions and removals to/from this collection are wrapped by a shared readLock, which means there is no guard against concurrent modification and there is also a possibility of a race condition between isEmpty and removeFirst call during concurrent usages of getNextBrowserDispatch (if they are possible).
> I think the easiest fix is to switch from LinkedList to ConcurrentLinkedQueue and make use of Queue methods to access the collection (because they allow single step isEmpty/remove call). I am attaching a patch that does it. I've left the readLocks in case they are used to block writes someplace else, but they are not needed anymore for concurrency control over the new collection.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (AMQ-3319) A possible browsing concurrency issue in org.apache.activemq.broker.region.Queue

Posted by "Przemek Bruski (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/AMQ-3319?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Przemek Bruski updated AMQ-3319:
--------------------------------

          Component/s: Broker
          Environment: OS X, Linux
    Affects Version/s: 5.4.2

I've encountered the problem on OS X and AMQ 5.4.2, but it looks like the issue is still present in trunk.

> A possible browsing concurrency issue in org.apache.activemq.broker.region.Queue
> --------------------------------------------------------------------------------
>
>                 Key: AMQ-3319
>                 URL: https://issues.apache.org/jira/browse/AMQ-3319
>             Project: ActiveMQ
>          Issue Type: Bug
>          Components: Broker
>    Affects Versions: 5.4.2
>         Environment: OS X, Linux
>            Reporter: Przemek Bruski
>         Attachments: patch.diff
>
>
> Some time ago clients of our ActiveMQ instance locked up while browsing. Analysis of the log files showed a large amount of:
> {code}
>  java.util.NoSuchElementException
> 	at java.util.LinkedList.remove(LinkedList.java:788)
> 	at java.util.LinkedList.removeFirst(LinkedList.java:134)
> 	at org.apache.activemq.broker.region.Queue.getNextBrowserDispatch(Queue.java:1341)
> 	at org.apache.activemq.broker.region.Queue.iterate(Queue.java:1463)
> 	at org.apache.activemq.thread.PooledTaskRunner.runTask(PooledTaskRunner.java:122)
> 	at org.apache.activemq.thread.PooledTaskRunner$1.run(PooledTaskRunner.java:43)
> 	at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> 	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> 	at java.lang.Thread.run(Thread.java:637)
> {code}
> Happening before lockup.
> It seems there is a problem in Queue class, which uses non thread safe LinkedList collection. Additions and removals to/from this collection are wrapped by a shared readLock, which means there is no guard against concurrent modification and there is also a possibility of a race condition between isEmpty and removeFirst call during concurrent usages of getNextBrowserDispatch (if they are possible).
> I think the easiest fix is to switch from LinkedList to ConcurrentLinkedQueue and make use of Queue methods to access the collection (because they allow single step isEmpty/remove call). I am attaching a patch that does it. I've left the readLocks in case they are used to block writes someplace else, but they are not needed anymore for concurrency control over the new collection.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (AMQ-3319) A possible browsing concurrency issue in org.apache.activemq.broker.region.Queue

Posted by "Przemek Bruski (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/AMQ-3319?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Przemek Bruski updated AMQ-3319:
--------------------------------

    Attachment: patch.diff

> A possible browsing concurrency issue in org.apache.activemq.broker.region.Queue
> --------------------------------------------------------------------------------
>
>                 Key: AMQ-3319
>                 URL: https://issues.apache.org/jira/browse/AMQ-3319
>             Project: ActiveMQ
>          Issue Type: Bug
>            Reporter: Przemek Bruski
>         Attachments: patch.diff
>
>
> Some time ago clients of our ActiveMQ instance locked up while browsing. Analysis of the log files showed a large amount of:
> {code}
>  java.util.NoSuchElementException
> 	at java.util.LinkedList.remove(LinkedList.java:788)
> 	at java.util.LinkedList.removeFirst(LinkedList.java:134)
> 	at org.apache.activemq.broker.region.Queue.getNextBrowserDispatch(Queue.java:1341)
> 	at org.apache.activemq.broker.region.Queue.iterate(Queue.java:1463)
> 	at org.apache.activemq.thread.PooledTaskRunner.runTask(PooledTaskRunner.java:122)
> 	at org.apache.activemq.thread.PooledTaskRunner$1.run(PooledTaskRunner.java:43)
> 	at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> 	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> 	at java.lang.Thread.run(Thread.java:637)
> {code}
> Happening before lockup.
> It seems there is a problem in Queue class, which uses non thread safe LinkedList collection. Additions and removals to/from this collection are wrapped by a shared readLock, which means there is no guard against concurrent modification and there is also a possibility of a race condition between isEmpty and removeFirst call during concurrent usages of getNextBrowserDispatch (if they are possible).
> I think the easiest fix is to switch from LinkedList to ConcurrentLinkedQueue and make use of Queue methods to access the collection (because they allow single step isEmpty/remove call). I am attaching a patch that does it. I've left the readLocks in case they are used to block writes someplace else, but they are not needed anymore for concurrency control over the new collection.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira