You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@activemq.apache.org by "Martin Murphy (JIRA)" <ji...@apache.org> on 2009/11/03 11:03:52 UTC

[jira] Created: (AMQ-2475) If tmp message store fills up, broker can deadlock due to while producers wait on disk space and consumers wait on acks

If tmp message store fills up, broker can deadlock due to while producers wait on disk space and consumers wait on acks
-----------------------------------------------------------------------------------------------------------------------

                 Key: AMQ-2475
                 URL: https://issues.apache.org/activemq/browse/AMQ-2475
             Project: ActiveMQ
          Issue Type: Bug
          Components: Broker, Message Store, Transport
    Affects Versions: 5.3.0
         Environment: Tested on Windows XP with JDK 1.60_13, but fairly sure it will be an issue on all platforms
            Reporter: Martin Murphy


I will attach a simple project that shows this. In the test the tmp space is set to 32 MB and two threads are created. One thread will constantly produce 1KB messages and the other consumes these, but sleeps for 100ms, note that producer flow control is turned off as well. The goal here is to ensure that the producers block while the consumers read the rest of the messages from the broker and catch up, this in turn frees up the disk space and allows the producer to send more messages. This config means that you can bound the broker based on disk space rather than memory usage.

Unfortunately in this test using topics while the broker is reading in the message from the producer it has to lock the matched list it is adding it to. This is an abstract from the Topic's point of view and doesn't realize that the file may block based on the file system. 
{code}
    public void add(MessageReference node) throws Exception { //... snip ...
            if (maximumPendingMessages != 0) {
                synchronized (matchedListMutex) {   // We have this mutex
                    matched.addMessageLast(node); // ends up waiting for space
                    // NOTE - be careful about the slaveBroker!
                    if (maximumPendingMessages > 0) {
{code}
Meanwhile the consumer is sending acknowledgements for the 10 messages it just read in (the configured prefetch) from the same topic, but since they also modify the same list in the topic this waits as well on the mutex held to service the producer:
{code}
    private void dispatchMatched() throws IOException {       
        synchronized (matchedListMutex) {  // never gets passed here.
            if (!matched.isEmpty() && !isFull()) {
{code}
This is a fairly classic deadlock. The trick is now how to resolve this given the fact that the topic isn't aware that it's list may need to wait for the file system to clean up.


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (AMQ-2475) If tmp message store fills up, broker can deadlock due to while producers wait on disk space and consumers wait on acks

Posted by "Dominic Tootell (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/activemq/browse/AMQ-2475?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Dominic Tootell updated AMQ-2475:
---------------------------------

    Attachment: TopicSubscription.java
                Queue.java
                Topic.java

> If tmp message store fills up, broker can deadlock due to while producers wait on disk space and consumers wait on acks
> -----------------------------------------------------------------------------------------------------------------------
>
>                 Key: AMQ-2475
>                 URL: https://issues.apache.org/activemq/browse/AMQ-2475
>             Project: ActiveMQ
>          Issue Type: Bug
>          Components: Broker, Message Store, Transport
>    Affects Versions: 5.3.0
>         Environment: Tested on Windows XP with JDK 1.60_13, but fairly sure it will be an issue on all platforms
>            Reporter: Martin Murphy
>            Assignee: Rob Davies
>         Attachments: activemq.xml, hangtest.zip, Queue.java, Queue.patchfile.txt, Topic.java, Topic.patchfile.txt, TopicSubscription.java, TopicSubscription.patchfile.txt
>
>
> I will attach a simple project that shows this. In the test the tmp space is set to 32 MB and two threads are created. One thread will constantly produce 1KB messages and the other consumes these, but sleeps for 100ms, note that producer flow control is turned off as well. The goal here is to ensure that the producers block while the consumers read the rest of the messages from the broker and catch up, this in turn frees up the disk space and allows the producer to send more messages. This config means that you can bound the broker based on disk space rather than memory usage.
> Unfortunately in this test using topics while the broker is reading in the message from the producer it has to lock the matched list it is adding it to. This is an abstract from the Topic's point of view and doesn't realize that the file may block based on the file system. 
> {code}
>     public void add(MessageReference node) throws Exception { //... snip ...
>             if (maximumPendingMessages != 0) {
>                 synchronized (matchedListMutex) {   // We have this mutex
>                     matched.addMessageLast(node); // ends up waiting for space
>                     // NOTE - be careful about the slaveBroker!
>                     if (maximumPendingMessages > 0) {
> {code}
> Meanwhile the consumer is sending acknowledgements for the 10 messages it just read in (the configured prefetch) from the same topic, but since they also modify the same list in the topic this waits as well on the mutex held to service the producer:
> {code}
>     private void dispatchMatched() throws IOException {       
>         synchronized (matchedListMutex) {  // never gets passed here.
>             if (!matched.isEmpty() && !isFull()) {
> {code}
> This is a fairly classic deadlock. The trick is now how to resolve this given the fact that the topic isn't aware that it's list may need to wait for the file system to clean up.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Resolved: (AMQ-2475) If tmp message store fills up, broker can deadlock due to while producers wait on disk space and consumers wait on acks

Posted by "Rob Davies (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/activemq/browse/AMQ-2475?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Rob Davies resolved AMQ-2475.
-----------------------------

       Resolution: Fixed
    Fix Version/s: 5.4.0

dead lock fixed by svn revision 881313 and 881340

> If tmp message store fills up, broker can deadlock due to while producers wait on disk space and consumers wait on acks
> -----------------------------------------------------------------------------------------------------------------------
>
>                 Key: AMQ-2475
>                 URL: https://issues.apache.org/activemq/browse/AMQ-2475
>             Project: ActiveMQ
>          Issue Type: Bug
>          Components: Broker, Message Store, Transport
>    Affects Versions: 5.3.0
>         Environment: Tested on Windows XP with JDK 1.60_13, but fairly sure it will be an issue on all platforms
>            Reporter: Martin Murphy
>            Assignee: Rob Davies
>             Fix For: 5.4.0
>
>         Attachments: activemq.xml, hangtest.zip, Queue.java, Queue.patchfile.txt, Topic.java, Topic.patchfile.txt, TopicSubscription.java, TopicSubscription.patchfile.txt
>
>
> I will attach a simple project that shows this. In the test the tmp space is set to 32 MB and two threads are created. One thread will constantly produce 1KB messages and the other consumes these, but sleeps for 100ms, note that producer flow control is turned off as well. The goal here is to ensure that the producers block while the consumers read the rest of the messages from the broker and catch up, this in turn frees up the disk space and allows the producer to send more messages. This config means that you can bound the broker based on disk space rather than memory usage.
> Unfortunately in this test using topics while the broker is reading in the message from the producer it has to lock the matched list it is adding it to. This is an abstract from the Topic's point of view and doesn't realize that the file may block based on the file system. 
> {code}
>     public void add(MessageReference node) throws Exception { //... snip ...
>             if (maximumPendingMessages != 0) {
>                 synchronized (matchedListMutex) {   // We have this mutex
>                     matched.addMessageLast(node); // ends up waiting for space
>                     // NOTE - be careful about the slaveBroker!
>                     if (maximumPendingMessages > 0) {
> {code}
> Meanwhile the consumer is sending acknowledgements for the 10 messages it just read in (the configured prefetch) from the same topic, but since they also modify the same list in the topic this waits as well on the mutex held to service the producer:
> {code}
>     private void dispatchMatched() throws IOException {       
>         synchronized (matchedListMutex) {  // never gets passed here.
>             if (!matched.isEmpty() && !isFull()) {
> {code}
> This is a fairly classic deadlock. The trick is now how to resolve this given the fact that the topic isn't aware that it's list may need to wait for the file system to clean up.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (AMQ-2475) If tmp message store fills up, broker can deadlock due to while producers wait on disk space and consumers wait on acks

Posted by "Dominic Tootell (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/activemq/browse/AMQ-2475?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Dominic Tootell updated AMQ-2475:
---------------------------------

    Attachment: TopicSubscription.java
                Topic.java
                Queue.java

The patched java files:

Queue.java
Topic.java
TopicSubscription.java



> If tmp message store fills up, broker can deadlock due to while producers wait on disk space and consumers wait on acks
> -----------------------------------------------------------------------------------------------------------------------
>
>                 Key: AMQ-2475
>                 URL: https://issues.apache.org/activemq/browse/AMQ-2475
>             Project: ActiveMQ
>          Issue Type: Bug
>          Components: Broker, Message Store, Transport
>    Affects Versions: 5.3.0
>         Environment: Tested on Windows XP with JDK 1.60_13, but fairly sure it will be an issue on all platforms
>            Reporter: Martin Murphy
>            Assignee: Rob Davies
>         Attachments: hangtest.zip, Queue.java, Queue.patchfile.txt, Topic.java, Topic.patchfile.txt, TopicSubscription.java, TopicSubscription.patchfile.txt
>
>
> I will attach a simple project that shows this. In the test the tmp space is set to 32 MB and two threads are created. One thread will constantly produce 1KB messages and the other consumes these, but sleeps for 100ms, note that producer flow control is turned off as well. The goal here is to ensure that the producers block while the consumers read the rest of the messages from the broker and catch up, this in turn frees up the disk space and allows the producer to send more messages. This config means that you can bound the broker based on disk space rather than memory usage.
> Unfortunately in this test using topics while the broker is reading in the message from the producer it has to lock the matched list it is adding it to. This is an abstract from the Topic's point of view and doesn't realize that the file may block based on the file system. 
> {code}
>     public void add(MessageReference node) throws Exception { //... snip ...
>             if (maximumPendingMessages != 0) {
>                 synchronized (matchedListMutex) {   // We have this mutex
>                     matched.addMessageLast(node); // ends up waiting for space
>                     // NOTE - be careful about the slaveBroker!
>                     if (maximumPendingMessages > 0) {
> {code}
> Meanwhile the consumer is sending acknowledgements for the 10 messages it just read in (the configured prefetch) from the same topic, but since they also modify the same list in the topic this waits as well on the mutex held to service the producer:
> {code}
>     private void dispatchMatched() throws IOException {       
>         synchronized (matchedListMutex) {  // never gets passed here.
>             if (!matched.isEmpty() && !isFull()) {
> {code}
> This is a fairly classic deadlock. The trick is now how to resolve this given the fact that the topic isn't aware that it's list may need to wait for the file system to clean up.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (AMQ-2475) If tmp message store fills up, broker can deadlock due to while producers wait on disk space and consumers wait on acks

Posted by "Dominic Tootell (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/activemq/browse/AMQ-2475?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Dominic Tootell updated AMQ-2475:
---------------------------------

    Attachment:     (was: Queue.patchfile.txt)

> If tmp message store fills up, broker can deadlock due to while producers wait on disk space and consumers wait on acks
> -----------------------------------------------------------------------------------------------------------------------
>
>                 Key: AMQ-2475
>                 URL: https://issues.apache.org/activemq/browse/AMQ-2475
>             Project: ActiveMQ
>          Issue Type: Bug
>          Components: Broker, Message Store, Transport
>    Affects Versions: 5.3.0
>         Environment: Tested on Windows XP with JDK 1.60_13, but fairly sure it will be an issue on all platforms
>            Reporter: Martin Murphy
>            Assignee: Rob Davies
>         Attachments: activemq.xml, hangtest.zip, Queue.patchfile.txt, Topic.patchfile.txt, TopicSubscription.patchfile.txt
>
>
> I will attach a simple project that shows this. In the test the tmp space is set to 32 MB and two threads are created. One thread will constantly produce 1KB messages and the other consumes these, but sleeps for 100ms, note that producer flow control is turned off as well. The goal here is to ensure that the producers block while the consumers read the rest of the messages from the broker and catch up, this in turn frees up the disk space and allows the producer to send more messages. This config means that you can bound the broker based on disk space rather than memory usage.
> Unfortunately in this test using topics while the broker is reading in the message from the producer it has to lock the matched list it is adding it to. This is an abstract from the Topic's point of view and doesn't realize that the file may block based on the file system. 
> {code}
>     public void add(MessageReference node) throws Exception { //... snip ...
>             if (maximumPendingMessages != 0) {
>                 synchronized (matchedListMutex) {   // We have this mutex
>                     matched.addMessageLast(node); // ends up waiting for space
>                     // NOTE - be careful about the slaveBroker!
>                     if (maximumPendingMessages > 0) {
> {code}
> Meanwhile the consumer is sending acknowledgements for the 10 messages it just read in (the configured prefetch) from the same topic, but since they also modify the same list in the topic this waits as well on the mutex held to service the producer:
> {code}
>     private void dispatchMatched() throws IOException {       
>         synchronized (matchedListMutex) {  // never gets passed here.
>             if (!matched.isEmpty() && !isFull()) {
> {code}
> This is a fairly classic deadlock. The trick is now how to resolve this given the fact that the topic isn't aware that it's list may need to wait for the file system to clean up.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (AMQ-2475) If tmp message store fills up, broker can deadlock due to while producers wait on disk space and consumers wait on acks

Posted by "Gary Tully (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/activemq/browse/AMQ-2475?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Gary Tully updated AMQ-2475:
----------------------------

    Fix Version/s: 5.3.1

also on 5.3.1 branch

> If tmp message store fills up, broker can deadlock due to while producers wait on disk space and consumers wait on acks
> -----------------------------------------------------------------------------------------------------------------------
>
>                 Key: AMQ-2475
>                 URL: https://issues.apache.org/activemq/browse/AMQ-2475
>             Project: ActiveMQ
>          Issue Type: Bug
>          Components: Broker, Message Store, Transport
>    Affects Versions: 5.3.0
>         Environment: Tested on Windows XP with JDK 1.60_13, but fairly sure it will be an issue on all platforms
>            Reporter: Martin Murphy
>            Assignee: Rob Davies
>             Fix For: 5.3.1, 5.4.0
>
>         Attachments: activemq.xml, hangtest.zip, Queue.java, Queue.patchfile.txt, Topic.java, Topic.patchfile.txt, TopicSubscription.java, TopicSubscription.patchfile.txt
>
>
> I will attach a simple project that shows this. In the test the tmp space is set to 32 MB and two threads are created. One thread will constantly produce 1KB messages and the other consumes these, but sleeps for 100ms, note that producer flow control is turned off as well. The goal here is to ensure that the producers block while the consumers read the rest of the messages from the broker and catch up, this in turn frees up the disk space and allows the producer to send more messages. This config means that you can bound the broker based on disk space rather than memory usage.
> Unfortunately in this test using topics while the broker is reading in the message from the producer it has to lock the matched list it is adding it to. This is an abstract from the Topic's point of view and doesn't realize that the file may block based on the file system. 
> {code}
>     public void add(MessageReference node) throws Exception { //... snip ...
>             if (maximumPendingMessages != 0) {
>                 synchronized (matchedListMutex) {   // We have this mutex
>                     matched.addMessageLast(node); // ends up waiting for space
>                     // NOTE - be careful about the slaveBroker!
>                     if (maximumPendingMessages > 0) {
> {code}
> Meanwhile the consumer is sending acknowledgements for the 10 messages it just read in (the configured prefetch) from the same topic, but since they also modify the same list in the topic this waits as well on the mutex held to service the producer:
> {code}
>     private void dispatchMatched() throws IOException {       
>         synchronized (matchedListMutex) {  // never gets passed here.
>             if (!matched.isEmpty() && !isFull()) {
> {code}
> This is a fairly classic deadlock. The trick is now how to resolve this given the fact that the topic isn't aware that it's list may need to wait for the file system to clean up.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Resolved: (AMQ-2475) If tmp message store fills up, broker can deadlock due to while producers wait on disk space and consumers wait on acks

Posted by "Rob Davies (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/activemq/browse/AMQ-2475?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Rob Davies resolved AMQ-2475.
-----------------------------

    Resolution: Fixed

Fixed by SVN revision 955504

> If tmp message store fills up, broker can deadlock due to while producers wait on disk space and consumers wait on acks
> -----------------------------------------------------------------------------------------------------------------------
>
>                 Key: AMQ-2475
>                 URL: https://issues.apache.org/activemq/browse/AMQ-2475
>             Project: ActiveMQ
>          Issue Type: Bug
>          Components: Broker, Message Store, Transport
>    Affects Versions: 5.3.0
>         Environment: Tested on Windows XP with JDK 1.60_13, but fairly sure it will be an issue on all platforms
>            Reporter: Martin Murphy
>            Assignee: Rob Davies
>             Fix For: 5.4.0, 5.3.1
>
>         Attachments: activemq.xml, hangtest.zip, Queue.java, Queue.patchfile.txt, Topic.java, Topic.patchfile.txt, TopicSubscription.java, TopicSubscription.patchfile.txt
>
>
> I will attach a simple project that shows this. In the test the tmp space is set to 32 MB and two threads are created. One thread will constantly produce 1KB messages and the other consumes these, but sleeps for 100ms, note that producer flow control is turned off as well. The goal here is to ensure that the producers block while the consumers read the rest of the messages from the broker and catch up, this in turn frees up the disk space and allows the producer to send more messages. This config means that you can bound the broker based on disk space rather than memory usage.
> Unfortunately in this test using topics while the broker is reading in the message from the producer it has to lock the matched list it is adding it to. This is an abstract from the Topic's point of view and doesn't realize that the file may block based on the file system. 
> {code}
>     public void add(MessageReference node) throws Exception { //... snip ...
>             if (maximumPendingMessages != 0) {
>                 synchronized (matchedListMutex) {   // We have this mutex
>                     matched.addMessageLast(node); // ends up waiting for space
>                     // NOTE - be careful about the slaveBroker!
>                     if (maximumPendingMessages > 0) {
> {code}
> Meanwhile the consumer is sending acknowledgements for the 10 messages it just read in (the configured prefetch) from the same topic, but since they also modify the same list in the topic this waits as well on the mutex held to service the producer:
> {code}
>     private void dispatchMatched() throws IOException {       
>         synchronized (matchedListMutex) {  // never gets passed here.
>             if (!matched.isEmpty() && !isFull()) {
> {code}
> This is a fairly classic deadlock. The trick is now how to resolve this given the fact that the topic isn't aware that it's list may need to wait for the file system to clean up.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Reopened: (AMQ-2475) If tmp message store fills up, broker can deadlock due to while producers wait on disk space and consumers wait on acks

Posted by "Michael Cooper (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/activemq/browse/AMQ-2475?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Michael Cooper reopened AMQ-2475:
---------------------------------


I am still able to consistently see this deadlock behavior on our system with both version 5.3.0 and 5.3.2.  The patch loops in the TopicSubscription add method while holding the matchedListMutex until thePendingMessageCursor "matched" is not full.  The code then attempts to call addMessageLast on the "matched" instance assuming that the matchedListMutex will prevent any additional threads from taking that space.  This assumption is wrong, because I have found that when using a filePendingMessageCursor, the addMessageLast method will end up calling systemUsage.getTempUsage().waitForSpace() which for whatever reason can be full when it is called and without the ability to reduce in size due to monitors already held earlier in the stack.  Therefore, the code loops infinitely and the system is deadlocked.

To workaround this issue, I switched to using vm cursors, which don't rely on this shared pool of temp file storage, and haven't seen the deadlock.

I am new to this project and still trying to understand the code completely, but this is what I have found.  I think the looping that is happening to wait for space is happening to early in the stack.  The matchedListMutex does not seem to lock out other threads that use temp storage.  I'm not sure what the correct fix is, but without a significant reworking of the code, the best I can think to do would be to have the addMessageLast method throw some kind of exception or have a return value if space is not available, so the calling method can again release its the matchedListMutext by calling wait, and try again.  And addMessageLast wold also not call waitForSpace with an infinite timeout, but instead specify a small timeout.

> If tmp message store fills up, broker can deadlock due to while producers wait on disk space and consumers wait on acks
> -----------------------------------------------------------------------------------------------------------------------
>
>                 Key: AMQ-2475
>                 URL: https://issues.apache.org/activemq/browse/AMQ-2475
>             Project: ActiveMQ
>          Issue Type: Bug
>          Components: Broker, Message Store, Transport
>    Affects Versions: 5.3.0
>         Environment: Tested on Windows XP with JDK 1.60_13, but fairly sure it will be an issue on all platforms
>            Reporter: Martin Murphy
>            Assignee: Rob Davies
>             Fix For: 5.3.1, 5.4.0
>
>         Attachments: activemq.xml, hangtest.zip, Queue.java, Queue.patchfile.txt, Topic.java, Topic.patchfile.txt, TopicSubscription.java, TopicSubscription.patchfile.txt
>
>
> I will attach a simple project that shows this. In the test the tmp space is set to 32 MB and two threads are created. One thread will constantly produce 1KB messages and the other consumes these, but sleeps for 100ms, note that producer flow control is turned off as well. The goal here is to ensure that the producers block while the consumers read the rest of the messages from the broker and catch up, this in turn frees up the disk space and allows the producer to send more messages. This config means that you can bound the broker based on disk space rather than memory usage.
> Unfortunately in this test using topics while the broker is reading in the message from the producer it has to lock the matched list it is adding it to. This is an abstract from the Topic's point of view and doesn't realize that the file may block based on the file system. 
> {code}
>     public void add(MessageReference node) throws Exception { //... snip ...
>             if (maximumPendingMessages != 0) {
>                 synchronized (matchedListMutex) {   // We have this mutex
>                     matched.addMessageLast(node); // ends up waiting for space
>                     // NOTE - be careful about the slaveBroker!
>                     if (maximumPendingMessages > 0) {
> {code}
> Meanwhile the consumer is sending acknowledgements for the 10 messages it just read in (the configured prefetch) from the same topic, but since they also modify the same list in the topic this waits as well on the mutex held to service the producer:
> {code}
>     private void dispatchMatched() throws IOException {       
>         synchronized (matchedListMutex) {  // never gets passed here.
>             if (!matched.isEmpty() && !isFull()) {
> {code}
> This is a fairly classic deadlock. The trick is now how to resolve this given the fact that the topic isn't aware that it's list may need to wait for the file system to clean up.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (AMQ-2475) If tmp message store fills up, broker can deadlock due to while producers wait on disk space and consumers wait on acks

Posted by "Dominic Tootell (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/activemq/browse/AMQ-2475?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Dominic Tootell updated AMQ-2475:
---------------------------------

    Attachment:     (was: TopicSubscription.patchfile.txt)

> If tmp message store fills up, broker can deadlock due to while producers wait on disk space and consumers wait on acks
> -----------------------------------------------------------------------------------------------------------------------
>
>                 Key: AMQ-2475
>                 URL: https://issues.apache.org/activemq/browse/AMQ-2475
>             Project: ActiveMQ
>          Issue Type: Bug
>          Components: Broker, Message Store, Transport
>    Affects Versions: 5.3.0
>         Environment: Tested on Windows XP with JDK 1.60_13, but fairly sure it will be an issue on all platforms
>            Reporter: Martin Murphy
>            Assignee: Rob Davies
>         Attachments: activemq.xml, hangtest.zip, Queue.patchfile.txt, Topic.patchfile.txt, TopicSubscription.patchfile.txt
>
>
> I will attach a simple project that shows this. In the test the tmp space is set to 32 MB and two threads are created. One thread will constantly produce 1KB messages and the other consumes these, but sleeps for 100ms, note that producer flow control is turned off as well. The goal here is to ensure that the producers block while the consumers read the rest of the messages from the broker and catch up, this in turn frees up the disk space and allows the producer to send more messages. This config means that you can bound the broker based on disk space rather than memory usage.
> Unfortunately in this test using topics while the broker is reading in the message from the producer it has to lock the matched list it is adding it to. This is an abstract from the Topic's point of view and doesn't realize that the file may block based on the file system. 
> {code}
>     public void add(MessageReference node) throws Exception { //... snip ...
>             if (maximumPendingMessages != 0) {
>                 synchronized (matchedListMutex) {   // We have this mutex
>                     matched.addMessageLast(node); // ends up waiting for space
>                     // NOTE - be careful about the slaveBroker!
>                     if (maximumPendingMessages > 0) {
> {code}
> Meanwhile the consumer is sending acknowledgements for the 10 messages it just read in (the configured prefetch) from the same topic, but since they also modify the same list in the topic this waits as well on the mutex held to service the producer:
> {code}
>     private void dispatchMatched() throws IOException {       
>         synchronized (matchedListMutex) {  // never gets passed here.
>             if (!matched.isEmpty() && !isFull()) {
> {code}
> This is a fairly classic deadlock. The trick is now how to resolve this given the fact that the topic isn't aware that it's list may need to wait for the file system to clean up.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (AMQ-2475) If tmp message store fills up, broker can deadlock due to while producers wait on disk space and consumers wait on acks

Posted by "Dominic Tootell (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/activemq/browse/AMQ-2475?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Dominic Tootell updated AMQ-2475:
---------------------------------

    Attachment: activemq.xml

activemq.xml test broker configuration I was using to test the scenario

> If tmp message store fills up, broker can deadlock due to while producers wait on disk space and consumers wait on acks
> -----------------------------------------------------------------------------------------------------------------------
>
>                 Key: AMQ-2475
>                 URL: https://issues.apache.org/activemq/browse/AMQ-2475
>             Project: ActiveMQ
>          Issue Type: Bug
>          Components: Broker, Message Store, Transport
>    Affects Versions: 5.3.0
>         Environment: Tested on Windows XP with JDK 1.60_13, but fairly sure it will be an issue on all platforms
>            Reporter: Martin Murphy
>            Assignee: Rob Davies
>         Attachments: activemq.xml, hangtest.zip, Queue.java, Queue.patchfile.txt, Topic.java, Topic.patchfile.txt, TopicSubscription.java, TopicSubscription.patchfile.txt
>
>
> I will attach a simple project that shows this. In the test the tmp space is set to 32 MB and two threads are created. One thread will constantly produce 1KB messages and the other consumes these, but sleeps for 100ms, note that producer flow control is turned off as well. The goal here is to ensure that the producers block while the consumers read the rest of the messages from the broker and catch up, this in turn frees up the disk space and allows the producer to send more messages. This config means that you can bound the broker based on disk space rather than memory usage.
> Unfortunately in this test using topics while the broker is reading in the message from the producer it has to lock the matched list it is adding it to. This is an abstract from the Topic's point of view and doesn't realize that the file may block based on the file system. 
> {code}
>     public void add(MessageReference node) throws Exception { //... snip ...
>             if (maximumPendingMessages != 0) {
>                 synchronized (matchedListMutex) {   // We have this mutex
>                     matched.addMessageLast(node); // ends up waiting for space
>                     // NOTE - be careful about the slaveBroker!
>                     if (maximumPendingMessages > 0) {
> {code}
> Meanwhile the consumer is sending acknowledgements for the 10 messages it just read in (the configured prefetch) from the same topic, but since they also modify the same list in the topic this waits as well on the mutex held to service the producer:
> {code}
>     private void dispatchMatched() throws IOException {       
>         synchronized (matchedListMutex) {  // never gets passed here.
>             if (!matched.isEmpty() && !isFull()) {
> {code}
> This is a fairly classic deadlock. The trick is now how to resolve this given the fact that the topic isn't aware that it's list may need to wait for the file system to clean up.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Reopened: (AMQ-2475) If tmp message store fills up, broker can deadlock due to while producers wait on disk space and consumers wait on acks

Posted by "Michael Cooper (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/activemq/browse/AMQ-2475?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Michael Cooper reopened AMQ-2475:
---------------------------------


Just reopening the issue to make sure the last code review doesn't get missed.  Ignore what I said about the "throws Exception" part, as I realize it was part of another change.  My biggest concern is over the "return false" which could introduce an infinite loop for expired messages.

> If tmp message store fills up, broker can deadlock due to while producers wait on disk space and consumers wait on acks
> -----------------------------------------------------------------------------------------------------------------------
>
>                 Key: AMQ-2475
>                 URL: https://issues.apache.org/activemq/browse/AMQ-2475
>             Project: ActiveMQ
>          Issue Type: Bug
>          Components: Broker, Message Store, Transport
>    Affects Versions: 5.3.0
>         Environment: Tested on Windows XP with JDK 1.60_13, but fairly sure it will be an issue on all platforms
>            Reporter: Martin Murphy
>            Assignee: Rob Davies
>             Fix For: 5.3.1, 5.4.0
>
>         Attachments: activemq.xml, hangtest.zip, Queue.java, Queue.patchfile.txt, Topic.java, Topic.patchfile.txt, TopicSubscription.java, TopicSubscription.patchfile.txt
>
>
> I will attach a simple project that shows this. In the test the tmp space is set to 32 MB and two threads are created. One thread will constantly produce 1KB messages and the other consumes these, but sleeps for 100ms, note that producer flow control is turned off as well. The goal here is to ensure that the producers block while the consumers read the rest of the messages from the broker and catch up, this in turn frees up the disk space and allows the producer to send more messages. This config means that you can bound the broker based on disk space rather than memory usage.
> Unfortunately in this test using topics while the broker is reading in the message from the producer it has to lock the matched list it is adding it to. This is an abstract from the Topic's point of view and doesn't realize that the file may block based on the file system. 
> {code}
>     public void add(MessageReference node) throws Exception { //... snip ...
>             if (maximumPendingMessages != 0) {
>                 synchronized (matchedListMutex) {   // We have this mutex
>                     matched.addMessageLast(node); // ends up waiting for space
>                     // NOTE - be careful about the slaveBroker!
>                     if (maximumPendingMessages > 0) {
> {code}
> Meanwhile the consumer is sending acknowledgements for the 10 messages it just read in (the configured prefetch) from the same topic, but since they also modify the same list in the topic this waits as well on the mutex held to service the producer:
> {code}
>     private void dispatchMatched() throws IOException {       
>         synchronized (matchedListMutex) {  // never gets passed here.
>             if (!matched.isEmpty() && !isFull()) {
> {code}
> This is a fairly classic deadlock. The trick is now how to resolve this given the fact that the topic isn't aware that it's list may need to wait for the file system to clean up.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (AMQ-2475) If tmp message store fills up, broker can deadlock due to while producers wait on disk space and consumers wait on acks

Posted by "Dominic Tootell (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/activemq/browse/AMQ-2475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=55493#action_55493 ] 

Dominic Tootell commented on AMQ-2475:
--------------------------------------

I wasn't experiencing the same issue with queue's, just topics (non persistent when they overflowed to tmp_storage)

What test was failing in the org.apache.activemq.broker.BrokerTest? out of interest.
/dom

> If tmp message store fills up, broker can deadlock due to while producers wait on disk space and consumers wait on acks
> -----------------------------------------------------------------------------------------------------------------------
>
>                 Key: AMQ-2475
>                 URL: https://issues.apache.org/activemq/browse/AMQ-2475
>             Project: ActiveMQ
>          Issue Type: Bug
>          Components: Broker, Message Store, Transport
>    Affects Versions: 5.3.0
>         Environment: Tested on Windows XP with JDK 1.60_13, but fairly sure it will be an issue on all platforms
>            Reporter: Martin Murphy
>            Assignee: Rob Davies
>         Attachments: activemq.xml, hangtest.zip, Queue.java, Queue.patchfile.txt, Topic.java, Topic.patchfile.txt, TopicSubscription.java, TopicSubscription.patchfile.txt
>
>
> I will attach a simple project that shows this. In the test the tmp space is set to 32 MB and two threads are created. One thread will constantly produce 1KB messages and the other consumes these, but sleeps for 100ms, note that producer flow control is turned off as well. The goal here is to ensure that the producers block while the consumers read the rest of the messages from the broker and catch up, this in turn frees up the disk space and allows the producer to send more messages. This config means that you can bound the broker based on disk space rather than memory usage.
> Unfortunately in this test using topics while the broker is reading in the message from the producer it has to lock the matched list it is adding it to. This is an abstract from the Topic's point of view and doesn't realize that the file may block based on the file system. 
> {code}
>     public void add(MessageReference node) throws Exception { //... snip ...
>             if (maximumPendingMessages != 0) {
>                 synchronized (matchedListMutex) {   // We have this mutex
>                     matched.addMessageLast(node); // ends up waiting for space
>                     // NOTE - be careful about the slaveBroker!
>                     if (maximumPendingMessages > 0) {
> {code}
> Meanwhile the consumer is sending acknowledgements for the 10 messages it just read in (the configured prefetch) from the same topic, but since they also modify the same list in the topic this waits as well on the mutex held to service the producer:
> {code}
>     private void dispatchMatched() throws IOException {       
>         synchronized (matchedListMutex) {  // never gets passed here.
>             if (!matched.isEmpty() && !isFull()) {
> {code}
> This is a fairly classic deadlock. The trick is now how to resolve this given the fact that the topic isn't aware that it's list may need to wait for the file system to clean up.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (AMQ-2475) If tmp message store fills up, broker can deadlock due to while producers wait on disk space and consumers wait on acks

Posted by "Rob Davies (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/activemq/browse/AMQ-2475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=55494#action_55494 ] 

Rob Davies commented on AMQ-2475:
---------------------------------

the failing tests were org.apache.activemq.broker.BrokerTest (a few are derived from this - so causes multiple failures) and org.apache.activemq.bugs.AMQ2314Test

> If tmp message store fills up, broker can deadlock due to while producers wait on disk space and consumers wait on acks
> -----------------------------------------------------------------------------------------------------------------------
>
>                 Key: AMQ-2475
>                 URL: https://issues.apache.org/activemq/browse/AMQ-2475
>             Project: ActiveMQ
>          Issue Type: Bug
>          Components: Broker, Message Store, Transport
>    Affects Versions: 5.3.0
>         Environment: Tested on Windows XP with JDK 1.60_13, but fairly sure it will be an issue on all platforms
>            Reporter: Martin Murphy
>            Assignee: Rob Davies
>         Attachments: activemq.xml, hangtest.zip, Queue.java, Queue.patchfile.txt, Topic.java, Topic.patchfile.txt, TopicSubscription.java, TopicSubscription.patchfile.txt
>
>
> I will attach a simple project that shows this. In the test the tmp space is set to 32 MB and two threads are created. One thread will constantly produce 1KB messages and the other consumes these, but sleeps for 100ms, note that producer flow control is turned off as well. The goal here is to ensure that the producers block while the consumers read the rest of the messages from the broker and catch up, this in turn frees up the disk space and allows the producer to send more messages. This config means that you can bound the broker based on disk space rather than memory usage.
> Unfortunately in this test using topics while the broker is reading in the message from the producer it has to lock the matched list it is adding it to. This is an abstract from the Topic's point of view and doesn't realize that the file may block based on the file system. 
> {code}
>     public void add(MessageReference node) throws Exception { //... snip ...
>             if (maximumPendingMessages != 0) {
>                 synchronized (matchedListMutex) {   // We have this mutex
>                     matched.addMessageLast(node); // ends up waiting for space
>                     // NOTE - be careful about the slaveBroker!
>                     if (maximumPendingMessages > 0) {
> {code}
> Meanwhile the consumer is sending acknowledgements for the 10 messages it just read in (the configured prefetch) from the same topic, but since they also modify the same list in the topic this waits as well on the mutex held to service the producer:
> {code}
>     private void dispatchMatched() throws IOException {       
>         synchronized (matchedListMutex) {  // never gets passed here.
>             if (!matched.isEmpty() && !isFull()) {
> {code}
> This is a fairly classic deadlock. The trick is now how to resolve this given the fact that the topic isn't aware that it's list may need to wait for the file system to clean up.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (AMQ-2475) If tmp message store fills up, broker can deadlock due to while producers wait on disk space and consumers wait on acks

Posted by "Dominic Tootell (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/activemq/browse/AMQ-2475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=55210#action_55210 ] 

Dominic Tootell commented on AMQ-2475:
--------------------------------------

I had an investigate into attempted to patch this locally in activemq-core, on a fusesource 5.3.0.4 (MacOSX 10.6.1).

I've run the following tests on the patch, will I'll attach: (the patch diffs, the patched .java and the broker xml I used in testing):

The test cases I've run overnight and this morning/afternoon are:

- Virtual Topic (  VirtualTopic.iplayer  -> Consumer.A.VirtualTopic.iplayer)
- 3 x Producer, 4,000,000 messages each onto Virtual Topic (12million in total)
- 1 x Consumer
- 100mb tmp_store limit

- Virtual Topic (  VirtualTopic.iplayer  -> Consumer.A.VirtualTopic.iplayer)
- 6 x Producer, 2,000,000 messages each onto Virtual Topic (12million in total)
- 1 x Consumer
- 512mb tmp_store limit

The tmp_storage was definitely limiting ok, and niether the broker, producer or consumer blocked:

du -sh of the tmp_storage area:
{code}
dominic-tootells-macbook-pro:data dominict$ du -sh *
 96M	journal
 48K	kr-store
  0B	lock
512M	tmp-test-broker

dominic-tootells-macbook-pro:data dominict$ du -sh *
 96M	journal
 48K	kr-store
  0B	lock
483M	tmp-test-broker

dominic-tootells-macbook-pro:data dominict$ du -sh *
 96M	journal
 48K	kr-store
  0B	lock
490M	tmp-test-broker

dominic-tootells-macbook-pro:data dominict$ du -sh *
 64M	journal
 48K	kr-store
  0B	lock
 38M	tmp-test-broker

dominic-tootells-macbook-pro:data dominict$ 

{code}


I've also run the junit provided by Martin; this ran ok too; with no blockage.

I shall attach the potential patches.  I haven't run any other tests against the patches; to see if they potentially cause any other unforeseen issues (i.e. normal persistent queue - will do this later on)

cheers
/dom





> If tmp message store fills up, broker can deadlock due to while producers wait on disk space and consumers wait on acks
> -----------------------------------------------------------------------------------------------------------------------
>
>                 Key: AMQ-2475
>                 URL: https://issues.apache.org/activemq/browse/AMQ-2475
>             Project: ActiveMQ
>          Issue Type: Bug
>          Components: Broker, Message Store, Transport
>    Affects Versions: 5.3.0
>         Environment: Tested on Windows XP with JDK 1.60_13, but fairly sure it will be an issue on all platforms
>            Reporter: Martin Murphy
>            Assignee: Rob Davies
>         Attachments: hangtest.zip
>
>
> I will attach a simple project that shows this. In the test the tmp space is set to 32 MB and two threads are created. One thread will constantly produce 1KB messages and the other consumes these, but sleeps for 100ms, note that producer flow control is turned off as well. The goal here is to ensure that the producers block while the consumers read the rest of the messages from the broker and catch up, this in turn frees up the disk space and allows the producer to send more messages. This config means that you can bound the broker based on disk space rather than memory usage.
> Unfortunately in this test using topics while the broker is reading in the message from the producer it has to lock the matched list it is adding it to. This is an abstract from the Topic's point of view and doesn't realize that the file may block based on the file system. 
> {code}
>     public void add(MessageReference node) throws Exception { //... snip ...
>             if (maximumPendingMessages != 0) {
>                 synchronized (matchedListMutex) {   // We have this mutex
>                     matched.addMessageLast(node); // ends up waiting for space
>                     // NOTE - be careful about the slaveBroker!
>                     if (maximumPendingMessages > 0) {
> {code}
> Meanwhile the consumer is sending acknowledgements for the 10 messages it just read in (the configured prefetch) from the same topic, but since they also modify the same list in the topic this waits as well on the mutex held to service the producer:
> {code}
>     private void dispatchMatched() throws IOException {       
>         synchronized (matchedListMutex) {  // never gets passed here.
>             if (!matched.isEmpty() && !isFull()) {
> {code}
> This is a fairly classic deadlock. The trick is now how to resolve this given the fact that the topic isn't aware that it's list may need to wait for the file system to clean up.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (AMQ-2475) If tmp message store fills up, broker can deadlock due to while producers wait on disk space and consumers wait on acks

Posted by "Dominic Tootell (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/activemq/browse/AMQ-2475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=55222#action_55222 ] 

Dominic Tootell commented on AMQ-2475:
--------------------------------------

Test for a normal persistent queue ran ok:

- 1 consumer
- 5 * producer
- 10,000,000 * 1k messages

hope this helps,
/dom

> If tmp message store fills up, broker can deadlock due to while producers wait on disk space and consumers wait on acks
> -----------------------------------------------------------------------------------------------------------------------
>
>                 Key: AMQ-2475
>                 URL: https://issues.apache.org/activemq/browse/AMQ-2475
>             Project: ActiveMQ
>          Issue Type: Bug
>          Components: Broker, Message Store, Transport
>    Affects Versions: 5.3.0
>         Environment: Tested on Windows XP with JDK 1.60_13, but fairly sure it will be an issue on all platforms
>            Reporter: Martin Murphy
>            Assignee: Rob Davies
>         Attachments: activemq.xml, hangtest.zip, Queue.java, Queue.patchfile.txt, Topic.java, Topic.patchfile.txt, TopicSubscription.java, TopicSubscription.patchfile.txt
>
>
> I will attach a simple project that shows this. In the test the tmp space is set to 32 MB and two threads are created. One thread will constantly produce 1KB messages and the other consumes these, but sleeps for 100ms, note that producer flow control is turned off as well. The goal here is to ensure that the producers block while the consumers read the rest of the messages from the broker and catch up, this in turn frees up the disk space and allows the producer to send more messages. This config means that you can bound the broker based on disk space rather than memory usage.
> Unfortunately in this test using topics while the broker is reading in the message from the producer it has to lock the matched list it is adding it to. This is an abstract from the Topic's point of view and doesn't realize that the file may block based on the file system. 
> {code}
>     public void add(MessageReference node) throws Exception { //... snip ...
>             if (maximumPendingMessages != 0) {
>                 synchronized (matchedListMutex) {   // We have this mutex
>                     matched.addMessageLast(node); // ends up waiting for space
>                     // NOTE - be careful about the slaveBroker!
>                     if (maximumPendingMessages > 0) {
> {code}
> Meanwhile the consumer is sending acknowledgements for the 10 messages it just read in (the configured prefetch) from the same topic, but since they also modify the same list in the topic this waits as well on the mutex held to service the producer:
> {code}
>     private void dispatchMatched() throws IOException {       
>         synchronized (matchedListMutex) {  // never gets passed here.
>             if (!matched.isEmpty() && !isFull()) {
> {code}
> This is a fairly classic deadlock. The trick is now how to resolve this given the fact that the topic isn't aware that it's list may need to wait for the file system to clean up.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (AMQ-2475) If tmp message store fills up, broker can deadlock due to while producers wait on disk space and consumers wait on acks

Posted by "Rob Davies (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/activemq/browse/AMQ-2475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=60122#action_60122 ] 

Rob Davies commented on AMQ-2475:
---------------------------------

Hi Michael  - thx for the analysis - is this something you can reproduce on your system easily ?

> If tmp message store fills up, broker can deadlock due to while producers wait on disk space and consumers wait on acks
> -----------------------------------------------------------------------------------------------------------------------
>
>                 Key: AMQ-2475
>                 URL: https://issues.apache.org/activemq/browse/AMQ-2475
>             Project: ActiveMQ
>          Issue Type: Bug
>          Components: Broker, Message Store, Transport
>    Affects Versions: 5.3.0
>         Environment: Tested on Windows XP with JDK 1.60_13, but fairly sure it will be an issue on all platforms
>            Reporter: Martin Murphy
>            Assignee: Rob Davies
>             Fix For: 5.3.1, 5.4.0
>
>         Attachments: activemq.xml, hangtest.zip, Queue.java, Queue.patchfile.txt, Topic.java, Topic.patchfile.txt, TopicSubscription.java, TopicSubscription.patchfile.txt
>
>
> I will attach a simple project that shows this. In the test the tmp space is set to 32 MB and two threads are created. One thread will constantly produce 1KB messages and the other consumes these, but sleeps for 100ms, note that producer flow control is turned off as well. The goal here is to ensure that the producers block while the consumers read the rest of the messages from the broker and catch up, this in turn frees up the disk space and allows the producer to send more messages. This config means that you can bound the broker based on disk space rather than memory usage.
> Unfortunately in this test using topics while the broker is reading in the message from the producer it has to lock the matched list it is adding it to. This is an abstract from the Topic's point of view and doesn't realize that the file may block based on the file system. 
> {code}
>     public void add(MessageReference node) throws Exception { //... snip ...
>             if (maximumPendingMessages != 0) {
>                 synchronized (matchedListMutex) {   // We have this mutex
>                     matched.addMessageLast(node); // ends up waiting for space
>                     // NOTE - be careful about the slaveBroker!
>                     if (maximumPendingMessages > 0) {
> {code}
> Meanwhile the consumer is sending acknowledgements for the 10 messages it just read in (the configured prefetch) from the same topic, but since they also modify the same list in the topic this waits as well on the mutex held to service the producer:
> {code}
>     private void dispatchMatched() throws IOException {       
>         synchronized (matchedListMutex) {  // never gets passed here.
>             if (!matched.isEmpty() && !isFull()) {
> {code}
> This is a fairly classic deadlock. The trick is now how to resolve this given the fact that the topic isn't aware that it's list may need to wait for the file system to clean up.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (AMQ-2475) If tmp message store fills up, broker can deadlock due to while producers wait on disk space and consumers wait on acks

Posted by "Dominic Tootell (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/activemq/browse/AMQ-2475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=55496#action_55496 ] 

Dominic Tootell commented on AMQ-2475:
--------------------------------------

Thanks for the info Rob,

I'll give it a run through here, and see what I come up with here.

I was running on a fusesource 5.3.0.4 broker.  I've recently updated this to the fusesource 5.3.0.5 broker which included that AMQ2314Test (http://fusesource.com/wiki/display/ProdInfo/FUSE+Message+Broker+v5.3+Release+Notes).  I'll run a test suite on a base source distro and a 5.3.0.5 with above patches and see what I get out of it.

If I get chance I'll grab a trunk and test on that too (most likely an evening this week I'm guessing though) and see if I spot anything.  

Thanks for commenting back.
/dom

> If tmp message store fills up, broker can deadlock due to while producers wait on disk space and consumers wait on acks
> -----------------------------------------------------------------------------------------------------------------------
>
>                 Key: AMQ-2475
>                 URL: https://issues.apache.org/activemq/browse/AMQ-2475
>             Project: ActiveMQ
>          Issue Type: Bug
>          Components: Broker, Message Store, Transport
>    Affects Versions: 5.3.0
>         Environment: Tested on Windows XP with JDK 1.60_13, but fairly sure it will be an issue on all platforms
>            Reporter: Martin Murphy
>            Assignee: Rob Davies
>         Attachments: activemq.xml, hangtest.zip, Queue.java, Queue.patchfile.txt, Topic.java, Topic.patchfile.txt, TopicSubscription.java, TopicSubscription.patchfile.txt
>
>
> I will attach a simple project that shows this. In the test the tmp space is set to 32 MB and two threads are created. One thread will constantly produce 1KB messages and the other consumes these, but sleeps for 100ms, note that producer flow control is turned off as well. The goal here is to ensure that the producers block while the consumers read the rest of the messages from the broker and catch up, this in turn frees up the disk space and allows the producer to send more messages. This config means that you can bound the broker based on disk space rather than memory usage.
> Unfortunately in this test using topics while the broker is reading in the message from the producer it has to lock the matched list it is adding it to. This is an abstract from the Topic's point of view and doesn't realize that the file may block based on the file system. 
> {code}
>     public void add(MessageReference node) throws Exception { //... snip ...
>             if (maximumPendingMessages != 0) {
>                 synchronized (matchedListMutex) {   // We have this mutex
>                     matched.addMessageLast(node); // ends up waiting for space
>                     // NOTE - be careful about the slaveBroker!
>                     if (maximumPendingMessages > 0) {
> {code}
> Meanwhile the consumer is sending acknowledgements for the 10 messages it just read in (the configured prefetch) from the same topic, but since they also modify the same list in the topic this waits as well on the mutex held to service the producer:
> {code}
>     private void dispatchMatched() throws IOException {       
>         synchronized (matchedListMutex) {  // never gets passed here.
>             if (!matched.isEmpty() && !isFull()) {
> {code}
> This is a fairly classic deadlock. The trick is now how to resolve this given the fact that the topic isn't aware that it's list may need to wait for the file system to clean up.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (AMQ-2475) If tmp message store fills up, broker can deadlock due to while producers wait on disk space and consumers wait on acks

Posted by "Dominic Tootell (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/activemq/browse/AMQ-2475?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Dominic Tootell updated AMQ-2475:
---------------------------------

    Attachment: TopicSubscription.patchfile.txt
                Topic.patchfile.txt
                Queue.patchfile.txt

> If tmp message store fills up, broker can deadlock due to while producers wait on disk space and consumers wait on acks
> -----------------------------------------------------------------------------------------------------------------------
>
>                 Key: AMQ-2475
>                 URL: https://issues.apache.org/activemq/browse/AMQ-2475
>             Project: ActiveMQ
>          Issue Type: Bug
>          Components: Broker, Message Store, Transport
>    Affects Versions: 5.3.0
>         Environment: Tested on Windows XP with JDK 1.60_13, but fairly sure it will be an issue on all platforms
>            Reporter: Martin Murphy
>            Assignee: Rob Davies
>         Attachments: activemq.xml, hangtest.zip, Queue.patchfile.txt, Topic.patchfile.txt, TopicSubscription.patchfile.txt
>
>
> I will attach a simple project that shows this. In the test the tmp space is set to 32 MB and two threads are created. One thread will constantly produce 1KB messages and the other consumes these, but sleeps for 100ms, note that producer flow control is turned off as well. The goal here is to ensure that the producers block while the consumers read the rest of the messages from the broker and catch up, this in turn frees up the disk space and allows the producer to send more messages. This config means that you can bound the broker based on disk space rather than memory usage.
> Unfortunately in this test using topics while the broker is reading in the message from the producer it has to lock the matched list it is adding it to. This is an abstract from the Topic's point of view and doesn't realize that the file may block based on the file system. 
> {code}
>     public void add(MessageReference node) throws Exception { //... snip ...
>             if (maximumPendingMessages != 0) {
>                 synchronized (matchedListMutex) {   // We have this mutex
>                     matched.addMessageLast(node); // ends up waiting for space
>                     // NOTE - be careful about the slaveBroker!
>                     if (maximumPendingMessages > 0) {
> {code}
> Meanwhile the consumer is sending acknowledgements for the 10 messages it just read in (the configured prefetch) from the same topic, but since they also modify the same list in the topic this waits as well on the mutex held to service the producer:
> {code}
>     private void dispatchMatched() throws IOException {       
>         synchronized (matchedListMutex) {  // never gets passed here.
>             if (!matched.isEmpty() && !isFull()) {
> {code}
> This is a fairly classic deadlock. The trick is now how to resolve this given the fact that the topic isn't aware that it's list may need to wait for the file system to clean up.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (AMQ-2475) If tmp message store fills up, broker can deadlock due to while producers wait on disk space and consumers wait on acks

Posted by "Dominic Tootell (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/activemq/browse/AMQ-2475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=55218#action_55218 ] 

Dominic Tootell commented on AMQ-2475:
--------------------------------------

I've ran some more tests on the patch I uploaded yesterday, and come across a small issue with  sendFailIfNoSpace="true".  The ResourceAllocationException would only be thrown if the producer noticed the out of space condition before the message was added to the cursor.  However, there was a slight chance space would be available when producer 1 checked, but this space was then eaten by producer 2.  Producer 1 would then be within the waiting for space loop; and not send a ResourceAllocationException.  

I have added checks within the waiting for space loop, to check if a ResourceAllocationException should be thrown if sendFailIfNoSpace="true".   

I shall update the patches attached yesterday, to reflect this.

apologies,
/dom

I'm also currently running the test against a normal persistent queue; to make sure all is ok with that; I'll comment back once the run has finished.

> If tmp message store fills up, broker can deadlock due to while producers wait on disk space and consumers wait on acks
> -----------------------------------------------------------------------------------------------------------------------
>
>                 Key: AMQ-2475
>                 URL: https://issues.apache.org/activemq/browse/AMQ-2475
>             Project: ActiveMQ
>          Issue Type: Bug
>          Components: Broker, Message Store, Transport
>    Affects Versions: 5.3.0
>         Environment: Tested on Windows XP with JDK 1.60_13, but fairly sure it will be an issue on all platforms
>            Reporter: Martin Murphy
>            Assignee: Rob Davies
>         Attachments: activemq.xml, hangtest.zip, Queue.java, Queue.patchfile.txt, Topic.java, Topic.patchfile.txt, TopicSubscription.java, TopicSubscription.patchfile.txt
>
>
> I will attach a simple project that shows this. In the test the tmp space is set to 32 MB and two threads are created. One thread will constantly produce 1KB messages and the other consumes these, but sleeps for 100ms, note that producer flow control is turned off as well. The goal here is to ensure that the producers block while the consumers read the rest of the messages from the broker and catch up, this in turn frees up the disk space and allows the producer to send more messages. This config means that you can bound the broker based on disk space rather than memory usage.
> Unfortunately in this test using topics while the broker is reading in the message from the producer it has to lock the matched list it is adding it to. This is an abstract from the Topic's point of view and doesn't realize that the file may block based on the file system. 
> {code}
>     public void add(MessageReference node) throws Exception { //... snip ...
>             if (maximumPendingMessages != 0) {
>                 synchronized (matchedListMutex) {   // We have this mutex
>                     matched.addMessageLast(node); // ends up waiting for space
>                     // NOTE - be careful about the slaveBroker!
>                     if (maximumPendingMessages > 0) {
> {code}
> Meanwhile the consumer is sending acknowledgements for the 10 messages it just read in (the configured prefetch) from the same topic, but since they also modify the same list in the topic this waits as well on the mutex held to service the producer:
> {code}
>     private void dispatchMatched() throws IOException {       
>         synchronized (matchedListMutex) {  // never gets passed here.
>             if (!matched.isEmpty() && !isFull()) {
> {code}
> This is a fairly classic deadlock. The trick is now how to resolve this given the fact that the topic isn't aware that it's list may need to wait for the file system to clean up.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (AMQ-2475) If tmp message store fills up, broker can deadlock due to while producers wait on disk space and consumers wait on acks

Posted by "Dominic Tootell (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/activemq/browse/AMQ-2475?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Dominic Tootell updated AMQ-2475:
---------------------------------

    Attachment:     (was: Topic.java)

> If tmp message store fills up, broker can deadlock due to while producers wait on disk space and consumers wait on acks
> -----------------------------------------------------------------------------------------------------------------------
>
>                 Key: AMQ-2475
>                 URL: https://issues.apache.org/activemq/browse/AMQ-2475
>             Project: ActiveMQ
>          Issue Type: Bug
>          Components: Broker, Message Store, Transport
>    Affects Versions: 5.3.0
>         Environment: Tested on Windows XP with JDK 1.60_13, but fairly sure it will be an issue on all platforms
>            Reporter: Martin Murphy
>            Assignee: Rob Davies
>         Attachments: activemq.xml, hangtest.zip, Queue.patchfile.txt, Topic.patchfile.txt, TopicSubscription.patchfile.txt
>
>
> I will attach a simple project that shows this. In the test the tmp space is set to 32 MB and two threads are created. One thread will constantly produce 1KB messages and the other consumes these, but sleeps for 100ms, note that producer flow control is turned off as well. The goal here is to ensure that the producers block while the consumers read the rest of the messages from the broker and catch up, this in turn frees up the disk space and allows the producer to send more messages. This config means that you can bound the broker based on disk space rather than memory usage.
> Unfortunately in this test using topics while the broker is reading in the message from the producer it has to lock the matched list it is adding it to. This is an abstract from the Topic's point of view and doesn't realize that the file may block based on the file system. 
> {code}
>     public void add(MessageReference node) throws Exception { //... snip ...
>             if (maximumPendingMessages != 0) {
>                 synchronized (matchedListMutex) {   // We have this mutex
>                     matched.addMessageLast(node); // ends up waiting for space
>                     // NOTE - be careful about the slaveBroker!
>                     if (maximumPendingMessages > 0) {
> {code}
> Meanwhile the consumer is sending acknowledgements for the 10 messages it just read in (the configured prefetch) from the same topic, but since they also modify the same list in the topic this waits as well on the mutex held to service the producer:
> {code}
>     private void dispatchMatched() throws IOException {       
>         synchronized (matchedListMutex) {  // never gets passed here.
>             if (!matched.isEmpty() && !isFull()) {
> {code}
> This is a fairly classic deadlock. The trick is now how to resolve this given the fact that the topic isn't aware that it's list may need to wait for the file system to clean up.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (AMQ-2475) If tmp message store fills up, broker can deadlock due to while producers wait on disk space and consumers wait on acks

Posted by "Martin Murphy (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/activemq/browse/AMQ-2475?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Martin Murphy updated AMQ-2475:
-------------------------------

    Attachment: hangtest.zip

Oops, didn't realize that the test never attached properly last time

> If tmp message store fills up, broker can deadlock due to while producers wait on disk space and consumers wait on acks
> -----------------------------------------------------------------------------------------------------------------------
>
>                 Key: AMQ-2475
>                 URL: https://issues.apache.org/activemq/browse/AMQ-2475
>             Project: ActiveMQ
>          Issue Type: Bug
>          Components: Broker, Message Store, Transport
>    Affects Versions: 5.3.0
>         Environment: Tested on Windows XP with JDK 1.60_13, but fairly sure it will be an issue on all platforms
>            Reporter: Martin Murphy
>         Attachments: hangtest.zip
>
>
> I will attach a simple project that shows this. In the test the tmp space is set to 32 MB and two threads are created. One thread will constantly produce 1KB messages and the other consumes these, but sleeps for 100ms, note that producer flow control is turned off as well. The goal here is to ensure that the producers block while the consumers read the rest of the messages from the broker and catch up, this in turn frees up the disk space and allows the producer to send more messages. This config means that you can bound the broker based on disk space rather than memory usage.
> Unfortunately in this test using topics while the broker is reading in the message from the producer it has to lock the matched list it is adding it to. This is an abstract from the Topic's point of view and doesn't realize that the file may block based on the file system. 
> {code}
>     public void add(MessageReference node) throws Exception { //... snip ...
>             if (maximumPendingMessages != 0) {
>                 synchronized (matchedListMutex) {   // We have this mutex
>                     matched.addMessageLast(node); // ends up waiting for space
>                     // NOTE - be careful about the slaveBroker!
>                     if (maximumPendingMessages > 0) {
> {code}
> Meanwhile the consumer is sending acknowledgements for the 10 messages it just read in (the configured prefetch) from the same topic, but since they also modify the same list in the topic this waits as well on the mutex held to service the producer:
> {code}
>     private void dispatchMatched() throws IOException {       
>         synchronized (matchedListMutex) {  // never gets passed here.
>             if (!matched.isEmpty() && !isFull()) {
> {code}
> This is a fairly classic deadlock. The trick is now how to resolve this given the fact that the topic isn't aware that it's list may need to wait for the file system to clean up.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (AMQ-2475) If tmp message store fills up, broker can deadlock due to while producers wait on disk space and consumers wait on acks

Posted by "Rob Davies (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/activemq/browse/AMQ-2475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=55413#action_55413 ] 

Rob Davies commented on AMQ-2475:
---------------------------------

I applied the patch to trunk - but got some JUnit test failures - going to resolve the dead lock differently.
I'm not clear if you are experiencing the same problem with Queues ?

> If tmp message store fills up, broker can deadlock due to while producers wait on disk space and consumers wait on acks
> -----------------------------------------------------------------------------------------------------------------------
>
>                 Key: AMQ-2475
>                 URL: https://issues.apache.org/activemq/browse/AMQ-2475
>             Project: ActiveMQ
>          Issue Type: Bug
>          Components: Broker, Message Store, Transport
>    Affects Versions: 5.3.0
>         Environment: Tested on Windows XP with JDK 1.60_13, but fairly sure it will be an issue on all platforms
>            Reporter: Martin Murphy
>            Assignee: Rob Davies
>         Attachments: activemq.xml, hangtest.zip, Queue.java, Queue.patchfile.txt, Topic.java, Topic.patchfile.txt, TopicSubscription.java, TopicSubscription.patchfile.txt
>
>
> I will attach a simple project that shows this. In the test the tmp space is set to 32 MB and two threads are created. One thread will constantly produce 1KB messages and the other consumes these, but sleeps for 100ms, note that producer flow control is turned off as well. The goal here is to ensure that the producers block while the consumers read the rest of the messages from the broker and catch up, this in turn frees up the disk space and allows the producer to send more messages. This config means that you can bound the broker based on disk space rather than memory usage.
> Unfortunately in this test using topics while the broker is reading in the message from the producer it has to lock the matched list it is adding it to. This is an abstract from the Topic's point of view and doesn't realize that the file may block based on the file system. 
> {code}
>     public void add(MessageReference node) throws Exception { //... snip ...
>             if (maximumPendingMessages != 0) {
>                 synchronized (matchedListMutex) {   // We have this mutex
>                     matched.addMessageLast(node); // ends up waiting for space
>                     // NOTE - be careful about the slaveBroker!
>                     if (maximumPendingMessages > 0) {
> {code}
> Meanwhile the consumer is sending acknowledgements for the 10 messages it just read in (the configured prefetch) from the same topic, but since they also modify the same list in the topic this waits as well on the mutex held to service the producer:
> {code}
>     private void dispatchMatched() throws IOException {       
>         synchronized (matchedListMutex) {  // never gets passed here.
>             if (!matched.isEmpty() && !isFull()) {
> {code}
> This is a fairly classic deadlock. The trick is now how to resolve this given the fact that the topic isn't aware that it's list may need to wait for the file system to clean up.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Issue Comment Edited: (AMQ-2475) If tmp message store fills up, broker can deadlock due to while producers wait on disk space and consumers wait on acks

Posted by "Rob Davies (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/activemq/browse/AMQ-2475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=55413#action_55413 ] 

Rob Davies edited comment on AMQ-2475 at 11/13/09 11:50 AM:
------------------------------------------------------------

I applied the patch to trunk - but got some JUnit test failures -  org.apache.activemq.broker.BrokerTest is the main one (a few JUnit tests are derived from this one).
Going to try resolve the dead lock differently.
I'm not clear if you are experiencing the same problem with Queues ?

      was (Author: rajdavies):
    I applied the patch to trunk - but got some JUnit test failures - going to resolve the dead lock differently.
I'm not clear if you are experiencing the same problem with Queues ?
  
> If tmp message store fills up, broker can deadlock due to while producers wait on disk space and consumers wait on acks
> -----------------------------------------------------------------------------------------------------------------------
>
>                 Key: AMQ-2475
>                 URL: https://issues.apache.org/activemq/browse/AMQ-2475
>             Project: ActiveMQ
>          Issue Type: Bug
>          Components: Broker, Message Store, Transport
>    Affects Versions: 5.3.0
>         Environment: Tested on Windows XP with JDK 1.60_13, but fairly sure it will be an issue on all platforms
>            Reporter: Martin Murphy
>            Assignee: Rob Davies
>         Attachments: activemq.xml, hangtest.zip, Queue.java, Queue.patchfile.txt, Topic.java, Topic.patchfile.txt, TopicSubscription.java, TopicSubscription.patchfile.txt
>
>
> I will attach a simple project that shows this. In the test the tmp space is set to 32 MB and two threads are created. One thread will constantly produce 1KB messages and the other consumes these, but sleeps for 100ms, note that producer flow control is turned off as well. The goal here is to ensure that the producers block while the consumers read the rest of the messages from the broker and catch up, this in turn frees up the disk space and allows the producer to send more messages. This config means that you can bound the broker based on disk space rather than memory usage.
> Unfortunately in this test using topics while the broker is reading in the message from the producer it has to lock the matched list it is adding it to. This is an abstract from the Topic's point of view and doesn't realize that the file may block based on the file system. 
> {code}
>     public void add(MessageReference node) throws Exception { //... snip ...
>             if (maximumPendingMessages != 0) {
>                 synchronized (matchedListMutex) {   // We have this mutex
>                     matched.addMessageLast(node); // ends up waiting for space
>                     // NOTE - be careful about the slaveBroker!
>                     if (maximumPendingMessages > 0) {
> {code}
> Meanwhile the consumer is sending acknowledgements for the 10 messages it just read in (the configured prefetch) from the same topic, but since they also modify the same list in the topic this waits as well on the mutex held to service the producer:
> {code}
>     private void dispatchMatched() throws IOException {       
>         synchronized (matchedListMutex) {  // never gets passed here.
>             if (!matched.isEmpty() && !isFull()) {
> {code}
> This is a fairly classic deadlock. The trick is now how to resolve this given the fact that the topic isn't aware that it's list may need to wait for the file system to clean up.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (AMQ-2475) If tmp message store fills up, broker can deadlock due to while producers wait on disk space and consumers wait on acks

Posted by "Dominic Tootell (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/activemq/browse/AMQ-2475?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Dominic Tootell updated AMQ-2475:
---------------------------------

    Attachment:     (was: Queue.java)

> If tmp message store fills up, broker can deadlock due to while producers wait on disk space and consumers wait on acks
> -----------------------------------------------------------------------------------------------------------------------
>
>                 Key: AMQ-2475
>                 URL: https://issues.apache.org/activemq/browse/AMQ-2475
>             Project: ActiveMQ
>          Issue Type: Bug
>          Components: Broker, Message Store, Transport
>    Affects Versions: 5.3.0
>         Environment: Tested on Windows XP with JDK 1.60_13, but fairly sure it will be an issue on all platforms
>            Reporter: Martin Murphy
>            Assignee: Rob Davies
>         Attachments: activemq.xml, hangtest.zip, Queue.patchfile.txt, Topic.patchfile.txt, TopicSubscription.patchfile.txt
>
>
> I will attach a simple project that shows this. In the test the tmp space is set to 32 MB and two threads are created. One thread will constantly produce 1KB messages and the other consumes these, but sleeps for 100ms, note that producer flow control is turned off as well. The goal here is to ensure that the producers block while the consumers read the rest of the messages from the broker and catch up, this in turn frees up the disk space and allows the producer to send more messages. This config means that you can bound the broker based on disk space rather than memory usage.
> Unfortunately in this test using topics while the broker is reading in the message from the producer it has to lock the matched list it is adding it to. This is an abstract from the Topic's point of view and doesn't realize that the file may block based on the file system. 
> {code}
>     public void add(MessageReference node) throws Exception { //... snip ...
>             if (maximumPendingMessages != 0) {
>                 synchronized (matchedListMutex) {   // We have this mutex
>                     matched.addMessageLast(node); // ends up waiting for space
>                     // NOTE - be careful about the slaveBroker!
>                     if (maximumPendingMessages > 0) {
> {code}
> Meanwhile the consumer is sending acknowledgements for the 10 messages it just read in (the configured prefetch) from the same topic, but since they also modify the same list in the topic this waits as well on the mutex held to service the producer:
> {code}
>     private void dispatchMatched() throws IOException {       
>         synchronized (matchedListMutex) {  // never gets passed here.
>             if (!matched.isEmpty() && !isFull()) {
> {code}
> This is a fairly classic deadlock. The trick is now how to resolve this given the fact that the topic isn't aware that it's list may need to wait for the file system to clean up.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (AMQ-2475) If tmp message store fills up, broker can deadlock due to while producers wait on disk space and consumers wait on acks

Posted by "Dominic Tootell (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/activemq/browse/AMQ-2475?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Dominic Tootell updated AMQ-2475:
---------------------------------

    Attachment:     (was: Topic.patchfile.txt)

> If tmp message store fills up, broker can deadlock due to while producers wait on disk space and consumers wait on acks
> -----------------------------------------------------------------------------------------------------------------------
>
>                 Key: AMQ-2475
>                 URL: https://issues.apache.org/activemq/browse/AMQ-2475
>             Project: ActiveMQ
>          Issue Type: Bug
>          Components: Broker, Message Store, Transport
>    Affects Versions: 5.3.0
>         Environment: Tested on Windows XP with JDK 1.60_13, but fairly sure it will be an issue on all platforms
>            Reporter: Martin Murphy
>            Assignee: Rob Davies
>         Attachments: activemq.xml, hangtest.zip, Queue.patchfile.txt, Topic.patchfile.txt, TopicSubscription.patchfile.txt
>
>
> I will attach a simple project that shows this. In the test the tmp space is set to 32 MB and two threads are created. One thread will constantly produce 1KB messages and the other consumes these, but sleeps for 100ms, note that producer flow control is turned off as well. The goal here is to ensure that the producers block while the consumers read the rest of the messages from the broker and catch up, this in turn frees up the disk space and allows the producer to send more messages. This config means that you can bound the broker based on disk space rather than memory usage.
> Unfortunately in this test using topics while the broker is reading in the message from the producer it has to lock the matched list it is adding it to. This is an abstract from the Topic's point of view and doesn't realize that the file may block based on the file system. 
> {code}
>     public void add(MessageReference node) throws Exception { //... snip ...
>             if (maximumPendingMessages != 0) {
>                 synchronized (matchedListMutex) {   // We have this mutex
>                     matched.addMessageLast(node); // ends up waiting for space
>                     // NOTE - be careful about the slaveBroker!
>                     if (maximumPendingMessages > 0) {
> {code}
> Meanwhile the consumer is sending acknowledgements for the 10 messages it just read in (the configured prefetch) from the same topic, but since they also modify the same list in the topic this waits as well on the mutex held to service the producer:
> {code}
>     private void dispatchMatched() throws IOException {       
>         synchronized (matchedListMutex) {  // never gets passed here.
>             if (!matched.isEmpty() && !isFull()) {
> {code}
> This is a fairly classic deadlock. The trick is now how to resolve this given the fact that the topic isn't aware that it's list may need to wait for the file system to clean up.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Assigned: (AMQ-2475) If tmp message store fills up, broker can deadlock due to while producers wait on disk space and consumers wait on acks

Posted by "Rob Davies (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/activemq/browse/AMQ-2475?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Rob Davies reassigned AMQ-2475:
-------------------------------

    Assignee: Rob Davies

> If tmp message store fills up, broker can deadlock due to while producers wait on disk space and consumers wait on acks
> -----------------------------------------------------------------------------------------------------------------------
>
>                 Key: AMQ-2475
>                 URL: https://issues.apache.org/activemq/browse/AMQ-2475
>             Project: ActiveMQ
>          Issue Type: Bug
>          Components: Broker, Message Store, Transport
>    Affects Versions: 5.3.0
>         Environment: Tested on Windows XP with JDK 1.60_13, but fairly sure it will be an issue on all platforms
>            Reporter: Martin Murphy
>            Assignee: Rob Davies
>         Attachments: hangtest.zip
>
>
> I will attach a simple project that shows this. In the test the tmp space is set to 32 MB and two threads are created. One thread will constantly produce 1KB messages and the other consumes these, but sleeps for 100ms, note that producer flow control is turned off as well. The goal here is to ensure that the producers block while the consumers read the rest of the messages from the broker and catch up, this in turn frees up the disk space and allows the producer to send more messages. This config means that you can bound the broker based on disk space rather than memory usage.
> Unfortunately in this test using topics while the broker is reading in the message from the producer it has to lock the matched list it is adding it to. This is an abstract from the Topic's point of view and doesn't realize that the file may block based on the file system. 
> {code}
>     public void add(MessageReference node) throws Exception { //... snip ...
>             if (maximumPendingMessages != 0) {
>                 synchronized (matchedListMutex) {   // We have this mutex
>                     matched.addMessageLast(node); // ends up waiting for space
>                     // NOTE - be careful about the slaveBroker!
>                     if (maximumPendingMessages > 0) {
> {code}
> Meanwhile the consumer is sending acknowledgements for the 10 messages it just read in (the configured prefetch) from the same topic, but since they also modify the same list in the topic this waits as well on the mutex held to service the producer:
> {code}
>     private void dispatchMatched() throws IOException {       
>         synchronized (matchedListMutex) {  // never gets passed here.
>             if (!matched.isEmpty() && !isFull()) {
> {code}
> This is a fairly classic deadlock. The trick is now how to resolve this given the fact that the topic isn't aware that it's list may need to wait for the file system to clean up.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (AMQ-2475) If tmp message store fills up, broker can deadlock due to while producers wait on disk space and consumers wait on acks

Posted by "Dominic Tootell (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/activemq/browse/AMQ-2475?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Dominic Tootell updated AMQ-2475:
---------------------------------

    Attachment:     (was: TopicSubscription.java)

> If tmp message store fills up, broker can deadlock due to while producers wait on disk space and consumers wait on acks
> -----------------------------------------------------------------------------------------------------------------------
>
>                 Key: AMQ-2475
>                 URL: https://issues.apache.org/activemq/browse/AMQ-2475
>             Project: ActiveMQ
>          Issue Type: Bug
>          Components: Broker, Message Store, Transport
>    Affects Versions: 5.3.0
>         Environment: Tested on Windows XP with JDK 1.60_13, but fairly sure it will be an issue on all platforms
>            Reporter: Martin Murphy
>            Assignee: Rob Davies
>         Attachments: activemq.xml, hangtest.zip, Queue.patchfile.txt, Topic.patchfile.txt, TopicSubscription.patchfile.txt
>
>
> I will attach a simple project that shows this. In the test the tmp space is set to 32 MB and two threads are created. One thread will constantly produce 1KB messages and the other consumes these, but sleeps for 100ms, note that producer flow control is turned off as well. The goal here is to ensure that the producers block while the consumers read the rest of the messages from the broker and catch up, this in turn frees up the disk space and allows the producer to send more messages. This config means that you can bound the broker based on disk space rather than memory usage.
> Unfortunately in this test using topics while the broker is reading in the message from the producer it has to lock the matched list it is adding it to. This is an abstract from the Topic's point of view and doesn't realize that the file may block based on the file system. 
> {code}
>     public void add(MessageReference node) throws Exception { //... snip ...
>             if (maximumPendingMessages != 0) {
>                 synchronized (matchedListMutex) {   // We have this mutex
>                     matched.addMessageLast(node); // ends up waiting for space
>                     // NOTE - be careful about the slaveBroker!
>                     if (maximumPendingMessages > 0) {
> {code}
> Meanwhile the consumer is sending acknowledgements for the 10 messages it just read in (the configured prefetch) from the same topic, but since they also modify the same list in the topic this waits as well on the mutex held to service the producer:
> {code}
>     private void dispatchMatched() throws IOException {       
>         synchronized (matchedListMutex) {  // never gets passed here.
>             if (!matched.isEmpty() && !isFull()) {
> {code}
> This is a fairly classic deadlock. The trick is now how to resolve this given the fact that the topic isn't aware that it's list may need to wait for the file system to clean up.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Resolved: (AMQ-2475) If tmp message store fills up, broker can deadlock due to while producers wait on disk space and consumers wait on acks

Posted by "Rob Davies (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/activemq/browse/AMQ-2475?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Rob Davies resolved AMQ-2475.
-----------------------------

    Resolution: Fixed

Thanks Michael  - resolved by SVN revision 963118

> If tmp message store fills up, broker can deadlock due to while producers wait on disk space and consumers wait on acks
> -----------------------------------------------------------------------------------------------------------------------
>
>                 Key: AMQ-2475
>                 URL: https://issues.apache.org/activemq/browse/AMQ-2475
>             Project: ActiveMQ
>          Issue Type: Bug
>          Components: Broker, Message Store, Transport
>    Affects Versions: 5.3.0
>         Environment: Tested on Windows XP with JDK 1.60_13, but fairly sure it will be an issue on all platforms
>            Reporter: Martin Murphy
>            Assignee: Rob Davies
>             Fix For: 5.4.0, 5.3.1
>
>         Attachments: activemq.xml, hangtest.zip, Queue.java, Queue.patchfile.txt, Topic.java, Topic.patchfile.txt, TopicSubscription.java, TopicSubscription.patchfile.txt
>
>
> I will attach a simple project that shows this. In the test the tmp space is set to 32 MB and two threads are created. One thread will constantly produce 1KB messages and the other consumes these, but sleeps for 100ms, note that producer flow control is turned off as well. The goal here is to ensure that the producers block while the consumers read the rest of the messages from the broker and catch up, this in turn frees up the disk space and allows the producer to send more messages. This config means that you can bound the broker based on disk space rather than memory usage.
> Unfortunately in this test using topics while the broker is reading in the message from the producer it has to lock the matched list it is adding it to. This is an abstract from the Topic's point of view and doesn't realize that the file may block based on the file system. 
> {code}
>     public void add(MessageReference node) throws Exception { //... snip ...
>             if (maximumPendingMessages != 0) {
>                 synchronized (matchedListMutex) {   // We have this mutex
>                     matched.addMessageLast(node); // ends up waiting for space
>                     // NOTE - be careful about the slaveBroker!
>                     if (maximumPendingMessages > 0) {
> {code}
> Meanwhile the consumer is sending acknowledgements for the 10 messages it just read in (the configured prefetch) from the same topic, but since they also modify the same list in the topic this waits as well on the mutex held to service the producer:
> {code}
>     private void dispatchMatched() throws IOException {       
>         synchronized (matchedListMutex) {  // never gets passed here.
>             if (!matched.isEmpty() && !isFull()) {
> {code}
> This is a fairly classic deadlock. The trick is now how to resolve this given the fact that the topic isn't aware that it's list may need to wait for the file system to clean up.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (AMQ-2475) If tmp message store fills up, broker can deadlock due to while producers wait on disk space and consumers wait on acks

Posted by "Dominic Tootell (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/activemq/browse/AMQ-2475?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Dominic Tootell updated AMQ-2475:
---------------------------------

    Attachment: Queue.patchfile.txt
                Topic.patchfile.txt
                TopicSubscription.patchfile.txt

all in package org.apache.activemq.broker.region


TopicSubscription.patchfile.txt      (Changes to TopicSubscription.java)
Topic.patchfile.txt                          (Changes to Topic.java)
Queue.patchfile.txt                       (Changes to Queue.java)




> If tmp message store fills up, broker can deadlock due to while producers wait on disk space and consumers wait on acks
> -----------------------------------------------------------------------------------------------------------------------
>
>                 Key: AMQ-2475
>                 URL: https://issues.apache.org/activemq/browse/AMQ-2475
>             Project: ActiveMQ
>          Issue Type: Bug
>          Components: Broker, Message Store, Transport
>    Affects Versions: 5.3.0
>         Environment: Tested on Windows XP with JDK 1.60_13, but fairly sure it will be an issue on all platforms
>            Reporter: Martin Murphy
>            Assignee: Rob Davies
>         Attachments: hangtest.zip, Queue.patchfile.txt, Topic.patchfile.txt, TopicSubscription.patchfile.txt
>
>
> I will attach a simple project that shows this. In the test the tmp space is set to 32 MB and two threads are created. One thread will constantly produce 1KB messages and the other consumes these, but sleeps for 100ms, note that producer flow control is turned off as well. The goal here is to ensure that the producers block while the consumers read the rest of the messages from the broker and catch up, this in turn frees up the disk space and allows the producer to send more messages. This config means that you can bound the broker based on disk space rather than memory usage.
> Unfortunately in this test using topics while the broker is reading in the message from the producer it has to lock the matched list it is adding it to. This is an abstract from the Topic's point of view and doesn't realize that the file may block based on the file system. 
> {code}
>     public void add(MessageReference node) throws Exception { //... snip ...
>             if (maximumPendingMessages != 0) {
>                 synchronized (matchedListMutex) {   // We have this mutex
>                     matched.addMessageLast(node); // ends up waiting for space
>                     // NOTE - be careful about the slaveBroker!
>                     if (maximumPendingMessages > 0) {
> {code}
> Meanwhile the consumer is sending acknowledgements for the 10 messages it just read in (the configured prefetch) from the same topic, but since they also modify the same list in the topic this waits as well on the mutex held to service the producer:
> {code}
>     private void dispatchMatched() throws IOException {       
>         synchronized (matchedListMutex) {  // never gets passed here.
>             if (!matched.isEmpty() && !isFull()) {
> {code}
> This is a fairly classic deadlock. The trick is now how to resolve this given the fact that the topic isn't aware that it's list may need to wait for the file system to clean up.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (AMQ-2475) If tmp message store fills up, broker can deadlock due to while producers wait on disk space and consumers wait on acks

Posted by "Michael Cooper (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/activemq/browse/AMQ-2475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=60133#action_60133 ] 

Michael Cooper commented on AMQ-2475:
-------------------------------------

Rob, wow, quick response!  I reviewed the change carefully, and it looks good for the most part.  A couple potential issues though:

In FilePendingMessageCursor:

- Why did you add a "throws Exception" clause to original and try versions of the method?  Doesn't seem to be used.
- The end of the tryAddMessageLast method returns false.  I think this should probably be true instead, because if the caller passes an expired message, it will now loop forever retrying to add it.

In TopicSubscription:

While the fix you made is probably the safest fix, I think the ideal fix would not have to even make a check for  matched.isFull() since the tryAddMessageLast method should return false if it is full and cannot add the message.  However, this requires implementing tryAddMessageLast in all implementations of PendingMessageCursor.

Also, the fix version of the bug may need to be updated.

> If tmp message store fills up, broker can deadlock due to while producers wait on disk space and consumers wait on acks
> -----------------------------------------------------------------------------------------------------------------------
>
>                 Key: AMQ-2475
>                 URL: https://issues.apache.org/activemq/browse/AMQ-2475
>             Project: ActiveMQ
>          Issue Type: Bug
>          Components: Broker, Message Store, Transport
>    Affects Versions: 5.3.0
>         Environment: Tested on Windows XP with JDK 1.60_13, but fairly sure it will be an issue on all platforms
>            Reporter: Martin Murphy
>            Assignee: Rob Davies
>             Fix For: 5.3.1, 5.4.0
>
>         Attachments: activemq.xml, hangtest.zip, Queue.java, Queue.patchfile.txt, Topic.java, Topic.patchfile.txt, TopicSubscription.java, TopicSubscription.patchfile.txt
>
>
> I will attach a simple project that shows this. In the test the tmp space is set to 32 MB and two threads are created. One thread will constantly produce 1KB messages and the other consumes these, but sleeps for 100ms, note that producer flow control is turned off as well. The goal here is to ensure that the producers block while the consumers read the rest of the messages from the broker and catch up, this in turn frees up the disk space and allows the producer to send more messages. This config means that you can bound the broker based on disk space rather than memory usage.
> Unfortunately in this test using topics while the broker is reading in the message from the producer it has to lock the matched list it is adding it to. This is an abstract from the Topic's point of view and doesn't realize that the file may block based on the file system. 
> {code}
>     public void add(MessageReference node) throws Exception { //... snip ...
>             if (maximumPendingMessages != 0) {
>                 synchronized (matchedListMutex) {   // We have this mutex
>                     matched.addMessageLast(node); // ends up waiting for space
>                     // NOTE - be careful about the slaveBroker!
>                     if (maximumPendingMessages > 0) {
> {code}
> Meanwhile the consumer is sending acknowledgements for the 10 messages it just read in (the configured prefetch) from the same topic, but since they also modify the same list in the topic this waits as well on the mutex held to service the producer:
> {code}
>     private void dispatchMatched() throws IOException {       
>         synchronized (matchedListMutex) {  // never gets passed here.
>             if (!matched.isEmpty() && !isFull()) {
> {code}
> This is a fairly classic deadlock. The trick is now how to resolve this given the fact that the topic isn't aware that it's list may need to wait for the file system to clean up.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.