You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@activemq.apache.org by "Rob Lugt (JIRA)" <ji...@apache.org> on 2006/11/01 18:47:02 UTC

[jira] Created: (AMQ-1020) Slow consumer terminally blocks both client and broker

Slow consumer terminally blocks both client and broker
------------------------------------------------------

                 Key: AMQ-1020
                 URL: https://issues.apache.org/activemq/browse/AMQ-1020
             Project: ActiveMQ
          Issue Type: Bug
          Components: Broker
    Affects Versions: 4.0.2
         Environment: Broker: Windows XP, Sun JDK1.5  Client: activemq-dotnet (Trunk)
            Reporter: Rob Lugt


I have a multi-threaded client (client1) which is acting as both a publisher (Topic1) and subscriber (Topic2) using a single session.  There is another client process (client2) which publishes on Topic2.

I have witnessed the following repeatable scenario where both clients get stuck, which can only be rectified by restarting the broker! :-

Client1 publishes messages to Topic1 (rate = about 30 msgs/sec).
Client2 publishes bursts of messages to Topic2 (rate = 500 msgs/sec) 
Client1 is a slow subscriber on Topic2

After running in this scenario for a couple of seconds, Client1 and Client2 become stuck.  Looking at a stack trace for Client1 I can see that it's read_loop is stuck waiting for input, and it's publisher thread is stuck waiting for an acknowledgement to the synchronous message send (the acknowledgement never arrives because the broker won't sent any more messages).

Client2 is also stuck waiting for an acknowledgement to a synchronous send.

My perception is that it appears the broker is throttling the connection because the consumer is running slowly, but for some reason it gets into a state where all message flow stops (even though the consumer is automatically acknowledging messages, albeit slowly).  Furthermore, if I kill Client1 the broker doesn't recover (using a JMX console the connection remains visible).

The broker uses a vanilla configuration (i.e. no policies are set for the topics in quedtion).
 

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: https://issues.apache.org/activemq/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] Assigned: (AMQ-1020) Slow consumer terminally blocks both client and broker

Posted by "Rob Davies (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/activemq/browse/AMQ-1020?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Rob Davies reassigned AMQ-1020:
-------------------------------

    Assignee: Rob Davies

> Slow consumer terminally blocks both client and broker
> ------------------------------------------------------
>
>                 Key: AMQ-1020
>                 URL: https://issues.apache.org/activemq/browse/AMQ-1020
>             Project: ActiveMQ
>          Issue Type: Bug
>          Components: Broker
>    Affects Versions: 4.0.2
>         Environment: Broker: Windows XP, Sun JDK1.5  Client: activemq-dotnet (Trunk)
>            Reporter: Rob Lugt
>         Assigned To: Rob Davies
>
> I have a multi-threaded client (client1) which is acting as both a publisher (Topic1) and subscriber (Topic2) using a single session.  There is another client process (client2) which publishes on Topic2.
> I have witnessed the following repeatable scenario where both clients get stuck, which can only be rectified by restarting the broker! :-
> Client1 publishes messages to Topic1 (rate = about 30 msgs/sec).
> Client2 publishes bursts of messages to Topic2 (rate = 500 msgs/sec) 
> Client1 is a slow subscriber on Topic2
> After running in this scenario for a couple of seconds, Client1 and Client2 become stuck.  Looking at a stack trace for Client1 I can see that it's read_loop is stuck waiting for input, and it's publisher thread is stuck waiting for an acknowledgement to the synchronous message send (the acknowledgement never arrives because the broker won't sent any more messages).
> Client2 is also stuck waiting for an acknowledgement to a synchronous send.
> My perception is that it appears the broker is throttling the connection because the consumer is running slowly, but for some reason it gets into a state where all message flow stops (even though the consumer is automatically acknowledging messages, albeit slowly).  Furthermore, if I kill Client1 the broker doesn't recover (using a JMX console the connection remains visible).
> The broker uses a vanilla configuration (i.e. no policies are set for the topics in quedtion).
>  

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: https://issues.apache.org/activemq/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] Commented: (AMQ-1020) Slow consumer terminally blocks both client and broker

Posted by "paul normington (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/activemq/browse/AMQ-1020?page=comments#action_37631 ] 
            
paul normington commented on AMQ-1020:
--------------------------------------


I have run into this same issue I think.

We have a scenario with a DurableSubscriber that has retrieved some messages and then disconnects.
The publisher continuous publishing at 1000 messages per second.
After 60000 messages are sent, the publisher hangs, and the broker quiesces.

I have seen this behavior with 4.0.1 and 4.1.0

I took a stack trace which showed the server was stuck in UsageManager.waitForSpace().

I can delay the hanging problem by configuring more memory in the MemoryManager.
I have tried setting timeToLives and pendingMessageLimitStrategies but I still get the hanging.

If I use a non durable subscriber the problem goes away, but this is not an ideal solution for us.

> Slow consumer terminally blocks both client and broker
> ------------------------------------------------------
>
>                 Key: AMQ-1020
>                 URL: https://issues.apache.org/activemq/browse/AMQ-1020
>             Project: ActiveMQ
>          Issue Type: Bug
>          Components: Broker
>    Affects Versions: 4.0.2
>         Environment: Broker: Windows XP, Sun JDK1.5  Client: activemq-dotnet (Trunk)
>            Reporter: Rob Lugt
>
> I have a multi-threaded client (client1) which is acting as both a publisher (Topic1) and subscriber (Topic2) using a single session.  There is another client process (client2) which publishes on Topic2.
> I have witnessed the following repeatable scenario where both clients get stuck, which can only be rectified by restarting the broker! :-
> Client1 publishes messages to Topic1 (rate = about 30 msgs/sec).
> Client2 publishes bursts of messages to Topic2 (rate = 500 msgs/sec) 
> Client1 is a slow subscriber on Topic2
> After running in this scenario for a couple of seconds, Client1 and Client2 become stuck.  Looking at a stack trace for Client1 I can see that it's read_loop is stuck waiting for input, and it's publisher thread is stuck waiting for an acknowledgement to the synchronous message send (the acknowledgement never arrives because the broker won't sent any more messages).
> Client2 is also stuck waiting for an acknowledgement to a synchronous send.
> My perception is that it appears the broker is throttling the connection because the consumer is running slowly, but for some reason it gets into a state where all message flow stops (even though the consumer is automatically acknowledging messages, albeit slowly).  Furthermore, if I kill Client1 the broker doesn't recover (using a JMX console the connection remains visible).
> The broker uses a vanilla configuration (i.e. no policies are set for the topics in quedtion).
>  

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: https://issues.apache.org/activemq/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] Commented: (AMQ-1020) Slow consumer terminally blocks both client and broker

Posted by "james strachan (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/activemq/browse/AMQ-1020?page=comments#action_37340 ] 
            
james strachan commented on AMQ-1020:
-------------------------------------

Are you using explicit acknowledgements or auto-ack (or transactions?). The default prefetch is only about 1000 I think for NMS which means after sending 1000 messages no more messages will be dispatched to a consumer until it receives acks. So I can see why Client1 becomes stuck pretty quickly and why client1 can no longer publish more messages.

So 2 things to try...

use dispatchAsync=true (on consumer info) on the consumers, so that dispatching to consumers is asynchronous in the broker. That way a producer won't get blocked waiting to dispatch to slow consumers.

Also try upping the prefetch value to something large. e.g. on Java for non-persistent topics its about 32000 I think

> Slow consumer terminally blocks both client and broker
> ------------------------------------------------------
>
>                 Key: AMQ-1020
>                 URL: https://issues.apache.org/activemq/browse/AMQ-1020
>             Project: ActiveMQ
>          Issue Type: Bug
>          Components: Broker
>    Affects Versions: 4.0.2
>         Environment: Broker: Windows XP, Sun JDK1.5  Client: activemq-dotnet (Trunk)
>            Reporter: Rob Lugt
>
> I have a multi-threaded client (client1) which is acting as both a publisher (Topic1) and subscriber (Topic2) using a single session.  There is another client process (client2) which publishes on Topic2.
> I have witnessed the following repeatable scenario where both clients get stuck, which can only be rectified by restarting the broker! :-
> Client1 publishes messages to Topic1 (rate = about 30 msgs/sec).
> Client2 publishes bursts of messages to Topic2 (rate = 500 msgs/sec) 
> Client1 is a slow subscriber on Topic2
> After running in this scenario for a couple of seconds, Client1 and Client2 become stuck.  Looking at a stack trace for Client1 I can see that it's read_loop is stuck waiting for input, and it's publisher thread is stuck waiting for an acknowledgement to the synchronous message send (the acknowledgement never arrives because the broker won't sent any more messages).
> Client2 is also stuck waiting for an acknowledgement to a synchronous send.
> My perception is that it appears the broker is throttling the connection because the consumer is running slowly, but for some reason it gets into a state where all message flow stops (even though the consumer is automatically acknowledging messages, albeit slowly).  Furthermore, if I kill Client1 the broker doesn't recover (using a JMX console the connection remains visible).
> The broker uses a vanilla configuration (i.e. no policies are set for the topics in quedtion).
>  

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: https://issues.apache.org/activemq/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] Commented: (AMQ-1020) Slow consumer terminally blocks both client and broker

Posted by "Rob Lugt (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/activemq/browse/AMQ-1020?page=comments#action_37345 ] 
            
Rob Lugt commented on AMQ-1020:
-------------------------------

Hi James

I'm using auto-acknowledgements (no transactions).  You are correct that the NMS prefetch default is 1000 messages, and this threshold appears to have a bearing on when the consumer (and hence the publisher) gets stuck.  Changing the prefetch size may well remove the symtoms from my test case, but that's not really what I'm looking for.  I believe the test case exposes a critical bug in the broker, and hence gives us an opportunity to fix the bug, which is preferable to changing the configuration to avoid the condition (sod's law dictates that the condition will re-emerge as soon as my application goes into production).

I think there are two crucial points here that need investigating
1) even though [auto] acknowledgements are being sent to the broker, the consumer is getting stuck dead (i.e. no message activity is occuring once the broker becomes stuck).
2) killing the slow consumer does not rectify the situation.  This implies that the broker is stuck in some state where it doesn't recognise the client socket has been closed.

It's probably worth noting that this problem does not occur when I disable Client1 from publishing (even though it's still a slow consumer).  It's only when Client1 is a slow consumer and a [fast] publisher that it falls into the dead-locked condition.

> Slow consumer terminally blocks both client and broker
> ------------------------------------------------------
>
>                 Key: AMQ-1020
>                 URL: https://issues.apache.org/activemq/browse/AMQ-1020
>             Project: ActiveMQ
>          Issue Type: Bug
>          Components: Broker
>    Affects Versions: 4.0.2
>         Environment: Broker: Windows XP, Sun JDK1.5  Client: activemq-dotnet (Trunk)
>            Reporter: Rob Lugt
>
> I have a multi-threaded client (client1) which is acting as both a publisher (Topic1) and subscriber (Topic2) using a single session.  There is another client process (client2) which publishes on Topic2.
> I have witnessed the following repeatable scenario where both clients get stuck, which can only be rectified by restarting the broker! :-
> Client1 publishes messages to Topic1 (rate = about 30 msgs/sec).
> Client2 publishes bursts of messages to Topic2 (rate = 500 msgs/sec) 
> Client1 is a slow subscriber on Topic2
> After running in this scenario for a couple of seconds, Client1 and Client2 become stuck.  Looking at a stack trace for Client1 I can see that it's read_loop is stuck waiting for input, and it's publisher thread is stuck waiting for an acknowledgement to the synchronous message send (the acknowledgement never arrives because the broker won't sent any more messages).
> Client2 is also stuck waiting for an acknowledgement to a synchronous send.
> My perception is that it appears the broker is throttling the connection because the consumer is running slowly, but for some reason it gets into a state where all message flow stops (even though the consumer is automatically acknowledging messages, albeit slowly).  Furthermore, if I kill Client1 the broker doesn't recover (using a JMX console the connection remains visible).
> The broker uses a vanilla configuration (i.e. no policies are set for the topics in quedtion).
>  

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: https://issues.apache.org/activemq/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] Resolved: (AMQ-1020) Slow consumer terminally blocks both client and broker

Posted by "Rob Davies (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/activemq/browse/AMQ-1020?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Rob Davies resolved AMQ-1020.
-----------------------------

       Resolution: Fixed
    Fix Version/s: 4.2.0

This is fixed by the default use of cursors and the spooling to disk for non-durable slow topic consumers

> Slow consumer terminally blocks both client and broker
> ------------------------------------------------------
>
>                 Key: AMQ-1020
>                 URL: https://issues.apache.org/activemq/browse/AMQ-1020
>             Project: ActiveMQ
>          Issue Type: Bug
>          Components: Broker
>    Affects Versions: 4.0.2
>         Environment: Broker: Windows XP, Sun JDK1.5  Client: activemq-dotnet (Trunk)
>            Reporter: Rob Lugt
>         Assigned To: Rob Davies
>             Fix For: 4.2.0
>
>
> I have a multi-threaded client (client1) which is acting as both a publisher (Topic1) and subscriber (Topic2) using a single session.  There is another client process (client2) which publishes on Topic2.
> I have witnessed the following repeatable scenario where both clients get stuck, which can only be rectified by restarting the broker! :-
> Client1 publishes messages to Topic1 (rate = about 30 msgs/sec).
> Client2 publishes bursts of messages to Topic2 (rate = 500 msgs/sec) 
> Client1 is a slow subscriber on Topic2
> After running in this scenario for a couple of seconds, Client1 and Client2 become stuck.  Looking at a stack trace for Client1 I can see that it's read_loop is stuck waiting for input, and it's publisher thread is stuck waiting for an acknowledgement to the synchronous message send (the acknowledgement never arrives because the broker won't sent any more messages).
> Client2 is also stuck waiting for an acknowledgement to a synchronous send.
> My perception is that it appears the broker is throttling the connection because the consumer is running slowly, but for some reason it gets into a state where all message flow stops (even though the consumer is automatically acknowledging messages, albeit slowly).  Furthermore, if I kill Client1 the broker doesn't recover (using a JMX console the connection remains visible).
> The broker uses a vanilla configuration (i.e. no policies are set for the topics in quedtion).
>  

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: https://issues.apache.org/activemq/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira