You are viewing a plain text version of this content. The canonical link for it is here.

Posted to users@qpid.apache.org by Matt Broadstone <mb...@gmail.com> on 2015/04/29 18:46:44 UTC

acknowledge not releasing message with C++ messaging api

Hi,

I have a service using the C++ Messaging API which connects to a single
instance of qpidd (currently on the same machine), which seems to crash out
with this exception every couple of days under moderate load:

qpidd[68257]: 2015-04-28 11:56:38 [Broker] error
qpid.192.168.2.225:5672-192.168.2.148:60492: resource-limit-exceeded:
Maximum depth exceeded on
b1386bee-a36c-449d-953f-c25f4842e76d_hive.guest.metadata_7bf9355b-524b-4853-89bd-1848366cd21f:
current=[count: 389438, size: 104857546], max=[size: 104857600]
(/build/buildd/qpid-cpp-0.28/src/qpid/broker/Queue.cpp:1575)

Using qpid-stat I don't see the queue depth ever increase from 0 (which I
gather is why the exception is thrown, from reading the code), however I
-do- notice that the "acquired" count is increasing with every message with
no corresponding "release" (release count is always 0).

My service currently looks a lot like the "Receiving Messages from Multiple
Sources" on this page:
https://qpid.apache.org/releases/qpid-0.28/messaging-api/cpp/api/index.html,
and I am definitely calling the message agnostic "session.acknowledge()" in
my main event loop, so I'm not sure why the messages are never released
(presuming that messages would be released about settling/acknowledgement).

I've created a ticket for what I think is a bug in the broker here:
https://issues.apache.org/jira/browse/QPID-6517, but figured I would post
here as well just in case there is operator error happening on my part:

Regards,
Matt

Re: acknowledge not releasing message with C++ messaging api

Posted by Gordon Sim <gs...@redhat.com>.

On 04/29/2015 08:33 PM, Gordon Sim wrote:
> On 04/29/2015 07:59 PM, Gordon Sim wrote:
>> On 04/29/2015 07:46 PM, Matt Broadstone wrote:
>>> The process that's taking up memory is the receiver (mqget.cc in the
>>> posted
>>> gist). Proton is version 0.9.
>>
>> I see some sort of build up in receivers using AMQP 1.0 with
>> 0.32+proton-0.9 also. Seems to be only when receiving from topics. I'll
>> investigate further.
>
> It's a bug in the qpid::messaging client for 1.0 I'm afraid.
> Specifically it is not locally settling deliveries that are sent
> pre-settled. These then build up within protons delivery map.
>
> I'll get a fix in shortly. However as a workaround you can turn on
> acknowledgements for the receiver, e.g. using 'my-topic;
> {link:{reliability:at-least-once, timeout:1}}'.

And just for information, this is a regression introduced by changes 
made for https://issues.apache.org/jira/browse/QPID-6252, so affects 
0.32 only.


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@qpid.apache.org
For additional commands, e-mail: users-help@qpid.apache.org

Re: acknowledge not releasing message with C++ messaging api

Posted by Gordon Sim <gs...@redhat.com>.

On 04/30/2015 04:58 PM, Matt Broadstone wrote:
> On Thu, Apr 30, 2015 at 11:46 AM, Gordon Sim <gs...@redhat.com> wrote:
>
>> On 04/29/2015 08:33 PM, Gordon Sim wrote:
>>
>>> On 04/29/2015 07:59 PM, Gordon Sim wrote:
>>>
>>>> On 04/29/2015 07:46 PM, Matt Broadstone wrote:
>>>>
>>>>> The process that's taking up memory is the receiver (mqget.cc in the
>>>>> posted
>>>>> gist). Proton is version 0.9.
>>>>>
>>>>
>>>> I see some sort of build up in receivers using AMQP 1.0 with
>>>> 0.32+proton-0.9 also. Seems to be only when receiving from topics. I'll
>>>> investigate further.
>>>>
>>>
>>> It's a bug in the qpid::messaging client for 1.0 I'm afraid.
>>> Specifically it is not locally settling deliveries that are sent
>>> pre-settled. These then build up within protons delivery map.
>>>
>>> I'll get a fix in shortly. However as a workaround you can turn on
>>> acknowledgements for the receiver, e.g. using 'my-topic;
>>> {link:{reliability:at-least-once, timeout:1}}'.
>>>
>>
>> This is now fixed and tracked by
>> https://issues.apache.org/jira/browse/QPID-6521
>>
>>
> Great! Thanks for the quick turn around Gordon.

Actually it is largely Alan Conway we have to thank.

> As for my original bug,
> I've been having considerable trouble replicating it locally. I left my
> test programs running for an hour or so and got up to around 8M messages
> with no similar error.
>
> I know this is fairly vague, but could you help describe what conditions
> might lead to that kind of growth in queue depth?

Usually it is simply the consumer topping processing messages in 
someway, or slowing down significantly. As you pointed out in an earlier 
mail, the relatively slow rates you were seeing makes it hard to 
understand how such a large depth could build up when the steady state 
did not seem to involve any depth at all.

We have in the (distant) past seen scenarios where the depth as assumed 
by the queue and the depth as reported by management were out of sync, 
but those were all around fairly 'exotic' combinations of features 
(transactions and lvqs etc) and the known issues were fixed long before 
0.28.

I recently hit an issue in proton whereby it lost the ability to handle 
acknowledgements of messages where there were many links on the same 
session and deliveries were settled significantly out of order. It's 
possible there could be something like that, but it would show up as 
growing queue depth I believe.

> In my particular scenario
> I am doing nothing more than the two provided sample apps do:
> non-persistent, default constructed Messages being published to a topic
> exchange and consumed by a single consumer.
>
> Obviously the vanilla case here can be cleaned up a bit (maybe adding
> message TTLs, or perhaps the acknowledgement workaround you provided), but
> still I'm unclear as to why it worked for so many days and then failed on
> that assertion (effectively rendering the client useless, as well as
> anything posted to that topic). It's true this only happened with 0.28, but
> I just don't have enough data points yet to rule out the possibility of it
> happening again with 0.32/0.33.
>
> I'm locally testing a number of scenarios, and we have the original
> environment up so hopefully the behavior will be triggered again in a state
> when I can collect data. In the meantime, any speculation on your side as
> to what might cause this would be appreciated. Please let me know if I can
> provide any more information.

Running qpid-stat -q periodically against the original environment (if 
that is possible) might be useful, say every 30 mins or so. That way if 
it happens again we have a reasonable amount of historical data leading 
up to the problem.

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@qpid.apache.org
For additional commands, e-mail: users-help@qpid.apache.org

Re: acknowledge not releasing message with C++ messaging api

Posted by Matt Broadstone <mb...@gmail.com>.

On Thu, Apr 30, 2015 at 11:46 AM, Gordon Sim <gs...@redhat.com> wrote:

> On 04/29/2015 08:33 PM, Gordon Sim wrote:
>
>> On 04/29/2015 07:59 PM, Gordon Sim wrote:
>>
>>> On 04/29/2015 07:46 PM, Matt Broadstone wrote:
>>>
>>>> The process that's taking up memory is the receiver (mqget.cc in the
>>>> posted
>>>> gist). Proton is version 0.9.
>>>>
>>>
>>> I see some sort of build up in receivers using AMQP 1.0 with
>>> 0.32+proton-0.9 also. Seems to be only when receiving from topics. I'll
>>> investigate further.
>>>
>>
>> It's a bug in the qpid::messaging client for 1.0 I'm afraid.
>> Specifically it is not locally settling deliveries that are sent
>> pre-settled. These then build up within protons delivery map.
>>
>> I'll get a fix in shortly. However as a workaround you can turn on
>> acknowledgements for the receiver, e.g. using 'my-topic;
>> {link:{reliability:at-least-once, timeout:1}}'.
>>
>
> This is now fixed and tracked by
> https://issues.apache.org/jira/browse/QPID-6521
>
>
Great! Thanks for the quick turn around Gordon. As for my original bug,
I've been having considerable trouble replicating it locally. I left my
test programs running for an hour or so and got up to around 8M messages
with no similar error.

I know this is fairly vague, but could you help describe what conditions
might lead to that kind of growth in queue depth? In my particular scenario
I am doing nothing more than the two provided sample apps do:
non-persistent, default constructed Messages being published to a topic
exchange and consumed by a single consumer.

Obviously the vanilla case here can be cleaned up a bit (maybe adding
message TTLs, or perhaps the acknowledgement workaround you provided), but
still I'm unclear as to why it worked for so many days and then failed on
that assertion (effectively rendering the client useless, as well as
anything posted to that topic). It's true this only happened with 0.28, but
I just don't have enough data points yet to rule out the possibility of it
happening again with 0.32/0.33.

I'm locally testing a number of scenarios, and we have the original
environment up so hopefully the behavior will be triggered again in a state
when I can collect data. In the meantime, any speculation on your side as
to what might cause this would be appreciated. Please let me know if I can
provide any more information.

Cheers,
Matt

>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@qpid.apache.org
> For additional commands, e-mail: users-help@qpid.apache.org
>
>

Re: acknowledge not releasing message with C++ messaging api

Posted by Gordon Sim <gs...@redhat.com>.

On 04/29/2015 08:33 PM, Gordon Sim wrote:
> On 04/29/2015 07:59 PM, Gordon Sim wrote:
>> On 04/29/2015 07:46 PM, Matt Broadstone wrote:
>>> The process that's taking up memory is the receiver (mqget.cc in the
>>> posted
>>> gist). Proton is version 0.9.
>>
>> I see some sort of build up in receivers using AMQP 1.0 with
>> 0.32+proton-0.9 also. Seems to be only when receiving from topics. I'll
>> investigate further.
>
> It's a bug in the qpid::messaging client for 1.0 I'm afraid.
> Specifically it is not locally settling deliveries that are sent
> pre-settled. These then build up within protons delivery map.
>
> I'll get a fix in shortly. However as a workaround you can turn on
> acknowledgements for the receiver, e.g. using 'my-topic;
> {link:{reliability:at-least-once, timeout:1}}'.

This is now fixed and tracked by 
https://issues.apache.org/jira/browse/QPID-6521


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@qpid.apache.org
For additional commands, e-mail: users-help@qpid.apache.org

Re: acknowledge not releasing message with C++ messaging api

Posted by Gordon Sim <gs...@redhat.com>.

On 04/29/2015 07:59 PM, Gordon Sim wrote:
> On 04/29/2015 07:46 PM, Matt Broadstone wrote:
>> The process that's taking up memory is the receiver (mqget.cc in the
>> posted
>> gist). Proton is version 0.9.
>
> I see some sort of build up in receivers using AMQP 1.0 with
> 0.32+proton-0.9 also. Seems to be only when receiving from topics. I'll
> investigate further.

It's a bug in the qpid::messaging client for 1.0 I'm afraid. 
Specifically it is not locally settling deliveries that are sent 
pre-settled. These then build up within protons delivery map.

I'll get a fix in shortly. However as a workaround you can turn on 
acknowledgements for the receiver, e.g. using 'my-topic; 
{link:{reliability:at-least-once, timeout:1}}'.

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@qpid.apache.org
For additional commands, e-mail: users-help@qpid.apache.org

Re: acknowledge not releasing message with C++ messaging api

Posted by Gordon Sim <gs...@redhat.com>.

On 04/29/2015 07:46 PM, Matt Broadstone wrote:
> The process that's taking up memory is the receiver (mqget.cc in the posted
> gist). Proton is version 0.9.

I see some sort of build up in receivers using AMQP 1.0 with 
0.32+proton-0.9 also. Seems to be only when receiving from topics. I'll 
investigate further.

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@qpid.apache.org
For additional commands, e-mail: users-help@qpid.apache.org

Re: acknowledge not releasing message with C++ messaging api

Posted by Matt Broadstone <mb...@gmail.com>.

On Wed, Apr 29, 2015 at 2:23 PM, Gordon Sim <gs...@redhat.com> wrote:

> On 04/29/2015 07:08 PM, Matt Broadstone wrote:
>
>> I am using AMQP 1.0, and this is with two services communicating with
>> each other over qpidd, both written in C++ and using the QPID C++
>> Messaging
>> API on the 0.28 release. I just updated to 0.32, and began trying to
>> stress
>> test the system in a VM and am now running into memory consumption errors
>> with my test programs. They can be found here:
>> https://gist.github.com/mbroadst/0adcd6cb1b2c8617a473
>>
>> Running this for about 1-2min results in something like 80% memory usage
>> on
>> my VM, although valgrind isn't able to spot any memory leaks.
>>
>
> What process is taking up the memory? The sender, the receiver or the
> broker? What address are you using and what does qpid-stat -q show? You
> could also try running qpid-queue-stats while the test is running to give
> some insight into the enqueue and dequeue rates.
>
> Also, what version of proton are you using?
>
>
Using all packages from here:
https://launchpad.net/~mcpierce/+archive/ubuntu/qpid-testing

The process that's taking up memory is the receiver (mqget.cc in the posted
gist). Proton is version 0.9.

Here is my test process currently:

1) qpid-config -a localhost add exchange topic my.topic --durable
2) mqget localhost my.topic
3) mqsend localhost my.topic

wait for 2min and these were the statistics gathered (all initial counters
are 0 at start of the test):

mqget memory consumption is at 30.6%, and will hold there indefinitely
until I quit the program.

mbroadst@simulated-cell:~/Development/test/mqsend/build$ qpid-stat -g
Broker Summary:
  uptime  cluster       connections  sessions  exchanges  queues
  ================================================================
  6m 28s  <standalone>  2            2         9          2

Aggregate Broker Statistics:
  Statistic                   Messages  Bytes
  =================================================
  queue-depth                 0         0
  total-enqueues              77,373    2,255,741
  total-dequeues              77,373    2,255,741
  persistent-enqueues         0         0
  persistent-dequeues         0         0
  transactional-enqueues      0         0
  transactional-dequeues      0         0
  flow-to-disk-depth          0         0
  flow-to-disk-enqueues       0         0
  flow-to-disk-dequeues       0         0
  acquires                    77,373
  releases                    0
  discards-no-route           282
  discards-ttl-expired        0
  discards-limit-overflow     0
  discards-ring-overflow      0
  discards-lvq-replace        0
  discards-subscriber-reject  0
  discards-purged             0
  reroutes                    0
  abandoned                   0
  abandoned-via-alt           0


mbroadst@simulated-cell:~/Development/test/mqsend/build$ qpid-stat -q
Queues

queue
dur  autoDel  excl  msg   msgIn  msgOut  bytes  bytesIn  bytesOut  cons
bind

===================================================================================================================================================================

43b625de-1047-4c14-a295-7da4c5c0ac52:0.0
Y        Y        0     0      0       0      0        0         1     2

a6725d7b-1bd6-4d36-b88a-e9f017473c80_my.topic_26f4a0ac-af6e-4343-b8a3-af34544d8140
Y        Y        0  77.4k  77.4k      0   2.24m    2.24m        1     2


Any other info I can provide?

Cheers,
Matt

Re: acknowledge not releasing message with C++ messaging api

Posted by Gordon Sim <gs...@redhat.com>.

On 04/29/2015 07:08 PM, Matt Broadstone wrote:
> I am using AMQP 1.0, and this is with two services communicating with
> each other over qpidd, both written in C++ and using the QPID C++ Messaging
> API on the 0.28 release. I just updated to 0.32, and began trying to stress
> test the system in a VM and am now running into memory consumption errors
> with my test programs. They can be found here:
> https://gist.github.com/mbroadst/0adcd6cb1b2c8617a473
>
> Running this for about 1-2min results in something like 80% memory usage on
> my VM, although valgrind isn't able to spot any memory leaks.

What process is taking up the memory? The sender, the receiver or the 
broker? What address are you using and what does qpid-stat -q show? You 
could also try running qpid-queue-stats while the test is running to 
give some insight into the enqueue and dequeue rates.

Also, what version of proton are you using?

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@qpid.apache.org
For additional commands, e-mail: users-help@qpid.apache.org

Re: acknowledge not releasing message with C++ messaging api

Posted by Matt Broadstone <mb...@gmail.com>.

On Wed, Apr 29, 2015 at 1:57 PM, Robbie Gemmell <ro...@gmail.com>
wrote:

> On 29 April 2015 at 17:46, Matt Broadstone <mb...@gmail.com> wrote:
> > Hi,
> >
> > I have a service using the C++ Messaging API which connects to a single
> > instance of qpidd (currently on the same machine), which seems to crash
> out
> > with this exception every couple of days under moderate load:
> >
> > qpidd[68257]: 2015-04-28 11:56:38 [Broker] error
> > qpid.192.168.2.225:5672-192.168.2.148:60492: resource-limit-exceeded:
> > Maximum depth exceeded on
> >
> b1386bee-a36c-449d-953f-c25f4842e76d_hive.guest.metadata_7bf9355b-524b-4853-89bd-1848366cd21f:
> > current=[count: 389438, size: 104857546], max=[size: 104857600]
> > (/build/buildd/qpid-cpp-0.28/src/qpid/broker/Queue.cpp:1575)
> >
> > Using qpid-stat I don't see the queue depth ever increase from 0 (which I
> > gather is why the exception is thrown, from reading the code), however I
> > -do- notice that the "acquired" count is increasing with every message
> with
> > no corresponding "release" (release count is always 0).
> >
> > My service currently looks a lot like the "Receiving Messages from
> Multiple
> > Sources" on this page:
> >
> https://qpid.apache.org/releases/qpid-0.28/messaging-api/cpp/api/index.html
> ,
> > and I am definitely calling the message agnostic "session.acknowledge()"
> in
> > my main event loop, so I'm not sure why the messages are never released
> > (presuming that messages would be released about
> settling/acknowledgement).
> >
> > I've created a ticket for what I think is a bug in the broker here:
> > https://issues.apache.org/jira/browse/QPID-6517, but figured I would
> post
> > here as well just in case there is operator error happening on my part:
> >
> > Regards,
> > Matt
>
> I know very little of the C++ code in question I'm afraid, but the
> title of the email did make me note something. I wouldnt expect the
> released count to increase if you do call acknowledge, since releasing
> is essentially something that tends to happen (implicitly or
> excplicitly) after you *dont* acknowledge.
>
>
Yeah, I had assumed that "release" meant explicitly releasing the message
(although the release count didn't seem to rise when I switched to using
that instead of acknowledge). I really have no idea what "acquire" is
supposed to mean in this context, and can't find much documentation on it.

To help those who might be able to answer...some questions...
>
> From IRC (and the above) I believe you are using 0.28 of both client
> and server, is that right? I expect you are using AMQP 1.0 at least
> some of the time from your mentions of the NodeJS client you have
> worked on, but are you definitely using AMQP 1.0 with the C++ client?
> I saw you mention 0.32 earlier on IRC, have you tried with that yet
> (now that it is available on your platform)?
>
>
Yes, I am using AMQP 1.0, and this is with two services communicating with
each other over qpidd, both written in C++ and using the QPID C++ Messaging
API on the 0.28 release. I just updated to 0.32, and began trying to stress
test the system in a VM and am now running into memory consumption errors
with my test programs. They can be found here:
https://gist.github.com/mbroadst/0adcd6cb1b2c8617a473

Running this for about 1-2min results in something like 80% memory usage on
my VM, although valgrind isn't able to spot any memory leaks.
Unfortunately, given that outcome I'm not really comfortable just updating
to 0.32 on the development box for the product to test if the release bump
fixes the problem.

Matt

Robbie
>

---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@qpid.apache.org
> For additional commands, e-mail: users-help@qpid.apache.org
>
>

Re: acknowledge not releasing message with C++ messaging api

Posted by Robbie Gemmell <ro...@gmail.com>.

On 29 April 2015 at 17:46, Matt Broadstone <mb...@gmail.com> wrote:
> Hi,
>
> I have a service using the C++ Messaging API which connects to a single
> instance of qpidd (currently on the same machine), which seems to crash out
> with this exception every couple of days under moderate load:
>
> qpidd[68257]: 2015-04-28 11:56:38 [Broker] error
> qpid.192.168.2.225:5672-192.168.2.148:60492: resource-limit-exceeded:
> Maximum depth exceeded on
> b1386bee-a36c-449d-953f-c25f4842e76d_hive.guest.metadata_7bf9355b-524b-4853-89bd-1848366cd21f:
> current=[count: 389438, size: 104857546], max=[size: 104857600]
> (/build/buildd/qpid-cpp-0.28/src/qpid/broker/Queue.cpp:1575)
>
> Using qpid-stat I don't see the queue depth ever increase from 0 (which I
> gather is why the exception is thrown, from reading the code), however I
> -do- notice that the "acquired" count is increasing with every message with
> no corresponding "release" (release count is always 0).
>
> My service currently looks a lot like the "Receiving Messages from Multiple
> Sources" on this page:
> https://qpid.apache.org/releases/qpid-0.28/messaging-api/cpp/api/index.html,
> and I am definitely calling the message agnostic "session.acknowledge()" in
> my main event loop, so I'm not sure why the messages are never released
> (presuming that messages would be released about settling/acknowledgement).
>
> I've created a ticket for what I think is a bug in the broker here:
> https://issues.apache.org/jira/browse/QPID-6517, but figured I would post
> here as well just in case there is operator error happening on my part:
>
> Regards,
> Matt

I know very little of the C++ code in question I'm afraid, but the
title of the email did make me note something. I wouldnt expect the
released count to increase if you do call acknowledge, since releasing
is essentially something that tends to happen (implicitly or
excplicitly) after you *dont* acknowledge.

To help those who might be able to answer...some questions...

>From IRC (and the above) I believe you are using 0.28 of both client
and server, is that right? I expect you are using AMQP 1.0 at least
some of the time from your mentions of the NodeJS client you have
worked on, but are you definitely using AMQP 1.0 with the C++ client?
I saw you mention 0.32 earlier on IRC, have you tried with that yet
(now that it is available on your platform)?

Robbie

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@qpid.apache.org
For additional commands, e-mail: users-help@qpid.apache.org

Re: acknowledge not releasing message with C++ messaging api

Posted by Matt Broadstone <mb...@gmail.com>.

On Wed, Apr 29, 2015 at 3:51 PM, Gordon Sim <gs...@redhat.com> wrote:

> On 04/29/2015 08:03 PM, Matt Broadstone wrote:
>
>> On Wed, Apr 29, 2015 at 3:01 PM, Matt Broadstone <mb...@gmail.com>
>> wrote:
>>
>>  On Wed, Apr 29, 2015 at 2:55 PM, Gordon Sim <gs...@redhat.com> wrote:
>>>
>>>  On 04/29/2015 05:46 PM, Matt Broadstone wrote:
>>>>
>>>>  Hi,
>>>>>
>>>>> I have a service using the C++ Messaging API which connects to a single
>>>>> instance of qpidd (currently on the same machine), which seems to crash
>>>>> out
>>>>> with this exception every couple of days under moderate load:
>>>>>
>>>>> qpidd[68257]: 2015-04-28 11:56:38 [Broker] error
>>>>> qpid.192.168.2.225:5672-192.168.2.148:60492: resource-limit-exceeded:
>>>>> Maximum depth exceeded on
>>>>>
>>>>>
>>>>> b1386bee-a36c-449d-953f-c25f4842e76d_hive.guest.metadata_7bf9355b-524b-4853-89bd-1848366cd21f:
>>>>> current=[count: 389438, size: 104857546], max=[size: 104857600]
>>>>> (/build/buildd/qpid-cpp-0.28/src/qpid/broker/Queue.cpp:1575)
>>>>>
>>>>> Using qpid-stat I don't see the queue depth ever increase from 0
>>>>> (which I
>>>>> gather is why the exception is thrown, from reading the code), however
>>>>> I
>>>>> -do- notice that the "acquired" count is increasing with every message
>>>>> with
>>>>> no corresponding "release" (release count is always 0).
>>>>>
>>>>>
>>>> That's actually 'expected', in terms of the code. It only increments the
>>>> released count when a messages is released back to the queue, rather
>>>> than
>>>> being acknowledged and dequeued. Also there is nothing at present that
>>>> decrements the acquired count, so it would be expected to keep going up.
>>>>
>>>>
>>>>  Okay, good to know, just making sure I wasn't seeing a huge problem
>>> with
>>> improperly handled messages here.
>>>
>>>
>>>  The exception above is indeed a result of the queue backing up,
>>>> apparently reaching a depth of 389438 messages. What address options if
>>>> any
>>>> are used for the receiver consuming from that queue? Is there anything
>>>> to
>>>> indicate whether that receiver was behaving normally just before the
>>>> point
>>>> at which the error occurred?
>>>>
>>>>
>>>>  I'm using no address options at all. The two programs I submitted
>>> earlier
>>> (mqget/mqsend) are reduced examples of what we're using (except the
>>> receiver in my case uses the "multiple receivers"
>>> Session.getNext().fetch()
>>> etc). Aside from that it's very "vanilla" right now. AFAICT everything
>>> was
>>> fine, until it wasn't. The original bug occurred with version 0.28, so
>>> maybe there's an issue with the fact that it was still using the legacy
>>> store? However, everything I see here indicates nothing ever touched the
>>> disk (these are just messages being published to a topic). As for the
>>> receiving side, each receiver (and this one in particular) are set to a
>>> prefetch(capacity) of 10.
>>>
>>> What seems particularly strange to me is that the backup is for hundreds
>>> of thousands of messages, how could that even be possible? Right now we
>>> have about 10 producers publishing every ~6 seconds.
>>>
>>> Matt
>>>
>>>
>>>
>> Also, what is the recommended failover scenario for this situation?
>> Basically what happened for us is that this "situation" occurred, and then
>> were no longer receiving ANY messages on that receiver and it took our
>> whole system down. The "workaround" was to simply restart the qpidd
>> process.
>>
>
> Did you restart the clients (do you use auto reconnect)? Did you run
> qpid-stat at the time the incident occurred? Did you try restarting the
> receiver before restarting qpidd?
>
>
The clients do use auto reconnect, however when this happens there was no
error on the client side iirc. I was actually not available when the issue
occurred, and the guys working on the system at the time were in QA and
needed to fix it immediately - so I have limited knowledge of
statistics/state at the time of failure unfortunately.

I'm going to try to locally reproduce the problem with the 0.32 fix you
mentioned above (thanks for that). Do you have any ideas why this issue
might occur (like what would cause it to start increasing queue depth), so
I can try to speed up my testing?

Matt




>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@qpid.apache.org
> For additional commands, e-mail: users-help@qpid.apache.org
>
>

Re: acknowledge not releasing message with C++ messaging api

Posted by Gordon Sim <gs...@redhat.com>.

On 04/29/2015 08:03 PM, Matt Broadstone wrote:
> On Wed, Apr 29, 2015 at 3:01 PM, Matt Broadstone <mb...@gmail.com> wrote:
>
>> On Wed, Apr 29, 2015 at 2:55 PM, Gordon Sim <gs...@redhat.com> wrote:
>>
>>> On 04/29/2015 05:46 PM, Matt Broadstone wrote:
>>>
>>>> Hi,
>>>>
>>>> I have a service using the C++ Messaging API which connects to a single
>>>> instance of qpidd (currently on the same machine), which seems to crash
>>>> out
>>>> with this exception every couple of days under moderate load:
>>>>
>>>> qpidd[68257]: 2015-04-28 11:56:38 [Broker] error
>>>> qpid.192.168.2.225:5672-192.168.2.148:60492: resource-limit-exceeded:
>>>> Maximum depth exceeded on
>>>>
>>>> b1386bee-a36c-449d-953f-c25f4842e76d_hive.guest.metadata_7bf9355b-524b-4853-89bd-1848366cd21f:
>>>> current=[count: 389438, size: 104857546], max=[size: 104857600]
>>>> (/build/buildd/qpid-cpp-0.28/src/qpid/broker/Queue.cpp:1575)
>>>>
>>>> Using qpid-stat I don't see the queue depth ever increase from 0 (which I
>>>> gather is why the exception is thrown, from reading the code), however I
>>>> -do- notice that the "acquired" count is increasing with every message
>>>> with
>>>> no corresponding "release" (release count is always 0).
>>>>
>>>
>>> That's actually 'expected', in terms of the code. It only increments the
>>> released count when a messages is released back to the queue, rather than
>>> being acknowledged and dequeued. Also there is nothing at present that
>>> decrements the acquired count, so it would be expected to keep going up.
>>>
>>>
>> Okay, good to know, just making sure I wasn't seeing a huge problem with
>> improperly handled messages here.
>>
>>
>>> The exception above is indeed a result of the queue backing up,
>>> apparently reaching a depth of 389438 messages. What address options if any
>>> are used for the receiver consuming from that queue? Is there anything to
>>> indicate whether that receiver was behaving normally just before the point
>>> at which the error occurred?
>>>
>>>
>> I'm using no address options at all. The two programs I submitted earlier
>> (mqget/mqsend) are reduced examples of what we're using (except the
>> receiver in my case uses the "multiple receivers" Session.getNext().fetch()
>> etc). Aside from that it's very "vanilla" right now. AFAICT everything was
>> fine, until it wasn't. The original bug occurred with version 0.28, so
>> maybe there's an issue with the fact that it was still using the legacy
>> store? However, everything I see here indicates nothing ever touched the
>> disk (these are just messages being published to a topic). As for the
>> receiving side, each receiver (and this one in particular) are set to a
>> prefetch(capacity) of 10.
>>
>> What seems particularly strange to me is that the backup is for hundreds
>> of thousands of messages, how could that even be possible? Right now we
>> have about 10 producers publishing every ~6 seconds.
>>
>> Matt
>>
>>
>
> Also, what is the recommended failover scenario for this situation?
> Basically what happened for us is that this "situation" occurred, and then
> were no longer receiving ANY messages on that receiver and it took our
> whole system down. The "workaround" was to simply restart the qpidd process.

Did you restart the clients (do you use auto reconnect)? Did you run 
qpid-stat at the time the incident occurred? Did you try restarting the 
receiver before restarting qpidd?


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@qpid.apache.org
For additional commands, e-mail: users-help@qpid.apache.org

Re: acknowledge not releasing message with C++ messaging api

Posted by Matt Broadstone <mb...@gmail.com>.

On Wed, Apr 29, 2015 at 3:01 PM, Matt Broadstone <mb...@gmail.com> wrote:

> On Wed, Apr 29, 2015 at 2:55 PM, Gordon Sim <gs...@redhat.com> wrote:
>
>> On 04/29/2015 05:46 PM, Matt Broadstone wrote:
>>
>>> Hi,
>>>
>>> I have a service using the C++ Messaging API which connects to a single
>>> instance of qpidd (currently on the same machine), which seems to crash
>>> out
>>> with this exception every couple of days under moderate load:
>>>
>>> qpidd[68257]: 2015-04-28 11:56:38 [Broker] error
>>> qpid.192.168.2.225:5672-192.168.2.148:60492: resource-limit-exceeded:
>>> Maximum depth exceeded on
>>>
>>> b1386bee-a36c-449d-953f-c25f4842e76d_hive.guest.metadata_7bf9355b-524b-4853-89bd-1848366cd21f:
>>> current=[count: 389438, size: 104857546], max=[size: 104857600]
>>> (/build/buildd/qpid-cpp-0.28/src/qpid/broker/Queue.cpp:1575)
>>>
>>> Using qpid-stat I don't see the queue depth ever increase from 0 (which I
>>> gather is why the exception is thrown, from reading the code), however I
>>> -do- notice that the "acquired" count is increasing with every message
>>> with
>>> no corresponding "release" (release count is always 0).
>>>
>>
>> That's actually 'expected', in terms of the code. It only increments the
>> released count when a messages is released back to the queue, rather than
>> being acknowledged and dequeued. Also there is nothing at present that
>> decrements the acquired count, so it would be expected to keep going up.
>>
>>
> Okay, good to know, just making sure I wasn't seeing a huge problem with
> improperly handled messages here.
>
>
>> The exception above is indeed a result of the queue backing up,
>> apparently reaching a depth of 389438 messages. What address options if any
>> are used for the receiver consuming from that queue? Is there anything to
>> indicate whether that receiver was behaving normally just before the point
>> at which the error occurred?
>>
>>
> I'm using no address options at all. The two programs I submitted earlier
> (mqget/mqsend) are reduced examples of what we're using (except the
> receiver in my case uses the "multiple receivers" Session.getNext().fetch()
> etc). Aside from that it's very "vanilla" right now. AFAICT everything was
> fine, until it wasn't. The original bug occurred with version 0.28, so
> maybe there's an issue with the fact that it was still using the legacy
> store? However, everything I see here indicates nothing ever touched the
> disk (these are just messages being published to a topic). As for the
> receiving side, each receiver (and this one in particular) are set to a
> prefetch(capacity) of 10.
>
> What seems particularly strange to me is that the backup is for hundreds
> of thousands of messages, how could that even be possible? Right now we
> have about 10 producers publishing every ~6 seconds.
>
> Matt
>
>

Also, what is the recommended failover scenario for this situation?
Basically what happened for us is that this "situation" occurred, and then
were no longer receiving ANY messages on that receiver and it took our
whole system down. The "workaround" was to simply restart the qpidd process.

Matt



>
>>> My service currently looks a lot like the "Receiving Messages from
>>> Multiple
>>> Sources" on this page:
>>>
>>> https://qpid.apache.org/releases/qpid-0.28/messaging-api/cpp/api/index.html
>>> ,
>>> and I am definitely calling the message agnostic "session.acknowledge()"
>>> in
>>> my main event loop, so I'm not sure why the messages are never released
>>> (presuming that messages would be released about
>>> settling/acknowledgement).
>>>
>>> I've created a ticket for what I think is a bug in the broker here:
>>> https://issues.apache.org/jira/browse/QPID-6517, but figured I would
>>> post
>>> here as well
>>>
>>
>> Always a good idea, just to make sure it gets noticed!
>>
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: users-unsubscribe@qpid.apache.org
>> For additional commands, e-mail: users-help@qpid.apache.org
>>
>>
>

Re: acknowledge not releasing message with C++ messaging api

Posted by Matt Broadstone <mb...@gmail.com>.

On Wed, Apr 29, 2015 at 2:55 PM, Gordon Sim <gs...@redhat.com> wrote:

> On 04/29/2015 05:46 PM, Matt Broadstone wrote:
>
>> Hi,
>>
>> I have a service using the C++ Messaging API which connects to a single
>> instance of qpidd (currently on the same machine), which seems to crash
>> out
>> with this exception every couple of days under moderate load:
>>
>> qpidd[68257]: 2015-04-28 11:56:38 [Broker] error
>> qpid.192.168.2.225:5672-192.168.2.148:60492: resource-limit-exceeded:
>> Maximum depth exceeded on
>>
>> b1386bee-a36c-449d-953f-c25f4842e76d_hive.guest.metadata_7bf9355b-524b-4853-89bd-1848366cd21f:
>> current=[count: 389438, size: 104857546], max=[size: 104857600]
>> (/build/buildd/qpid-cpp-0.28/src/qpid/broker/Queue.cpp:1575)
>>
>> Using qpid-stat I don't see the queue depth ever increase from 0 (which I
>> gather is why the exception is thrown, from reading the code), however I
>> -do- notice that the "acquired" count is increasing with every message
>> with
>> no corresponding "release" (release count is always 0).
>>
>
> That's actually 'expected', in terms of the code. It only increments the
> released count when a messages is released back to the queue, rather than
> being acknowledged and dequeued. Also there is nothing at present that
> decrements the acquired count, so it would be expected to keep going up.
>
>
Okay, good to know, just making sure I wasn't seeing a huge problem with
improperly handled messages here.


> The exception above is indeed a result of the queue backing up, apparently
> reaching a depth of 389438 messages. What address options if any are used
> for the receiver consuming from that queue? Is there anything to indicate
> whether that receiver was behaving normally just before the point at which
> the error occurred?
>
>
I'm using no address options at all. The two programs I submitted earlier
(mqget/mqsend) are reduced examples of what we're using (except the
receiver in my case uses the "multiple receivers" Session.getNext().fetch()
etc). Aside from that it's very "vanilla" right now. AFAICT everything was
fine, until it wasn't. The original bug occurred with version 0.28, so
maybe there's an issue with the fact that it was still using the legacy
store? However, everything I see here indicates nothing ever touched the
disk (these are just messages being published to a topic). As for the
receiving side, each receiver (and this one in particular) are set to a
prefetch(capacity) of 10.

What seems particularly strange to me is that the backup is for hundreds of
thousands of messages, how could that even be possible? Right now we have
about 10 producers publishing every ~6 seconds.

Matt


>
>> My service currently looks a lot like the "Receiving Messages from
>> Multiple
>> Sources" on this page:
>>
>> https://qpid.apache.org/releases/qpid-0.28/messaging-api/cpp/api/index.html
>> ,
>> and I am definitely calling the message agnostic "session.acknowledge()"
>> in
>> my main event loop, so I'm not sure why the messages are never released
>> (presuming that messages would be released about
>> settling/acknowledgement).
>>
>> I've created a ticket for what I think is a bug in the broker here:
>> https://issues.apache.org/jira/browse/QPID-6517, but figured I would post
>> here as well
>>
>
> Always a good idea, just to make sure it gets noticed!
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@qpid.apache.org
> For additional commands, e-mail: users-help@qpid.apache.org
>
>

Re: acknowledge not releasing message with C++ messaging api

Posted by Gordon Sim <gs...@redhat.com>.

On 04/29/2015 05:46 PM, Matt Broadstone wrote:
> Hi,
>
> I have a service using the C++ Messaging API which connects to a single
> instance of qpidd (currently on the same machine), which seems to crash out
> with this exception every couple of days under moderate load:
>
> qpidd[68257]: 2015-04-28 11:56:38 [Broker] error
> qpid.192.168.2.225:5672-192.168.2.148:60492: resource-limit-exceeded:
> Maximum depth exceeded on
> b1386bee-a36c-449d-953f-c25f4842e76d_hive.guest.metadata_7bf9355b-524b-4853-89bd-1848366cd21f:
> current=[count: 389438, size: 104857546], max=[size: 104857600]
> (/build/buildd/qpid-cpp-0.28/src/qpid/broker/Queue.cpp:1575)
>
> Using qpid-stat I don't see the queue depth ever increase from 0 (which I
> gather is why the exception is thrown, from reading the code), however I
> -do- notice that the "acquired" count is increasing with every message with
> no corresponding "release" (release count is always 0).

That's actually 'expected', in terms of the code. It only increments the 
released count when a messages is released back to the queue, rather 
than being acknowledged and dequeued. Also there is nothing at present 
that decrements the acquired count, so it would be expected to keep 
going up.

The exception above is indeed a result of the queue backing up, 
apparently reaching a depth of 389438 messages. What address options if 
any are used for the receiver consuming from that queue? Is there 
anything to indicate whether that receiver was behaving normally just 
before the point at which the error occurred?

>
> My service currently looks a lot like the "Receiving Messages from Multiple
> Sources" on this page:
> https://qpid.apache.org/releases/qpid-0.28/messaging-api/cpp/api/index.html,
> and I am definitely calling the message agnostic "session.acknowledge()" in
> my main event loop, so I'm not sure why the messages are never released
> (presuming that messages would be released about settling/acknowledgement).
>
> I've created a ticket for what I think is a bug in the broker here:
> https://issues.apache.org/jira/browse/QPID-6517, but figured I would post
> here as well

Always a good idea, just to make sure it gets noticed!


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@qpid.apache.org
For additional commands, e-mail: users-help@qpid.apache.org