You are viewing a plain text version of this content. The canonical link for it is here.

Posted to users@qpid.apache.org by Helen Kwong <he...@gmail.com> on 2014/01/08 03:43:04 UTC

Java broker - message grouping in C++ compatibility mode

Hi,

I use the Java broker and client, version 0.16, and am considering using
the message grouping feature (
http://qpid.apache.org/releases/qpid-0.24/java-broker/book/Java-Broker-Queues.html#Java-Broker-Queues-OtherTypes-Message-Grouping).
>From testing I've done, there seems to be a bug with the C++ compatibility
model, and I'm wondering if this is a known issue. Specifically, in my test
I have a queue configured to use a group header field with
"qpid.group_header_key" and C++ mode with "qpid.shared_msg_group", and have
multiple listeners to the queue. Each listener will sleep for a short
amount of time when it receives a message before returning. I then enqueue
10 messages that do not have a value in the group header field to the
queue. Since the doc says that messages without a value in the grouping
header will be delivered to any available consumer, the behavior I expect
is that the messages will be processed in parallel, i.e., when listener 1
is holding on to a message and sleeping, listener 2 can receive another
message from the queue. But what I see is that the messages are processed
serially -- message 2 won't be received by some thread until message 1 is
done. When I use the default mode instead of C++ mode, then I get the
parallel processing behavior.

Is this is a known bug, and is there a fix for it already?

Thanks,
Helen

Re: Java broker - message grouping in C++ compatibility mode

Posted by Robbie Gemmell <ro...@gmail.com>.

Jira created at https://issues.apache.org/jira/browse/QPID-5496

I created a temporary branch and committed the change to it, and set up a
temporary job on Jenkins to build it, the latest successful output will be
at:
https://builds.apache.org/view/M-R/view/Qpid/job/Qpid-RG-Temp/lastSuccessfulBuild/artifact/QPID-5496_default_groups/qpid/java/broker/release/qpid-broker-0.27.tar.gz

Details of the new behaviour and config are listed on the JIRA. Since I was
mostly just toying around when I did it, it obviously isn't finished; its
untested beyond running the previously existing tests, and needs other
things like updating the docs and adding support for setting the value via
web management UI before going on trunk. You should be able to create the
queue either via a message client, or via the JMX interface or the
underlying REST interface for the web ui.

Robbie

On 20 January 2014 18:46, Robbie Gemmell <ro...@gmail.com> wrote:

> Hi Helen,
>
> I had a quick stab at implementing the changes I mentioned below, namely:
>
> - Allow for disabling the 'default group' when using C++ style 'shared
> groups', removing or minimising the synchronization as much as possible
> when dealing with non-grouped messages.
> - Allow enabling a 'default group' when using 'regular groups'
>
> I will raise a JIRA and possibly make a branch with the update applied
> later tonight so you can give it a try.
>
> I also discussed with Rob how we might improve things further and we did
> come up with one suggestion, but its a much bigger change and so definitely
> one that would need to be undertaken separately if done (i.e, I only have
> so much free time :P).
>
> Robbie
>
> On 14 January 2014 21:19, Robbie Gemmell <ro...@gmail.com> wrote:
>
>> Hi Helen,
>>
>> I expected that you would see increased performance in some of the tests,
>> as Rob did some more work on improving performance after the 0.16 release.
>> I did expect performance to go down somewhat for case #1 as the defect fix
>> was putting in place some missing synchronization between delivery and
>> acknowlegement steps, though I admit it dropped a bit more than I expected.
>> Giving that some thought, its possible that a change to the threading from
>> some time ago (in 0.16, I think) which generally increased performance all
>> round at the time may in fact hurt it in that scenario.
>>
>> In terms of introducing a new mode of grouping, we would actually be
>> quite open to this as its something we have commented on separately
>> ourselves but never quite got round to implementing; so much to do, so
>> little time, etc. In particular, I have never liked the 'default group'
>> disparity between the two modes and inability to really change it on top of
>> that, so something we talked about was allowing the default group to be
>> controlled for both grouping types such that you could specifically turn
>> one on for the 'standard grouping' which doesnt normally have one, or
>> specifically turn it off for the 'shared groups' that normally do. This
>> would allow them to act a lot more similar, with the distinction betwen
>> them then becoming more about whether the groups can float between
>> consumers or not.
>>
>> Doing the above would allow you to do away with setting the unique
>> messageid on the messages you dont wish to group and enable any consumer to
>> pick them up, which should significantly change the performance seen
>> consuming from a 'shared groups' queue which had a mixture of messages in
>> groups and message not in groups (since a particular subscription becomes
>> much more likely to find a message earlier in the queue if/when the group
>> reset occurs). I think making additional changes outwith those to try and
>> combat a queue full of unique groups in the shared groups case may require
>> larger changes to the queue itself, such as a custom delivery loop to allow
>> altering when the messages can be assigned to the group or even moving away
>> from the current 'group manager' addition to the base queue and more
>> towards a custom queue such as is done currently for the priority and
>> sorted queue implementations.
>>
>> Robbie
>>
>> On 14 January 2014 07:15, Helen Kwong <he...@gmail.com> wrote:
>>
>>> Hi Robbie,
>>>
>>> I ran the tests again with a broker built from the latest trunk code and
>>> with a 0.16 client. The performance results are a lot better in almost
>>> every test setup (3-4 times better), except for Test #1 where I build up
>>> 100000 messages of unique grouping values and then see how many messages
>>> we
>>> can process in 5 minutes, using C++ mode (I got ~7800 messages before,
>>> and
>>> now ~5400). The overall comparisons are about the same, with C++ mode
>>> still
>>> performing significantly worse:
>>>
>>> 1. For the first test, for a regular queue and a default mode queue (with
>>> all messages having no grouping values, all having the same value, and
>>> all
>>> having different values), I am able to process 100000 messages in around
>>> 3
>>> and a half minutes. In C++ mode, with all 100000 messages having
>>> different
>>> values, I can process only ~5400 in 5 minutes; with all having no value /
>>> the default group, ~44000 messages were processed in 5 minutes (after
>>> which
>>> I stop waiting and just clear the queue).
>>>
>>> 2. For the second test, for a default mode queue, the performance is
>>> again
>>> not affected by how many unprocessable messages we have at the front of
>>> the
>>> queue due to another consumer holding on to those messages' group for a
>>> long time. The time it takes to process 1000 messages of group B after N
>>> messages of group A (which is assigned to another consumer) is about the
>>> same as processing 1000 messages on a regular queue. With a C++ mode
>>> queue,
>>> the performance still gets worse the more messages of group A we have at
>>> the front of the queue.
>>>
>>>
>>> > Can I enquire what it is about your use case that would preclude use
>>> of a
>>> > higher prefetch?
>>>
>>>
>>> We want to avoid starving messages that can be processed by another
>>> consumer. We have multiple dequeue sessions, each listening to multiple
>>> queues. If we have a higher prefetch, then if there is a long-running
>>> message on a queue, the messages that were prefetched along with it will
>>> have to wait for a long time, even if another listener is available and
>>> can
>>> process those messages. And if we decide to add the message grouping
>>> configuration, we want to avoid starving a message of a particular group
>>> if
>>> there's an available listener -- if a listener can have 2 messages and 2
>>> different groups assigned at a time, and processing of the first message
>>> takes a long time, then we might be unfairly starving the second message,
>>> even though other listeners can process it.
>>>
>>> We might be able to tolerate this to some extent if the prefetch is only
>>> 2,
>>> if this is the only way to improve overall throughput. I'll have to
>>> discuss
>>> this more with my team. Though this still wouldn't solve the problem with
>>> unique keys in C++ mode.
>>>
>>>
>>> > Also, you originally seemed to prefer the idea that
>>> > messages without a group header value would be not be grouped, so is
>>> there
>>> > a particular reason you are leaning towards using the shared groups
>>> > functionality which can't do that?
>>> >
>>>
>>> Basically, the ideal behavior we want is a third mode that combines the
>>> two. If no grouping value is specified, then treat it as though it has no
>>> group; for messages with a grouping value, ensure that only 1 listener
>>> can
>>> be processing messages from a particular group at a time, but don't tie a
>>> group to a particular consumer for the lifetime of that consumer. The
>>> reason we want the first part is that we'll have many messages that don't
>>> belong to any group, and we want them to be processed by different
>>> listeners in parallel. The reason we don't want a consumer to be
>>> associated
>>> with a group as long as the consumer lives is again to avoid starvation
>>> --
>>> we don't want a consumer processing a message of group A for a long time
>>> result in starving group B's messages, because it happens so that the
>>> first
>>> group B message was processed by that consumer and so group B is assigned
>>> to the consumer, when there's another consumer of the same queue that is
>>> doing nothing and can process the B message.
>>>
>>> So we were thinking of asking you guys if you'd be open to introducing a
>>> third mode, mostly the same as the C++ mode but where no grouping value
>>> means no group, instead of the default group. Another workaround we
>>> thought
>>> of was for any message that doesn't belong to a group, we'll put its
>>> message ID in the grouping key property, so that essentially any listener
>>> can pick it up. That's one reason why we were testing the unique keys
>>> case,
>>> though it's also possible for us to get many different grouping values
>>> in a
>>> queue (though not at quite as high a number, e.g., 100000). But in C++
>>> mode, with both unique values and same-group values at a high depth,
>>> we're
>>> seeing decreased performance anyway, so we might not be able to use it.
>>>
>>> Do you have any suggestions for what we should do? Are there any ideas
>>> you
>>> have for solving the performance issue with unique keys in C++ mode, so
>>> perhaps we could look into it more?
>>>
>>> Thanks,
>>> Helen
>>>
>>>
>>> > On 13 January 2014 20:34, Helen Kwong <he...@gmail.com> wrote:
>>> >
>>> > > Hi Robbie,
>>> > >
>>> > > I am actually still running version 0.16 of the broker. It will take
>>> me a
>>> > > little time to set up the trunk code and rebuild and rerun the
>>> > experiments.
>>> > > Do you think your fix will likely make a difference?
>>> > >
>>> > > For the second case with a long-lived consumer being assigned the
>>> group
>>> > of
>>> > > many messages at the head of the queue, I was indeed using a
>>> prefetch of
>>> > 1.
>>> > > I ran it again (with version 0.16 still) with a prefetch of 2 as you
>>> > > suggested, and the dequeue time of the messages at the end was then
>>> not
>>> > > affected by the number of unprocessable messages at the beginning of
>>> the
>>> > > queue, about the same as the other test setups I ran. However, I
>>> think
>>> > > increasing prefetch to 2 might not work for our use case.
>>> > >
>>> > > For the first case with unique message groups, your explanation makes
>>> > sense
>>> > > and I think I understand it now. Do you think there is still a way to
>>> > > optimize this behavior, so that we don't need to possibly traverse
>>> > through
>>> > > the whole queue whenever a subscription is unassigned from a group?
>>> Since
>>> > > you mentioned that maintaining a fixed pointer to a queue entry would
>>> > > likely lead to memory retention issues, would having a weak
>>> reference be
>>> > a
>>> > > possible option?
>>> > >
>>> > > Thanks,
>>> > > Helen
>>> > >
>>> > >
>>> > >
>>> > > On Sat, Jan 11, 2014 at 7:58 AM, Robbie Gemmell <
>>> > robbie.gemmell@gmail.com
>>> > > >wrote:
>>> > >
>>> > > > Hi Helen,
>>> > > >
>>> > > > Can I check what version of the code you were using? I ask as the
>>> > latest
>>> > > > trunk or 0.26 release branch code is going to be necessary for
>>> > > correctness
>>> > > > and representative testing of the shared groups functionality, due
>>> to
>>> > the
>>> > > > defect fix I previously mentioned making recently.
>>> > > > You can find a nightly build of the trunk broker at:
>>> > > >
>>> > > >
>>> > >
>>> >
>>> https://builds.apache.org/view/M-R/view/Qpid/job/Qpid-Java-Artefact-Release/lastSuccessfulBuild/artifact/trunk/qpid/java/broker/release/and
>>> > > > I would need to build the 0.26 branch as the fix was introduced
>>> after
>>> > > > the latest RC.
>>> > > >
>>> > > > In your first case, I think the reason for the difference between
>>> the
>>> > > > default group and the unique group is also likely to be tied to the
>>> > > > 'findEarliestAssignedAvailableEntry' behaviour you mention later in
>>> > your
>>> > > > mail. For the default group case, that next message always going
>>> to be
>>> > a
>>> > > > message near the front of the queue. For the unique group, there
>>> isnt
>>> > > > actually going to be a message which matches, but it looks like it
>>> will
>>> > > > currently be checking every message to determine that and doing so
>>> > under
>>> > > > the synchronization, and thus probably preventing other deliveries
>>> > > > occurring at the time. That isnt a problem in the non-shared case
>>> > because
>>> > > > there isnt a need to synchronise the GroupManager as a whole, and
>>> even
>>> > > > going beyond that its also highly unlikely it would need to check
>>> as
>>> > many
>>> > > > messages before finding a match due to the signifcant difference
>>> in how
>>> > > > groups become associated with a particular subscription in the
>>> > non-shared
>>> > > > case.
>>> > > >
>>> > > > In your second case, your explanation seems likely and I think this
>>> > case
>>> > > > really reduces to just being a variant of the above behaviour. The
>>> > > > particular issue is that one could argue it shouldnt need to be
>>> doing
>>> > the
>>> > > > 'findEarliestAssignedAvailableEntry' task all that often in this
>>> case
>>> > if
>>> > > > you have a long-lived consumer, and so your mention of this makes
>>> me
>>> > > think
>>> > > > you are using a prefetch of 1. Using a prefetch of 1 currently
>>> means
>>> > that
>>> > > > the delivery state associated with the shared group effectively
>>> becomes
>>> > > > empty after each message, because messages are only fully added to
>>> the
>>> > > > group when they become acquired by a particular subscription, and
>>> they
>>> > > cant
>>> > > > be acquired until the previous message is consumed (or perhaps
>>> slightly
>>> > > > confusingly, explicitly not-consumed). If so, I expect it could be
>>> very
>>> > > > interesting to run this case again with a prefetch of 2 or more.
>>> The
>>> > > > obvious tradeoff with increasing prefetch is that a particular
>>> consumer
>>> > > > could then be assigned up to <prefetch> groups at a given point,
>>> though
>>> > > > likely not in your test case due to the large contiguous blocks of
>>> > > messages
>>> > > > for each group.
>>> > > >
>>> > > > I'm not sure that the suggestion to track the first message in the
>>> > group
>>> > > > would really work currently, due to the way the underlying queue
>>> data
>>> > > > structure works. Maintaining a fixed pointer into it like that is
>>> > likely
>>> > > to
>>> > > > lead to some undesirable memory retention issues, based on a
>>> related
>>> > but
>>> > > > far simpler case I fixed previously in a similar structure
>>> elsewhere in
>>> > > the
>>> > > > broker. Looking at the way messages become assigned to a group in
>>> the
>>> > > > shared group case may be a more viable path to handling your second
>>> > case
>>> > > > more gracefully. The unique groups from you first case would still
>>> need
>>> > > > something different though, as neither of these routes would really
>>> > help
>>> > > > there.
>>> > > >
>>> > > > Robbie
>>> > > >
>>> > > > On 11 January 2014 01:03, Helen Kwong <he...@gmail.com>
>>> wrote:
>>> > > >
>>> > > > > Hi Robbie,
>>> > > > >
>>> > > > > I did some more testing to see whether message grouping will
>>> work for
>>> > > us,
>>> > > > > and compared the dequeue performance of a queue using message
>>> > grouping
>>> > > in
>>> > > > > default Java mode, a queue using C++ mode, and a queue not using
>>> > > message
>>> > > > > grouping. I found that when I use C++ mode, the performance can
>>> be
>>> > much
>>> > > > > worse than in other comparable setups, and was wondering if you
>>> could
>>> > > > help
>>> > > > > me understand why.
>>> > > > >
>>> > > > > 1.  In one test, I have multiple listeners to a queue, enqueue
>>> 100000
>>> > > > > messages to it, and see how many messages are processed in 5
>>> > minutes. I
>>> > > > > have these different setups:
>>> > > > >
>>> > > > > - C++ mode queue with each message having a unique identifier
>>> > > > >
>>> > > > > - C++ mode queue with all messages having no grouping identifier
>>> (so
>>> > > all
>>> > > > > belong to the default group)
>>> > > > >
>>> > > > > - default mode queue with each message having a unique grouping
>>> > > > identifier
>>> > > > >
>>> > > > > - default mode queue with all messages having no grouping
>>> identifier
>>> > > > >
>>> > > > > - default mode queue with all messages having the same grouping
>>> > > > identifier
>>> > > > >
>>> > > > > - regular queue without a group header key configured
>>> > > > >
>>> > > > > All setups except for the first had about 35K - 39K messages
>>> > processed,
>>> > > > but
>>> > > > > for the first setup, there were under 8000 messages processed.
>>> What
>>> > > could
>>> > > > > explain this big difference? I’ve looked at the two grouping
>>> modes’
>>> > > > > implementations of MessageGroupManager and see that C++ mode uses
>>> > > > > synchronized methods rather than a ConcurrentHashMap as in
>>> default
>>> > > mode,
>>> > > > so
>>> > > > > I’d guess there might be more contention because of that, but at
>>> the
>>> > > same
>>> > > > > time I can’t see why, in C++ mode, having a unique identifier for
>>> > each
>>> > > > > message results in throughput that is so much worse than having
>>> all
>>> > > > > messages in the default group.
>>> > > > >
>>> > > > >
>>> > > > > 2. I also wanted to see the impact of having many messages at the
>>> > head
>>> > > of
>>> > > > > the queue that a listener can’t process because the messages
>>> belong
>>> > to
>>> > > a
>>> > > > > group assigned to another consumer. E.g., have 10000 messages of
>>> > group
>>> > > A,
>>> > > > > followed by 1000 messages of group B, and listener 1 is holding
>>> on to
>>> > > the
>>> > > > > first A message for a long time -- see how long it will take
>>> > listener 2
>>> > > > to
>>> > > > > process all the B messages. In this case C++ mode has performance
>>> > that
>>> > > > > degrades as the number of unprocessable group A messages at the
>>> front
>>> > > of
>>> > > > > the queue increases, whereas default mode's performance is
>>> > unaffected,
>>> > > > > about the same as processing 1000 messages on a regular queue.
>>> > > > >
>>> > > > > My rough guess from looking at DefinedGroupMessageGroupManager is
>>> > that
>>> > > > > whenever listener 2 is done with a group B message, the state
>>> change
>>> > > > > listener triggers the Group.subtract() to reset pointers for
>>> other
>>> > > > > subscriptions and consequently
>>> findEarliestAssignedAvailableEntry().
>>> > > This
>>> > > > > then has to iterate through all the group A messages before it
>>> finds
>>> > > the
>>> > > > B
>>> > > > > message. Do you think this is the reason for the results I see?
>>> > > > >
>>> > > > > If so, is the idea here that other subscriptions of the queue
>>> could
>>> > > have
>>> > > > > skipped over the messages of a group while the group was
>>> assigned to
>>> > > some
>>> > > > > subscription S, so we need to tell them to set their pointers
>>> back?
>>> > If
>>> > > > that
>>> > > > > is indeed the idea, would it be possible to optimize it such that
>>> > when
>>> > > a
>>> > > > > group A is assigned to S and S gets its first message of the
>>> group,
>>> > we
>>> > > > > store what that first A message / queue entry is. Then when S is
>>> done
>>> > > > with
>>> > > > > the last A message, we can tell other subscriptions to go back to
>>> > that
>>> > > > > first entry, without having to iterate through the queue?
>>> > > > >
>>> > > > > Thanks a lot for your help!
>>> > > > >
>>> > > > > Helen
>>> > > > >
>>> > > > >
>>> > > > > On Tue, Jan 7, 2014 at 8:46 PM, Robbie Gemmell <
>>> > > robbie.gemmell@gmail.com
>>> > > > > >wrote:
>>> > > > >
>>> > > > > > ...and just to be super clear, though I think it it is
>>> mentioned
>>> > > > > correctly
>>> > > > > > in the docs this time, the 'default group' concept does not
>>> apply
>>> > in
>>> > > > the
>>> > > > > > regular / 'non shared' grouping mode. Messages that dont
>>> specify a
>>> > > > group
>>> > > > > > key value in that mode are simply not grouped in any way.
>>> > > > > >
>>> > > > > > On 8 January 2014 04:41, Robbie Gemmell <
>>> robbie.gemmell@gmail.com>
>>> > > > > wrote:
>>> > > > > >
>>> > > > > > >
>>> > > > > > > On 8 January 2014 04:33, Helen Kwong <he...@gmail.com>
>>> > wrote:
>>> > > > > > >
>>> > > > > > >> Oh I see, I thought what you meant was that I could only
>>> alter
>>> > the
>>> > > > > > default
>>> > > > > > >> group in shared-groups mode starting with 0.24.
>>> > > > > > >
>>> > > > > > >
>>> > > > > > > No, I just missed that you said 0.16 and assumed 0.24 was the
>>> > > version
>>> > > > > you
>>> > > > > > > were using . You could always change it, just in more limited
>>> > ways
>>> > > in
>>> > > > > > > earlier releases.
>>> > > > > > >
>>> > > > > > > To make sure I'm
>>> > > > > > >> understanding this correctly -- changing the the default
>>> message
>>> > > > group
>>> > > > > > >> name
>>> > > > > > >> to something else in C++ mode won't change the serial
>>> processing
>>> > > > > > behavior
>>> > > > > > >> I
>>> > > > > > >> saw, right?
>>> > > > > > >
>>> > > > > > >
>>> > > > > > > Correct
>>> > > > > > >
>>> > > > > > >
>>> > > > > > >> Messages without a group identifier will still be
>>> considered to
>>> > > > > > >> be in a group -- rather than no group -- and they cannot be
>>> > > > processed
>>> > > > > by
>>> > > > > > >> multiple consumers concurrently?
>>> > > > > > >>
>>> > > > > > >>
>>> > > > > > > Yes. In the C++/shared-groups mode every message is
>>> considered to
>>> > > be
>>> > > > > in a
>>> > > > > > > group, it is just a case of whether the message specifies
>>> that
>>> > > group
>>> > > > or
>>> > > > > > > instead gets put into the default group.
>>> > > > > > >
>>> > > > > > >
>>> > > > > > >
>>> > > > > > >> Thanks,
>>> > > > > > >> Helen
>>> > > > > > >>
>>> > > > > > >>
>>> > > > > > >> On Tue, Jan 7, 2014 at 8:22 PM, Robbie Gemmell <
>>> > > > > > robbie.gemmell@gmail.com
>>> > > > > > >> >wrote:
>>> > > > > > >>
>>> > > > > > >> > I just noticed you said you were using 0.16, somehow
>>> glossed
>>> > > over
>>> > > > it
>>> > > > > > >> > originally and only noticed the 0.24 in the doc URL (its
>>> many
>>> > > > hours
>>> > > > > > past
>>> > > > > > >> > time I was asleep, I might be getting tired).
>>> > > > > > >> >
>>> > > > > > >> > Realising that, I should add that prior to 0.22 the only
>>> way
>>> > to
>>> > > > > alter
>>> > > > > > >> the
>>> > > > > > >> > default group in the shared-groups mode from
>>> 'qpid.no-group'
>>> > to
>>> > > > > > >> something
>>> > > > > > >> > else would have been via the 'qpid.default-message-group'
>>> > queue
>>> > > > > > declare
>>> > > > > > >> > argument when using an AMQP client to create the queue
>>> > > originally,
>>> > > > > and
>>> > > > > > >> for
>>> > > > > > >> > 0.22 itself only that and the system property approach I
>>> > > mentioned
>>> > > > > > would
>>> > > > > > >> > work.
>>> > > > > > >> >
>>> > > > > > >> > Robbie
>>> > > > > > >> >
>>> > > > > > >> > On 8 January 2014 04:03, Helen Kwong <
>>> helenkwong@gmail.com>
>>> > > > wrote:
>>> > > > > > >> >
>>> > > > > > >> > > Hi Robbie,
>>> > > > > > >> > >
>>> > > > > > >> > > I see. Thanks for the quick response and explanation!
>>> > > > > > >> > >
>>> > > > > > >> > > Helen
>>> > > > > > >> > >
>>> > > > > > >> > >
>>> > > > > > >> > > On Tue, Jan 7, 2014 at 7:43 PM, Robbie Gemmell <
>>> > > > > > >> robbie.gemmell@gmail.com
>>> > > > > > >> > > >wrote:
>>> > > > > > >> > >
>>> > > > > > >> > > > Hi Helen,
>>> > > > > > >> > > >
>>> > > > > > >> > > > The short answer to your question is that it is the
>>> > > > > documentation
>>> > > > > > >> which
>>> > > > > > >> > > is
>>> > > > > > >> > > > incorrect, and the behaviour you are seeing is
>>> expected.
>>> > > > > > >> > > >
>>> > > > > > >> > > > The long answer is, when that documentation was
>>> composed a
>>> > > > > segment
>>> > > > > > >> was
>>> > > > > > >> > > > missed out indicating this, and needs to be added to
>>> the
>>> > > docs.
>>> > > > > The
>>> > > > > > >> > > > behaviour listed for when no group is specified is
>>> only
>>> > true
>>> > > > of
>>> > > > > > the
>>> > > > > > >> > > > 'non-shared' groups supported by the Java broker, in
>>> the
>>> > > > > > C++/shared
>>> > > > > > >> > group
>>> > > > > > >> > > > mode any messages recieved without an explicit group
>>> value
>>> > > are
>>> > > > > all
>>> > > > > > >> > > assigned
>>> > > > > > >> > > > to a default group of 'qpid.no-group'. This is as per
>>> the
>>> > > > > > behaviour
>>> > > > > > >> of
>>> > > > > > >> > > the
>>> > > > > > >> > > > C++ broker itself, which is explained in the C++
>>> broker
>>> > docs
>>> > > > at
>>> > > > > > the
>>> > > > > > >> end
>>> > > > > > >> > > of
>>> > > > > > >> > > > the following page
>>> > > > > > >> > > >
>>> > > > > > >> > > >
>>> > > > > > >> > >
>>> > > > > > >> >
>>> > > > > > >>
>>> > > > > >
>>> > > > >
>>> > > >
>>> > >
>>> >
>>> http://qpid.apache.org/releases/qpid-0.24/cpp-broker/book/Using-message-groups.html
>>> > > > > > >> > > > .
>>> > > > > > >> > > > For the 0.24 Java broker, this default shared group
>>> can be
>>> > > > > changed
>>> > > > > > >> > > > broker-wide using the Java system property
>>> > > > > > >> > > > 'qpid.broker_default-shared-message-group', or can be
>>> > > > overriden
>>> > > > > > for
>>> > > > > > >> an
>>> > > > > > >> > > > individual queue during creation programatically via
>>> AMQP
>>> > > > > clients
>>> > > > > > or
>>> > > > > > >> > the
>>> > > > > > >> > > > management interfaces through use of the argument
>>> > > > > > >> > > > 'qpid.default-message-group' or
>>> > 'messageGroupDefaultGroup'.
>>> > > > > > >> > > >
>>> > > > > > >> > > > I coincidentally happened to have fixed a defect with
>>> the
>>> > > > shared
>>> > > > > > >> groups
>>> > > > > > >> > > > functionality last night on trunk. Its not yet
>>> included in
>>> > > the
>>> > > > > > >> imminent
>>> > > > > > >> > > > 0.26 release, though I am about to request whether
>>> that is
>>> > > > > > possible.
>>> > > > > > >> > > > https://issues.apache.org/jira/browse/QPID-5450
>>> > > > > > >> > > >
>>> > > > > > >> > > > Robbie
>>> > > > > > >> > > >
>>> > > > > > >> > > > On 8 January 2014 02:43, Helen Kwong <
>>> > helenkwong@gmail.com>
>>> > > > > > wrote:
>>> > > > > > >> > > >
>>> > > > > > >> > > > > Hi,
>>> > > > > > >> > > > >
>>> > > > > > >> > > > > I use the Java broker and client, version 0.16, and
>>> am
>>> > > > > > considering
>>> > > > > > >> > > using
>>> > > > > > >> > > > > the message grouping feature (
>>> > > > > > >> > > > >
>>> > > > > > >> > > > >
>>> > > > > > >> > > >
>>> > > > > > >> > >
>>> > > > > > >> >
>>> > > > > > >>
>>> > > > > >
>>> > > > >
>>> > > >
>>> > >
>>> >
>>> http://qpid.apache.org/releases/qpid-0.24/java-broker/book/Java-Broker-Queues.html#Java-Broker-Queues-OtherTypes-Message-Grouping
>>> > > > > > >> > > > > ).
>>> > > > > > >> > > > > From testing I've done, there seems to be a bug
>>> with the
>>> > > C++
>>> > > > > > >> > > > compatibility
>>> > > > > > >> > > > > model, and I'm wondering if this is a known issue.
>>> > > > > Specifically,
>>> > > > > > >> in
>>> > > > > > >> > my
>>> > > > > > >> > > > test
>>> > > > > > >> > > > > I have a queue configured to use a group header
>>> field
>>> > with
>>> > > > > > >> > > > > "qpid.group_header_key" and C++ mode with
>>> > > > > > "qpid.shared_msg_group",
>>> > > > > > >> > and
>>> > > > > > >> > > > have
>>> > > > > > >> > > > > multiple listeners to the queue. Each listener will
>>> > sleep
>>> > > > for
>>> > > > > a
>>> > > > > > >> short
>>> > > > > > >> > > > > amount of time when it receives a message before
>>> > > returning.
>>> > > > I
>>> > > > > > then
>>> > > > > > >> > > > enqueue
>>> > > > > > >> > > > > 10 messages that do not have a value in the group
>>> header
>>> > > > field
>>> > > > > > to
>>> > > > > > >> the
>>> > > > > > >> > > > > queue. Since the doc says that messages without a
>>> value
>>> > in
>>> > > > the
>>> > > > > > >> > grouping
>>> > > > > > >> > > > > header will be delivered to any available consumer,
>>> the
>>> > > > > > behavior I
>>> > > > > > >> > > expect
>>> > > > > > >> > > > > is that the messages will be processed in parallel,
>>> > i.e.,
>>> > > > when
>>> > > > > > >> > > listener 1
>>> > > > > > >> > > > > is holding on to a message and sleeping, listener 2
>>> can
>>> > > > > receive
>>> > > > > > >> > another
>>> > > > > > >> > > > > message from the queue. But what I see is that the
>>> > > messages
>>> > > > > are
>>> > > > > > >> > > processed
>>> > > > > > >> > > > > serially -- message 2 won't be received by some
>>> thread
>>> > > until
>>> > > > > > >> message
>>> > > > > > >> > 1
>>> > > > > > >> > > is
>>> > > > > > >> > > > > done. When I use the default mode instead of C++
>>> mode,
>>> > > then
>>> > > > I
>>> > > > > > get
>>> > > > > > >> the
>>> > > > > > >> > > > > parallel processing behavior.
>>> > > > > > >> > > > >
>>> > > > > > >> > > > > Is this is a known bug, and is there a fix for it
>>> > already?
>>> > > > > > >> > > > >
>>> > > > > > >> > > > > Thanks,
>>> > > > > > >> > > > > Helen
>>> > > > > > >> > > > >
>>> > > > > > >> > > >
>>> > > > > > >> > >
>>> > > > > > >> >
>>> > > > > > >>
>>> > > > > > >
>>> > > > > > >
>>> > > > > >
>>> > > > >
>>> > > >
>>> > >
>>> >
>>>
>>
>>
>

Re: Java broker - message grouping in C++ compatibility mode

Posted by Robbie Gemmell <ro...@gmail.com>.

Hi Helen,

I had a quick stab at implementing the changes I mentioned below, namely:

- Allow for disabling the 'default group' when using C++ style 'shared
groups', removing or minimising the synchronization as much as possible
when dealing with non-grouped messages.
- Allow enabling a 'default group' when using 'regular groups'

I will raise a JIRA and possibly make a branch with the update applied
later tonight so you can give it a try.

I also discussed with Rob how we might improve things further and we did
come up with one suggestion, but its a much bigger change and so definitely
one that would need to be undertaken separately if done (i.e, I only have
so much free time :P).

Robbie

On 14 January 2014 21:19, Robbie Gemmell <ro...@gmail.com> wrote:

> Hi Helen,
>
> I expected that you would see increased performance in some of the tests,
> as Rob did some more work on improving performance after the 0.16 release.
> I did expect performance to go down somewhat for case #1 as the defect fix
> was putting in place some missing synchronization between delivery and
> acknowlegement steps, though I admit it dropped a bit more than I expected.
> Giving that some thought, its possible that a change to the threading from
> some time ago (in 0.16, I think) which generally increased performance all
> round at the time may in fact hurt it in that scenario.
>
> In terms of introducing a new mode of grouping, we would actually be quite
> open to this as its something we have commented on separately ourselves but
> never quite got round to implementing; so much to do, so little time, etc.
> In particular, I have never liked the 'default group' disparity between the
> two modes and inability to really change it on top of that, so something we
> talked about was allowing the default group to be controlled for both
> grouping types such that you could specifically turn one on for the
> 'standard grouping' which doesnt normally have one, or specifically turn it
> off for the 'shared groups' that normally do. This would allow them to act
> a lot more similar, with the distinction betwen them then becoming more
> about whether the groups can float between consumers or not.
>
> Doing the above would allow you to do away with setting the unique
> messageid on the messages you dont wish to group and enable any consumer to
> pick them up, which should significantly change the performance seen
> consuming from a 'shared groups' queue which had a mixture of messages in
> groups and message not in groups (since a particular subscription becomes
> much more likely to find a message earlier in the queue if/when the group
> reset occurs). I think making additional changes outwith those to try and
> combat a queue full of unique groups in the shared groups case may require
> larger changes to the queue itself, such as a custom delivery loop to allow
> altering when the messages can be assigned to the group or even moving away
> from the current 'group manager' addition to the base queue and more
> towards a custom queue such as is done currently for the priority and
> sorted queue implementations.
>
> Robbie
>
> On 14 January 2014 07:15, Helen Kwong <he...@gmail.com> wrote:
>
>> Hi Robbie,
>>
>> I ran the tests again with a broker built from the latest trunk code and
>> with a 0.16 client. The performance results are a lot better in almost
>> every test setup (3-4 times better), except for Test #1 where I build up
>> 100000 messages of unique grouping values and then see how many messages
>> we
>> can process in 5 minutes, using C++ mode (I got ~7800 messages before, and
>> now ~5400). The overall comparisons are about the same, with C++ mode
>> still
>> performing significantly worse:
>>
>> 1. For the first test, for a regular queue and a default mode queue (with
>> all messages having no grouping values, all having the same value, and all
>> having different values), I am able to process 100000 messages in around 3
>> and a half minutes. In C++ mode, with all 100000 messages having different
>> values, I can process only ~5400 in 5 minutes; with all having no value /
>> the default group, ~44000 messages were processed in 5 minutes (after
>> which
>> I stop waiting and just clear the queue).
>>
>> 2. For the second test, for a default mode queue, the performance is again
>> not affected by how many unprocessable messages we have at the front of
>> the
>> queue due to another consumer holding on to those messages' group for a
>> long time. The time it takes to process 1000 messages of group B after N
>> messages of group A (which is assigned to another consumer) is about the
>> same as processing 1000 messages on a regular queue. With a C++ mode
>> queue,
>> the performance still gets worse the more messages of group A we have at
>> the front of the queue.
>>
>>
>> > Can I enquire what it is about your use case that would preclude use of
>> a
>> > higher prefetch?
>>
>>
>> We want to avoid starving messages that can be processed by another
>> consumer. We have multiple dequeue sessions, each listening to multiple
>> queues. If we have a higher prefetch, then if there is a long-running
>> message on a queue, the messages that were prefetched along with it will
>> have to wait for a long time, even if another listener is available and
>> can
>> process those messages. And if we decide to add the message grouping
>> configuration, we want to avoid starving a message of a particular group
>> if
>> there's an available listener -- if a listener can have 2 messages and 2
>> different groups assigned at a time, and processing of the first message
>> takes a long time, then we might be unfairly starving the second message,
>> even though other listeners can process it.
>>
>> We might be able to tolerate this to some extent if the prefetch is only
>> 2,
>> if this is the only way to improve overall throughput. I'll have to
>> discuss
>> this more with my team. Though this still wouldn't solve the problem with
>> unique keys in C++ mode.
>>
>>
>> > Also, you originally seemed to prefer the idea that
>> > messages without a group header value would be not be grouped, so is
>> there
>> > a particular reason you are leaning towards using the shared groups
>> > functionality which can't do that?
>> >
>>
>> Basically, the ideal behavior we want is a third mode that combines the
>> two. If no grouping value is specified, then treat it as though it has no
>> group; for messages with a grouping value, ensure that only 1 listener can
>> be processing messages from a particular group at a time, but don't tie a
>> group to a particular consumer for the lifetime of that consumer. The
>> reason we want the first part is that we'll have many messages that don't
>> belong to any group, and we want them to be processed by different
>> listeners in parallel. The reason we don't want a consumer to be
>> associated
>> with a group as long as the consumer lives is again to avoid starvation --
>> we don't want a consumer processing a message of group A for a long time
>> result in starving group B's messages, because it happens so that the
>> first
>> group B message was processed by that consumer and so group B is assigned
>> to the consumer, when there's another consumer of the same queue that is
>> doing nothing and can process the B message.
>>
>> So we were thinking of asking you guys if you'd be open to introducing a
>> third mode, mostly the same as the C++ mode but where no grouping value
>> means no group, instead of the default group. Another workaround we
>> thought
>> of was for any message that doesn't belong to a group, we'll put its
>> message ID in the grouping key property, so that essentially any listener
>> can pick it up. That's one reason why we were testing the unique keys
>> case,
>> though it's also possible for us to get many different grouping values in
>> a
>> queue (though not at quite as high a number, e.g., 100000). But in C++
>> mode, with both unique values and same-group values at a high depth, we're
>> seeing decreased performance anyway, so we might not be able to use it.
>>
>> Do you have any suggestions for what we should do? Are there any ideas you
>> have for solving the performance issue with unique keys in C++ mode, so
>> perhaps we could look into it more?
>>
>> Thanks,
>> Helen
>>
>>
>> > On 13 January 2014 20:34, Helen Kwong <he...@gmail.com> wrote:
>> >
>> > > Hi Robbie,
>> > >
>> > > I am actually still running version 0.16 of the broker. It will take
>> me a
>> > > little time to set up the trunk code and rebuild and rerun the
>> > experiments.
>> > > Do you think your fix will likely make a difference?
>> > >
>> > > For the second case with a long-lived consumer being assigned the
>> group
>> > of
>> > > many messages at the head of the queue, I was indeed using a prefetch
>> of
>> > 1.
>> > > I ran it again (with version 0.16 still) with a prefetch of 2 as you
>> > > suggested, and the dequeue time of the messages at the end was then
>> not
>> > > affected by the number of unprocessable messages at the beginning of
>> the
>> > > queue, about the same as the other test setups I ran. However, I think
>> > > increasing prefetch to 2 might not work for our use case.
>> > >
>> > > For the first case with unique message groups, your explanation makes
>> > sense
>> > > and I think I understand it now. Do you think there is still a way to
>> > > optimize this behavior, so that we don't need to possibly traverse
>> > through
>> > > the whole queue whenever a subscription is unassigned from a group?
>> Since
>> > > you mentioned that maintaining a fixed pointer to a queue entry would
>> > > likely lead to memory retention issues, would having a weak reference
>> be
>> > a
>> > > possible option?
>> > >
>> > > Thanks,
>> > > Helen
>> > >
>> > >
>> > >
>> > > On Sat, Jan 11, 2014 at 7:58 AM, Robbie Gemmell <
>> > robbie.gemmell@gmail.com
>> > > >wrote:
>> > >
>> > > > Hi Helen,
>> > > >
>> > > > Can I check what version of the code you were using? I ask as the
>> > latest
>> > > > trunk or 0.26 release branch code is going to be necessary for
>> > > correctness
>> > > > and representative testing of the shared groups functionality, due
>> to
>> > the
>> > > > defect fix I previously mentioned making recently.
>> > > > You can find a nightly build of the trunk broker at:
>> > > >
>> > > >
>> > >
>> >
>> https://builds.apache.org/view/M-R/view/Qpid/job/Qpid-Java-Artefact-Release/lastSuccessfulBuild/artifact/trunk/qpid/java/broker/release/and
>> > > > I would need to build the 0.26 branch as the fix was introduced
>> after
>> > > > the latest RC.
>> > > >
>> > > > In your first case, I think the reason for the difference between
>> the
>> > > > default group and the unique group is also likely to be tied to the
>> > > > 'findEarliestAssignedAvailableEntry' behaviour you mention later in
>> > your
>> > > > mail. For the default group case, that next message always going to
>> be
>> > a
>> > > > message near the front of the queue. For the unique group, there
>> isnt
>> > > > actually going to be a message which matches, but it looks like it
>> will
>> > > > currently be checking every message to determine that and doing so
>> > under
>> > > > the synchronization, and thus probably preventing other deliveries
>> > > > occurring at the time. That isnt a problem in the non-shared case
>> > because
>> > > > there isnt a need to synchronise the GroupManager as a whole, and
>> even
>> > > > going beyond that its also highly unlikely it would need to check as
>> > many
>> > > > messages before finding a match due to the signifcant difference in
>> how
>> > > > groups become associated with a particular subscription in the
>> > non-shared
>> > > > case.
>> > > >
>> > > > In your second case, your explanation seems likely and I think this
>> > case
>> > > > really reduces to just being a variant of the above behaviour. The
>> > > > particular issue is that one could argue it shouldnt need to be
>> doing
>> > the
>> > > > 'findEarliestAssignedAvailableEntry' task all that often in this
>> case
>> > if
>> > > > you have a long-lived consumer, and so your mention of this makes me
>> > > think
>> > > > you are using a prefetch of 1. Using a prefetch of 1 currently means
>> > that
>> > > > the delivery state associated with the shared group effectively
>> becomes
>> > > > empty after each message, because messages are only fully added to
>> the
>> > > > group when they become acquired by a particular subscription, and
>> they
>> > > cant
>> > > > be acquired until the previous message is consumed (or perhaps
>> slightly
>> > > > confusingly, explicitly not-consumed). If so, I expect it could be
>> very
>> > > > interesting to run this case again with a prefetch of 2 or more. The
>> > > > obvious tradeoff with increasing prefetch is that a particular
>> consumer
>> > > > could then be assigned up to <prefetch> groups at a given point,
>> though
>> > > > likely not in your test case due to the large contiguous blocks of
>> > > messages
>> > > > for each group.
>> > > >
>> > > > I'm not sure that the suggestion to track the first message in the
>> > group
>> > > > would really work currently, due to the way the underlying queue
>> data
>> > > > structure works. Maintaining a fixed pointer into it like that is
>> > likely
>> > > to
>> > > > lead to some undesirable memory retention issues, based on a related
>> > but
>> > > > far simpler case I fixed previously in a similar structure
>> elsewhere in
>> > > the
>> > > > broker. Looking at the way messages become assigned to a group in
>> the
>> > > > shared group case may be a more viable path to handling your second
>> > case
>> > > > more gracefully. The unique groups from you first case would still
>> need
>> > > > something different though, as neither of these routes would really
>> > help
>> > > > there.
>> > > >
>> > > > Robbie
>> > > >
>> > > > On 11 January 2014 01:03, Helen Kwong <he...@gmail.com> wrote:
>> > > >
>> > > > > Hi Robbie,
>> > > > >
>> > > > > I did some more testing to see whether message grouping will work
>> for
>> > > us,
>> > > > > and compared the dequeue performance of a queue using message
>> > grouping
>> > > in
>> > > > > default Java mode, a queue using C++ mode, and a queue not using
>> > > message
>> > > > > grouping. I found that when I use C++ mode, the performance can be
>> > much
>> > > > > worse than in other comparable setups, and was wondering if you
>> could
>> > > > help
>> > > > > me understand why.
>> > > > >
>> > > > > 1.  In one test, I have multiple listeners to a queue, enqueue
>> 100000
>> > > > > messages to it, and see how many messages are processed in 5
>> > minutes. I
>> > > > > have these different setups:
>> > > > >
>> > > > > - C++ mode queue with each message having a unique identifier
>> > > > >
>> > > > > - C++ mode queue with all messages having no grouping identifier
>> (so
>> > > all
>> > > > > belong to the default group)
>> > > > >
>> > > > > - default mode queue with each message having a unique grouping
>> > > > identifier
>> > > > >
>> > > > > - default mode queue with all messages having no grouping
>> identifier
>> > > > >
>> > > > > - default mode queue with all messages having the same grouping
>> > > > identifier
>> > > > >
>> > > > > - regular queue without a group header key configured
>> > > > >
>> > > > > All setups except for the first had about 35K - 39K messages
>> > processed,
>> > > > but
>> > > > > for the first setup, there were under 8000 messages processed.
>> What
>> > > could
>> > > > > explain this big difference? I’ve looked at the two grouping
>> modes’
>> > > > > implementations of MessageGroupManager and see that C++ mode uses
>> > > > > synchronized methods rather than a ConcurrentHashMap as in default
>> > > mode,
>> > > > so
>> > > > > I’d guess there might be more contention because of that, but at
>> the
>> > > same
>> > > > > time I can’t see why, in C++ mode, having a unique identifier for
>> > each
>> > > > > message results in throughput that is so much worse than having
>> all
>> > > > > messages in the default group.
>> > > > >
>> > > > >
>> > > > > 2. I also wanted to see the impact of having many messages at the
>> > head
>> > > of
>> > > > > the queue that a listener can’t process because the messages
>> belong
>> > to
>> > > a
>> > > > > group assigned to another consumer. E.g., have 10000 messages of
>> > group
>> > > A,
>> > > > > followed by 1000 messages of group B, and listener 1 is holding
>> on to
>> > > the
>> > > > > first A message for a long time -- see how long it will take
>> > listener 2
>> > > > to
>> > > > > process all the B messages. In this case C++ mode has performance
>> > that
>> > > > > degrades as the number of unprocessable group A messages at the
>> front
>> > > of
>> > > > > the queue increases, whereas default mode's performance is
>> > unaffected,
>> > > > > about the same as processing 1000 messages on a regular queue.
>> > > > >
>> > > > > My rough guess from looking at DefinedGroupMessageGroupManager is
>> > that
>> > > > > whenever listener 2 is done with a group B message, the state
>> change
>> > > > > listener triggers the Group.subtract() to reset pointers for other
>> > > > > subscriptions and consequently
>> findEarliestAssignedAvailableEntry().
>> > > This
>> > > > > then has to iterate through all the group A messages before it
>> finds
>> > > the
>> > > > B
>> > > > > message. Do you think this is the reason for the results I see?
>> > > > >
>> > > > > If so, is the idea here that other subscriptions of the queue
>> could
>> > > have
>> > > > > skipped over the messages of a group while the group was assigned
>> to
>> > > some
>> > > > > subscription S, so we need to tell them to set their pointers
>> back?
>> > If
>> > > > that
>> > > > > is indeed the idea, would it be possible to optimize it such that
>> > when
>> > > a
>> > > > > group A is assigned to S and S gets its first message of the
>> group,
>> > we
>> > > > > store what that first A message / queue entry is. Then when S is
>> done
>> > > > with
>> > > > > the last A message, we can tell other subscriptions to go back to
>> > that
>> > > > > first entry, without having to iterate through the queue?
>> > > > >
>> > > > > Thanks a lot for your help!
>> > > > >
>> > > > > Helen
>> > > > >
>> > > > >
>> > > > > On Tue, Jan 7, 2014 at 8:46 PM, Robbie Gemmell <
>> > > robbie.gemmell@gmail.com
>> > > > > >wrote:
>> > > > >
>> > > > > > ...and just to be super clear, though I think it it is mentioned
>> > > > > correctly
>> > > > > > in the docs this time, the 'default group' concept does not
>> apply
>> > in
>> > > > the
>> > > > > > regular / 'non shared' grouping mode. Messages that dont
>> specify a
>> > > > group
>> > > > > > key value in that mode are simply not grouped in any way.
>> > > > > >
>> > > > > > On 8 January 2014 04:41, Robbie Gemmell <
>> robbie.gemmell@gmail.com>
>> > > > > wrote:
>> > > > > >
>> > > > > > >
>> > > > > > > On 8 January 2014 04:33, Helen Kwong <he...@gmail.com>
>> > wrote:
>> > > > > > >
>> > > > > > >> Oh I see, I thought what you meant was that I could only
>> alter
>> > the
>> > > > > > default
>> > > > > > >> group in shared-groups mode starting with 0.24.
>> > > > > > >
>> > > > > > >
>> > > > > > > No, I just missed that you said 0.16 and assumed 0.24 was the
>> > > version
>> > > > > you
>> > > > > > > were using . You could always change it, just in more limited
>> > ways
>> > > in
>> > > > > > > earlier releases.
>> > > > > > >
>> > > > > > > To make sure I'm
>> > > > > > >> understanding this correctly -- changing the the default
>> message
>> > > > group
>> > > > > > >> name
>> > > > > > >> to something else in C++ mode won't change the serial
>> processing
>> > > > > > behavior
>> > > > > > >> I
>> > > > > > >> saw, right?
>> > > > > > >
>> > > > > > >
>> > > > > > > Correct
>> > > > > > >
>> > > > > > >
>> > > > > > >> Messages without a group identifier will still be considered
>> to
>> > > > > > >> be in a group -- rather than no group -- and they cannot be
>> > > > processed
>> > > > > by
>> > > > > > >> multiple consumers concurrently?
>> > > > > > >>
>> > > > > > >>
>> > > > > > > Yes. In the C++/shared-groups mode every message is
>> considered to
>> > > be
>> > > > > in a
>> > > > > > > group, it is just a case of whether the message specifies that
>> > > group
>> > > > or
>> > > > > > > instead gets put into the default group.
>> > > > > > >
>> > > > > > >
>> > > > > > >
>> > > > > > >> Thanks,
>> > > > > > >> Helen
>> > > > > > >>
>> > > > > > >>
>> > > > > > >> On Tue, Jan 7, 2014 at 8:22 PM, Robbie Gemmell <
>> > > > > > robbie.gemmell@gmail.com
>> > > > > > >> >wrote:
>> > > > > > >>
>> > > > > > >> > I just noticed you said you were using 0.16, somehow
>> glossed
>> > > over
>> > > > it
>> > > > > > >> > originally and only noticed the 0.24 in the doc URL (its
>> many
>> > > > hours
>> > > > > > past
>> > > > > > >> > time I was asleep, I might be getting tired).
>> > > > > > >> >
>> > > > > > >> > Realising that, I should add that prior to 0.22 the only
>> way
>> > to
>> > > > > alter
>> > > > > > >> the
>> > > > > > >> > default group in the shared-groups mode from
>> 'qpid.no-group'
>> > to
>> > > > > > >> something
>> > > > > > >> > else would have been via the 'qpid.default-message-group'
>> > queue
>> > > > > > declare
>> > > > > > >> > argument when using an AMQP client to create the queue
>> > > originally,
>> > > > > and
>> > > > > > >> for
>> > > > > > >> > 0.22 itself only that and the system property approach I
>> > > mentioned
>> > > > > > would
>> > > > > > >> > work.
>> > > > > > >> >
>> > > > > > >> > Robbie
>> > > > > > >> >
>> > > > > > >> > On 8 January 2014 04:03, Helen Kwong <helenkwong@gmail.com
>> >
>> > > > wrote:
>> > > > > > >> >
>> > > > > > >> > > Hi Robbie,
>> > > > > > >> > >
>> > > > > > >> > > I see. Thanks for the quick response and explanation!
>> > > > > > >> > >
>> > > > > > >> > > Helen
>> > > > > > >> > >
>> > > > > > >> > >
>> > > > > > >> > > On Tue, Jan 7, 2014 at 7:43 PM, Robbie Gemmell <
>> > > > > > >> robbie.gemmell@gmail.com
>> > > > > > >> > > >wrote:
>> > > > > > >> > >
>> > > > > > >> > > > Hi Helen,
>> > > > > > >> > > >
>> > > > > > >> > > > The short answer to your question is that it is the
>> > > > > documentation
>> > > > > > >> which
>> > > > > > >> > > is
>> > > > > > >> > > > incorrect, and the behaviour you are seeing is
>> expected.
>> > > > > > >> > > >
>> > > > > > >> > > > The long answer is, when that documentation was
>> composed a
>> > > > > segment
>> > > > > > >> was
>> > > > > > >> > > > missed out indicating this, and needs to be added to
>> the
>> > > docs.
>> > > > > The
>> > > > > > >> > > > behaviour listed for when no group is specified is only
>> > true
>> > > > of
>> > > > > > the
>> > > > > > >> > > > 'non-shared' groups supported by the Java broker, in
>> the
>> > > > > > C++/shared
>> > > > > > >> > group
>> > > > > > >> > > > mode any messages recieved without an explicit group
>> value
>> > > are
>> > > > > all
>> > > > > > >> > > assigned
>> > > > > > >> > > > to a default group of 'qpid.no-group'. This is as per
>> the
>> > > > > > behaviour
>> > > > > > >> of
>> > > > > > >> > > the
>> > > > > > >> > > > C++ broker itself, which is explained in the C++ broker
>> > docs
>> > > > at
>> > > > > > the
>> > > > > > >> end
>> > > > > > >> > > of
>> > > > > > >> > > > the following page
>> > > > > > >> > > >
>> > > > > > >> > > >
>> > > > > > >> > >
>> > > > > > >> >
>> > > > > > >>
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>> http://qpid.apache.org/releases/qpid-0.24/cpp-broker/book/Using-message-groups.html
>> > > > > > >> > > > .
>> > > > > > >> > > > For the 0.24 Java broker, this default shared group
>> can be
>> > > > > changed
>> > > > > > >> > > > broker-wide using the Java system property
>> > > > > > >> > > > 'qpid.broker_default-shared-message-group', or can be
>> > > > overriden
>> > > > > > for
>> > > > > > >> an
>> > > > > > >> > > > individual queue during creation programatically via
>> AMQP
>> > > > > clients
>> > > > > > or
>> > > > > > >> > the
>> > > > > > >> > > > management interfaces through use of the argument
>> > > > > > >> > > > 'qpid.default-message-group' or
>> > 'messageGroupDefaultGroup'.
>> > > > > > >> > > >
>> > > > > > >> > > > I coincidentally happened to have fixed a defect with
>> the
>> > > > shared
>> > > > > > >> groups
>> > > > > > >> > > > functionality last night on trunk. Its not yet
>> included in
>> > > the
>> > > > > > >> imminent
>> > > > > > >> > > > 0.26 release, though I am about to request whether
>> that is
>> > > > > > possible.
>> > > > > > >> > > > https://issues.apache.org/jira/browse/QPID-5450
>> > > > > > >> > > >
>> > > > > > >> > > > Robbie
>> > > > > > >> > > >
>> > > > > > >> > > > On 8 January 2014 02:43, Helen Kwong <
>> > helenkwong@gmail.com>
>> > > > > > wrote:
>> > > > > > >> > > >
>> > > > > > >> > > > > Hi,
>> > > > > > >> > > > >
>> > > > > > >> > > > > I use the Java broker and client, version 0.16, and
>> am
>> > > > > > considering
>> > > > > > >> > > using
>> > > > > > >> > > > > the message grouping feature (
>> > > > > > >> > > > >
>> > > > > > >> > > > >
>> > > > > > >> > > >
>> > > > > > >> > >
>> > > > > > >> >
>> > > > > > >>
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>> http://qpid.apache.org/releases/qpid-0.24/java-broker/book/Java-Broker-Queues.html#Java-Broker-Queues-OtherTypes-Message-Grouping
>> > > > > > >> > > > > ).
>> > > > > > >> > > > > From testing I've done, there seems to be a bug with
>> the
>> > > C++
>> > > > > > >> > > > compatibility
>> > > > > > >> > > > > model, and I'm wondering if this is a known issue.
>> > > > > Specifically,
>> > > > > > >> in
>> > > > > > >> > my
>> > > > > > >> > > > test
>> > > > > > >> > > > > I have a queue configured to use a group header field
>> > with
>> > > > > > >> > > > > "qpid.group_header_key" and C++ mode with
>> > > > > > "qpid.shared_msg_group",
>> > > > > > >> > and
>> > > > > > >> > > > have
>> > > > > > >> > > > > multiple listeners to the queue. Each listener will
>> > sleep
>> > > > for
>> > > > > a
>> > > > > > >> short
>> > > > > > >> > > > > amount of time when it receives a message before
>> > > returning.
>> > > > I
>> > > > > > then
>> > > > > > >> > > > enqueue
>> > > > > > >> > > > > 10 messages that do not have a value in the group
>> header
>> > > > field
>> > > > > > to
>> > > > > > >> the
>> > > > > > >> > > > > queue. Since the doc says that messages without a
>> value
>> > in
>> > > > the
>> > > > > > >> > grouping
>> > > > > > >> > > > > header will be delivered to any available consumer,
>> the
>> > > > > > behavior I
>> > > > > > >> > > expect
>> > > > > > >> > > > > is that the messages will be processed in parallel,
>> > i.e.,
>> > > > when
>> > > > > > >> > > listener 1
>> > > > > > >> > > > > is holding on to a message and sleeping, listener 2
>> can
>> > > > > receive
>> > > > > > >> > another
>> > > > > > >> > > > > message from the queue. But what I see is that the
>> > > messages
>> > > > > are
>> > > > > > >> > > processed
>> > > > > > >> > > > > serially -- message 2 won't be received by some
>> thread
>> > > until
>> > > > > > >> message
>> > > > > > >> > 1
>> > > > > > >> > > is
>> > > > > > >> > > > > done. When I use the default mode instead of C++
>> mode,
>> > > then
>> > > > I
>> > > > > > get
>> > > > > > >> the
>> > > > > > >> > > > > parallel processing behavior.
>> > > > > > >> > > > >
>> > > > > > >> > > > > Is this is a known bug, and is there a fix for it
>> > already?
>> > > > > > >> > > > >
>> > > > > > >> > > > > Thanks,
>> > > > > > >> > > > > Helen
>> > > > > > >> > > > >
>> > > > > > >> > > >
>> > > > > > >> > >
>> > > > > > >> >
>> > > > > > >>
>> > > > > > >
>> > > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>>
>
>

Re: Java broker - message grouping in C++ compatibility mode

Posted by Robbie Gemmell <ro...@gmail.com>.

Hi Helen,

I expected that you would see increased performance in some of the tests,
as Rob did some more work on improving performance after the 0.16 release.
I did expect performance to go down somewhat for case #1 as the defect fix
was putting in place some missing synchronization between delivery and
acknowlegement steps, though I admit it dropped a bit more than I expected.
Giving that some thought, its possible that a change to the threading from
some time ago (in 0.16, I think) which generally increased performance all
round at the time may in fact hurt it in that scenario.

In terms of introducing a new mode of grouping, we would actually be quite
open to this as its something we have commented on separately ourselves but
never quite got round to implementing; so much to do, so little time, etc.
In particular, I have never liked the 'default group' disparity between the
two modes and inability to really change it on top of that, so something we
talked about was allowing the default group to be controlled for both
grouping types such that you could specifically turn one on for the
'standard grouping' which doesnt normally have one, or specifically turn it
off for the 'shared groups' that normally do. This would allow them to act
a lot more similar, with the distinction betwen them then becoming more
about whether the groups can float between consumers or not.

Doing the above would allow you to do away with setting the unique
messageid on the messages you dont wish to group and enable any consumer to
pick them up, which should significantly change the performance seen
consuming from a 'shared groups' queue which had a mixture of messages in
groups and message not in groups (since a particular subscription becomes
much more likely to find a message earlier in the queue if/when the group
reset occurs). I think making additional changes outwith those to try and
combat a queue full of unique groups in the shared groups case may require
larger changes to the queue itself, such as a custom delivery loop to allow
altering when the messages can be assigned to the group or even moving away
from the current 'group manager' addition to the base queue and more
towards a custom queue such as is done currently for the priority and
sorted queue implementations.

Robbie

On 14 January 2014 07:15, Helen Kwong <he...@gmail.com> wrote:

> Hi Robbie,
>
> I ran the tests again with a broker built from the latest trunk code and
> with a 0.16 client. The performance results are a lot better in almost
> every test setup (3-4 times better), except for Test #1 where I build up
> 100000 messages of unique grouping values and then see how many messages we
> can process in 5 minutes, using C++ mode (I got ~7800 messages before, and
> now ~5400). The overall comparisons are about the same, with C++ mode still
> performing significantly worse:
>
> 1. For the first test, for a regular queue and a default mode queue (with
> all messages having no grouping values, all having the same value, and all
> having different values), I am able to process 100000 messages in around 3
> and a half minutes. In C++ mode, with all 100000 messages having different
> values, I can process only ~5400 in 5 minutes; with all having no value /
> the default group, ~44000 messages were processed in 5 minutes (after which
> I stop waiting and just clear the queue).
>
> 2. For the second test, for a default mode queue, the performance is again
> not affected by how many unprocessable messages we have at the front of the
> queue due to another consumer holding on to those messages' group for a
> long time. The time it takes to process 1000 messages of group B after N
> messages of group A (which is assigned to another consumer) is about the
> same as processing 1000 messages on a regular queue. With a C++ mode queue,
> the performance still gets worse the more messages of group A we have at
> the front of the queue.
>
>
> > Can I enquire what it is about your use case that would preclude use of a
> > higher prefetch?
>
>
> We want to avoid starving messages that can be processed by another
> consumer. We have multiple dequeue sessions, each listening to multiple
> queues. If we have a higher prefetch, then if there is a long-running
> message on a queue, the messages that were prefetched along with it will
> have to wait for a long time, even if another listener is available and can
> process those messages. And if we decide to add the message grouping
> configuration, we want to avoid starving a message of a particular group if
> there's an available listener -- if a listener can have 2 messages and 2
> different groups assigned at a time, and processing of the first message
> takes a long time, then we might be unfairly starving the second message,
> even though other listeners can process it.
>
> We might be able to tolerate this to some extent if the prefetch is only 2,
> if this is the only way to improve overall throughput. I'll have to discuss
> this more with my team. Though this still wouldn't solve the problem with
> unique keys in C++ mode.
>
>
> > Also, you originally seemed to prefer the idea that
> > messages without a group header value would be not be grouped, so is
> there
> > a particular reason you are leaning towards using the shared groups
> > functionality which can't do that?
> >
>
> Basically, the ideal behavior we want is a third mode that combines the
> two. If no grouping value is specified, then treat it as though it has no
> group; for messages with a grouping value, ensure that only 1 listener can
> be processing messages from a particular group at a time, but don't tie a
> group to a particular consumer for the lifetime of that consumer. The
> reason we want the first part is that we'll have many messages that don't
> belong to any group, and we want them to be processed by different
> listeners in parallel. The reason we don't want a consumer to be associated
> with a group as long as the consumer lives is again to avoid starvation --
> we don't want a consumer processing a message of group A for a long time
> result in starving group B's messages, because it happens so that the first
> group B message was processed by that consumer and so group B is assigned
> to the consumer, when there's another consumer of the same queue that is
> doing nothing and can process the B message.
>
> So we were thinking of asking you guys if you'd be open to introducing a
> third mode, mostly the same as the C++ mode but where no grouping value
> means no group, instead of the default group. Another workaround we thought
> of was for any message that doesn't belong to a group, we'll put its
> message ID in the grouping key property, so that essentially any listener
> can pick it up. That's one reason why we were testing the unique keys case,
> though it's also possible for us to get many different grouping values in a
> queue (though not at quite as high a number, e.g., 100000). But in C++
> mode, with both unique values and same-group values at a high depth, we're
> seeing decreased performance anyway, so we might not be able to use it.
>
> Do you have any suggestions for what we should do? Are there any ideas you
> have for solving the performance issue with unique keys in C++ mode, so
> perhaps we could look into it more?
>
> Thanks,
> Helen
>
>
> > On 13 January 2014 20:34, Helen Kwong <he...@gmail.com> wrote:
> >
> > > Hi Robbie,
> > >
> > > I am actually still running version 0.16 of the broker. It will take
> me a
> > > little time to set up the trunk code and rebuild and rerun the
> > experiments.
> > > Do you think your fix will likely make a difference?
> > >
> > > For the second case with a long-lived consumer being assigned the group
> > of
> > > many messages at the head of the queue, I was indeed using a prefetch
> of
> > 1.
> > > I ran it again (with version 0.16 still) with a prefetch of 2 as you
> > > suggested, and the dequeue time of the messages at the end was then not
> > > affected by the number of unprocessable messages at the beginning of
> the
> > > queue, about the same as the other test setups I ran. However, I think
> > > increasing prefetch to 2 might not work for our use case.
> > >
> > > For the first case with unique message groups, your explanation makes
> > sense
> > > and I think I understand it now. Do you think there is still a way to
> > > optimize this behavior, so that we don't need to possibly traverse
> > through
> > > the whole queue whenever a subscription is unassigned from a group?
> Since
> > > you mentioned that maintaining a fixed pointer to a queue entry would
> > > likely lead to memory retention issues, would having a weak reference
> be
> > a
> > > possible option?
> > >
> > > Thanks,
> > > Helen
> > >
> > >
> > >
> > > On Sat, Jan 11, 2014 at 7:58 AM, Robbie Gemmell <
> > robbie.gemmell@gmail.com
> > > >wrote:
> > >
> > > > Hi Helen,
> > > >
> > > > Can I check what version of the code you were using? I ask as the
> > latest
> > > > trunk or 0.26 release branch code is going to be necessary for
> > > correctness
> > > > and representative testing of the shared groups functionality, due to
> > the
> > > > defect fix I previously mentioned making recently.
> > > > You can find a nightly build of the trunk broker at:
> > > >
> > > >
> > >
> >
> https://builds.apache.org/view/M-R/view/Qpid/job/Qpid-Java-Artefact-Release/lastSuccessfulBuild/artifact/trunk/qpid/java/broker/release/and
> > > > I would need to build the 0.26 branch as the fix was introduced after
> > > > the latest RC.
> > > >
> > > > In your first case, I think the reason for the difference between the
> > > > default group and the unique group is also likely to be tied to the
> > > > 'findEarliestAssignedAvailableEntry' behaviour you mention later in
> > your
> > > > mail. For the default group case, that next message always going to
> be
> > a
> > > > message near the front of the queue. For the unique group, there isnt
> > > > actually going to be a message which matches, but it looks like it
> will
> > > > currently be checking every message to determine that and doing so
> > under
> > > > the synchronization, and thus probably preventing other deliveries
> > > > occurring at the time. That isnt a problem in the non-shared case
> > because
> > > > there isnt a need to synchronise the GroupManager as a whole, and
> even
> > > > going beyond that its also highly unlikely it would need to check as
> > many
> > > > messages before finding a match due to the signifcant difference in
> how
> > > > groups become associated with a particular subscription in the
> > non-shared
> > > > case.
> > > >
> > > > In your second case, your explanation seems likely and I think this
> > case
> > > > really reduces to just being a variant of the above behaviour. The
> > > > particular issue is that one could argue it shouldnt need to be doing
> > the
> > > > 'findEarliestAssignedAvailableEntry' task all that often in this case
> > if
> > > > you have a long-lived consumer, and so your mention of this makes me
> > > think
> > > > you are using a prefetch of 1. Using a prefetch of 1 currently means
> > that
> > > > the delivery state associated with the shared group effectively
> becomes
> > > > empty after each message, because messages are only fully added to
> the
> > > > group when they become acquired by a particular subscription, and
> they
> > > cant
> > > > be acquired until the previous message is consumed (or perhaps
> slightly
> > > > confusingly, explicitly not-consumed). If so, I expect it could be
> very
> > > > interesting to run this case again with a prefetch of 2 or more. The
> > > > obvious tradeoff with increasing prefetch is that a particular
> consumer
> > > > could then be assigned up to <prefetch> groups at a given point,
> though
> > > > likely not in your test case due to the large contiguous blocks of
> > > messages
> > > > for each group.
> > > >
> > > > I'm not sure that the suggestion to track the first message in the
> > group
> > > > would really work currently, due to the way the underlying queue data
> > > > structure works. Maintaining a fixed pointer into it like that is
> > likely
> > > to
> > > > lead to some undesirable memory retention issues, based on a related
> > but
> > > > far simpler case I fixed previously in a similar structure elsewhere
> in
> > > the
> > > > broker. Looking at the way messages become assigned to a group in the
> > > > shared group case may be a more viable path to handling your second
> > case
> > > > more gracefully. The unique groups from you first case would still
> need
> > > > something different though, as neither of these routes would really
> > help
> > > > there.
> > > >
> > > > Robbie
> > > >
> > > > On 11 January 2014 01:03, Helen Kwong <he...@gmail.com> wrote:
> > > >
> > > > > Hi Robbie,
> > > > >
> > > > > I did some more testing to see whether message grouping will work
> for
> > > us,
> > > > > and compared the dequeue performance of a queue using message
> > grouping
> > > in
> > > > > default Java mode, a queue using C++ mode, and a queue not using
> > > message
> > > > > grouping. I found that when I use C++ mode, the performance can be
> > much
> > > > > worse than in other comparable setups, and was wondering if you
> could
> > > > help
> > > > > me understand why.
> > > > >
> > > > > 1.  In one test, I have multiple listeners to a queue, enqueue
> 100000
> > > > > messages to it, and see how many messages are processed in 5
> > minutes. I
> > > > > have these different setups:
> > > > >
> > > > > - C++ mode queue with each message having a unique identifier
> > > > >
> > > > > - C++ mode queue with all messages having no grouping identifier
> (so
> > > all
> > > > > belong to the default group)
> > > > >
> > > > > - default mode queue with each message having a unique grouping
> > > > identifier
> > > > >
> > > > > - default mode queue with all messages having no grouping
> identifier
> > > > >
> > > > > - default mode queue with all messages having the same grouping
> > > > identifier
> > > > >
> > > > > - regular queue without a group header key configured
> > > > >
> > > > > All setups except for the first had about 35K - 39K messages
> > processed,
> > > > but
> > > > > for the first setup, there were under 8000 messages processed. What
> > > could
> > > > > explain this big difference? I’ve looked at the two grouping modes’
> > > > > implementations of MessageGroupManager and see that C++ mode uses
> > > > > synchronized methods rather than a ConcurrentHashMap as in default
> > > mode,
> > > > so
> > > > > I’d guess there might be more contention because of that, but at
> the
> > > same
> > > > > time I can’t see why, in C++ mode, having a unique identifier for
> > each
> > > > > message results in throughput that is so much worse than having all
> > > > > messages in the default group.
> > > > >
> > > > >
> > > > > 2. I also wanted to see the impact of having many messages at the
> > head
> > > of
> > > > > the queue that a listener can’t process because the messages belong
> > to
> > > a
> > > > > group assigned to another consumer. E.g., have 10000 messages of
> > group
> > > A,
> > > > > followed by 1000 messages of group B, and listener 1 is holding on
> to
> > > the
> > > > > first A message for a long time -- see how long it will take
> > listener 2
> > > > to
> > > > > process all the B messages. In this case C++ mode has performance
> > that
> > > > > degrades as the number of unprocessable group A messages at the
> front
> > > of
> > > > > the queue increases, whereas default mode's performance is
> > unaffected,
> > > > > about the same as processing 1000 messages on a regular queue.
> > > > >
> > > > > My rough guess from looking at DefinedGroupMessageGroupManager is
> > that
> > > > > whenever listener 2 is done with a group B message, the state
> change
> > > > > listener triggers the Group.subtract() to reset pointers for other
> > > > > subscriptions and consequently
> findEarliestAssignedAvailableEntry().
> > > This
> > > > > then has to iterate through all the group A messages before it
> finds
> > > the
> > > > B
> > > > > message. Do you think this is the reason for the results I see?
> > > > >
> > > > > If so, is the idea here that other subscriptions of the queue could
> > > have
> > > > > skipped over the messages of a group while the group was assigned
> to
> > > some
> > > > > subscription S, so we need to tell them to set their pointers back?
> > If
> > > > that
> > > > > is indeed the idea, would it be possible to optimize it such that
> > when
> > > a
> > > > > group A is assigned to S and S gets its first message of the group,
> > we
> > > > > store what that first A message / queue entry is. Then when S is
> done
> > > > with
> > > > > the last A message, we can tell other subscriptions to go back to
> > that
> > > > > first entry, without having to iterate through the queue?
> > > > >
> > > > > Thanks a lot for your help!
> > > > >
> > > > > Helen
> > > > >
> > > > >
> > > > > On Tue, Jan 7, 2014 at 8:46 PM, Robbie Gemmell <
> > > robbie.gemmell@gmail.com
> > > > > >wrote:
> > > > >
> > > > > > ...and just to be super clear, though I think it it is mentioned
> > > > > correctly
> > > > > > in the docs this time, the 'default group' concept does not apply
> > in
> > > > the
> > > > > > regular / 'non shared' grouping mode. Messages that dont specify
> a
> > > > group
> > > > > > key value in that mode are simply not grouped in any way.
> > > > > >
> > > > > > On 8 January 2014 04:41, Robbie Gemmell <
> robbie.gemmell@gmail.com>
> > > > > wrote:
> > > > > >
> > > > > > >
> > > > > > > On 8 January 2014 04:33, Helen Kwong <he...@gmail.com>
> > wrote:
> > > > > > >
> > > > > > >> Oh I see, I thought what you meant was that I could only alter
> > the
> > > > > > default
> > > > > > >> group in shared-groups mode starting with 0.24.
> > > > > > >
> > > > > > >
> > > > > > > No, I just missed that you said 0.16 and assumed 0.24 was the
> > > version
> > > > > you
> > > > > > > were using . You could always change it, just in more limited
> > ways
> > > in
> > > > > > > earlier releases.
> > > > > > >
> > > > > > > To make sure I'm
> > > > > > >> understanding this correctly -- changing the the default
> message
> > > > group
> > > > > > >> name
> > > > > > >> to something else in C++ mode won't change the serial
> processing
> > > > > > behavior
> > > > > > >> I
> > > > > > >> saw, right?
> > > > > > >
> > > > > > >
> > > > > > > Correct
> > > > > > >
> > > > > > >
> > > > > > >> Messages without a group identifier will still be considered
> to
> > > > > > >> be in a group -- rather than no group -- and they cannot be
> > > > processed
> > > > > by
> > > > > > >> multiple consumers concurrently?
> > > > > > >>
> > > > > > >>
> > > > > > > Yes. In the C++/shared-groups mode every message is considered
> to
> > > be
> > > > > in a
> > > > > > > group, it is just a case of whether the message specifies that
> > > group
> > > > or
> > > > > > > instead gets put into the default group.
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > >> Thanks,
> > > > > > >> Helen
> > > > > > >>
> > > > > > >>
> > > > > > >> On Tue, Jan 7, 2014 at 8:22 PM, Robbie Gemmell <
> > > > > > robbie.gemmell@gmail.com
> > > > > > >> >wrote:
> > > > > > >>
> > > > > > >> > I just noticed you said you were using 0.16, somehow glossed
> > > over
> > > > it
> > > > > > >> > originally and only noticed the 0.24 in the doc URL (its
> many
> > > > hours
> > > > > > past
> > > > > > >> > time I was asleep, I might be getting tired).
> > > > > > >> >
> > > > > > >> > Realising that, I should add that prior to 0.22 the only way
> > to
> > > > > alter
> > > > > > >> the
> > > > > > >> > default group in the shared-groups mode from 'qpid.no-group'
> > to
> > > > > > >> something
> > > > > > >> > else would have been via the 'qpid.default-message-group'
> > queue
> > > > > > declare
> > > > > > >> > argument when using an AMQP client to create the queue
> > > originally,
> > > > > and
> > > > > > >> for
> > > > > > >> > 0.22 itself only that and the system property approach I
> > > mentioned
> > > > > > would
> > > > > > >> > work.
> > > > > > >> >
> > > > > > >> > Robbie
> > > > > > >> >
> > > > > > >> > On 8 January 2014 04:03, Helen Kwong <he...@gmail.com>
> > > > wrote:
> > > > > > >> >
> > > > > > >> > > Hi Robbie,
> > > > > > >> > >
> > > > > > >> > > I see. Thanks for the quick response and explanation!
> > > > > > >> > >
> > > > > > >> > > Helen
> > > > > > >> > >
> > > > > > >> > >
> > > > > > >> > > On Tue, Jan 7, 2014 at 7:43 PM, Robbie Gemmell <
> > > > > > >> robbie.gemmell@gmail.com
> > > > > > >> > > >wrote:
> > > > > > >> > >
> > > > > > >> > > > Hi Helen,
> > > > > > >> > > >
> > > > > > >> > > > The short answer to your question is that it is the
> > > > > documentation
> > > > > > >> which
> > > > > > >> > > is
> > > > > > >> > > > incorrect, and the behaviour you are seeing is expected.
> > > > > > >> > > >
> > > > > > >> > > > The long answer is, when that documentation was
> composed a
> > > > > segment
> > > > > > >> was
> > > > > > >> > > > missed out indicating this, and needs to be added to the
> > > docs.
> > > > > The
> > > > > > >> > > > behaviour listed for when no group is specified is only
> > true
> > > > of
> > > > > > the
> > > > > > >> > > > 'non-shared' groups supported by the Java broker, in the
> > > > > > C++/shared
> > > > > > >> > group
> > > > > > >> > > > mode any messages recieved without an explicit group
> value
> > > are
> > > > > all
> > > > > > >> > > assigned
> > > > > > >> > > > to a default group of 'qpid.no-group'. This is as per
> the
> > > > > > behaviour
> > > > > > >> of
> > > > > > >> > > the
> > > > > > >> > > > C++ broker itself, which is explained in the C++ broker
> > docs
> > > > at
> > > > > > the
> > > > > > >> end
> > > > > > >> > > of
> > > > > > >> > > > the following page
> > > > > > >> > > >
> > > > > > >> > > >
> > > > > > >> > >
> > > > > > >> >
> > > > > > >>
> > > > > >
> > > > >
> > > >
> > >
> >
> http://qpid.apache.org/releases/qpid-0.24/cpp-broker/book/Using-message-groups.html
> > > > > > >> > > > .
> > > > > > >> > > > For the 0.24 Java broker, this default shared group can
> be
> > > > > changed
> > > > > > >> > > > broker-wide using the Java system property
> > > > > > >> > > > 'qpid.broker_default-shared-message-group', or can be
> > > > overriden
> > > > > > for
> > > > > > >> an
> > > > > > >> > > > individual queue during creation programatically via
> AMQP
> > > > > clients
> > > > > > or
> > > > > > >> > the
> > > > > > >> > > > management interfaces through use of the argument
> > > > > > >> > > > 'qpid.default-message-group' or
> > 'messageGroupDefaultGroup'.
> > > > > > >> > > >
> > > > > > >> > > > I coincidentally happened to have fixed a defect with
> the
> > > > shared
> > > > > > >> groups
> > > > > > >> > > > functionality last night on trunk. Its not yet included
> in
> > > the
> > > > > > >> imminent
> > > > > > >> > > > 0.26 release, though I am about to request whether that
> is
> > > > > > possible.
> > > > > > >> > > > https://issues.apache.org/jira/browse/QPID-5450
> > > > > > >> > > >
> > > > > > >> > > > Robbie
> > > > > > >> > > >
> > > > > > >> > > > On 8 January 2014 02:43, Helen Kwong <
> > helenkwong@gmail.com>
> > > > > > wrote:
> > > > > > >> > > >
> > > > > > >> > > > > Hi,
> > > > > > >> > > > >
> > > > > > >> > > > > I use the Java broker and client, version 0.16, and am
> > > > > > considering
> > > > > > >> > > using
> > > > > > >> > > > > the message grouping feature (
> > > > > > >> > > > >
> > > > > > >> > > > >
> > > > > > >> > > >
> > > > > > >> > >
> > > > > > >> >
> > > > > > >>
> > > > > >
> > > > >
> > > >
> > >
> >
> http://qpid.apache.org/releases/qpid-0.24/java-broker/book/Java-Broker-Queues.html#Java-Broker-Queues-OtherTypes-Message-Grouping
> > > > > > >> > > > > ).
> > > > > > >> > > > > From testing I've done, there seems to be a bug with
> the
> > > C++
> > > > > > >> > > > compatibility
> > > > > > >> > > > > model, and I'm wondering if this is a known issue.
> > > > > Specifically,
> > > > > > >> in
> > > > > > >> > my
> > > > > > >> > > > test
> > > > > > >> > > > > I have a queue configured to use a group header field
> > with
> > > > > > >> > > > > "qpid.group_header_key" and C++ mode with
> > > > > > "qpid.shared_msg_group",
> > > > > > >> > and
> > > > > > >> > > > have
> > > > > > >> > > > > multiple listeners to the queue. Each listener will
> > sleep
> > > > for
> > > > > a
> > > > > > >> short
> > > > > > >> > > > > amount of time when it receives a message before
> > > returning.
> > > > I
> > > > > > then
> > > > > > >> > > > enqueue
> > > > > > >> > > > > 10 messages that do not have a value in the group
> header
> > > > field
> > > > > > to
> > > > > > >> the
> > > > > > >> > > > > queue. Since the doc says that messages without a
> value
> > in
> > > > the
> > > > > > >> > grouping
> > > > > > >> > > > > header will be delivered to any available consumer,
> the
> > > > > > behavior I
> > > > > > >> > > expect
> > > > > > >> > > > > is that the messages will be processed in parallel,
> > i.e.,
> > > > when
> > > > > > >> > > listener 1
> > > > > > >> > > > > is holding on to a message and sleeping, listener 2
> can
> > > > > receive
> > > > > > >> > another
> > > > > > >> > > > > message from the queue. But what I see is that the
> > > messages
> > > > > are
> > > > > > >> > > processed
> > > > > > >> > > > > serially -- message 2 won't be received by some thread
> > > until
> > > > > > >> message
> > > > > > >> > 1
> > > > > > >> > > is
> > > > > > >> > > > > done. When I use the default mode instead of C++ mode,
> > > then
> > > > I
> > > > > > get
> > > > > > >> the
> > > > > > >> > > > > parallel processing behavior.
> > > > > > >> > > > >
> > > > > > >> > > > > Is this is a known bug, and is there a fix for it
> > already?
> > > > > > >> > > > >
> > > > > > >> > > > > Thanks,
> > > > > > >> > > > > Helen
> > > > > > >> > > > >
> > > > > > >> > > >
> > > > > > >> > >
> > > > > > >> >
> > > > > > >>
> > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Re: Java broker - message grouping in C++ compatibility mode

Posted by Helen Kwong <he...@gmail.com>.

Hi Robbie,

I ran the tests again with a broker built from the latest trunk code and
with a 0.16 client. The performance results are a lot better in almost
every test setup (3-4 times better), except for Test #1 where I build up
100000 messages of unique grouping values and then see how many messages we
can process in 5 minutes, using C++ mode (I got ~7800 messages before, and
now ~5400). The overall comparisons are about the same, with C++ mode still
performing significantly worse:

1. For the first test, for a regular queue and a default mode queue (with
all messages having no grouping values, all having the same value, and all
having different values), I am able to process 100000 messages in around 3
and a half minutes. In C++ mode, with all 100000 messages having different
values, I can process only ~5400 in 5 minutes; with all having no value /
the default group, ~44000 messages were processed in 5 minutes (after which
I stop waiting and just clear the queue).

2. For the second test, for a default mode queue, the performance is again
not affected by how many unprocessable messages we have at the front of the
queue due to another consumer holding on to those messages' group for a
long time. The time it takes to process 1000 messages of group B after N
messages of group A (which is assigned to another consumer) is about the
same as processing 1000 messages on a regular queue. With a C++ mode queue,
the performance still gets worse the more messages of group A we have at
the front of the queue.


> Can I enquire what it is about your use case that would preclude use of a
> higher prefetch?


We want to avoid starving messages that can be processed by another
consumer. We have multiple dequeue sessions, each listening to multiple
queues. If we have a higher prefetch, then if there is a long-running
message on a queue, the messages that were prefetched along with it will
have to wait for a long time, even if another listener is available and can
process those messages. And if we decide to add the message grouping
configuration, we want to avoid starving a message of a particular group if
there's an available listener -- if a listener can have 2 messages and 2
different groups assigned at a time, and processing of the first message
takes a long time, then we might be unfairly starving the second message,
even though other listeners can process it.

We might be able to tolerate this to some extent if the prefetch is only 2,
if this is the only way to improve overall throughput. I'll have to discuss
this more with my team. Though this still wouldn't solve the problem with
unique keys in C++ mode.


> Also, you originally seemed to prefer the idea that
> messages without a group header value would be not be grouped, so is there
> a particular reason you are leaning towards using the shared groups
> functionality which can't do that?
>

Basically, the ideal behavior we want is a third mode that combines the
two. If no grouping value is specified, then treat it as though it has no
group; for messages with a grouping value, ensure that only 1 listener can
be processing messages from a particular group at a time, but don't tie a
group to a particular consumer for the lifetime of that consumer. The
reason we want the first part is that we'll have many messages that don't
belong to any group, and we want them to be processed by different
listeners in parallel. The reason we don't want a consumer to be associated
with a group as long as the consumer lives is again to avoid starvation --
we don't want a consumer processing a message of group A for a long time
result in starving group B's messages, because it happens so that the first
group B message was processed by that consumer and so group B is assigned
to the consumer, when there's another consumer of the same queue that is
doing nothing and can process the B message.

So we were thinking of asking you guys if you'd be open to introducing a
third mode, mostly the same as the C++ mode but where no grouping value
means no group, instead of the default group. Another workaround we thought
of was for any message that doesn't belong to a group, we'll put its
message ID in the grouping key property, so that essentially any listener
can pick it up. That's one reason why we were testing the unique keys case,
though it's also possible for us to get many different grouping values in a
queue (though not at quite as high a number, e.g., 100000). But in C++
mode, with both unique values and same-group values at a high depth, we're
seeing decreased performance anyway, so we might not be able to use it.

Do you have any suggestions for what we should do? Are there any ideas you
have for solving the performance issue with unique keys in C++ mode, so
perhaps we could look into it more?

Thanks,
Helen


> On 13 January 2014 20:34, Helen Kwong <he...@gmail.com> wrote:
>
> > Hi Robbie,
> >
> > I am actually still running version 0.16 of the broker. It will take me a
> > little time to set up the trunk code and rebuild and rerun the
> experiments.
> > Do you think your fix will likely make a difference?
> >
> > For the second case with a long-lived consumer being assigned the group
> of
> > many messages at the head of the queue, I was indeed using a prefetch of
> 1.
> > I ran it again (with version 0.16 still) with a prefetch of 2 as you
> > suggested, and the dequeue time of the messages at the end was then not
> > affected by the number of unprocessable messages at the beginning of the
> > queue, about the same as the other test setups I ran. However, I think
> > increasing prefetch to 2 might not work for our use case.
> >
> > For the first case with unique message groups, your explanation makes
> sense
> > and I think I understand it now. Do you think there is still a way to
> > optimize this behavior, so that we don't need to possibly traverse
> through
> > the whole queue whenever a subscription is unassigned from a group? Since
> > you mentioned that maintaining a fixed pointer to a queue entry would
> > likely lead to memory retention issues, would having a weak reference be
> a
> > possible option?
> >
> > Thanks,
> > Helen
> >
> >
> >
> > On Sat, Jan 11, 2014 at 7:58 AM, Robbie Gemmell <
> robbie.gemmell@gmail.com
> > >wrote:
> >
> > > Hi Helen,
> > >
> > > Can I check what version of the code you were using? I ask as the
> latest
> > > trunk or 0.26 release branch code is going to be necessary for
> > correctness
> > > and representative testing of the shared groups functionality, due to
> the
> > > defect fix I previously mentioned making recently.
> > > You can find a nightly build of the trunk broker at:
> > >
> > >
> >
> https://builds.apache.org/view/M-R/view/Qpid/job/Qpid-Java-Artefact-Release/lastSuccessfulBuild/artifact/trunk/qpid/java/broker/release/and
> > > I would need to build the 0.26 branch as the fix was introduced after
> > > the latest RC.
> > >
> > > In your first case, I think the reason for the difference between the
> > > default group and the unique group is also likely to be tied to the
> > > 'findEarliestAssignedAvailableEntry' behaviour you mention later in
> your
> > > mail. For the default group case, that next message always going to be
> a
> > > message near the front of the queue. For the unique group, there isnt
> > > actually going to be a message which matches, but it looks like it will
> > > currently be checking every message to determine that and doing so
> under
> > > the synchronization, and thus probably preventing other deliveries
> > > occurring at the time. That isnt a problem in the non-shared case
> because
> > > there isnt a need to synchronise the GroupManager as a whole, and even
> > > going beyond that its also highly unlikely it would need to check as
> many
> > > messages before finding a match due to the signifcant difference in how
> > > groups become associated with a particular subscription in the
> non-shared
> > > case.
> > >
> > > In your second case, your explanation seems likely and I think this
> case
> > > really reduces to just being a variant of the above behaviour. The
> > > particular issue is that one could argue it shouldnt need to be doing
> the
> > > 'findEarliestAssignedAvailableEntry' task all that often in this case
> if
> > > you have a long-lived consumer, and so your mention of this makes me
> > think
> > > you are using a prefetch of 1. Using a prefetch of 1 currently means
> that
> > > the delivery state associated with the shared group effectively becomes
> > > empty after each message, because messages are only fully added to the
> > > group when they become acquired by a particular subscription, and they
> > cant
> > > be acquired until the previous message is consumed (or perhaps slightly
> > > confusingly, explicitly not-consumed). If so, I expect it could be very
> > > interesting to run this case again with a prefetch of 2 or more. The
> > > obvious tradeoff with increasing prefetch is that a particular consumer
> > > could then be assigned up to <prefetch> groups at a given point, though
> > > likely not in your test case due to the large contiguous blocks of
> > messages
> > > for each group.
> > >
> > > I'm not sure that the suggestion to track the first message in the
> group
> > > would really work currently, due to the way the underlying queue data
> > > structure works. Maintaining a fixed pointer into it like that is
> likely
> > to
> > > lead to some undesirable memory retention issues, based on a related
> but
> > > far simpler case I fixed previously in a similar structure elsewhere in
> > the
> > > broker. Looking at the way messages become assigned to a group in the
> > > shared group case may be a more viable path to handling your second
> case
> > > more gracefully. The unique groups from you first case would still need
> > > something different though, as neither of these routes would really
> help
> > > there.
> > >
> > > Robbie
> > >
> > > On 11 January 2014 01:03, Helen Kwong <he...@gmail.com> wrote:
> > >
> > > > Hi Robbie,
> > > >
> > > > I did some more testing to see whether message grouping will work for
> > us,
> > > > and compared the dequeue performance of a queue using message
> grouping
> > in
> > > > default Java mode, a queue using C++ mode, and a queue not using
> > message
> > > > grouping. I found that when I use C++ mode, the performance can be
> much
> > > > worse than in other comparable setups, and was wondering if you could
> > > help
> > > > me understand why.
> > > >
> > > > 1.  In one test, I have multiple listeners to a queue, enqueue 100000
> > > > messages to it, and see how many messages are processed in 5
> minutes. I
> > > > have these different setups:
> > > >
> > > > - C++ mode queue with each message having a unique identifier
> > > >
> > > > - C++ mode queue with all messages having no grouping identifier (so
> > all
> > > > belong to the default group)
> > > >
> > > > - default mode queue with each message having a unique grouping
> > > identifier
> > > >
> > > > - default mode queue with all messages having no grouping identifier
> > > >
> > > > - default mode queue with all messages having the same grouping
> > > identifier
> > > >
> > > > - regular queue without a group header key configured
> > > >
> > > > All setups except for the first had about 35K - 39K messages
> processed,
> > > but
> > > > for the first setup, there were under 8000 messages processed. What
> > could
> > > > explain this big difference? I’ve looked at the two grouping modes’
> > > > implementations of MessageGroupManager and see that C++ mode uses
> > > > synchronized methods rather than a ConcurrentHashMap as in default
> > mode,
> > > so
> > > > I’d guess there might be more contention because of that, but at the
> > same
> > > > time I can’t see why, in C++ mode, having a unique identifier for
> each
> > > > message results in throughput that is so much worse than having all
> > > > messages in the default group.
> > > >
> > > >
> > > > 2. I also wanted to see the impact of having many messages at the
> head
> > of
> > > > the queue that a listener can’t process because the messages belong
> to
> > a
> > > > group assigned to another consumer. E.g., have 10000 messages of
> group
> > A,
> > > > followed by 1000 messages of group B, and listener 1 is holding on to
> > the
> > > > first A message for a long time -- see how long it will take
> listener 2
> > > to
> > > > process all the B messages. In this case C++ mode has performance
> that
> > > > degrades as the number of unprocessable group A messages at the front
> > of
> > > > the queue increases, whereas default mode's performance is
> unaffected,
> > > > about the same as processing 1000 messages on a regular queue.
> > > >
> > > > My rough guess from looking at DefinedGroupMessageGroupManager is
> that
> > > > whenever listener 2 is done with a group B message, the state change
> > > > listener triggers the Group.subtract() to reset pointers for other
> > > > subscriptions and consequently findEarliestAssignedAvailableEntry().
> > This
> > > > then has to iterate through all the group A messages before it finds
> > the
> > > B
> > > > message. Do you think this is the reason for the results I see?
> > > >
> > > > If so, is the idea here that other subscriptions of the queue could
> > have
> > > > skipped over the messages of a group while the group was assigned to
> > some
> > > > subscription S, so we need to tell them to set their pointers back?
> If
> > > that
> > > > is indeed the idea, would it be possible to optimize it such that
> when
> > a
> > > > group A is assigned to S and S gets its first message of the group,
> we
> > > > store what that first A message / queue entry is. Then when S is done
> > > with
> > > > the last A message, we can tell other subscriptions to go back to
> that
> > > > first entry, without having to iterate through the queue?
> > > >
> > > > Thanks a lot for your help!
> > > >
> > > > Helen
> > > >
> > > >
> > > > On Tue, Jan 7, 2014 at 8:46 PM, Robbie Gemmell <
> > robbie.gemmell@gmail.com
> > > > >wrote:
> > > >
> > > > > ...and just to be super clear, though I think it it is mentioned
> > > > correctly
> > > > > in the docs this time, the 'default group' concept does not apply
> in
> > > the
> > > > > regular / 'non shared' grouping mode. Messages that dont specify a
> > > group
> > > > > key value in that mode are simply not grouped in any way.
> > > > >
> > > > > On 8 January 2014 04:41, Robbie Gemmell <ro...@gmail.com>
> > > > wrote:
> > > > >
> > > > > >
> > > > > > On 8 January 2014 04:33, Helen Kwong <he...@gmail.com>
> wrote:
> > > > > >
> > > > > >> Oh I see, I thought what you meant was that I could only alter
> the
> > > > > default
> > > > > >> group in shared-groups mode starting with 0.24.
> > > > > >
> > > > > >
> > > > > > No, I just missed that you said 0.16 and assumed 0.24 was the
> > version
> > > > you
> > > > > > were using . You could always change it, just in more limited
> ways
> > in
> > > > > > earlier releases.
> > > > > >
> > > > > > To make sure I'm
> > > > > >> understanding this correctly -- changing the the default message
> > > group
> > > > > >> name
> > > > > >> to something else in C++ mode won't change the serial processing
> > > > > behavior
> > > > > >> I
> > > > > >> saw, right?
> > > > > >
> > > > > >
> > > > > > Correct
> > > > > >
> > > > > >
> > > > > >> Messages without a group identifier will still be considered to
> > > > > >> be in a group -- rather than no group -- and they cannot be
> > > processed
> > > > by
> > > > > >> multiple consumers concurrently?
> > > > > >>
> > > > > >>
> > > > > > Yes. In the C++/shared-groups mode every message is considered to
> > be
> > > > in a
> > > > > > group, it is just a case of whether the message specifies that
> > group
> > > or
> > > > > > instead gets put into the default group.
> > > > > >
> > > > > >
> > > > > >
> > > > > >> Thanks,
> > > > > >> Helen
> > > > > >>
> > > > > >>
> > > > > >> On Tue, Jan 7, 2014 at 8:22 PM, Robbie Gemmell <
> > > > > robbie.gemmell@gmail.com
> > > > > >> >wrote:
> > > > > >>
> > > > > >> > I just noticed you said you were using 0.16, somehow glossed
> > over
> > > it
> > > > > >> > originally and only noticed the 0.24 in the doc URL (its many
> > > hours
> > > > > past
> > > > > >> > time I was asleep, I might be getting tired).
> > > > > >> >
> > > > > >> > Realising that, I should add that prior to 0.22 the only way
> to
> > > > alter
> > > > > >> the
> > > > > >> > default group in the shared-groups mode from 'qpid.no-group'
> to
> > > > > >> something
> > > > > >> > else would have been via the 'qpid.default-message-group'
> queue
> > > > > declare
> > > > > >> > argument when using an AMQP client to create the queue
> > originally,
> > > > and
> > > > > >> for
> > > > > >> > 0.22 itself only that and the system property approach I
> > mentioned
> > > > > would
> > > > > >> > work.
> > > > > >> >
> > > > > >> > Robbie
> > > > > >> >
> > > > > >> > On 8 January 2014 04:03, Helen Kwong <he...@gmail.com>
> > > wrote:
> > > > > >> >
> > > > > >> > > Hi Robbie,
> > > > > >> > >
> > > > > >> > > I see. Thanks for the quick response and explanation!
> > > > > >> > >
> > > > > >> > > Helen
> > > > > >> > >
> > > > > >> > >
> > > > > >> > > On Tue, Jan 7, 2014 at 7:43 PM, Robbie Gemmell <
> > > > > >> robbie.gemmell@gmail.com
> > > > > >> > > >wrote:
> > > > > >> > >
> > > > > >> > > > Hi Helen,
> > > > > >> > > >
> > > > > >> > > > The short answer to your question is that it is the
> > > > documentation
> > > > > >> which
> > > > > >> > > is
> > > > > >> > > > incorrect, and the behaviour you are seeing is expected.
> > > > > >> > > >
> > > > > >> > > > The long answer is, when that documentation was composed a
> > > > segment
> > > > > >> was
> > > > > >> > > > missed out indicating this, and needs to be added to the
> > docs.
> > > > The
> > > > > >> > > > behaviour listed for when no group is specified is only
> true
> > > of
> > > > > the
> > > > > >> > > > 'non-shared' groups supported by the Java broker, in the
> > > > > C++/shared
> > > > > >> > group
> > > > > >> > > > mode any messages recieved without an explicit group value
> > are
> > > > all
> > > > > >> > > assigned
> > > > > >> > > > to a default group of 'qpid.no-group'. This is as per the
> > > > > behaviour
> > > > > >> of
> > > > > >> > > the
> > > > > >> > > > C++ broker itself, which is explained in the C++ broker
> docs
> > > at
> > > > > the
> > > > > >> end
> > > > > >> > > of
> > > > > >> > > > the following page
> > > > > >> > > >
> > > > > >> > > >
> > > > > >> > >
> > > > > >> >
> > > > > >>
> > > > >
> > > >
> > >
> >
> http://qpid.apache.org/releases/qpid-0.24/cpp-broker/book/Using-message-groups.html
> > > > > >> > > > .
> > > > > >> > > > For the 0.24 Java broker, this default shared group can be
> > > > changed
> > > > > >> > > > broker-wide using the Java system property
> > > > > >> > > > 'qpid.broker_default-shared-message-group', or can be
> > > overriden
> > > > > for
> > > > > >> an
> > > > > >> > > > individual queue during creation programatically via AMQP
> > > > clients
> > > > > or
> > > > > >> > the
> > > > > >> > > > management interfaces through use of the argument
> > > > > >> > > > 'qpid.default-message-group' or
> 'messageGroupDefaultGroup'.
> > > > > >> > > >
> > > > > >> > > > I coincidentally happened to have fixed a defect with the
> > > shared
> > > > > >> groups
> > > > > >> > > > functionality last night on trunk. Its not yet included in
> > the
> > > > > >> imminent
> > > > > >> > > > 0.26 release, though I am about to request whether that is
> > > > > possible.
> > > > > >> > > > https://issues.apache.org/jira/browse/QPID-5450
> > > > > >> > > >
> > > > > >> > > > Robbie
> > > > > >> > > >
> > > > > >> > > > On 8 January 2014 02:43, Helen Kwong <
> helenkwong@gmail.com>
> > > > > wrote:
> > > > > >> > > >
> > > > > >> > > > > Hi,
> > > > > >> > > > >
> > > > > >> > > > > I use the Java broker and client, version 0.16, and am
> > > > > considering
> > > > > >> > > using
> > > > > >> > > > > the message grouping feature (
> > > > > >> > > > >
> > > > > >> > > > >
> > > > > >> > > >
> > > > > >> > >
> > > > > >> >
> > > > > >>
> > > > >
> > > >
> > >
> >
> http://qpid.apache.org/releases/qpid-0.24/java-broker/book/Java-Broker-Queues.html#Java-Broker-Queues-OtherTypes-Message-Grouping
> > > > > >> > > > > ).
> > > > > >> > > > > From testing I've done, there seems to be a bug with the
> > C++
> > > > > >> > > > compatibility
> > > > > >> > > > > model, and I'm wondering if this is a known issue.
> > > > Specifically,
> > > > > >> in
> > > > > >> > my
> > > > > >> > > > test
> > > > > >> > > > > I have a queue configured to use a group header field
> with
> > > > > >> > > > > "qpid.group_header_key" and C++ mode with
> > > > > "qpid.shared_msg_group",
> > > > > >> > and
> > > > > >> > > > have
> > > > > >> > > > > multiple listeners to the queue. Each listener will
> sleep
> > > for
> > > > a
> > > > > >> short
> > > > > >> > > > > amount of time when it receives a message before
> > returning.
> > > I
> > > > > then
> > > > > >> > > > enqueue
> > > > > >> > > > > 10 messages that do not have a value in the group header
> > > field
> > > > > to
> > > > > >> the
> > > > > >> > > > > queue. Since the doc says that messages without a value
> in
> > > the
> > > > > >> > grouping
> > > > > >> > > > > header will be delivered to any available consumer, the
> > > > > behavior I
> > > > > >> > > expect
> > > > > >> > > > > is that the messages will be processed in parallel,
> i.e.,
> > > when
> > > > > >> > > listener 1
> > > > > >> > > > > is holding on to a message and sleeping, listener 2 can
> > > > receive
> > > > > >> > another
> > > > > >> > > > > message from the queue. But what I see is that the
> > messages
> > > > are
> > > > > >> > > processed
> > > > > >> > > > > serially -- message 2 won't be received by some thread
> > until
> > > > > >> message
> > > > > >> > 1
> > > > > >> > > is
> > > > > >> > > > > done. When I use the default mode instead of C++ mode,
> > then
> > > I
> > > > > get
> > > > > >> the
> > > > > >> > > > > parallel processing behavior.
> > > > > >> > > > >
> > > > > >> > > > > Is this is a known bug, and is there a fix for it
> already?
> > > > > >> > > > >
> > > > > >> > > > > Thanks,
> > > > > >> > > > > Helen
> > > > > >> > > > >
> > > > > >> > > >
> > > > > >> > >
> > > > > >> >
> > > > > >>
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Re: Java broker - message grouping in C++ compatibility mode

Posted by Robbie Gemmell <ro...@gmail.com>.

Hi Helen,

The change I mentioned is an important defect fix if you are looking to use
the shared groups, the underlying issue allowed messages to be delivered
concurrently that should not have been (in particular when using a prefetch
of 1, it was significantly less likely to ever occur at >1 prefetch) and so
to ensure your results are valid you should look to pick up the newer code
or backport the change to the old release.

Can I enquire what it is about your use case that would preclude use of a
higher prefetch? Also, you originally seemed to prefer the idea that
messages without a group header value would be not be grouped, so is there
a particular reason you are leaning towards using the shared groups
functionality which can't do that?

A WeakReference might be more viable, but could also end up doing little
good due to when its reference would be required (mostly after the target
entry has been unlinked from the queue structure), a SoftReference might be
another option....it would need looked into it in greater depth some time,
which I am somewhat short on at the moment (hence the late night emails).
That still wouldnt help with the entirely-unique-keys shared groups case
though, which might need a different approach in general than is currently
being taken.

Robbie

On 13 January 2014 20:34, Helen Kwong <he...@gmail.com> wrote:

> Hi Robbie,
>
> I am actually still running version 0.16 of the broker. It will take me a
> little time to set up the trunk code and rebuild and rerun the experiments.
> Do you think your fix will likely make a difference?
>
> For the second case with a long-lived consumer being assigned the group of
> many messages at the head of the queue, I was indeed using a prefetch of 1.
> I ran it again (with version 0.16 still) with a prefetch of 2 as you
> suggested, and the dequeue time of the messages at the end was then not
> affected by the number of unprocessable messages at the beginning of the
> queue, about the same as the other test setups I ran. However, I think
> increasing prefetch to 2 might not work for our use case.
>
> For the first case with unique message groups, your explanation makes sense
> and I think I understand it now. Do you think there is still a way to
> optimize this behavior, so that we don't need to possibly traverse through
> the whole queue whenever a subscription is unassigned from a group? Since
> you mentioned that maintaining a fixed pointer to a queue entry would
> likely lead to memory retention issues, would having a weak reference be a
> possible option?
>
> Thanks,
> Helen
>
>
>
> On Sat, Jan 11, 2014 at 7:58 AM, Robbie Gemmell <robbie.gemmell@gmail.com
> >wrote:
>
> > Hi Helen,
> >
> > Can I check what version of the code you were using? I ask as the latest
> > trunk or 0.26 release branch code is going to be necessary for
> correctness
> > and representative testing of the shared groups functionality, due to the
> > defect fix I previously mentioned making recently.
> > You can find a nightly build of the trunk broker at:
> >
> >
> https://builds.apache.org/view/M-R/view/Qpid/job/Qpid-Java-Artefact-Release/lastSuccessfulBuild/artifact/trunk/qpid/java/broker/release/and
> > I would need to build the 0.26 branch as the fix was introduced after
> > the latest RC.
> >
> > In your first case, I think the reason for the difference between the
> > default group and the unique group is also likely to be tied to the
> > 'findEarliestAssignedAvailableEntry' behaviour you mention later in your
> > mail. For the default group case, that next message always going to be a
> > message near the front of the queue. For the unique group, there isnt
> > actually going to be a message which matches, but it looks like it will
> > currently be checking every message to determine that and doing so under
> > the synchronization, and thus probably preventing other deliveries
> > occurring at the time. That isnt a problem in the non-shared case because
> > there isnt a need to synchronise the GroupManager as a whole, and even
> > going beyond that its also highly unlikely it would need to check as many
> > messages before finding a match due to the signifcant difference in how
> > groups become associated with a particular subscription in the non-shared
> > case.
> >
> > In your second case, your explanation seems likely and I think this case
> > really reduces to just being a variant of the above behaviour. The
> > particular issue is that one could argue it shouldnt need to be doing the
> > 'findEarliestAssignedAvailableEntry' task all that often in this case if
> > you have a long-lived consumer, and so your mention of this makes me
> think
> > you are using a prefetch of 1. Using a prefetch of 1 currently means that
> > the delivery state associated with the shared group effectively becomes
> > empty after each message, because messages are only fully added to the
> > group when they become acquired by a particular subscription, and they
> cant
> > be acquired until the previous message is consumed (or perhaps slightly
> > confusingly, explicitly not-consumed). If so, I expect it could be very
> > interesting to run this case again with a prefetch of 2 or more. The
> > obvious tradeoff with increasing prefetch is that a particular consumer
> > could then be assigned up to <prefetch> groups at a given point, though
> > likely not in your test case due to the large contiguous blocks of
> messages
> > for each group.
> >
> > I'm not sure that the suggestion to track the first message in the group
> > would really work currently, due to the way the underlying queue data
> > structure works. Maintaining a fixed pointer into it like that is likely
> to
> > lead to some undesirable memory retention issues, based on a related but
> > far simpler case I fixed previously in a similar structure elsewhere in
> the
> > broker. Looking at the way messages become assigned to a group in the
> > shared group case may be a more viable path to handling your second case
> > more gracefully. The unique groups from you first case would still need
> > something different though, as neither of these routes would really help
> > there.
> >
> > Robbie
> >
> > On 11 January 2014 01:03, Helen Kwong <he...@gmail.com> wrote:
> >
> > > Hi Robbie,
> > >
> > > I did some more testing to see whether message grouping will work for
> us,
> > > and compared the dequeue performance of a queue using message grouping
> in
> > > default Java mode, a queue using C++ mode, and a queue not using
> message
> > > grouping. I found that when I use C++ mode, the performance can be much
> > > worse than in other comparable setups, and was wondering if you could
> > help
> > > me understand why.
> > >
> > > 1.  In one test, I have multiple listeners to a queue, enqueue 100000
> > > messages to it, and see how many messages are processed in 5 minutes. I
> > > have these different setups:
> > >
> > > - C++ mode queue with each message having a unique identifier
> > >
> > > - C++ mode queue with all messages having no grouping identifier (so
> all
> > > belong to the default group)
> > >
> > > - default mode queue with each message having a unique grouping
> > identifier
> > >
> > > - default mode queue with all messages having no grouping identifier
> > >
> > > - default mode queue with all messages having the same grouping
> > identifier
> > >
> > > - regular queue without a group header key configured
> > >
> > > All setups except for the first had about 35K - 39K messages processed,
> > but
> > > for the first setup, there were under 8000 messages processed. What
> could
> > > explain this big difference? I’ve looked at the two grouping modes’
> > > implementations of MessageGroupManager and see that C++ mode uses
> > > synchronized methods rather than a ConcurrentHashMap as in default
> mode,
> > so
> > > I’d guess there might be more contention because of that, but at the
> same
> > > time I can’t see why, in C++ mode, having a unique identifier for each
> > > message results in throughput that is so much worse than having all
> > > messages in the default group.
> > >
> > >
> > > 2. I also wanted to see the impact of having many messages at the head
> of
> > > the queue that a listener can’t process because the messages belong to
> a
> > > group assigned to another consumer. E.g., have 10000 messages of group
> A,
> > > followed by 1000 messages of group B, and listener 1 is holding on to
> the
> > > first A message for a long time -- see how long it will take listener 2
> > to
> > > process all the B messages. In this case C++ mode has performance that
> > > degrades as the number of unprocessable group A messages at the front
> of
> > > the queue increases, whereas default mode's performance is unaffected,
> > > about the same as processing 1000 messages on a regular queue.
> > >
> > > My rough guess from looking at DefinedGroupMessageGroupManager is that
> > > whenever listener 2 is done with a group B message, the state change
> > > listener triggers the Group.subtract() to reset pointers for other
> > > subscriptions and consequently findEarliestAssignedAvailableEntry().
> This
> > > then has to iterate through all the group A messages before it finds
> the
> > B
> > > message. Do you think this is the reason for the results I see?
> > >
> > > If so, is the idea here that other subscriptions of the queue could
> have
> > > skipped over the messages of a group while the group was assigned to
> some
> > > subscription S, so we need to tell them to set their pointers back? If
> > that
> > > is indeed the idea, would it be possible to optimize it such that when
> a
> > > group A is assigned to S and S gets its first message of the group, we
> > > store what that first A message / queue entry is. Then when S is done
> > with
> > > the last A message, we can tell other subscriptions to go back to that
> > > first entry, without having to iterate through the queue?
> > >
> > > Thanks a lot for your help!
> > >
> > > Helen
> > >
> > >
> > > On Tue, Jan 7, 2014 at 8:46 PM, Robbie Gemmell <
> robbie.gemmell@gmail.com
> > > >wrote:
> > >
> > > > ...and just to be super clear, though I think it it is mentioned
> > > correctly
> > > > in the docs this time, the 'default group' concept does not apply in
> > the
> > > > regular / 'non shared' grouping mode. Messages that dont specify a
> > group
> > > > key value in that mode are simply not grouped in any way.
> > > >
> > > > On 8 January 2014 04:41, Robbie Gemmell <ro...@gmail.com>
> > > wrote:
> > > >
> > > > >
> > > > > On 8 January 2014 04:33, Helen Kwong <he...@gmail.com> wrote:
> > > > >
> > > > >> Oh I see, I thought what you meant was that I could only alter the
> > > > default
> > > > >> group in shared-groups mode starting with 0.24.
> > > > >
> > > > >
> > > > > No, I just missed that you said 0.16 and assumed 0.24 was the
> version
> > > you
> > > > > were using . You could always change it, just in more limited ways
> in
> > > > > earlier releases.
> > > > >
> > > > > To make sure I'm
> > > > >> understanding this correctly -- changing the the default message
> > group
> > > > >> name
> > > > >> to something else in C++ mode won't change the serial processing
> > > > behavior
> > > > >> I
> > > > >> saw, right?
> > > > >
> > > > >
> > > > > Correct
> > > > >
> > > > >
> > > > >> Messages without a group identifier will still be considered to
> > > > >> be in a group -- rather than no group -- and they cannot be
> > processed
> > > by
> > > > >> multiple consumers concurrently?
> > > > >>
> > > > >>
> > > > > Yes. In the C++/shared-groups mode every message is considered to
> be
> > > in a
> > > > > group, it is just a case of whether the message specifies that
> group
> > or
> > > > > instead gets put into the default group.
> > > > >
> > > > >
> > > > >
> > > > >> Thanks,
> > > > >> Helen
> > > > >>
> > > > >>
> > > > >> On Tue, Jan 7, 2014 at 8:22 PM, Robbie Gemmell <
> > > > robbie.gemmell@gmail.com
> > > > >> >wrote:
> > > > >>
> > > > >> > I just noticed you said you were using 0.16, somehow glossed
> over
> > it
> > > > >> > originally and only noticed the 0.24 in the doc URL (its many
> > hours
> > > > past
> > > > >> > time I was asleep, I might be getting tired).
> > > > >> >
> > > > >> > Realising that, I should add that prior to 0.22 the only way to
> > > alter
> > > > >> the
> > > > >> > default group in the shared-groups mode from 'qpid.no-group' to
> > > > >> something
> > > > >> > else would have been via the 'qpid.default-message-group' queue
> > > > declare
> > > > >> > argument when using an AMQP client to create the queue
> originally,
> > > and
> > > > >> for
> > > > >> > 0.22 itself only that and the system property approach I
> mentioned
> > > > would
> > > > >> > work.
> > > > >> >
> > > > >> > Robbie
> > > > >> >
> > > > >> > On 8 January 2014 04:03, Helen Kwong <he...@gmail.com>
> > wrote:
> > > > >> >
> > > > >> > > Hi Robbie,
> > > > >> > >
> > > > >> > > I see. Thanks for the quick response and explanation!
> > > > >> > >
> > > > >> > > Helen
> > > > >> > >
> > > > >> > >
> > > > >> > > On Tue, Jan 7, 2014 at 7:43 PM, Robbie Gemmell <
> > > > >> robbie.gemmell@gmail.com
> > > > >> > > >wrote:
> > > > >> > >
> > > > >> > > > Hi Helen,
> > > > >> > > >
> > > > >> > > > The short answer to your question is that it is the
> > > documentation
> > > > >> which
> > > > >> > > is
> > > > >> > > > incorrect, and the behaviour you are seeing is expected.
> > > > >> > > >
> > > > >> > > > The long answer is, when that documentation was composed a
> > > segment
> > > > >> was
> > > > >> > > > missed out indicating this, and needs to be added to the
> docs.
> > > The
> > > > >> > > > behaviour listed for when no group is specified is only true
> > of
> > > > the
> > > > >> > > > 'non-shared' groups supported by the Java broker, in the
> > > > C++/shared
> > > > >> > group
> > > > >> > > > mode any messages recieved without an explicit group value
> are
> > > all
> > > > >> > > assigned
> > > > >> > > > to a default group of 'qpid.no-group'. This is as per the
> > > > behaviour
> > > > >> of
> > > > >> > > the
> > > > >> > > > C++ broker itself, which is explained in the C++ broker docs
> > at
> > > > the
> > > > >> end
> > > > >> > > of
> > > > >> > > > the following page
> > > > >> > > >
> > > > >> > > >
> > > > >> > >
> > > > >> >
> > > > >>
> > > >
> > >
> >
> http://qpid.apache.org/releases/qpid-0.24/cpp-broker/book/Using-message-groups.html
> > > > >> > > > .
> > > > >> > > > For the 0.24 Java broker, this default shared group can be
> > > changed
> > > > >> > > > broker-wide using the Java system property
> > > > >> > > > 'qpid.broker_default-shared-message-group', or can be
> > overriden
> > > > for
> > > > >> an
> > > > >> > > > individual queue during creation programatically via AMQP
> > > clients
> > > > or
> > > > >> > the
> > > > >> > > > management interfaces through use of the argument
> > > > >> > > > 'qpid.default-message-group' or 'messageGroupDefaultGroup'.
> > > > >> > > >
> > > > >> > > > I coincidentally happened to have fixed a defect with the
> > shared
> > > > >> groups
> > > > >> > > > functionality last night on trunk. Its not yet included in
> the
> > > > >> imminent
> > > > >> > > > 0.26 release, though I am about to request whether that is
> > > > possible.
> > > > >> > > > https://issues.apache.org/jira/browse/QPID-5450
> > > > >> > > >
> > > > >> > > > Robbie
> > > > >> > > >
> > > > >> > > > On 8 January 2014 02:43, Helen Kwong <he...@gmail.com>
> > > > wrote:
> > > > >> > > >
> > > > >> > > > > Hi,
> > > > >> > > > >
> > > > >> > > > > I use the Java broker and client, version 0.16, and am
> > > > considering
> > > > >> > > using
> > > > >> > > > > the message grouping feature (
> > > > >> > > > >
> > > > >> > > > >
> > > > >> > > >
> > > > >> > >
> > > > >> >
> > > > >>
> > > >
> > >
> >
> http://qpid.apache.org/releases/qpid-0.24/java-broker/book/Java-Broker-Queues.html#Java-Broker-Queues-OtherTypes-Message-Grouping
> > > > >> > > > > ).
> > > > >> > > > > From testing I've done, there seems to be a bug with the
> C++
> > > > >> > > > compatibility
> > > > >> > > > > model, and I'm wondering if this is a known issue.
> > > Specifically,
> > > > >> in
> > > > >> > my
> > > > >> > > > test
> > > > >> > > > > I have a queue configured to use a group header field with
> > > > >> > > > > "qpid.group_header_key" and C++ mode with
> > > > "qpid.shared_msg_group",
> > > > >> > and
> > > > >> > > > have
> > > > >> > > > > multiple listeners to the queue. Each listener will sleep
> > for
> > > a
> > > > >> short
> > > > >> > > > > amount of time when it receives a message before
> returning.
> > I
> > > > then
> > > > >> > > > enqueue
> > > > >> > > > > 10 messages that do not have a value in the group header
> > field
> > > > to
> > > > >> the
> > > > >> > > > > queue. Since the doc says that messages without a value in
> > the
> > > > >> > grouping
> > > > >> > > > > header will be delivered to any available consumer, the
> > > > behavior I
> > > > >> > > expect
> > > > >> > > > > is that the messages will be processed in parallel, i.e.,
> > when
> > > > >> > > listener 1
> > > > >> > > > > is holding on to a message and sleeping, listener 2 can
> > > receive
> > > > >> > another
> > > > >> > > > > message from the queue. But what I see is that the
> messages
> > > are
> > > > >> > > processed
> > > > >> > > > > serially -- message 2 won't be received by some thread
> until
> > > > >> message
> > > > >> > 1
> > > > >> > > is
> > > > >> > > > > done. When I use the default mode instead of C++ mode,
> then
> > I
> > > > get
> > > > >> the
> > > > >> > > > > parallel processing behavior.
> > > > >> > > > >
> > > > >> > > > > Is this is a known bug, and is there a fix for it already?
> > > > >> > > > >
> > > > >> > > > > Thanks,
> > > > >> > > > > Helen
> > > > >> > > > >
> > > > >> > > >
> > > > >> > >
> > > > >> >
> > > > >>
> > > > >
> > > > >
> > > >
> > >
> >
>

Re: Java broker - message grouping in C++ compatibility mode

Posted by Helen Kwong <he...@gmail.com>.

Hi Robbie,

I am actually still running version 0.16 of the broker. It will take me a
little time to set up the trunk code and rebuild and rerun the experiments.
Do you think your fix will likely make a difference?

For the second case with a long-lived consumer being assigned the group of
many messages at the head of the queue, I was indeed using a prefetch of 1.
I ran it again (with version 0.16 still) with a prefetch of 2 as you
suggested, and the dequeue time of the messages at the end was then not
affected by the number of unprocessable messages at the beginning of the
queue, about the same as the other test setups I ran. However, I think
increasing prefetch to 2 might not work for our use case.

For the first case with unique message groups, your explanation makes sense
and I think I understand it now. Do you think there is still a way to
optimize this behavior, so that we don't need to possibly traverse through
the whole queue whenever a subscription is unassigned from a group? Since
you mentioned that maintaining a fixed pointer to a queue entry would
likely lead to memory retention issues, would having a weak reference be a
possible option?

Thanks,
Helen



On Sat, Jan 11, 2014 at 7:58 AM, Robbie Gemmell <ro...@gmail.com>wrote:

> Hi Helen,
>
> Can I check what version of the code you were using? I ask as the latest
> trunk or 0.26 release branch code is going to be necessary for correctness
> and representative testing of the shared groups functionality, due to the
> defect fix I previously mentioned making recently.
> You can find a nightly build of the trunk broker at:
>
> https://builds.apache.org/view/M-R/view/Qpid/job/Qpid-Java-Artefact-Release/lastSuccessfulBuild/artifact/trunk/qpid/java/broker/release/and
> I would need to build the 0.26 branch as the fix was introduced after
> the latest RC.
>
> In your first case, I think the reason for the difference between the
> default group and the unique group is also likely to be tied to the
> 'findEarliestAssignedAvailableEntry' behaviour you mention later in your
> mail. For the default group case, that next message always going to be a
> message near the front of the queue. For the unique group, there isnt
> actually going to be a message which matches, but it looks like it will
> currently be checking every message to determine that and doing so under
> the synchronization, and thus probably preventing other deliveries
> occurring at the time. That isnt a problem in the non-shared case because
> there isnt a need to synchronise the GroupManager as a whole, and even
> going beyond that its also highly unlikely it would need to check as many
> messages before finding a match due to the signifcant difference in how
> groups become associated with a particular subscription in the non-shared
> case.
>
> In your second case, your explanation seems likely and I think this case
> really reduces to just being a variant of the above behaviour. The
> particular issue is that one could argue it shouldnt need to be doing the
> 'findEarliestAssignedAvailableEntry' task all that often in this case if
> you have a long-lived consumer, and so your mention of this makes me think
> you are using a prefetch of 1. Using a prefetch of 1 currently means that
> the delivery state associated with the shared group effectively becomes
> empty after each message, because messages are only fully added to the
> group when they become acquired by a particular subscription, and they cant
> be acquired until the previous message is consumed (or perhaps slightly
> confusingly, explicitly not-consumed). If so, I expect it could be very
> interesting to run this case again with a prefetch of 2 or more. The
> obvious tradeoff with increasing prefetch is that a particular consumer
> could then be assigned up to <prefetch> groups at a given point, though
> likely not in your test case due to the large contiguous blocks of messages
> for each group.
>
> I'm not sure that the suggestion to track the first message in the group
> would really work currently, due to the way the underlying queue data
> structure works. Maintaining a fixed pointer into it like that is likely to
> lead to some undesirable memory retention issues, based on a related but
> far simpler case I fixed previously in a similar structure elsewhere in the
> broker. Looking at the way messages become assigned to a group in the
> shared group case may be a more viable path to handling your second case
> more gracefully. The unique groups from you first case would still need
> something different though, as neither of these routes would really help
> there.
>
> Robbie
>
> On 11 January 2014 01:03, Helen Kwong <he...@gmail.com> wrote:
>
> > Hi Robbie,
> >
> > I did some more testing to see whether message grouping will work for us,
> > and compared the dequeue performance of a queue using message grouping in
> > default Java mode, a queue using C++ mode, and a queue not using message
> > grouping. I found that when I use C++ mode, the performance can be much
> > worse than in other comparable setups, and was wondering if you could
> help
> > me understand why.
> >
> > 1.  In one test, I have multiple listeners to a queue, enqueue 100000
> > messages to it, and see how many messages are processed in 5 minutes. I
> > have these different setups:
> >
> > - C++ mode queue with each message having a unique identifier
> >
> > - C++ mode queue with all messages having no grouping identifier (so all
> > belong to the default group)
> >
> > - default mode queue with each message having a unique grouping
> identifier
> >
> > - default mode queue with all messages having no grouping identifier
> >
> > - default mode queue with all messages having the same grouping
> identifier
> >
> > - regular queue without a group header key configured
> >
> > All setups except for the first had about 35K - 39K messages processed,
> but
> > for the first setup, there were under 8000 messages processed. What could
> > explain this big difference? I’ve looked at the two grouping modes’
> > implementations of MessageGroupManager and see that C++ mode uses
> > synchronized methods rather than a ConcurrentHashMap as in default mode,
> so
> > I’d guess there might be more contention because of that, but at the same
> > time I can’t see why, in C++ mode, having a unique identifier for each
> > message results in throughput that is so much worse than having all
> > messages in the default group.
> >
> >
> > 2. I also wanted to see the impact of having many messages at the head of
> > the queue that a listener can’t process because the messages belong to a
> > group assigned to another consumer. E.g., have 10000 messages of group A,
> > followed by 1000 messages of group B, and listener 1 is holding on to the
> > first A message for a long time -- see how long it will take listener 2
> to
> > process all the B messages. In this case C++ mode has performance that
> > degrades as the number of unprocessable group A messages at the front of
> > the queue increases, whereas default mode's performance is unaffected,
> > about the same as processing 1000 messages on a regular queue.
> >
> > My rough guess from looking at DefinedGroupMessageGroupManager is that
> > whenever listener 2 is done with a group B message, the state change
> > listener triggers the Group.subtract() to reset pointers for other
> > subscriptions and consequently findEarliestAssignedAvailableEntry(). This
> > then has to iterate through all the group A messages before it finds the
> B
> > message. Do you think this is the reason for the results I see?
> >
> > If so, is the idea here that other subscriptions of the queue could have
> > skipped over the messages of a group while the group was assigned to some
> > subscription S, so we need to tell them to set their pointers back? If
> that
> > is indeed the idea, would it be possible to optimize it such that when a
> > group A is assigned to S and S gets its first message of the group, we
> > store what that first A message / queue entry is. Then when S is done
> with
> > the last A message, we can tell other subscriptions to go back to that
> > first entry, without having to iterate through the queue?
> >
> > Thanks a lot for your help!
> >
> > Helen
> >
> >
> > On Tue, Jan 7, 2014 at 8:46 PM, Robbie Gemmell <robbie.gemmell@gmail.com
> > >wrote:
> >
> > > ...and just to be super clear, though I think it it is mentioned
> > correctly
> > > in the docs this time, the 'default group' concept does not apply in
> the
> > > regular / 'non shared' grouping mode. Messages that dont specify a
> group
> > > key value in that mode are simply not grouped in any way.
> > >
> > > On 8 January 2014 04:41, Robbie Gemmell <ro...@gmail.com>
> > wrote:
> > >
> > > >
> > > > On 8 January 2014 04:33, Helen Kwong <he...@gmail.com> wrote:
> > > >
> > > >> Oh I see, I thought what you meant was that I could only alter the
> > > default
> > > >> group in shared-groups mode starting with 0.24.
> > > >
> > > >
> > > > No, I just missed that you said 0.16 and assumed 0.24 was the version
> > you
> > > > were using . You could always change it, just in more limited ways in
> > > > earlier releases.
> > > >
> > > > To make sure I'm
> > > >> understanding this correctly -- changing the the default message
> group
> > > >> name
> > > >> to something else in C++ mode won't change the serial processing
> > > behavior
> > > >> I
> > > >> saw, right?
> > > >
> > > >
> > > > Correct
> > > >
> > > >
> > > >> Messages without a group identifier will still be considered to
> > > >> be in a group -- rather than no group -- and they cannot be
> processed
> > by
> > > >> multiple consumers concurrently?
> > > >>
> > > >>
> > > > Yes. In the C++/shared-groups mode every message is considered to be
> > in a
> > > > group, it is just a case of whether the message specifies that group
> or
> > > > instead gets put into the default group.
> > > >
> > > >
> > > >
> > > >> Thanks,
> > > >> Helen
> > > >>
> > > >>
> > > >> On Tue, Jan 7, 2014 at 8:22 PM, Robbie Gemmell <
> > > robbie.gemmell@gmail.com
> > > >> >wrote:
> > > >>
> > > >> > I just noticed you said you were using 0.16, somehow glossed over
> it
> > > >> > originally and only noticed the 0.24 in the doc URL (its many
> hours
> > > past
> > > >> > time I was asleep, I might be getting tired).
> > > >> >
> > > >> > Realising that, I should add that prior to 0.22 the only way to
> > alter
> > > >> the
> > > >> > default group in the shared-groups mode from 'qpid.no-group' to
> > > >> something
> > > >> > else would have been via the 'qpid.default-message-group' queue
> > > declare
> > > >> > argument when using an AMQP client to create the queue originally,
> > and
> > > >> for
> > > >> > 0.22 itself only that and the system property approach I mentioned
> > > would
> > > >> > work.
> > > >> >
> > > >> > Robbie
> > > >> >
> > > >> > On 8 January 2014 04:03, Helen Kwong <he...@gmail.com>
> wrote:
> > > >> >
> > > >> > > Hi Robbie,
> > > >> > >
> > > >> > > I see. Thanks for the quick response and explanation!
> > > >> > >
> > > >> > > Helen
> > > >> > >
> > > >> > >
> > > >> > > On Tue, Jan 7, 2014 at 7:43 PM, Robbie Gemmell <
> > > >> robbie.gemmell@gmail.com
> > > >> > > >wrote:
> > > >> > >
> > > >> > > > Hi Helen,
> > > >> > > >
> > > >> > > > The short answer to your question is that it is the
> > documentation
> > > >> which
> > > >> > > is
> > > >> > > > incorrect, and the behaviour you are seeing is expected.
> > > >> > > >
> > > >> > > > The long answer is, when that documentation was composed a
> > segment
> > > >> was
> > > >> > > > missed out indicating this, and needs to be added to the docs.
> > The
> > > >> > > > behaviour listed for when no group is specified is only true
> of
> > > the
> > > >> > > > 'non-shared' groups supported by the Java broker, in the
> > > C++/shared
> > > >> > group
> > > >> > > > mode any messages recieved without an explicit group value are
> > all
> > > >> > > assigned
> > > >> > > > to a default group of 'qpid.no-group'. This is as per the
> > > behaviour
> > > >> of
> > > >> > > the
> > > >> > > > C++ broker itself, which is explained in the C++ broker docs
> at
> > > the
> > > >> end
> > > >> > > of
> > > >> > > > the following page
> > > >> > > >
> > > >> > > >
> > > >> > >
> > > >> >
> > > >>
> > >
> >
> http://qpid.apache.org/releases/qpid-0.24/cpp-broker/book/Using-message-groups.html
> > > >> > > > .
> > > >> > > > For the 0.24 Java broker, this default shared group can be
> > changed
> > > >> > > > broker-wide using the Java system property
> > > >> > > > 'qpid.broker_default-shared-message-group', or can be
> overriden
> > > for
> > > >> an
> > > >> > > > individual queue during creation programatically via AMQP
> > clients
> > > or
> > > >> > the
> > > >> > > > management interfaces through use of the argument
> > > >> > > > 'qpid.default-message-group' or 'messageGroupDefaultGroup'.
> > > >> > > >
> > > >> > > > I coincidentally happened to have fixed a defect with the
> shared
> > > >> groups
> > > >> > > > functionality last night on trunk. Its not yet included in the
> > > >> imminent
> > > >> > > > 0.26 release, though I am about to request whether that is
> > > possible.
> > > >> > > > https://issues.apache.org/jira/browse/QPID-5450
> > > >> > > >
> > > >> > > > Robbie
> > > >> > > >
> > > >> > > > On 8 January 2014 02:43, Helen Kwong <he...@gmail.com>
> > > wrote:
> > > >> > > >
> > > >> > > > > Hi,
> > > >> > > > >
> > > >> > > > > I use the Java broker and client, version 0.16, and am
> > > considering
> > > >> > > using
> > > >> > > > > the message grouping feature (
> > > >> > > > >
> > > >> > > > >
> > > >> > > >
> > > >> > >
> > > >> >
> > > >>
> > >
> >
> http://qpid.apache.org/releases/qpid-0.24/java-broker/book/Java-Broker-Queues.html#Java-Broker-Queues-OtherTypes-Message-Grouping
> > > >> > > > > ).
> > > >> > > > > From testing I've done, there seems to be a bug with the C++
> > > >> > > > compatibility
> > > >> > > > > model, and I'm wondering if this is a known issue.
> > Specifically,
> > > >> in
> > > >> > my
> > > >> > > > test
> > > >> > > > > I have a queue configured to use a group header field with
> > > >> > > > > "qpid.group_header_key" and C++ mode with
> > > "qpid.shared_msg_group",
> > > >> > and
> > > >> > > > have
> > > >> > > > > multiple listeners to the queue. Each listener will sleep
> for
> > a
> > > >> short
> > > >> > > > > amount of time when it receives a message before returning.
> I
> > > then
> > > >> > > > enqueue
> > > >> > > > > 10 messages that do not have a value in the group header
> field
> > > to
> > > >> the
> > > >> > > > > queue. Since the doc says that messages without a value in
> the
> > > >> > grouping
> > > >> > > > > header will be delivered to any available consumer, the
> > > behavior I
> > > >> > > expect
> > > >> > > > > is that the messages will be processed in parallel, i.e.,
> when
> > > >> > > listener 1
> > > >> > > > > is holding on to a message and sleeping, listener 2 can
> > receive
> > > >> > another
> > > >> > > > > message from the queue. But what I see is that the messages
> > are
> > > >> > > processed
> > > >> > > > > serially -- message 2 won't be received by some thread until
> > > >> message
> > > >> > 1
> > > >> > > is
> > > >> > > > > done. When I use the default mode instead of C++ mode, then
> I
> > > get
> > > >> the
> > > >> > > > > parallel processing behavior.
> > > >> > > > >
> > > >> > > > > Is this is a known bug, and is there a fix for it already?
> > > >> > > > >
> > > >> > > > > Thanks,
> > > >> > > > > Helen
> > > >> > > > >
> > > >> > > >
> > > >> > >
> > > >> >
> > > >>
> > > >
> > > >
> > >
> >
>

Re: Java broker - message grouping in C++ compatibility mode

Posted by Robbie Gemmell <ro...@gmail.com>.

Hi Helen,

Can I check what version of the code you were using? I ask as the latest
trunk or 0.26 release branch code is going to be necessary for correctness
and representative testing of the shared groups functionality, due to the
defect fix I previously mentioned making recently.
You can find a nightly build of the trunk broker at:
https://builds.apache.org/view/M-R/view/Qpid/job/Qpid-Java-Artefact-Release/lastSuccessfulBuild/artifact/trunk/qpid/java/broker/release/and
I would need to build the 0.26 branch as the fix was introduced after
the latest RC.

In your first case, I think the reason for the difference between the
default group and the unique group is also likely to be tied to the
'findEarliestAssignedAvailableEntry' behaviour you mention later in your
mail. For the default group case, that next message always going to be a
message near the front of the queue. For the unique group, there isnt
actually going to be a message which matches, but it looks like it will
currently be checking every message to determine that and doing so under
the synchronization, and thus probably preventing other deliveries
occurring at the time. That isnt a problem in the non-shared case because
there isnt a need to synchronise the GroupManager as a whole, and even
going beyond that its also highly unlikely it would need to check as many
messages before finding a match due to the signifcant difference in how
groups become associated with a particular subscription in the non-shared
case.

In your second case, your explanation seems likely and I think this case
really reduces to just being a variant of the above behaviour. The
particular issue is that one could argue it shouldnt need to be doing the
'findEarliestAssignedAvailableEntry' task all that often in this case if
you have a long-lived consumer, and so your mention of this makes me think
you are using a prefetch of 1. Using a prefetch of 1 currently means that
the delivery state associated with the shared group effectively becomes
empty after each message, because messages are only fully added to the
group when they become acquired by a particular subscription, and they cant
be acquired until the previous message is consumed (or perhaps slightly
confusingly, explicitly not-consumed). If so, I expect it could be very
interesting to run this case again with a prefetch of 2 or more. The
obvious tradeoff with increasing prefetch is that a particular consumer
could then be assigned up to <prefetch> groups at a given point, though
likely not in your test case due to the large contiguous blocks of messages
for each group.

I'm not sure that the suggestion to track the first message in the group
would really work currently, due to the way the underlying queue data
structure works. Maintaining a fixed pointer into it like that is likely to
lead to some undesirable memory retention issues, based on a related but
far simpler case I fixed previously in a similar structure elsewhere in the
broker. Looking at the way messages become assigned to a group in the
shared group case may be a more viable path to handling your second case
more gracefully. The unique groups from you first case would still need
something different though, as neither of these routes would really help
there.

Robbie

On 11 January 2014 01:03, Helen Kwong <he...@gmail.com> wrote:

> Hi Robbie,
>
> I did some more testing to see whether message grouping will work for us,
> and compared the dequeue performance of a queue using message grouping in
> default Java mode, a queue using C++ mode, and a queue not using message
> grouping. I found that when I use C++ mode, the performance can be much
> worse than in other comparable setups, and was wondering if you could help
> me understand why.
>
> 1.  In one test, I have multiple listeners to a queue, enqueue 100000
> messages to it, and see how many messages are processed in 5 minutes. I
> have these different setups:
>
> - C++ mode queue with each message having a unique identifier
>
> - C++ mode queue with all messages having no grouping identifier (so all
> belong to the default group)
>
> - default mode queue with each message having a unique grouping identifier
>
> - default mode queue with all messages having no grouping identifier
>
> - default mode queue with all messages having the same grouping identifier
>
> - regular queue without a group header key configured
>
> All setups except for the first had about 35K - 39K messages processed, but
> for the first setup, there were under 8000 messages processed. What could
> explain this big difference? I’ve looked at the two grouping modes’
> implementations of MessageGroupManager and see that C++ mode uses
> synchronized methods rather than a ConcurrentHashMap as in default mode, so
> I’d guess there might be more contention because of that, but at the same
> time I can’t see why, in C++ mode, having a unique identifier for each
> message results in throughput that is so much worse than having all
> messages in the default group.
>
>
> 2. I also wanted to see the impact of having many messages at the head of
> the queue that a listener can’t process because the messages belong to a
> group assigned to another consumer. E.g., have 10000 messages of group A,
> followed by 1000 messages of group B, and listener 1 is holding on to the
> first A message for a long time -- see how long it will take listener 2 to
> process all the B messages. In this case C++ mode has performance that
> degrades as the number of unprocessable group A messages at the front of
> the queue increases, whereas default mode's performance is unaffected,
> about the same as processing 1000 messages on a regular queue.
>
> My rough guess from looking at DefinedGroupMessageGroupManager is that
> whenever listener 2 is done with a group B message, the state change
> listener triggers the Group.subtract() to reset pointers for other
> subscriptions and consequently findEarliestAssignedAvailableEntry(). This
> then has to iterate through all the group A messages before it finds the B
> message. Do you think this is the reason for the results I see?
>
> If so, is the idea here that other subscriptions of the queue could have
> skipped over the messages of a group while the group was assigned to some
> subscription S, so we need to tell them to set their pointers back? If that
> is indeed the idea, would it be possible to optimize it such that when a
> group A is assigned to S and S gets its first message of the group, we
> store what that first A message / queue entry is. Then when S is done with
> the last A message, we can tell other subscriptions to go back to that
> first entry, without having to iterate through the queue?
>
> Thanks a lot for your help!
>
> Helen
>
>
> On Tue, Jan 7, 2014 at 8:46 PM, Robbie Gemmell <robbie.gemmell@gmail.com
> >wrote:
>
> > ...and just to be super clear, though I think it it is mentioned
> correctly
> > in the docs this time, the 'default group' concept does not apply in the
> > regular / 'non shared' grouping mode. Messages that dont specify a group
> > key value in that mode are simply not grouped in any way.
> >
> > On 8 January 2014 04:41, Robbie Gemmell <ro...@gmail.com>
> wrote:
> >
> > >
> > > On 8 January 2014 04:33, Helen Kwong <he...@gmail.com> wrote:
> > >
> > >> Oh I see, I thought what you meant was that I could only alter the
> > default
> > >> group in shared-groups mode starting with 0.24.
> > >
> > >
> > > No, I just missed that you said 0.16 and assumed 0.24 was the version
> you
> > > were using . You could always change it, just in more limited ways in
> > > earlier releases.
> > >
> > > To make sure I'm
> > >> understanding this correctly -- changing the the default message group
> > >> name
> > >> to something else in C++ mode won't change the serial processing
> > behavior
> > >> I
> > >> saw, right?
> > >
> > >
> > > Correct
> > >
> > >
> > >> Messages without a group identifier will still be considered to
> > >> be in a group -- rather than no group -- and they cannot be processed
> by
> > >> multiple consumers concurrently?
> > >>
> > >>
> > > Yes. In the C++/shared-groups mode every message is considered to be
> in a
> > > group, it is just a case of whether the message specifies that group or
> > > instead gets put into the default group.
> > >
> > >
> > >
> > >> Thanks,
> > >> Helen
> > >>
> > >>
> > >> On Tue, Jan 7, 2014 at 8:22 PM, Robbie Gemmell <
> > robbie.gemmell@gmail.com
> > >> >wrote:
> > >>
> > >> > I just noticed you said you were using 0.16, somehow glossed over it
> > >> > originally and only noticed the 0.24 in the doc URL (its many hours
> > past
> > >> > time I was asleep, I might be getting tired).
> > >> >
> > >> > Realising that, I should add that prior to 0.22 the only way to
> alter
> > >> the
> > >> > default group in the shared-groups mode from 'qpid.no-group' to
> > >> something
> > >> > else would have been via the 'qpid.default-message-group' queue
> > declare
> > >> > argument when using an AMQP client to create the queue originally,
> and
> > >> for
> > >> > 0.22 itself only that and the system property approach I mentioned
> > would
> > >> > work.
> > >> >
> > >> > Robbie
> > >> >
> > >> > On 8 January 2014 04:03, Helen Kwong <he...@gmail.com> wrote:
> > >> >
> > >> > > Hi Robbie,
> > >> > >
> > >> > > I see. Thanks for the quick response and explanation!
> > >> > >
> > >> > > Helen
> > >> > >
> > >> > >
> > >> > > On Tue, Jan 7, 2014 at 7:43 PM, Robbie Gemmell <
> > >> robbie.gemmell@gmail.com
> > >> > > >wrote:
> > >> > >
> > >> > > > Hi Helen,
> > >> > > >
> > >> > > > The short answer to your question is that it is the
> documentation
> > >> which
> > >> > > is
> > >> > > > incorrect, and the behaviour you are seeing is expected.
> > >> > > >
> > >> > > > The long answer is, when that documentation was composed a
> segment
> > >> was
> > >> > > > missed out indicating this, and needs to be added to the docs.
> The
> > >> > > > behaviour listed for when no group is specified is only true of
> > the
> > >> > > > 'non-shared' groups supported by the Java broker, in the
> > C++/shared
> > >> > group
> > >> > > > mode any messages recieved without an explicit group value are
> all
> > >> > > assigned
> > >> > > > to a default group of 'qpid.no-group'. This is as per the
> > behaviour
> > >> of
> > >> > > the
> > >> > > > C++ broker itself, which is explained in the C++ broker docs at
> > the
> > >> end
> > >> > > of
> > >> > > > the following page
> > >> > > >
> > >> > > >
> > >> > >
> > >> >
> > >>
> >
> http://qpid.apache.org/releases/qpid-0.24/cpp-broker/book/Using-message-groups.html
> > >> > > > .
> > >> > > > For the 0.24 Java broker, this default shared group can be
> changed
> > >> > > > broker-wide using the Java system property
> > >> > > > 'qpid.broker_default-shared-message-group', or can be overriden
> > for
> > >> an
> > >> > > > individual queue during creation programatically via AMQP
> clients
> > or
> > >> > the
> > >> > > > management interfaces through use of the argument
> > >> > > > 'qpid.default-message-group' or 'messageGroupDefaultGroup'.
> > >> > > >
> > >> > > > I coincidentally happened to have fixed a defect with the shared
> > >> groups
> > >> > > > functionality last night on trunk. Its not yet included in the
> > >> imminent
> > >> > > > 0.26 release, though I am about to request whether that is
> > possible.
> > >> > > > https://issues.apache.org/jira/browse/QPID-5450
> > >> > > >
> > >> > > > Robbie
> > >> > > >
> > >> > > > On 8 January 2014 02:43, Helen Kwong <he...@gmail.com>
> > wrote:
> > >> > > >
> > >> > > > > Hi,
> > >> > > > >
> > >> > > > > I use the Java broker and client, version 0.16, and am
> > considering
> > >> > > using
> > >> > > > > the message grouping feature (
> > >> > > > >
> > >> > > > >
> > >> > > >
> > >> > >
> > >> >
> > >>
> >
> http://qpid.apache.org/releases/qpid-0.24/java-broker/book/Java-Broker-Queues.html#Java-Broker-Queues-OtherTypes-Message-Grouping
> > >> > > > > ).
> > >> > > > > From testing I've done, there seems to be a bug with the C++
> > >> > > > compatibility
> > >> > > > > model, and I'm wondering if this is a known issue.
> Specifically,
> > >> in
> > >> > my
> > >> > > > test
> > >> > > > > I have a queue configured to use a group header field with
> > >> > > > > "qpid.group_header_key" and C++ mode with
> > "qpid.shared_msg_group",
> > >> > and
> > >> > > > have
> > >> > > > > multiple listeners to the queue. Each listener will sleep for
> a
> > >> short
> > >> > > > > amount of time when it receives a message before returning. I
> > then
> > >> > > > enqueue
> > >> > > > > 10 messages that do not have a value in the group header field
> > to
> > >> the
> > >> > > > > queue. Since the doc says that messages without a value in the
> > >> > grouping
> > >> > > > > header will be delivered to any available consumer, the
> > behavior I
> > >> > > expect
> > >> > > > > is that the messages will be processed in parallel, i.e., when
> > >> > > listener 1
> > >> > > > > is holding on to a message and sleeping, listener 2 can
> receive
> > >> > another
> > >> > > > > message from the queue. But what I see is that the messages
> are
> > >> > > processed
> > >> > > > > serially -- message 2 won't be received by some thread until
> > >> message
> > >> > 1
> > >> > > is
> > >> > > > > done. When I use the default mode instead of C++ mode, then I
> > get
> > >> the
> > >> > > > > parallel processing behavior.
> > >> > > > >
> > >> > > > > Is this is a known bug, and is there a fix for it already?
> > >> > > > >
> > >> > > > > Thanks,
> > >> > > > > Helen
> > >> > > > >
> > >> > > >
> > >> > >
> > >> >
> > >>
> > >
> > >
> >
>

Re: Java broker - message grouping in C++ compatibility mode

Posted by Helen Kwong <he...@gmail.com>.

Hi Robbie,

I did some more testing to see whether message grouping will work for us,
and compared the dequeue performance of a queue using message grouping in
default Java mode, a queue using C++ mode, and a queue not using message
grouping. I found that when I use C++ mode, the performance can be much
worse than in other comparable setups, and was wondering if you could help
me understand why.

1.  In one test, I have multiple listeners to a queue, enqueue 100000
messages to it, and see how many messages are processed in 5 minutes. I
have these different setups:

- C++ mode queue with each message having a unique identifier

- C++ mode queue with all messages having no grouping identifier (so all
belong to the default group)

- default mode queue with each message having a unique grouping identifier

- default mode queue with all messages having no grouping identifier

- default mode queue with all messages having the same grouping identifier

- regular queue without a group header key configured

All setups except for the first had about 35K - 39K messages processed, but
for the first setup, there were under 8000 messages processed. What could
explain this big difference? I’ve looked at the two grouping modes’
implementations of MessageGroupManager and see that C++ mode uses
synchronized methods rather than a ConcurrentHashMap as in default mode, so
I’d guess there might be more contention because of that, but at the same
time I can’t see why, in C++ mode, having a unique identifier for each
message results in throughput that is so much worse than having all
messages in the default group.


2. I also wanted to see the impact of having many messages at the head of
the queue that a listener can’t process because the messages belong to a
group assigned to another consumer. E.g., have 10000 messages of group A,
followed by 1000 messages of group B, and listener 1 is holding on to the
first A message for a long time -- see how long it will take listener 2 to
process all the B messages. In this case C++ mode has performance that
degrades as the number of unprocessable group A messages at the front of
the queue increases, whereas default mode's performance is unaffected,
about the same as processing 1000 messages on a regular queue.

My rough guess from looking at DefinedGroupMessageGroupManager is that
whenever listener 2 is done with a group B message, the state change
listener triggers the Group.subtract() to reset pointers for other
subscriptions and consequently findEarliestAssignedAvailableEntry(). This
then has to iterate through all the group A messages before it finds the B
message. Do you think this is the reason for the results I see?

If so, is the idea here that other subscriptions of the queue could have
skipped over the messages of a group while the group was assigned to some
subscription S, so we need to tell them to set their pointers back? If that
is indeed the idea, would it be possible to optimize it such that when a
group A is assigned to S and S gets its first message of the group, we
store what that first A message / queue entry is. Then when S is done with
the last A message, we can tell other subscriptions to go back to that
first entry, without having to iterate through the queue?

Thanks a lot for your help!

Helen


On Tue, Jan 7, 2014 at 8:46 PM, Robbie Gemmell <ro...@gmail.com>wrote:

> ...and just to be super clear, though I think it it is mentioned correctly
> in the docs this time, the 'default group' concept does not apply in the
> regular / 'non shared' grouping mode. Messages that dont specify a group
> key value in that mode are simply not grouped in any way.
>
> On 8 January 2014 04:41, Robbie Gemmell <ro...@gmail.com> wrote:
>
> >
> > On 8 January 2014 04:33, Helen Kwong <he...@gmail.com> wrote:
> >
> >> Oh I see, I thought what you meant was that I could only alter the
> default
> >> group in shared-groups mode starting with 0.24.
> >
> >
> > No, I just missed that you said 0.16 and assumed 0.24 was the version you
> > were using . You could always change it, just in more limited ways in
> > earlier releases.
> >
> > To make sure I'm
> >> understanding this correctly -- changing the the default message group
> >> name
> >> to something else in C++ mode won't change the serial processing
> behavior
> >> I
> >> saw, right?
> >
> >
> > Correct
> >
> >
> >> Messages without a group identifier will still be considered to
> >> be in a group -- rather than no group -- and they cannot be processed by
> >> multiple consumers concurrently?
> >>
> >>
> > Yes. In the C++/shared-groups mode every message is considered to be in a
> > group, it is just a case of whether the message specifies that group or
> > instead gets put into the default group.
> >
> >
> >
> >> Thanks,
> >> Helen
> >>
> >>
> >> On Tue, Jan 7, 2014 at 8:22 PM, Robbie Gemmell <
> robbie.gemmell@gmail.com
> >> >wrote:
> >>
> >> > I just noticed you said you were using 0.16, somehow glossed over it
> >> > originally and only noticed the 0.24 in the doc URL (its many hours
> past
> >> > time I was asleep, I might be getting tired).
> >> >
> >> > Realising that, I should add that prior to 0.22 the only way to alter
> >> the
> >> > default group in the shared-groups mode from 'qpid.no-group' to
> >> something
> >> > else would have been via the 'qpid.default-message-group' queue
> declare
> >> > argument when using an AMQP client to create the queue originally, and
> >> for
> >> > 0.22 itself only that and the system property approach I mentioned
> would
> >> > work.
> >> >
> >> > Robbie
> >> >
> >> > On 8 January 2014 04:03, Helen Kwong <he...@gmail.com> wrote:
> >> >
> >> > > Hi Robbie,
> >> > >
> >> > > I see. Thanks for the quick response and explanation!
> >> > >
> >> > > Helen
> >> > >
> >> > >
> >> > > On Tue, Jan 7, 2014 at 7:43 PM, Robbie Gemmell <
> >> robbie.gemmell@gmail.com
> >> > > >wrote:
> >> > >
> >> > > > Hi Helen,
> >> > > >
> >> > > > The short answer to your question is that it is the documentation
> >> which
> >> > > is
> >> > > > incorrect, and the behaviour you are seeing is expected.
> >> > > >
> >> > > > The long answer is, when that documentation was composed a segment
> >> was
> >> > > > missed out indicating this, and needs to be added to the docs. The
> >> > > > behaviour listed for when no group is specified is only true of
> the
> >> > > > 'non-shared' groups supported by the Java broker, in the
> C++/shared
> >> > group
> >> > > > mode any messages recieved without an explicit group value are all
> >> > > assigned
> >> > > > to a default group of 'qpid.no-group'. This is as per the
> behaviour
> >> of
> >> > > the
> >> > > > C++ broker itself, which is explained in the C++ broker docs at
> the
> >> end
> >> > > of
> >> > > > the following page
> >> > > >
> >> > > >
> >> > >
> >> >
> >>
> http://qpid.apache.org/releases/qpid-0.24/cpp-broker/book/Using-message-groups.html
> >> > > > .
> >> > > > For the 0.24 Java broker, this default shared group can be changed
> >> > > > broker-wide using the Java system property
> >> > > > 'qpid.broker_default-shared-message-group', or can be overriden
> for
> >> an
> >> > > > individual queue during creation programatically via AMQP clients
> or
> >> > the
> >> > > > management interfaces through use of the argument
> >> > > > 'qpid.default-message-group' or 'messageGroupDefaultGroup'.
> >> > > >
> >> > > > I coincidentally happened to have fixed a defect with the shared
> >> groups
> >> > > > functionality last night on trunk. Its not yet included in the
> >> imminent
> >> > > > 0.26 release, though I am about to request whether that is
> possible.
> >> > > > https://issues.apache.org/jira/browse/QPID-5450
> >> > > >
> >> > > > Robbie
> >> > > >
> >> > > > On 8 January 2014 02:43, Helen Kwong <he...@gmail.com>
> wrote:
> >> > > >
> >> > > > > Hi,
> >> > > > >
> >> > > > > I use the Java broker and client, version 0.16, and am
> considering
> >> > > using
> >> > > > > the message grouping feature (
> >> > > > >
> >> > > > >
> >> > > >
> >> > >
> >> >
> >>
> http://qpid.apache.org/releases/qpid-0.24/java-broker/book/Java-Broker-Queues.html#Java-Broker-Queues-OtherTypes-Message-Grouping
> >> > > > > ).
> >> > > > > From testing I've done, there seems to be a bug with the C++
> >> > > > compatibility
> >> > > > > model, and I'm wondering if this is a known issue. Specifically,
> >> in
> >> > my
> >> > > > test
> >> > > > > I have a queue configured to use a group header field with
> >> > > > > "qpid.group_header_key" and C++ mode with
> "qpid.shared_msg_group",
> >> > and
> >> > > > have
> >> > > > > multiple listeners to the queue. Each listener will sleep for a
> >> short
> >> > > > > amount of time when it receives a message before returning. I
> then
> >> > > > enqueue
> >> > > > > 10 messages that do not have a value in the group header field
> to
> >> the
> >> > > > > queue. Since the doc says that messages without a value in the
> >> > grouping
> >> > > > > header will be delivered to any available consumer, the
> behavior I
> >> > > expect
> >> > > > > is that the messages will be processed in parallel, i.e., when
> >> > > listener 1
> >> > > > > is holding on to a message and sleeping, listener 2 can receive
> >> > another
> >> > > > > message from the queue. But what I see is that the messages are
> >> > > processed
> >> > > > > serially -- message 2 won't be received by some thread until
> >> message
> >> > 1
> >> > > is
> >> > > > > done. When I use the default mode instead of C++ mode, then I
> get
> >> the
> >> > > > > parallel processing behavior.
> >> > > > >
> >> > > > > Is this is a known bug, and is there a fix for it already?
> >> > > > >
> >> > > > > Thanks,
> >> > > > > Helen
> >> > > > >
> >> > > >
> >> > >
> >> >
> >>
> >
> >
>

Re: Java broker - message grouping in C++ compatibility mode

Posted by Robbie Gemmell <ro...@gmail.com>.

...and just to be super clear, though I think it it is mentioned correctly
in the docs this time, the 'default group' concept does not apply in the
regular / 'non shared' grouping mode. Messages that dont specify a group
key value in that mode are simply not grouped in any way.

On 8 January 2014 04:41, Robbie Gemmell <ro...@gmail.com> wrote:

>
> On 8 January 2014 04:33, Helen Kwong <he...@gmail.com> wrote:
>
>> Oh I see, I thought what you meant was that I could only alter the default
>> group in shared-groups mode starting with 0.24.
>
>
> No, I just missed that you said 0.16 and assumed 0.24 was the version you
> were using . You could always change it, just in more limited ways in
> earlier releases.
>
> To make sure I'm
>> understanding this correctly -- changing the the default message group
>> name
>> to something else in C++ mode won't change the serial processing behavior
>> I
>> saw, right?
>
>
> Correct
>
>
>> Messages without a group identifier will still be considered to
>> be in a group -- rather than no group -- and they cannot be processed by
>> multiple consumers concurrently?
>>
>>
> Yes. In the C++/shared-groups mode every message is considered to be in a
> group, it is just a case of whether the message specifies that group or
> instead gets put into the default group.
>
>
>
>> Thanks,
>> Helen
>>
>>
>> On Tue, Jan 7, 2014 at 8:22 PM, Robbie Gemmell <robbie.gemmell@gmail.com
>> >wrote:
>>
>> > I just noticed you said you were using 0.16, somehow glossed over it
>> > originally and only noticed the 0.24 in the doc URL (its many hours past
>> > time I was asleep, I might be getting tired).
>> >
>> > Realising that, I should add that prior to 0.22 the only way to alter
>> the
>> > default group in the shared-groups mode from 'qpid.no-group' to
>> something
>> > else would have been via the 'qpid.default-message-group' queue declare
>> > argument when using an AMQP client to create the queue originally, and
>> for
>> > 0.22 itself only that and the system property approach I mentioned would
>> > work.
>> >
>> > Robbie
>> >
>> > On 8 January 2014 04:03, Helen Kwong <he...@gmail.com> wrote:
>> >
>> > > Hi Robbie,
>> > >
>> > > I see. Thanks for the quick response and explanation!
>> > >
>> > > Helen
>> > >
>> > >
>> > > On Tue, Jan 7, 2014 at 7:43 PM, Robbie Gemmell <
>> robbie.gemmell@gmail.com
>> > > >wrote:
>> > >
>> > > > Hi Helen,
>> > > >
>> > > > The short answer to your question is that it is the documentation
>> which
>> > > is
>> > > > incorrect, and the behaviour you are seeing is expected.
>> > > >
>> > > > The long answer is, when that documentation was composed a segment
>> was
>> > > > missed out indicating this, and needs to be added to the docs. The
>> > > > behaviour listed for when no group is specified is only true of the
>> > > > 'non-shared' groups supported by the Java broker, in the C++/shared
>> > group
>> > > > mode any messages recieved without an explicit group value are all
>> > > assigned
>> > > > to a default group of 'qpid.no-group'. This is as per the behaviour
>> of
>> > > the
>> > > > C++ broker itself, which is explained in the C++ broker docs at the
>> end
>> > > of
>> > > > the following page
>> > > >
>> > > >
>> > >
>> >
>> http://qpid.apache.org/releases/qpid-0.24/cpp-broker/book/Using-message-groups.html
>> > > > .
>> > > > For the 0.24 Java broker, this default shared group can be changed
>> > > > broker-wide using the Java system property
>> > > > 'qpid.broker_default-shared-message-group', or can be overriden for
>> an
>> > > > individual queue during creation programatically via AMQP clients or
>> > the
>> > > > management interfaces through use of the argument
>> > > > 'qpid.default-message-group' or 'messageGroupDefaultGroup'.
>> > > >
>> > > > I coincidentally happened to have fixed a defect with the shared
>> groups
>> > > > functionality last night on trunk. Its not yet included in the
>> imminent
>> > > > 0.26 release, though I am about to request whether that is possible.
>> > > > https://issues.apache.org/jira/browse/QPID-5450
>> > > >
>> > > > Robbie
>> > > >
>> > > > On 8 January 2014 02:43, Helen Kwong <he...@gmail.com> wrote:
>> > > >
>> > > > > Hi,
>> > > > >
>> > > > > I use the Java broker and client, version 0.16, and am considering
>> > > using
>> > > > > the message grouping feature (
>> > > > >
>> > > > >
>> > > >
>> > >
>> >
>> http://qpid.apache.org/releases/qpid-0.24/java-broker/book/Java-Broker-Queues.html#Java-Broker-Queues-OtherTypes-Message-Grouping
>> > > > > ).
>> > > > > From testing I've done, there seems to be a bug with the C++
>> > > > compatibility
>> > > > > model, and I'm wondering if this is a known issue. Specifically,
>> in
>> > my
>> > > > test
>> > > > > I have a queue configured to use a group header field with
>> > > > > "qpid.group_header_key" and C++ mode with "qpid.shared_msg_group",
>> > and
>> > > > have
>> > > > > multiple listeners to the queue. Each listener will sleep for a
>> short
>> > > > > amount of time when it receives a message before returning. I then
>> > > > enqueue
>> > > > > 10 messages that do not have a value in the group header field to
>> the
>> > > > > queue. Since the doc says that messages without a value in the
>> > grouping
>> > > > > header will be delivered to any available consumer, the behavior I
>> > > expect
>> > > > > is that the messages will be processed in parallel, i.e., when
>> > > listener 1
>> > > > > is holding on to a message and sleeping, listener 2 can receive
>> > another
>> > > > > message from the queue. But what I see is that the messages are
>> > > processed
>> > > > > serially -- message 2 won't be received by some thread until
>> message
>> > 1
>> > > is
>> > > > > done. When I use the default mode instead of C++ mode, then I get
>> the
>> > > > > parallel processing behavior.
>> > > > >
>> > > > > Is this is a known bug, and is there a fix for it already?
>> > > > >
>> > > > > Thanks,
>> > > > > Helen
>> > > > >
>> > > >
>> > >
>> >
>>
>
>

Re: Java broker - message grouping in C++ compatibility mode

Posted by Robbie Gemmell <ro...@gmail.com>.

On 8 January 2014 04:33, Helen Kwong <he...@gmail.com> wrote:

> Oh I see, I thought what you meant was that I could only alter the default
> group in shared-groups mode starting with 0.24.


No, I just missed that you said 0.16 and assumed 0.24 was the version you
were using . You could always change it, just in more limited ways in
earlier releases.

To make sure I'm
> understanding this correctly -- changing the the default message group name
> to something else in C++ mode won't change the serial processing behavior I
> saw, right?


Correct


> Messages without a group identifier will still be considered to
> be in a group -- rather than no group -- and they cannot be processed by
> multiple consumers concurrently?
>
>
Yes. In the C++/shared-groups mode every message is considered to be in a
group, it is just a case of whether the message specifies that group or
instead gets put into the default group.



> Thanks,
> Helen
>
>
> On Tue, Jan 7, 2014 at 8:22 PM, Robbie Gemmell <robbie.gemmell@gmail.com
> >wrote:
>
> > I just noticed you said you were using 0.16, somehow glossed over it
> > originally and only noticed the 0.24 in the doc URL (its many hours past
> > time I was asleep, I might be getting tired).
> >
> > Realising that, I should add that prior to 0.22 the only way to alter the
> > default group in the shared-groups mode from 'qpid.no-group' to something
> > else would have been via the 'qpid.default-message-group' queue declare
> > argument when using an AMQP client to create the queue originally, and
> for
> > 0.22 itself only that and the system property approach I mentioned would
> > work.
> >
> > Robbie
> >
> > On 8 January 2014 04:03, Helen Kwong <he...@gmail.com> wrote:
> >
> > > Hi Robbie,
> > >
> > > I see. Thanks for the quick response and explanation!
> > >
> > > Helen
> > >
> > >
> > > On Tue, Jan 7, 2014 at 7:43 PM, Robbie Gemmell <
> robbie.gemmell@gmail.com
> > > >wrote:
> > >
> > > > Hi Helen,
> > > >
> > > > The short answer to your question is that it is the documentation
> which
> > > is
> > > > incorrect, and the behaviour you are seeing is expected.
> > > >
> > > > The long answer is, when that documentation was composed a segment
> was
> > > > missed out indicating this, and needs to be added to the docs. The
> > > > behaviour listed for when no group is specified is only true of the
> > > > 'non-shared' groups supported by the Java broker, in the C++/shared
> > group
> > > > mode any messages recieved without an explicit group value are all
> > > assigned
> > > > to a default group of 'qpid.no-group'. This is as per the behaviour
> of
> > > the
> > > > C++ broker itself, which is explained in the C++ broker docs at the
> end
> > > of
> > > > the following page
> > > >
> > > >
> > >
> >
> http://qpid.apache.org/releases/qpid-0.24/cpp-broker/book/Using-message-groups.html
> > > > .
> > > > For the 0.24 Java broker, this default shared group can be changed
> > > > broker-wide using the Java system property
> > > > 'qpid.broker_default-shared-message-group', or can be overriden for
> an
> > > > individual queue during creation programatically via AMQP clients or
> > the
> > > > management interfaces through use of the argument
> > > > 'qpid.default-message-group' or 'messageGroupDefaultGroup'.
> > > >
> > > > I coincidentally happened to have fixed a defect with the shared
> groups
> > > > functionality last night on trunk. Its not yet included in the
> imminent
> > > > 0.26 release, though I am about to request whether that is possible.
> > > > https://issues.apache.org/jira/browse/QPID-5450
> > > >
> > > > Robbie
> > > >
> > > > On 8 January 2014 02:43, Helen Kwong <he...@gmail.com> wrote:
> > > >
> > > > > Hi,
> > > > >
> > > > > I use the Java broker and client, version 0.16, and am considering
> > > using
> > > > > the message grouping feature (
> > > > >
> > > > >
> > > >
> > >
> >
> http://qpid.apache.org/releases/qpid-0.24/java-broker/book/Java-Broker-Queues.html#Java-Broker-Queues-OtherTypes-Message-Grouping
> > > > > ).
> > > > > From testing I've done, there seems to be a bug with the C++
> > > > compatibility
> > > > > model, and I'm wondering if this is a known issue. Specifically, in
> > my
> > > > test
> > > > > I have a queue configured to use a group header field with
> > > > > "qpid.group_header_key" and C++ mode with "qpid.shared_msg_group",
> > and
> > > > have
> > > > > multiple listeners to the queue. Each listener will sleep for a
> short
> > > > > amount of time when it receives a message before returning. I then
> > > > enqueue
> > > > > 10 messages that do not have a value in the group header field to
> the
> > > > > queue. Since the doc says that messages without a value in the
> > grouping
> > > > > header will be delivered to any available consumer, the behavior I
> > > expect
> > > > > is that the messages will be processed in parallel, i.e., when
> > > listener 1
> > > > > is holding on to a message and sleeping, listener 2 can receive
> > another
> > > > > message from the queue. But what I see is that the messages are
> > > processed
> > > > > serially -- message 2 won't be received by some thread until
> message
> > 1
> > > is
> > > > > done. When I use the default mode instead of C++ mode, then I get
> the
> > > > > parallel processing behavior.
> > > > >
> > > > > Is this is a known bug, and is there a fix for it already?
> > > > >
> > > > > Thanks,
> > > > > Helen
> > > > >
> > > >
> > >
> >
>

Re: Java broker - message grouping in C++ compatibility mode

Posted by Helen Kwong <he...@gmail.com>.

Oh I see, I thought what you meant was that I could only alter the default
group in shared-groups mode starting with 0.24. To make sure I'm
understanding this correctly -- changing the the default message group name
to something else in C++ mode won't change the serial processing behavior I
saw, right? Messages without a group identifier will still be considered to
be in a group -- rather than no group -- and they cannot be processed by
multiple consumers concurrently?

Thanks,
Helen


On Tue, Jan 7, 2014 at 8:22 PM, Robbie Gemmell <ro...@gmail.com>wrote:

> I just noticed you said you were using 0.16, somehow glossed over it
> originally and only noticed the 0.24 in the doc URL (its many hours past
> time I was asleep, I might be getting tired).
>
> Realising that, I should add that prior to 0.22 the only way to alter the
> default group in the shared-groups mode from 'qpid.no-group' to something
> else would have been via the 'qpid.default-message-group' queue declare
> argument when using an AMQP client to create the queue originally, and for
> 0.22 itself only that and the system property approach I mentioned would
> work.
>
> Robbie
>
> On 8 January 2014 04:03, Helen Kwong <he...@gmail.com> wrote:
>
> > Hi Robbie,
> >
> > I see. Thanks for the quick response and explanation!
> >
> > Helen
> >
> >
> > On Tue, Jan 7, 2014 at 7:43 PM, Robbie Gemmell <robbie.gemmell@gmail.com
> > >wrote:
> >
> > > Hi Helen,
> > >
> > > The short answer to your question is that it is the documentation which
> > is
> > > incorrect, and the behaviour you are seeing is expected.
> > >
> > > The long answer is, when that documentation was composed a segment was
> > > missed out indicating this, and needs to be added to the docs. The
> > > behaviour listed for when no group is specified is only true of the
> > > 'non-shared' groups supported by the Java broker, in the C++/shared
> group
> > > mode any messages recieved without an explicit group value are all
> > assigned
> > > to a default group of 'qpid.no-group'. This is as per the behaviour of
> > the
> > > C++ broker itself, which is explained in the C++ broker docs at the end
> > of
> > > the following page
> > >
> > >
> >
> http://qpid.apache.org/releases/qpid-0.24/cpp-broker/book/Using-message-groups.html
> > > .
> > > For the 0.24 Java broker, this default shared group can be changed
> > > broker-wide using the Java system property
> > > 'qpid.broker_default-shared-message-group', or can be overriden for an
> > > individual queue during creation programatically via AMQP clients or
> the
> > > management interfaces through use of the argument
> > > 'qpid.default-message-group' or 'messageGroupDefaultGroup'.
> > >
> > > I coincidentally happened to have fixed a defect with the shared groups
> > > functionality last night on trunk. Its not yet included in the imminent
> > > 0.26 release, though I am about to request whether that is possible.
> > > https://issues.apache.org/jira/browse/QPID-5450
> > >
> > > Robbie
> > >
> > > On 8 January 2014 02:43, Helen Kwong <he...@gmail.com> wrote:
> > >
> > > > Hi,
> > > >
> > > > I use the Java broker and client, version 0.16, and am considering
> > using
> > > > the message grouping feature (
> > > >
> > > >
> > >
> >
> http://qpid.apache.org/releases/qpid-0.24/java-broker/book/Java-Broker-Queues.html#Java-Broker-Queues-OtherTypes-Message-Grouping
> > > > ).
> > > > From testing I've done, there seems to be a bug with the C++
> > > compatibility
> > > > model, and I'm wondering if this is a known issue. Specifically, in
> my
> > > test
> > > > I have a queue configured to use a group header field with
> > > > "qpid.group_header_key" and C++ mode with "qpid.shared_msg_group",
> and
> > > have
> > > > multiple listeners to the queue. Each listener will sleep for a short
> > > > amount of time when it receives a message before returning. I then
> > > enqueue
> > > > 10 messages that do not have a value in the group header field to the
> > > > queue. Since the doc says that messages without a value in the
> grouping
> > > > header will be delivered to any available consumer, the behavior I
> > expect
> > > > is that the messages will be processed in parallel, i.e., when
> > listener 1
> > > > is holding on to a message and sleeping, listener 2 can receive
> another
> > > > message from the queue. But what I see is that the messages are
> > processed
> > > > serially -- message 2 won't be received by some thread until message
> 1
> > is
> > > > done. When I use the default mode instead of C++ mode, then I get the
> > > > parallel processing behavior.
> > > >
> > > > Is this is a known bug, and is there a fix for it already?
> > > >
> > > > Thanks,
> > > > Helen
> > > >
> > >
> >
>

Re: Java broker - message grouping in C++ compatibility mode

Posted by Robbie Gemmell <ro...@gmail.com>.

I just noticed you said you were using 0.16, somehow glossed over it
originally and only noticed the 0.24 in the doc URL (its many hours past
time I was asleep, I might be getting tired).

Realising that, I should add that prior to 0.22 the only way to alter the
default group in the shared-groups mode from 'qpid.no-group' to something
else would have been via the 'qpid.default-message-group' queue declare
argument when using an AMQP client to create the queue originally, and for
0.22 itself only that and the system property approach I mentioned would
work.

Robbie

On 8 January 2014 04:03, Helen Kwong <he...@gmail.com> wrote:

> Hi Robbie,
>
> I see. Thanks for the quick response and explanation!
>
> Helen
>
>
> On Tue, Jan 7, 2014 at 7:43 PM, Robbie Gemmell <robbie.gemmell@gmail.com
> >wrote:
>
> > Hi Helen,
> >
> > The short answer to your question is that it is the documentation which
> is
> > incorrect, and the behaviour you are seeing is expected.
> >
> > The long answer is, when that documentation was composed a segment was
> > missed out indicating this, and needs to be added to the docs. The
> > behaviour listed for when no group is specified is only true of the
> > 'non-shared' groups supported by the Java broker, in the C++/shared group
> > mode any messages recieved without an explicit group value are all
> assigned
> > to a default group of 'qpid.no-group'. This is as per the behaviour of
> the
> > C++ broker itself, which is explained in the C++ broker docs at the end
> of
> > the following page
> >
> >
> http://qpid.apache.org/releases/qpid-0.24/cpp-broker/book/Using-message-groups.html
> > .
> > For the 0.24 Java broker, this default shared group can be changed
> > broker-wide using the Java system property
> > 'qpid.broker_default-shared-message-group', or can be overriden for an
> > individual queue during creation programatically via AMQP clients or the
> > management interfaces through use of the argument
> > 'qpid.default-message-group' or 'messageGroupDefaultGroup'.
> >
> > I coincidentally happened to have fixed a defect with the shared groups
> > functionality last night on trunk. Its not yet included in the imminent
> > 0.26 release, though I am about to request whether that is possible.
> > https://issues.apache.org/jira/browse/QPID-5450
> >
> > Robbie
> >
> > On 8 January 2014 02:43, Helen Kwong <he...@gmail.com> wrote:
> >
> > > Hi,
> > >
> > > I use the Java broker and client, version 0.16, and am considering
> using
> > > the message grouping feature (
> > >
> > >
> >
> http://qpid.apache.org/releases/qpid-0.24/java-broker/book/Java-Broker-Queues.html#Java-Broker-Queues-OtherTypes-Message-Grouping
> > > ).
> > > From testing I've done, there seems to be a bug with the C++
> > compatibility
> > > model, and I'm wondering if this is a known issue. Specifically, in my
> > test
> > > I have a queue configured to use a group header field with
> > > "qpid.group_header_key" and C++ mode with "qpid.shared_msg_group", and
> > have
> > > multiple listeners to the queue. Each listener will sleep for a short
> > > amount of time when it receives a message before returning. I then
> > enqueue
> > > 10 messages that do not have a value in the group header field to the
> > > queue. Since the doc says that messages without a value in the grouping
> > > header will be delivered to any available consumer, the behavior I
> expect
> > > is that the messages will be processed in parallel, i.e., when
> listener 1
> > > is holding on to a message and sleeping, listener 2 can receive another
> > > message from the queue. But what I see is that the messages are
> processed
> > > serially -- message 2 won't be received by some thread until message 1
> is
> > > done. When I use the default mode instead of C++ mode, then I get the
> > > parallel processing behavior.
> > >
> > > Is this is a known bug, and is there a fix for it already?
> > >
> > > Thanks,
> > > Helen
> > >
> >
>

Re: Java broker - message grouping in C++ compatibility mode

Posted by Helen Kwong <he...@gmail.com>.

Hi Robbie,

I see. Thanks for the quick response and explanation!

Helen


On Tue, Jan 7, 2014 at 7:43 PM, Robbie Gemmell <ro...@gmail.com>wrote:

> Hi Helen,
>
> The short answer to your question is that it is the documentation which is
> incorrect, and the behaviour you are seeing is expected.
>
> The long answer is, when that documentation was composed a segment was
> missed out indicating this, and needs to be added to the docs. The
> behaviour listed for when no group is specified is only true of the
> 'non-shared' groups supported by the Java broker, in the C++/shared group
> mode any messages recieved without an explicit group value are all assigned
> to a default group of 'qpid.no-group'. This is as per the behaviour of the
> C++ broker itself, which is explained in the C++ broker docs at the end of
> the following page
>
> http://qpid.apache.org/releases/qpid-0.24/cpp-broker/book/Using-message-groups.html
> .
> For the 0.24 Java broker, this default shared group can be changed
> broker-wide using the Java system property
> 'qpid.broker_default-shared-message-group', or can be overriden for an
> individual queue during creation programatically via AMQP clients or the
> management interfaces through use of the argument
> 'qpid.default-message-group' or 'messageGroupDefaultGroup'.
>
> I coincidentally happened to have fixed a defect with the shared groups
> functionality last night on trunk. Its not yet included in the imminent
> 0.26 release, though I am about to request whether that is possible.
> https://issues.apache.org/jira/browse/QPID-5450
>
> Robbie
>
> On 8 January 2014 02:43, Helen Kwong <he...@gmail.com> wrote:
>
> > Hi,
> >
> > I use the Java broker and client, version 0.16, and am considering using
> > the message grouping feature (
> >
> >
> http://qpid.apache.org/releases/qpid-0.24/java-broker/book/Java-Broker-Queues.html#Java-Broker-Queues-OtherTypes-Message-Grouping
> > ).
> > From testing I've done, there seems to be a bug with the C++
> compatibility
> > model, and I'm wondering if this is a known issue. Specifically, in my
> test
> > I have a queue configured to use a group header field with
> > "qpid.group_header_key" and C++ mode with "qpid.shared_msg_group", and
> have
> > multiple listeners to the queue. Each listener will sleep for a short
> > amount of time when it receives a message before returning. I then
> enqueue
> > 10 messages that do not have a value in the group header field to the
> > queue. Since the doc says that messages without a value in the grouping
> > header will be delivered to any available consumer, the behavior I expect
> > is that the messages will be processed in parallel, i.e., when listener 1
> > is holding on to a message and sleeping, listener 2 can receive another
> > message from the queue. But what I see is that the messages are processed
> > serially -- message 2 won't be received by some thread until message 1 is
> > done. When I use the default mode instead of C++ mode, then I get the
> > parallel processing behavior.
> >
> > Is this is a known bug, and is there a fix for it already?
> >
> > Thanks,
> > Helen
> >
>

Re: Java broker - message grouping in C++ compatibility mode

Posted by Robbie Gemmell <ro...@gmail.com>.

Hi Helen,

The short answer to your question is that it is the documentation which is
incorrect, and the behaviour you are seeing is expected.

The long answer is, when that documentation was composed a segment was
missed out indicating this, and needs to be added to the docs. The
behaviour listed for when no group is specified is only true of the
'non-shared' groups supported by the Java broker, in the C++/shared group
mode any messages recieved without an explicit group value are all assigned
to a default group of 'qpid.no-group'. This is as per the behaviour of the
C++ broker itself, which is explained in the C++ broker docs at the end of
the following page
http://qpid.apache.org/releases/qpid-0.24/cpp-broker/book/Using-message-groups.html.
For the 0.24 Java broker, this default shared group can be changed
broker-wide using the Java system property
'qpid.broker_default-shared-message-group', or can be overriden for an
individual queue during creation programatically via AMQP clients or the
management interfaces through use of the argument
'qpid.default-message-group' or 'messageGroupDefaultGroup'.

I coincidentally happened to have fixed a defect with the shared groups
functionality last night on trunk. Its not yet included in the imminent
0.26 release, though I am about to request whether that is possible.
https://issues.apache.org/jira/browse/QPID-5450

Robbie

On 8 January 2014 02:43, Helen Kwong <he...@gmail.com> wrote:

> Hi,
>
> I use the Java broker and client, version 0.16, and am considering using
> the message grouping feature (
>
> http://qpid.apache.org/releases/qpid-0.24/java-broker/book/Java-Broker-Queues.html#Java-Broker-Queues-OtherTypes-Message-Grouping
> ).
> From testing I've done, there seems to be a bug with the C++ compatibility
> model, and I'm wondering if this is a known issue. Specifically, in my test
> I have a queue configured to use a group header field with
> "qpid.group_header_key" and C++ mode with "qpid.shared_msg_group", and have
> multiple listeners to the queue. Each listener will sleep for a short
> amount of time when it receives a message before returning. I then enqueue
> 10 messages that do not have a value in the group header field to the
> queue. Since the doc says that messages without a value in the grouping
> header will be delivered to any available consumer, the behavior I expect
> is that the messages will be processed in parallel, i.e., when listener 1
> is holding on to a message and sleeping, listener 2 can receive another
> message from the queue. But what I see is that the messages are processed
> serially -- message 2 won't be received by some thread until message 1 is
> done. When I use the default mode instead of C++ mode, then I get the
> parallel processing behavior.
>
> Is this is a known bug, and is there a fix for it already?
>
> Thanks,
> Helen
>