You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@activemq.apache.org by sburczymucha <sb...@gmail.com> on 2015/12/17 13:28:09 UTC

NMSXGroupID problem - two different consumers received message for one group

Hello,

I have a strange problem with message groups and multiple consumers - I've
noticed several cases when message with the same NMSXGroupID was received by
two different consumers, I'm not sure if this is my fault or some other edge
case. How can I debug this?

My setup:

1. Each consumer has own connection and session
2. Every 255 messages NMSXGroupSeq is reset to -1
3. Messages are processed in transaction
4. Prefetch is set to 1
5. ActiveMQ + NMS + listener

Here is a sample output from multiple consumers and one group (with consumer
ID, NMSMessageId, NMSXGroupID, NMSXGroupSeq and JMSXGroupFirstForConsumer),
logged just after message was passed to consumer (and not yet commited):

http://pastebin.com/raw/V1ZYUNWF

And here are problematic assignments:


Consumer 2 receives first message for 1 group

CONSUMER#2 | 2015-12-17 12:33:38.9183 | NMSMessageId
ID:Developer-QADW3-49592-635859524291587812-1:0:2:1:1, NMSXGroupID 1,
NMSXGroupSeq 1, JMSXGroupFirstForConsumer True
CONSUMER#2 | 2015-12-17 12:33:39.9083 | NMSMessageId
ID:Developer-QADW3-49592-635859524291587812-1:0:2:1:2, NMSXGroupID 1,
NMSXGroupSeq 2, JMSXGroupFirstForConsumer False

Consumer 2 continues his job...

Consumer 3 receives message for group 1, but Consumer 2 already has this
group. NMSMessageId is from future?

CONSUMER#2 | 2015-12-17 12:33:52.7433 | NMSMessageId
ID:Developer-QADW3-49592-635859524291587812-1:0:2:1:74, NMSXGroupID 1,
NMSXGroupSeq 74, JMSXGroupFirstForConsumer False
CONSUMER#2 | 2015-12-17 12:33:52.7573 | NMSMessageId
ID:Developer-QADW3-49592-635859524291587812-1:0:2:1:75, NMSXGroupID 1,
NMSXGroupSeq 75, JMSXGroupFirstForConsumer False
CONSUMER#3 | 2015-12-17 12:33:52.9413 | NMSMessageId
ID:Developer-QADW3-49592-635859524291587812-1:0:2:1:257, NMSXGroupID 1,
NMSXGroupSeq 1, JMSXGroupFirstForConsumer True
CONSUMER#2 | 2015-12-17 12:33:53.3623 | NMSMessageId
ID:Developer-QADW3-49592-635859524291587812-1:0:2:1:76, NMSXGroupID 1,
NMSXGroupSeq 76, JMSXGroupFirstForConsumer False

And from now on only Consumer 3 is processing messages from group 1

CONSUMER#3 | 2015-12-17 12:33:53.4303 | NMSMessageId
ID:Developer-QADW3-49592-635859524291587812-1:0:2:1:77, NMSXGroupID 1,
NMSXGroupSeq 77, JMSXGroupFirstForConsumer False
CONSUMER#3 | 2015-12-17 12:33:53.6113 | NMSMessageId
ID:Developer-QADW3-49592-635859524291587812-1:0:2:1:78, NMSXGroupID 1,
NMSXGroupSeq 78, JMSXGroupFirstForConsumer False

And similar situation:

CONSUMER#3 | 2015-12-17 12:34:13.8848 | NMSMessageId
ID:Developer-QADW3-49592-635859524291587812-1:0:2:1:233, NMSXGroupID 1,
NMSXGroupSeq 233, JMSXGroupFirstForConsumer False
CONSUMER#3 | 2015-12-17 12:34:14.0219 | NMSMessageId
ID:Developer-QADW3-49592-635859524291587812-1:0:2:1:234, NMSXGroupID 1,
NMSXGroupSeq 234, JMSXGroupFirstForConsumer False
CONSUMER#4 | 2015-12-17 12:34:14.3928 | NMSMessageId
ID:Developer-QADW3-49592-635859524291587812-1:0:2:1:513, NMSXGroupID 1,
NMSXGroupSeq 1, JMSXGroupFirstForConsumer True
CONSUMER#3 | 2015-12-17 12:34:14.6008 | NMSMessageId
ID:Developer-QADW3-49592-635859524291587812-1:0:2:1:235, NMSXGroupID 1,
NMSXGroupSeq 235, JMSXGroupFirstForConsumer False
CONSUMER#4 | 2015-12-17 12:34:14.7238 | NMSMessageId
ID:Developer-QADW3-49592-635859524291587812-1:0:2:1:236, NMSXGroupID 1,
NMSXGroupSeq 236, JMSXGroupFirstForConsumer False
CONSUMER#4 | 2015-12-17 12:34:14.9708 | NMSMessageId
ID:Developer-QADW3-49592-635859524291587812-1:0:2:1:237, NMSXGroupID 1,
NMSXGroupSeq 237, JMSXGroupFirstForConsumer False

Again, NMSMessageId is from future, Consumer 3 was processing messages from
group 1 and in the same time Consumer 4 started to process messages

There are more situations like this in output file


Does any one have an idea, what might be wrong with this?

Thank you,
Best Regards,



--
View this message in context: http://activemq.2283324.n4.nabble.com/NMSXGroupID-problem-two-different-consumers-received-message-for-one-group-tp4705096.html
Sent from the ActiveMQ - User mailing list archive at Nabble.com.

Re: NMSXGroupID problem - two different consumers received message for one group

Posted by sburczymucha <sb...@gmail.com>.
Yeah, it seems that https://issues.apache.org/jira/browse/AMQ-3027 is
describing exactly the same problem I'm having. 

Thank you for clarification!

Best Regards,



--
View this message in context: http://activemq.2283324.n4.nabble.com/NMSXGroupID-problem-two-different-consumers-received-message-for-one-group-tp4705096p4705217.html
Sent from the ActiveMQ - User mailing list archive at Nabble.com.

Re: NMSXGroupID problem - two different consumers received message for one group

Posted by Tim Bain <tb...@alumni.duke.edu>.
I don't have any insider knowledge about whether a message group is closed
when the broker receives the message or when it becomes a candidate for
being dispatched, though other people on this list may and may be able to
speak more authoritatively to how it's done.

With that said, the behavior you're seeing seems to indicate that the
message group is closed when the message is received, which seems to be
confirmed by https://issues.apache.org/jira/browse/AMQ-3027 and by Tamas
Cserveny's comment on https://issues.apache.org/jira/browse/AMQ-2106.

Hopefully we can get AMQ-3027 fixed in a future version.  (Add your vote to
the issue if you want it fixed.)  Till then, if your messages are all
persistent, you could periodically restart the broker; if you're lucky, all
the consumers will reconnect at just about the same time, and you'll get
even distribution across them.  (You could help your odds here by using the
consumersBeforeDispatchStarts and timeBeforeDispatchStarts policyEntry
attributes as described in http://activemq.apache.org/message-groups.html
to help make sure you don't start dispatching messages with just one
consumer connected.)

If some of your messages are non-persistent and you're not willing to lose
them in order to achieve rebalancing, you're probably out of luck; either
do what you're doing now (and deal with concurrent consumption), or don't
close the group and just deal with the uneven distribution when it occurs.
Disconnecting all the consumers probably won't help, because it becomes a
race condition for how many of them are connected when the broker starts
assigning message groups; if they don't all reconnect at essentially the
same time, the broker will just hand them all to the first one, and you'll
have even worse distribution than you did before.

Tim

On Thu, Dec 17, 2015 at 7:42 AM, sburczymucha <sb...@gmail.com>
wrote:

> Tim Bain wrote
> > Are all messages being immediately consumed, or is there a backlog of
> > messages for a given message group when you close the message group?  I
> > would expect the message group to be closed immediately when the message
> > with NMSXGroupSeg=-1 is sent (not when it is consumed), so if there is a
> > backlog I would expect the new consumer to start processing with the next
> > undispatched message (which would be in the 200s), even if the old
> > consumer
> > is still processing its current message plus the one in its prefetch
> > buffer.
>
> Yes, there are a backlog messages even after a group is closed. I thought
> that until a group is not closed (until message with NMSXGroupSeg=-1 is not
> processed and committed) it belongs to only one consumer.
>
> By the way, does NMSXGroupSeg needs to be in order or the NMSXGroupSeg
> order
> does not have any impact on processing? What about the situation when
> current NMSXGroupSeg has max value and next one has 1 (starting from
> scratch)?
>
>
> Tim Bain wrote
> > Why are you closing the message group?  Seems like you could avoid the
> > whole problem just by doing that...
>
> What I'm trying to achieve is to periodically rebalance message groups
> along
> currently connected consumers (number of consumers can vary from time to
> time). I want to avoid the situation when new consumers are idle because
> all
> groups are processed by other consumers. I also need to keep the order of
> messages within group. Is there a better way to do it? Disconnect all
> consumers and reconnect?
>
> Thank you,
>
>
>
> --
> View this message in context:
> http://activemq.2283324.n4.nabble.com/NMSXGroupID-problem-two-different-consumers-received-message-for-one-group-tp4705096p4705111.html
> Sent from the ActiveMQ - User mailing list archive at Nabble.com.
>

Re: NMSXGroupID problem - two different consumers received message for one group

Posted by sburczymucha <sb...@gmail.com>.
Tim Bain wrote
> Are all messages being immediately consumed, or is there a backlog of
> messages for a given message group when you close the message group?  I
> would expect the message group to be closed immediately when the message
> with NMSXGroupSeg=-1 is sent (not when it is consumed), so if there is a
> backlog I would expect the new consumer to start processing with the next
> undispatched message (which would be in the 200s), even if the old
> consumer
> is still processing its current message plus the one in its prefetch
> buffer.

Yes, there are a backlog messages even after a group is closed. I thought
that until a group is not closed (until message with NMSXGroupSeg=-1 is not
processed and committed) it belongs to only one consumer.

By the way, does NMSXGroupSeg needs to be in order or the NMSXGroupSeg order
does not have any impact on processing? What about the situation when
current NMSXGroupSeg has max value and next one has 1 (starting from
scratch)? 


Tim Bain wrote
> Why are you closing the message group?  Seems like you could avoid the
> whole problem just by doing that...

What I'm trying to achieve is to periodically rebalance message groups along
currently connected consumers (number of consumers can vary from time to
time). I want to avoid the situation when new consumers are idle because all
groups are processed by other consumers. I also need to keep the order of
messages within group. Is there a better way to do it? Disconnect all
consumers and reconnect?

Thank you,



--
View this message in context: http://activemq.2283324.n4.nabble.com/NMSXGroupID-problem-two-different-consumers-received-message-for-one-group-tp4705096p4705111.html
Sent from the ActiveMQ - User mailing list archive at Nabble.com.

Re: NMSXGroupID problem - two different consumers received message for one group

Posted by Tim Bain <tb...@alumni.duke.edu>.
Are all messages being immediately consumed, or is there a backlog of
messages for a given message group when you close the message group?  I
would expect the message group to be closed immediately when the message
with NMSXGroupSeg=-1 is sent (not when it is consumed), so if there is a
backlog I would expect the new consumer to start processing with the next
undispatched message (which would be in the 200s), even if the old consumer
is still processing its current message plus the one in its prefetch buffer.

Why are you closing the message group?  Seems like you could avoid the
whole problem just by doing that...
On Dec 17, 2015 5:46 AM, "sburczymucha" <sb...@gmail.com> wrote:

> Hello,
>
> I have a strange problem with message groups and multiple consumers - I've
> noticed several cases when message with the same NMSXGroupID was received
> by
> two different consumers, I'm not sure if this is my fault or some other
> edge
> case. How can I debug this?
>
> My setup:
>
> 1. Each consumer has own connection and session
> 2. Every 255 messages NMSXGroupSeq is reset to -1
> 3. Messages are processed in transaction
> 4. Prefetch is set to 1
> 5. ActiveMQ + NMS + listener
>
> Here is a sample output from multiple consumers and one group (with
> consumer
> ID, NMSMessageId, NMSXGroupID, NMSXGroupSeq and JMSXGroupFirstForConsumer),
> logged just after message was passed to consumer (and not yet commited):
>
> http://pastebin.com/raw/V1ZYUNWF
>
> And here are problematic assignments:
>
>
> Consumer 2 receives first message for 1 group
>
> CONSUMER#2 | 2015-12-17 12:33:38.9183 | NMSMessageId
> ID:Developer-QADW3-49592-635859524291587812-1:0:2:1:1, NMSXGroupID 1,
> NMSXGroupSeq 1, JMSXGroupFirstForConsumer True
> CONSUMER#2 | 2015-12-17 12:33:39.9083 | NMSMessageId
> ID:Developer-QADW3-49592-635859524291587812-1:0:2:1:2, NMSXGroupID 1,
> NMSXGroupSeq 2, JMSXGroupFirstForConsumer False
>
> Consumer 2 continues his job...
>
> Consumer 3 receives message for group 1, but Consumer 2 already has this
> group. NMSMessageId is from future?
>
> CONSUMER#2 | 2015-12-17 12:33:52.7433 | NMSMessageId
> ID:Developer-QADW3-49592-635859524291587812-1:0:2:1:74, NMSXGroupID 1,
> NMSXGroupSeq 74, JMSXGroupFirstForConsumer False
> CONSUMER#2 | 2015-12-17 12:33:52.7573 | NMSMessageId
> ID:Developer-QADW3-49592-635859524291587812-1:0:2:1:75, NMSXGroupID 1,
> NMSXGroupSeq 75, JMSXGroupFirstForConsumer False
> CONSUMER#3 | 2015-12-17 12:33:52.9413 | NMSMessageId
> ID:Developer-QADW3-49592-635859524291587812-1:0:2:1:257, NMSXGroupID 1,
> NMSXGroupSeq 1, JMSXGroupFirstForConsumer True
> CONSUMER#2 | 2015-12-17 12:33:53.3623 | NMSMessageId
> ID:Developer-QADW3-49592-635859524291587812-1:0:2:1:76, NMSXGroupID 1,
> NMSXGroupSeq 76, JMSXGroupFirstForConsumer False
>
> And from now on only Consumer 3 is processing messages from group 1
>
> CONSUMER#3 | 2015-12-17 12:33:53.4303 | NMSMessageId
> ID:Developer-QADW3-49592-635859524291587812-1:0:2:1:77, NMSXGroupID 1,
> NMSXGroupSeq 77, JMSXGroupFirstForConsumer False
> CONSUMER#3 | 2015-12-17 12:33:53.6113 | NMSMessageId
> ID:Developer-QADW3-49592-635859524291587812-1:0:2:1:78, NMSXGroupID 1,
> NMSXGroupSeq 78, JMSXGroupFirstForConsumer False
>
> And similar situation:
>
> CONSUMER#3 | 2015-12-17 12:34:13.8848 | NMSMessageId
> ID:Developer-QADW3-49592-635859524291587812-1:0:2:1:233, NMSXGroupID 1,
> NMSXGroupSeq 233, JMSXGroupFirstForConsumer False
> CONSUMER#3 | 2015-12-17 12:34:14.0219 | NMSMessageId
> ID:Developer-QADW3-49592-635859524291587812-1:0:2:1:234, NMSXGroupID 1,
> NMSXGroupSeq 234, JMSXGroupFirstForConsumer False
> CONSUMER#4 | 2015-12-17 12:34:14.3928 | NMSMessageId
> ID:Developer-QADW3-49592-635859524291587812-1:0:2:1:513, NMSXGroupID 1,
> NMSXGroupSeq 1, JMSXGroupFirstForConsumer True
> CONSUMER#3 | 2015-12-17 12:34:14.6008 | NMSMessageId
> ID:Developer-QADW3-49592-635859524291587812-1:0:2:1:235, NMSXGroupID 1,
> NMSXGroupSeq 235, JMSXGroupFirstForConsumer False
> CONSUMER#4 | 2015-12-17 12:34:14.7238 | NMSMessageId
> ID:Developer-QADW3-49592-635859524291587812-1:0:2:1:236, NMSXGroupID 1,
> NMSXGroupSeq 236, JMSXGroupFirstForConsumer False
> CONSUMER#4 | 2015-12-17 12:34:14.9708 | NMSMessageId
> ID:Developer-QADW3-49592-635859524291587812-1:0:2:1:237, NMSXGroupID 1,
> NMSXGroupSeq 237, JMSXGroupFirstForConsumer False
>
> Again, NMSMessageId is from future, Consumer 3 was processing messages from
> group 1 and in the same time Consumer 4 started to process messages
>
> There are more situations like this in output file
>
>
> Does any one have an idea, what might be wrong with this?
>
> Thank you,
> Best Regards,
>
>
>
> --
> View this message in context:
> http://activemq.2283324.n4.nabble.com/NMSXGroupID-problem-two-different-consumers-received-message-for-one-group-tp4705096.html
> Sent from the ActiveMQ - User mailing list archive at Nabble.com.
>