You are viewing a plain text version of this content. The canonical link for it is here.

Posted to users@activemq.apache.org by Jelmer Marinus <je...@hotmail.com> on 2022/10/04 14:32:22 UTC

Apache Artemis 2.25 message counters differ from actual number of messages

Hi,

I have a 3 node Apache Artemis 2.25 symmetrical cluster setup. Each cluster node has a primary and a backup node. Replication is used to keep the primary and backup in sync. Each node runs in a Docker container and uses Java 18 and the ZGC garbage collector.

When I run 3 parallel producers which connect to specific primary nodes and I send (i.e.) 1000 messages to each node the message counters seem to show the wrong results, i.e. 1399, 1000, 1000 for a total number of 3399 messages. I would expect to get a 1000, 1000, 1000 distribution for a total of 3000 messages.
When I start consuming the messages I receive a total of 3000 messages and afterwards the message-counters are back to 0 on each node (0,0,0).
Sometimes messages also seem to get lost in redistribution. When I consume through one specific node it is possible to get only 2000 of the 3000 messages. The remaining 1000 messages do not seem to be on the queues anymore. My consumers do not use message-selectors/filters.

Has anyone else encountered these problems and is there anything we can do about it ?

Best regards,
Jelmer

Re: Apache Artemis 2.25 message counters differ from actual number of messages

Posted by Clebert Suconic <cl...@gmail.com>.

When I developed page counters.  I did out of users demanding. I did not
want to have it. And just rely on number of pages.


I’m having issues with the page counters out of replication.   I was not
able to identify any issues without the use of replication.
So are you using replication as well.


And if your system is patient also look at the number of pages until we fix
it.

On Tue, Oct 4, 2022 at 10:52 AM Jelmer Marinus <je...@hotmail.com>
wrote:

> I think so but I'll check just to be sure. Could that also be the cause of
> messages getting lost in redistribution ?
> ________________________________
> From: Clebert Suconic <cl...@gmail.com>
> Sent: Tuesday, October 4, 2022 4:42 PM
> To: users@activemq.apache.org <us...@activemq.apache.org>
> Subject: Re: Apache Artemis 2.25 message counters differ from actual
> number of messages
>
> do you have your destinations paging? I have a page-counter issue I'm
> currently working on.
>
> On Tue, Oct 4, 2022 at 10:32 AM Jelmer Marinus
> <je...@hotmail.com> wrote:
> >
> > Hi,
> >
> > I have a 3 node Apache Artemis 2.25 symmetrical cluster setup. Each
> cluster node has a primary and a backup node. Replication is used to keep
> the primary and backup in sync. Each node runs in a Docker container and
> uses Java 18 and the ZGC garbage collector.
> >
> > When I run 3 parallel producers which connect to specific primary nodes
> and I send (i.e.) 1000 messages to each node the message counters seem to
> show the wrong results, i.e. 1399, 1000, 1000 for a total number of 3399
> messages. I would expect to get a 1000, 1000, 1000 distribution for a total
> of 3000 messages.
> > When I start consuming the messages I receive a total of 3000 messages
> and afterwards the message-counters are back to 0 on each node (0,0,0).
> > Sometimes messages also seem to get lost in redistribution. When I
> consume through one specific node it is possible to get only 2000 of the
> 3000 messages. The remaining 1000 messages do not seem to be on the queues
> anymore. My consumers do not use message-selectors/filters.
> >
> > Has anyone else encountered these problems and is there anything we can
> do about it ?
> >
> > Best regards,
> > Jelmer
> >
> >
> >
> >
> >
> >
>
>
> --
> Clebert Suconic
>
-- 
Clebert Suconic

Re: Apache Artemis 2.25 message counters differ from actual number of messages

Posted by Jelmer Marinus <je...@hotmail.com>.

I think so but I'll check just to be sure. Could that also be the cause of messages getting lost in redistribution ?
________________________________
From: Clebert Suconic <cl...@gmail.com>
Sent: Tuesday, October 4, 2022 4:42 PM
To: users@activemq.apache.org <us...@activemq.apache.org>
Subject: Re: Apache Artemis 2.25 message counters differ from actual number of messages

do you have your destinations paging? I have a page-counter issue I'm
currently working on.

On Tue, Oct 4, 2022 at 10:32 AM Jelmer Marinus
<je...@hotmail.com> wrote:
>
> Hi,
>
> I have a 3 node Apache Artemis 2.25 symmetrical cluster setup. Each cluster node has a primary and a backup node. Replication is used to keep the primary and backup in sync. Each node runs in a Docker container and uses Java 18 and the ZGC garbage collector.
>
> When I run 3 parallel producers which connect to specific primary nodes and I send (i.e.) 1000 messages to each node the message counters seem to show the wrong results, i.e. 1399, 1000, 1000 for a total number of 3399 messages. I would expect to get a 1000, 1000, 1000 distribution for a total of 3000 messages.
> When I start consuming the messages I receive a total of 3000 messages and afterwards the message-counters are back to 0 on each node (0,0,0).
> Sometimes messages also seem to get lost in redistribution. When I consume through one specific node it is possible to get only 2000 of the 3000 messages. The remaining 1000 messages do not seem to be on the queues anymore. My consumers do not use message-selectors/filters.
>
> Has anyone else encountered these problems and is there anything we can do about it ?
>
> Best regards,
> Jelmer
>
>
>
>
>
>


--
Clebert Suconic

Re: Apache Artemis 2.25 message counters differ from actual number of messages

Posted by Clebert Suconic <cl...@gmail.com>.

do you have your destinations paging? I have a page-counter issue I'm
currently working on.

On Tue, Oct 4, 2022 at 10:32 AM Jelmer Marinus
<je...@hotmail.com> wrote:
>
> Hi,
>
> I have a 3 node Apache Artemis 2.25 symmetrical cluster setup. Each cluster node has a primary and a backup node. Replication is used to keep the primary and backup in sync. Each node runs in a Docker container and uses Java 18 and the ZGC garbage collector.
>
> When I run 3 parallel producers which connect to specific primary nodes and I send (i.e.) 1000 messages to each node the message counters seem to show the wrong results, i.e. 1399, 1000, 1000 for a total number of 3399 messages. I would expect to get a 1000, 1000, 1000 distribution for a total of 3000 messages.
> When I start consuming the messages I receive a total of 3000 messages and afterwards the message-counters are back to 0 on each node (0,0,0).
> Sometimes messages also seem to get lost in redistribution. When I consume through one specific node it is possible to get only 2000 of the 3000 messages. The remaining 1000 messages do not seem to be on the queues anymore. My consumers do not use message-selectors/filters.
>
> Has anyone else encountered these problems and is there anything we can do about it ?
>
> Best regards,
> Jelmer
>
>
>
>
>
>


-- 
Clebert Suconic

Re: Apache Artemis 2.25 message counters differ from actual number of messages

Posted by Clebert Suconic <cl...@gmail.com>.

If you could provide a way to reproduce the issue it would be very helpful.

On Thu, Oct 6, 2022 at 2:46 AM Jelmer Marinus
<je...@hotmail.com> wrote:
>
> We experienced the issue on both 2.24 and 2.25. When we downgraded to 2.20 we haven't experienced the problem anymore.
>
> When we were producing to the address on the nodes there were no consumers on the underlaying anycast queue. There where other producers/consumers present in the cluster but they were on different addresses/queues. Our addresses all have a one-to-one relationship with a any cast JMS queue.
> ________________________________
> From: Clebert Suconic <cl...@gmail.com>
> Sent: Wednesday, October 5, 2022 3:51 PM
> To: users@activemq.apache.org <us...@activemq.apache.org>
> Subject: Re: Apache Artemis 2.25 message counters differ from actual number of messages
>
> what version are you using:   beware of ARTEMIS-3862 Short lived
> subscription makes address size inconsistent
>
>
> Are you sure 2.20 would fix it? I am not aware of any difference that
> would make it so.
>
>
> do you have the replica taking over on your tests?
>
> On Wed, Oct 5, 2022 at 4:31 AM Jelmer Marinus
> <je...@hotmail.com> wrote:
> >
> > The message-counter we are inspecting is retrieved by requesting the "messageCount" of the queue using the "activemq.management" management queue.
> > When this was of we checked the Hawtio web-console and looked at the "Durable message count" of the queue. Both counters showed the same number.
> >
> > An actual test we did showed this.
> >
> > Node        Produced    messageCount      Durable message count
> > Node 1      5000        6642                6642
> > Node 2      5000        5000                5000
> > Node 3      5000        5000                5000
> >
> > We were able to consume 15.000 (3 times 5000) messages and afterwards all message counter were at zero (0).
> >
> > Sometimes we got a good (5000,5000,5000) result and it wasn't related to a specific node, i.e.
> >
> > Node        Produced    messageCount      Durable message count
> > Node 1      5000        5000                did not check
> > Node 2      5000        5000                did not check
> > Node 3      5000        6626                did not check
> >
> > We downgraded to 2.20 and problem seems to be gone. We encountered it on 2.25 and 2.24 also.
> >
> > We use replication to sync our primary and backup nodes.
> > In the log we saw a "AMQ222038 Starting paging on address...." so I asume  there was paging going on during this test.
> >
> > ________________________________
> > From: Justin Bertram <jb...@apache.org>
> > Sent: Tuesday, October 4, 2022 7:30 PM
> > To: users@activemq.apache.org <us...@activemq.apache.org>
> > Subject: Re: Apache Artemis 2.25 message counters differ from actual number of messages
> >
> > > When I run 3 parallel producers which connect to specific primary nodes
> > and I send (i.e.) 1000 messages to each node the message counters seem to
> > show the wrong results, i.e. 1399, 1000, 1000 for a total number of 3399
> > messages. I would expect to get a 1000, 1000, 1000 distribution for a total
> > of 3000 messages.
> >
> > Which specific "message counters" are you inspecting? Do you have any
> > consumers connected when you send the messages or do you only connect the
> > consumers later after all the messages are sent?
> >
> >
> > Justin
> >
> > On Tue, Oct 4, 2022 at 9:32 AM Jelmer Marinus <je...@hotmail.com>
> > wrote:
> >
> > > Hi,
> > >
> > > I have a 3 node Apache Artemis 2.25 symmetrical cluster setup. Each
> > > cluster node has a primary and a backup node. Replication is used to keep
> > > the primary and backup in sync. Each node runs in a Docker container and
> > > uses Java 18 and the ZGC garbage collector.
> > >
> > > When I run 3 parallel producers which connect to specific primary nodes
> > > and I send (i.e.) 1000 messages to each node the message counters seem to
> > > show the wrong results, i.e. 1399, 1000, 1000 for a total number of 3399
> > > messages. I would expect to get a 1000, 1000, 1000 distribution for a total
> > > of 3000 messages.
> > > When I start consuming the messages I receive a total of 3000 messages and
> > > afterwards the message-counters are back to 0 on each node (0,0,0).
> > > Sometimes messages also seem to get lost in redistribution. When I consume
> > > through one specific node it is possible to get only 2000 of the 3000
> > > messages. The remaining 1000 messages do not seem to be on the queues
> > > anymore. My consumers do not use message-selectors/filters.
> > >
> > > Has anyone else encountered these problems and is there anything we can do
> > > about it ?
> > >
> > > Best regards,
> > > Jelmer
> > >
> > >
> > >
> > >
> > >
> > >
> > >
>
>
>
> --
> Clebert Suconic



-- 
Clebert Suconic

Re: Apache Artemis 2.25 message counters differ from actual number of messages

Posted by Jelmer Marinus <je...@hotmail.com>.

We experienced the issue on both 2.24 and 2.25. When we downgraded to 2.20 we haven't experienced the problem anymore.

When we were producing to the address on the nodes there were no consumers on the underlaying anycast queue. There where other producers/consumers present in the cluster but they were on different addresses/queues. Our addresses all have a one-to-one relationship with a any cast JMS queue.
________________________________
From: Clebert Suconic <cl...@gmail.com>
Sent: Wednesday, October 5, 2022 3:51 PM
To: users@activemq.apache.org <us...@activemq.apache.org>
Subject: Re: Apache Artemis 2.25 message counters differ from actual number of messages

what version are you using:   beware of ARTEMIS-3862 Short lived
subscription makes address size inconsistent


Are you sure 2.20 would fix it? I am not aware of any difference that
would make it so.


do you have the replica taking over on your tests?

On Wed, Oct 5, 2022 at 4:31 AM Jelmer Marinus
<je...@hotmail.com> wrote:
>
> The message-counter we are inspecting is retrieved by requesting the "messageCount" of the queue using the "activemq.management" management queue.
> When this was of we checked the Hawtio web-console and looked at the "Durable message count" of the queue. Both counters showed the same number.
>
> An actual test we did showed this.
>
> Node        Produced    messageCount      Durable message count
> Node 1      5000        6642                6642
> Node 2      5000        5000                5000
> Node 3      5000        5000                5000
>
> We were able to consume 15.000 (3 times 5000) messages and afterwards all message counter were at zero (0).
>
> Sometimes we got a good (5000,5000,5000) result and it wasn't related to a specific node, i.e.
>
> Node        Produced    messageCount      Durable message count
> Node 1      5000        5000                did not check
> Node 2      5000        5000                did not check
> Node 3      5000        6626                did not check
>
> We downgraded to 2.20 and problem seems to be gone. We encountered it on 2.25 and 2.24 also.
>
> We use replication to sync our primary and backup nodes.
> In the log we saw a "AMQ222038 Starting paging on address...." so I asume  there was paging going on during this test.
>
> ________________________________
> From: Justin Bertram <jb...@apache.org>
> Sent: Tuesday, October 4, 2022 7:30 PM
> To: users@activemq.apache.org <us...@activemq.apache.org>
> Subject: Re: Apache Artemis 2.25 message counters differ from actual number of messages
>
> > When I run 3 parallel producers which connect to specific primary nodes
> and I send (i.e.) 1000 messages to each node the message counters seem to
> show the wrong results, i.e. 1399, 1000, 1000 for a total number of 3399
> messages. I would expect to get a 1000, 1000, 1000 distribution for a total
> of 3000 messages.
>
> Which specific "message counters" are you inspecting? Do you have any
> consumers connected when you send the messages or do you only connect the
> consumers later after all the messages are sent?
>
>
> Justin
>
> On Tue, Oct 4, 2022 at 9:32 AM Jelmer Marinus <je...@hotmail.com>
> wrote:
>
> > Hi,
> >
> > I have a 3 node Apache Artemis 2.25 symmetrical cluster setup. Each
> > cluster node has a primary and a backup node. Replication is used to keep
> > the primary and backup in sync. Each node runs in a Docker container and
> > uses Java 18 and the ZGC garbage collector.
> >
> > When I run 3 parallel producers which connect to specific primary nodes
> > and I send (i.e.) 1000 messages to each node the message counters seem to
> > show the wrong results, i.e. 1399, 1000, 1000 for a total number of 3399
> > messages. I would expect to get a 1000, 1000, 1000 distribution for a total
> > of 3000 messages.
> > When I start consuming the messages I receive a total of 3000 messages and
> > afterwards the message-counters are back to 0 on each node (0,0,0).
> > Sometimes messages also seem to get lost in redistribution. When I consume
> > through one specific node it is possible to get only 2000 of the 3000
> > messages. The remaining 1000 messages do not seem to be on the queues
> > anymore. My consumers do not use message-selectors/filters.
> >
> > Has anyone else encountered these problems and is there anything we can do
> > about it ?
> >
> > Best regards,
> > Jelmer
> >
> >
> >
> >
> >
> >
> >



--
Clebert Suconic

Re: Apache Artemis 2.25 message counters differ from actual number of messages

Posted by Clebert Suconic <cl...@gmail.com>.

what version are you using:   beware of ARTEMIS-3862 Short lived
subscription makes address size inconsistent


Are you sure 2.20 would fix it? I am not aware of any difference that
would make it so.


do you have the replica taking over on your tests?

On Wed, Oct 5, 2022 at 4:31 AM Jelmer Marinus
<je...@hotmail.com> wrote:
>
> The message-counter we are inspecting is retrieved by requesting the "messageCount" of the queue using the "activemq.management" management queue.
> When this was of we checked the Hawtio web-console and looked at the "Durable message count" of the queue. Both counters showed the same number.
>
> An actual test we did showed this.
>
> Node        Produced    messageCount      Durable message count
> Node 1      5000        6642                6642
> Node 2      5000        5000                5000
> Node 3      5000        5000                5000
>
> We were able to consume 15.000 (3 times 5000) messages and afterwards all message counter were at zero (0).
>
> Sometimes we got a good (5000,5000,5000) result and it wasn't related to a specific node, i.e.
>
> Node        Produced    messageCount      Durable message count
> Node 1      5000        5000                did not check
> Node 2      5000        5000                did not check
> Node 3      5000        6626                did not check
>
> We downgraded to 2.20 and problem seems to be gone. We encountered it on 2.25 and 2.24 also.
>
> We use replication to sync our primary and backup nodes.
> In the log we saw a "AMQ222038 Starting paging on address...." so I asume  there was paging going on during this test.
>
> ________________________________
> From: Justin Bertram <jb...@apache.org>
> Sent: Tuesday, October 4, 2022 7:30 PM
> To: users@activemq.apache.org <us...@activemq.apache.org>
> Subject: Re: Apache Artemis 2.25 message counters differ from actual number of messages
>
> > When I run 3 parallel producers which connect to specific primary nodes
> and I send (i.e.) 1000 messages to each node the message counters seem to
> show the wrong results, i.e. 1399, 1000, 1000 for a total number of 3399
> messages. I would expect to get a 1000, 1000, 1000 distribution for a total
> of 3000 messages.
>
> Which specific "message counters" are you inspecting? Do you have any
> consumers connected when you send the messages or do you only connect the
> consumers later after all the messages are sent?
>
>
> Justin
>
> On Tue, Oct 4, 2022 at 9:32 AM Jelmer Marinus <je...@hotmail.com>
> wrote:
>
> > Hi,
> >
> > I have a 3 node Apache Artemis 2.25 symmetrical cluster setup. Each
> > cluster node has a primary and a backup node. Replication is used to keep
> > the primary and backup in sync. Each node runs in a Docker container and
> > uses Java 18 and the ZGC garbage collector.
> >
> > When I run 3 parallel producers which connect to specific primary nodes
> > and I send (i.e.) 1000 messages to each node the message counters seem to
> > show the wrong results, i.e. 1399, 1000, 1000 for a total number of 3399
> > messages. I would expect to get a 1000, 1000, 1000 distribution for a total
> > of 3000 messages.
> > When I start consuming the messages I receive a total of 3000 messages and
> > afterwards the message-counters are back to 0 on each node (0,0,0).
> > Sometimes messages also seem to get lost in redistribution. When I consume
> > through one specific node it is possible to get only 2000 of the 3000
> > messages. The remaining 1000 messages do not seem to be on the queues
> > anymore. My consumers do not use message-selectors/filters.
> >
> > Has anyone else encountered these problems and is there anything we can do
> > about it ?
> >
> > Best regards,
> > Jelmer
> >
> >
> >
> >
> >
> >
> >



-- 
Clebert Suconic

Re: Apache Artemis 2.25 message counters differ from actual number of messages

Posted by Jelmer Marinus <je...@hotmail.com>.

The message-counter we are inspecting is retrieved by requesting the "messageCount" of the queue using the "activemq.management" management queue.
When this was of we checked the Hawtio web-console and looked at the "Durable message count" of the queue. Both counters showed the same number.

An actual test we did showed this.

Node        Produced    messageCount      Durable message count
Node 1      5000        6642                6642  
Node 2      5000        5000                5000
Node 3      5000        5000                5000

We were able to consume 15.000 (3 times 5000) messages and afterwards all message counter were at zero (0).

Sometimes we got a good (5000,5000,5000) result and it wasn't related to a specific node, i.e.

Node        Produced    messageCount      Durable message count
Node 1      5000        5000                did not check
Node 2      5000        5000                did not check
Node 3      5000        6626                did not check

We downgraded to 2.20 and problem seems to be gone. We encountered it on 2.25 and 2.24 also.

We use replication to sync our primary and backup nodes.
In the log we saw a "AMQ222038 Starting paging on address...." so I asume  there was paging going on during this test.

________________________________
From: Justin Bertram <jb...@apache.org>
Sent: Tuesday, October 4, 2022 7:30 PM
To: users@activemq.apache.org <us...@activemq.apache.org>
Subject: Re: Apache Artemis 2.25 message counters differ from actual number of messages

> When I run 3 parallel producers which connect to specific primary nodes
and I send (i.e.) 1000 messages to each node the message counters seem to
show the wrong results, i.e. 1399, 1000, 1000 for a total number of 3399
messages. I would expect to get a 1000, 1000, 1000 distribution for a total
of 3000 messages.

Which specific "message counters" are you inspecting? Do you have any
consumers connected when you send the messages or do you only connect the
consumers later after all the messages are sent?


Justin

On Tue, Oct 4, 2022 at 9:32 AM Jelmer Marinus <je...@hotmail.com>
wrote:

> Hi,
>
> I have a 3 node Apache Artemis 2.25 symmetrical cluster setup. Each
> cluster node has a primary and a backup node. Replication is used to keep
> the primary and backup in sync. Each node runs in a Docker container and
> uses Java 18 and the ZGC garbage collector.
>
> When I run 3 parallel producers which connect to specific primary nodes
> and I send (i.e.) 1000 messages to each node the message counters seem to
> show the wrong results, i.e. 1399, 1000, 1000 for a total number of 3399
> messages. I would expect to get a 1000, 1000, 1000 distribution for a total
> of 3000 messages.
> When I start consuming the messages I receive a total of 3000 messages and
> afterwards the message-counters are back to 0 on each node (0,0,0).
> Sometimes messages also seem to get lost in redistribution. When I consume
> through one specific node it is possible to get only 2000 of the 3000
> messages. The remaining 1000 messages do not seem to be on the queues
> anymore. My consumers do not use message-selectors/filters.
>
> Has anyone else encountered these problems and is there anything we can do
> about it ?
>
> Best regards,
> Jelmer
>
>
>
>
>
>
>

Re: Apache Artemis 2.25 message counters differ from actual number of messages

Posted by Justin Bertram <jb...@apache.org>.

> When I run 3 parallel producers which connect to specific primary nodes
and I send (i.e.) 1000 messages to each node the message counters seem to
show the wrong results, i.e. 1399, 1000, 1000 for a total number of 3399
messages. I would expect to get a 1000, 1000, 1000 distribution for a total
of 3000 messages.

Which specific "message counters" are you inspecting? Do you have any
consumers connected when you send the messages or do you only connect the
consumers later after all the messages are sent?


Justin

On Tue, Oct 4, 2022 at 9:32 AM Jelmer Marinus <je...@hotmail.com>
wrote:

> Hi,
>
> I have a 3 node Apache Artemis 2.25 symmetrical cluster setup. Each
> cluster node has a primary and a backup node. Replication is used to keep
> the primary and backup in sync. Each node runs in a Docker container and
> uses Java 18 and the ZGC garbage collector.
>
> When I run 3 parallel producers which connect to specific primary nodes
> and I send (i.e.) 1000 messages to each node the message counters seem to
> show the wrong results, i.e. 1399, 1000, 1000 for a total number of 3399
> messages. I would expect to get a 1000, 1000, 1000 distribution for a total
> of 3000 messages.
> When I start consuming the messages I receive a total of 3000 messages and
> afterwards the message-counters are back to 0 on each node (0,0,0).
> Sometimes messages also seem to get lost in redistribution. When I consume
> through one specific node it is possible to get only 2000 of the 3000
> messages. The remaining 1000 messages do not seem to be on the queues
> anymore. My consumers do not use message-selectors/filters.
>
> Has anyone else encountered these problems and is there anything we can do
> about it ?
>
> Best regards,
> Jelmer
>
>
>
>
>
>
>