You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@activemq.apache.org by oseymen <oz...@tdpg.com> on 2011/06/10 17:34:24 UTC

Message loss in network of brokers - transactional send

I seem to be losing some messages in a network of brokers topology in the
following setup:

- 1 producer pushing messages to a persistent queue (let’s call it
“IN.QUEUE.Broker1”) on one broker (broker1) inside a transaction. Producer
commits in every 100 messages.

- IN.QUEUE is a composite queue which forwards the messages to another queue
in broker1 (let’s call this “Q1”). Network of brokers is configured to
forward messages from Q1 to broker2. Q1 is listed in
staticallyIncludedDestinations so it will always forward. 

- There are no consumers. I am looking at the message counts from AMQ web
interface once all messages are in.

- Second broker (broker2) is connected to broker1 via network of brokers
(configuration below).

My test case:
1.	Purge all messages in all queues
2.	Start producer and send 10,000 messages to IN.QUEUE in broker1. At this
point, I start to see messages forwarded to broker2.
3.	Stop/kill broker2. I am running the brokers in console. So in order to
kill the broker I just kill the console. At this point I am seeing the
messages accumulated in broker1.
4.	Start broker2.
5.	When all 10,000 is sent, I look at the total number of messages in each
broker and they are same. I just run the test again and I have 9981 messages
in broker2 instead of 10,000. I can replicate this whenever I run this test.

I also looked at https://issues.apache.org/jira/browse/AMQ-1845. My issue
seems to be similar with this. So I converted my code from Spring jms
template to Apache NMS. But issue is still there.

Can you please advise whether this is a known issue or something is wrong in
my configuration?

CONFIGURATION in BROKER1:

<destinationInterceptors>
  <virtualDestinationInterceptor>
	<virtualDestinations>
	  <compositeQueue name="IN.Broker1">
		<forwardTo>
		  <queue physicalName="Q1" />
		</forwardTo>
	  </compositeQueue>
	</virtualDestinations>
  </virtualDestinationInterceptor>
</destinationInterceptors>

<networkConnectors>
  	<networkConnector 
		uri="static:(tcp://localhost:61617)"
		name="FromB1ToB2"
		conduitSubscriptions="false"
		decreaseNetworkConsumerPriority="false"
		prefetchSize="1"
		>
	<staticallyIncludedDestinations>
		<queue physicalName="Q1" />
	</staticallyIncludedDestinations> 
  </networkConnector>
</networkConnectors>

PRODUCER CODE:

var textMessage = GetMessageContent();
var connectionFactory = new ConnectionFactory(ServerUri);

using (var connection = connectionFactory.CreateConnection())
{
	connection.Start();

	using (var session =
connection.CreateSession(AcknowledgementMode.Transactional))
	{
		var queue = session.GetQueue(destination);
		using (var producer = session.CreateProducer(queue))
		{
			producer.DeliveryMode = MsgDeliveryMode.Persistent;

			for (int i = 1; i < numberOfMessagesToSend+1; i++)
			{
				var message = producer.CreateTextMessage(textMessage);
				producer.Send(message);

				if (i % 100 == 0)
				{
					session.Commit();
				}
			}
		}
	}
}

Cheers,
Ozan

--
View this message in context: http://activemq.2283324.n4.nabble.com/Message-loss-in-network-of-brokers-transactional-send-tp3588714p3588714.html
Sent from the ActiveMQ - User mailing list archive at Nabble.com.

Re: Message loss in network of brokers - transactional send

Posted by oseymen <oz...@tdpg.com>.
Possibly similar to https://issues.apache.org/jira/browse/AMQ-3473

--
View this message in context: http://activemq.2283324.n4.nabble.com/Message-loss-in-network-of-brokers-transactional-send-tp3588714p3766852.html
Sent from the ActiveMQ - User mailing list archive at Nabble.com.

Re: Message loss in network of brokers - transactional send

Posted by oseymen <oz...@tdpg.com>.
Just created a JIRA for this: https://issues.apache.org/jira/browse/AMQ-3469.
This might actually be a bug.

--
View this message in context: http://activemq.2283324.n4.nabble.com/Message-loss-in-network-of-brokers-transactional-send-tp3588714p3764707.html
Sent from the ActiveMQ - User mailing list archive at Nabble.com.

Re: Message loss in network of brokers - transactional send

Posted by oseymen <oz...@tdpg.com>.
Broker1 configuration: 
http://activemq.2283324.n4.nabble.com/file/n3762374/activemq_-_broker1.xml
activemq_-_broker1.xml 
Broker2 configuration: 
http://activemq.2283324.n4.nabble.com/file/n3762374/activemq_-_broker2.xml
activemq_-_broker2.xml 

--
View this message in context: http://activemq.2283324.n4.nabble.com/Message-loss-in-network-of-brokers-transactional-send-tp3588714p3762374.html
Sent from the ActiveMQ - User mailing list archive at Nabble.com.

Re: Message loss in network of brokers - transactional send

Posted by oseymen <oz...@tdpg.com>.
So let me summarize this thread from the beginning:

Please see this arch diagram: 
http://activemq.2283324.n4.nabble.com/file/n3762364/ActiveMQ_-_Composite_Queues.png
ActiveMQ_-_Composite_Queues.png 

I am sending 3000 messages from my producer into broker1 and observe message
counts in Transit and Indexing queues. I start my producer and while
messages are flowing, I start killing brokers one by one randomly.

My first problem was the message loss in the first composite operation:
Q.Index.Transit.DC2 & Q.Index.Transit.DC3. This problem is solved by sending
Transactional messages from the producer to Q.Index.Replication.

However another problem (which I explained in previous message) came up
where admin console and JConsole reporting more messages than there actually
are in Q.A.Indexing and Q.B.Indexing queues.

Take a look at this screenshot: 
http://activemq.2283324.n4.nabble.com/file/n3762364/ActiveMQ_-_Admin_Console.png
ActiveMQ_-_Admin_Console.png 

All queues had 3003 messages (instead of 3000). When I consumed all messages
in Q.A.Indexing with my consumer, it successfully consumed 3000 messages (as
expected) but admin console still reports there are additional 3 messages
pending in the queue. When I click on "Browse" for this queue in admin
console, it reports that there are no messages. When I restart this broker,
pending message count correct itself and reports 0.

Receiving duplicate messages are not a problem for me - I can deal with them
in my consumer. But JConsole and admin console reporting that there are
still messages to be consumed is a problem for the monitoring side. I have
no way of knowing whether there are really 3 messages left and consumer is
experiencing problems or consumer is alive but there are no messages to
consume.

I'd appreciate any help in this matter. Am I using composite destinations
incorrectly? I can also do the same design with Camel which works fine but
I'd like to use native AMQ features to do this.

Here are the configuration files for both brokers:
Broker1:  http://activemq.2283324.n4.nabble.com/file/n3762364/activemq.xml
activemq.xml 
Broker2:  http://activemq.2283324.n4.nabble.com/file/n3762364/activemq.xml
activemq.xml 

Thanks,
Ozan

--
View this message in context: http://activemq.2283324.n4.nabble.com/Message-loss-in-network-of-brokers-transactional-send-tp3588714p3762364.html
Sent from the ActiveMQ - User mailing list archive at Nabble.com.

Re: Message loss in network of brokers - transactional send

Posted by oseymen <oz...@tdpg.com>.
>> "Also, I think a send transaction would help, it should encompass all of
the composite destinations."

That is correct. When I send messages in transaction, I get no message loss
- but both JMX and admin console pending message count displays wrong
number.

My setup is still same - 2 brokers, connected with network of brokers. (Just
to remind you:) Messages are flowing from broker1 to broker2. On broker2, I
have configured the replicated queue as a composite queue so that I can
divert incoming messages into multiple queues in broker2.

When I send 3000 messages to broker1 (in transaction) and start killing
brokers one by one randomly while messages are flowing, I end up having 3001
or 3002 pending messages count. When I execute a consumer on broker2, I can
consume 3000 messages which is perfect. But admin console reports there is 1
message pending. When I browse the messages in the queue using admin
console, it displays no messages. So even though there are no messages
pending, somehow, JMX and admin console thinks that there is 1 message
pending. When I restart broker2, pending messages count corrects itself and
reports 0 in admin console.

Have you ever encountered this problem before? Number of pending messages
counter is very important for me for monitoring purposes. If it says 1
message is pending even though its not, I will get monitoring nightmares.

I'd really appreciate any help or information.

Cheers,
Ozan

--
View this message in context: http://activemq.2283324.n4.nabble.com/Message-loss-in-network-of-brokers-transactional-send-tp3588714p3760397.html
Sent from the ActiveMQ - User mailing list archive at Nabble.com.

Re: Message loss in network of brokers - transactional send

Posted by Gary Tully <ga...@gmail.com>.
the only disadvantage is the obvious one, and there will only be a
duplicate dispatch in this case if the message is already acked.

On 11 August 2011 11:18, oseymen <oz...@tdpg.com> wrote:
> Hi there,
>
> Is it possible to let me know the disadvantages of disabling audit in kahadb
> (apart from the obvious - duplicates won't be suppressed) please?
>
> Thanks,
> Ozan
>
> --
> View this message in context: http://activemq.2283324.n4.nabble.com/Message-loss-in-network-of-brokers-transactional-send-tp3588714p3735439.html
> Sent from the ActiveMQ - User mailing list archive at Nabble.com.
>



-- 
http://fusesource.com
http://blog.garytully.com

Re: Message loss in network of brokers - transactional send

Posted by oseymen <oz...@tdpg.com>.
Hi there,

Is it possible to let me know the disadvantages of disabling audit in kahadb
(apart from the obvious - duplicates won't be suppressed) please?

Thanks,
Ozan

--
View this message in context: http://activemq.2283324.n4.nabble.com/Message-loss-in-network-of-brokers-transactional-send-tp3588714p3735439.html
Sent from the ActiveMQ - User mailing list archive at Nabble.com.

Re: Message loss in network of brokers - transactional send

Posted by oseymen <oz...@tdpg.com>.
Thanks Gary. I can confirm that sending with transactions works perfectly but
slower than AUTO for my scenario where messages are generated one by one and
need to be sent one by one.

What are the disadvantages of disabling audit in kahadb (apart from the
obvious - duplicates won't be suppressed)?

Am I correct in assuming that composite destinations are not generally used
when "zero tolerance for message loss" scenarios with audit enabled? 

The reason why I was implementing composite destinations (virtual topics in
this case) was to make AMQ future proof, i.e. any other consumers
implemented in the future that are interested in the same messages won't
require reconfiguration & restart of the broker - they will just come in and
start listening on a queue of their own backed by the Virtual Topic. An
example of this might be a consumer that handles custom statistics or
consumers for a separate system that are also interested in the same
messages. How do you normally handle such situations?

Will repeat my tests with maxFailoverProducersToTrack as soon as.

Thanks again Gary
Ozan

--
View this message in context: http://activemq.2283324.n4.nabble.com/Message-loss-in-network-of-brokers-transactional-send-tp3588714p3664704.html
Sent from the ActiveMQ - User mailing list archive at Nabble.com.

Re: Message loss in network of brokers - transactional send

Posted by Gary Tully <ga...@gmail.com>.
that is a problem, if broker death occurs between the first send to a
composite destination and the last send (all of which occur on the
broker), the resend will be suppressed but there is no guarantee that
each of the composite dests got the message.

The duplicate suppression is not aware of composite destinations.
It should be possible to disable the duplicate suppression on the
persistence adapter using
<kahaDB maxFailoverProducersToTrack="0" />

Also, I think a send transaction would help, it should encompass all
of the composite destinations.

On 13 July 2011 09:38, oseymen <oz...@tdpg.com> wrote:
> Thanks Gary.
>
> I've tested below with Fuse 5.5 (apache-activemq-5.5.0-fuse-00-27) and
> unfortunately the issue is there. However what it boils down to is this
> "suppressing duplicate message send" message in composite destinations.
>
> In order to prove this is the case, I started eliminating components one by
> one. I removed network-of-brokers from my setup. I took the vanilla
> (default) activemq.xml from the distribution and setup a virtual topic with
> 3 queues. I started sending 3000 messages with my producer which simply
> sends messages in auto-acknowledge mode using failover transport. While
> producer is running, I killed and restarted  activemq multiple times
> (killed: prematurely. Just close the console window in which AMQ was
> running).
>
> I am seeing 3000 in one queue and less messages (~2998) on other queues. In
> the log file I have DEBUG statements saying "suppressing duplicate message
> send...". So after AMQ restart, AMQ is suppressing the message to other
> queues thinking that it is duplicate by looking at the last stored sequence
> id. In this case last stored sequence id is correct but this stops message
> propagation to other queues in composite destination setup.
>
> I searched this in Google and spotted Gary's comment on
> https://issues.apache.org/jira/browse/AMQ-2800. Gary says "duplicate message
> sends can occur with the non transactional use of the failover: transport.
> It can happen if a send is in progress when a failover reconnection occurs
> back to the same broker (as if there was a temp network partition) and the
> send reply is not received. A non transactional client will resend the
> message which needs to be suppressed by the audit". He also recommends
> disabling audit. However this is for JDBCMessageStore as far as I understand
> and not for KahaDB.
>
> I am sure that this is the reason for my problems as well. I really couldn't
> understand how to disable audit for KahaDB but looking at the configuration
> schema, I applied following changes to the config but none of them fixed the
> problem:
> 1. enableAudit="false" to PolicyEntry for all topics and queues.
> 2. maxProducersToAudit="0" to PolicyEntry for all topics and queues.
>
> Does anyone have any ideas on how to fix this? If not, I will raise a JIRA.
>
> Thanks,
> Ozan
>
> --
> View this message in context: http://activemq.2283324.n4.nabble.com/Message-loss-in-network-of-brokers-transactional-send-tp3588714p3664510.html
> Sent from the ActiveMQ - User mailing list archive at Nabble.com.
>



-- 
http://fusesource.com
http://blog.garytully.com

Re: Message loss in network of brokers - transactional send

Posted by oseymen <oz...@tdpg.com>.
Thanks Gary.

I've tested below with Fuse 5.5 (apache-activemq-5.5.0-fuse-00-27) and
unfortunately the issue is there. However what it boils down to is this
"suppressing duplicate message send" message in composite destinations.

In order to prove this is the case, I started eliminating components one by
one. I removed network-of-brokers from my setup. I took the vanilla
(default) activemq.xml from the distribution and setup a virtual topic with
3 queues. I started sending 3000 messages with my producer which simply
sends messages in auto-acknowledge mode using failover transport. While
producer is running, I killed and restarted  activemq multiple times
(killed: prematurely. Just close the console window in which AMQ was
running).

I am seeing 3000 in one queue and less messages (~2998) on other queues. In
the log file I have DEBUG statements saying "suppressing duplicate message
send...". So after AMQ restart, AMQ is suppressing the message to other
queues thinking that it is duplicate by looking at the last stored sequence
id. In this case last stored sequence id is correct but this stops message
propagation to other queues in composite destination setup.

I searched this in Google and spotted Gary's comment on
https://issues.apache.org/jira/browse/AMQ-2800. Gary says "duplicate message
sends can occur with the non transactional use of the failover: transport.
It can happen if a send is in progress when a failover reconnection occurs
back to the same broker (as if there was a temp network partition) and the
send reply is not received. A non transactional client will resend the
message which needs to be suppressed by the audit". He also recommends
disabling audit. However this is for JDBCMessageStore as far as I understand
and not for KahaDB.

I am sure that this is the reason for my problems as well. I really couldn't
understand how to disable audit for KahaDB but looking at the configuration
schema, I applied following changes to the config but none of them fixed the
problem:
1. enableAudit="false" to PolicyEntry for all topics and queues.
2. maxProducersToAudit="0" to PolicyEntry for all topics and queues.

Does anyone have any ideas on how to fix this? If not, I will raise a JIRA.

Thanks,
Ozan

--
View this message in context: http://activemq.2283324.n4.nabble.com/Message-loss-in-network-of-brokers-transactional-send-tp3588714p3664510.html
Sent from the ActiveMQ - User mailing list archive at Nabble.com.

Re: Message loss in network of brokers - transactional send

Posted by Gary Tully <ga...@gmail.com>.
if you can reproduce this issue with the current 5.5 release, can you
raise a jira issue to track this?
thanks.

On 16 June 2011 13:45, oseymen <oz...@tdpg.com> wrote:
> Here is some more info:
>
> I enabled debug logging and can see where the problem is (however I still
> don't know what the solution is).
>
> I ran my test again and send 5000 messages to one broker which is configured
> to store-and-forward the message to broker2. Two brokers are connected via
> network of brokers.
>
> After killing brokers randomly, I ended up 4997 messages in broker2. I wrote
> a consumer that will consume all messages from broker2 and create a
> spreadsheet with all properties. Using this spreadsheet I can pinpoint which
> messages have failed using sequential ids. One of the messages that has
> failed is #3751.
>
> Looking at the debug log I can see:
>
> (line 976) bridging (broker1 -> broker2) messageId =
> ID:HAM-NB-073-56843-634438260819794239-1:0:1:1:3748
> (line 979) bridging (broker1 -> broker2) messageId =
> ID:HAM-NB-073-56843-634438260819794239-1:0:1:1:3749
> (line 983) bridging (broker1 -> broker2) messageId =
> ID:HAM-NB-073-56843-634438260819794239-1:0:1:1:3750
>
> Then broker1 is killed.
> When it comes back, it says:
> (line 1311) last stored sequence id set: 3751
> (line 1312) suppressing duplicate message send
> [ID:HAM-NB-073-56843-634438260819794239-1:0:1:1:3751] with
> producerSequenceId [3751] less than last stored: 3751
>
> There is not message sent information in the log for 3751!!!
>
> I've attached the full log for your perusal. I'd appreciate any help to
> solve this problem.
>
> This test was done with apache-activemq-5.4.2-fuse-02-00.
>
> http://activemq.2283324.n4.nabble.com/file/n3602343/activemq.log.4
> activemq.log.4
>
>
>
> --
> View this message in context: http://activemq.2283324.n4.nabble.com/Message-loss-in-network-of-brokers-transactional-send-tp3588714p3602343.html
> Sent from the ActiveMQ - User mailing list archive at Nabble.com.
>



-- 
http://fusesource.com
http://blog.garytully.com

Re: Message loss in network of brokers - transactional send

Posted by oseymen <oz...@tdpg.com>.
Here is some more info:

I enabled debug logging and can see where the problem is (however I still
don't know what the solution is).

I ran my test again and send 5000 messages to one broker which is configured
to store-and-forward the message to broker2. Two brokers are connected via
network of brokers.

After killing brokers randomly, I ended up 4997 messages in broker2. I wrote
a consumer that will consume all messages from broker2 and create a
spreadsheet with all properties. Using this spreadsheet I can pinpoint which
messages have failed using sequential ids. One of the messages that has
failed is #3751.

Looking at the debug log I can see:

(line 976) bridging (broker1 -> broker2) messageId =
ID:HAM-NB-073-56843-634438260819794239-1:0:1:1:3748
(line 979) bridging (broker1 -> broker2) messageId =
ID:HAM-NB-073-56843-634438260819794239-1:0:1:1:3749
(line 983) bridging (broker1 -> broker2) messageId =
ID:HAM-NB-073-56843-634438260819794239-1:0:1:1:3750

Then broker1 is killed.
When it comes back, it says:
(line 1311) last stored sequence id set: 3751
(line 1312) suppressing duplicate message send
[ID:HAM-NB-073-56843-634438260819794239-1:0:1:1:3751] with
producerSequenceId [3751] less than last stored: 3751

There is not message sent information in the log for 3751!!!

I've attached the full log for your perusal. I'd appreciate any help to
solve this problem.

This test was done with apache-activemq-5.4.2-fuse-02-00.

http://activemq.2283324.n4.nabble.com/file/n3602343/activemq.log.4
activemq.log.4 



--
View this message in context: http://activemq.2283324.n4.nabble.com/Message-loss-in-network-of-brokers-transactional-send-tp3588714p3602343.html
Sent from the ActiveMQ - User mailing list archive at Nabble.com.

Re: Message loss in network of brokers - transactional send

Posted by oseymen <oz...@tdpg.com>.
Hi,

Based on below test and my further tests where I kill the target broker
during store-and-forward operation, I am seeing messages lost/dropped.

Basically this problem occurs when messages are sent to a queue on one
broker which is configured to be "staticallyincluded" to forward messages to
another broker, and if target broker is killed during message forwarding
operation.

It is as if store-and-forward is removing the message from the queue before
making sure that the message is persisted in the target broker.

Do you think this is so because of transactional send from the producer to
the original queue on the first broker?

Can you advise which configuration options I should set in order to achieve
reliable delivery in network of brokers please?

Thanks,
Ozan

--
View this message in context: http://activemq.2283324.n4.nabble.com/Message-loss-in-network-of-brokers-transactional-send-tp3588714p3590852.html
Sent from the ActiveMQ - User mailing list archive at Nabble.com.