You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@activemq.apache.org by Frédéric Beaulieu <fb...@pagesjaunes.fr> on 2014/07/18 14:42:20 UTC
3-machine cluster + replicated leveldb + durable subs => massive
message loss on restart
Hello,
That's my first post so i'll try to give as much information as possible.
My problem in a few words : i installed a 3-server ActiveMQ + Zookeeper +
_replicated_ leveldb store. I use 1 topic and 3 durable subscribers with a
filter. I send 1500 messages (500 for each durable subscriber). If i consume
them right after, that's OK. But if I perform either shutdown the ActiveMQ
master or the full cluster, when I restart it again, most of my messages are
gone ! No exceptions in ActiveMQ logs.
Now more in-depth reproduction steps :
1) I installed a Zookeeper 3 machine cluster (no password). See attached
zoo.cfg.
2) I uncompressed ActiveMQ 5.9.10 on 3 servers, using the attached
activemq.xml configuration file. I start with an EMPTY data / leveldb-data
dir.
3) To overcome this bug (https://issues.apache.org/jira/browse/AMQ-5105), I
follow the proposed workaround.
4) I start my 3 activemq servers (bin/activemq start) ; See all 3 attached
activemq.log
5) I create 3 durable subscribers on my Log.Raw topic (using jolokia API) :
curl --user admin:admin -s -XPOST
'http://rechrds01t.bbo1t.local:8161/api/jolokia/exec' -d '
{
"type":"exec",
"mbean":"org.apache.activemq:brokerName=task,type=Broker",
"operation":"createDurableSubscriber",
"arguments":[
"Log.Raw.ORDO.1",
"Log.Raw.ORDO",
"Log.Raw",
"domain='"'"'ORDO'"'"'"
]
}';echo
curl --user admin:admin -s -XPOST
'http://rechrds01t.bbo1t.local:8161/api/jolokia/exec' -d '
{
"type":"exec",
"mbean":"org.apache.activemq:brokerName=task,type=Broker",
"operation":"createDurableSubscriber",
"arguments":[
"Log.Raw.BO.1",
"Log.Raw.BO",
"Log.Raw",
"domain='"'"'BO'"'"'"
]
}';echo
curl --user admin:admin -s -XPOST
'http://rechrds01t.bbo1t.local:8161/api/jolokia/exec' -d '
{
"type":"exec",
"mbean":"org.apache.activemq:brokerName=task,type=Broker",
"operation":"createDurableSubscriber",
"arguments":[
"Log.Raw.PARU.1",
"Log.Raw.PARU",
"Log.Raw",
"domain='"'"'PARU'"'"'"
]
}';echo
6) I send 1500 messages, 500 with domain='BO', 500 with domain='PARU', 500
with domain='ORDO'
NIO port sync mode, AUTO_ACKNOWLEDGE session.
7) I check (using hawt.io or ActiveMQ admin UI that my 3 durable subs have
500 enqueud messages.
(If i try to consume them without stopping any activemq server, all is OK
: 1500 messages received)
8) I stop (bin/activemq stop) the activemq master (rechrds01t.bbo1t.local
for example)
=> rechrds02t.bbo1t.local becomes master.
9) When i check the 3 durable subs queue size on the new master, it shows
that most of my messages
have disappeared, except for a very few number.
I google'd a lot about this and found nothing really interesting...
I also attached a text file that lists the leveldb-data content at various
steps.
This is fully reproducible. What do i miss ? Is this a bug ?
Thank you a lot if you can help.
zoo.cfg <http://activemq.2283324.n4.nabble.com/file/n4683411/zoo.cfg>
activemq.xml
<http://activemq.2283324.n4.nabble.com/file/n4683411/activemq.xml>
activemq.log-on-rechrds01t
<http://activemq.2283324.n4.nabble.com/file/n4683411/activemq.log-on-rechrds01t>
activemq.log-on-rechrds02t
<http://activemq.2283324.n4.nabble.com/file/n4683411/activemq.log-on-rechrds02t>
activemq.log-on-rechrds03t
<http://activemq.2283324.n4.nabble.com/file/n4683411/activemq.log-on-rechrds03t>
leveldb-ls-lR.txt
<http://activemq.2283324.n4.nabble.com/file/n4683411/leveldb-ls-lR.txt>
--
View this message in context: http://activemq.2283324.n4.nabble.com/3-machine-cluster-replicated-leveldb-durable-subs-massive-message-loss-on-restart-tp4683411.html
Sent from the ActiveMQ - User mailing list archive at Nabble.com.
Re: 3-machine cluster + replicated leveldb + durable subs =>
massive message loss on restart
Posted by Frédéric Beaulieu <fb...@pagesjaunes.fr>.
Found a workaround ...
Topics are replaced by queues + persistent messages
Durable subscribers are replaced by other queues + camel routing to feed
them.
Camel routes can be updated live using JMX/Jolokia call
A benefit from using Camel is also that I can suspend the routing process.
For reference, here's my camel.xml file (works fine) :
<beans
xmlns="http://www.springframework.org/schema/beans"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="
http://camel.apache.org/schema/spring
http://camel.apache.org/schema/spring/camel-spring.xsd
http://www.springframework.org/schema/beans
http://www.springframework.org/schema/beans/spring-beans.xsd">
<camelContext id="camel" xmlns="http://camel.apache.org/schema/spring">
<route>
<description>From Log.Raw to specialized Log.Raw
queues</description>
<from uri="activemq:queue:Log.Raw"/>
<choice>
<when>
<spel>#{request.headers['domain'].equals('ORDO')}</spel>
<to uri="activemq:queue:Log.Raw.ORDO"/>
</when>
<when>
<spel>#{request.headers['domain'].equals('BO')}</spel>
<to uri="activemq:queue:Log.Raw.BO"/>
</when>
<otherwise>
<to uri="activemq:queue:Log.Raw.Trash"/>
</otherwise>
</choice>
</route>
</camelContext>
<bean id="activemq"
class="org.apache.activemq.camel.component.ActiveMQComponent" >
<property name="connectionFactory">
<bean class="org.apache.activemq.ActiveMQConnectionFactory">
<property name="brokerURL" value="vm://task?create=false"/>
<property name="userName" value="${activemq.username}"/>
<property name="password" value="${activemq.password}"/>
</bean>
</property>
</bean>
</beans>
--
View this message in context: http://activemq.2283324.n4.nabble.com/3-machine-cluster-replicated-leveldb-durable-subs-massive-message-loss-on-restart-tp4683411p4683473.html
Sent from the ActiveMQ - User mailing list archive at Nabble.com.