You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@activemq.apache.org by Hiram Chirino <hi...@hiramchirino.com> on 2013/11/01 19:39:03 UTC
Re: Replicated LevelDB Store not working

Hi Antonio,

Just wanted to bring to your attention an issue that Guillaume opened
up against the replicated leveldb store:
https://issues.apache.org/jira/browse/AMQ-4837

Perhaps the bug related to that issue was also causing your error.  If
you get a chance could you try out the following build? :
https://repository.apache.org/content/repositories/snapshots/org/apache/activemq/apache-activemq/5.10-SNAPSHOT/apache-activemq-5.10-20131101.162431-14-bin.tar.gz


On Wed, Oct 30, 2013 at 11:44 AM, Hiram Chirino <hi...@hiramchirino.com> wrote:
> On Wed, Oct 30, 2013 at 6:12 AM, Antonio Terreno
> <an...@gmail.com> wrote:
>> Hi all,
>> we are having some troubles setting up a cluster of three nodes of ActiveMQ
>> v.5.9.0 with LevelDB and Zookeeper, as described in this page:
>> http://activemq.apache.org/replicated-leveldb-store.html.
>>
>> The configuration of the 3 servers is in this gist:
>> https://gist.github.com/aterreno/7229464. (the only difference between the
>> 3 instances is the hostname, we use the IP of the host)
>>
>
> That looks good.
>
>> First of all, the performance compared with a local Kaha DB store (jms
>> failover url with no replica) seem less performant,
>
> Yeah. we really have not done much benchmarking to compare the 2 yet.
> But it probably due to extra work need to replicate the data to the
> slaves.  Are you hitting any CPU/Network/Disk bottlenecks?
>
>> but more importantly,
>> whenever we try to take down the master, with the command bin/activemq stop
>> the slaves that tries to become master gets IOExpections from levelDB.
>
> That's not expected.  Your using the pure Java leveldb driver which
> might still have some bugs in it.
> It might be interesting to see if this failure is just isolated to it.
>  If you get a chance you could
> you copy into your distribution's 'lib' directory the following jar:
>     http://repo1.maven.org/maven2/org/fusesource/leveldbjni/leveldbjni-all/1.8/leveldbjni-all-1.8.jar
>
> That should get you to use the JNI implementation.  You'll know it's
> being used when you see the following message logged:
>     INFO | Using the JNI LevelDB implementation.
>
>> (I've just reproduced it by taking down master, putting back master and
>> killing the master that got elected)
>>
>> I feel like this might happen because we don't stop properly the master,
>> the log for the shutdown is this one:
>>
>> ./bin/activemq stop
> <snip>
>> INFO: failed to resolve jmxUrl for pid:53586, using default JMX url
>> Connecting to JMX URL: service:jmx:rmi:///jndi/rmi://localhost:1099/jmxrmi
>> .Stopping broker: localhost
>> . FINISHED
>>
>> Another frequent error we get when trying to stop ActiveMQ is:
>>
>> ACTIVEMQ_OPTS_MEMORY="-Xms3G -Xmx3G" ./bin/activemq stop
>> INFO: Loading '/root/.activemqrc'
>> INFO: Using java '/opt/molsfw/java/latest7/bin/java'
>> INFO: Waiting at least 30 seconds for regular process termination of pid
>> '55182' :
>> Error occurred during initialization of VM
>> Could not reserve enough space for object heap
>> .............................
>> INFO: Regular shutdown not successful,  sending SIGKILL to process with pid
>> '55182'
>
> That seems really weird since we are just doing a JMX remote call to
> stop the running server.
>
>
>> And while testing the failover/resilience we are sending messages to this
>> connection string: "failover://(tcp://10.251.76.45:61616,tcp://
>> 10.251.76.58:61616,tcp://10.251.76.60:61616) "
>>
>> I hope that the problem is clear, but I'll reiterate: we have 3 machines,
>> 10.251.76.45 (#1), 10.251.76.58 (#2) and 10.251.76.60 (#3).
>> We have a fully working Zookeper cluster 10.251.76.39:2181,10.251.76.40:2181
>> ,10.251.76.52:2181 and we want to have high avaibility by leveraging the
>> latest version of AMQ & LevelDB persistence.
>>
>> What is the problem with this
>> https://gist.github.com/aterreno/7229464configuration?
>
>
> Yep. seem simple enough.  I don't see anything wrong with the
> configuration.  Just seems like there might be some bugs in the
> implementation still.  Thanks for reporting it.  Hopefully we can get
> to the bottom of it soon.
>
>>
>> Thanks a lot,
>>
>> toni
>
>
>
> --
> Hiram Chirino
>
> Engineering | Red Hat, Inc.
>
> hchirino@redhat.com | fusesource.com | redhat.com
>
> skype: hiramchirino | twitter: @hiramchirino
>
> blog: Hiram Chirino's Bit Mojo



-- 
Hiram Chirino

Engineering | Red Hat, Inc.

hchirino@redhat.com | fusesource.com | redhat.com

skype: hiramchirino | twitter: @hiramchirino

blog: Hiram Chirino's Bit Mojo