You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@activemq.apache.org by "James A. Robinson" <ji...@highwire.org> on 2015/05/22 01:56:07 UTC

Re: AMQ-5082 (replicated LevelDB)

I've had a dev cluster running for a little while now and twice I've seen
interruptions where the cluster didn't recover, didn't select a new master.

I had hoped AMQ-5082 fixed that issue but it looks like there might be
additional problems.  How many of you folks are running replicated leveldb,
and what issues, if any, have you been seeing with failover or leader
election?

I've got a tcpdump running to capture the zookeeper traffic and I'm hoping
that will give me a clue as to what is going wrong.  But anybody else
seeing the issues and telling me what they see could also help me debug
this...

Here's what I captured from zookeeper.  I'll also not that even *delete*
the nodes didn't trigger any sort of activity from activemq, which
indicates deeply flawed client-side logic... :-(

/activemq/amq-dev-1/000000000041
{"id":"amq-dev-1","container":null,"address":null,"position":-1,"weight":1,"elected":null}

/activemq/amq-dev-1/000000000043
{"id":"amq-dev-1","container":null,"address":null,"position":0,"weight":1,"elected":null}

/activemq/amq-dev-1/000000000044
{"id":"amq-dev-1","container":null,"address":null,"position":2936,"weight":1,"elected":null}