You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@activemq.apache.org by "Yurii Kartsev (JIRA)" <ji...@apache.org> on 2016/03/16 23:34:34 UTC

[jira] [Created] (AMQ-6216) Slave can't become master due to infinite "Store update waiting on 1 replica(s) to catch up to log position XXX" in replicated LevelDB Store mode.

Yurii Kartsev created AMQ-6216:
----------------------------------

             Summary: Slave can't become master due to infinite "Store update waiting on 1 replica(s) to catch up to log position XXX" in replicated LevelDB Store mode.
                 Key: AMQ-6216
                 URL: https://issues.apache.org/jira/browse/AMQ-6216
             Project: ActiveMQ
          Issue Type: Bug
    Affects Versions: 5.13.2
            Reporter: Yurii Kartsev


Good afternoon dear ActiveMQ support.

I wanted to test newest ActiveMQ version for stability in Replicated LevelDB Store mode in order to know whether we can upgrade our production copy (since it has similar issues as well), but facing an issue. I am able to reproduce it, so it's not sporadic. It happens when under load I manually shut down master node and new master can not be selected having infinite logging as follows (log taken from initial slave which is promoted to master later):{code}user@machine ~/Documents/apache-activemq-5.13.2/bin $ tail -f ../data/activemq.log 
2016-03-16 18:07:34,026 | INFO  | Client environment:os.version=3.13.0-37-generic | org.apache.zookeeper.ZooKeeper | main
2016-03-16 18:07:34,027 | INFO  | Client environment:user.name=yury | org.apache.zookeeper.ZooKeeper | main
2016-03-16 18:07:34,028 | INFO  | Client environment:user.home=/home/yury | org.apache.zookeeper.ZooKeeper | main
2016-03-16 18:07:34,031 | INFO  | Client environment:user.dir=/home/yury/Documents/apache-activemq-5.13.2/bin | org.apache.zookeeper.ZooKeeper | main
2016-03-16 18:07:34,033 | INFO  | Initiating client connection, connectString=10.100.211.233:2181 sessionTimeout=2000 watcher=org.apache.activemq.leveldb.replicated.groups.ZKClient@4770c18a | org.apache.zookeeper.ZooKeeper | main
2016-03-16 18:07:34,049 | WARN  | SASL configuration failed: javax.security.auth.login.LoginException: No JAAS configuration section named 'Client' was found in specified JAAS configuration file: '/home/yury/Documents/apache-activemq-5.13.2//conf/login.config'. Will continue connection to Zookeeper server without SASL authentication, if Zookeeper server allows it. | org.apache.zookeeper.ClientCnxn | main-SendThread(10.100.211.233:2181)
2016-03-16 18:07:34,052 | INFO  | Opening socket connection to server 10.100.211.233/10.100.211.233:2181 | org.apache.zookeeper.ClientCnxn | main-SendThread(10.100.211.233:2181)
2016-03-16 18:07:34,052 | WARN  | unprocessed event state: AuthFailed | org.apache.activemq.leveldb.replicated.groups.ZKClient | main-EventThread
2016-03-16 18:07:34,056 | INFO  | Socket connection established to 10.100.211.233/10.100.211.233:2181, initiating session | org.apache.zookeeper.ClientCnxn | main-SendThread(10.100.211.233:2181)
2016-03-16 18:07:34,066 | INFO  | Session establishment complete on server 10.100.211.233/10.100.211.233:2181, sessionid = 0x1537b47772c0007, negotiated timeout = 4000 | org.apache.zookeeper.ClientCnxn | main-SendThread(10.100.211.233:2181)
2016-03-16 18:07:34,682 | INFO  | Using the pure java LevelDB implementation. | org.apache.activemq.leveldb.LevelDBClient | ActiveMQ BrokerService[localhost] Task-1
2016-03-16 18:07:34,694 | INFO  | Attaching to master: tcp://10.100.211.233:61619 | org.apache.activemq.leveldb.replicated.SlaveLevelDBStore | ActiveMQ BrokerService[localhost] Task-1
2016-03-16 18:07:34,703 | INFO  | Slave started | org.apache.activemq.leveldb.replicated.MasterElector | ActiveMQ BrokerService[localhost] Task-1
2016-03-16 18:07:34,800 | INFO  | Slave skipping download of: log/0000000000000000.log | org.apache.activemq.leveldb.replicated.SlaveLevelDBStore | hawtdispatch-DEFAULT-1
2016-03-16 18:07:34,812 | INFO  | Slave requested: 00000000006399f2.index/CURRENT | org.apache.activemq.leveldb.replicated.SlaveLevelDBStore | hawtdispatch-DEFAULT-1
2016-03-16 18:07:34,815 | INFO  | Slave requested: 00000000006399f2.index/000003.log | org.apache.activemq.leveldb.replicated.SlaveLevelDBStore | hawtdispatch-DEFAULT-1
2016-03-16 18:07:34,816 | INFO  | Slave requested: 00000000006399f2.index/MANIFEST-000002 | org.apache.activemq.leveldb.replicated.SlaveLevelDBStore | hawtdispatch-DEFAULT-1
2016-03-16 18:07:34,826 | INFO  | Attaching... Downloaded 0.02/28.37 kb and 1/3 files | org.apache.activemq.leveldb.replicated.SlaveLevelDBStore | hawtdispatch-DEFAULT-1
2016-03-16 18:07:34,827 | INFO  | Attaching... Downloaded 28.32/28.37 kb and 2/3 files | org.apache.activemq.leveldb.replicated.SlaveLevelDBStore | hawtdispatch-DEFAULT-1
2016-03-16 18:07:34,829 | INFO  | Attaching... Downloaded 28.37/28.37 kb and 3/3 files | org.apache.activemq.leveldb.replicated.SlaveLevelDBStore | hawtdispatch-DEFAULT-1
2016-03-16 18:07:34,830 | INFO  | Attached | org.apache.activemq.leveldb.replicated.SlaveLevelDBStore | hawtdispatch-DEFAULT-1
2016-03-16 18:15:11,459 | INFO  | Not enough cluster members have reported their update positions yet. | org.apache.activemq.leveldb.replicated.MasterElector | main-EventThread
2016-03-16 18:15:11,527 | INFO  | Slave stopped | org.apache.activemq.leveldb.replicated.MasterElector | ActiveMQ BrokerService[localhost] Task-2
2016-03-16 18:15:11,529 | INFO  | Not enough cluster members have reported their update positions yet. | org.apache.activemq.leveldb.replicated.MasterElector | ActiveMQ BrokerService[localhost] Task-2
2016-03-16 18:15:11,534 | INFO  | Not enough cluster members have reported their update positions yet. | org.apache.activemq.leveldb.replicated.MasterElector | main-EventThread
2016-03-16 18:15:11,594 | INFO  | Promoted to master | org.apache.activemq.leveldb.replicated.MasterElector | main-EventThread
2016-03-16 18:15:11,616 | INFO  | Using the pure java LevelDB implementation. | org.apache.activemq.leveldb.LevelDBClient | ActiveMQ BrokerService[localhost] Task-2
2016-03-16 18:15:12,066 | INFO  | Master started: tcp://10.211.55.5:61619 | org.apache.activemq.leveldb.replicated.MasterElector | ActiveMQ BrokerService[localhost] Task-3
2016-03-16 18:15:13,069 | WARN  | Store update waiting on 1 replica(s) to catch up to log position 12074259.  | org.apache.activemq.leveldb.replicated.MasterLevelDBStore | ActiveMQ BrokerService[localhost] Task-2
2016-03-16 18:15:14,071 | WARN  | Store update waiting on 1 replica(s) to catch up to log position 12074259.  | org.apache.activemq.leveldb.replicated.MasterLevelDBStore | ActiveMQ BrokerService[localhost] Task-2
2016-03-16 18:15:15,072 | WARN  | Store update waiting on 1 replica(s) to catch up to log position 12074259.  | org.apache.activemq.leveldb.replicated.MasterLevelDBStore | ActiveMQ BrokerService[localhost] Task-2
2016-03-16 18:15:16,074 | WARN  | Store update waiting on 1 replica(s) to catch up to log position 12074259.  | org.apache.activemq.leveldb.replicated.MasterLevelDBStore | ActiveMQ BrokerService[localhost] Task-2
2016-03-16 18:15:17,076 | WARN  | Store update waiting on 1 replica(s) to catch up to log position 12074259.  | org.apache.activemq.leveldb.replicated.MasterLevelDBStore | ActiveMQ BrokerService[localhost] Task-2
2016-03-16 18:15:18,078 | WARN  | Store update waiting on 1 replica(s) to catch up to log position 12074259.  | org.apache.activemq.leveldb.replicated.MasterLevelDBStore | ActiveMQ BrokerService[localhost] Task-2
2016-03-16 18:15:19,080 | WARN  | Store update waiting on 1 replica(s) to catch up to log position 12074259.  | org.apache.activemq.leveldb.replicated.MasterLevelDBStore | ActiveMQ BrokerService[localhost] Task-2
2016-03-16 18:15:20,081 | WARN  | Store update waiting on 1 replica(s) to catch up to log position 12074259.  | org.apache.activemq.leveldb.replicated.MasterLevelDBStore | ActiveMQ BrokerService[localhost] Task-2
2016-03-16 18:15:21,083 | WARN  | Store update waiting on 1 replica(s) to catch up to log position 12074259.  | org.apache.activemq.leveldb.replicated.MasterLevelDBStore | ActiveMQ BrokerService[localhost] Task-2
2016-03-16 18:15:22,085 | WARN  | Store update waiting on 1 replica(s) to catch up to log position 12074259.  | org.apache.activemq.leveldb.replicated.MasterLevelDBStore | ActiveMQ BrokerService[localhost] Task-2
2016-03-16 18:15:23,087 | WARN  | Store update waiting on 1 replica(s) to catch up to log position 12074259.  | org.apache.activemq.leveldb.replicated.MasterLevelDBStore | ActiveMQ BrokerService[localhost] Task-2
2016-03-16 18:15:24,089 | WARN  | Store update waiting on 1 replica(s) to catch up to log position 12074259.  | org.apache.activemq.leveldb.replicated.MasterLevelDBStore | ActiveMQ BrokerService[localhost] Task-2
2016-03-16 18:15:25,091 | WARN  | Store update waiting on 1 replica(s) to catch up to log position 12074259.  | org.apache.activemq.leveldb.replicated.MasterLevelDBStore | ActiveMQ BrokerService[localhost] Task-2
2016-03-16 18:15:26,092 | WARN  | Store update waiting on 1 replica(s) to catch up to log position 12074259.  | org.apache.activemq.leveldb.replicated.MasterLevelDBStore | ActiveMQ BrokerService[localhost] Task-2
2016-03-16 18:15:27,095 | WARN  | Store update waiting on 1 replica(s) to catch up to log position 12074259.  | org.apache.activemq.leveldb.replicated.MasterLevelDBStore | ActiveMQ BrokerService[localhost] Task-2
2016-03-16 18:15:28,097 | WARN  | Store update waiting on 1 replica(s) to catch up to log position 12074259.  | org.apache.activemq.leveldb.replicated.MasterLevelDBStore | ActiveMQ BrokerService[localhost] Task-2
2016-03-16 18:15:29,098 | WARN  | Store update waiting on 1 replica(s) to catch up to log position 12074259.  | org.apache.activemq.leveldb.replicated.MasterLevelDBStore | ActiveMQ BrokerService[localhost] Task-2
2016-03-16 18:15:30,100 | WARN  | Store update waiting on 1 replica(s) to catch up to log position 12074259.  | org.apache.activemq.leveldb.replicated.MasterLevelDBStore | ActiveMQ BrokerService[localhost] Task-2
2016-03-16 18:15:31,101 | WARN  | Store update waiting on 1 replica(s) to catch up to log position 12074259.  | org.apache.activemq.leveldb.replicated.MasterLevelDBStore | ActiveMQ BrokerService[localhost] Task-2
2016-03-16 18:15:32,102 | WARN  | Store update waiting on 1 replica(s) to catch up to log position 12074259.  | org.apache.activemq.leveldb.replicated.MasterLevelDBStore | ActiveMQ BrokerService[localhost] Task-2
2016-03-16 18:15:33,103 | WARN  | Store update waiting on 1 replica(s) to catch up to log position 12074259.  | org.apache.activemq.leveldb.replicated.MasterLevelDBStore | ActiveMQ BrokerService[localhost] Task-2
2016-03-16 18:15:34,104 | WARN  | Store update waiting on 1 replica(s) to catch up to log position 12074259.  | org.apache.activemq.leveldb.replicated.MasterLevelDBStore | ActiveMQ BrokerService[localhost] Task-2
2016-03-16 18:15:35,106 | WARN  | Store update waiting on 1 replica(s) to catch up to log position 12074259.  | org.apache.activemq.leveldb.replicated.MasterLevelDBStore | ActiveMQ BrokerService[localhost] Task-2
2016-03-16 18:15:36,108 | WARN  | Store update waiting on 1 replica(s) to catch up to log position 12074259.  | org.apache.activemq.leveldb.replicated.MasterLevelDBStore | ActiveMQ BrokerService[localhost] Task-2
2016-03-16 18:15:37,110 | WARN  | Store update waiting on 1 replica(s) to catch up to log position 12074259.  | org.apache.activemq.leveldb.replicated.MasterLevelDBStore | ActiveMQ BrokerService[localhost] Task-2
2016-03-16 18:15:38,112 | WARN  | Store update waiting on 1 replica(s) to catch up to log position 12074259.  | org.apache.activemq.leveldb.replicated.MasterLevelDBStore | ActiveMQ BrokerService[localhost] Task-2
2016-03-16 18:15:39,113 | WARN  | Store update waiting on 1 replica(s) to catch up to log position 12074259.  | org.apache.activemq.leveldb.replicated.MasterLevelDBStore | ActiveMQ BrokerService[localhost] Task-2
2016-03-16 18:15:40,115 | WARN  | Store update waiting on 1 replica(s) to catch up to log position 12074259.  | org.apache.activemq.leveldb.replicated.MasterLevelDBStore | ActiveMQ BrokerService[localhost] Task-2
2016-03-16 18:15:41,117 | WARN  | Store update waiting on 1 replica(s) to catch up to log position 12074259.  | org.apache.activemq.leveldb.replicated.MasterLevelDBStore | ActiveMQ BrokerService[localhost] Task-2
2016-03-16 18:15:42,119 | WARN  | Store update waiting on 1 replica(s) to catch up to log position 12074259.  | org.apache.activemq.leveldb.replicated.MasterLevelDBStore | ActiveMQ BrokerService[localhost] Task-2
2016-03-16 18:15:43,120 | WARN  | Store update waiting on 1 replica(s) to catch up to log position 12074259.  | org.apache.activemq.leveldb.replicated.MasterLevelDBStore | ActiveMQ BrokerService[localhost] Task-2
2016-03-16 18:15:44,121 | WARN  | Store update waiting on 1 replica(s) to catch up to log position 12074259.  | org.apache.activemq.leveldb.replicated.MasterLevelDBStore | ActiveMQ BrokerService[localhost] Task-2
2016-03-16 18:15:45,122 | WARN  | Store update waiting on 1 replica(s) to catch up to log position 12074259.  | org.apache.activemq.leveldb.replicated.MasterLevelDBStore | ActiveMQ BrokerService[localhost] Task-2
2016-03-16 18:15:46,123 | WARN  | Store update waiting on 1 replica(s) to catch up to log position 12074259.  | org.apache.activemq.leveldb.replicated.MasterLevelDBStore | ActiveMQ BrokerService[localhost] Task-2
2016-03-16 18:15:47,124 | WARN  | Store update waiting on 1 replica(s) to catch up to log position 12074259.  | org.apache.activemq.leveldb.replicated.MasterLevelDBStore | ActiveMQ BrokerService[localhost] Task-2
2016-03-16 18:15:48,125 | WARN  | Store update waiting on 1 replica(s) to catch up to log position 12074259.  | org.apache.activemq.leveldb.replicated.MasterLevelDBStore | ActiveMQ BrokerService[localhost] Task-2
2016-03-16 18:15:49,126 | WARN  | Store update waiting on 1 replica(s) to catch up to log position 12074259.  | org.apache.activemq.leveldb.replicated.MasterLevelDBStore | ActiveMQ BrokerService[localhost] Task-2
2016-03-16 18:15:50,128 | WARN  | Store update waiting on 1 replica(s) to catch up to log position 12074259.  | org.apache.activemq.leveldb.replicated.MasterLevelDBStore | ActiveMQ BrokerService[localhost] Task-2
2016-03-16 18:15:51,129 | WARN  | Store update waiting on 1 replica(s) to catch up to log position 12074259.  | org.apache.activemq.leveldb.replicated.MasterLevelDBStore | ActiveMQ BrokerService[localhost] Task-2
2016-03-16 18:15:52,130 | WARN  | Store update waiting on 1 replica(s) to catch up to log position 12074259.  | org.apache.activemq.leveldb.replicated.MasterLevelDBStore | ActiveMQ BrokerService[localhost] Task-2
2016-03-16 18:15:53,131 | WARN  | Store update waiting on 1 replica(s) to catch up to log position 12074259.  | org.apache.activemq.leveldb.replicated.MasterLevelDBStore | ActiveMQ BrokerService[localhost] Task-2
2016-03-16 18:15:54,132 | WARN  | Store update waiting on 1 replica(s) to catch up to log position 12074259.  | org.apache.activemq.leveldb.replicated.MasterLevelDBStore | ActiveMQ BrokerService[localhost] Task-2
2016-03-16 18:15:55,133 | WARN  | Store update waiting on 1 replica(s) to catch up to log position 12074259.  | org.apache.activemq.leveldb.replicated.MasterLevelDBStore | ActiveMQ BrokerService[localhost] Task-2
2016-03-16 18:15:56,134 | WARN  | Store update waiting on 1 replica(s) to catch up to log position 12074259.  | org.apache.activemq.leveldb.replicated.MasterLevelDBStore | ActiveMQ BrokerService[localhost] Task-2
2016-03-16 18:15:57,136 | WARN  | Store update waiting on 1 replica(s) to catch up to log position 12074259.  | org.apache.activemq.leveldb.replicated.MasterLevelDBStore | ActiveMQ BrokerService[localhost] Task-2
2016-03-16 18:15:58,137 | WARN  | Store update waiting on 1 replica(s) to catch up to log position 12074259.  | org.apache.activemq.leveldb.replicated.MasterLevelDBStore | ActiveMQ BrokerService[localhost] Task-2
2016-03-16 18:15:59,139 | WARN  | Store update waiting on 1 replica(s) to catch up to log position 12074259.  | org.apache.activemq.leveldb.replicated.MasterLevelDBStore | ActiveMQ BrokerService[localhost] Task-2
2016-03-16 18:16:00,140 | WARN  | Store update waiting on 1 replica(s) to catch up to log position 12074259.  | org.apache.activemq.leveldb.replicated.MasterLevelDBStore | ActiveMQ BrokerService[localhost] Task-2
2016-03-16 18:16:01,142 | WARN  | Store update waiting on 1 replica(s) to catch up to log position 12074259.  | org.apache.activemq.leveldb.replicated.MasterLevelDBStore | ActiveMQ BrokerService[localhost] Task-2
2016-03-16 18:16:02,144 | WARN  | Store update waiting on 1 replica(s) to catch up to log position 12074259.  | org.apache.activemq.leveldb.replicated.MasterLevelDBStore | ActiveMQ BrokerService[localhost] Task-2
2016-03-16 18:16:03,145 | WARN  | Store update waiting on 1 replica(s) to catch up to log position 12074259.  | org.apache.activemq.leveldb.replicated.MasterLevelDBStore | ActiveMQ BrokerService[localhost] Task-2
2016-03-16 18:16:04,146 | WARN  | Store update waiting on 1 replica(s) to catch up to log position 12074259.  | org.apache.activemq.leveldb.replicated.MasterLevelDBStore | ActiveMQ BrokerService[localhost] Task-2

...{code}

Steps to reproduce:

1. Start one ZK node. Set up ActiveMQ cluster in Replicated LevelDB Store mode as described [here|http://activemq.apache.org/replicated-leveldb-store.html]. My cluster contains of 3 nodes: Mac OS X + Windows + Linux Mint. In my case Mac OS X node was started first and becomes initial master.
2. Start a process which constantly adds messages to a single queue using URL as "failover(...)".
3. In the middle of that process shut down current master node (activemq stop)
4. Observe log from above on one of the slave nodes.

My activeMQ and levelDB is a fresh copy - just downloaded and installed it. My ZK version is 3.4.6.

Please ask for any additional data you need - I will be able to provide it.

Thank you very much for your help in advance.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)