You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@rocketmq.apache.org by GitBox <gi...@apache.org> on 2019/12/11 03:34:22 UTC
[GitHub] [rocketmq] henrypoter opened a new issue #1641: INDEX_OUT_OF_RANGE
caused broker & nameserver shutdown
henrypoter opened a new issue #1641: INDEX_OUT_OF_RANGE caused broker & nameserver shutdown
URL: https://github.com/apache/rocketmq/issues/1641
1. Please describe the issue you observed:
- What did you do (The steps to reproduce)?
rocketmq版本 4.6.0。使用dledger模式部署了一个broker分片,1主2从。
其中有一个从节点在压测过程中broker和namserver进程终止了
错误信息
`[rocketmq@y2_web212 rocketmqlogs]$ vi broker_default.log
2019-12-05 03:38:01 ERROR EntryDispatcher-n0-n1 - [Push-n1]Error in EntryDispatcher-n0-n1 writeIndex=29222072 compareIndex=-1
io.openmessaging.storage.dledger.exception.DLedgerException: [code=410,name=INDEX_OUT_OF_RANGE,desc=] 29221072 should between 300627707-313007211
at io.openmessaging.storage.dledger.utils.PreConditions.check(PreConditions.java:41) ~[dledger-0.1.jar:na]
at io.openmessaging.storage.dledger.store.file.DLedgerMmapFileStore.get(DLedgerMmapFileStore.java:479) ~[dledger-0.1.jar:na]
at io.openmessaging.storage.dledger.DLedgerEntryPusher$EntryDispatcher.doAppendInner(DLedgerEntryPusher.java:389) ~[dledger-0.1.jar:na]
at io.openmessaging.storage.dledger.DLedgerEntryPusher$EntryDispatcher.doCheckAppendResponse(DLedgerEntryPusher.java:434) ~[dledger-0.1.jar:na]
at io.openmessaging.storage.dledger.DLedgerEntryPusher$EntryDispatcher.doAppend(DLedgerEntryPusher.java:461) ~[dledger-0.1.jar:na]
at io.openmessaging.storage.dledger.DLedgerEntryPusher$EntryDispatcher.doWork(DLedgerEntryPusher.java:602) ~[dledger-0.1.jar:na]
at io.openmessaging.storage.dledger.ShutdownAbleThread.run(ShutdownAbleThread.java:87) [dledger-0.1.jar:na]
2019-12-05 03:38:02 INFO DLedgerFlushDataService - Flush data cost=691 ms
2019-12-05 03:38:02 ERROR StateMaintainer - Send heartBeat request failed info[group=RaftNode00,term=1,code=200,local=n0,remote=n1,leader=n0]
org.apache.rocketmq.remoting.exception.RemotingConnectException: connect to 192.168.111.213:40911 failed
at org.apache.rocketmq.remoting.netty.NettyRemotingClient.invokeAsync(NettyRemotingClient.java:531) ~[rocketmq-remoting-4.6.0.jar:4.6.0]
at io.openmessaging.storage.dledger.DLedgerRpcNettyService.heartBeat(DLedgerRpcNettyService.java:118) ~[dledger-0.1.jar:na]
at io.openmessaging.storage.dledger.DLedgerLeaderElector.sendHeartbeats(DLedgerLeaderElector.java:252) [dledger-0.1.jar:na]
at io.openmessaging.storage.dledger.DLedgerLeaderElector.maintainAsLeader(DLedgerLeaderElector.java:320) [dledger-0.1.jar:na]
at io.openmessaging.storage.dledger.DLedgerLeaderElector.maintainState(DLedgerLeaderElector.java:506) [dledger-0.1.jar:na]
at io.openmessaging.storage.dledger.DLedgerLeaderElector.access$200(DLedgerLeaderElector.java:40) [dledger-0.1.jar:na]
at io.openmessaging.storage.dledger.DLedgerLeaderElector$StateMaintainer.doWork(DLedgerLeaderElector.java:548) [dledger-0.1.jar:na]
at io.openmessaging.storage.dledger.ShutdownAbleThread.run(ShutdownAbleThread.java:87) [dledger-0.1.jar:na]
2019-12-05 03:38:03 WARN EntryDispatcher-n0-n1 - [Push-n1]Retry to push entry at 29221072
2019-12-05 03:38:03 ERROR EntryDispatcher-n0-n1 - [Push-n1]Error in EntryDispatcher-n0-n1 writeIndex=29222072 compareIndex=-1
io.openmessaging.storage.dledger.exception.DLedgerException: [code=410,name=INDEX_OUT_OF_RANGE,desc=] 29221072 should between 300627707-313007409
at io.openmessaging.storage.dledger.utils.PreConditions.check(PreConditions.java:41) ~[dledger-0.1.jar:na]
at io.openmessaging.storage.dledger.store.file.DLedgerMmapFileStore.get(DLedgerMmapFileStore.java:479) ~[dledger-0.1.jar:na]
at io.openmessaging.storage.dledger.DLedgerEntryPusher$EntryDispatcher.doAppendInner(DLedgerEntryPusher.java:389) ~[dledger-0.1.jar:na]
at io.openmessaging.storage.dledger.DLedgerEntryPusher$EntryDispatcher.doCheckAppendResponse(DLedgerEntryPusher.java:434) ~[dledger-0.1.jar:na]
at io.openmessaging.storage.dledger.DLedgerEntryPusher$EntryDispatcher.doAppend(DLedgerEntryPusher.java:461) ~[dledger-0.1.jar:na]
at io.openmessaging.storage.dledger.DLedgerEntryPusher$EntryDispatcher.doWork(DLedgerEntryPusher.java:602) ~[dledger-0.1.jar:na]
at io.openmessaging.storage.dledger.ShutdownAbleThread.run(ShutdownAbleThread.java:87) [dledger-0.1.jar:na]
2019-12-05 03:38:03 INFO QuorumAckChecker - [n0][LEADER] term=1 ledgerBegin=300627707 ledgerEnd=313007409 committed=313007409 watermarks={1:{"n0":313007409,"n1":29221071,"n2":313007409}}
2019-12-05 03:38:03 WARN EntryDispatcher-n0-n1 - [Push-n1]Retry to push entry at 29221072
2019-12-05 03:38:03 ERROR EntryDispatcher-n0-n1 - [Push-n1]Error in EntryDispatcher-n0-n1 writeIndex=29222072 compareIndex=-1
io.openmessaging.storage.dledger.exception.DLedgerException: [code=410,name=INDEX_OUT_OF_RANGE,desc=] 29221072 should between 300627707-313007431
at io.openmessaging.storage.dledger.utils.PreConditions.check(PreConditions.java:41) ~[dledger-0.1.jar:na]
at io.openmessaging.storage.dledger.store.file.DLedgerMmapFileStore.get(DLedgerMmapFileStore.java:479) ~[dledger-0.1.jar:na]
at io.openmessaging.storage.dledger.DLedgerEntryPusher$EntryDispatcher.doAppendInner(DLedgerEntryPusher.java:389) ~[dledger-0.1.jar:na]
at io.openmessaging.storage.dledger.DLedgerEntryPusher$EntryDispatcher.doCheckAppendResponse(DLedgerEntryPusher.java:434) ~[dledger-0.1.jar:na]
at io.openmessaging.storage.dledger.DLedgerEntryPusher$EntryDispatcher.doAppend(DLedgerEntryPusher.java:461) ~[dledger-0.1.jar:na]
at io.openmessaging.storage.dledger.DLedgerEntryPusher$EntryDispatcher.doWork(DLedgerEntryPusher.java:602) ~[dledger-0.1.jar:na]
at io.openmessaging.storage.dledger.ShutdownAbleThread.run(ShutdownAbleThread.java:87) [dledger-0.1.jar:na]
2019-12-05 03:38:04 WARN EntryDispatcher-n0-n1 - [Push-n1]Retry to push entry at 29221072
2019-12-05 03:38:04 ERROR EntryDispatcher-n0-n1 - [Push-n1]Error in EntryDispatcher-n0-n1 writeIndex=29222072 compareIndex=-1
io.openmessaging.storage.dledger.exception.DLedgerException: [code=410,name=INDEX_OUT_OF_RANGE,desc=] 29221072 should between 300627707-313007473
at io.openmessaging.storage.dledger.utils.PreConditions.check(PreConditions.java:41) ~[dledger-0.1.jar:na]
at io.openmessaging.storage.dledger.store.file.DLedgerMmapFileStore.get(DLedgerMmapFileStore.java:479) ~[dledger-0.1.jar:na]
at io.openmessaging.storage.dledger.DLedgerEntryPusher$EntryDispatcher.doAppendInner(DLedgerEntryPusher.java:389) ~[dledger-0.1.jar:na]
at io.openmessaging.storage.dledger.DLedgerEntryPusher$EntryDispatcher.doCheckAppendResponse(DLedgerEntryPusher.java:434) ~[dledger-0.1.jar:na]
at io.openmessaging.storage.dledger.DLedgerEntryPusher$EntryDispatcher.doAppend(DLedgerEntryPusher.java:461) ~[dledger-0.1.jar:na]
at io.openmessaging.storage.dledger.DLedgerEntryPusher$EntryDispatcher.doWork(DLedgerEntryPusher.java:602) ~[dledger-0.1.jar:na]
at io.openmessaging.storage.dledger.ShutdownAbleThread.run(ShutdownAbleThread.java:87) [dledger-0.1.jar:na]
2019-12-05 03:38:04 WARN EntryDispatcher-n0-n1 - [Push-n1]Retry to push entry at 29221072
2019-12-05 03:38:04 ERROR EntryDispatcher-n0-n1 - [Push-n1]Error in EntryDispatcher-n0-n1 writeIndex=29222072 compareIndex=-1
?ERROR `
重启broker和NameServer后恢复log
`
[rocketmq@y2zomq213 rocketmqlogs]$ tail -fn 111111111 *.log | grep recover
2019-12-05 14:20:24 INFO main - Begin to recover data from entryIndex=629360492 fileIndex=17 fileSize=20 fileName=/home/rocketmq/store/dledger-n1/data/00000000426275504128
2019-12-05 14:20:25 INFO main - Trying to recover data file /home/rocketmq/store/dledger-n1/data/00000000427349245952
2019-12-05 14:20:26 INFO main - Trying to recover data file /home/rocketmq/store/dledger-n1/data/00000000428422987776
at io.openmessaging.storage.dledger.store.file.DLedgerMmapFileStore.recover(DLedgerMmapFileStore.java:216) ~[dledger-0.1.jar:na]
at org.apache.rocketmq.store.dledger.DLedgerCommitLog.recover(DLedgerCommitLog.java:253) [rocketmq-store-4.6.0.jar:4.6.0]
at org.apache.rocketmq.store.dledger.DLedgerCommitLog.recoverNormally(DLedgerCommitLog.java:305) [rocketmq-store-4.6.0.jar:4.6.0]
at org.apache.rocketmq.store.DefaultMessageStore.recover(DefaultMessageStore.java:1334) [rocketmq-store-4.6.0.jar:4.6.0]
2019-12-05 22:28:18 INFO main - Begin to recover data from entryIndex=629360492 fileIndex=17 fileSize=20 fileName=/home/rocketmq/store/dledger-n1/data/00000000426275504128
2019-12-05 22:28:19 INFO main - Trying to recover data file /home/rocketmq/store/dledger-n1/data/00000000427349245952
2019-12-05 22:28:20 INFO main - Trying to recover data file /home/rocketmq/store/dledger-n1/data/00000000428422987776
at io.openmessaging.storage.dledger.store.file.DLedgerMmapFileStore.recover(DLedgerMmapFileStore.java:216) ~[dledger-0.1.jar:na]
at org.apache.rocketmq.store.dledger.DLedgerCommitLog.recover(DLedgerCommitLog.java:253) [rocketmq-store-4.6.0.jar:4.6.0]
at org.apache.rocketmq.store.dledger.DLedgerCommitLog.recoverNormally(DLedgerCommitLog.java:305) [rocketmq-store-4.6.0.jar:4.6.0]
at org.apache.rocketmq.store.DefaultMessageStore.recover(DefaultMessageStore.java:1334) [rocketmq-store-4.6.0.jar:4.6.0]
2019-12-03 10:47:56 INFO main - recover next consume queue file, /home/rocketmq/store/consumequeue/uat-topic/0/00000000000012000000
2019-12-03 10:47:56 INFO main - recover next consume queue file, /home/rocketmq/store/consumequeue/uat-topic/0/00000000000018000000
2019-12-03 10:47:56 INFO main - recover current consume queue file over, /home/rocketmq/store/consumequeue/uat-topic/0/00000000000018000000 0 0 0
2019-12-03 10:47:56 INFO main - recover current consume queue queue over /home/rocketmq/store/consumequeue/uat-topic/0/00000000000018000000 22122820
2019-12-03 10:47:56 INFO main - recover next consume queue file, /home/rocketmq/store/consumequeue/uat-topic/1/00000000000012000000
2019-12-03 10:47:56 INFO main - recover next consume queue file, /home/rocketmq/store/consumequeue/uat-topic/1/00000000000018000000
2019-12-03 10:47:56 INFO main - recover current consume queue file over, /home/rocketmq/store/consumequeue/uat-topic/1/00000000000018000000 0 0 0
2019-12-03 10:47:56 INFO main - recover current consume queue queue over /home/rocketmq/store/consumequeue/uat-topic/1/00000000000018000000 22123000
2019-12-03 10:47:56 INFO main - recover next consume queue file, /home/rocketmq/store/consumequeue/uat-topic/2/00000000000012000000
2019-12-03 10:47:56 INFO main - recover next consume queue f`
- What did you expect to see?
2. Please tell us about your environment:
3. Other information (e.g. detailed explanation, logs, related issues, suggestions how to fix, etc):
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
With regards,
Apache Git Services