You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@rocketmq.apache.org by GitBox <gi...@apache.org> on 2020/02/16 10:33:41 UTC

[GitHub] [rocketmq] TerrellChen opened a new issue #1778: Dledger新增无数据节点后,数据没有自动同步

TerrellChen opened a new issue #1778: Dledger新增无数据节点后,数据没有自动同步
URL: https://github.com/apache/rocketmq/issues/1778
 
 
   在一个已经搭建好且正常运行的3节点Deldger集群中,停止掉一台slave的进程,并且清空它的store/dledger_store目录后,再次重启进程,希望模拟为Dledger集群新增一个空节点的场景。
   而这个节点却始终没有数据同步过来。
   在master的broker_default.log日志中有相关异常,由io.openmessaging.storage.dledger.DLedgerEntryPusher.EntryDispatcher#doWork的try-catch块抛出
   `2020-02-16 18:22:24 ERROR EntryDispatcher-n0-n2 - [Push-n2]Error in EntryDispatcher-n0-n2 writeIndex=1435933765 compareIndex=-1
   io.openmessaging.storage.dledger.exception.DLedgerException: [code=410,name=INDEX_OUT_OF_RANGE,desc=] 1435933765 should between 1551149011-1815212110
           at io.openmessaging.storage.dledger.utils.PreConditions.check(PreConditions.java:41) ~[dledger-0.1.jar:na]
           at io.openmessaging.storage.dledger.store.file.DLedgerMmapFileStore.get(DLedgerMmapFileStore.java:479) ~[dledger-0.1.jar:na]
           at io.openmessaging.storage.dledger.DLedgerEntryPusher$EntryDispatcher.doAppendInner(DLedgerEntryPusher.java:389) ~[dledger-0.1.jar:na]
           at io.openmessaging.storage.dledger.DLedgerEntryPusher$EntryDispatcher.doAppend(DLedgerEntryPusher.java:464) ~[dledger-0.1.jar:na]
           at io.openmessaging.storage.dledger.DLedgerEntryPusher$EntryDispatcher.doWork(DLedgerEntryPusher.java:602) ~[dledger-0.1.jar:na]`
   在手动操作重启一次master后,数据同步又会正常进行。
   请问这个问题的原因是什么?以及为集群新增节点的正确操作是什么,有文档吗?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [rocketmq] RongtongJin commented on issue #1778: The data is not synchronized when a new node without data is added in dledger mode

Posted by GitBox <gi...@apache.org>.
RongtongJin commented on issue #1778: The data is not synchronized when a new node without data is added in dledger mode
URL: https://github.com/apache/rocketmq/issues/1778#issuecomment-586720938
 
 
   I followed your description to reproduce the issue, but it failed. I will check the code tomorrow. It would be better if you can tell me how to reproduce it.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [rocketmq] RongtongJin commented on issue #1778: The data is not synchronized when a new node without data is added in dledger mode

Posted by GitBox <gi...@apache.org>.
RongtongJin commented on issue #1778: The data is not synchronized when a new node without data is added in dledger mode
URL: https://github.com/apache/rocketmq/issues/1778#issuecomment-587459541
 
 
   Yep, It is possible when a follower fall behind leader a lot and a thread is cleaning up space at the same time. I will make a pr to dledger repo to fix the issue.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [rocketmq] TerrellChen commented on issue #1778: The data is not synchronized when a new node without data is added in dledger mode

Posted by GitBox <gi...@apache.org>.
TerrellChen commented on issue #1778: The data is not synchronized when a new node without data is added in dledger mode
URL: https://github.com/apache/rocketmq/issues/1778#issuecomment-587438168
 
 
   @RongtongJin 
   I just reproduce it. I think the key point is space clean. The exception always happend after a space clean due to the physicRatio or something else.
   When data synchronization running between master and a clean slave, is there any chance that the request index no longer exist after cleaning the oldest commitLog/index to make this exception happen?
   
   `2020-02-18 19:57:58 INFO DLedgerCleanSpaceService - unmap file[REF:0] /data/rocketmq/broker/store/dledger_store/dledger-n0/data/00000002244120412160 OK
   2020-02-18 19:57:58 INFO DLedgerCleanSpaceService - close file channel /data/rocketmq/broker/store/dledger_store/dledger-n0/data/00000002244120412160 OK
   2020-02-18 19:57:58 INFO DLedgerCleanSpaceService - delete file[REF:0] /data/rocketmq/broker/store/dledger_store/dledger-n0/data/00000002244120412160 OK, W:1073741824 M:1073741824, 96
   2020-02-18 19:57:59 INFO DLedgerCleanSpaceService - unmap file[REF:0] /data/rocketmq/broker/store/dledger_store/dledger-n0/data/00000002245194153984 OK
   2020-02-18 19:57:59 INFO DLedgerCleanSpaceService - close file channel /data/rocketmq/broker/store/dledger_store/dledger-n0/data/00000002245194153984 OK
   2020-02-18 19:57:59 INFO DLedgerCleanSpaceService - delete file[REF:0] /data/rocketmq/broker/store/dledger_store/dledger-n0/data/00000002245194153984 OK, W:1073741824 M:1073741824, 98
   2020-02-18 19:58:00 WARN AdminBrokerThread_6 - matched, but hold failed, request pos=0 fileFromOffset=2244120412160
   2020-02-18 19:58:00 INFO DLedgerCleanSpaceService - unmap file[REF:0] /data/rocketmq/broker/store/dledger_store/dledger-n0/data/00000002246267895808 OK
   2020-02-18 19:58:00 INFO DLedgerCleanSpaceService - close file channel /data/rocketmq/broker/store/dledger_store/dledger-n0/data/00000002246267895808 OK
   2020-02-18 19:58:00 INFO DLedgerCleanSpaceService - delete file[REF:0] /data/rocketmq/broker/store/dledger_store/dledger-n0/data/00000002246267895808 OK, W:1073741824 M:1073741824, 97
   2020-02-18 19:58:00 INFO DLedgerFlushDataService - Flush data cost=696 ms
   2020-02-18 19:58:00 INFO DLedgerFlushDataService - Flush data cost=507 ms
   2020-02-18 19:58:00 INFO DLedgerCleanSpaceService - unmap file[REF:0] /data/rocketmq/broker/store/dledger_store/dledger-n0/data/00000002247341637632 OK
   2020-02-18 19:58:00 INFO DLedgerCleanSpaceService - close file channel /data/rocketmq/broker/store/dledger_store/dledger-n0/data/00000002247341637632 OK
   2020-02-18 19:58:00 INFO DLedgerCleanSpaceService - delete file[REF:0] /data/rocketmq/broker/store/dledger_store/dledger-n0/data/00000002247341637632 OK, W:1073741824 M:1073741824, 97
   2020-02-18 19:58:01 INFO QuorumAckChecker - [n0][LEADER] term=6 ledgerBegin=892303262 ledgerEnd=1696913917 committed=1696913917 watermarks={6:{"n0":1696913917,"n1":899552530,"n2":1696913917}}
   2020-02-18 19:58:01 INFO DLedgerCleanSpaceService - unmap file[REF:0] /data/rocketmq/broker/store/dledger_store/dledger-n0/data/00000002248415379456 OK
   2020-02-18 19:58:01 INFO DLedgerCleanSpaceService - close file channel /data/rocketmq/broker/store/dledger_store/dledger-n0/data/00000002248415379456 OK
   2020-02-18 19:58:01 INFO DLedgerFlushDataService - Flush data cost=524 ms
   2020-02-18 19:58:01 INFO DLedgerCleanSpaceService - delete file[REF:0] /data/rocketmq/broker/store/dledger_store/dledger-n0/data/00000002248415379456 OK, W:1073741824 M:1073741824, 95
   2020-02-18 19:58:01 INFO DLedgerCleanSpaceService - unmap file[REF:0] /data/rocketmq/broker/store/dledger_store/dledger-n0/data/00000002249489121280 OK
   2020-02-18 19:58:01 INFO DLedgerCleanSpaceService - close file channel /data/rocketmq/broker/store/dledger_store/dledger-n0/data/00000002249489121280 OK
   2020-02-18 19:58:02 INFO DLedgerCleanSpaceService - delete file[REF:0] /data/rocketmq/broker/store/dledger_store/dledger-n0/data/00000002249489121280 OK, W:1073741824 M:1073741824, 129
   2020-02-18 19:58:02 INFO DLedgerCleanSpaceService - unmap file[REF:0] /data/rocketmq/broker/store/dledger_store/dledger-n0/data/00000002250562863104 OK
   2020-02-18 19:58:02 INFO DLedgerCleanSpaceService - close file channel /data/rocketmq/broker/store/dledger_store/dledger-n0/data/00000002250562863104 OK
   2020-02-18 19:58:02 INFO DLedgerCleanSpaceService - delete file[REF:0] /data/rocketmq/broker/store/dledger_store/dledger-n0/data/00000002250562863104 OK, W:1073741824 M:1073741824, 114
   2020-02-18 19:58:02 INFO DLedgerCleanSpaceService - unmap file[REF:0] /data/rocketmq/broker/store/dledger_store/dledger-n0/data/00000002251636604928 OK
   2020-02-18 19:58:02 INFO DLedgerCleanSpaceService - close file channel /data/rocketmq/broker/store/dledger_store/dledger-n0/data/00000002251636604928 OK
   2020-02-18 19:58:02 INFO DLedgerCleanSpaceService - delete file[REF:0] /data/rocketmq/broker/store/dledger_store/dledger-n0/data/00000002251636604928 OK, W:1073741824 M:1073741824, 112
   2020-02-18 19:58:03 INFO DLedgerCleanSpaceService - unmap file[REF:0] /data/rocketmq/broker/store/dledger_store/dledger-n0/data/00000002252710346752 OK
   2020-02-18 19:58:03 INFO DLedgerCleanSpaceService - close file channel /data/rocketmq/broker/store/dledger_store/dledger-n0/data/00000002252710346752 OK
   2020-02-18 19:58:03 WARN EntryDispatcher-n0-n1 - matched, but hold failed, request pos=933544355 fileFromOffset=2252710346752
   2020-02-18 19:58:03 ERROR EntryDispatcher-n0-n1 - [Push-n1]Error in EntryDispatcher-n0-n1 writeIndex=899589775 compareIndex=-1
   io.openmessaging.storage.dledger.exception.DLedgerException: [code=414,name=DISK_ERROR,desc=] Get null data for 899589775
   	at io.openmessaging.storage.dledger.utils.PreConditions.check(PreConditions.java:41) ~[dledger-0.1.jar:na]
   	at io.openmessaging.storage.dledger.store.file.DLedgerMmapFileStore.get(DLedgerMmapFileStore.java:489) ~[dledger-0.1.jar:na]
   	at io.openmessaging.storage.dledger.DLedgerEntryPusher$EntryDispatcher.doAppendInner(DLedgerEntryPusher.java:389) ~[dledger-0.1.jar:na]
   	at io.openmessaging.storage.dledger.DLedgerEntryPusher$EntryDispatcher.doAppend(DLedgerEntryPusher.java:464) ~[dledger-0.1.jar:na]
   	at io.openmessaging.storage.dledger.DLedgerEntryPusher$EntryDispatcher.doWork(DLedgerEntryPusher.java:602) ~[dledger-0.1.jar:na]
   	at io.openmessaging.storage.dledger.ShutdownAbleThread.run(ShutdownAbleThread.java:87) [dledger-0.1.jar:na]
   2020-02-18 19:58:03 INFO DLedgerCleanSpaceService - delete file[REF:0] /data/rocketmq/broker/store/dledger_store/dledger-n0/data/00000002252710346752 OK, W:1073741824 M:1073741824, 100
   2020-02-18 19:58:03 INFO DLedgerCleanSpaceService - unmap file[REF:0] /data/rocketmq/broker/store/dledger_store/dledger-n0/data/00000002253784088576 OK
   2020-02-18 19:58:03 INFO DLedgerCleanSpaceService - close file channel /data/rocketmq/broker/store/dledger_store/dledger-n0/data/00000002253784088576 OK
   2020-02-18 19:58:03 INFO DLedgerCleanSpaceService - delete file[REF:0] /data/rocketmq/broker/store/dledger_store/dledger-n0/data/00000002253784088576 OK, W:1073741824 M:1073741824, 0
   2020-02-18 19:58:03 INFO DLedgerCleanSpaceService - Clean space count=10 timeUp=false checkExpired=true forceClean=true enableForceClean=true diskFull=false storeBaseRatio=0.8500003765681342 dataRatio=0.8500003765681342
   2020-02-18 19:58:03 INFO DLedgerCleanSpaceService - unmap file[REF:0] /data/rocketmq/broker/store/dledger_store/dledger-n0/index/00000000028521267200 OK
   2020-02-18 19:58:03 INFO DLedgerCleanSpaceService - close file channel /data/rocketmq/broker/store/dledger_store/dledger-n0/index/00000000028521267200 OK
   2020-02-18 19:58:03 INFO DLedgerCleanSpaceService - delete file[REF:0] /data/rocketmq/broker/store/dledger_store/dledger-n0/index/00000000028521267200 OK, W:167772160 M:167772160, 13
   2020-02-18 19:58:03 ERROR EntryDispatcher-n0-n1 - [Push-n1]Error in EntryDispatcher-n0-n1 writeIndex=899589775 compareIndex=-1
   io.openmessaging.storage.dledger.exception.DLedgerException: [code=410,name=INDEX_OUT_OF_RANGE,desc=] 899589775 should between 900518572-1696925904
   	at io.openmessaging.storage.dledger.utils.PreConditions.check(PreConditions.java:41) ~[dledger-0.1.jar:na]
   	at io.openmessaging.storage.dledger.store.file.DLedgerMmapFileStore.get(DLedgerMmapFileStore.java:479) ~[dledger-0.1.jar:na]
   	at io.openmessaging.storage.dledger.DLedgerEntryPusher$EntryDispatcher.doAppendInner(DLedgerEntryPusher.java:389) ~[dledger-0.1.jar:na]
   	at io.openmessaging.storage.dledger.DLedgerEntryPusher$EntryDispatcher.doAppend(DLedgerEntryPusher.java:464) ~[dledger-0.1.jar:na]
   	at io.openmessaging.storage.dledger.DLedgerEntryPusher$EntryDispatcher.doWork(DLedgerEntryPusher.java:602) ~[dledger-0.1.jar:na]
   	at io.openmessaging.storage.dledger.ShutdownAbleThread.run(ShutdownAbleThread.java:87) [dledger-0.1.jar:na]`

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [rocketmq] RongtongJin closed issue #1778: The data is not synchronized when a new node without data is added in dledger mode

Posted by GitBox <gi...@apache.org>.
RongtongJin closed issue #1778: The data is not synchronized when a new node without data is added in dledger mode
URL: https://github.com/apache/rocketmq/issues/1778
 
 
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [rocketmq] TerrellChen closed issue #1778: The data is not synchronized when a new node without data is added in dledger mode

Posted by GitBox <gi...@apache.org>.
TerrellChen closed issue #1778: The data is not synchronized when a new node without data is added in dledger mode
URL: https://github.com/apache/rocketmq/issues/1778
 
 
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [rocketmq] RongtongJin commented on issue #1778: The data is not synchronized when a new node without data is added in dledger mode

Posted by GitBox <gi...@apache.org>.
RongtongJin commented on issue #1778: The data is not synchronized when a new node without data is added in dledger mode
URL: https://github.com/apache/rocketmq/issues/1778#issuecomment-598671079
 
 
   link https://github.com/openmessaging/openmessaging-storage-dledger/pull/50

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [rocketmq] TerrellChen commented on issue #1778: The data is not synchronized when a new node without data is added in dledger mode

Posted by GitBox <gi...@apache.org>.
TerrellChen commented on issue #1778: The data is not synchronized when a new node without data is added in dledger mode
URL: https://github.com/apache/rocketmq/issues/1778#issuecomment-588047700
 
 
   @RongtongJin That would be nice! I will try to work on this.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [rocketmq] TerrellChen commented on issue #1778: The data is not synchronized when a new node without data is added in dledger mode

Posted by GitBox <gi...@apache.org>.
TerrellChen commented on issue #1778: The data is not synchronized when a new node without data is added in dledger mode
URL: https://github.com/apache/rocketmq/issues/1778#issuecomment-586890560
 
 
   @RongtongJin Thank you for your help.
   Since i cleaned the cluster and built a new one, this situation dosen't occur.
   If there is any way in which i can reproduce it, i'll let you know.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [rocketmq] TerrellChen opened a new issue #1778: The data is not synchronized when a new node without data is added in dledger mode

Posted by GitBox <gi...@apache.org>.
TerrellChen opened a new issue #1778: The data is not synchronized when a new node without data is added in dledger mode
URL: https://github.com/apache/rocketmq/issues/1778
 
 
   在一个已经搭建好且正常运行的3节点Deldger集群中,停止掉一台slave的进程,并且清空它的store/dledger_store目录后,再次重启进程,希望模拟为Dledger集群新增一个空节点的场景。
   而这个节点却始终没有数据同步过来。
   在master的broker_default.log日志中有相关异常,由io.openmessaging.storage.dledger.DLedgerEntryPusher.EntryDispatcher#doWork的try-catch块抛出
   `2020-02-16 18:22:24 ERROR EntryDispatcher-n0-n2 - [Push-n2]Error in EntryDispatcher-n0-n2 writeIndex=1435933765 compareIndex=-1
   io.openmessaging.storage.dledger.exception.DLedgerException: [code=410,name=INDEX_OUT_OF_RANGE,desc=] 1435933765 should between 1551149011-1815212110
           at io.openmessaging.storage.dledger.utils.PreConditions.check(PreConditions.java:41) ~[dledger-0.1.jar:na]
           at io.openmessaging.storage.dledger.store.file.DLedgerMmapFileStore.get(DLedgerMmapFileStore.java:479) ~[dledger-0.1.jar:na]
           at io.openmessaging.storage.dledger.DLedgerEntryPusher$EntryDispatcher.doAppendInner(DLedgerEntryPusher.java:389) ~[dledger-0.1.jar:na]
           at io.openmessaging.storage.dledger.DLedgerEntryPusher$EntryDispatcher.doAppend(DLedgerEntryPusher.java:464) ~[dledger-0.1.jar:na]
           at io.openmessaging.storage.dledger.DLedgerEntryPusher$EntryDispatcher.doWork(DLedgerEntryPusher.java:602) ~[dledger-0.1.jar:na]`
   在手动操作重启一次master后,数据同步又会正常进行。
   请问这个问题的原因是什么?以及为集群新增节点的正确操作是什么,有文档吗?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [rocketmq] RongtongJin commented on issue #1778: The data is not synchronized when a new node without data is added in dledger mode

Posted by GitBox <gi...@apache.org>.
RongtongJin commented on issue #1778: The data is not synchronized when a new node without data is added in dledger mode
URL: https://github.com/apache/rocketmq/issues/1778#issuecomment-587999113
 
 
   Hi @TerrellChen  Are you interested in making a pr to solve this issue?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services