You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@rocketmq.apache.org by GitBox <gi...@apache.org> on 2021/01/11 03:31:01 UTC

[GitHub] [rocketmq] iamqq23ue opened a new issue #2582: RocketMQ4.8.0 dledger cluster will switch master when the message accumulation is close to 30 million

iamqq23ue opened a new issue #2582:
URL: https://github.com/apache/rocketmq/issues/2582


   When the message accumulation is close to 30 million, most of the memory of my machine will be occupied by RocketMQ. I guess the master switch is caused by this.
   In addition, after the master switch, the original master (now slave) TPS is much lower than the other two nodes, which will cause this node to always be unable to fully synchronize.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [rocketmq] iamqq23ue commented on issue #2582: RocketMQ4.8.0 dledger cluster will switch master to slave when the message accumulation is close to 30 million

Posted by GitBox <gi...@apache.org>.
iamqq23ue commented on issue #2582:
URL: https://github.com/apache/rocketmq/issues/2582#issuecomment-814627422


   brokerClusterName = RaftCluster
   brokerName=RaftNode00
   flushDiskType=SYNC_FLUSH
   brokerIP1=XXX
   listenPort=XXX
   namesrvAddr=XXX
   storePathRootDir=/rtmqdata/rocketmq/480
   enableDLegerCommitLog=true
   waitTimeMillsInSendQueue=2900
   dLegerGroup=RaftNode00
   dLegerPeers=XXX
   ## must be unique
   dLegerSelfId=n1
   sendMessageThreadPoolNums=16
   
   
   Just a normal three-node cluster。One of the nodes is configured as above。
   ------------------&nbsp;原始邮件&nbsp;------------------
   发件人:                                                                                                                        "apache/rocketmq"                                                                                    ***@***.***&gt;;
   发送时间:&nbsp;2021年4月7日(星期三) 中午1:17
   ***@***.***&gt;;
   ***@***.******@***.***&gt;;
   主题:&nbsp;Re: [apache/rocketmq] RocketMQ4.8.0 dledger cluster will switch master to slave when the message accumulation is close to 30 million (#2582)
   
   
   
   
   
    
   I can't reproduce your case. IMO the config is important.
    
    
   —
   You are receiving this because you were mentioned.
   Reply to this email directly, view it on GitHub, or unsubscribe.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [rocketmq] iamqq23ue commented on issue #2582: RocketMQ4.8.0 dledger cluster will switch master to slave when the message accumulation is close to 30 million

Posted by GitBox <gi...@apache.org>.
iamqq23ue commented on issue #2582:
URL: https://github.com/apache/rocketmq/issues/2582#issuecomment-811626745


   > > > Are there any logs available for troubleshooting?
   > > 
   > > 
   > > It can be reproduced stably, as long as there are enough news accumulated。You can reproduce it yourself. The phenomenon at that time was that when the message accumulation reached about 30 million, the java process occupied more than 20G of memory, and the master would not respond for a period of time. I guess it is because the accumulation of messages exceeds the memory size and there will be a pause for a while.
   > 
   > I will try to reproduce the issue.
   > In addition, does this issue exist in the lower version of the dledger?
   
   I did not find this problem in version 4.7.1。


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [rocketmq] yuz10 commented on issue #2582: RocketMQ4.8.0 dledger cluster will switch master to slave when the message accumulation is close to 30 million

Posted by GitBox <gi...@apache.org>.
yuz10 commented on issue #2582:
URL: https://github.com/apache/rocketmq/issues/2582#issuecomment-962937490


   I can't reproduce either, my virtual machine is 8g memory, and master don't change even if accumulation is 70,00 0,000 msgs
   
    
   ![image](https://user-images.githubusercontent.com/14816818/140712469-d722a0d9-94a7-485f-ac6a-a44d9c757545.png)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@rocketmq.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [rocketmq] maixiaohai commented on issue #2582: RocketMQ4.8.0 dledger cluster will switch master to slave when the message accumulation is close to 30 million

Posted by GitBox <gi...@apache.org>.
maixiaohai commented on issue #2582:
URL: https://github.com/apache/rocketmq/issues/2582#issuecomment-812019411


   @iamqq23ue can u post some broker config and os config, our cluster use the same version and recently meet some switch case too. I want to reproduce it.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [rocketmq] iamqq23ue commented on issue #2582: RocketMQ4.8.0 dledger cluster will switch master to slave when the message accumulation is close to 30 million

Posted by GitBox <gi...@apache.org>.
iamqq23ue commented on issue #2582:
URL: https://github.com/apache/rocketmq/issues/2582#issuecomment-855691106


   > I can't reproduce your case. IMO the config is important.
   > ![image](https://user-images.githubusercontent.com/3734319/113813947-6d0d8c80-97a3-11eb-937e-a8d7abf28a0d.png)
   
   想起来了,你们内存是多大。我估计是堆积的消息足够多,将大部分内存cached之后会触发。我用的32G内存,可用20左右,所以大概是3000万消息堆积切换。


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [rocketmq] maixiaohai commented on issue #2582: RocketMQ4.8.0 dledger cluster will switch master to slave when the message accumulation is close to 30 million

Posted by GitBox <gi...@apache.org>.
maixiaohai commented on issue #2582:
URL: https://github.com/apache/rocketmq/issues/2582#issuecomment-814607592


   I can't reproduce your case. IMO the config is important.
   ![image](https://user-images.githubusercontent.com/3734319/113813947-6d0d8c80-97a3-11eb-937e-a8d7abf28a0d.png)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [rocketmq] iamqq23ue commented on issue #2582: RocketMQ4.8.0 dledger cluster will switch master to slave when the message accumulation is close to 30 million

Posted by GitBox <gi...@apache.org>.
iamqq23ue commented on issue #2582:
URL: https://github.com/apache/rocketmq/issues/2582#issuecomment-812288377


   You do not need special configuration.
   The trigger condition is to accumulate enough messages to cache all your memory
   You just need to keep producing news, but not consuming,and you can reproduce the problem.
   
   
   ------------------&nbsp;原始邮件&nbsp;------------------
   发件人:                                                                                                                        "apache/rocketmq"                                                                                    ***@***.***&gt;;
   发送时间:&nbsp;2021年4月2日(星期五) 凌晨0:20
   ***@***.***&gt;;
   ***@***.******@***.***&gt;;
   主题:&nbsp;Re: [apache/rocketmq] RocketMQ4.8.0 dledger cluster will switch master to slave when the message accumulation is close to 30 million (#2582)
   
   
   
   
   
    
   @iamqq23ue can u post some broker config and os config, our cluster use the same version and recently meet some switch case too. I want to reproduce it.
    
   —
   You are receiving this because you were mentioned.
   Reply to this email directly, view it on GitHub, or unsubscribe.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [rocketmq] maixiaohai commented on issue #2582: RocketMQ4.8.0 dledger cluster will switch master to slave when the message accumulation is close to 30 million

Posted by GitBox <gi...@apache.org>.
maixiaohai commented on issue #2582:
URL: https://github.com/apache/rocketmq/issues/2582#issuecomment-812017659


   mark


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [rocketmq] RongtongJin commented on issue #2582: RocketMQ4.8.0 dledger cluster will switch master to slave when the message accumulation is close to 30 million

Posted by GitBox <gi...@apache.org>.
RongtongJin commented on issue #2582:
URL: https://github.com/apache/rocketmq/issues/2582#issuecomment-759894439


   Are there any logs available for troubleshooting?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [rocketmq] maixiaohai commented on issue #2582: RocketMQ4.8.0 dledger cluster will switch master to slave when the message accumulation is close to 30 million

Posted by GitBox <gi...@apache.org>.
maixiaohai commented on issue #2582:
URL: https://github.com/apache/rocketmq/issues/2582#issuecomment-857309883


   > > I can't reproduce your case. IMO the config is important.
   > > ![image](https://user-images.githubusercontent.com/3734319/113813947-6d0d8c80-97a3-11eb-937e-a8d7abf28a0d.png)
   > 
   > 想起来了,你们内存是多大。我估计是堆积的消息足够多,将大部分内存cached之后会触发。我用的32G内存,可用20左右,所以大概是3000万消息堆积切换。
   
   128G


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [rocketmq] yzezzm commented on issue #2582: RocketMQ4.8.0 dledger cluster will switch master to slave when the message accumulation is close to 30 million

Posted by GitBox <gi...@apache.org>.
yzezzm commented on issue #2582:
URL: https://github.com/apache/rocketmq/issues/2582#issuecomment-855640529


   @RongtongJin 
   try this
   sudo sysctl -w vm.min_free_kbytes=


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [rocketmq] iamqq23ue commented on issue #2582: RocketMQ4.8.0 dledger cluster will switch master to slave when the message accumulation is close to 30 million

Posted by GitBox <gi...@apache.org>.
iamqq23ue commented on issue #2582:
URL: https://github.com/apache/rocketmq/issues/2582#issuecomment-759909621


   > Are there any logs available for troubleshooting?
   
   It can be reproduced stably, as long as there are enough news accumulated。You can reproduce it yourself. The phenomenon at that time was that when the message accumulation reached about 30 million, the java process occupied more than 20G of memory, and the master would not respond for a period of time. I guess it is because the accumulation of messages exceeds the memory size and there will be a pause for a while.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [rocketmq] RongtongJin commented on issue #2582: RocketMQ4.8.0 dledger cluster will switch master to slave when the message accumulation is close to 30 million

Posted by GitBox <gi...@apache.org>.
RongtongJin commented on issue #2582:
URL: https://github.com/apache/rocketmq/issues/2582#issuecomment-811618825


   > > Are there any logs available for troubleshooting?
   > 
   > It can be reproduced stably, as long as there are enough news accumulated。You can reproduce it yourself. The phenomenon at that time was that when the message accumulation reached about 30 million, the java process occupied more than 20G of memory, and the master would not respond for a period of time. I guess it is because the accumulation of messages exceeds the memory size and there will be a pause for a while.
   
   I will try to reproduce the issue.
   In addition, does this issue exist in the lower version of the dledger?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [rocketmq] yzezzm commented on issue #2582: RocketMQ4.8.0 dledger cluster will switch master to slave when the message accumulation is close to 30 million

Posted by GitBox <gi...@apache.org>.
yzezzm commented on issue #2582:
URL: https://github.com/apache/rocketmq/issues/2582#issuecomment-856390851


   > > I can't reproduce your case. IMO the config is important.
   > > ![image](https://user-images.githubusercontent.com/3734319/113813947-6d0d8c80-97a3-11eb-937e-a8d7abf28a0d.png)
   > 
   > 想起来了,你们内存是多大。我估计是堆积的消息足够多,将大部分内存cached之后会触发。我用的32G内存,可用20左右,所以大概是3000万消息堆积切换。
   
   这个要调,我64G机器,调到8G,之后sar -B 命令中的scanD基本消失


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org