You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@rocketmq.apache.org by GitBox <gi...@apache.org> on 2021/09/28 12:18:44 UTC

[GitHub] [rocketmq] qsrg opened a new issue #3388: HA not avaiable when slave's commitLog not match master

qsrg opened a new issue #3388:
URL: https://github.com/apache/rocketmq/issues/3388


   version:4.3.2,4.4.0
   When the machine  of slave broke down for some days,start up slave after the machine repaired。On this condition,master  cannot send commitLog data to slave,this cause SYNC_MASTER return SLAVE_NOT_AVAIABLE  when send message to this broker。search in master's broker.log,can find ‘Slave fall behind master:xxx '。
   
   After analysis,I find the slave's commitLog offset does not existed in master's commitLog ,this cause master getCommitLogData  by a out of range offset return null,then just send heartbeat 
   
   some resource code as follow:
   ```
   SelectMappedBufferResult selectResult =
                           HAConnection.this.haService.getDefaultMessageStore().getCommitLogData(this.nextTransferFromWhere);
                       if (selectResult != null) {
                           int size = selectResult.getSize();
                           if (size > HAConnection.this.haService.getDefaultMessageStore().getMessageStoreConfig().getHaTransferBatchSize()) {
                               size = HAConnection.this.haService.getDefaultMessageStore().getMessageStoreConfig().getHaTransferBatchSize();
                           }
   
                           long thisOffset = this.nextTransferFromWhere;
                           this.nextTransferFromWhere += size;
   
                           selectResult.getByteBuffer().limit(size);
                           this.selectMappedBufferResult = selectResult;
   
                           // Build Header
                           this.byteBufferHeader.position(0);
                           this.byteBufferHeader.limit(headerSize);
                           this.byteBufferHeader.putLong(thisOffset);
                           this.byteBufferHeader.putInt(size);
                           this.byteBufferHeader.flip();
   
                           this.lastWriteOver = this.transferData();
                       } else {
                           service.getWaitNotifyObject().wakeupAll();
                           HAConnection.this.haService.getWaitNotifyObject().allWaitForRunning(100);
                       }
   ```
   I think when master received slaveRequestOffset,calculate of  nextTransferFromWhere  should  check slaveRequestOffset is legal,not just
   ```
   this.nextTransferFromWhere = HAConnection.this.slaveRequestOffset;
   
   ```
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@rocketmq.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [rocketmq] RongtongJin closed issue #3388: HA may not avaiable when slave's commitLog not match master

Posted by GitBox <gi...@apache.org>.
RongtongJin closed issue #3388:
URL: https://github.com/apache/rocketmq/issues/3388


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@rocketmq.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [rocketmq] qsrg edited a comment on issue #3388: HA may not avaiable when slave's commitLog not match master

Posted by GitBox <gi...@apache.org>.
qsrg edited a comment on issue #3388:
URL: https://github.com/apache/rocketmq/issues/3388#issuecomment-929833299


   Before start up the slave,check the commigLog data is correctly by manual is inconvenient to execute 。From the machine of slave broke down to it repaired,we did not modify the data in the store directory of slave,this mistake maybe due to the deletion policy as it'is not time to deleteExpiredFiles at start up time or  other reasons. so add check is necessary.
   
   add the following checks when master  received slaveRequestOffset at first,if slaveRequestOffset>maxOffset or slaveRequestOffset<minOffset,send last commitLog file. And change slave's dispatchReadRequest method to handle a correct offset from master.
   Is this idea appropriate?
   
   ```
   if (-1 == this.nextTransferFromWhere) {
                           if (0 == HAConnection.this.slaveRequestOffset) {
                               long masterOffset = HAConnection.this.haService.getDefaultMessageStore().getCommitLog().getMaxOffset();
                               masterOffset =
                                   masterOffset
                                       - (masterOffset % HAConnection.this.haService.getDefaultMessageStore().getMessageStoreConfig()
                                       .getMapedFileSizeCommitLog());
   
                               if (masterOffset < 0) {
                                   masterOffset = 0;
                               }
                               this.nextTransferFromWhere = masterOffset;
                           } else {
                              //changes
                               long maxOffset = HAConnection.this.haService.getDefaultMessageStore().getCommitLog().getMaxOffset();
                               long minOffset = HAConnection.this.haService.getDefaultMessageStore().getCommitLog().getMinOffset();
                               if (HAConnection.this.slaveRequestOffset> maxOffset || HAConnection.this.slaveRequestOffset< minOffset) {
                                   long masterOffset =
                                           maxOffset
                                                   - (maxOffset % HAConnection.this.haService.getDefaultMessageStore().getMessageStoreConfig()
                                                   .getMapedFileSizeCommitLog());
   
                                   if (masterOffset < 0) {
                                       masterOffset = 0;
                                   }
                                   this.nextTransferFromWhere = masterOffset;
   
                               }else {
                                   this.nextTransferFromWhere = HAConnection.this.slaveRequestOffset;
                               }
   
                           }
   
                           log.info("master transfer data from " + this.nextTransferFromWhere + " to slave[" + HAConnection.this.clientAddr
                               + "], and slave request " + HAConnection.this.slaveRequestOffset);
                       }
   ```
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@rocketmq.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [rocketmq] qsrg commented on issue #3388: HA may not avaiable when slave's commitLog not match master

Posted by GitBox <gi...@apache.org>.
qsrg commented on issue #3388:
URL: https://github.com/apache/rocketmq/issues/3388#issuecomment-929833299


   Before start up the slave,check the commigLog data is correctly by manual is inconvenient to execute 。From the machine of slave broke down to it repaired,we did not modify the data in the store directory of slave,this mistake maybe due to the deletion policy or  other reasons.
   
   add the following checks when master  received slaveRequestOffset at first,if slaveRequestOffset>maxOffset or slaveRequestOffset<minOffset,send last commitLog file. And change slave's dispatchReadRequest method to handle a correct offset from master.
   Is this idea appropriate?
   
   ```
   if (-1 == this.nextTransferFromWhere) {
                           if (0 == HAConnection.this.slaveRequestOffset) {
                               long masterOffset = HAConnection.this.haService.getDefaultMessageStore().getCommitLog().getMaxOffset();
                               masterOffset =
                                   masterOffset
                                       - (masterOffset % HAConnection.this.haService.getDefaultMessageStore().getMessageStoreConfig()
                                       .getMapedFileSizeCommitLog());
   
                               if (masterOffset < 0) {
                                   masterOffset = 0;
                               }
                               this.nextTransferFromWhere = masterOffset;
                           } else {
                              //changes
                               long maxOffset = HAConnection.this.haService.getDefaultMessageStore().getCommitLog().getMaxOffset();
                               long minOffset = HAConnection.this.haService.getDefaultMessageStore().getCommitLog().getMinOffset();
                               if (HAConnection.this.slaveRequestOffset> maxOffset || HAConnection.this.slaveRequestOffset< minOffset) {
                                   long masterOffset =
                                           maxOffset
                                                   - (maxOffset % HAConnection.this.haService.getDefaultMessageStore().getMessageStoreConfig()
                                                   .getMapedFileSizeCommitLog());
   
                                   if (masterOffset < 0) {
                                       masterOffset = 0;
                                   }
                                   this.nextTransferFromWhere = masterOffset;
   
                               }else {
                                   this.nextTransferFromWhere = HAConnection.this.slaveRequestOffset;
                               }
   
                           }
   
                           log.info("master transfer data from " + this.nextTransferFromWhere + " to slave[" + HAConnection.this.clientAddr
                               + "], and slave request " + HAConnection.this.slaveRequestOffset);
                       }
   ```
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@rocketmq.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [rocketmq] RongtongJin commented on issue #3388: HA may not avaiable when slave's commitLog not match master

Posted by GitBox <gi...@apache.org>.
RongtongJin commented on issue #3388:
URL: https://github.com/apache/rocketmq/issues/3388#issuecomment-932057774


   > Before start up the slave,check the commigLog data is correctly by manual is inconvenient to execute 。From the machine of slave broke down to it repaired,we did not modify the data in the store directory of slave,this mistake maybe due to the deletion policy as it'is not time to deleteExpiredFiles at start up time or other reasons. so add check is necessary.
   > 
   > add the following checks when master received slaveRequestOffset at first,if slaveRequestOffset>maxOffset or slaveRequestOffset<minOffset,send last commitLog file. And change slave's dispatchReadRequest method to handle a correct offset from master. Is this idea appropriate?
   > 
   > ```
   > if (-1 == this.nextTransferFromWhere) {
   >                         if (0 == HAConnection.this.slaveRequestOffset) {
   >                             long masterOffset = HAConnection.this.haService.getDefaultMessageStore().getCommitLog().getMaxOffset();
   >                             masterOffset =
   >                                 masterOffset
   >                                     - (masterOffset % HAConnection.this.haService.getDefaultMessageStore().getMessageStoreConfig()
   >                                     .getMapedFileSizeCommitLog());
   > 
   >                             if (masterOffset < 0) {
   >                                 masterOffset = 0;
   >                             }
   >                             this.nextTransferFromWhere = masterOffset;
   >                         } else {
   >                            //changes
   >                             long maxOffset = HAConnection.this.haService.getDefaultMessageStore().getCommitLog().getMaxOffset();
   >                             long minOffset = HAConnection.this.haService.getDefaultMessageStore().getCommitLog().getMinOffset();
   >                             if (HAConnection.this.slaveRequestOffset> maxOffset || HAConnection.this.slaveRequestOffset< minOffset) {
   >                                 long masterOffset =
   >                                         maxOffset
   >                                                 - (maxOffset % HAConnection.this.haService.getDefaultMessageStore().getMessageStoreConfig()
   >                                                 .getMapedFileSizeCommitLog());
   > 
   >                                 if (masterOffset < 0) {
   >                                     masterOffset = 0;
   >                                 }
   >                                 this.nextTransferFromWhere = masterOffset;
   > 
   >                             }else {
   >                                 this.nextTransferFromWhere = HAConnection.this.slaveRequestOffset;
   >                             }
   > 
   >                         }
   > 
   >                         log.info("master transfer data from " + this.nextTransferFromWhere + " to slave[" + HAConnection.this.clientAddr
   >                             + "], and slave request " + HAConnection.this.slaveRequestOffset);
   >                     }
   > ```
   
   This will cause the commitlog to be discontinuous. I think it's best to repair it manually in this case


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@rocketmq.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [rocketmq] panzhi33 commented on issue #3388: HA may not avaiable when slave's commitLog not match master

Posted by GitBox <gi...@apache.org>.
panzhi33 commented on issue #3388:
URL: https://github.com/apache/rocketmq/issues/3388#issuecomment-930078672


   HAConnection.this.slaveRequestOffset> maxOffset  there is a problem with this situationt,ReputMessageService will down。like this kind of  slave broker that has been offline for a long time,It is recommended to rebuild the slave and delete the store directory of the slave.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@rocketmq.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [rocketmq] panzhi33 commented on issue #3388: HA may not avaiable when slave's commitLog not match master

Posted by GitBox <gi...@apache.org>.
panzhi33 commented on issue #3388:
URL: https://github.com/apache/rocketmq/issues/3388#issuecomment-930078672


   HAConnection.this.slaveRequestOffset> maxOffset  there is a problem with this situationt,ReputMessageService will down。like this kind of  slave broker that has been offline for a long time,It is recommended to rebuild the slave and delete the store directory of the slave.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@rocketmq.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [rocketmq] RongtongJin commented on issue #3388: HA may not avaiable when slave's commitLog not match master

Posted by GitBox <gi...@apache.org>.
RongtongJin commented on issue #3388:
URL: https://github.com/apache/rocketmq/issues/3388#issuecomment-929776864


   In this case, it is difficult to correct the offset self, so the slave is usually rebuilt and the store directory of the slave should be deleted.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@rocketmq.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [rocketmq] qsrg edited a comment on issue #3388: HA may not avaiable when slave's commitLog not match master

Posted by GitBox <gi...@apache.org>.
qsrg edited a comment on issue #3388:
URL: https://github.com/apache/rocketmq/issues/3388#issuecomment-929833299


   Before start up the slave,check the commigLog data is correctly by manual is inconvenient to execute 。From the machine of slave broke down to it repaired,we did not modify the data in the store directory of slave,this mistake maybe due to the deletion policy as it'is not time to deleteExpiredFiles at start up time or  other reasons. so add check is necessary.
   
   add the following checks when master  received slaveRequestOffset at first,if slaveRequestOffset>maxOffset or slaveRequestOffset<minOffset,send last commitLog file. And change slave's dispatchReadRequest method to handle a correct offset from master.
   Is this idea appropriate?
   
   ```
   if (-1 == this.nextTransferFromWhere) {
                           if (0 == HAConnection.this.slaveRequestOffset) {
                               long masterOffset = HAConnection.this.haService.getDefaultMessageStore().getCommitLog().getMaxOffset();
                               masterOffset =
                                   masterOffset
                                       - (masterOffset % HAConnection.this.haService.getDefaultMessageStore().getMessageStoreConfig()
                                       .getMapedFileSizeCommitLog());
   
                               if (masterOffset < 0) {
                                   masterOffset = 0;
                               }
                               this.nextTransferFromWhere = masterOffset;
                           } else {
                              //changes
                               long maxOffset = HAConnection.this.haService.getDefaultMessageStore().getCommitLog().getMaxOffset();
                               long minOffset = HAConnection.this.haService.getDefaultMessageStore().getCommitLog().getMinOffset();
                               if (HAConnection.this.slaveRequestOffset> maxOffset || HAConnection.this.slaveRequestOffset< minOffset) {
                                   long masterOffset =
                                           maxOffset
                                                   - (maxOffset % HAConnection.this.haService.getDefaultMessageStore().getMessageStoreConfig()
                                                   .getMapedFileSizeCommitLog());
   
                                   if (masterOffset < 0) {
                                       masterOffset = 0;
                                   }
                                   this.nextTransferFromWhere = masterOffset;
   
                               }else {
                                   this.nextTransferFromWhere = HAConnection.this.slaveRequestOffset;
                               }
   
                           }
   
                           log.info("master transfer data from " + this.nextTransferFromWhere + " to slave[" + HAConnection.this.clientAddr
                               + "], and slave request " + HAConnection.this.slaveRequestOffset);
                       }
   ```
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@rocketmq.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [rocketmq] qsrg commented on issue #3388: HA may not avaiable when slave's commitLog not match master

Posted by GitBox <gi...@apache.org>.
qsrg commented on issue #3388:
URL: https://github.com/apache/rocketmq/issues/3388#issuecomment-929833299


   Before start up the slave,check the commigLog data is correctly by manual is inconvenient to execute 。From the machine of slave broke down to it repaired,we did not modify the data in the store directory of slave,this mistake maybe due to the deletion policy or  other reasons.
   
   add the following checks when master  received slaveRequestOffset at first,if slaveRequestOffset>maxOffset or slaveRequestOffset<minOffset,send last commitLog file. And change slave's dispatchReadRequest method to handle a correct offset from master.
   Is this idea appropriate?
   
   ```
   if (-1 == this.nextTransferFromWhere) {
                           if (0 == HAConnection.this.slaveRequestOffset) {
                               long masterOffset = HAConnection.this.haService.getDefaultMessageStore().getCommitLog().getMaxOffset();
                               masterOffset =
                                   masterOffset
                                       - (masterOffset % HAConnection.this.haService.getDefaultMessageStore().getMessageStoreConfig()
                                       .getMapedFileSizeCommitLog());
   
                               if (masterOffset < 0) {
                                   masterOffset = 0;
                               }
                               this.nextTransferFromWhere = masterOffset;
                           } else {
                              //changes
                               long maxOffset = HAConnection.this.haService.getDefaultMessageStore().getCommitLog().getMaxOffset();
                               long minOffset = HAConnection.this.haService.getDefaultMessageStore().getCommitLog().getMinOffset();
                               if (HAConnection.this.slaveRequestOffset> maxOffset || HAConnection.this.slaveRequestOffset< minOffset) {
                                   long masterOffset =
                                           maxOffset
                                                   - (maxOffset % HAConnection.this.haService.getDefaultMessageStore().getMessageStoreConfig()
                                                   .getMapedFileSizeCommitLog());
   
                                   if (masterOffset < 0) {
                                       masterOffset = 0;
                                   }
                                   this.nextTransferFromWhere = masterOffset;
   
                               }else {
                                   this.nextTransferFromWhere = HAConnection.this.slaveRequestOffset;
                               }
   
                           }
   
                           log.info("master transfer data from " + this.nextTransferFromWhere + " to slave[" + HAConnection.this.clientAddr
                               + "], and slave request " + HAConnection.this.slaveRequestOffset);
                       }
   ```
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@rocketmq.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [rocketmq] RongtongJin commented on issue #3388: HA may not avaiable when slave's commitLog not match master

Posted by GitBox <gi...@apache.org>.
RongtongJin commented on issue #3388:
URL: https://github.com/apache/rocketmq/issues/3388#issuecomment-929776864


   In this case, it is difficult to correct the offset self, so the slave is usually rebuilt and the store directory of the slave should be deleted.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@rocketmq.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org