You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@rocketmq.apache.org by GitBox <gi...@apache.org> on 2018/09/14 12:45:11 UTC

[GitHub] suiyuzeng opened a new issue #467: Message missed after recovering from abnormal shutdown

suiyuzeng opened a new issue #467: Message missed after recovering from abnormal shutdown
URL: https://github.com/apache/rocketmq/issues/467
 
 
      One Message was missed after recovering from abnormal shutdown. The following is the main log of the problem:
      slave shutdown:
   2018-09-11 16:37:55.950 WARN ShutdownHook - shutdown ReputMessageService, but commitlog have not finish to be dispatched, CL: 80549937152 reputFromOffset: 80549937024
   2018-09-11 16:37:55.964 WARN ShutdownHook - the store may be wrong, so shutdown abnormally, and keep abort file.
       Then recover and sync message from the master, get the error log as follow:
   2018-09-11 16:46:46.976 WARN ReputMessageService - [BUG]logic queue order maybe wrong, expectLogicOffset: 1050988860 currentLogicOffset: 1050988840 Topic: role_change QID: Diff: 20
   2018-09-11 16:46:46.976 WARN ReputMessageService - [BUG]logic queue order maybe wrong, expectLogicOffset: 1050988880 currentLogicOffset: 1050988860 Topic: role_change QID: Diff: 20
   2018-09-11 16:46:46.977 WARN ReputMessageService - [BUG]logic queue order maybe wrong, expectLogicOffset: 1050988900 currentLogicOffset: 1050988880 Topic: role_change QID: Diff: 20
       After shutdown the master, one message cant not be consumed from the slave;
   
       Analysis:
       In the abnormal shutdown, some message was not dispatched. And after recovering,  the ReputMessageService did not reput these messages as duplicationEnable was not enable. Is the parm "duplicationEnable" for this problem? But i find it will not resolve the problem even if enable it, as it is not saved.
   
       Solution:
       A: Save confirmOffset(such the file checkpoint) and ReputMessageService reput message from the confirmOffset saved after recover.
      B: In the method recover(), get the max phy offset was reputed when recover consume queue. Then reput from the message from max phy offset was reputed to max phy offset of commit log. It will work,whether or not  duplicationEnable is enable.
        
      Which is more in line with  the overall design? Or another better way?
       thx~

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services