You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@rocketmq.apache.org by "ASF GitHub Bot (JIRA)" <ji...@apache.org> on 2018/02/12 09:06:02 UTC

[jira] [Commented] (ROCKETMQ-332) MappedFileQueue is not thread safe, which will cause message loss.

    [ https://issues.apache.org/jira/browse/ROCKETMQ-332?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16360460#comment-16360460 ] 

ASF GitHub Bot commented on ROCKETMQ-332:
-----------------------------------------

Jason918 opened a new pull request #227: [ROCKETMQ-332] fix concurrent bug in MappedFileQueue#findMappedFileByOffset, which m…
URL: https://github.com/apache/rocketmq/pull/227
 
 
   
   ## What is the purpose of the change
   
   fix concurrent bug in MappedFileQueue#findMappedFileByOffset, which may cause message loss.
   
   
   ## Brief changelog
   
   The origin bug only occurs when the mappedFileQueue is deleting mappedFiles from the head of the queue. So the main idea of this bug fix is to check if the firstMappedFile in the queue is changed. If it changed, we may get the wrong mappedFile, and we handle this by doing retries. 
   
   Finally, If it failed after 3 times, we will try to find the mappedFile by iterating through all the mappedFiles in the queue to ensure returning the right result (solution from zhouxinyu).
   
   
   ## Verifying this change
   
   Follow this checklist to help us incorporate your contribution quickly and easily:
   
   - [x] Make sure there is a [JIRA issue](https://issues.apache.org/jira/projects/ROCKETMQ/issues/) filed for the change (usually before you start working on it). Trivial changes like typos do not require a JIRA issue. Your pull request should address just this issue, without pulling in other changes - one PR resolves one issue. 
   - [x] Format the pull request title like `[ROCKETMQ-XXX] Fix UnknownException when host config not exist`. Each commit in the pull request should have a meaningful subject line and body.
   - [x] Write a pull request description that is detailed enough to understand what the pull request does, how, and why.
   - [x] Write necessary unit-test to verify your logic correction, more mock a little better when cross module dependency exist. If the new feature or significant change is committed, please remember to add integration-test in [test module](https://github.com/apache/rocketmq/tree/master/test).
   - [x] Run `mvn -B clean apache-rat:check findbugs:findbugs checkstyle:checkstyle` to make sure basic checks pass. Run `mvn clean install -DskipITs` to make sure unit-test pass. Run `mvn clean test-compile failsafe:integration-test`  to make sure integration-test pass.
   - [x] If this contribution is large, please file an [Apache Individual Contributor License Agreement](http://www.apache.org/licenses/#clas).
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


> MappedFileQueue is not thread safe, which will cause message loss.
> ------------------------------------------------------------------
>
>                 Key: ROCKETMQ-332
>                 URL: https://issues.apache.org/jira/browse/ROCKETMQ-332
>             Project: Apache RocketMQ
>          Issue Type: Bug
>          Components: rocketmq-store
>    Affects Versions: 4.0.0-incubating, 4.1.0-incubating
>            Reporter: Jas0n918
>            Assignee: yukon
>            Priority: Major
>         Attachments: rocketmq.log
>
>
> In RocketMQ V3.5.8, there is a readWriteLock in com.alibaba.rocketmq.store.MapedFileQueue, which guarantee thread safety. But in the new org.apache.rocketmq.store.MappedFileQueue, there is not any concurrent control mechanism. 
> when consumer is fetching message(no large lag), broker calls
> org.apache.rocketmq.broker.processor.PullMessageProcessor#processRequest ==>
> org.apache.rocketmq.store.DefaultMessageStore#getMessage  ==>
> org.apache.rocketmq.store.ConsumeQueue#getIndexBuffer ==>
> org.apache.rocketmq.store.MappedFileQueue#findMappedFileByOffset
> but findMappedFileByOffset is not thread safe, as
> org.apache.rocketmq.store.MappedFileQueue#deleteExpiredFile maybe running concurrently(  the size of mappedFiles maybe change) , which will results in ConsumeQueue#getIndexBuffer returns null, causing 
> _nextBeginOffset  = nextOffsetCorrection(offset, consumeQueue.rollNextFile(offset));_+
> which will skip the whole consumeQueue file, any messages left in this ConsumeQueue will not be consumed by client.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)