You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@rocketmq.apache.org by GitBox <gi...@apache.org> on 2021/04/22 06:40:30 UTC

[GitHub] [rocketmq] iamqq23ue opened a new pull request #2707: [ISSUE #2706] Fix the problem of returning SEND_OK after flush failed

iamqq23ue opened a new pull request #2707:
URL: https://github.com/apache/rocketmq/pull/2707


           modified:  broker/src/main/java/org/apache/rocketmq/broker/processor/SendMessageProcessor.java
   	modified:   client/src/main/java/org/apache/rocketmq/client/impl/producer/DefaultMQProducerImpl.java
   	modified:   client/src/main/java/org/apache/rocketmq/client/producer/SendStatus.java
   	modified:   common/src/main/java/org/apache/rocketmq/common/protocol/ResponseCode.java
   	modified:   store/src/main/java/org/apache/rocketmq/store/CommitLog.java
   	modified:   store/src/main/java/org/apache/rocketmq/store/MappedFile.java
   	modified:   store/src/main/java/org/apache/rocketmq/store/MappedFileQueue.java
   	modified:   store/src/main/java/org/apache/rocketmq/store/PutMessageStatus.java
   
   
   **Make sure set the target branch to `develop`**
   
   ## What is the purpose of the change
   You can see from the code below. When flush fails, only an error is reported, but the client is not notified 。
   If there is an error, a try catch will be performed .Then log.error("Error occurred when force data to disk.", e);
   But there is no error code returned to the client, and the client still thinks that the send is successful.
   
   public int flush(final int flushLeastPages) {
           if (this.isAbleToFlush(flushLeastPages)) {
               if (this.hold()) {
                   int value = getReadPosition();
   
                   try {
                       //We only append data to fileChannel or mappedByteBuffer, never both.
                       if (writeBuffer != null || this.fileChannel.position() != 0) {
                           this.fileChannel.force(false);
                       } else {
                           this.mappedByteBuffer.force();
                       }
                   } catch (Throwable e) {
                       log.error("Error occurred when force data to disk.", e);
                   }
   
                   this.flushedPosition.set(value);
                   this.release();
               } else {
                   log.warn("in flush, hold failed, flush offset = " + this.flushedPosition.get());
                   this.flushedPosition.set(getReadPosition());
               }
           }
           return this.getFlushedPosition();
       }
   
   ## Brief changelog
   
   So I add an error code FLUSH_DISK_FAILED, and then notify the client to deal with it 。
   
   I have used mvn to package and run it locally, which can fix this problem 。
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [rocketmq] iamqq23ue closed pull request #2707: [ISSUE #2706] Fix the problem of returning SEND_OK after flush failed

Posted by GitBox <gi...@apache.org>.
iamqq23ue closed pull request #2707:
URL: https://github.com/apache/rocketmq/pull/2707


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [rocketmq] coveralls edited a comment on pull request #2707: [ISSUE #2706] Fix the problem of returning SEND_OK after flush failed

Posted by GitBox <gi...@apache.org>.
coveralls edited a comment on pull request #2707:
URL: https://github.com/apache/rocketmq/pull/2707#issuecomment-789392884


   
   [![Coverage Status](https://coveralls.io/builds/39033915/badge)](https://coveralls.io/builds/39033915)
   
   Coverage decreased (-0.02%) to 51.847% when pulling **b335ba550a4fcd19755df7f7aa0090443e5b485e on iamqq23ue:master** into **9eae1d676b1533efcc2af636331816700bb1522e on apache:develop**.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [rocketmq] iamqq23ue commented on pull request #2707: [ISSUE #2706] Fix the problem of returning SEND_OK after flush failed

Posted by GitBox <gi...@apache.org>.
iamqq23ue commented on pull request #2707:
URL: https://github.com/apache/rocketmq/pull/2707#issuecomment-920815802


   It should be that the code changed after I fixed the bug. I will try to deal with it. 
   
   
   ------------------&nbsp;原始邮件&nbsp;------------------
   发件人:                                                                                                                        "apache/rocketmq"                                                                                    ***@***.***&gt;;
   发送时间:&nbsp;2021年9月16日(星期四) 晚上7:10
   ***@***.***&gt;;
   ***@***.******@***.***&gt;;
   主题:&nbsp;Re: [apache/rocketmq] [ISSUE #2706]  Fix the problem of returning SEND_OK after flush failed  (#2707)
   
   
   
   
   
    
   The pr has codes conflicts, please review and resolve it in your local enviroments. @iamqq23ue
    
   —
   You are receiving this because you were mentioned.
   Reply to this email directly, view it on GitHub, or unsubscribe.
   Triage notifications on the go with GitHub Mobile for iOS or Android.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@rocketmq.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [rocketmq] iamqq23ue edited a comment on pull request #2707: [ISSUE #2706] Fix the problem of returning SEND_OK after flush failed

Posted by GitBox <gi...@apache.org>.
iamqq23ue edited a comment on pull request #2707:
URL: https://github.com/apache/rocketmq/pull/2707#issuecomment-824596217


   But there is a try catch,code as below:
   catch (Throwable e) {            
        log.error("Error occurred when force data to disk.", e);          
      }
   In my test,I can see the error "Error occurred when force data to disk." in log。But the exception is caught so it does not affect code execution
   
   
   ------------------&nbsp;原始邮件&nbsp;------------------
   发件人:                                                                                                                        "apache/rocketmq"                                                                                    ***@***.***&gt;;
   发送时间:&nbsp;2021年4月22日(星期四) 下午3:00
   ***@***.***&gt;;
   ***@***.***&gt;;"State ***@***.***&gt;;
   主题:&nbsp;Re: [apache/rocketmq] [ISSUE #2706]  Fix the problem of returning SEND_OK after flush failed  (#2707)
   
   
   
   
   
      
   Good catch! But I feel that the change is a bit complicated. When the flushing error occurs, the location flushedPosition does not move and will return to the client FLUSH_DISK_TIMEOUT.
     
   this.flushedPosition.set(value);
    You can see that this line of code will be executed no matter what。So even if an error is reported, the position will change。
     
   You're right. But when an exception is thrown, we can not set  flushedPosition.
    
   —
   You are receiving this because you modified the open/close state.
   Reply to this email directly, view it on GitHub, or unsubscribe.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [rocketmq] RongtongJin commented on pull request #2707: [ISSUE #2706] Fix the problem of returning SEND_OK after flush failed

Posted by GitBox <gi...@apache.org>.
RongtongJin commented on pull request #2707:
URL: https://github.com/apache/rocketmq/pull/2707#issuecomment-824592777


   > > Good catch! But I feel that the change is a bit complicated. When the flushing error occurs, the location flushedPosition does not move and will return to the client FLUSH_DISK_TIMEOUT.
   > 
   > **this.flushedPosition.set(value);**
   > You can see that this line of code will be executed no matter what。So even if an error is reported, the position will change。
   
   You're right. But when an exception is thrown, we can not set  flushedPosition.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [rocketmq] zongtanghu commented on pull request #2707: [ISSUE #2706] Fix the problem of returning SEND_OK after flush failed

Posted by GitBox <gi...@apache.org>.
zongtanghu commented on pull request #2707:
URL: https://github.com/apache/rocketmq/pull/2707#issuecomment-920806551


   The pr has codes conflicts, please review and resolve it in your local enviroments. @iamqq23ue 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@rocketmq.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [rocketmq] iamqq23ue commented on pull request #2707: [ISSUE #2706] Fix the problem of returning SEND_OK after flush failed

Posted by GitBox <gi...@apache.org>.
iamqq23ue commented on pull request #2707:
URL: https://github.com/apache/rocketmq/pull/2707#issuecomment-824596217


   But there is a try catchcatch (Throwable e) {                 log.error("Error occurred when force data to disk.", e);             }
   In my test,I can see the error "Error occurred when force data to disk." in log,but there is no exception.
   
   
   ------------------&nbsp;原始邮件&nbsp;------------------
   发件人:                                                                                                                        "apache/rocketmq"                                                                                    ***@***.***&gt;;
   发送时间:&nbsp;2021年4月22日(星期四) 下午3:00
   ***@***.***&gt;;
   ***@***.***&gt;;"State ***@***.***&gt;;
   主题:&nbsp;Re: [apache/rocketmq] [ISSUE #2706]  Fix the problem of returning SEND_OK after flush failed  (#2707)
   
   
   
   
   
      
   Good catch! But I feel that the change is a bit complicated. When the flushing error occurs, the location flushedPosition does not move and will return to the client FLUSH_DISK_TIMEOUT.
     
   this.flushedPosition.set(value);
    You can see that this line of code will be executed no matter what。So even if an error is reported, the position will change。
     
   You're right. But when an exception is thrown, we can not set  flushedPosition.
    
   —
   You are receiving this because you modified the open/close state.
   Reply to this email directly, view it on GitHub, or unsubscribe.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [rocketmq] coveralls commented on pull request #2707: #2706 fix

Posted by GitBox <gi...@apache.org>.
coveralls commented on pull request #2707:
URL: https://github.com/apache/rocketmq/pull/2707#issuecomment-789392884


   
   [![Coverage Status](https://coveralls.io/builds/37602067/badge)](https://coveralls.io/builds/37602067)
   
   Coverage increased (+0.1%) to 51.992% when pulling **b335ba550a4fcd19755df7f7aa0090443e5b485e on iamqq23ue:master** into **9eae1d676b1533efcc2af636331816700bb1522e on apache:develop**.
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [rocketmq] duhenglucky closed pull request #2707: [ISSUE #2706] Fix the problem of returning SEND_OK after flush failed

Posted by GitBox <gi...@apache.org>.
duhenglucky closed pull request #2707:
URL: https://github.com/apache/rocketmq/pull/2707


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@rocketmq.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [rocketmq] iamqq23ue commented on pull request #2707: [ISSUE #2706] Fix the problem of returning SEND_OK after flush failed

Posted by GitBox <gi...@apache.org>.
iamqq23ue commented on pull request #2707:
URL: https://github.com/apache/rocketmq/pull/2707#issuecomment-927297099


   The new pull request is #3382。The code is basically unchanged, slightly adjusted according to the new version 。


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@rocketmq.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [rocketmq] iamqq23ue commented on pull request #2707: [ISSUE #2706] Fix the problem of returning SEND_OK after flush failed

Posted by GitBox <gi...@apache.org>.
iamqq23ue commented on pull request #2707:
URL: https://github.com/apache/rocketmq/pull/2707#issuecomment-876090169


   > > > Good catch! But I feel that the change is a bit complicated. When the flushing error occurs, the location flushedPosition does not move and will return to the client FLUSH_DISK_TIMEOUT.
   > > 
   > > 
   > > **this.flushedPosition.set(value);**
   > > You can see that this line of code will be executed no matter what。So even if an error is reported, the position will change。
   > 
   > You're right. But when an exception is thrown, we can not set flushedPosition.
   
   
   请问什么时候可以修复这个bug呢,目前想考虑多个单master组成集群的架构,主要困难就是这个bug可能导致数据丢失。


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@rocketmq.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [rocketmq] iamqq23ue commented on pull request #2707: [ISSUE #2706] Fix the problem of returning SEND_OK after flush failed

Posted by GitBox <gi...@apache.org>.
iamqq23ue commented on pull request #2707:
URL: https://github.com/apache/rocketmq/pull/2707#issuecomment-824491734


   Thanks。But in my test,there is no FLUSH_DISK_TIMEOUT,my client get SEND_OK.In fact,I first tested the impact of the file system read only and found this problem, and then I went to analyze the code.
   
   
   
   
   ------------------&nbsp;原始邮件&nbsp;------------------
   发件人:                                                                                                                        "apache/rocketmq"                                                                                    ***@***.***&gt;;
   发送时间:&nbsp;2021年4月22日(星期四) 上午10:17
   ***@***.***&gt;;
   ***@***.******@***.***&gt;;
   主题:&nbsp;Re: [apache/rocketmq] [ISSUE #2706]  Fix the problem of returning SEND_OK after flush failed  (#2707)
   
   
   
   
   
    
   @RongtongJin commented on this pull request.
    
   Good catch! But I feel that the change is a bit complicated. When the flushing error occurs, the location flushedPosition does not move and will return to the client FLUSH_DISK_TIMEOUT.
    
   —
   You are receiving this because you authored the thread.
   Reply to this email directly, view it on GitHub, or unsubscribe.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [rocketmq] iamqq23ue commented on pull request #2707: [ISSUE #2706] Fix the problem of returning SEND_OK after flush failed

Posted by GitBox <gi...@apache.org>.
iamqq23ue commented on pull request #2707:
URL: https://github.com/apache/rocketmq/pull/2707#issuecomment-921405018


   > The pr has codes conflicts, please review and resolve it in your local enviroments. @iamqq23ue
   
   May I ask which branch  should I  modify.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@rocketmq.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [rocketmq] iamqq23ue commented on pull request #2707: [ISSUE #2706] Fix the problem of returning SEND_OK after flush failed

Posted by GitBox <gi...@apache.org>.
iamqq23ue commented on pull request #2707:
URL: https://github.com/apache/rocketmq/pull/2707#issuecomment-824581923


   > Good catch! But I feel that the change is a bit complicated. When the flushing error occurs, the location flushedPosition does not move and will return to the client FLUSH_DISK_TIMEOUT.
   
   
    **this.flushedPosition.set(value);**
   You can see that this line of code will be executed no matter what。So even if an error is reported, the position will change。


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org