You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@rocketmq.apache.org by GitBox <gi...@apache.org> on 2021/11/16 03:57:47 UTC

[GitHub] [rocketmq] cserwen opened a new issue #3491: [POP] When only consuming but not producing, the pull request may be held by broker

cserwen opened a new issue #3491:
URL: https://github.com/apache/rocketmq/issues/3491


   **BUG REPORT**
   
   1. Please describe the issue you observed:
   
   - What did you do (The steps to reproduce)?
   Consumers only consume messages and producers don't produce msgs.
   
   - What did you expect to see?
   Consumers consume msgs with a normal speed.
   
   - What did you see instead?
   - Consumers sometimes consume normally and sometimes are blocked. 
   - Consume TPS is less then 200 and obviously there is a problem
   
   2. Please tell us about your environment:
   Linux
   
   3. Other information (e.g. detailed explanation, logs, related issues, suggestions how to fix, etc):
   I found that the reason may be related to `QueueLockManager` class.
   - Client send `pop-msg` request to broker.
   - Sometimes this request can success to get lock for queue, so return msgs immediately. But sometimes it can't get lock successfully,then the request will be put into `pollingMap`.
   - The requests in `pollingMap` will be executed when new msgs for this topic arriving or timeout.
   
   So for a topic,If no new msgs are produced, the consumers which subscribe it can't consume normally because their `pop-msg` requests may be held.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@rocketmq.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [rocketmq] cserwen edited a comment on issue #3491: [POP] When only consuming but not producing, the pull request may be held by broker

Posted by GitBox <gi...@apache.org>.
cserwen edited a comment on issue #3491:
URL: https://github.com/apache/rocketmq/issues/3491#issuecomment-971508756


   > org.apache.rocketmq.broker.processor.PopMessageProcessor#popMsgFromQueue Offset can also be submitted in this method, so not only the threads of PopBufferMergeService can submit offset.
   
   Ok, how should we solve this problem?
   Besides, I found another bug. If the broker restart when the consumer is running, these lines will be executed. Then the consumeOffset will be set as 0 and consumer consumes from first. The log is as follows:
   ```log
   2021-11-17 20:16:12 WARN PullMessageThread_21 - Pop initial offset, because store is no correct, pop-1@consumer@3, 5384577->null
   2021-11-17 20:16:12 WARN PullMessageThread_69 - Pop initial offset, because store is no correct, pop-1@consumer@2, 5384587->null
   2021-11-17 20:16:12 WARN PullMessageThread_3 - Pop initial offset, because store is no correct, pop-1@consumer@0, 26128469->null
   2021-11-17 20:16:12 WARN PullMessageThread_31 - Pop initial offset, because store is no correct, pop-1@consumer@1, 12384558->null
   2021-11-17 20:16:12 WARN PullMessageThread_78 - Pop initial offset, because store is no correct, %RETRY%consumer_pop-1@consumer@0, 143690->null
   ```
   **But the actual consumeOffset is correct.** 
   
   Is it right to set offset as 0 when `getMessage()` return null ?
   https://github.com/apache/rocketmq/blob/4506f34e24714ec4d6ac37babd8f096632fd6b1c/store/src/main/java/org/apache/rocketmq/store/DefaultMessageStore.java#L526
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@rocketmq.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [rocketmq] odbozhou commented on issue #3491: [POP] When only consuming but not producing, the pull request may be held by broker

Posted by GitBox <gi...@apache.org>.
odbozhou commented on issue #3491:
URL: https://github.com/apache/rocketmq/issues/3491#issuecomment-971131721


   I agree with you, but I think the main function of this lock is to save the offset scene at the time when the offset is rolled back, which is convenient for checking the problem, it does increase the lock competition. Do you have a better way to balance performance and problem site information?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@rocketmq.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [rocketmq] cserwen edited a comment on issue #3491: [POP] When only consuming but not producing, the pull request may be held by broker

Posted by GitBox <gi...@apache.org>.
cserwen edited a comment on issue #3491:
URL: https://github.com/apache/rocketmq/issues/3491#issuecomment-971255903


   > I agree with you, but I think the main function of this lock is to save the offset scene at the time when the offset is rolled back, which is convenient for checking the problem, it does increase the lock competition. Do you have a better way to balance performance and problem site information?
   
   I don't understand when the offset rollback will happen, so I can't get the effect of this lock. 
   I have a solution that we can use the `PopLongPollingService` thread to scan the requests in the `pollingMap` to determine if the locks corresponding to those requests can be acquired, and if possible, we will `wakeUp` these requests in advance.
   ```java
   if (!PopMessageProcessor.this.queueLockManager.isLock(key)) {  
       totalPollingNum.decrementAndGet();
       wakeUp(first);
       continue;
   }    //above lines.
   
   if (!first.isTimeout()) {
       if (popQ.add(first)) {
           break;
       } else {
           POP_LOGGER.info("polling, add fail again: {}", first);
       }
   }
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@rocketmq.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [rocketmq] cserwen commented on issue #3491: [POP] When only consuming but not producing, the pull request may be held by broker

Posted by GitBox <gi...@apache.org>.
cserwen commented on issue #3491:
URL: https://github.com/apache/rocketmq/issues/3491#issuecomment-970091401


   I found that the `commitOffset` will also get lock . If I deleted related lines, the consumer can consume normally. Maybe this caused the lock competition to become more serious.
   https://github.com/apache/rocketmq/blob/4506f34e24714ec4d6ac37babd8f096632fd6b1c/broker/src/main/java/org/apache/rocketmq/broker/processor/PopBufferMergeService.java#L343


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@rocketmq.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [rocketmq] odbozhou commented on issue #3491: [POP] When only consuming but not producing, the pull request may be held by broker

Posted by GitBox <gi...@apache.org>.
odbozhou commented on issue #3491:
URL: https://github.com/apache/rocketmq/issues/3491#issuecomment-971497686


   org.apache.rocketmq.broker.processor.PopMessageProcessor#popMsgFromQueue   
   Offset can also be submitted in this method, so not only the threads of PopBufferMergeService can submit offset.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@rocketmq.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [rocketmq] cserwen edited a comment on issue #3491: [POP] When only consuming but not producing, the pull request may be held by broker

Posted by GitBox <gi...@apache.org>.
cserwen edited a comment on issue #3491:
URL: https://github.com/apache/rocketmq/issues/3491#issuecomment-971508756


   > org.apache.rocketmq.broker.processor.PopMessageProcessor#popMsgFromQueue Offset can also be submitted in this method, so not only the threads of PopBufferMergeService can submit offset.
   
   Ok, how should we solve this problem?
   Besides, I found another bug. If the broker restart when the consumer is running, these lines will be executed. Then the consumeOffset will be set as 0 and consume consume from first. The log is as follows:
   ```log
   2021-11-17 20:16:12 WARN PullMessageThread_21 - Pop initial offset, because store is no correct, pop-1@consumer@3, 5384577->null
   2021-11-17 20:16:12 WARN PullMessageThread_69 - Pop initial offset, because store is no correct, pop-1@consumer@2, 5384587->null
   2021-11-17 20:16:12 WARN PullMessageThread_3 - Pop initial offset, because store is no correct, pop-1@consumer@0, 26128469->null
   2021-11-17 20:16:12 WARN PullMessageThread_31 - Pop initial offset, because store is no correct, pop-1@consumer@1, 12384558->null
   2021-11-17 20:16:12 WARN PullMessageThread_78 - Pop initial offset, because store is no correct, %RETRY%consumer_pop-1@consumer@0, 143690->null
   ```
   **But the actual consumeOffset is correct.** 
   
   Is it right to set offset as 0 when `getMessage()` return null ?
   https://github.com/apache/rocketmq/blob/4506f34e24714ec4d6ac37babd8f096632fd6b1c/store/src/main/java/org/apache/rocketmq/store/DefaultMessageStore.java#L526
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@rocketmq.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [rocketmq] cserwen edited a comment on issue #3491: [POP] When only consuming but not producing, the pull request may be held by broker

Posted by GitBox <gi...@apache.org>.
cserwen edited a comment on issue #3491:
URL: https://github.com/apache/rocketmq/issues/3491#issuecomment-971255903


   > I agree with you, but I think the main function of this lock is to save the offset scene at the time when the offset is rolled back, which is convenient for checking the problem, it does increase the lock competition. Do you have a better way to balance performance and problem site information?
   
   I don't understand when the offset rollback will happen, so I can't get the effect of this lock. And there is only one thread to commit the offset. So I don’t understand why the lock is acquired here.
   I have a solution that we can use the `PopLongPollingService` thread to scan the requests in the `pollingMap` to determine if the locks corresponding to those requests can be acquired, and if possible, we will `wakeUp` these requests in advance.
   The code is here: https://github.com/apache/rocketmq/blob/4506f34e24714ec4d6ac37babd8f096632fd6b1c/broker/src/main/java/org/apache/rocketmq/broker/processor/PopMessageProcessor.java#L808
   
   just like this:
   ```java
   if (!PopMessageProcessor.this.queueLockManager.isLock(key)) {  
       totalPollingNum.decrementAndGet();
       wakeUp(first);
       continue;
   }    //above lines.
   
   if (!first.isTimeout()) {
       if (popQ.add(first)) {
           break;
       } else {
           POP_LOGGER.info("polling, add fail again: {}", first);
       }
   }
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@rocketmq.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [rocketmq] cserwen commented on issue #3491: [POP] When only consuming but not producing, the pull request may be held by broker

Posted by GitBox <gi...@apache.org>.
cserwen commented on issue #3491:
URL: https://github.com/apache/rocketmq/issues/3491#issuecomment-971255903


   > I agree with you, but I think the main function of this lock is to save the offset scene at the time when the offset is rolled back, which is convenient for checking the problem, it does increase the lock competition. Do you have a better way to balance performance and problem site information?
   
   I don't understand when the offset rollback will happen, so I can't get the effect of this lock. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@rocketmq.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [rocketmq] cserwen edited a comment on issue #3491: [POP] When only consuming but not producing, the pull request may be held by broker

Posted by GitBox <gi...@apache.org>.
cserwen edited a comment on issue #3491:
URL: https://github.com/apache/rocketmq/issues/3491#issuecomment-971255903


   > I agree with you, but I think the main function of this lock is to save the offset scene at the time when the offset is rolled back, which is convenient for checking the problem, it does increase the lock competition. Do you have a better way to balance performance and problem site information?
   
   I don't understand when the offset rollback will happen, so I can't get the effect of this lock. 
   I have a solution that we can use the `PopLongPollingService` thread to scan the requests in the `pollingMap` to determine if the locks corresponding to those requests can be acquired, and if possible, we will `wakeUp` these requests in advance.
   The code is here: https://github.com/apache/rocketmq/blob/4506f34e24714ec4d6ac37babd8f096632fd6b1c/broker/src/main/java/org/apache/rocketmq/broker/processor/PopMessageProcessor.java#L808
   
   just like this:
   ```java
   if (!PopMessageProcessor.this.queueLockManager.isLock(key)) {  
       totalPollingNum.decrementAndGet();
       wakeUp(first);
       continue;
   }    //above lines.
   
   if (!first.isTimeout()) {
       if (popQ.add(first)) {
           break;
       } else {
           POP_LOGGER.info("polling, add fail again: {}", first);
       }
   }
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@rocketmq.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [rocketmq] cserwen commented on issue #3491: [POP] When only consuming but not producing, the pull request may be held by broker

Posted by GitBox <gi...@apache.org>.
cserwen commented on issue #3491:
URL: https://github.com/apache/rocketmq/issues/3491#issuecomment-971508756


   > org.apache.rocketmq.broker.processor.PopMessageProcessor#popMsgFromQueue Offset can also be submitted in this method, so not only the threads of PopBufferMergeService can submit offset.
   
   Ok, how should we solve this problem?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@rocketmq.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org