You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@rocketmq.apache.org by GitBox <gi...@apache.org> on 2020/06/29 07:51:53 UTC

[GitHub] [rocketmq] gaoyf commented on issue #1918: About consumers stopping pulling messages

gaoyf commented on issue #1918:
URL: https://github.com/apache/rocketmq/issues/1918#issuecomment-650994400


   @xiaoyuzxcasd  最近翻了一issue,发现了你这个问题。我之前也遇到过类似的,不知道你日志还保存着没有,有的话你可以**搜一下OutOfMemoryError,看看异常堆栈有没有关于PullMessageService的**。
   下面说一下你贴图的日志:
   ![image](https://user-images.githubusercontent.com/10137071/85985285-1c4f5300-ba1d-11ea-8963-6c8ba37b6651.png)
   1 这个图对应的代码如下:
   ```
   if (pq.isPullExpired()) {
       switch (this.consumeType()) {
           case CONSUME_ACTIVELY:
               break;
           case CONSUME_PASSIVELY:
               pq.setDropped(true);
               if (this.removeUnnecessaryMessageQueue(mq, pq)) {
                   it.remove();
                   changed = true;
                   log.error("[BUG]doRebalance, {}, remove unnecessary mq, {}, because pull is pause, so try to fixed it",
                       consumerGroup, mq);
               }
               break;
           default:
               break;
       }
   }
   ```
   即`pq.isPullExpired()`为true,而其对应的代码如下:
   ```
   public boolean isPullExpired() {
       return (System.currentTimeMillis() - this.lastPullTimestamp) > PULL_MAX_IDLE_TIME;
   }
   ```
   即lastPullTimestamp一直没有更新,那么lastPullTimestamp在哪更新的呢?参考代码[DefaultMQPushConsumerImpl.pullMessage()](https://github.com/apache/rocketmq/blob/master/client/src/main/java/org/apache/rocketmq/client/impl/consumer/DefaultMQPushConsumerImpl.java#L220):
   
   2 而执行该方法的线程是[PullMessageService](https://github.com/apache/rocketmq/blob/master/client/src/main/java/org/apache/rocketmq/client/impl/consumer/PullMessageService.java#L96)。
   其实说到这里,你应该也明白原因了,**即消息拉取的线程因为OOM停止了,所以不会更新lastPullTimestamp字段**。
   3 为啥会OOM?
   3.1 看你的贴图貌似有两个broker,broker上topic有8个队列,那么总共算来就是16个队列,默认rocketmq限流是每个队列最大拉取1000条消息,16*1000=16000,而单条消息最大为4M,那么客户端最大缓存64000M,即62.5G。
   当然,一般情况下单条消息通常小于512K,那么按此计算最大缓存8000M,即7.8G。
   所以,如果jvm设置堆不大的情况下,是存在可能把内存打爆的。
   3.2 当然还有一种可能,就是业务接到消息后执行的逻辑,需要从数据源大量获取数据到内存。我之前遇到的就是这种情况,业务接到消息后,一次性拉取到内存数据量过大,直接oom,导致拉消息线程退出。


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org