You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@rocketmq.apache.org by GitBox <gi...@apache.org> on 2019/11/04 05:31:31 UTC

[GitHub] [rocketmq] guyuezheng opened a new issue #1568: rocketmq 事务消息,底层消息字节损坏导致不停重复回调check(以重复1天),丢失消息

guyuezheng opened a new issue #1568: rocketmq 事务消息,底层消息字节损坏导致不停重复回调check(以重复1天),丢失消息
URL: https://github.com/apache/rocketmq/issues/1568
 
 
   我们生产发现rocketMq 事务消息不停回调,导致事务消息不可用,丢失消息,目前版本是4.3.1,但是我发现后面版本没有修复
   
   
   主要原因已经发现如下
   事务消息回调,是先批量处理触发回调check,在更新RMQ_SYS_TRANS_HALF_TOPIC上的offset,一旦在批量处理中有异常,就会跳过 更新RMQ_SYS_TRANS_HALF_TOPIC上的offset,导致每次处理都报相同的错和回调同一批事务消息,不管他有没有在RMQ_SYS_TRANS_OP_HALF_TOPIC,
   
   broker 回调check报错日志如下(一分钟一次):
   java.lang.NullPointerException
   	at org.apache.rocketmq.broker.transaction.queue.TransactionalMessageBridge.getMessage(TransactionalMessageBridge.java:136)
   	at org.apache.rocketmq.broker.transaction.queue.TransactionalMessageBridge.getHalfMessage(TransactionalMessageBridge.java:108)
   	at org.apache.rocketmq.broker.transaction.queue.TransactionalMessageServiceImpl.pullHalfMsg(TransactionalMessageServiceImpl.java:379)
   	at org.apache.rocketmq.broker.transaction.queue.TransactionalMessageServiceImpl.getHalfMsg(TransactionalMessageServiceImpl.java:444)
   	at org.apache.rocketmq.broker.transaction.queue.TransactionalMessageServiceImpl.check(TransactionalMessageServiceImpl.java:164)
   	at org.apache.rocketmq.broker.transaction.TransactionalMessageCheckService.onWaitEnd(TransactionalMessageCheckService.java:76)
   	at org.apache.rocketmq.common.ServiceThread.waitForRunning(ServiceThread.java:121)
   	at org.apache.rocketmq.broker.transaction.TransactionalMessageCheckService.run(TransactionalMessageCheckService.java:65)
   	at java.lang.Thread.run(Thread.java:745)、
   报错细节如下;
   1)我们的底层消息字节有损坏,导致org.apache.rocketmq.common.message.MessageDecoder#decode(java.nio.ByteBuffer, boolean, boolean, boolean)方法转化抛异常,decodeMsgList直接吃了异常,返回null
   2)org.apache.rocketmq.broker.transaction.queue.TransactionalMessageBridge#decodeMsgList,直接把null加入到返回的 List<MessageExt> foundList
   3)导致在org.apache.rocketmq.broker.transaction.queue.TransactionalMessageBridge#getMessage,的136行
   获取消息存储时间报了空指针,
   4)org.apache.rocketmq.broker.transaction.queue.TransactionalMessageServiceImpl#check方法在获取待确定回调时,报空指针,没有更新 transactionalMessageBridge.updateConsumeOffset(messageQueue, newOffset);
   5)每次重复相同过程,导致不停重复回调check,事务消息不可用
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services