You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@rocketmq.apache.org by GitBox <gi...@apache.org> on 2022/07/01 03:06:51 UTC

[GitHub] [rocketmq-connect] sunxiaojian commented on issue #180: Specify RocketMQ connect domain model

sunxiaojian commented on issue #180:
URL: https://github.com/apache/rocketmq-connect/issues/180#issuecomment-1171879829

   1.  WorkerSourceTask中对一条数据的处理过程是 poll , transform, converter, producer;
        <1.>  poll , transform过程是用户自定义,发生异常当在自定义插件逻辑中捕捉处理
        <2.> converter是正向序列化流程,由于序列化的都是标准的connectRecord ,一般不会有问题,问题经常发生在sink端对数据反序列化不出来
        <3.>producer发送失败,基本存在超时和服务不可用两种,非逻辑错误,这时首要的是保证offset不被提交,等恢复正常后能再次正常被处理, 这样发给dismissFailedMsgs、customerFailedMsgs就没有价值;由于系统的失败重试,可能也会存在一条数据被多次发向dismissFailedMsgs、customerFailedMsgs 中,无法保证一条数据只能被成功处理唯一;
        如果不使用系统内置的offset逻辑,自定义offset维护逻辑,那就需要dismissFailedMsgs、customerFailedMsgs和 commit,及commitRecord配合使用, 并需要对数据进行另存处理,这样就脱离了原数据处理的轨道,对一些要保序的数据就不适用
   
          所以整体来看source侧的核心是不是就只维护好offset就可以 ? 
       
        
   2. WorkerSourceTask中对一条数据的处理过程是consumer , converter,transform , put;
   
        已经定义了死信队列的处理逻辑,错误数据可通过死信队列来暂存


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@rocketmq.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org