You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@rocketmq.apache.org by GitBox <gi...@apache.org> on 2022/08/11 08:03:13 UTC

[GitHub] [rocketmq-connect] oudb opened a new pull request, #247: rocketmq-connect-kafka-connector-adapter 0.0.1-SNAPSHOT

oudb opened a new pull request, #247:
URL: https://github.com/apache/rocketmq-connect/pull/247

   ## What is the purpose of the change
   
   需要一个通用适配器connector,快速地让现存的大量的kafka connector运行在rocketmq-connect,使得数据在rocketmq导入导出。
   
   ## Brief changelog
   增加2个Connector,一个是SourceConnector:org.apache.rocketmq.connect.kafka.connector.KafkaRocketmqSourceConnector,用来适配Kafka Source Connector。一个是SinkConnector:org.apache.rocketmq.connect.kafka.connector.KafkaRocketmqSinkConnector,用来适配Kafka Sink Connector
   
   ## Verifying this change
   
   参考ReadMe的快速开始,运行kafka-file-connector
   
   Follow this checklist to help us incorporate your contribution quickly and easily. Notice, `it would be helpful if you could finish the following 5 checklist(the last one is not necessary)before request the community to review your PR`.
   
   - [x] Make sure there is a [Github issue](https://github.com/apache/rocketmq-connect/issues) filed for the change (usually before you start working on it). Trivial changes like typos do not require a Github issue. Your pull request should address just this issue, without pulling in other changes - one PR resolves one issue. 
   - [x] Format the pull request title like `[ISSUE #123] Fix UnknownException when host config not exist`. Each commit in the pull request should have a meaningful subject line and body.
   - [x] Write a pull request description that is detailed enough to understand what the pull request does, how, and why.
   - [x] Write necessary unit-test(over 80% coverage) to verify your logic correction, more mock a little better when cross module dependency exist. If the new feature or significant change is committed, please remember to add integration-test in [test module](https://github.com/apache/rocketmq/tree/master/test).
   - [x] Run `mvn -B clean apache-rat:check findbugs:findbugs checkstyle:checkstyle` to make sure basic checks pass. Run `mvn clean install -DskipITs` to make sure unit-test pass. Run `mvn clean test-compile failsafe:integration-test`  to make sure integration-test pass.
   - [ ] If this contribution is large, please file an [Apache Individual Contributor License Agreement](http://www.apache.org/licenses/#clas).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@rocketmq.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [rocketmq-connect] odbozhou commented on pull request #247: rocketmq-connect-kafka-connector-adapter 0.0.1-SNAPSHOT

Posted by GitBox <gi...@apache.org>.
odbozhou commented on PR #247:
URL: https://github.com/apache/rocketmq-connect/pull/247#issuecomment-1212753394

   @sunxiaojian debezium适配的时候好像有适配kafka source端,看一下能不能融合到一起。
   @oudb 非常好的想法,期待这个特性可以尽快合入,一起把这个特性做的更好。


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@rocketmq.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [rocketmq-connect] sunxiaojian commented on pull request #247: rocketmq-connect-kafka-connector-adapter 0.0.1-SNAPSHOT

Posted by GitBox <gi...@apache.org>.
sunxiaojian commented on PR #247:
URL: https://github.com/apache/rocketmq-connect/pull/247#issuecomment-1212709878

   > ## What is the purpose of the change
   > 需要一个通用适配器connector,快速地让现存的大量的kafka connector运行在rocketmq-connect,使得数据在rocketmq导入导出。
   > 
   > ## Brief changelog
   > 增加2个Connector,一个是SourceConnector:org.apache.rocketmq.connect.kafka.connector.KafkaRocketmqSourceConnector,用来适配Kafka Source Connector。一个是SinkConnector:org.apache.rocketmq.connect.kafka.connector.KafkaRocketmqSinkConnector,用来适配Kafka Sink Connector
   > 
   > ## Verifying this change
   > 参考ReadMe的快速开始,运行kafka-file-connector
   > 
   > Follow this checklist to help us incorporate your contribution quickly and easily. Notice, `it would be helpful if you could finish the following 5 checklist(the last one is not necessary)before request the community to review your PR`.
   > 
   > * [x]  Make sure there is a [Github issue](https://github.com/apache/rocketmq-connect/issues) filed for the change (usually before you start working on it). Trivial changes like typos do not require a Github issue. Your pull request should address just this issue, without pulling in other changes - one PR resolves one issue.
   > * [x]  Format the pull request title like `[ISSUE #123] Fix UnknownException when host config not exist`. Each commit in the pull request should have a meaningful subject line and body.
   > * [x]  Write a pull request description that is detailed enough to understand what the pull request does, how, and why.
   > * [x]  Write necessary unit-test(over 80% coverage) to verify your logic correction, more mock a little better when cross module dependency exist. If the new feature or significant change is committed, please remember to add integration-test in [test module](https://github.com/apache/rocketmq/tree/master/test).
   > * [x]  Run `mvn -B clean apache-rat:check findbugs:findbugs checkstyle:checkstyle` to make sure basic checks pass. Run `mvn clean install -DskipITs` to make sure unit-test pass. Run `mvn clean test-compile failsafe:integration-test`  to make sure integration-test pass.
   > * [ ]  If this contribution is large, please file an [Apache Individual Contributor License Agreement](http://www.apache.org/licenses/#clas).
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@rocketmq.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [rocketmq-connect] oudb commented on pull request #247: rocketmq-connect-kafka-connector-adapter 0.0.1-SNAPSHOT

Posted by GitBox <gi...@apache.org>.
oudb commented on PR #247:
URL: https://github.com/apache/rocketmq-connect/pull/247#issuecomment-1213020735

   是的,该PR专注于 kafka source -> rocketmqConnect -> kafka sink。与其他 RocketMQ Connector交互,我希望是通过transform来提供, 名字暂定为KafkaRocketmqTransformation, 比如kafka connect source -> KafkaRocketmqTransformation ->RocketMQ Connect-> RocketMQ Connect sink 。通过KafkaRocketmqTransformation还可以达到Kafka的transform和RocketMQ的transform混合使用。


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@rocketmq.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [rocketmq-connect] sunxiaojian closed pull request #247: rocketmq-connect-kafka-connector-adapter 0.0.1-SNAPSHOT

Posted by GitBox <gi...@apache.org>.
sunxiaojian closed pull request #247: rocketmq-connect-kafka-connector-adapter 0.0.1-SNAPSHOT
URL: https://github.com/apache/rocketmq-connect/pull/247


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@rocketmq.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [rocketmq-connect] oudb commented on pull request #247: rocketmq-connect-kafka-connector-adapter 0.0.1-SNAPSHOT

Posted by GitBox <gi...@apache.org>.
oudb commented on PR #247:
URL: https://github.com/apache/rocketmq-connect/pull/247#issuecomment-1214831839

   我提交该pr的之前看过debezium的kafka connect adaptor,有以下几点我还是想提交一个通用的灵活的适配器:
   (1)debezium的kafka connect adaptor专注于debezium的适配,缺少通过kafka插件组件来提供通用和灵活
   (2)直接适配其他的kafka connector可能还有其他地方没有考虑到,比如source connector的位点序列再反序列化的类型丢失,如Long序列化再反序列化变为Integer。
   (3)debezium的kafka connect adaptor为了和其他rokectmq原生connector和插件交互,scheme之间做个映射。 但 我想的是大多数用户使用一个通用的适配器都是kafka source connector和kafka sink connector配对使用的,很少会和rokectmq原生connector和其他插件混合使用,这不但引入开销而且必须保证scheme能做到一一映射,可能会后期升级有影响。
   
   关于scheme和transforms适配,我想的是通过在rokectmq connect的transforms层面去做,这样用户觉得必要的时候(比如和rokectmq原生connector和其插件配合的时候)才去做这个配置。
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@rocketmq.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [rocketmq-connect] sunxiaojian commented on pull request #247: rocketmq-connect-kafka-connector-adapter 0.0.1-SNAPSHOT

Posted by GitBox <gi...@apache.org>.
sunxiaojian commented on PR #247:
URL: https://github.com/apache/rocketmq-connect/pull/247#issuecomment-1214971620

   > 我提交该pr的之前看过debezium的kafka connect adaptor,有以下几点我还是想提交一个通用的灵活的适配器: (1)debezium的kafka connect adaptor专注于debezium的适配,缺少通过kafka插件组件来提供通用和灵活 (2)直接适配其他的kafka connector可能还有其他地方没有考虑到,比如source connector的位点序列再反序列化的类型丢失,如Long序列化再反序列化变为Integer。 (3)debezium的kafka connect adaptor为了和其他rokectmq原生connector和插件交互,scheme之间做个映射。 但 我想的是大多数用户使用一个通用的适配器都是kafka source connector和kafka sink connector配对使用的,很少会和rokectmq原生connector和其他插件混合使用,这不但引入开销而且必须保证scheme能做到一一映射,可能会后期升级有影响。
   > 
   > 关于scheme和transforms适配,我想的是通过在rokectmq connect的transforms层面去做,这样用户觉得必要的时候(比如和rokectmq原生connector和其插件配合的时候)才去做这个配置。
   
   通用性比较认同,这个也是计划要去支持;
   丢失类型问题这个好像不存在,source 的 offset 无论是rocketmq 还是kafka 定义的都是 object类型,不是固定类型,所以在返回offset不会直接拿固定类型,具体什么类型的是有包含业务逻辑的插件内部来决定的, 这个是在转换过程中遇到问题了吗?
   关于升级问题,应该是个逃不开的问题,无论是运行在 rocketmq 上面的 kafka connect,还是直接运行在kafka 上的kafka connect 都面临这个问题,还是但是好在都是api层面的变更,不会非常频繁;并且保证api变更时项目修改最小就可以了,
   其实无论在transform层面做schema转换还是直接转换逻辑应该差别不大,在做transform 转换时可以考虑是否把debezium的 adaptor 中 schema的转换直接提成transform就可以了
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@rocketmq.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [rocketmq-connect] sunxiaojian commented on pull request #247: rocketmq-connect-kafka-connector-adapter 0.0.1-SNAPSHOT

Posted by GitBox <gi...@apache.org>.
sunxiaojian commented on PR #247:
URL: https://github.com/apache/rocketmq-connect/pull/247#issuecomment-1212710270

   > ## What is the purpose of the change
   > 需要一个通用适配器connector,快速地让现存的大量的kafka connector运行在rocketmq-connect,使得数据在rocketmq导入导出。
   > 
   > ## Brief changelog
   > 增加2个Connector,一个是SourceConnector:org.apache.rocketmq.connect.kafka.connector.KafkaRocketmqSourceConnector,用来适配Kafka Source Connector。一个是SinkConnector:org.apache.rocketmq.connect.kafka.connector.KafkaRocketmqSinkConnector,用来适配Kafka Sink Connector
   > 
   > ## Verifying this change
   > 参考ReadMe的快速开始,运行kafka-file-connector
   > 
   > Follow this checklist to help us incorporate your contribution quickly and easily. Notice, `it would be helpful if you could finish the following 5 checklist(the last one is not necessary)before request the community to review your PR`.
   > 
   > * [x]  Make sure there is a [Github issue](https://github.com/apache/rocketmq-connect/issues) filed for the change (usually before you start working on it). Trivial changes like typos do not require a Github issue. Your pull request should address just this issue, without pulling in other changes - one PR resolves one issue.
   > * [x]  Format the pull request title like `[ISSUE #123] Fix UnknownException when host config not exist`. Each commit in the pull request should have a meaningful subject line and body.
   > * [x]  Write a pull request description that is detailed enough to understand what the pull request does, how, and why.
   > * [x]  Write necessary unit-test(over 80% coverage) to verify your logic correction, more mock a little better when cross module dependency exist. If the new feature or significant change is committed, please remember to add integration-test in [test module](https://github.com/apache/rocketmq/tree/master/test).
   > * [x]  Run `mvn -B clean apache-rat:check findbugs:findbugs checkstyle:checkstyle` to make sure basic checks pass. Run `mvn clean install -DskipITs` to make sure unit-test pass. Run `mvn clean test-compile failsafe:integration-test`  to make sure integration-test pass.
   > * [ ]  If this contribution is large, please file an [Apache Individual Contributor License Agreement](http://www.apache.org/licenses/#clas).
   
   先创建个issue吧,用于纪录和讨论


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@rocketmq.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [rocketmq-connect] sunxiaojian commented on pull request #247: rocketmq-connect-kafka-connector-adapter 0.0.1-SNAPSHOT

Posted by GitBox <gi...@apache.org>.
sunxiaojian commented on PR #247:
URL: https://github.com/apache/rocketmq-connect/pull/247#issuecomment-1212965320

   > @sunxiaojian debezium适配的时候好像有适配kafka source端,看一下能不能融合到一起。 @oudb 非常好的想法,期待这个特性可以尽快合入,一起把这个特性做的更好。
   
   ok , 大概review了一下功能,非常不错的能力,和debezium中的source 和 sink 兼容实现有所不同;
   当前PR实现,应该是为了满足 kafka source -> rocketmqConnect -> kafka sink 这一种场景,  source和sink限制在必须是kafka connect的插件,debezium中的实现是 希望做到 kafka connect source -> RocketMQ Connect -> kafka connect sink 或者 kafka connect source -> RocketMQ Connect -> RocketMQ Connect sink 或者 RocketMQ connect source -> RocketMQ Connect ->Kafka connect sink  这三种形式兼容,当然,这完全可以作为两种能力来提供,只为了满足用户可以无缝将kafka connect的插件迁移到 rocketmq connect而无需做任何适配, 持续兼容和维护下去,会是一个突出的功能


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@rocketmq.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [rocketmq-connect] sunxiaojian commented on pull request #247: rocketmq-connect-kafka-connector-adapter 0.0.1-SNAPSHOT

Posted by GitBox <gi...@apache.org>.
sunxiaojian commented on PR #247:
URL: https://github.com/apache/rocketmq-connect/pull/247#issuecomment-1213109191

   > 是的,该PR专注于 kafka source -> rocketmqConnect -> kafka sink。与其他 RocketMQ Connector交互,我希望是通过transform来提供, 名字暂定为KafkaRocketmqTransformation, 比如kafka connect source -> KafkaRocketmqTransformation ->RocketMQ Connect-> RocketMQ Connect sink 。通过KafkaRocketmqTransformation还可以达到Kafka的transform和RocketMQ的transform混合使用。
   
   可以看一下debezium 本项目中debezium下面的kafka connect adaptor, 整体来看


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@rocketmq.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [rocketmq-connect] sunxiaojian merged pull request #247: [ISSUES #261] rocketmq-connect-kafka-connector-adapter 0.0.1-SNAPSHOT

Posted by GitBox <gi...@apache.org>.
sunxiaojian merged PR #247:
URL: https://github.com/apache/rocketmq-connect/pull/247


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@rocketmq.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org