You are viewing a plain text version of this content. The canonical link for it is here.
Posted to gsoc@community.apache.org by "Hongsheng Zhong (Jira)" <ji...@apache.org> on 2023/03/27 08:41:00 UTC
[jira] [Updated] (GSOC-140) Apache ShardingSphere: Add ShardingSphere Kafka source connector

     [ https://issues.apache.org/jira/browse/GSOC-140?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Hongsheng Zhong updated GSOC-140:
---------------------------------
    Description: 
h2. Apache ShardingSphere

Apache ShardingSphere is positioned as a Database Plus, and aims at building a standard layer and ecosystem above heterogeneous databases. It focuses on how to reuse existing databases and their respective upper layer, rather than creating a new database. The goal is to minimize or eliminate the challenges caused by underlying databases fragmentation.

{*}Page{*}: [https://shardingsphere.apache.org|https://shardingsphere.apache.org/]
{*}Github{*}: [https://github.com/apache/shardingsphere] 

h2. Background
The community just added CDC (change data capture) [feature|https://github.com/apache/shardingsphere/issues/22500] recently. Change feed will be published in created network connection after logging in, then it could be consumed.

Since Kafka is popular distributed event streaming platform, it's useful to import change feed into Kafka for later processing.

h2. Task
 # Familiar with ShardingSphere CDC client usage, create publication and subscribe change feed.
 # Familiar with Kafka connector development, develop source connector, integrate with ShardingSphere CDC. Persist change feed to Kafka topics properly.
 # Add unit test and E2E integration test.

h2. Relevant Skills
 # Java language
 # Basic knowledge of CDC and Kafka
 # Maven

h2. References
 * [https://github.com/apache/shardingsphere/issues/22500]
 * [https://kafka.apache.org/documentation/#connect_development]
 * [https://github.com/apache/kafka/tree/trunk/connect/file/src]
 * [https://github.com/confluentinc/kafka-connect-jdbc]

h2. Local Test Steps
 # Modify `conf/server.yaml`, uncomment `cdc-server-port: 33071` to enable CDC. (Refer to step 2)
 # Configure proxy, refer to `Prerequisites` and `Procedure` in [build|https://shardingsphere.apache.org/document/5.3.1/en/user-manual/shardingsphere-proxy/migration/build/] to configure proxy (Newer version could be used too, current stable version is 5.3.1).
 # Start proxy server, it'll start CDC server too.
 # Download ShardingSphere source code from https://github.com/apache/shardingsphere , modify and run `org.apache.shardingsphere.data.pipeline.cdc.client.example.Bootstrap`. It'll print `records:` by default in `Bootstrap`.
 # Execute some ISNERT/UPDATE/DELETE SQLs in proxy to generate change feed, and then check in `Bootstrap` console.

h2. Mentor

Hongsheng Zhong, PMC of Apache ShardingSphere, zhonghongsheng@apache.org

Xinze Guo, Committer of Apache ShardingSphere, azexin@apache.org

 

  was:
h2. Apache ShardingSphere

Apache ShardingSphere is positioned as a Database Plus, and aims at building a standard layer and ecosystem above heterogeneous databases. It focuses on how to reuse existing databases and their respective upper layer, rather than creating a new database. The goal is to minimize or eliminate the challenges caused by underlying databases fragmentation.

{*}Page{*}: [https://shardingsphere.apache.org|https://shardingsphere.apache.org/]
{*}Github{*}: [https://github.com/apache/shardingsphere] 
h2. Background

The community just added CDC (change data capture) [feature|https://github.com/apache/shardingsphere/issues/22500] recently. Change feed will be published in created network connection after logging in, then it could be consumed.

Since Kafka is popular distributed event streaming platform, it's useful to import change feed into Kafka for later processing.
h2. Task
 # Familiar with ShardingSphere CDC client usage, create publication and subscribe change feed.
 # Familiar with Kafka connector development, develop source connector, integrate with ShardingSphere CDC. Persist change feed to Kafka topics properly.
 # Add unit test and E2E integration test.

h2. Relevant Skills

1. Java language

2. Basic knowledge of CDC and Kafka

3. Maven
h3. References
 * [https://github.com/apache/shardingsphere/issues/22500]
 * [https://kafka.apache.org/documentation/#connect_development]
 * [https://github.com/apache/kafka/tree/trunk/connect/file/src]
 * [https://github.com/confluentinc/kafka-connect-jdbc]

h3. Mentor

Hongsheng Zhong, PMC of Apache ShardingSphere, zhonghongsheng@apache.org

Xinze Guo, Committer of Apache ShardingSphere, azexin@apache.org

 


> Apache ShardingSphere: Add ShardingSphere Kafka source connector
> ----------------------------------------------------------------
>
>                 Key: GSOC-140
>                 URL: https://issues.apache.org/jira/browse/GSOC-140
>             Project: Comdev GSOC
>          Issue Type: Improvement
>            Reporter: Hongsheng Zhong
>            Priority: Major
>              Labels: ShardingSphere, full-time, gsoc2023, mentor
>
> h2. Apache ShardingSphere
> Apache ShardingSphere is positioned as a Database Plus, and aims at building a standard layer and ecosystem above heterogeneous databases. It focuses on how to reuse existing databases and their respective upper layer, rather than creating a new database. The goal is to minimize or eliminate the challenges caused by underlying databases fragmentation.
> {*}Page{*}: [https://shardingsphere.apache.org|https://shardingsphere.apache.org/]
> {*}Github{*}: [https://github.com/apache/shardingsphere] 
> h2. Background
> The community just added CDC (change data capture) [feature|https://github.com/apache/shardingsphere/issues/22500] recently. Change feed will be published in created network connection after logging in, then it could be consumed.
> Since Kafka is popular distributed event streaming platform, it's useful to import change feed into Kafka for later processing.
> h2. Task
>  # Familiar with ShardingSphere CDC client usage, create publication and subscribe change feed.
>  # Familiar with Kafka connector development, develop source connector, integrate with ShardingSphere CDC. Persist change feed to Kafka topics properly.
>  # Add unit test and E2E integration test.
> h2. Relevant Skills
>  # Java language
>  # Basic knowledge of CDC and Kafka
>  # Maven
> h2. References
>  * [https://github.com/apache/shardingsphere/issues/22500]
>  * [https://kafka.apache.org/documentation/#connect_development]
>  * [https://github.com/apache/kafka/tree/trunk/connect/file/src]
>  * [https://github.com/confluentinc/kafka-connect-jdbc]
> h2. Local Test Steps
>  # Modify `conf/server.yaml`, uncomment `cdc-server-port: 33071` to enable CDC. (Refer to step 2)
>  # Configure proxy, refer to `Prerequisites` and `Procedure` in [build|https://shardingsphere.apache.org/document/5.3.1/en/user-manual/shardingsphere-proxy/migration/build/] to configure proxy (Newer version could be used too, current stable version is 5.3.1).
>  # Start proxy server, it'll start CDC server too.
>  # Download ShardingSphere source code from https://github.com/apache/shardingsphere , modify and run `org.apache.shardingsphere.data.pipeline.cdc.client.example.Bootstrap`. It'll print `records:` by default in `Bootstrap`.
>  # Execute some ISNERT/UPDATE/DELETE SQLs in proxy to generate change feed, and then check in `Bootstrap` console.
> h2. Mentor
> Hongsheng Zhong, PMC of Apache ShardingSphere, zhonghongsheng@apache.org
> Xinze Guo, Committer of Apache ShardingSphere, azexin@apache.org
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: gsoc-unsubscribe@community.apache.org
For additional commands, e-mail: gsoc-help@community.apache.org