You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flink.apache.org by Mason Chen <ma...@gmail.com> on 2022/06/24 01:43:38 UTC

[DISCUSS] Contribution of Multi Cluster Kafka Source

Hi community,

We have been working on a Multi Cluster Kafka Source and are looking to
contribute it upstream. I've given a talk about the features and design at
a Flink meetup: https://youtu.be/H1SYOuLcUTI.

The main features that it provides is:
1. Reading multiple Kafka clusters within a single source.
2. Adjusting the clusters and topics the source consumes from dynamically,
without Flink job restart.

Some of the challenging use cases that these features solve are:
1. Transparent Kafka cluster migration without Flink job restart.
2. Transparent Kafka topic migration without Flink job restart.
3. Direct integration with Hybrid Source.

In addition, this is designed with wrapping and managing the existing
KafkaSource components to enable these features, so it can continue to
benefit from KafkaSource improvements and bug fixes. It can be considered
as a form of a composite source.

I think the contribution of this source could benefit a lot of users who
have asked in the mailing list about Flink handling Kafka migrations and
removing topics in the past. I would love to hear and address your thoughts
and feedback, and if possible drive a FLIP!

Best,
Mason

Re: [DISCUSS] Contribution of Multi Cluster Kafka Source

Posted by Mason Chen <ma...@gmail.com>.
Hi all,

Circling back on this--I have created a first draft document in confluence:
https://cwiki.apache.org/confluence/display/FLINK/FLIP-246%3A+Multi+Cluster+Kafka+Source
.

Looking forward to hear all your feedback in this email thread!

Best,
Mason

On Thu, Jun 30, 2022 at 6:57 AM Thomas Weise <th...@apache.org> wrote:

> Hi Mason,
>
> I added mason6345 to the Flink confluence space, you should be able to
> add a FLIP now.
>
> Looking forward to the contribution!
>
> Thomas
>
> On Thu, Jun 30, 2022 at 9:25 AM Martijn Visser <ma...@apache.org>
> wrote:
> >
> > Hi Mason,
> >
> > I'm sure there's a PMC (*hint*) out there who can grant you access to
> > create a FLIP. Looking forward to it, this sounds like an improvement
> that
> > users are looking forward to.
> >
> > Best regards,
> >
> > Martijn
> >
> > Op di 28 jun. 2022 om 09:21 schreef Mason Chen <ma...@gmail.com>:
> >
> > > Hi all,
> > >
> > > Thanks for the feedback! I'm adding the users, who responded in the
> user
> > > mailing list, to this thread.
> > >
> > > @Qingsheng - Yes, I would prefer to reuse the existing Kafka connector
> > > module. It makes a lot of sense since the dependencies are the same
> and the
> > > implementation can also extend and improve some of the test utilities
> you
> > > have been working on for the FLIP 27 Kafka Source. I will enumerate the
> > > migration steps in the FLIP template.
> > >
> > > @Ryan - I don't have a public branch available yet, but I would
> appreciate
> > > your review on the FLIP design! When the FLIP design is approved by
> devs
> > > and the community, I can start to commit our implementation to a fork.
> > >
> > > @Andrew - Yup, one of the requirements of the connector is to read
> > > multiple clusters within a single source, so it should be able to work
> well
> > > with your use case.
> > >
> > > @Devs - what do I need to get started on the FLIP design? I see the
> FLIP
> > > template and I have an account (mason6345), but I don't have access to
> > > create a page.
> > >
> > > Best,
> > > Mason
> > >
> > >
> > >
> > >
> > > On Sun, Jun 26, 2022 at 8:08 PM Qingsheng Ren <re...@apache.org>
> wrote:
> > >
> > >> Hi Mason,
> > >>
> > >> It sounds like an exciting enhancement to the Kafka source and will
> > >> benefit a lot of users I believe.
> > >>
> > >> Would you prefer to reuse the existing flink-connector-kafka module or
> > >> create a new one for the new multi-cluster feature? Personally I
> prefer the
> > >> former one because users won’t need to introduce another dependency
> module
> > >> to their projects in order to use the feature.
> > >>
> > >> Thanks for the effort on this and looking forward to your FLIP!
> > >>
> > >> Best,
> > >> Qingsheng
> > >>
> > >> > On Jun 24, 2022, at 09:43, Mason Chen <ma...@gmail.com>
> wrote:
> > >> >
> > >> > Hi community,
> > >> >
> > >> > We have been working on a Multi Cluster Kafka Source and are
> looking to
> > >> > contribute it upstream. I've given a talk about the features and
> design
> > >> at
> > >> > a Flink meetup: https://youtu.be/H1SYOuLcUTI.
> > >> >
> > >> > The main features that it provides is:
> > >> > 1. Reading multiple Kafka clusters within a single source.
> > >> > 2. Adjusting the clusters and topics the source consumes from
> > >> dynamically,
> > >> > without Flink job restart.
> > >> >
> > >> > Some of the challenging use cases that these features solve are:
> > >> > 1. Transparent Kafka cluster migration without Flink job restart.
> > >> > 2. Transparent Kafka topic migration without Flink job restart.
> > >> > 3. Direct integration with Hybrid Source.
> > >> >
> > >> > In addition, this is designed with wrapping and managing the
> existing
> > >> > KafkaSource components to enable these features, so it can continue
> to
> > >> > benefit from KafkaSource improvements and bug fixes. It can be
> > >> considered
> > >> > as a form of a composite source.
> > >> >
> > >> > I think the contribution of this source could benefit a lot of
> users who
> > >> > have asked in the mailing list about Flink handling Kafka
> migrations and
> > >> > removing topics in the past. I would love to hear and address your
> > >> thoughts
> > >> > and feedback, and if possible drive a FLIP!
> > >> >
> > >> > Best,
> > >> > Mason
> > >>
> > >>
>

Re: [DISCUSS] Contribution of Multi Cluster Kafka Source

Posted by Mason Chen <ma...@gmail.com>.
Hi all,

Circling back on this--I have created a first draft document in confluence:
https://cwiki.apache.org/confluence/display/FLINK/FLIP-246%3A+Multi+Cluster+Kafka+Source
.

Looking forward to hear all your feedback in this email thread!

Best,
Mason

On Thu, Jun 30, 2022 at 6:57 AM Thomas Weise <th...@apache.org> wrote:

> Hi Mason,
>
> I added mason6345 to the Flink confluence space, you should be able to
> add a FLIP now.
>
> Looking forward to the contribution!
>
> Thomas
>
> On Thu, Jun 30, 2022 at 9:25 AM Martijn Visser <ma...@apache.org>
> wrote:
> >
> > Hi Mason,
> >
> > I'm sure there's a PMC (*hint*) out there who can grant you access to
> > create a FLIP. Looking forward to it, this sounds like an improvement
> that
> > users are looking forward to.
> >
> > Best regards,
> >
> > Martijn
> >
> > Op di 28 jun. 2022 om 09:21 schreef Mason Chen <ma...@gmail.com>:
> >
> > > Hi all,
> > >
> > > Thanks for the feedback! I'm adding the users, who responded in the
> user
> > > mailing list, to this thread.
> > >
> > > @Qingsheng - Yes, I would prefer to reuse the existing Kafka connector
> > > module. It makes a lot of sense since the dependencies are the same
> and the
> > > implementation can also extend and improve some of the test utilities
> you
> > > have been working on for the FLIP 27 Kafka Source. I will enumerate the
> > > migration steps in the FLIP template.
> > >
> > > @Ryan - I don't have a public branch available yet, but I would
> appreciate
> > > your review on the FLIP design! When the FLIP design is approved by
> devs
> > > and the community, I can start to commit our implementation to a fork.
> > >
> > > @Andrew - Yup, one of the requirements of the connector is to read
> > > multiple clusters within a single source, so it should be able to work
> well
> > > with your use case.
> > >
> > > @Devs - what do I need to get started on the FLIP design? I see the
> FLIP
> > > template and I have an account (mason6345), but I don't have access to
> > > create a page.
> > >
> > > Best,
> > > Mason
> > >
> > >
> > >
> > >
> > > On Sun, Jun 26, 2022 at 8:08 PM Qingsheng Ren <re...@apache.org>
> wrote:
> > >
> > >> Hi Mason,
> > >>
> > >> It sounds like an exciting enhancement to the Kafka source and will
> > >> benefit a lot of users I believe.
> > >>
> > >> Would you prefer to reuse the existing flink-connector-kafka module or
> > >> create a new one for the new multi-cluster feature? Personally I
> prefer the
> > >> former one because users won’t need to introduce another dependency
> module
> > >> to their projects in order to use the feature.
> > >>
> > >> Thanks for the effort on this and looking forward to your FLIP!
> > >>
> > >> Best,
> > >> Qingsheng
> > >>
> > >> > On Jun 24, 2022, at 09:43, Mason Chen <ma...@gmail.com>
> wrote:
> > >> >
> > >> > Hi community,
> > >> >
> > >> > We have been working on a Multi Cluster Kafka Source and are
> looking to
> > >> > contribute it upstream. I've given a talk about the features and
> design
> > >> at
> > >> > a Flink meetup: https://youtu.be/H1SYOuLcUTI.
> > >> >
> > >> > The main features that it provides is:
> > >> > 1. Reading multiple Kafka clusters within a single source.
> > >> > 2. Adjusting the clusters and topics the source consumes from
> > >> dynamically,
> > >> > without Flink job restart.
> > >> >
> > >> > Some of the challenging use cases that these features solve are:
> > >> > 1. Transparent Kafka cluster migration without Flink job restart.
> > >> > 2. Transparent Kafka topic migration without Flink job restart.
> > >> > 3. Direct integration with Hybrid Source.
> > >> >
> > >> > In addition, this is designed with wrapping and managing the
> existing
> > >> > KafkaSource components to enable these features, so it can continue
> to
> > >> > benefit from KafkaSource improvements and bug fixes. It can be
> > >> considered
> > >> > as a form of a composite source.
> > >> >
> > >> > I think the contribution of this source could benefit a lot of
> users who
> > >> > have asked in the mailing list about Flink handling Kafka
> migrations and
> > >> > removing topics in the past. I would love to hear and address your
> > >> thoughts
> > >> > and feedback, and if possible drive a FLIP!
> > >> >
> > >> > Best,
> > >> > Mason
> > >>
> > >>
>

Re: [DISCUSS] Contribution of Multi Cluster Kafka Source

Posted by Thomas Weise <th...@apache.org>.
Hi Mason,

I added mason6345 to the Flink confluence space, you should be able to
add a FLIP now.

Looking forward to the contribution!

Thomas

On Thu, Jun 30, 2022 at 9:25 AM Martijn Visser <ma...@apache.org> wrote:
>
> Hi Mason,
>
> I'm sure there's a PMC (*hint*) out there who can grant you access to
> create a FLIP. Looking forward to it, this sounds like an improvement that
> users are looking forward to.
>
> Best regards,
>
> Martijn
>
> Op di 28 jun. 2022 om 09:21 schreef Mason Chen <ma...@gmail.com>:
>
> > Hi all,
> >
> > Thanks for the feedback! I'm adding the users, who responded in the user
> > mailing list, to this thread.
> >
> > @Qingsheng - Yes, I would prefer to reuse the existing Kafka connector
> > module. It makes a lot of sense since the dependencies are the same and the
> > implementation can also extend and improve some of the test utilities you
> > have been working on for the FLIP 27 Kafka Source. I will enumerate the
> > migration steps in the FLIP template.
> >
> > @Ryan - I don't have a public branch available yet, but I would appreciate
> > your review on the FLIP design! When the FLIP design is approved by devs
> > and the community, I can start to commit our implementation to a fork.
> >
> > @Andrew - Yup, one of the requirements of the connector is to read
> > multiple clusters within a single source, so it should be able to work well
> > with your use case.
> >
> > @Devs - what do I need to get started on the FLIP design? I see the FLIP
> > template and I have an account (mason6345), but I don't have access to
> > create a page.
> >
> > Best,
> > Mason
> >
> >
> >
> >
> > On Sun, Jun 26, 2022 at 8:08 PM Qingsheng Ren <re...@apache.org> wrote:
> >
> >> Hi Mason,
> >>
> >> It sounds like an exciting enhancement to the Kafka source and will
> >> benefit a lot of users I believe.
> >>
> >> Would you prefer to reuse the existing flink-connector-kafka module or
> >> create a new one for the new multi-cluster feature? Personally I prefer the
> >> former one because users won’t need to introduce another dependency module
> >> to their projects in order to use the feature.
> >>
> >> Thanks for the effort on this and looking forward to your FLIP!
> >>
> >> Best,
> >> Qingsheng
> >>
> >> > On Jun 24, 2022, at 09:43, Mason Chen <ma...@gmail.com> wrote:
> >> >
> >> > Hi community,
> >> >
> >> > We have been working on a Multi Cluster Kafka Source and are looking to
> >> > contribute it upstream. I've given a talk about the features and design
> >> at
> >> > a Flink meetup: https://youtu.be/H1SYOuLcUTI.
> >> >
> >> > The main features that it provides is:
> >> > 1. Reading multiple Kafka clusters within a single source.
> >> > 2. Adjusting the clusters and topics the source consumes from
> >> dynamically,
> >> > without Flink job restart.
> >> >
> >> > Some of the challenging use cases that these features solve are:
> >> > 1. Transparent Kafka cluster migration without Flink job restart.
> >> > 2. Transparent Kafka topic migration without Flink job restart.
> >> > 3. Direct integration with Hybrid Source.
> >> >
> >> > In addition, this is designed with wrapping and managing the existing
> >> > KafkaSource components to enable these features, so it can continue to
> >> > benefit from KafkaSource improvements and bug fixes. It can be
> >> considered
> >> > as a form of a composite source.
> >> >
> >> > I think the contribution of this source could benefit a lot of users who
> >> > have asked in the mailing list about Flink handling Kafka migrations and
> >> > removing topics in the past. I would love to hear and address your
> >> thoughts
> >> > and feedback, and if possible drive a FLIP!
> >> >
> >> > Best,
> >> > Mason
> >>
> >>

Re: [DISCUSS] Contribution of Multi Cluster Kafka Source

Posted by Thomas Weise <th...@apache.org>.
Hi Mason,

I added mason6345 to the Flink confluence space, you should be able to
add a FLIP now.

Looking forward to the contribution!

Thomas

On Thu, Jun 30, 2022 at 9:25 AM Martijn Visser <ma...@apache.org> wrote:
>
> Hi Mason,
>
> I'm sure there's a PMC (*hint*) out there who can grant you access to
> create a FLIP. Looking forward to it, this sounds like an improvement that
> users are looking forward to.
>
> Best regards,
>
> Martijn
>
> Op di 28 jun. 2022 om 09:21 schreef Mason Chen <ma...@gmail.com>:
>
> > Hi all,
> >
> > Thanks for the feedback! I'm adding the users, who responded in the user
> > mailing list, to this thread.
> >
> > @Qingsheng - Yes, I would prefer to reuse the existing Kafka connector
> > module. It makes a lot of sense since the dependencies are the same and the
> > implementation can also extend and improve some of the test utilities you
> > have been working on for the FLIP 27 Kafka Source. I will enumerate the
> > migration steps in the FLIP template.
> >
> > @Ryan - I don't have a public branch available yet, but I would appreciate
> > your review on the FLIP design! When the FLIP design is approved by devs
> > and the community, I can start to commit our implementation to a fork.
> >
> > @Andrew - Yup, one of the requirements of the connector is to read
> > multiple clusters within a single source, so it should be able to work well
> > with your use case.
> >
> > @Devs - what do I need to get started on the FLIP design? I see the FLIP
> > template and I have an account (mason6345), but I don't have access to
> > create a page.
> >
> > Best,
> > Mason
> >
> >
> >
> >
> > On Sun, Jun 26, 2022 at 8:08 PM Qingsheng Ren <re...@apache.org> wrote:
> >
> >> Hi Mason,
> >>
> >> It sounds like an exciting enhancement to the Kafka source and will
> >> benefit a lot of users I believe.
> >>
> >> Would you prefer to reuse the existing flink-connector-kafka module or
> >> create a new one for the new multi-cluster feature? Personally I prefer the
> >> former one because users won’t need to introduce another dependency module
> >> to their projects in order to use the feature.
> >>
> >> Thanks for the effort on this and looking forward to your FLIP!
> >>
> >> Best,
> >> Qingsheng
> >>
> >> > On Jun 24, 2022, at 09:43, Mason Chen <ma...@gmail.com> wrote:
> >> >
> >> > Hi community,
> >> >
> >> > We have been working on a Multi Cluster Kafka Source and are looking to
> >> > contribute it upstream. I've given a talk about the features and design
> >> at
> >> > a Flink meetup: https://youtu.be/H1SYOuLcUTI.
> >> >
> >> > The main features that it provides is:
> >> > 1. Reading multiple Kafka clusters within a single source.
> >> > 2. Adjusting the clusters and topics the source consumes from
> >> dynamically,
> >> > without Flink job restart.
> >> >
> >> > Some of the challenging use cases that these features solve are:
> >> > 1. Transparent Kafka cluster migration without Flink job restart.
> >> > 2. Transparent Kafka topic migration without Flink job restart.
> >> > 3. Direct integration with Hybrid Source.
> >> >
> >> > In addition, this is designed with wrapping and managing the existing
> >> > KafkaSource components to enable these features, so it can continue to
> >> > benefit from KafkaSource improvements and bug fixes. It can be
> >> considered
> >> > as a form of a composite source.
> >> >
> >> > I think the contribution of this source could benefit a lot of users who
> >> > have asked in the mailing list about Flink handling Kafka migrations and
> >> > removing topics in the past. I would love to hear and address your
> >> thoughts
> >> > and feedback, and if possible drive a FLIP!
> >> >
> >> > Best,
> >> > Mason
> >>
> >>

Re: [DISCUSS] Contribution of Multi Cluster Kafka Source

Posted by Martijn Visser <ma...@apache.org>.
Hi Mason,

I'm sure there's a PMC (*hint*) out there who can grant you access to
create a FLIP. Looking forward to it, this sounds like an improvement that
users are looking forward to.

Best regards,

Martijn

Op di 28 jun. 2022 om 09:21 schreef Mason Chen <ma...@gmail.com>:

> Hi all,
>
> Thanks for the feedback! I'm adding the users, who responded in the user
> mailing list, to this thread.
>
> @Qingsheng - Yes, I would prefer to reuse the existing Kafka connector
> module. It makes a lot of sense since the dependencies are the same and the
> implementation can also extend and improve some of the test utilities you
> have been working on for the FLIP 27 Kafka Source. I will enumerate the
> migration steps in the FLIP template.
>
> @Ryan - I don't have a public branch available yet, but I would appreciate
> your review on the FLIP design! When the FLIP design is approved by devs
> and the community, I can start to commit our implementation to a fork.
>
> @Andrew - Yup, one of the requirements of the connector is to read
> multiple clusters within a single source, so it should be able to work well
> with your use case.
>
> @Devs - what do I need to get started on the FLIP design? I see the FLIP
> template and I have an account (mason6345), but I don't have access to
> create a page.
>
> Best,
> Mason
>
>
>
>
> On Sun, Jun 26, 2022 at 8:08 PM Qingsheng Ren <re...@apache.org> wrote:
>
>> Hi Mason,
>>
>> It sounds like an exciting enhancement to the Kafka source and will
>> benefit a lot of users I believe.
>>
>> Would you prefer to reuse the existing flink-connector-kafka module or
>> create a new one for the new multi-cluster feature? Personally I prefer the
>> former one because users won’t need to introduce another dependency module
>> to their projects in order to use the feature.
>>
>> Thanks for the effort on this and looking forward to your FLIP!
>>
>> Best,
>> Qingsheng
>>
>> > On Jun 24, 2022, at 09:43, Mason Chen <ma...@gmail.com> wrote:
>> >
>> > Hi community,
>> >
>> > We have been working on a Multi Cluster Kafka Source and are looking to
>> > contribute it upstream. I've given a talk about the features and design
>> at
>> > a Flink meetup: https://youtu.be/H1SYOuLcUTI.
>> >
>> > The main features that it provides is:
>> > 1. Reading multiple Kafka clusters within a single source.
>> > 2. Adjusting the clusters and topics the source consumes from
>> dynamically,
>> > without Flink job restart.
>> >
>> > Some of the challenging use cases that these features solve are:
>> > 1. Transparent Kafka cluster migration without Flink job restart.
>> > 2. Transparent Kafka topic migration without Flink job restart.
>> > 3. Direct integration with Hybrid Source.
>> >
>> > In addition, this is designed with wrapping and managing the existing
>> > KafkaSource components to enable these features, so it can continue to
>> > benefit from KafkaSource improvements and bug fixes. It can be
>> considered
>> > as a form of a composite source.
>> >
>> > I think the contribution of this source could benefit a lot of users who
>> > have asked in the mailing list about Flink handling Kafka migrations and
>> > removing topics in the past. I would love to hear and address your
>> thoughts
>> > and feedback, and if possible drive a FLIP!
>> >
>> > Best,
>> > Mason
>>
>>

Re: [DISCUSS] Contribution of Multi Cluster Kafka Source

Posted by Martijn Visser <ma...@apache.org>.
Hi Mason,

I'm sure there's a PMC (*hint*) out there who can grant you access to
create a FLIP. Looking forward to it, this sounds like an improvement that
users are looking forward to.

Best regards,

Martijn

Op di 28 jun. 2022 om 09:21 schreef Mason Chen <ma...@gmail.com>:

> Hi all,
>
> Thanks for the feedback! I'm adding the users, who responded in the user
> mailing list, to this thread.
>
> @Qingsheng - Yes, I would prefer to reuse the existing Kafka connector
> module. It makes a lot of sense since the dependencies are the same and the
> implementation can also extend and improve some of the test utilities you
> have been working on for the FLIP 27 Kafka Source. I will enumerate the
> migration steps in the FLIP template.
>
> @Ryan - I don't have a public branch available yet, but I would appreciate
> your review on the FLIP design! When the FLIP design is approved by devs
> and the community, I can start to commit our implementation to a fork.
>
> @Andrew - Yup, one of the requirements of the connector is to read
> multiple clusters within a single source, so it should be able to work well
> with your use case.
>
> @Devs - what do I need to get started on the FLIP design? I see the FLIP
> template and I have an account (mason6345), but I don't have access to
> create a page.
>
> Best,
> Mason
>
>
>
>
> On Sun, Jun 26, 2022 at 8:08 PM Qingsheng Ren <re...@apache.org> wrote:
>
>> Hi Mason,
>>
>> It sounds like an exciting enhancement to the Kafka source and will
>> benefit a lot of users I believe.
>>
>> Would you prefer to reuse the existing flink-connector-kafka module or
>> create a new one for the new multi-cluster feature? Personally I prefer the
>> former one because users won’t need to introduce another dependency module
>> to their projects in order to use the feature.
>>
>> Thanks for the effort on this and looking forward to your FLIP!
>>
>> Best,
>> Qingsheng
>>
>> > On Jun 24, 2022, at 09:43, Mason Chen <ma...@gmail.com> wrote:
>> >
>> > Hi community,
>> >
>> > We have been working on a Multi Cluster Kafka Source and are looking to
>> > contribute it upstream. I've given a talk about the features and design
>> at
>> > a Flink meetup: https://youtu.be/H1SYOuLcUTI.
>> >
>> > The main features that it provides is:
>> > 1. Reading multiple Kafka clusters within a single source.
>> > 2. Adjusting the clusters and topics the source consumes from
>> dynamically,
>> > without Flink job restart.
>> >
>> > Some of the challenging use cases that these features solve are:
>> > 1. Transparent Kafka cluster migration without Flink job restart.
>> > 2. Transparent Kafka topic migration without Flink job restart.
>> > 3. Direct integration with Hybrid Source.
>> >
>> > In addition, this is designed with wrapping and managing the existing
>> > KafkaSource components to enable these features, so it can continue to
>> > benefit from KafkaSource improvements and bug fixes. It can be
>> considered
>> > as a form of a composite source.
>> >
>> > I think the contribution of this source could benefit a lot of users who
>> > have asked in the mailing list about Flink handling Kafka migrations and
>> > removing topics in the past. I would love to hear and address your
>> thoughts
>> > and feedback, and if possible drive a FLIP!
>> >
>> > Best,
>> > Mason
>>
>>

Re: [DISCUSS] Contribution of Multi Cluster Kafka Source

Posted by Mason Chen <ma...@gmail.com>.
Hi all,

Thanks for the feedback! I'm adding the users, who responded in the user
mailing list, to this thread.

@Qingsheng - Yes, I would prefer to reuse the existing Kafka connector
module. It makes a lot of sense since the dependencies are the same and the
implementation can also extend and improve some of the test utilities you
have been working on for the FLIP 27 Kafka Source. I will enumerate the
migration steps in the FLIP template.

@Ryan - I don't have a public branch available yet, but I would appreciate
your review on the FLIP design! When the FLIP design is approved by devs
and the community, I can start to commit our implementation to a fork.

@Andrew - Yup, one of the requirements of the connector is to read multiple
clusters within a single source, so it should be able to work well with
your use case.

@Devs - what do I need to get started on the FLIP design? I see the FLIP
template and I have an account (mason6345), but I don't have access to
create a page.

Best,
Mason




On Sun, Jun 26, 2022 at 8:08 PM Qingsheng Ren <re...@apache.org> wrote:

> Hi Mason,
>
> It sounds like an exciting enhancement to the Kafka source and will
> benefit a lot of users I believe.
>
> Would you prefer to reuse the existing flink-connector-kafka module or
> create a new one for the new multi-cluster feature? Personally I prefer the
> former one because users won’t need to introduce another dependency module
> to their projects in order to use the feature.
>
> Thanks for the effort on this and looking forward to your FLIP!
>
> Best,
> Qingsheng
>
> > On Jun 24, 2022, at 09:43, Mason Chen <ma...@gmail.com> wrote:
> >
> > Hi community,
> >
> > We have been working on a Multi Cluster Kafka Source and are looking to
> > contribute it upstream. I've given a talk about the features and design
> at
> > a Flink meetup: https://youtu.be/H1SYOuLcUTI.
> >
> > The main features that it provides is:
> > 1. Reading multiple Kafka clusters within a single source.
> > 2. Adjusting the clusters and topics the source consumes from
> dynamically,
> > without Flink job restart.
> >
> > Some of the challenging use cases that these features solve are:
> > 1. Transparent Kafka cluster migration without Flink job restart.
> > 2. Transparent Kafka topic migration without Flink job restart.
> > 3. Direct integration with Hybrid Source.
> >
> > In addition, this is designed with wrapping and managing the existing
> > KafkaSource components to enable these features, so it can continue to
> > benefit from KafkaSource improvements and bug fixes. It can be considered
> > as a form of a composite source.
> >
> > I think the contribution of this source could benefit a lot of users who
> > have asked in the mailing list about Flink handling Kafka migrations and
> > removing topics in the past. I would love to hear and address your
> thoughts
> > and feedback, and if possible drive a FLIP!
> >
> > Best,
> > Mason
>
>

Re: [DISCUSS] Contribution of Multi Cluster Kafka Source

Posted by Mason Chen <ma...@gmail.com>.
Hi all,

Thanks for the feedback! I'm adding the users, who responded in the user
mailing list, to this thread.

@Qingsheng - Yes, I would prefer to reuse the existing Kafka connector
module. It makes a lot of sense since the dependencies are the same and the
implementation can also extend and improve some of the test utilities you
have been working on for the FLIP 27 Kafka Source. I will enumerate the
migration steps in the FLIP template.

@Ryan - I don't have a public branch available yet, but I would appreciate
your review on the FLIP design! When the FLIP design is approved by devs
and the community, I can start to commit our implementation to a fork.

@Andrew - Yup, one of the requirements of the connector is to read multiple
clusters within a single source, so it should be able to work well with
your use case.

@Devs - what do I need to get started on the FLIP design? I see the FLIP
template and I have an account (mason6345), but I don't have access to
create a page.

Best,
Mason




On Sun, Jun 26, 2022 at 8:08 PM Qingsheng Ren <re...@apache.org> wrote:

> Hi Mason,
>
> It sounds like an exciting enhancement to the Kafka source and will
> benefit a lot of users I believe.
>
> Would you prefer to reuse the existing flink-connector-kafka module or
> create a new one for the new multi-cluster feature? Personally I prefer the
> former one because users won’t need to introduce another dependency module
> to their projects in order to use the feature.
>
> Thanks for the effort on this and looking forward to your FLIP!
>
> Best,
> Qingsheng
>
> > On Jun 24, 2022, at 09:43, Mason Chen <ma...@gmail.com> wrote:
> >
> > Hi community,
> >
> > We have been working on a Multi Cluster Kafka Source and are looking to
> > contribute it upstream. I've given a talk about the features and design
> at
> > a Flink meetup: https://youtu.be/H1SYOuLcUTI.
> >
> > The main features that it provides is:
> > 1. Reading multiple Kafka clusters within a single source.
> > 2. Adjusting the clusters and topics the source consumes from
> dynamically,
> > without Flink job restart.
> >
> > Some of the challenging use cases that these features solve are:
> > 1. Transparent Kafka cluster migration without Flink job restart.
> > 2. Transparent Kafka topic migration without Flink job restart.
> > 3. Direct integration with Hybrid Source.
> >
> > In addition, this is designed with wrapping and managing the existing
> > KafkaSource components to enable these features, so it can continue to
> > benefit from KafkaSource improvements and bug fixes. It can be considered
> > as a form of a composite source.
> >
> > I think the contribution of this source could benefit a lot of users who
> > have asked in the mailing list about Flink handling Kafka migrations and
> > removing topics in the past. I would love to hear and address your
> thoughts
> > and feedback, and if possible drive a FLIP!
> >
> > Best,
> > Mason
>
>

Re: [DISCUSS] Contribution of Multi Cluster Kafka Source

Posted by Qingsheng Ren <re...@apache.org>.
Hi Mason,

It sounds like an exciting enhancement to the Kafka source and will benefit a lot of users I believe. 

Would you prefer to reuse the existing flink-connector-kafka module or create a new one for the new multi-cluster feature? Personally I prefer the former one because users won’t need to introduce another dependency module to their projects in order to use the feature. 

Thanks for the effort on this and looking forward to your FLIP!

Best, 
Qingsheng

> On Jun 24, 2022, at 09:43, Mason Chen <ma...@gmail.com> wrote:
> 
> Hi community,
> 
> We have been working on a Multi Cluster Kafka Source and are looking to
> contribute it upstream. I've given a talk about the features and design at
> a Flink meetup: https://youtu.be/H1SYOuLcUTI.
> 
> The main features that it provides is:
> 1. Reading multiple Kafka clusters within a single source.
> 2. Adjusting the clusters and topics the source consumes from dynamically,
> without Flink job restart.
> 
> Some of the challenging use cases that these features solve are:
> 1. Transparent Kafka cluster migration without Flink job restart.
> 2. Transparent Kafka topic migration without Flink job restart.
> 3. Direct integration with Hybrid Source.
> 
> In addition, this is designed with wrapping and managing the existing
> KafkaSource components to enable these features, so it can continue to
> benefit from KafkaSource improvements and bug fixes. It can be considered
> as a form of a composite source.
> 
> I think the contribution of this source could benefit a lot of users who
> have asked in the mailing list about Flink handling Kafka migrations and
> removing topics in the past. I would love to hear and address your thoughts
> and feedback, and if possible drive a FLIP!
> 
> Best,
> Mason