You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flink.apache.org by "Roberts, Ben (Senior Developer) via user" <us...@flink.apache.org> on 2023/03/30 16:33:41 UTC

KafkaSource consumer group

Hi,

Is there a way to run multiple flink jobs with the same Kafka group.id and have them join the same consumer group?

It seems that setting the group.id using KafkaSource.builder().set_group_id() does not have the effect of creating an actual consumer group in Kafka.

Running the same flink job with the same group.id, consuming from the same topic, will result in both flink jobs receiving the same messages from the topic, rather than only one of the jobs receiving the messages (as would be expected for consumers in a consumer group normally with Kafka).

Is this a design choice, and is there a way to configure it so messages can be split across two jobs using the same “group.id”?

Thanks in advance,
Ben

Information in this email including any attachments may be privileged, confidential and is intended exclusively for the addressee. The views expressed may not be official policy, but the personal views of the originator. If you have received it in error, please notify the sender by return e-mail and delete it from your system. You should not reproduce, distribute, store, retransmit, use or disclose its contents to anyone. Please note we reserve the right to monitor all e-mail communication through our internal and external networks. SKY and the SKY marks are trademarks of Sky Limited and Sky International AG and are used under licence.

Sky UK Limited (Registration No. 2906991), Sky-In-Home Service Limited (Registration No. 2067075), Sky Subscribers Services Limited (Registration No. 2340150) and Sky CP Limited (Registration No. 9513259) are direct or indirect subsidiaries of Sky Limited (Registration No. 2247735). All of the companies mentioned in this paragraph are incorporated in England and Wales and share the same registered office at Grant Way, Isleworth, Middlesex TW7 5QD

Re: KafkaSource consumer group

Posted by "Tzu-Li (Gordon) Tai" <tz...@apache.org>.
Sorry, I meant to say "Hi Ben" :-)

On Thu, Mar 30, 2023 at 9:52 AM Tzu-Li (Gordon) Tai <tz...@apache.org>
wrote:

> Hi Robert,
>
> This is a design choice. Flink's KafkaSource doesn't rely on consumer
> groups for assigning partitions / rebalancing / offset tracking. It
> manually assigns whatever partitions are in the specified topic across its
> consumer instances, and rebalances only when the Flink job / KafkaSink is
> rescaled.
>
> Is there a specific reason that you need two Flink jobs for this? I
> believe the Flink-way of doing this would be to have one job read the
> topic, and then you'd do a stream split if you want to have two different
> branches of processing business logic.
>
> Thanks,
> Gordon
>
> On Thu, Mar 30, 2023 at 9:34 AM Roberts, Ben (Senior Developer) via user <
> user@flink.apache.org> wrote:
>
>> Hi,
>>
>>
>>
>> Is there a way to run multiple flink jobs with the same Kafka group.id
>> and have them join the same consumer group?
>>
>>
>>
>> It seems that setting the group.id using
>> KafkaSource.builder().set_group_id() does not have the effect of creating
>> an actual consumer group in Kafka.
>>
>>
>>
>> Running the same flink job with the same group.id, consuming from the
>> same topic, will result in both flink jobs receiving the same messages from
>> the topic, rather than only one of the jobs receiving the messages (as
>> would be expected for consumers in a consumer group normally with Kafka).
>>
>>
>>
>> Is this a design choice, and is there a way to configure it so messages
>> can be split across two jobs using the same “group.id”?
>>
>>
>>
>> Thanks in advance,
>>
>> Ben
>>
>>
>> Information in this email including any attachments may be privileged,
>> confidential and is intended exclusively for the addressee. The views
>> expressed may not be official policy, but the personal views of the
>> originator. If you have received it in error, please notify the sender by
>> return e-mail and delete it from your system. You should not reproduce,
>> distribute, store, retransmit, use or disclose its contents to anyone.
>> Please note we reserve the right to monitor all e-mail communication
>> through our internal and external networks. SKY and the SKY marks are
>> trademarks of Sky Limited and Sky International AG and are used under
>> licence.
>>
>> Sky UK Limited (Registration No. 2906991), Sky-In-Home Service Limited
>> (Registration No. 2067075), Sky Subscribers Services Limited (Registration
>> No. 2340150) and Sky CP Limited (Registration No. 9513259) are direct or
>> indirect subsidiaries of Sky Limited (Registration No. 2247735). All of the
>> companies mentioned in this paragraph are incorporated in England and Wales
>> and share the same registered office at Grant Way, Isleworth, Middlesex TW7
>> 5QD
>>
>

Re: KafkaSource consumer group

Posted by "Tzu-Li (Gordon) Tai" <tz...@apache.org>.
Hi Robert,

This is a design choice. Flink's KafkaSource doesn't rely on consumer
groups for assigning partitions / rebalancing / offset tracking. It
manually assigns whatever partitions are in the specified topic across its
consumer instances, and rebalances only when the Flink job / KafkaSink is
rescaled.

Is there a specific reason that you need two Flink jobs for this? I believe
the Flink-way of doing this would be to have one job read the topic, and
then you'd do a stream split if you want to have two different branches of
processing business logic.

Thanks,
Gordon

On Thu, Mar 30, 2023 at 9:34 AM Roberts, Ben (Senior Developer) via user <
user@flink.apache.org> wrote:

> Hi,
>
>
>
> Is there a way to run multiple flink jobs with the same Kafka group.id
> and have them join the same consumer group?
>
>
>
> It seems that setting the group.id using
> KafkaSource.builder().set_group_id() does not have the effect of creating
> an actual consumer group in Kafka.
>
>
>
> Running the same flink job with the same group.id, consuming from the
> same topic, will result in both flink jobs receiving the same messages from
> the topic, rather than only one of the jobs receiving the messages (as
> would be expected for consumers in a consumer group normally with Kafka).
>
>
>
> Is this a design choice, and is there a way to configure it so messages
> can be split across two jobs using the same “group.id”?
>
>
>
> Thanks in advance,
>
> Ben
>
>
> Information in this email including any attachments may be privileged,
> confidential and is intended exclusively for the addressee. The views
> expressed may not be official policy, but the personal views of the
> originator. If you have received it in error, please notify the sender by
> return e-mail and delete it from your system. You should not reproduce,
> distribute, store, retransmit, use or disclose its contents to anyone.
> Please note we reserve the right to monitor all e-mail communication
> through our internal and external networks. SKY and the SKY marks are
> trademarks of Sky Limited and Sky International AG and are used under
> licence.
>
> Sky UK Limited (Registration No. 2906991), Sky-In-Home Service Limited
> (Registration No. 2067075), Sky Subscribers Services Limited (Registration
> No. 2340150) and Sky CP Limited (Registration No. 9513259) are direct or
> indirect subsidiaries of Sky Limited (Registration No. 2247735). All of the
> companies mentioned in this paragraph are incorporated in England and Wales
> and share the same registered office at Grant Way, Isleworth, Middlesex TW7
> 5QD
>