You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flink.apache.org by Debasish Ghosh <gh...@gmail.com> on 2020/02/26 13:46:07 UTC

Kafka consumer group id and Flink

 Hello -

Can someone please point me to some document / code snippet as to how Flink
uses Kafka consumer group property "group.id". In the message
http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Flink-kafka-group-question-td8185.html#none
I see the following ..

Internally, the Flink Kafka connectors don’t use the consumer group
> management functionality because they are using lower-level APIs
> (SimpleConsumer in 0.8, and KafkaConsumer#assign(…) in 0.9) on each
> parallel instance for more control on individual partition consumption. So,
> essentially, the “group.id” setting in the Flink Kafka connector is only
> used for committing offsets back to ZK / Kafka brokers.


This message is quite old - has anything changed since then? Looks like
this property is a mandatory setting though.

If I have multiple flink streaming jobs, since each job tracks the offsets
individually and saves it by the internal checkpoint mechanism, is there no
need to specify a different groupd.id for each job ? And in the case when
there are two jobs reading the same topic but has different business logic
will that work correctly although the consumers will be in the same
consumer-group?

Thanks for any help.
regards.
-- 
Debasish Ghosh
http://manning.com/ghosh2
http://manning.com/ghosh

Twttr: @debasishg
Blog: http://debasishg.blogspot.com
Code: http://github.com/debasishg

Re: Kafka consumer group id and Flink

Posted by Benchao Li <li...@gmail.com>.
Hi Debasish,

AFAIK, Flink Kafka Connector still works like that, the community will keep
the document updated.

Flink Kafka connector has three main modes (and also specific offsets and
specific timestamp):
- earliest-offset: no matter what your offset of "group.id" is currently,
it always consumes from the earliest offset.
- group-offset: consumes from offsets kept in kafka of the "group.id" (or
zookeeper in older kafka versions)
- latest-offset: no matter what your offset of "group.id" is currently, it
always consumes from the latest offset.
And it will automatically commit offsets to the kafka for "group.id" you
specified.


Plus the checkpoint, if you enable checkpoint, it always start from the
offsets kept in checkpoint, no matter what mode you set.
And will commit offsets to the kafka for "group.id" when checkpoint
succeeds by default.

Different jobs won't affect each other for consuming. However, if you
specify same "group.id" for different jobs, maybe your offsets save to
kafka will be messed.

Debasish Ghosh <gh...@gmail.com> 于2020年2月26日周三 下午9:47写道:

>  Hello -
>
> Can someone please point me to some document / code snippet as to how
> Flink uses Kafka consumer group property "group.id". In the message
> http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Flink-kafka-group-question-td8185.html#none
> I see the following ..
>
> Internally, the Flink Kafka connectors don’t use the consumer group
>> management functionality because they are using lower-level APIs
>> (SimpleConsumer in 0.8, and KafkaConsumer#assign(…) in 0.9) on each
>> parallel instance for more control on individual partition consumption. So,
>> essentially, the “group.id” setting in the Flink Kafka connector is only
>> used for committing offsets back to ZK / Kafka brokers.
>
>
> This message is quite old - has anything changed since then? Looks like
> this property is a mandatory setting though.
>
> If I have multiple flink streaming jobs, since each job tracks the offsets
> individually and saves it by the internal checkpoint mechanism, is there no
> need to specify a different groupd.id for each job ? And in the case when
> there are two jobs reading the same topic but has different business logic
> will that work correctly although the consumers will be in the same
> consumer-group?
>
> Thanks for any help.
> regards.
> --
> Debasish Ghosh
> http://manning.com/ghosh2
> http://manning.com/ghosh
>
> Twttr: @debasishg
> Blog: http://debasishg.blogspot.com
> Code: http://github.com/debasishg
>


-- 

Benchao Li
School of Electronics Engineering and Computer Science, Peking University
Tel:+86-15650713730
Email: libenchao@gmail.com; libenchao@pku.edu.cn