You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flink.apache.org by Giriraj <gr...@gmail.com> on 2018/06/28 12:26:21 UTC

Flink kafka consumers don't honor group.id

It seems that Flink kafka consumer don't honor group.id while consumers are
added dynamically. Lets say I have some flink kafka consumers reading from a
topic and I dynamically add some new Flink kafka consumers with same
group.id, kafka messages are getting duplicated to existing as well as new
consumers. It violates the exactly once semantics as all the consumers
belong to same consumer group. 

I am of the view that its an existing issue. 
http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Flink-kafka-group-question-td8185.html#none
<http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Flink-kafka-group-question-td8185.html#none>  
https://stackoverflow.com/questions/38639019/flink-kafka-consumer-groupid-not-working
<https://stackoverflow.com/questions/38639019/flink-kafka-consumer-groupid-not-working>  
Is there any plan to provide the fix/support for group.id in upcoming
releases?
Let me know if there is any way to deal with it for now.



--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/

Re: Flink kafka consumers don't honor group.id

Posted by Giriraj <gr...@gmail.com>.
Hi Gordon,

Gordon:If I understood you correctly, what you are doing is, while a job
with a Kafka consumer is already running, you want to start a new job also
with a Kafka consumer as the source and uses the same group.id so that the
topic's messages are routed between the two jobs.

Is this correct? If so, could you briefly explain what your use case is and
why you want to do this?
Giriraj: You got it almost correctly. The so called "new job" above is like
a new instance of the same job. We are trying to scale the job by spawning
more instances of same job. i.e. we are abstracting the flink job as a
microservice and when load increases on this service/job, we would like to
spawn a new instance of same job/service. That is why we are expecting when
group.id is same in both the jobs, messages should get delivered only to 
one of the job consumers. I have also given thought of scaling job by
increasing parallelism(canceling and starting job with increased
parallelism). But  former approach looked cleaner,  intuitive and seamless
to us. 

I would really appreciate your thoughts about it. 

Apology for the delay in response. 

--
Giriraj



--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/

Re: Flink kafka consumers don't honor group.id

Posted by "Tzu-Li (Gordon) Tai" <tz...@apache.org>.
Hi Giriraj,

The fact that the Flink Kafka Consumer doesn't use the group.id property,
is an expected behavior.
Since the partition-to-subtask assignment of the Flink Kafka Consumer needs
to be deterministic in Flink, the consumer uses static assignment instead
of the more high-level consumer group dynamic assignments.

If I understood you correctly, what you are doing is, while a job with a
Kafka consumer is already running, you want to start a new job also with a
Kafka consumer as the source and uses the same group.id so that the topic's
messages are routed between the two jobs.
Is this correct? If so, could you briefly explain what your use case is and
why you want to do this?
Perhaps this should be tackled from a different angle when designed in
Flink.

Cheers,
Gordon

On Thu, Jun 28, 2018 at 8:26 PM Giriraj <gr...@gmail.com> wrote:

> It seems that Flink kafka consumer don't honor group.id while consumers
> are
> added dynamically. Lets say I have some flink kafka consumers reading from
> a
> topic and I dynamically add some new Flink kafka consumers with same
> group.id, kafka messages are getting duplicated to existing as well as new
> consumers. It violates the exactly once semantics as all the consumers
> belong to same consumer group.
>
> I am of the view that its an existing issue.
>
> http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Flink-kafka-group-question-td8185.html#none
> <
> http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Flink-kafka-group-question-td8185.html#none>
>
>
> https://stackoverflow.com/questions/38639019/flink-kafka-consumer-groupid-not-working
> <
> https://stackoverflow.com/questions/38639019/flink-kafka-consumer-groupid-not-working>
>
> Is there any plan to provide the fix/support for group.id in upcoming
> releases?
> Let me know if there is any way to deal with it for now.
>
>
>
> --
> Sent from:
> http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/
>