You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flink.apache.org by Valentin <va...@aseno.de> on 2017/05/17 17:35:20 UTC

FlinkKafkaConsumer using Kafka-GroupID?

Hi there,

As far as I understood, Flink Kafka Connectors don’t use the consumer group management feature from Kafka. Here the post I got the info from:
http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Flink-kafka-group-question-td8185.html#none <http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Flink-kafka-group-question-td8185.html#none>

For some reasons we cannot set up a flink-cluster environment, but we still need to assure high availability. e.g. in case one node goes down the second should still keep on running.


My question:
- Is there any chance to run 2 different flink (standalone) apps consuming messages from a single kafka-topic only once? This is what I could do by using 2 native Kafka-Consumers within the same consumer-group.

Many thanks in advance
Valentin 
 

Re: FlinkKafkaConsumer using Kafka-GroupID?

Posted by "Tzu-Li (Gordon) Tai" <tz...@apache.org>.
Hi Valentin!

Your understanding is correct, the Kafka connectors do not use the consumer group functionality to distribute messages across multiple instances of a FlinkKafkaConsumer source. It’s basically determining which instances should be assigned which Kafka partitions based on a simple round-robin distribution.

Is there any chance to run 2 different flink (standalone) apps consuming messages from a single kafka-topic only once? This is what I could do by using 2 native Kafka-Consumers within the same consumer-group.

Therefore, I don’t think this is possible with the FlinkKafkaConsumers. However, this is exactly what Flink’s checkpointing and savepoints is designed for.
If your single app fails, using checkpoints / savepoints the consumer can just re-start from the offsets in that checkpoint / savepoint.
In other words, with Flink’s streaming fault tolerance mechanics, you will get exactly-once guarantees across 2 different runs of the app.
The FlinkKafkaConnector docs should explain this thoroughly [1].

Does this address what your concerns?

Cheers,
Gordon

[1] https://ci.apache.org/projects/flink/flink-docs-release-1.2/dev/connectors/kafka.html#kafka-consumers-and-fault-tolerance


On 18 May 2017 at 1:35:35 AM, Valentin (valentin@aseno.de) wrote:

Hi there,

As far as I understood, Flink Kafka Connectors don’t use the consumer group management feature from Kafka. Here the post I got the info from:
http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Flink-kafka-group-question-td8185.html#none

For some reasons we cannot set up a flink-cluster environment, but we still need to assure high availability. e.g. in case one node goes down the second should still keep on running.


My question:
- Is there any chance to run 2 different flink (standalone) apps consuming messages from a single kafka-topic only once? This is what I could do by using 2 native Kafka-Consumers within the same consumer-group.

Many thanks in advance
Valentin