You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by "Ayrat Hudaygulov (Jira)" <ji...@apache.org> on 2020/08/24 15:17:00 UTC

[jira] [Created] (FLINK-19039) Parallel Flink Kafka Consumers compete with each other

Ayrat Hudaygulov created FLINK-19039:
----------------------------------------

             Summary: Parallel Flink Kafka Consumers compete with each other
                 Key: FLINK-19039
                 URL: https://issues.apache.org/jira/browse/FLINK-19039
             Project: Flink
          Issue Type: Improvement
          Components: Connectors / Kafka
    Affects Versions: 1.11.1
            Reporter: Ayrat Hudaygulov


If I'll run multiple Flink instances with same consumer group id they will not re-balance partitions with each other, but rather each instance take all partitions, effectively not working in parallel at all, and multiplying amount of messages processed.

 

This is because FlinkKafkaConsumer has its own re-balancing mechanism for current parallelism level and then just calls:

`consumerTmp.assign(newPartitionAssignments){color:#cc7832};{color}`

 

[https://github.com/apache/flink/blob/59714b9d6addb1dbf2171cab937a0e3fec52f2b1/flink-connectors/flink-connector-kafka-0.10/src/main/java/org/apache/flink/streaming/connectors/kafka/internal/KafkaConsumerThread.java#L422]

 

I suppose there has to be a way to fallback to default kafka mechanism of re-balancing to respect consumer group id, but it's not presented in Flink at all.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)