You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flink.apache.org by wang xuchen <be...@gmail.com> on 2019/05/31 01:48:16 UTC

Question about Kafka Flink consumer parallelism

Hi Flink users,

I am trying to figure out how leverage parallelism to improve throughput of
a Kafka consumer. From my research, I understand the scenario when *kafka
partitions (=<>) # consumer and * to use rebalance spread messages evenly
across workers.

Also use setParallelism(#) to achieve the similar effect as adding more
bolts in Storm`s speak. In storm, there is an offsetManager to handle
multiple outstanding offsets due to parallelism.

Does Flink also has mechanism to manage multiple offset  when
setParallelism is used and make sure the offset is committed 'in order'?

From my own experiments, looks like it has something to do with whether
checkpointing is enabled and the interval of checkpoint if it is enabled.

when setParallelism is used, if one thread is stuck, how does Flink decide
what is the  number of uncommitted offset?

Thanks in advance.
Ben