You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@flink.apache.org by Dipanjan Mazumder <ja...@yahoo.com> on 2022/01/19 08:01:54 UTC

Distributing Data to multiple kafka partitions while having a single flink instance or worker or slot

Hi,
   Problem:            - Currently i am using flink as an embedded library in one of my application, eventually the application will be the Job and will be deployed in the flink cluster , but right its not a cluster but a standalone single process running flink within the same process.         - It is obvious that the parallelism for flink is set to 1 currently.
         - The Process is fetching data from kafka and using kafka as sync, to send flink processed output to kafka.         - While the topic which has three partitions which is sync to the process , its producing the data only to one of the partitions , which seems to be an expected behaviour.

Ask:
    - Even if i am using a single process to run flink embedded in it , i want to produce data to all the partition like any other normal producer does by default , it publishes data in a round robin fashion to the mentioned topic.
    - Ask is how can i do it, what are the alternatives to do that apart from increasing the parallelism for the flink process which won't work as there is just a single process embedding flink in it, so what is the best alternate way to achieve that.


RegardsDipanjan