You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@flink.apache.org by Aissa Elaffani <ai...@gmail.com> on 2020/08/10 03:11:11 UTC

multiple kafka topics

Hello Guys,
I am working on a Flink application, in which I consume data from Apache
Kafka, the data is published in three topics of the cluster, and I need to
read from them, I suppose I can create three
FlikKafkaConsumer constructors.  The data I am consuming is in the same
format {Id_sensor:, Id_equipement, Date:, Value{...}, ...}, the problem is
the "Value" field changes from topic to topic, in fact in the first topic I
have the temperature as a value "Value":{"temperature":26}  , the second
topic contains oil data as a value "Value":{"oil_data":26}, the third topic
the value field is "Value": {"Pitch":, "Roll", "Yaw"}.
So I created three FlinkKafkaConsumer, and I defined three
DeserializationSchema for each data of a topic, the problem is I want to do
some aggregations on those data all together in order to apply a function.
So I am wondering if It is a problem to join the three streams together in
one stream and then do my aggregation by a field, and then apply the
function, and finally sink it. and if so, am I going to have a problem
sinking the data, because actually as I explained the value field is
different from topic to another. Can anyone give me an explanation Please,
I would be so grateful. Thank you for your time !!
Aissa

Re: multiple kafka topics

Posted by Dmytro Dragan <dd...@softserveinc.com>.

Hi Aissa,


  1.  To join 3 streams you can chain 2 coflatmap functions:
https://stackoverflow.com/questions/54277910/how-do-i-join-two-streams-in-apache-flink


  1.  If your aggregation function can be also applied partially you can chain 2 joins:

https://ci.apache.org/projects/flink/flink-docs-stable/dev/stream/operators/joining.html


  1.  create tables from each stream and make one sql join:
https://ci.apache.org/projects/flink/flink-docs-release-1.11/dev/table/tableApi.html#joins

It is important to understand what type of join is suits to your scenario,  could one of streams miss some data + max delay.


From: Aissa Elaffani <ai...@gmail.com>
Date: Monday, 10 August 2020 at 05:11
To: <us...@flink.apache.org>
Subject: multiple kafka topics

Hello Guys,
I am working on a Flink application, in which I consume data from Apache Kafka, the data is published in three topics of the cluster, and I need to read from them, I suppose I can create three FlikKafkaConsumer constructors.  The data I am consuming is in the same format {Id_sensor:, Id_equipement, Date:, Value{...}, ...}, the problem is the "Value" field changes from topic to topic, in fact in the first topic I have the temperature as a value "Value":{"temperature":26}  , the second topic contains oil data as a value "Value":{"oil_data":26}, the third topic the value field is "Value": {"Pitch":, "Roll", "Yaw"}.
So I created three FlinkKafkaConsumer, and I defined three DeserializationSchema for each data of a topic, the problem is I want to do some aggregations on those data all together in order to apply a function. So I am wondering if It is a problem to join the three streams together in one stream and then do my aggregation by a field, and then apply the function, and finally sink it. and if so, am I going to have a problem sinking the data, because actually as I explained the value field is different from topic to another. Can anyone give me an explanation Please, I would be so grateful. Thank you for your time !!
Aissa