You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@kafka.apache.org by Debraj Manna <su...@gmail.com> on 2019/11/08 10:26:24 UTC
Running Kafka Stream Application in YARN
Hi
Is there any documentation or link I can refer to for the steps for
deploying the Kafka Streams application in YARN?
Kafka Client - 0.11.0.3
Kafka Broker - 2.2.1
YARN - 2.6.0
Re: Running stateful Kafka Stream Application on topic with multiple
partitions
Posted by "Matthias J. Sax" <ma...@confluent.io>.
You need to partition the input data correctly, thus that all records
with the same key go the same partition. For this case, all records with
the same key will be processed by the same task, and thus each key is
stored in one shard only.
-Matthias
On 11/15/19 4:28 PM, Gioacchino Vino wrote:
> Hi expert,
>
>
> I don't understand a kafka behavior and I'm here to ask for explanation.
>
> My processing task is pretty simple and it's quite similar to a
> change-log one.
>
> The record value contains a key/value pair: if the new value is
> different respect the stored one, forward to the output topic and update
> the state store, otherwise do nothing. There is also a punctuate task
> that forwards all stored data to the output topic periodically (30
> seconds).
>
> The input topic has 6 partitions.
>
> The observed behavior is that the punctuate task sends 6 times the same
> key/value pair. I figure out that there are 6 state store instances, one
> for each topic partition, and this produces the undesired behavior of
> having 6 times the same key/value pair, but I want only one.
>
> I tried to use a single partition for the input topic and in this
> scenario I got the correct behavior: the punctuate task sends no pair
> copies.
>
> The issue is that I don't want use input topic with a single partition
> because that topic collects data from a large number of producers.
>
>
> Any better explanations?
>
> Any comments or advices?
>
>
> Thank a lot in advance,
>
> Gioacchino
>
>
>
Running stateful Kafka Stream Application on topic with multiple
partitions
Posted by Gioacchino Vino <gi...@gmail.com>.
Hi expert,
I don't understand a kafka behavior and I'm here to ask for explanation.
My processing task is pretty simple and it's quite similar to a
change-log one.
The record value contains a key/value pair: if the new value is
different respect the stored one, forward to the output topic and update
the state store, otherwise do nothing. There is also a punctuate task
that forwards all stored data to the output topic periodically (30 seconds).
The input topic has 6 partitions.
The observed behavior is that the punctuate task sends 6 times the same
key/value pair. I figure out that there are 6 state store instances, one
for each topic partition, and this produces the undesired behavior of
having 6 times the same key/value pair, but I want only one.
I tried to use a single partition for the input topic and in this
scenario I got the correct behavior: the punctuate task sends no pair
copies.
The issue is that I don't want use input topic with a single partition
because that topic collects data from a large number of producers.
Any better explanations?
Any comments or advices?
Thank a lot in advance,
Gioacchino
Re: Running Kafka Stream Application in YARN
Posted by Ryanne Dolan <ry...@gmail.com>.
> Why that? Just because there is explicit documentation?
Just that they target YARN.
Ryanne
On Thu, Nov 14, 2019, 1:59 AM Matthias J. Sax <ma...@confluent.io> wrote:
> Why that? Just because there is explicit documentation?
>
>
> @Debraj: Kafka Streams can be deployed as a regular Java application.
> Hence, and tutorial on how to run a Java application on YARN should help.
>
>
> -Matthias
>
> On 11/11/19 10:33 AM, Ryanne Dolan wrote:
> > Consider using Flink, Spark, or Samza instead.
> >
> > Ryanne
> >
> > On Fri, Nov 8, 2019, 4:27 AM Debraj Manna <su...@gmail.com>
> wrote:
> >
> >> Hi
> >>
> >> Is there any documentation or link I can refer to for the steps for
> >> deploying the Kafka Streams application in YARN?
> >>
> >> Kafka Client - 0.11.0.3
> >> Kafka Broker - 2.2.1
> >> YARN - 2.6.0
> >>
> >
>
>
Re: Running Kafka Stream Application in YARN
Posted by "Matthias J. Sax" <ma...@confluent.io>.
Why that? Just because there is explicit documentation?
@Debraj: Kafka Streams can be deployed as a regular Java application.
Hence, and tutorial on how to run a Java application on YARN should help.
-Matthias
On 11/11/19 10:33 AM, Ryanne Dolan wrote:
> Consider using Flink, Spark, or Samza instead.
>
> Ryanne
>
> On Fri, Nov 8, 2019, 4:27 AM Debraj Manna <su...@gmail.com> wrote:
>
>> Hi
>>
>> Is there any documentation or link I can refer to for the steps for
>> deploying the Kafka Streams application in YARN?
>>
>> Kafka Client - 0.11.0.3
>> Kafka Broker - 2.2.1
>> YARN - 2.6.0
>>
>
Re: Running Kafka Stream Application in YARN
Posted by Ryanne Dolan <ry...@gmail.com>.
Consider using Flink, Spark, or Samza instead.
Ryanne
On Fri, Nov 8, 2019, 4:27 AM Debraj Manna <su...@gmail.com> wrote:
> Hi
>
> Is there any documentation or link I can refer to for the steps for
> deploying the Kafka Streams application in YARN?
>
> Kafka Client - 0.11.0.3
> Kafka Broker - 2.2.1
> YARN - 2.6.0
>
Re: Running Kafka Stream Application in YARN
Posted by "Matthias J. Sax" <ma...@confluent.io>.
I would checkout the YARN docs on how to run a Java application. There
should not be a difference. (Of course, Kafka Streams might be stateful
though).
-Matthias
On 11/9/19 3:26 AM, Debraj Manna wrote:
> Anyone any update on this?
>
> On Fri, 8 Nov 2019, 15:56 Debraj Manna, <su...@gmail.com> wrote:
>
>> Hi
>>
>> Is there any documentation or link I can refer to for the steps for
>> deploying the Kafka Streams application in YARN?
>>
>> Kafka Client - 0.11.0.3
>> Kafka Broker - 2.2.1
>> YARN - 2.6.0
>>
>
Re: Running Kafka Stream Application in YARN
Posted by Debraj Manna <su...@gmail.com>.
Anyone any update on this?
On Fri, 8 Nov 2019, 15:56 Debraj Manna, <su...@gmail.com> wrote:
> Hi
>
> Is there any documentation or link I can refer to for the steps for
> deploying the Kafka Streams application in YARN?
>
> Kafka Client - 0.11.0.3
> Kafka Broker - 2.2.1
> YARN - 2.6.0
>