You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@kafka.apache.org by Raffaele Esposito <ra...@gmail.com> on 2020/05/17 12:55:25 UTC

Is Zombie instance problem consumer related ?

Kafka documentation
<https://www.confluent.io/blog/transactions-apache-kafka/> states:

Finally, in distributed environments, applications will crash
or—worse!—temporarily lose connectivity to the rest of the system.
Typically, new instances are automatically started to replace the ones
which were deemed lost. Through this process, we may have multiple
instances processing the same input topics and writing to the same output
topics, causing duplicate outputs and violating the exactly once processing
semantics. We call this the problem of “zombie instances.”

But think about Kafka Stream for example, when different threads, each with
its own consumer and producer, actually consume from same topic and write
to same topic.

Usually when developing processor logic, consume process produce,  we
actually may want replicas to read from the same topics, but of course
different partitions, and write to the same topic.
This is how Kafka allows scaling.

So in my understanding the zombie problem appears when consumers belonging
to the same group are, because of some previous failure e.g. network
failure, consuming from the same partition.
Am I missing something?

Re: Is Zombie instance problem consumer related ?

Posted by "Matthias J. Sax" <mj...@apache.org>.
That is correct, but you don't need to worry about it. Kafka Streams
takes care of this problem for you.

-Matthias

On 5/17/20 5:55 AM, Raffaele Esposito wrote:
> Kafka documentation
> <https://www.confluent.io/blog/transactions-apache-kafka/> states:
> 
> Finally, in distributed environments, applications will crash
> or—worse!—temporarily lose connectivity to the rest of the system.
> Typically, new instances are automatically started to replace the ones
> which were deemed lost. Through this process, we may have multiple
> instances processing the same input topics and writing to the same output
> topics, causing duplicate outputs and violating the exactly once processing
> semantics. We call this the problem of “zombie instances.”
> 
> But think about Kafka Stream for example, when different threads, each with
> its own consumer and producer, actually consume from same topic and write
> to same topic.
> 
> Usually when developing processor logic, consume process produce,  we
> actually may want replicas to read from the same topics, but of course
> different partitions, and write to the same topic.
> This is how Kafka allows scaling.
> 
> So in my understanding the zombie problem appears when consumers belonging
> to the same group are, because of some previous failure e.g. network
> failure, consuming from the same partition.
> Am I missing something?
>