You are viewing a plain text version of this content. The canonical link for it is here.

Posted to users@kafka.apache.org by Paul van der Linden <pa...@sportr.co.uk> on 2017/06/13 15:21:49 UTC

kafka in unrecoverable state

Hi,

I'm trying to find out how to at least get my kafka working again.
Something went wrong and kafka has halted to a throughput of 0 messages. It
keeps looping on stablizing consumer groups, and erroring on an append
operation to the offset paritions, plus Not enough replicas.

The weird things is, that after not being able to work this out I want
pretty brutal (luckily I can afford to loose more messages):
- delete all kafka and zookeeper instances
- updated kafka
- cleared all disk

Still kafka is in this unrecoverable error. Does anyone have any idea how
to fix this?

Re: kafka in unrecoverable state

Posted by SenthilKumar K <se...@gmail.com>.

I observed below error in one of the broker , and it is unresponsive ...

2018-04-07 12:51:39,830] ERROR [Replica Manager on Broker 3]: Error
processing append operation on partition __consumer_offsets-27
(kafka.server.ReplicaManager)

org.apache.kafka.common.errors.NotEnoughReplicasException: Number of insync
replicas for partition __consumer_offsets-27 is [1], below required minimum
[2]


First 24 hours cluster works well under ~60K messages/sec inbound& outbound
load after a day broker is unresponsive and Group Coordinator started
thrwoing  "Offset commit failed with a retriable exception. You should
retry committing offsets. The underlying error was: The coordinator is not
available".


./bin/kafka-topics.sh --zookeeper localhost:2181:/kafka --describe --topic
__consumer_offsets

Topic:__consumer_offsets PartitionCount:50 ReplicationFactor:3
Configs:segment.bytes=104857600,cleanup.policy=compact,compression.type=producer

Topic: __consumer_offsets Partition: 0 Leader: 3 Replicas: 3,1,2 Isr: 3

Topic: __consumer_offsets Partition: 1 Leader: 1 Replicas: 1,2,3 Isr: 1

Initially concumer_offsets topic replicas was in sync.


Kafka Version : 0.11.0.



On Fri, Aug 25, 2017 at 6:04 PM, Murad Mamedov <ma...@muradm.net> wrote:

> At the time of first time it occurred, all replicas was in sync.
> But after restart of clients and brokers, exception started to occur
> immediately, and replicas becoming out of sync.
> As explained in the issue, bug related to configuration and timing of
> records.
>
> On Fri, Aug 25, 2017 at 10:31 AM, Dan Markhasin <mi...@gmail.com>
> wrote:
>
> > If you run kafka-topics.sh --describe --topic __consumer_offsets, does it
> > show that all replicas are in sync?
> >
> > On 23 August 2017 at 23:11, Murad Mamedov <ma...@muradm.net> wrote:
> >
> > > Hi David,
> > >
> > > Thanks for reply. However, I don't have problem with number of
> replicas.
> > I
> > > have 3 brokers. And topics configured accordingly, especially
> > > __consumer_offsets
> > >
> > > Topic:__consumer_offsets PartitionCount:50 ReplicationFactor:3
> > > Configs:segment.bytes=104857600,cleanup.policy=
> compact,compression.type=
> > > producer
> > >
> > > And everything was working find for months, until today.
> > >
> > > Why would I want changing replication factor? To what value?
> > >
> > > On Wed, Aug 23, 2017 at 11:19 PM, David Frederick <
> > > david.frederick@gmail.com
> > > > wrote:
> > >
> > > > |> NotEnoughReplicasException: Number of  insync replicas for
> partition
> > > > __consumer_offsets-17 is [1], below required minimum [2]
> > > >
> > > > Please refer to
> > > > https://stackoverflow.com/questions/37960767/how-to-
> > > > change-the-replicas-of-kafka-topic.
> > > > Hope it helps!
> > > >
> > > >
> > > > On Aug 23, 2017 5:17 AM, "Murad Mamedov" <ma...@muradm.net> wrote:
> > > >
> > > > > Hi,
> > > > >
> > > > > Did you manage to find the root cause of this issue?
> > > > >
> > > > > Same thing happened here.
> > > > >
> > > > > Thanks in advance
> > > > >
> > > > > On Tue, Jun 13, 2017 at 7:50 PM, Paul van der Linden <
> > > paul@sportr.co.uk>
> > > > > wrote:
> > > > >
> > > > > > I managed to solve it by:
> > > > > > - stopping and deleting all data on kafka & zookeeper
> > > > > > - stopping all consumers and producers
> > > > > > - starting kafka & zookeeper, waiting till they are up
> > > > > > - start all consumers & producers,
> > > > > >
> > > > > > Is there a better way to do this, without data loss and halting
> > > > > everything?
> > > > > >
> > > > > > On Tue, Jun 13, 2017 at 4:28 PM, Paul van der Linden <
> > > > paul@sportr.co.uk>
> > > > > > wrote:
> > > > > >
> > > > > > > A few lines of the logs:
> > > > > > >
> > > > > > > [2017-06-13 15:25:37,343] INFO [GroupCoordinator 0]: Stabilized
> > > group
> > > > > > > summarizer generation 701 (kafka.coordinator.GroupCoordinator)
> > > > > > > [2017-06-13 15:25:37,345] INFO [GroupCoordinator 0]: Assignment
> > > > > received
> > > > > > > from leader for group summarizer for generation 701
> > > > (kafka.coordinator.
> > > > > > > GroupCoordinator)
> > > > > > > [2017-06-13 15:25:37,345] ERROR [Replica Manager on Broker 0]:
> > > Error
> > > > > > > processing append operation on partition __consumer_offsets-17
> > > > > > > (kafka.server.ReplicaManager)
> > > > > > > org.apache.kafka.common.errors.NotEnoughReplicasException:
> > Number
> > > of
> > > > > > > insync replicas for partition __consumer_offsets-17 is [1],
> below
> > > > > > required
> > > > > > > minimum [2]
> > > > > > > [2017-06-13 15:25:37,345] INFO [GroupCoordinator 0]: Preparing
> to
> > > > > > > restabilize group summarizer with old generation 701
> > > > > (kafka.coordinator.
> > > > > > > GroupCoordinator)
> > > > > > >
> > > > > > > This keeps happening, for all consumer offsets and all groups,
> > etc
> > > > > > >
> > > > > > > On Tue, Jun 13, 2017 at 4:21 PM, Paul van der Linden <
> > > > > paul@sportr.co.uk>
> > > > > > > wrote:
> > > > > > >
> > > > > > >> Hi,
> > > > > > >>
> > > > > > >> I'm trying to find out how to at least get my kafka working
> > again.
> > > > > > >> Something went wrong and kafka has halted to a throughput of 0
> > > > > > messages. It
> > > > > > >> keeps looping on stablizing consumer groups, and erroring on
> an
> > > > append
> > > > > > >> operation to the offset paritions, plus Not enough replicas.
> > > > > > >>
> > > > > > >> The weird things is, that after not being able to work this
> out
> > I
> > > > want
> > > > > > >> pretty brutal (luckily I can afford to loose more messages):
> > > > > > >> - delete all kafka and zookeeper instances
> > > > > > >> - updated kafka
> > > > > > >> - cleared all disk
> > > > > > >>
> > > > > > >> Still kafka is in this unrecoverable error. Does anyone have
> any
> > > > idea
> > > > > > how
> > > > > > >> to fix this?
> > > > > > >>
> > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > > >
> > > > >
> > > > > --
> > > > > Regards,
> > > > > *Murad M*
> > > > > *M (tr): +90 (533) 4874329*
> > > > > *E: mail@muradm.net <ma...@muradm.net>*
> > > > >
> > > >
> > >
> > >
> > >
> > > --
> > > Regards,
> > > *Murad M*
> > > *M (tr): +90 (533) 4874329*
> > > *E: mail@muradm.net <ma...@muradm.net>*
> > >
> >
>
>
>
> --
> Regards,
> *Murad M*
> *M (tr): +90 (533) 4874329*
> *E: mail@muradm.net <ma...@muradm.net>*
>

Re: kafka in unrecoverable state

Posted by Murad Mamedov <ma...@muradm.net>.

At the time of first time it occurred, all replicas was in sync.
But after restart of clients and brokers, exception started to occur
immediately, and replicas becoming out of sync.
As explained in the issue, bug related to configuration and timing of
records.

On Fri, Aug 25, 2017 at 10:31 AM, Dan Markhasin <mi...@gmail.com> wrote:

> If you run kafka-topics.sh --describe --topic __consumer_offsets, does it
> show that all replicas are in sync?
>
> On 23 August 2017 at 23:11, Murad Mamedov <ma...@muradm.net> wrote:
>
> > Hi David,
> >
> > Thanks for reply. However, I don't have problem with number of replicas.
> I
> > have 3 brokers. And topics configured accordingly, especially
> > __consumer_offsets
> >
> > Topic:__consumer_offsets PartitionCount:50 ReplicationFactor:3
> > Configs:segment.bytes=104857600,cleanup.policy=compact,compression.type=
> > producer
> >
> > And everything was working find for months, until today.
> >
> > Why would I want changing replication factor? To what value?
> >
> > On Wed, Aug 23, 2017 at 11:19 PM, David Frederick <
> > david.frederick@gmail.com
> > > wrote:
> >
> > > |> NotEnoughReplicasException: Number of  insync replicas for partition
> > > __consumer_offsets-17 is [1], below required minimum [2]
> > >
> > > Please refer to
> > > https://stackoverflow.com/questions/37960767/how-to-
> > > change-the-replicas-of-kafka-topic.
> > > Hope it helps!
> > >
> > >
> > > On Aug 23, 2017 5:17 AM, "Murad Mamedov" <ma...@muradm.net> wrote:
> > >
> > > > Hi,
> > > >
> > > > Did you manage to find the root cause of this issue?
> > > >
> > > > Same thing happened here.
> > > >
> > > > Thanks in advance
> > > >
> > > > On Tue, Jun 13, 2017 at 7:50 PM, Paul van der Linden <
> > paul@sportr.co.uk>
> > > > wrote:
> > > >
> > > > > I managed to solve it by:
> > > > > - stopping and deleting all data on kafka & zookeeper
> > > > > - stopping all consumers and producers
> > > > > - starting kafka & zookeeper, waiting till they are up
> > > > > - start all consumers & producers,
> > > > >
> > > > > Is there a better way to do this, without data loss and halting
> > > > everything?
> > > > >
> > > > > On Tue, Jun 13, 2017 at 4:28 PM, Paul van der Linden <
> > > paul@sportr.co.uk>
> > > > > wrote:
> > > > >
> > > > > > A few lines of the logs:
> > > > > >
> > > > > > [2017-06-13 15:25:37,343] INFO [GroupCoordinator 0]: Stabilized
> > group
> > > > > > summarizer generation 701 (kafka.coordinator.GroupCoordinator)
> > > > > > [2017-06-13 15:25:37,345] INFO [GroupCoordinator 0]: Assignment
> > > > received
> > > > > > from leader for group summarizer for generation 701
> > > (kafka.coordinator.
> > > > > > GroupCoordinator)
> > > > > > [2017-06-13 15:25:37,345] ERROR [Replica Manager on Broker 0]:
> > Error
> > > > > > processing append operation on partition __consumer_offsets-17
> > > > > > (kafka.server.ReplicaManager)
> > > > > > org.apache.kafka.common.errors.NotEnoughReplicasException:
> Number
> > of
> > > > > > insync replicas for partition __consumer_offsets-17 is [1], below
> > > > > required
> > > > > > minimum [2]
> > > > > > [2017-06-13 15:25:37,345] INFO [GroupCoordinator 0]: Preparing to
> > > > > > restabilize group summarizer with old generation 701
> > > > (kafka.coordinator.
> > > > > > GroupCoordinator)
> > > > > >
> > > > > > This keeps happening, for all consumer offsets and all groups,
> etc
> > > > > >
> > > > > > On Tue, Jun 13, 2017 at 4:21 PM, Paul van der Linden <
> > > > paul@sportr.co.uk>
> > > > > > wrote:
> > > > > >
> > > > > >> Hi,
> > > > > >>
> > > > > >> I'm trying to find out how to at least get my kafka working
> again.
> > > > > >> Something went wrong and kafka has halted to a throughput of 0
> > > > > messages. It
> > > > > >> keeps looping on stablizing consumer groups, and erroring on an
> > > append
> > > > > >> operation to the offset paritions, plus Not enough replicas.
> > > > > >>
> > > > > >> The weird things is, that after not being able to work this out
> I
> > > want
> > > > > >> pretty brutal (luckily I can afford to loose more messages):
> > > > > >> - delete all kafka and zookeeper instances
> > > > > >> - updated kafka
> > > > > >> - cleared all disk
> > > > > >>
> > > > > >> Still kafka is in this unrecoverable error. Does anyone have any
> > > idea
> > > > > how
> > > > > >> to fix this?
> > > > > >>
> > > > > >
> > > > > >
> > > > >
> > > >
> > > >
> > > >
> > > > --
> > > > Regards,
> > > > *Murad M*
> > > > *M (tr): +90 (533) 4874329*
> > > > *E: mail@muradm.net <ma...@muradm.net>*
> > > >
> > >
> >
> >
> >
> > --
> > Regards,
> > *Murad M*
> > *M (tr): +90 (533) 4874329*
> > *E: mail@muradm.net <ma...@muradm.net>*
> >
>



-- 
Regards,
*Murad M*
*M (tr): +90 (533) 4874329*
*E: mail@muradm.net <ma...@muradm.net>*

Re: kafka in unrecoverable state

Posted by Dan Markhasin <mi...@gmail.com>.

If you run kafka-topics.sh --describe --topic __consumer_offsets, does it
show that all replicas are in sync?

On 23 August 2017 at 23:11, Murad Mamedov <ma...@muradm.net> wrote:

> Hi David,
>
> Thanks for reply. However, I don't have problem with number of replicas. I
> have 3 brokers. And topics configured accordingly, especially
> __consumer_offsets
>
> Topic:__consumer_offsets PartitionCount:50 ReplicationFactor:3
> Configs:segment.bytes=104857600,cleanup.policy=compact,compression.type=
> producer
>
> And everything was working find for months, until today.
>
> Why would I want changing replication factor? To what value?
>
> On Wed, Aug 23, 2017 at 11:19 PM, David Frederick <
> david.frederick@gmail.com
> > wrote:
>
> > |> NotEnoughReplicasException: Number of  insync replicas for partition
> > __consumer_offsets-17 is [1], below required minimum [2]
> >
> > Please refer to
> > https://stackoverflow.com/questions/37960767/how-to-
> > change-the-replicas-of-kafka-topic.
> > Hope it helps!
> >
> >
> > On Aug 23, 2017 5:17 AM, "Murad Mamedov" <ma...@muradm.net> wrote:
> >
> > > Hi,
> > >
> > > Did you manage to find the root cause of this issue?
> > >
> > > Same thing happened here.
> > >
> > > Thanks in advance
> > >
> > > On Tue, Jun 13, 2017 at 7:50 PM, Paul van der Linden <
> paul@sportr.co.uk>
> > > wrote:
> > >
> > > > I managed to solve it by:
> > > > - stopping and deleting all data on kafka & zookeeper
> > > > - stopping all consumers and producers
> > > > - starting kafka & zookeeper, waiting till they are up
> > > > - start all consumers & producers,
> > > >
> > > > Is there a better way to do this, without data loss and halting
> > > everything?
> > > >
> > > > On Tue, Jun 13, 2017 at 4:28 PM, Paul van der Linden <
> > paul@sportr.co.uk>
> > > > wrote:
> > > >
> > > > > A few lines of the logs:
> > > > >
> > > > > [2017-06-13 15:25:37,343] INFO [GroupCoordinator 0]: Stabilized
> group
> > > > > summarizer generation 701 (kafka.coordinator.GroupCoordinator)
> > > > > [2017-06-13 15:25:37,345] INFO [GroupCoordinator 0]: Assignment
> > > received
> > > > > from leader for group summarizer for generation 701
> > (kafka.coordinator.
> > > > > GroupCoordinator)
> > > > > [2017-06-13 15:25:37,345] ERROR [Replica Manager on Broker 0]:
> Error
> > > > > processing append operation on partition __consumer_offsets-17
> > > > > (kafka.server.ReplicaManager)
> > > > > org.apache.kafka.common.errors.NotEnoughReplicasException: Number
> of
> > > > > insync replicas for partition __consumer_offsets-17 is [1], below
> > > > required
> > > > > minimum [2]
> > > > > [2017-06-13 15:25:37,345] INFO [GroupCoordinator 0]: Preparing to
> > > > > restabilize group summarizer with old generation 701
> > > (kafka.coordinator.
> > > > > GroupCoordinator)
> > > > >
> > > > > This keeps happening, for all consumer offsets and all groups, etc
> > > > >
> > > > > On Tue, Jun 13, 2017 at 4:21 PM, Paul van der Linden <
> > > paul@sportr.co.uk>
> > > > > wrote:
> > > > >
> > > > >> Hi,
> > > > >>
> > > > >> I'm trying to find out how to at least get my kafka working again.
> > > > >> Something went wrong and kafka has halted to a throughput of 0
> > > > messages. It
> > > > >> keeps looping on stablizing consumer groups, and erroring on an
> > append
> > > > >> operation to the offset paritions, plus Not enough replicas.
> > > > >>
> > > > >> The weird things is, that after not being able to work this out I
> > want
> > > > >> pretty brutal (luckily I can afford to loose more messages):
> > > > >> - delete all kafka and zookeeper instances
> > > > >> - updated kafka
> > > > >> - cleared all disk
> > > > >>
> > > > >> Still kafka is in this unrecoverable error. Does anyone have any
> > idea
> > > > how
> > > > >> to fix this?
> > > > >>
> > > > >
> > > > >
> > > >
> > >
> > >
> > >
> > > --
> > > Regards,
> > > *Murad M*
> > > *M (tr): +90 (533) 4874329*
> > > *E: mail@muradm.net <ma...@muradm.net>*
> > >
> >
>
>
>
> --
> Regards,
> *Murad M*
> *M (tr): +90 (533) 4874329*
> *E: mail@muradm.net <ma...@muradm.net>*
>

Re: kafka in unrecoverable state

Posted by Murad Mamedov <ma...@muradm.net>.

Hi David,

Thanks for reply. However, I don't have problem with number of replicas. I
have 3 brokers. And topics configured accordingly, especially
__consumer_offsets

Topic:__consumer_offsets PartitionCount:50 ReplicationFactor:3
Configs:segment.bytes=104857600,cleanup.policy=compact,compression.type=producer

And everything was working find for months, until today.

Why would I want changing replication factor? To what value?

On Wed, Aug 23, 2017 at 11:19 PM, David Frederick <david.frederick@gmail.com
> wrote:

> |> NotEnoughReplicasException: Number of  insync replicas for partition
> __consumer_offsets-17 is [1], below required minimum [2]
>
> Please refer to
> https://stackoverflow.com/questions/37960767/how-to-
> change-the-replicas-of-kafka-topic.
> Hope it helps!
>
>
> On Aug 23, 2017 5:17 AM, "Murad Mamedov" <ma...@muradm.net> wrote:
>
> > Hi,
> >
> > Did you manage to find the root cause of this issue?
> >
> > Same thing happened here.
> >
> > Thanks in advance
> >
> > On Tue, Jun 13, 2017 at 7:50 PM, Paul van der Linden <pa...@sportr.co.uk>
> > wrote:
> >
> > > I managed to solve it by:
> > > - stopping and deleting all data on kafka & zookeeper
> > > - stopping all consumers and producers
> > > - starting kafka & zookeeper, waiting till they are up
> > > - start all consumers & producers,
> > >
> > > Is there a better way to do this, without data loss and halting
> > everything?
> > >
> > > On Tue, Jun 13, 2017 at 4:28 PM, Paul van der Linden <
> paul@sportr.co.uk>
> > > wrote:
> > >
> > > > A few lines of the logs:
> > > >
> > > > [2017-06-13 15:25:37,343] INFO [GroupCoordinator 0]: Stabilized group
> > > > summarizer generation 701 (kafka.coordinator.GroupCoordinator)
> > > > [2017-06-13 15:25:37,345] INFO [GroupCoordinator 0]: Assignment
> > received
> > > > from leader for group summarizer for generation 701
> (kafka.coordinator.
> > > > GroupCoordinator)
> > > > [2017-06-13 15:25:37,345] ERROR [Replica Manager on Broker 0]: Error
> > > > processing append operation on partition __consumer_offsets-17
> > > > (kafka.server.ReplicaManager)
> > > > org.apache.kafka.common.errors.NotEnoughReplicasException: Number of
> > > > insync replicas for partition __consumer_offsets-17 is [1], below
> > > required
> > > > minimum [2]
> > > > [2017-06-13 15:25:37,345] INFO [GroupCoordinator 0]: Preparing to
> > > > restabilize group summarizer with old generation 701
> > (kafka.coordinator.
> > > > GroupCoordinator)
> > > >
> > > > This keeps happening, for all consumer offsets and all groups, etc
> > > >
> > > > On Tue, Jun 13, 2017 at 4:21 PM, Paul van der Linden <
> > paul@sportr.co.uk>
> > > > wrote:
> > > >
> > > >> Hi,
> > > >>
> > > >> I'm trying to find out how to at least get my kafka working again.
> > > >> Something went wrong and kafka has halted to a throughput of 0
> > > messages. It
> > > >> keeps looping on stablizing consumer groups, and erroring on an
> append
> > > >> operation to the offset paritions, plus Not enough replicas.
> > > >>
> > > >> The weird things is, that after not being able to work this out I
> want
> > > >> pretty brutal (luckily I can afford to loose more messages):
> > > >> - delete all kafka and zookeeper instances
> > > >> - updated kafka
> > > >> - cleared all disk
> > > >>
> > > >> Still kafka is in this unrecoverable error. Does anyone have any
> idea
> > > how
> > > >> to fix this?
> > > >>
> > > >
> > > >
> > >
> >
> >
> >
> > --
> > Regards,
> > *Murad M*
> > *M (tr): +90 (533) 4874329*
> > *E: mail@muradm.net <ma...@muradm.net>*
> >
>



-- 
Regards,
*Murad M*
*M (tr): +90 (533) 4874329*
*E: mail@muradm.net <ma...@muradm.net>*

Re: kafka in unrecoverable state

Posted by David Frederick <da...@gmail.com>.

|> NotEnoughReplicasException: Number of  insync replicas for partition
__consumer_offsets-17 is [1], below required minimum [2]

Please refer to
https://stackoverflow.com/questions/37960767/how-to-change-the-replicas-of-kafka-topic.
Hope it helps!


On Aug 23, 2017 5:17 AM, "Murad Mamedov" <ma...@muradm.net> wrote:

> Hi,
>
> Did you manage to find the root cause of this issue?
>
> Same thing happened here.
>
> Thanks in advance
>
> On Tue, Jun 13, 2017 at 7:50 PM, Paul van der Linden <pa...@sportr.co.uk>
> wrote:
>
> > I managed to solve it by:
> > - stopping and deleting all data on kafka & zookeeper
> > - stopping all consumers and producers
> > - starting kafka & zookeeper, waiting till they are up
> > - start all consumers & producers,
> >
> > Is there a better way to do this, without data loss and halting
> everything?
> >
> > On Tue, Jun 13, 2017 at 4:28 PM, Paul van der Linden <pa...@sportr.co.uk>
> > wrote:
> >
> > > A few lines of the logs:
> > >
> > > [2017-06-13 15:25:37,343] INFO [GroupCoordinator 0]: Stabilized group
> > > summarizer generation 701 (kafka.coordinator.GroupCoordinator)
> > > [2017-06-13 15:25:37,345] INFO [GroupCoordinator 0]: Assignment
> received
> > > from leader for group summarizer for generation 701 (kafka.coordinator.
> > > GroupCoordinator)
> > > [2017-06-13 15:25:37,345] ERROR [Replica Manager on Broker 0]: Error
> > > processing append operation on partition __consumer_offsets-17
> > > (kafka.server.ReplicaManager)
> > > org.apache.kafka.common.errors.NotEnoughReplicasException: Number of
> > > insync replicas for partition __consumer_offsets-17 is [1], below
> > required
> > > minimum [2]
> > > [2017-06-13 15:25:37,345] INFO [GroupCoordinator 0]: Preparing to
> > > restabilize group summarizer with old generation 701
> (kafka.coordinator.
> > > GroupCoordinator)
> > >
> > > This keeps happening, for all consumer offsets and all groups, etc
> > >
> > > On Tue, Jun 13, 2017 at 4:21 PM, Paul van der Linden <
> paul@sportr.co.uk>
> > > wrote:
> > >
> > >> Hi,
> > >>
> > >> I'm trying to find out how to at least get my kafka working again.
> > >> Something went wrong and kafka has halted to a throughput of 0
> > messages. It
> > >> keeps looping on stablizing consumer groups, and erroring on an append
> > >> operation to the offset paritions, plus Not enough replicas.
> > >>
> > >> The weird things is, that after not being able to work this out I want
> > >> pretty brutal (luckily I can afford to loose more messages):
> > >> - delete all kafka and zookeeper instances
> > >> - updated kafka
> > >> - cleared all disk
> > >>
> > >> Still kafka is in this unrecoverable error. Does anyone have any idea
> > how
> > >> to fix this?
> > >>
> > >
> > >
> >
>
>
>
> --
> Regards,
> *Murad M*
> *M (tr): +90 (533) 4874329*
> *E: mail@muradm.net <ma...@muradm.net>*
>

Re: kafka in unrecoverable state

Posted by Murad Mamedov <ma...@muradm.net>.

Hi,

Did you manage to find the root cause of this issue?

Same thing happened here.

Thanks in advance

On Tue, Jun 13, 2017 at 7:50 PM, Paul van der Linden <pa...@sportr.co.uk>
wrote:

> I managed to solve it by:
> - stopping and deleting all data on kafka & zookeeper
> - stopping all consumers and producers
> - starting kafka & zookeeper, waiting till they are up
> - start all consumers & producers,
>
> Is there a better way to do this, without data loss and halting everything?
>
> On Tue, Jun 13, 2017 at 4:28 PM, Paul van der Linden <pa...@sportr.co.uk>
> wrote:
>
> > A few lines of the logs:
> >
> > [2017-06-13 15:25:37,343] INFO [GroupCoordinator 0]: Stabilized group
> > summarizer generation 701 (kafka.coordinator.GroupCoordinator)
> > [2017-06-13 15:25:37,345] INFO [GroupCoordinator 0]: Assignment received
> > from leader for group summarizer for generation 701 (kafka.coordinator.
> > GroupCoordinator)
> > [2017-06-13 15:25:37,345] ERROR [Replica Manager on Broker 0]: Error
> > processing append operation on partition __consumer_offsets-17
> > (kafka.server.ReplicaManager)
> > org.apache.kafka.common.errors.NotEnoughReplicasException: Number of
> > insync replicas for partition __consumer_offsets-17 is [1], below
> required
> > minimum [2]
> > [2017-06-13 15:25:37,345] INFO [GroupCoordinator 0]: Preparing to
> > restabilize group summarizer with old generation 701 (kafka.coordinator.
> > GroupCoordinator)
> >
> > This keeps happening, for all consumer offsets and all groups, etc
> >
> > On Tue, Jun 13, 2017 at 4:21 PM, Paul van der Linden <pa...@sportr.co.uk>
> > wrote:
> >
> >> Hi,
> >>
> >> I'm trying to find out how to at least get my kafka working again.
> >> Something went wrong and kafka has halted to a throughput of 0
> messages. It
> >> keeps looping on stablizing consumer groups, and erroring on an append
> >> operation to the offset paritions, plus Not enough replicas.
> >>
> >> The weird things is, that after not being able to work this out I want
> >> pretty brutal (luckily I can afford to loose more messages):
> >> - delete all kafka and zookeeper instances
> >> - updated kafka
> >> - cleared all disk
> >>
> >> Still kafka is in this unrecoverable error. Does anyone have any idea
> how
> >> to fix this?
> >>
> >
> >
>



-- 
Regards,
*Murad M*
*M (tr): +90 (533) 4874329*
*E: mail@muradm.net <ma...@muradm.net>*

Re: kafka in unrecoverable state

Posted by Paul van der Linden <pa...@sportr.co.uk>.

I managed to solve it by:
- stopping and deleting all data on kafka & zookeeper
- stopping all consumers and producers
- starting kafka & zookeeper, waiting till they are up
- start all consumers & producers,

Is there a better way to do this, without data loss and halting everything?

On Tue, Jun 13, 2017 at 4:28 PM, Paul van der Linden <pa...@sportr.co.uk>
wrote:

> A few lines of the logs:
>
> [2017-06-13 15:25:37,343] INFO [GroupCoordinator 0]: Stabilized group
> summarizer generation 701 (kafka.coordinator.GroupCoordinator)
> [2017-06-13 15:25:37,345] INFO [GroupCoordinator 0]: Assignment received
> from leader for group summarizer for generation 701 (kafka.coordinator.
> GroupCoordinator)
> [2017-06-13 15:25:37,345] ERROR [Replica Manager on Broker 0]: Error
> processing append operation on partition __consumer_offsets-17
> (kafka.server.ReplicaManager)
> org.apache.kafka.common.errors.NotEnoughReplicasException: Number of
> insync replicas for partition __consumer_offsets-17 is [1], below required
> minimum [2]
> [2017-06-13 15:25:37,345] INFO [GroupCoordinator 0]: Preparing to
> restabilize group summarizer with old generation 701 (kafka.coordinator.
> GroupCoordinator)
>
> This keeps happening, for all consumer offsets and all groups, etc
>
> On Tue, Jun 13, 2017 at 4:21 PM, Paul van der Linden <pa...@sportr.co.uk>
> wrote:
>
>> Hi,
>>
>> I'm trying to find out how to at least get my kafka working again.
>> Something went wrong and kafka has halted to a throughput of 0 messages. It
>> keeps looping on stablizing consumer groups, and erroring on an append
>> operation to the offset paritions, plus Not enough replicas.
>>
>> The weird things is, that after not being able to work this out I want
>> pretty brutal (luckily I can afford to loose more messages):
>> - delete all kafka and zookeeper instances
>> - updated kafka
>> - cleared all disk
>>
>> Still kafka is in this unrecoverable error. Does anyone have any idea how
>> to fix this?
>>
>
>

Re: kafka in unrecoverable state

Posted by Paul van der Linden <pa...@sportr.co.uk>.

A few lines of the logs:

[2017-06-13 15:25:37,343] INFO [GroupCoordinator 0]: Stabilized group
summarizer generation 701 (kafka.coordinator.GroupCoordinator)
[2017-06-13 15:25:37,345] INFO [GroupCoordinator 0]: Assignment received
from leader for group summarizer for generation 701
(kafka.coordinator.GroupCoordinator)
[2017-06-13 15:25:37,345] ERROR [Replica Manager on Broker 0]: Error
processing append operation on partition __consumer_offsets-17
(kafka.server.ReplicaManager)
org.apache.kafka.common.errors.NotEnoughReplicasException: Number of insync
replicas for partition __consumer_offsets-17 is [1], below required minimum
[2]
[2017-06-13 15:25:37,345] INFO [GroupCoordinator 0]: Preparing to
restabilize group summarizer with old generation 701
(kafka.coordinator.GroupCoordinator)

This keeps happening, for all consumer offsets and all groups, etc

On Tue, Jun 13, 2017 at 4:21 PM, Paul van der Linden <pa...@sportr.co.uk>
wrote:

> Hi,
>
> I'm trying to find out how to at least get my kafka working again.
> Something went wrong and kafka has halted to a throughput of 0 messages. It
> keeps looping on stablizing consumer groups, and erroring on an append
> operation to the offset paritions, plus Not enough replicas.
>
> The weird things is, that after not being able to work this out I want
> pretty brutal (luckily I can afford to loose more messages):
> - delete all kafka and zookeeper instances
> - updated kafka
> - cleared all disk
>
> Still kafka is in this unrecoverable error. Does anyone have any idea how
> to fix this?
>