You are viewing a plain text version of this content. The canonical link for it is here.

Posted to users@kafka.apache.org by Qi Xu <sh...@gmail.com> on 2015/11/23 23:26:21 UTC

All brokers are running but some partitions' leader is -1

Hi folks,
We have a 10 node cluster and have several topics. Each topic has about 256
partitions with 3 replica factor. Now we run into an issue that in some
topic, a few partition (< 10)'s leader is -1 and all of them has only one
synced partition.

>From the Kafka manager, here's the snapshot:
[image: Inline image 2]

[image: Inline image 1]

here's the state log:
[2015-11-23 21:57:58,598] ERROR Controller 1 epoch 435499 initiated state
change for partition [userlogs,84] from OnlinePartition to OnlinePartition
failed (state.change.logger)
kafka.common.StateChangeFailedException: encountered error while electing
leader for partition [userlogs,84] due to: Preferred replica 0 for
partition [userlogs,84] is either not alive or not in the isr. Current
leader and ISR: [{"leader":-1,"leader_epoch":203,"isr":[1]}].
Caused by: kafka.common.StateChangeFailedException: Preferred replica 0 for
partition [userlogs,84] is either not alive or not in the isr. Current
leader and ISR: [{"leader":-1,"leader_epoch":203,"isr":[1]}]

My question is:
1) how could this happen and how can I fix it or work around it?
2) Is 256 partitions too big? We have about 200+ cores for spark streaming
job.

Thanks,
Qi

Re: All brokers are running but some partitions' leader is -1

Posted by Qi Xu <sh...@gmail.com>.

Great to know that. Thanks Gwen!

On Wed, Nov 25, 2015 at 12:03 PM, Gwen Shapira <gw...@confluent.io> wrote:

> 1. Yes, you can do a rolling upgrade of brokers from 0.8.2 to 0.9.0. The
> important thing is to upgrade the brokers before you upgrade any of the
> clients.
>
> 2. I'm not aware of issues with 0.9.0 and SparkStreaming. However,
> definitely do your own testing to make sure.
>
> On Wed, Nov 25, 2015 at 11:25 AM, Qi Xu <sh...@gmail.com> wrote:
>
> > Hi Gwen,
> > Yes, we're going to upgrade the 0.9.0 version. Regarding the upgrade, we
> > definitely don't want to have down time of our cluster.
> > So the upgrade will be machine by machine. Will the release 0.9.0 work
> with
> > the Aug's version together in the same Kafka cluster?
> > Also we currently run spark streaming job (with scala 2.10) against the
> > cluster. Any known issues of 0.9.0 are you aware of under this scenario?
> >
> > Thanks,
> > Tony
> >
> >
> > On Mon, Nov 23, 2015 at 5:41 PM, Gwen Shapira <gw...@confluent.io> wrote:
> >
> > > We fixed many many bugs since August. Since we are about to release
> 0.9.0
> > > (with SSL!), maybe wait a day and go with a released and tested
> version.
> > >
> > > On Mon, Nov 23, 2015 at 3:01 PM, Qi Xu <sh...@gmail.com> wrote:
> > >
> > > > Forgot to mention is that the Kafka version we're using is from Aug's
> > > > Trunk branch---which has the SSL support.
> > > >
> > > > Thanks again,
> > > > Qi
> > > >
> > > >
> > > > On Mon, Nov 23, 2015 at 2:29 PM, Qi Xu <sh...@gmail.com> wrote:
> > > >
> > > >> Loop another guy from our team.
> > > >>
> > > >> On Mon, Nov 23, 2015 at 2:26 PM, Qi Xu <sh...@gmail.com> wrote:
> > > >>
> > > >>> Hi folks,
> > > >>> We have a 10 node cluster and have several topics. Each topic has
> > about
> > > >>> 256 partitions with 3 replica factor. Now we run into an issue that
> > in
> > > some
> > > >>> topic, a few partition (< 10)'s leader is -1 and all of them has
> only
> > > one
> > > >>> synced partition.
> > > >>>
> > > >>> From the Kafka manager, here's the snapshot:
> > > >>> [image: Inline image 2]
> > > >>>
> > > >>> [image: Inline image 1]
> > > >>>
> > > >>> here's the state log:
> > > >>> [2015-11-23 21:57:58,598] ERROR Controller 1 epoch 435499 initiated
> > > >>> state change for partition [userlogs,84] from OnlinePartition to
> > > >>> OnlinePartition failed (state.change.logger)
> > > >>> kafka.common.StateChangeFailedException: encountered error while
> > > >>> electing leader for partition [userlogs,84] due to: Preferred
> replica
> > > 0 for
> > > >>> partition [userlogs,84] is either not alive or not in the isr.
> > Current
> > > >>> leader and ISR: [{"leader":-1,"leader_epoch":203,"isr":[1]}].
> > > >>> Caused by: kafka.common.StateChangeFailedException: Preferred
> > replica 0
> > > >>> for partition [userlogs,84] is either not alive or not in the isr.
> > > Current
> > > >>> leader and ISR: [{"leader":-1,"leader_epoch":203,"isr":[1]}]
> > > >>>
> > > >>> My question is:
> > > >>> 1) how could this happen and how can I fix it or work around it?
> > > >>> 2) Is 256 partitions too big? We have about 200+ cores for spark
> > > >>> streaming job.
> > > >>>
> > > >>> Thanks,
> > > >>> Qi
> > > >>>
> > > >>>
> > > >>
> > > >
> > >
> >
>

Re: All brokers are running but some partitions' leader is -1

Posted by Qi Xu <sh...@gmail.com>.

Great to know that. Thanks Gwen!

On Wed, Nov 25, 2015 at 12:03 PM, Gwen Shapira <gw...@confluent.io> wrote:

> 1. Yes, you can do a rolling upgrade of brokers from 0.8.2 to 0.9.0. The
> important thing is to upgrade the brokers before you upgrade any of the
> clients.
>
> 2. I'm not aware of issues with 0.9.0 and SparkStreaming. However,
> definitely do your own testing to make sure.
>
> On Wed, Nov 25, 2015 at 11:25 AM, Qi Xu <sh...@gmail.com> wrote:
>
> > Hi Gwen,
> > Yes, we're going to upgrade the 0.9.0 version. Regarding the upgrade, we
> > definitely don't want to have down time of our cluster.
> > So the upgrade will be machine by machine. Will the release 0.9.0 work
> with
> > the Aug's version together in the same Kafka cluster?
> > Also we currently run spark streaming job (with scala 2.10) against the
> > cluster. Any known issues of 0.9.0 are you aware of under this scenario?
> >
> > Thanks,
> > Tony
> >
> >
> > On Mon, Nov 23, 2015 at 5:41 PM, Gwen Shapira <gw...@confluent.io> wrote:
> >
> > > We fixed many many bugs since August. Since we are about to release
> 0.9.0
> > > (with SSL!), maybe wait a day and go with a released and tested
> version.
> > >
> > > On Mon, Nov 23, 2015 at 3:01 PM, Qi Xu <sh...@gmail.com> wrote:
> > >
> > > > Forgot to mention is that the Kafka version we're using is from Aug's
> > > > Trunk branch---which has the SSL support.
> > > >
> > > > Thanks again,
> > > > Qi
> > > >
> > > >
> > > > On Mon, Nov 23, 2015 at 2:29 PM, Qi Xu <sh...@gmail.com> wrote:
> > > >
> > > >> Loop another guy from our team.
> > > >>
> > > >> On Mon, Nov 23, 2015 at 2:26 PM, Qi Xu <sh...@gmail.com> wrote:
> > > >>
> > > >>> Hi folks,
> > > >>> We have a 10 node cluster and have several topics. Each topic has
> > about
> > > >>> 256 partitions with 3 replica factor. Now we run into an issue that
> > in
> > > some
> > > >>> topic, a few partition (< 10)'s leader is -1 and all of them has
> only
> > > one
> > > >>> synced partition.
> > > >>>
> > > >>> From the Kafka manager, here's the snapshot:
> > > >>> [image: Inline image 2]
> > > >>>
> > > >>> [image: Inline image 1]
> > > >>>
> > > >>> here's the state log:
> > > >>> [2015-11-23 21:57:58,598] ERROR Controller 1 epoch 435499 initiated
> > > >>> state change for partition [userlogs,84] from OnlinePartition to
> > > >>> OnlinePartition failed (state.change.logger)
> > > >>> kafka.common.StateChangeFailedException: encountered error while
> > > >>> electing leader for partition [userlogs,84] due to: Preferred
> replica
> > > 0 for
> > > >>> partition [userlogs,84] is either not alive or not in the isr.
> > Current
> > > >>> leader and ISR: [{"leader":-1,"leader_epoch":203,"isr":[1]}].
> > > >>> Caused by: kafka.common.StateChangeFailedException: Preferred
> > replica 0
> > > >>> for partition [userlogs,84] is either not alive or not in the isr.
> > > Current
> > > >>> leader and ISR: [{"leader":-1,"leader_epoch":203,"isr":[1]}]
> > > >>>
> > > >>> My question is:
> > > >>> 1) how could this happen and how can I fix it or work around it?
> > > >>> 2) Is 256 partitions too big? We have about 200+ cores for spark
> > > >>> streaming job.
> > > >>>
> > > >>> Thanks,
> > > >>> Qi
> > > >>>
> > > >>>
> > > >>
> > > >
> > >
> >
>

Re: All brokers are running but some partitions' leader is -1

Posted by Gwen Shapira <gw...@confluent.io>.

1. Yes, you can do a rolling upgrade of brokers from 0.8.2 to 0.9.0. The
important thing is to upgrade the brokers before you upgrade any of the
clients.

2. I'm not aware of issues with 0.9.0 and SparkStreaming. However,
definitely do your own testing to make sure.

On Wed, Nov 25, 2015 at 11:25 AM, Qi Xu <sh...@gmail.com> wrote:

> Hi Gwen,
> Yes, we're going to upgrade the 0.9.0 version. Regarding the upgrade, we
> definitely don't want to have down time of our cluster.
> So the upgrade will be machine by machine. Will the release 0.9.0 work with
> the Aug's version together in the same Kafka cluster?
> Also we currently run spark streaming job (with scala 2.10) against the
> cluster. Any known issues of 0.9.0 are you aware of under this scenario?
>
> Thanks,
> Tony
>
>
> On Mon, Nov 23, 2015 at 5:41 PM, Gwen Shapira <gw...@confluent.io> wrote:
>
> > We fixed many many bugs since August. Since we are about to release 0.9.0
> > (with SSL!), maybe wait a day and go with a released and tested version.
> >
> > On Mon, Nov 23, 2015 at 3:01 PM, Qi Xu <sh...@gmail.com> wrote:
> >
> > > Forgot to mention is that the Kafka version we're using is from Aug's
> > > Trunk branch---which has the SSL support.
> > >
> > > Thanks again,
> > > Qi
> > >
> > >
> > > On Mon, Nov 23, 2015 at 2:29 PM, Qi Xu <sh...@gmail.com> wrote:
> > >
> > >> Loop another guy from our team.
> > >>
> > >> On Mon, Nov 23, 2015 at 2:26 PM, Qi Xu <sh...@gmail.com> wrote:
> > >>
> > >>> Hi folks,
> > >>> We have a 10 node cluster and have several topics. Each topic has
> about
> > >>> 256 partitions with 3 replica factor. Now we run into an issue that
> in
> > some
> > >>> topic, a few partition (< 10)'s leader is -1 and all of them has only
> > one
> > >>> synced partition.
> > >>>
> > >>> From the Kafka manager, here's the snapshot:
> > >>> [image: Inline image 2]
> > >>>
> > >>> [image: Inline image 1]
> > >>>
> > >>> here's the state log:
> > >>> [2015-11-23 21:57:58,598] ERROR Controller 1 epoch 435499 initiated
> > >>> state change for partition [userlogs,84] from OnlinePartition to
> > >>> OnlinePartition failed (state.change.logger)
> > >>> kafka.common.StateChangeFailedException: encountered error while
> > >>> electing leader for partition [userlogs,84] due to: Preferred replica
> > 0 for
> > >>> partition [userlogs,84] is either not alive or not in the isr.
> Current
> > >>> leader and ISR: [{"leader":-1,"leader_epoch":203,"isr":[1]}].
> > >>> Caused by: kafka.common.StateChangeFailedException: Preferred
> replica 0
> > >>> for partition [userlogs,84] is either not alive or not in the isr.
> > Current
> > >>> leader and ISR: [{"leader":-1,"leader_epoch":203,"isr":[1]}]
> > >>>
> > >>> My question is:
> > >>> 1) how could this happen and how can I fix it or work around it?
> > >>> 2) Is 256 partitions too big? We have about 200+ cores for spark
> > >>> streaming job.
> > >>>
> > >>> Thanks,
> > >>> Qi
> > >>>
> > >>>
> > >>
> > >
> >
>

Re: All brokers are running but some partitions' leader is -1

Posted by Gwen Shapira <gw...@confluent.io>.

1. Yes, you can do a rolling upgrade of brokers from 0.8.2 to 0.9.0. The
important thing is to upgrade the brokers before you upgrade any of the
clients.

2. I'm not aware of issues with 0.9.0 and SparkStreaming. However,
definitely do your own testing to make sure.

On Wed, Nov 25, 2015 at 11:25 AM, Qi Xu <sh...@gmail.com> wrote:

> Hi Gwen,
> Yes, we're going to upgrade the 0.9.0 version. Regarding the upgrade, we
> definitely don't want to have down time of our cluster.
> So the upgrade will be machine by machine. Will the release 0.9.0 work with
> the Aug's version together in the same Kafka cluster?
> Also we currently run spark streaming job (with scala 2.10) against the
> cluster. Any known issues of 0.9.0 are you aware of under this scenario?
>
> Thanks,
> Tony
>
>
> On Mon, Nov 23, 2015 at 5:41 PM, Gwen Shapira <gw...@confluent.io> wrote:
>
> > We fixed many many bugs since August. Since we are about to release 0.9.0
> > (with SSL!), maybe wait a day and go with a released and tested version.
> >
> > On Mon, Nov 23, 2015 at 3:01 PM, Qi Xu <sh...@gmail.com> wrote:
> >
> > > Forgot to mention is that the Kafka version we're using is from Aug's
> > > Trunk branch---which has the SSL support.
> > >
> > > Thanks again,
> > > Qi
> > >
> > >
> > > On Mon, Nov 23, 2015 at 2:29 PM, Qi Xu <sh...@gmail.com> wrote:
> > >
> > >> Loop another guy from our team.
> > >>
> > >> On Mon, Nov 23, 2015 at 2:26 PM, Qi Xu <sh...@gmail.com> wrote:
> > >>
> > >>> Hi folks,
> > >>> We have a 10 node cluster and have several topics. Each topic has
> about
> > >>> 256 partitions with 3 replica factor. Now we run into an issue that
> in
> > some
> > >>> topic, a few partition (< 10)'s leader is -1 and all of them has only
> > one
> > >>> synced partition.
> > >>>
> > >>> From the Kafka manager, here's the snapshot:
> > >>> [image: Inline image 2]
> > >>>
> > >>> [image: Inline image 1]
> > >>>
> > >>> here's the state log:
> > >>> [2015-11-23 21:57:58,598] ERROR Controller 1 epoch 435499 initiated
> > >>> state change for partition [userlogs,84] from OnlinePartition to
> > >>> OnlinePartition failed (state.change.logger)
> > >>> kafka.common.StateChangeFailedException: encountered error while
> > >>> electing leader for partition [userlogs,84] due to: Preferred replica
> > 0 for
> > >>> partition [userlogs,84] is either not alive or not in the isr.
> Current
> > >>> leader and ISR: [{"leader":-1,"leader_epoch":203,"isr":[1]}].
> > >>> Caused by: kafka.common.StateChangeFailedException: Preferred
> replica 0
> > >>> for partition [userlogs,84] is either not alive or not in the isr.
> > Current
> > >>> leader and ISR: [{"leader":-1,"leader_epoch":203,"isr":[1]}]
> > >>>
> > >>> My question is:
> > >>> 1) how could this happen and how can I fix it or work around it?
> > >>> 2) Is 256 partitions too big? We have about 200+ cores for spark
> > >>> streaming job.
> > >>>
> > >>> Thanks,
> > >>> Qi
> > >>>
> > >>>
> > >>
> > >
> >
>

Re: All brokers are running but some partitions' leader is -1

Posted by Qi Xu <sh...@gmail.com>.

Hi Gwen,
Yes, we're going to upgrade the 0.9.0 version. Regarding the upgrade, we
definitely don't want to have down time of our cluster.
So the upgrade will be machine by machine. Will the release 0.9.0 work with
the Aug's version together in the same Kafka cluster?
Also we currently run spark streaming job (with scala 2.10) against the
cluster. Any known issues of 0.9.0 are you aware of under this scenario?

Thanks,
Tony


On Mon, Nov 23, 2015 at 5:41 PM, Gwen Shapira <gw...@confluent.io> wrote:

> We fixed many many bugs since August. Since we are about to release 0.9.0
> (with SSL!), maybe wait a day and go with a released and tested version.
>
> On Mon, Nov 23, 2015 at 3:01 PM, Qi Xu <sh...@gmail.com> wrote:
>
> > Forgot to mention is that the Kafka version we're using is from Aug's
> > Trunk branch---which has the SSL support.
> >
> > Thanks again,
> > Qi
> >
> >
> > On Mon, Nov 23, 2015 at 2:29 PM, Qi Xu <sh...@gmail.com> wrote:
> >
> >> Loop another guy from our team.
> >>
> >> On Mon, Nov 23, 2015 at 2:26 PM, Qi Xu <sh...@gmail.com> wrote:
> >>
> >>> Hi folks,
> >>> We have a 10 node cluster and have several topics. Each topic has about
> >>> 256 partitions with 3 replica factor. Now we run into an issue that in
> some
> >>> topic, a few partition (< 10)'s leader is -1 and all of them has only
> one
> >>> synced partition.
> >>>
> >>> From the Kafka manager, here's the snapshot:
> >>> [image: Inline image 2]
> >>>
> >>> [image: Inline image 1]
> >>>
> >>> here's the state log:
> >>> [2015-11-23 21:57:58,598] ERROR Controller 1 epoch 435499 initiated
> >>> state change for partition [userlogs,84] from OnlinePartition to
> >>> OnlinePartition failed (state.change.logger)
> >>> kafka.common.StateChangeFailedException: encountered error while
> >>> electing leader for partition [userlogs,84] due to: Preferred replica
> 0 for
> >>> partition [userlogs,84] is either not alive or not in the isr. Current
> >>> leader and ISR: [{"leader":-1,"leader_epoch":203,"isr":[1]}].
> >>> Caused by: kafka.common.StateChangeFailedException: Preferred replica 0
> >>> for partition [userlogs,84] is either not alive or not in the isr.
> Current
> >>> leader and ISR: [{"leader":-1,"leader_epoch":203,"isr":[1]}]
> >>>
> >>> My question is:
> >>> 1) how could this happen and how can I fix it or work around it?
> >>> 2) Is 256 partitions too big? We have about 200+ cores for spark
> >>> streaming job.
> >>>
> >>> Thanks,
> >>> Qi
> >>>
> >>>
> >>
> >
>

Re: All brokers are running but some partitions' leader is -1

Posted by Qi Xu <sh...@gmail.com>.

Hi Gwen,
Yes, we're going to upgrade the 0.9.0 version. Regarding the upgrade, we
definitely don't want to have down time of our cluster.
So the upgrade will be machine by machine. Will the release 0.9.0 work with
the Aug's version together in the same Kafka cluster?
Also we currently run spark streaming job (with scala 2.10) against the
cluster. Any known issues of 0.9.0 are you aware of under this scenario?

Thanks,
Tony


On Mon, Nov 23, 2015 at 5:41 PM, Gwen Shapira <gw...@confluent.io> wrote:

> We fixed many many bugs since August. Since we are about to release 0.9.0
> (with SSL!), maybe wait a day and go with a released and tested version.
>
> On Mon, Nov 23, 2015 at 3:01 PM, Qi Xu <sh...@gmail.com> wrote:
>
> > Forgot to mention is that the Kafka version we're using is from Aug's
> > Trunk branch---which has the SSL support.
> >
> > Thanks again,
> > Qi
> >
> >
> > On Mon, Nov 23, 2015 at 2:29 PM, Qi Xu <sh...@gmail.com> wrote:
> >
> >> Loop another guy from our team.
> >>
> >> On Mon, Nov 23, 2015 at 2:26 PM, Qi Xu <sh...@gmail.com> wrote:
> >>
> >>> Hi folks,
> >>> We have a 10 node cluster and have several topics. Each topic has about
> >>> 256 partitions with 3 replica factor. Now we run into an issue that in
> some
> >>> topic, a few partition (< 10)'s leader is -1 and all of them has only
> one
> >>> synced partition.
> >>>
> >>> From the Kafka manager, here's the snapshot:
> >>> [image: Inline image 2]
> >>>
> >>> [image: Inline image 1]
> >>>
> >>> here's the state log:
> >>> [2015-11-23 21:57:58,598] ERROR Controller 1 epoch 435499 initiated
> >>> state change for partition [userlogs,84] from OnlinePartition to
> >>> OnlinePartition failed (state.change.logger)
> >>> kafka.common.StateChangeFailedException: encountered error while
> >>> electing leader for partition [userlogs,84] due to: Preferred replica
> 0 for
> >>> partition [userlogs,84] is either not alive or not in the isr. Current
> >>> leader and ISR: [{"leader":-1,"leader_epoch":203,"isr":[1]}].
> >>> Caused by: kafka.common.StateChangeFailedException: Preferred replica 0
> >>> for partition [userlogs,84] is either not alive or not in the isr.
> Current
> >>> leader and ISR: [{"leader":-1,"leader_epoch":203,"isr":[1]}]
> >>>
> >>> My question is:
> >>> 1) how could this happen and how can I fix it or work around it?
> >>> 2) Is 256 partitions too big? We have about 200+ cores for spark
> >>> streaming job.
> >>>
> >>> Thanks,
> >>> Qi
> >>>
> >>>
> >>
> >
>

Re: All brokers are running but some partitions' leader is -1

Posted by Qi Xu <sh...@gmail.com>.

Thanks a lot Prabhjot!
The issue is mitigated by running the preferred replica leader election
tool! Before that, I noticed that it simply could not do leader
election---when I created a new topic, that topic is not available for a
long time until preferred replica leader election finishes.

For the 3 steps above,
1. The replicas are evenly distributed
2. There's some imbalance in terms of the load among brokers, but not
significant. But I guess there might be some brokers down and then up
again---we have agent to restart it automatically.
3. Spark running in another set of machines. The kafka server's CPU/memory
usage is well below 50%.



On Mon, Nov 23, 2015 at 11:18 PM, Prabhjot Bharaj <pr...@gmail.com>
wrote:

> Hi,
>
> With the information provided, these are the steps I can think of (based on
> the experience I had with kafka):-
>
> 1. do a describe on the topic. See if the partitions and replicas are
> evenly distributed amongst all. If not, you might want to try the 'Reassign
> Partitions Tool' -
>
> https://cwiki.apache.org/confluence/display/KAFKA/Replication+tools#Replicationtools-6.ReassignPartitionsTool
> 2. is/are some partition(s) getting more data than others leading to an
> imbalance of disk space amongst the nodes in the cluster, to an extent that
> the kafka server process goes down on one or more machines in the cluster ?
> 3. From what I understand, your kafka and spark machines are the same ?? !!
> how much memory usage the replica-0 has when your spark cluster is running
> full throttle ?
>
> Workaround -
> Try running the Preferred Replica Leader Election Tool -
>
> https://cwiki.apache.org/confluence/display/KAFKA/Replication+tools#Replicationtools-1.PreferredReplicaLeaderElectionTool
> to make some replica (the one that you noticed earlier when the cluster was
> all good) as the leader for this partition
>
> Regards,
> Prabhjot
>
> On Tue, Nov 24, 2015 at 7:11 AM, Gwen Shapira <gw...@confluent.io> wrote:
>
> > We fixed many many bugs since August. Since we are about to release 0.9.0
> > (with SSL!), maybe wait a day and go with a released and tested version.
> >
> > On Mon, Nov 23, 2015 at 3:01 PM, Qi Xu <sh...@gmail.com> wrote:
> >
> > > Forgot to mention is that the Kafka version we're using is from Aug's
> > > Trunk branch---which has the SSL support.
> > >
> > > Thanks again,
> > > Qi
> > >
> > >
> > > On Mon, Nov 23, 2015 at 2:29 PM, Qi Xu <sh...@gmail.com> wrote:
> > >
> > >> Loop another guy from our team.
> > >>
> > >> On Mon, Nov 23, 2015 at 2:26 PM, Qi Xu <sh...@gmail.com> wrote:
> > >>
> > >>> Hi folks,
> > >>> We have a 10 node cluster and have several topics. Each topic has
> about
> > >>> 256 partitions with 3 replica factor. Now we run into an issue that
> in
> > some
> > >>> topic, a few partition (< 10)'s leader is -1 and all of them has only
> > one
> > >>> synced partition.
> > >>>
> > >>> From the Kafka manager, here's the snapshot:
> > >>> [image: Inline image 2]
> > >>>
> > >>> [image: Inline image 1]
> > >>>
> > >>> here's the state log:
> > >>> [2015-11-23 21:57:58,598] ERROR Controller 1 epoch 435499 initiated
> > >>> state change for partition [userlogs,84] from OnlinePartition to
> > >>> OnlinePartition failed (state.change.logger)
> > >>> kafka.common.StateChangeFailedException: encountered error while
> > >>> electing leader for partition [userlogs,84] due to: Preferred replica
> > 0 for
> > >>> partition [userlogs,84] is either not alive or not in the isr.
> Current
> > >>> leader and ISR: [{"leader":-1,"leader_epoch":203,"isr":[1]}].
> > >>> Caused by: kafka.common.StateChangeFailedException: Preferred
> replica 0
> > >>> for partition [userlogs,84] is either not alive or not in the isr.
> > Current
> > >>> leader and ISR: [{"leader":-1,"leader_epoch":203,"isr":[1]}]
> > >>>
> > >>> My question is:
> > >>> 1) how could this happen and how can I fix it or work around it?
> > >>> 2) Is 256 partitions too big? We have about 200+ cores for spark
> > >>> streaming job.
> > >>>
> > >>> Thanks,
> > >>> Qi
> > >>>
> > >>>
> > >>
> > >
> >
>
>
>
> --
> ---------------------------------------------------------
> "There are only 10 types of people in the world: Those who understand
> binary, and those who don't"
>

Re: All brokers are running but some partitions' leader is -1

Posted by Qi Xu <sh...@gmail.com>.

Thanks a lot Prabhjot!
The issue is mitigated by running the preferred replica leader election
tool! Before that, I noticed that it simply could not do leader
election---when I created a new topic, that topic is not available for a
long time until preferred replica leader election finishes.

For the 3 steps above,
1. The replicas are evenly distributed
2. There's some imbalance in terms of the load among brokers, but not
significant. But I guess there might be some brokers down and then up
again---we have agent to restart it automatically.
3. Spark running in another set of machines. The kafka server's CPU/memory
usage is well below 50%.



On Mon, Nov 23, 2015 at 11:18 PM, Prabhjot Bharaj <pr...@gmail.com>
wrote:

> Hi,
>
> With the information provided, these are the steps I can think of (based on
> the experience I had with kafka):-
>
> 1. do a describe on the topic. See if the partitions and replicas are
> evenly distributed amongst all. If not, you might want to try the 'Reassign
> Partitions Tool' -
>
> https://cwiki.apache.org/confluence/display/KAFKA/Replication+tools#Replicationtools-6.ReassignPartitionsTool
> 2. is/are some partition(s) getting more data than others leading to an
> imbalance of disk space amongst the nodes in the cluster, to an extent that
> the kafka server process goes down on one or more machines in the cluster ?
> 3. From what I understand, your kafka and spark machines are the same ?? !!
> how much memory usage the replica-0 has when your spark cluster is running
> full throttle ?
>
> Workaround -
> Try running the Preferred Replica Leader Election Tool -
>
> https://cwiki.apache.org/confluence/display/KAFKA/Replication+tools#Replicationtools-1.PreferredReplicaLeaderElectionTool
> to make some replica (the one that you noticed earlier when the cluster was
> all good) as the leader for this partition
>
> Regards,
> Prabhjot
>
> On Tue, Nov 24, 2015 at 7:11 AM, Gwen Shapira <gw...@confluent.io> wrote:
>
> > We fixed many many bugs since August. Since we are about to release 0.9.0
> > (with SSL!), maybe wait a day and go with a released and tested version.
> >
> > On Mon, Nov 23, 2015 at 3:01 PM, Qi Xu <sh...@gmail.com> wrote:
> >
> > > Forgot to mention is that the Kafka version we're using is from Aug's
> > > Trunk branch---which has the SSL support.
> > >
> > > Thanks again,
> > > Qi
> > >
> > >
> > > On Mon, Nov 23, 2015 at 2:29 PM, Qi Xu <sh...@gmail.com> wrote:
> > >
> > >> Loop another guy from our team.
> > >>
> > >> On Mon, Nov 23, 2015 at 2:26 PM, Qi Xu <sh...@gmail.com> wrote:
> > >>
> > >>> Hi folks,
> > >>> We have a 10 node cluster and have several topics. Each topic has
> about
> > >>> 256 partitions with 3 replica factor. Now we run into an issue that
> in
> > some
> > >>> topic, a few partition (< 10)'s leader is -1 and all of them has only
> > one
> > >>> synced partition.
> > >>>
> > >>> From the Kafka manager, here's the snapshot:
> > >>> [image: Inline image 2]
> > >>>
> > >>> [image: Inline image 1]
> > >>>
> > >>> here's the state log:
> > >>> [2015-11-23 21:57:58,598] ERROR Controller 1 epoch 435499 initiated
> > >>> state change for partition [userlogs,84] from OnlinePartition to
> > >>> OnlinePartition failed (state.change.logger)
> > >>> kafka.common.StateChangeFailedException: encountered error while
> > >>> electing leader for partition [userlogs,84] due to: Preferred replica
> > 0 for
> > >>> partition [userlogs,84] is either not alive or not in the isr.
> Current
> > >>> leader and ISR: [{"leader":-1,"leader_epoch":203,"isr":[1]}].
> > >>> Caused by: kafka.common.StateChangeFailedException: Preferred
> replica 0
> > >>> for partition [userlogs,84] is either not alive or not in the isr.
> > Current
> > >>> leader and ISR: [{"leader":-1,"leader_epoch":203,"isr":[1]}]
> > >>>
> > >>> My question is:
> > >>> 1) how could this happen and how can I fix it or work around it?
> > >>> 2) Is 256 partitions too big? We have about 200+ cores for spark
> > >>> streaming job.
> > >>>
> > >>> Thanks,
> > >>> Qi
> > >>>
> > >>>
> > >>
> > >
> >
>
>
>
> --
> ---------------------------------------------------------
> "There are only 10 types of people in the world: Those who understand
> binary, and those who don't"
>

Re: All brokers are running but some partitions' leader is -1

Posted by Prabhjot Bharaj <pr...@gmail.com>.

Hi,

With the information provided, these are the steps I can think of (based on
the experience I had with kafka):-

1. do a describe on the topic. See if the partitions and replicas are
evenly distributed amongst all. If not, you might want to try the 'Reassign
Partitions Tool' -
https://cwiki.apache.org/confluence/display/KAFKA/Replication+tools#Replicationtools-6.ReassignPartitionsTool
2. is/are some partition(s) getting more data than others leading to an
imbalance of disk space amongst the nodes in the cluster, to an extent that
the kafka server process goes down on one or more machines in the cluster ?
3. From what I understand, your kafka and spark machines are the same ?? !!
how much memory usage the replica-0 has when your spark cluster is running
full throttle ?

Workaround -
Try running the Preferred Replica Leader Election Tool -
https://cwiki.apache.org/confluence/display/KAFKA/Replication+tools#Replicationtools-1.PreferredReplicaLeaderElectionTool
to make some replica (the one that you noticed earlier when the cluster was
all good) as the leader for this partition

Regards,
Prabhjot

On Tue, Nov 24, 2015 at 7:11 AM, Gwen Shapira <gw...@confluent.io> wrote:

> We fixed many many bugs since August. Since we are about to release 0.9.0
> (with SSL!), maybe wait a day and go with a released and tested version.
>
> On Mon, Nov 23, 2015 at 3:01 PM, Qi Xu <sh...@gmail.com> wrote:
>
> > Forgot to mention is that the Kafka version we're using is from Aug's
> > Trunk branch---which has the SSL support.
> >
> > Thanks again,
> > Qi
> >
> >
> > On Mon, Nov 23, 2015 at 2:29 PM, Qi Xu <sh...@gmail.com> wrote:
> >
> >> Loop another guy from our team.
> >>
> >> On Mon, Nov 23, 2015 at 2:26 PM, Qi Xu <sh...@gmail.com> wrote:
> >>
> >>> Hi folks,
> >>> We have a 10 node cluster and have several topics. Each topic has about
> >>> 256 partitions with 3 replica factor. Now we run into an issue that in
> some
> >>> topic, a few partition (< 10)'s leader is -1 and all of them has only
> one
> >>> synced partition.
> >>>
> >>> From the Kafka manager, here's the snapshot:
> >>> [image: Inline image 2]
> >>>
> >>> [image: Inline image 1]
> >>>
> >>> here's the state log:
> >>> [2015-11-23 21:57:58,598] ERROR Controller 1 epoch 435499 initiated
> >>> state change for partition [userlogs,84] from OnlinePartition to
> >>> OnlinePartition failed (state.change.logger)
> >>> kafka.common.StateChangeFailedException: encountered error while
> >>> electing leader for partition [userlogs,84] due to: Preferred replica
> 0 for
> >>> partition [userlogs,84] is either not alive or not in the isr. Current
> >>> leader and ISR: [{"leader":-1,"leader_epoch":203,"isr":[1]}].
> >>> Caused by: kafka.common.StateChangeFailedException: Preferred replica 0
> >>> for partition [userlogs,84] is either not alive or not in the isr.
> Current
> >>> leader and ISR: [{"leader":-1,"leader_epoch":203,"isr":[1]}]
> >>>
> >>> My question is:
> >>> 1) how could this happen and how can I fix it or work around it?
> >>> 2) Is 256 partitions too big? We have about 200+ cores for spark
> >>> streaming job.
> >>>
> >>> Thanks,
> >>> Qi
> >>>
> >>>
> >>
> >
>



-- 
---------------------------------------------------------
"There are only 10 types of people in the world: Those who understand
binary, and those who don't"

Re: All brokers are running but some partitions' leader is -1

Posted by Prabhjot Bharaj <pr...@gmail.com>.

Hi,

With the information provided, these are the steps I can think of (based on
the experience I had with kafka):-

1. do a describe on the topic. See if the partitions and replicas are
evenly distributed amongst all. If not, you might want to try the 'Reassign
Partitions Tool' -
https://cwiki.apache.org/confluence/display/KAFKA/Replication+tools#Replicationtools-6.ReassignPartitionsTool
2. is/are some partition(s) getting more data than others leading to an
imbalance of disk space amongst the nodes in the cluster, to an extent that
the kafka server process goes down on one or more machines in the cluster ?
3. From what I understand, your kafka and spark machines are the same ?? !!
how much memory usage the replica-0 has when your spark cluster is running
full throttle ?

Workaround -
Try running the Preferred Replica Leader Election Tool -
https://cwiki.apache.org/confluence/display/KAFKA/Replication+tools#Replicationtools-1.PreferredReplicaLeaderElectionTool
to make some replica (the one that you noticed earlier when the cluster was
all good) as the leader for this partition

Regards,
Prabhjot

On Tue, Nov 24, 2015 at 7:11 AM, Gwen Shapira <gw...@confluent.io> wrote:

> We fixed many many bugs since August. Since we are about to release 0.9.0
> (with SSL!), maybe wait a day and go with a released and tested version.
>
> On Mon, Nov 23, 2015 at 3:01 PM, Qi Xu <sh...@gmail.com> wrote:
>
> > Forgot to mention is that the Kafka version we're using is from Aug's
> > Trunk branch---which has the SSL support.
> >
> > Thanks again,
> > Qi
> >
> >
> > On Mon, Nov 23, 2015 at 2:29 PM, Qi Xu <sh...@gmail.com> wrote:
> >
> >> Loop another guy from our team.
> >>
> >> On Mon, Nov 23, 2015 at 2:26 PM, Qi Xu <sh...@gmail.com> wrote:
> >>
> >>> Hi folks,
> >>> We have a 10 node cluster and have several topics. Each topic has about
> >>> 256 partitions with 3 replica factor. Now we run into an issue that in
> some
> >>> topic, a few partition (< 10)'s leader is -1 and all of them has only
> one
> >>> synced partition.
> >>>
> >>> From the Kafka manager, here's the snapshot:
> >>> [image: Inline image 2]
> >>>
> >>> [image: Inline image 1]
> >>>
> >>> here's the state log:
> >>> [2015-11-23 21:57:58,598] ERROR Controller 1 epoch 435499 initiated
> >>> state change for partition [userlogs,84] from OnlinePartition to
> >>> OnlinePartition failed (state.change.logger)
> >>> kafka.common.StateChangeFailedException: encountered error while
> >>> electing leader for partition [userlogs,84] due to: Preferred replica
> 0 for
> >>> partition [userlogs,84] is either not alive or not in the isr. Current
> >>> leader and ISR: [{"leader":-1,"leader_epoch":203,"isr":[1]}].
> >>> Caused by: kafka.common.StateChangeFailedException: Preferred replica 0
> >>> for partition [userlogs,84] is either not alive or not in the isr.
> Current
> >>> leader and ISR: [{"leader":-1,"leader_epoch":203,"isr":[1]}]
> >>>
> >>> My question is:
> >>> 1) how could this happen and how can I fix it or work around it?
> >>> 2) Is 256 partitions too big? We have about 200+ cores for spark
> >>> streaming job.
> >>>
> >>> Thanks,
> >>> Qi
> >>>
> >>>
> >>
> >
>



-- 
---------------------------------------------------------
"There are only 10 types of people in the world: Those who understand
binary, and those who don't"

Re: All brokers are running but some partitions' leader is -1

Posted by Gwen Shapira <gw...@confluent.io>.

We fixed many many bugs since August. Since we are about to release 0.9.0
(with SSL!), maybe wait a day and go with a released and tested version.

On Mon, Nov 23, 2015 at 3:01 PM, Qi Xu <sh...@gmail.com> wrote:

> Forgot to mention is that the Kafka version we're using is from Aug's
> Trunk branch---which has the SSL support.
>
> Thanks again,
> Qi
>
>
> On Mon, Nov 23, 2015 at 2:29 PM, Qi Xu <sh...@gmail.com> wrote:
>
>> Loop another guy from our team.
>>
>> On Mon, Nov 23, 2015 at 2:26 PM, Qi Xu <sh...@gmail.com> wrote:
>>
>>> Hi folks,
>>> We have a 10 node cluster and have several topics. Each topic has about
>>> 256 partitions with 3 replica factor. Now we run into an issue that in some
>>> topic, a few partition (< 10)'s leader is -1 and all of them has only one
>>> synced partition.
>>>
>>> From the Kafka manager, here's the snapshot:
>>> [image: Inline image 2]
>>>
>>> [image: Inline image 1]
>>>
>>> here's the state log:
>>> [2015-11-23 21:57:58,598] ERROR Controller 1 epoch 435499 initiated
>>> state change for partition [userlogs,84] from OnlinePartition to
>>> OnlinePartition failed (state.change.logger)
>>> kafka.common.StateChangeFailedException: encountered error while
>>> electing leader for partition [userlogs,84] due to: Preferred replica 0 for
>>> partition [userlogs,84] is either not alive or not in the isr. Current
>>> leader and ISR: [{"leader":-1,"leader_epoch":203,"isr":[1]}].
>>> Caused by: kafka.common.StateChangeFailedException: Preferred replica 0
>>> for partition [userlogs,84] is either not alive or not in the isr. Current
>>> leader and ISR: [{"leader":-1,"leader_epoch":203,"isr":[1]}]
>>>
>>> My question is:
>>> 1) how could this happen and how can I fix it or work around it?
>>> 2) Is 256 partitions too big? We have about 200+ cores for spark
>>> streaming job.
>>>
>>> Thanks,
>>> Qi
>>>
>>>
>>
>

Re: All brokers are running but some partitions' leader is -1

Posted by Gwen Shapira <gw...@confluent.io>.

We fixed many many bugs since August. Since we are about to release 0.9.0
(with SSL!), maybe wait a day and go with a released and tested version.

On Mon, Nov 23, 2015 at 3:01 PM, Qi Xu <sh...@gmail.com> wrote:

> Forgot to mention is that the Kafka version we're using is from Aug's
> Trunk branch---which has the SSL support.
>
> Thanks again,
> Qi
>
>
> On Mon, Nov 23, 2015 at 2:29 PM, Qi Xu <sh...@gmail.com> wrote:
>
>> Loop another guy from our team.
>>
>> On Mon, Nov 23, 2015 at 2:26 PM, Qi Xu <sh...@gmail.com> wrote:
>>
>>> Hi folks,
>>> We have a 10 node cluster and have several topics. Each topic has about
>>> 256 partitions with 3 replica factor. Now we run into an issue that in some
>>> topic, a few partition (< 10)'s leader is -1 and all of them has only one
>>> synced partition.
>>>
>>> From the Kafka manager, here's the snapshot:
>>> [image: Inline image 2]
>>>
>>> [image: Inline image 1]
>>>
>>> here's the state log:
>>> [2015-11-23 21:57:58,598] ERROR Controller 1 epoch 435499 initiated
>>> state change for partition [userlogs,84] from OnlinePartition to
>>> OnlinePartition failed (state.change.logger)
>>> kafka.common.StateChangeFailedException: encountered error while
>>> electing leader for partition [userlogs,84] due to: Preferred replica 0 for
>>> partition [userlogs,84] is either not alive or not in the isr. Current
>>> leader and ISR: [{"leader":-1,"leader_epoch":203,"isr":[1]}].
>>> Caused by: kafka.common.StateChangeFailedException: Preferred replica 0
>>> for partition [userlogs,84] is either not alive or not in the isr. Current
>>> leader and ISR: [{"leader":-1,"leader_epoch":203,"isr":[1]}]
>>>
>>> My question is:
>>> 1) how could this happen and how can I fix it or work around it?
>>> 2) Is 256 partitions too big? We have about 200+ cores for spark
>>> streaming job.
>>>
>>> Thanks,
>>> Qi
>>>
>>>
>>
>

Re: All brokers are running but some partitions' leader is -1

Posted by Qi Xu <sh...@gmail.com>.

Forgot to mention is that the Kafka version we're using is from Aug's Trunk
branch---which has the SSL support.

Thanks again,
Qi


On Mon, Nov 23, 2015 at 2:29 PM, Qi Xu <sh...@gmail.com> wrote:

> Loop another guy from our team.
>
> On Mon, Nov 23, 2015 at 2:26 PM, Qi Xu <sh...@gmail.com> wrote:
>
>> Hi folks,
>> We have a 10 node cluster and have several topics. Each topic has about
>> 256 partitions with 3 replica factor. Now we run into an issue that in some
>> topic, a few partition (< 10)'s leader is -1 and all of them has only one
>> synced partition.
>>
>> From the Kafka manager, here's the snapshot:
>> [image: Inline image 2]
>>
>> [image: Inline image 1]
>>
>> here's the state log:
>> [2015-11-23 21:57:58,598] ERROR Controller 1 epoch 435499 initiated state
>> change for partition [userlogs,84] from OnlinePartition to OnlinePartition
>> failed (state.change.logger)
>> kafka.common.StateChangeFailedException: encountered error while electing
>> leader for partition [userlogs,84] due to: Preferred replica 0 for
>> partition [userlogs,84] is either not alive or not in the isr. Current
>> leader and ISR: [{"leader":-1,"leader_epoch":203,"isr":[1]}].
>> Caused by: kafka.common.StateChangeFailedException: Preferred replica 0
>> for partition [userlogs,84] is either not alive or not in the isr. Current
>> leader and ISR: [{"leader":-1,"leader_epoch":203,"isr":[1]}]
>>
>> My question is:
>> 1) how could this happen and how can I fix it or work around it?
>> 2) Is 256 partitions too big? We have about 200+ cores for spark
>> streaming job.
>>
>> Thanks,
>> Qi
>>
>>
>

Re: All brokers are running but some partitions' leader is -1

Posted by Qi Xu <sh...@gmail.com>.

Forgot to mention is that the Kafka version we're using is from Aug's Trunk
branch---which has the SSL support.

Thanks again,
Qi


On Mon, Nov 23, 2015 at 2:29 PM, Qi Xu <sh...@gmail.com> wrote:

> Loop another guy from our team.
>
> On Mon, Nov 23, 2015 at 2:26 PM, Qi Xu <sh...@gmail.com> wrote:
>
>> Hi folks,
>> We have a 10 node cluster and have several topics. Each topic has about
>> 256 partitions with 3 replica factor. Now we run into an issue that in some
>> topic, a few partition (< 10)'s leader is -1 and all of them has only one
>> synced partition.
>>
>> From the Kafka manager, here's the snapshot:
>> [image: Inline image 2]
>>
>> [image: Inline image 1]
>>
>> here's the state log:
>> [2015-11-23 21:57:58,598] ERROR Controller 1 epoch 435499 initiated state
>> change for partition [userlogs,84] from OnlinePartition to OnlinePartition
>> failed (state.change.logger)
>> kafka.common.StateChangeFailedException: encountered error while electing
>> leader for partition [userlogs,84] due to: Preferred replica 0 for
>> partition [userlogs,84] is either not alive or not in the isr. Current
>> leader and ISR: [{"leader":-1,"leader_epoch":203,"isr":[1]}].
>> Caused by: kafka.common.StateChangeFailedException: Preferred replica 0
>> for partition [userlogs,84] is either not alive or not in the isr. Current
>> leader and ISR: [{"leader":-1,"leader_epoch":203,"isr":[1]}]
>>
>> My question is:
>> 1) how could this happen and how can I fix it or work around it?
>> 2) Is 256 partitions too big? We have about 200+ cores for spark
>> streaming job.
>>
>> Thanks,
>> Qi
>>
>>
>

Re: All brokers are running but some partitions' leader is -1

Posted by Qi Xu <sh...@gmail.com>.

Loop another guy from our team.

On Mon, Nov 23, 2015 at 2:26 PM, Qi Xu <sh...@gmail.com> wrote:

> Hi folks,
> We have a 10 node cluster and have several topics. Each topic has about
> 256 partitions with 3 replica factor. Now we run into an issue that in some
> topic, a few partition (< 10)'s leader is -1 and all of them has only one
> synced partition.
>
> From the Kafka manager, here's the snapshot:
> [image: Inline image 2]
>
> [image: Inline image 1]
>
> here's the state log:
> [2015-11-23 21:57:58,598] ERROR Controller 1 epoch 435499 initiated state
> change for partition [userlogs,84] from OnlinePartition to OnlinePartition
> failed (state.change.logger)
> kafka.common.StateChangeFailedException: encountered error while electing
> leader for partition [userlogs,84] due to: Preferred replica 0 for
> partition [userlogs,84] is either not alive or not in the isr. Current
> leader and ISR: [{"leader":-1,"leader_epoch":203,"isr":[1]}].
> Caused by: kafka.common.StateChangeFailedException: Preferred replica 0
> for partition [userlogs,84] is either not alive or not in the isr. Current
> leader and ISR: [{"leader":-1,"leader_epoch":203,"isr":[1]}]
>
> My question is:
> 1) how could this happen and how can I fix it or work around it?
> 2) Is 256 partitions too big? We have about 200+ cores for spark streaming
> job.
>
> Thanks,
> Qi
>
>

Re: All brokers are running but some partitions' leader is -1

Posted by Qi Xu <sh...@gmail.com>.

Loop another guy from our team.

On Mon, Nov 23, 2015 at 2:26 PM, Qi Xu <sh...@gmail.com> wrote:

> Hi folks,
> We have a 10 node cluster and have several topics. Each topic has about
> 256 partitions with 3 replica factor. Now we run into an issue that in some
> topic, a few partition (< 10)'s leader is -1 and all of them has only one
> synced partition.
>
> From the Kafka manager, here's the snapshot:
> [image: Inline image 2]
>
> [image: Inline image 1]
>
> here's the state log:
> [2015-11-23 21:57:58,598] ERROR Controller 1 epoch 435499 initiated state
> change for partition [userlogs,84] from OnlinePartition to OnlinePartition
> failed (state.change.logger)
> kafka.common.StateChangeFailedException: encountered error while electing
> leader for partition [userlogs,84] due to: Preferred replica 0 for
> partition [userlogs,84] is either not alive or not in the isr. Current
> leader and ISR: [{"leader":-1,"leader_epoch":203,"isr":[1]}].
> Caused by: kafka.common.StateChangeFailedException: Preferred replica 0
> for partition [userlogs,84] is either not alive or not in the isr. Current
> leader and ISR: [{"leader":-1,"leader_epoch":203,"isr":[1]}]
>
> My question is:
> 1) how could this happen and how can I fix it or work around it?
> 2) Is 256 partitions too big? We have about 200+ cores for spark streaming
> job.
>
> Thanks,
> Qi
>
>