You are viewing a plain text version of this content. The canonical link for it is here.

Posted to users@kafka.apache.org by Yashodhan Kocharekar <yk...@tibco.com> on 2016/04/09 06:03:54 UTC

Kafka safe Rolling restart

hi i am trying write a script for safe rolling restart of
kafka_2.9.2-0.8.1.1 cluster , high level workflow is

for each broker do
   1. move partition replica leadership from current_broker to others
   2. broker restart
   3. restore  leadership borker

now i have found a script to do 1.
https://gist.github.com/miguno/87d5b2411e3f93e80866
i am not sure how to do 3.

Re: Kafka safe Rolling restart

Posted by Robert Christ <rc...@tivo.com>.

Hi Yashodhan,

I do this quite frequently and if I understand your
question correctly, it is the default behavior.

If you issue a normal TERM signal to the kafka process
(or call kafka-server-stop.sh) it will start controlled
shutdown which will migrate leadership for all the partitions
it is currently leading to other brokers in the ISR.  This
will not happen if there are no other brokers in the ISR so
you probably don't want to start this until you have no
under replicated partitions.  You can check with:

kafka-topics.sh --zookeeper xxx --describe --under-replicated-partitions

After controlled shutdown completes the process should exit.
I have seen some cases where the broker appears to complete
shutdown and has moved leadership for all of its partitions
but it does not exit the process.  If this happens I have
just issued a hard KILL to the process.  It is possible that
I am just impatient and the process will eventually exit.

When you restart the broker, it will catch up on all
the partitions for which it is in the replica list.
Hopefully it will quickly enter the ISR as well.

Next, by default kafka has auto leader rebalancing enabled.
It is controlled by this parameter:

auto.leader.rebalance.enable  (default: true)

These two parameters also control the rebalancing:

leader.imbalance.check.interval.seconds   (default: 300)
leader.imbalance.per.broker.percentage     (default: 10)

So, on average about 150 seconds after your broker has
returned a rebalancing event should occur.  This will
move the leadership back to your broker for partitions
where it is the preferred leader which just means it
shows up first in the replica list and it is in the ISR.

There is also the script:

kafka-preferred-replica-election.sh

which will trigger the election.  I have only tried it once
but it appeared to do the job.

Good luck,
  rob

> On Apr 8, 2016, at 9:03 PM, Yashodhan Kocharekar <yk...@tibco.com> wrote:
>
> hi i am trying write a script for safe rolling restart of
> kafka_2.9.2-0.8.1.1 cluster , high level workflow is
>
> for each broker do
>   1. move partition replica leadership from current_broker to others
>   2. broker restart
>   3. restore  leadership borker
>
> now i have found a script to do 1.
> https://gist.github.com/miguno/87d5b2411e3f93e80866
> i am not sure how to do 3.

________________________________

This email and any attachments may contain confidential and privileged material for the sole use of the intended recipient. Any review, copying, or distribution of this email (or any attachments) by others is prohibited. If you are not the intended recipient, please contact the sender immediately and permanently delete this email and any attachments. No employee or agent of TiVo Inc. is authorized to conclude any binding agreement on behalf of TiVo Inc. by email. Binding agreements with TiVo Inc. may only be made by a signed written agreement.