You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@kafka.apache.org by Raghav <ra...@gmail.com> on 2018/06/01 05:31:01 UTC

How to gracefully stop Kafka

Hi

We have a 3 Kafka brokers setup on 0.10.2.1. We have a requirement in our
company environment that we have to first stop our 3 Kafka Broker setup,
then do some operations stuff that takes about 1 hours, and then bring up
Kafka (version 1.1) brokers again.

In order to achieve this, we issue:

1. Run *bin/kafka-server-stop.sh* at the same time on all three brokers.
2. Do operations on our environment for about 1 hour.
3. Run bin/kafka-server.-start.sh at the same time on all three brokers.

Upon start, we observe that leadership for lot of partition is messed up.
The leadership shows up as -1 for lot of partitions. And ISR has no
servers. Because of this our Kafka cluster is unusable, and even restart of
brokers doesn't help.

1. Could it be because we are not doing rolling stop ?
2. What's the best way to do rollling stop ?

Please advise.
Thanks.

R

Re: How to gracefully stop Kafka

Posted by Raghav <ra...@gmail.com>.
Thanks guys, I will try this and update to see if that worked.

On Fri, Jun 1, 2018 at 1:42 AM, M. Manna <ma...@gmail.com> wrote:

> Regarding graceful shutdown - I have got a response from Jan in the past -
> I am simply quoting that below:
>
> "A gracefully shutdown means the broker is only shutting down when it is
> not the leader of any partition.
> Therefore you should not be able to gracefully shut down your entire
> cluster."
>
> That said, you should allow some flexibility in your startup. I do my
> testbed (3-node) startup always the following way - and it works nicely
>
> 1) Start each zookeeper node - allow 5 seconds between each startup.
> 2) When all ZKs are up - wait for another 10 seconds
> 3) Start all brokers - allow 5 seconds between each startup
>
> Provided that your index files aren't corrupted - it should always start up
> normally.
>
> Regards,
>
>
>
>
> On 1 June 2018 at 07:37, Pena Quijada Alexander <a....@reply.it>
> wrote:
>
> > Hi,
> >
> > From my point of view, if you don't have any tool that help you in the
> > management of your broker services, in other to do a rolling restart
> > manually, you should shut down one broker at a time.
> >
> > In this way, you leave time to the broker controller service to balance
> > the active replicas into the healthy nodes.
> >
> > The same procedure when you start up your nodes.
> >
> > Regards!
> >
> > Alex
> >
> > Inviato da BlueMail<http://www.bluemail.me/r?b=13090> Il giorno 1 giu
> > 2018, alle ore 07:31, Raghav <raghavastic@gmail.com<mailto:
> > raghavastic@gmail.com>> ha scritto:
> >
> > Hi
> >
> > We have a 3 Kafka brokers setup on 0.10.2.1. We have a requirement in our
> > company environment that we have to first stop our 3 Kafka Broker setup,
> > then do some operations stuff that takes about 1 hours, and then bring up
> > Kafka (version 1.1) brokers again.
> >
> > In order to achieve this, we issue:
> >
> > 1. Run *bin/<http://kafka-server-stop.sh>kafka-server-stop.sh<http:
> > //kafka-server-stop.sh>* at the same time on all three brokers.
> > 2. Do operations on our environment for about 1 hour.
> > 3. Run bin/kafka-server.-<http://start.sh>start.sh<http://start.sh> at
> > the same time on all three brokers.
> >
> > Upon start, we observe that leadership for lot of partition is messed up.
> > The leadership shows up as -1 for lot of partitions. And ISR has no
> > servers. Because of this our Kafka cluster is unusable, and even restart
> of
> > brokers doesn't help.
> >
> > 1. Could it be because we are not doing rolling stop ?
> > 2. What's the best way to do rollling stop ?
> >
> > Please advise.
> > Thanks.
> >
> > R
> >
> > ________________________________
> >
> > --
> > The information transmitted is intended for the person or entity to which
> > it is addressed and may contain confidential and/or privileged material.
> > Any review, retransmission, dissemination or other use of, or taking of
> any
> > action in reliance upon, this information by persons or entities other
> than
> > the intended recipient is prohibited. If you received this in error,
> please
> > contact the sender and delete the material from any computer.
> >
>



-- 
Raghav

Re: How to gracefully stop Kafka

Posted by "M. Manna" <ma...@gmail.com>.
Regarding graceful shutdown - I have got a response from Jan in the past -
I am simply quoting that below:

"A gracefully shutdown means the broker is only shutting down when it is
not the leader of any partition.
Therefore you should not be able to gracefully shut down your entire
cluster."

That said, you should allow some flexibility in your startup. I do my
testbed (3-node) startup always the following way - and it works nicely

1) Start each zookeeper node - allow 5 seconds between each startup.
2) When all ZKs are up - wait for another 10 seconds
3) Start all brokers - allow 5 seconds between each startup

Provided that your index files aren't corrupted - it should always start up
normally.

Regards,




On 1 June 2018 at 07:37, Pena Quijada Alexander <a....@reply.it>
wrote:

> Hi,
>
> From my point of view, if you don't have any tool that help you in the
> management of your broker services, in other to do a rolling restart
> manually, you should shut down one broker at a time.
>
> In this way, you leave time to the broker controller service to balance
> the active replicas into the healthy nodes.
>
> The same procedure when you start up your nodes.
>
> Regards!
>
> Alex
>
> Inviato da BlueMail<http://www.bluemail.me/r?b=13090> Il giorno 1 giu
> 2018, alle ore 07:31, Raghav <raghavastic@gmail.com<mailto:
> raghavastic@gmail.com>> ha scritto:
>
> Hi
>
> We have a 3 Kafka brokers setup on 0.10.2.1. We have a requirement in our
> company environment that we have to first stop our 3 Kafka Broker setup,
> then do some operations stuff that takes about 1 hours, and then bring up
> Kafka (version 1.1) brokers again.
>
> In order to achieve this, we issue:
>
> 1. Run *bin/<http://kafka-server-stop.sh>kafka-server-stop.sh<http:
> //kafka-server-stop.sh>* at the same time on all three brokers.
> 2. Do operations on our environment for about 1 hour.
> 3. Run bin/kafka-server.-<http://start.sh>start.sh<http://start.sh> at
> the same time on all three brokers.
>
> Upon start, we observe that leadership for lot of partition is messed up.
> The leadership shows up as -1 for lot of partitions. And ISR has no
> servers. Because of this our Kafka cluster is unusable, and even restart of
> brokers doesn't help.
>
> 1. Could it be because we are not doing rolling stop ?
> 2. What's the best way to do rollling stop ?
>
> Please advise.
> Thanks.
>
> R
>
> ________________________________
>
> --
> The information transmitted is intended for the person or entity to which
> it is addressed and may contain confidential and/or privileged material.
> Any review, retransmission, dissemination or other use of, or taking of any
> action in reliance upon, this information by persons or entities other than
> the intended recipient is prohibited. If you received this in error, please
> contact the sender and delete the material from any computer.
>

Re: How to gracefully stop Kafka

Posted by Pena Quijada Alexander <a....@reply.it>.
Hi,

From my point of view, if you don't have any tool that help you in the management of your broker services, in other to do a rolling restart manually, you should shut down one broker at a time.

In this way, you leave time to the broker controller service to balance the active replicas into the healthy nodes.

The same procedure when you start up your nodes.

Regards!

Alex

Inviato da BlueMail<http://www.bluemail.me/r?b=13090> Il giorno 1 giu 2018, alle ore 07:31, Raghav <ra...@gmail.com>> ha scritto:

Hi

We have a 3 Kafka brokers setup on 0.10.2.1. We have a requirement in our
company environment that we have to first stop our 3 Kafka Broker setup,
then do some operations stuff that takes about 1 hours, and then bring up
Kafka (version 1.1) brokers again.

In order to achieve this, we issue:

1. Run *bin/<http://kafka-server-stop.sh>kafka-server-stop.sh<http://kafka-server-stop.sh>* at the same time on all three brokers.
2. Do operations on our environment for about 1 hour.
3. Run bin/kafka-server.-<http://start.sh>start.sh<http://start.sh> at the same time on all three brokers.

Upon start, we observe that leadership for lot of partition is messed up.
The leadership shows up as -1 for lot of partitions. And ISR has no
servers. Because of this our Kafka cluster is unusable, and even restart of
brokers doesn't help.

1. Could it be because we are not doing rolling stop ?
2. What's the best way to do rollling stop ?

Please advise.
Thanks.

R

________________________________

--
The information transmitted is intended for the person or entity to which it is addressed and may contain confidential and/or privileged material. Any review, retransmission, dissemination or other use of, or taking of any action in reliance upon, this information by persons or entities other than the intended recipient is prohibited. If you received this in error, please contact the sender and delete the material from any computer.