You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@kafka.apache.org by Tom Crayford <tc...@heroku.com> on 2016/01/06 16:14:24 UTC

Partition rebalancing after broker removal

Hi there,

Kafka's `kafka-reassign-partitions.sh` tool currently has no mechanism for
removing brokers. However, it does have the ability to generate partition
plans across arbitrary sets of brokers, by using `--generate`, passing all
the topics in the cluster into it, then passing the generated plan to
--execute.

This isn't ideal, because it (from my understanding), potentially moves all
the partitions in the entire cluster around, but it should work fine, and
stop Kafka from having the partitions assigned to a broker that no longer
exists.

Am I missing something there? Or is this a reasonable workaround until
better partition reassignment tools turn up in the future?

Thanks

Tom

Re: Partition rebalancing after broker removal

Posted by Luke Steensen <lu...@braintreepayments.com>.
No worries, glad to have the functionality! Thanks for your help.

Luke


On Thu, Jan 14, 2016 at 10:58 AM, Gwen Shapira <gw...@confluent.io> wrote:

> Yep. That tool is not our best documented :(
>
> On Thu, Jan 14, 2016 at 11:49 AM, Luke Steensen <
> luke.steensen@braintreepayments.com> wrote:
>
> > Is the preferred leader the first replica in the list passed to the
> > reassignment tool? I don't see it specifically called out in the json
> file
> > format.
> >
> >
> > On Thu, Jan 14, 2016 at 10:42 AM, Gwen Shapira <gw...@confluent.io>
> wrote:
> >
> > > Ah, got it!
> > >
> > > There's no easy way to transfer leadership on command, but you could
> use
> > > the reassignment tool to change the preferred leader (and nothing else)
> > and
> > > then trigger preferred leader election.
> > >
> > > Gwen
> > >
> > > On Thu, Jan 14, 2016 at 11:30 AM, Luke Steensen <
> > > luke.steensen@braintreepayments.com> wrote:
> > >
> > > > Hi Gwen,
> > > >
> > > > 1. I sent a message to this list a couple days ago with the subject
> > > > "Controlled shutdown not relinquishing leadership of all partitions"
> > > > describing the issue I saw. Sorry there's not a lot of detail on the
> > > > controlled shutdown part, but I've had trouble reproducing outside of
> > our
> > > > specific deployment.
> > > >
> > > > 2. Yes, that makes sense. Sorry, I was implicitly assuming tight
> > timeouts
> > > > and at least one retry.
> > > >
> > > > 3. Right, my understanding is that it doesn't change the preferred
> > > leader,
> > > > it just triggers a more graceful leader election than would occur if
> > the
> > > > broker were killed unexpectedly. I was basically asking if there's a
> > way
> > > to
> > > > move leadership away from a broker independently of shutting it down.
> > > That
> > > > would really just be a workaround for the controlled shutdown issues
> we
> > > > experienced.
> > > >
> > > > 4. Yep, we rely on exactly this behavior when replacing nodes. It's
> > very
> > > > helpful :)
> > > >
> > > > Thanks!
> > > > Luke
> > > >
> > > >
> > > > On Thu, Jan 14, 2016 at 10:07 AM, Gwen Shapira <gw...@confluent.io>
> > > wrote:
> > > >
> > > > > Hi,
> > > > >
> > > > > 1. If you had problems with controlled shutdown, we need to know.
> > Maybe
> > > > > open a thread to discuss?
> > > > > 2. Controlled shutdown is only used to reduce the downtime involved
> > in
> > > > > large number of leader elections. New leaders will get elected in
> any
> > > > case.
> > > > > 3. Controlled (or uncontrolled shutdown) does not change the
> > preferred
> > > > > leader. This happens only on re-assignment.
> > > > > 4. #3 relies on the fact that if you are a brand new broker with
> > > > absolutely
> > > > > no data joining the cluster with id = "n", and the replica-map
> shows
> > > that
> > > > > broker "n" has certain partitions (because we never assigned them
> > > away),
> > > > > the new broker will immediately become follower for these
> partitions
> > > and
> > > > > start replicating the missing data.
> > > > > This makes automatic recover much easier.
> > > > >
> > > > > Gwen
> > > > >
> > > > > On Thu, Jan 14, 2016 at 10:49 AM, Luke Steensen <
> > > > > luke.steensen@braintreepayments.com> wrote:
> > > > >
> > > > > > Hello,
> > > > > >
> > > > > > For #3, I assume this relies on controlled shutdown to transfer
> > > > > leadership
> > > > > > gracefully? Or is there some way to use partition reassignment to
> > set
> > > > the
> > > > > > preferred leader of each partition? I ask because we've run into
> > some
> > > > > > problems relying on controlled shutdown and having a separate
> > > > verifiable
> > > > > > step would be nice.
> > > > > >
> > > > > > Thanks,
> > > > > > Luke
> > > > > >
> > > > > >
> > > > > > On Thu, Jan 14, 2016 at 9:36 AM, Gwen Shapira <gwen@confluent.io
> >
> > > > wrote:
> > > > > >
> > > > > > > Hi,
> > > > > > >
> > > > > > > There was a Jira to add "remove broker" option to the
> > > > > > > partition-reassignment tool. I think it died in a long
> discussion
> > > > > trying
> > > > > > to
> > > > > > > solve a harder problem...
> > > > > > >
> > > > > > > To your work-around - it is an acceptable work-around.
> > > > > > >
> > > > > > > Few improvements:
> > > > > > > 1. Manually edit the resulting assignment json to avoid
> > unnecessary
> > > > > > moves.
> > > > > > > Or even create your own assignment (either manually or using a
> > > small
> > > > > > > script).
> > > > > > > 2. We don't throttle the partition move automatically, so it
> can
> > > > easily
> > > > > > > take over the network if you are not careful. Therefore running
> > the
> > > > > > > reassignment tools multiple times to move partitions one-by-one
> > is
> > > > > often
> > > > > > > safer.
> > > > > > > 3. If you don't mean to permanently reduce the number of
> brokers
> > > but
> > > > > > rather
> > > > > > > to replace a broker, don't reassign. Just take down the
> existing
> > > > broker
> > > > > > and
> > > > > > > give the new one the same ID.
> > > > > > >
> > > > > > > Hope this helps,
> > > > > > >
> > > > > > > Gwen
> > > > > > >
> > > > > > > On Wed, Jan 6, 2016 at 10:14 AM, Tom Crayford <
> > > tcrayford@heroku.com>
> > > > > > > wrote:
> > > > > > >
> > > > > > > > Hi there,
> > > > > > > >
> > > > > > > > Kafka's `kafka-reassign-partitions.sh` tool currently has no
> > > > > mechanism
> > > > > > > for
> > > > > > > > removing brokers. However, it does have the ability to
> generate
> > > > > > partition
> > > > > > > > plans across arbitrary sets of brokers, by using
> `--generate`,
> > > > > passing
> > > > > > > all
> > > > > > > > the topics in the cluster into it, then passing the generated
> > > plan
> > > > to
> > > > > > > > --execute.
> > > > > > > >
> > > > > > > > This isn't ideal, because it (from my understanding),
> > potentially
> > > > > moves
> > > > > > > all
> > > > > > > > the partitions in the entire cluster around, but it should
> work
> > > > fine,
> > > > > > and
> > > > > > > > stop Kafka from having the partitions assigned to a broker
> that
> > > no
> > > > > > longer
> > > > > > > > exists.
> > > > > > > >
> > > > > > > > Am I missing something there? Or is this a reasonable
> > workaround
> > > > > until
> > > > > > > > better partition reassignment tools turn up in the future?
> > > > > > > >
> > > > > > > > Thanks
> > > > > > > >
> > > > > > > > Tom
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Re: Partition rebalancing after broker removal

Posted by Gwen Shapira <gw...@confluent.io>.
Yep. That tool is not our best documented :(

On Thu, Jan 14, 2016 at 11:49 AM, Luke Steensen <
luke.steensen@braintreepayments.com> wrote:

> Is the preferred leader the first replica in the list passed to the
> reassignment tool? I don't see it specifically called out in the json file
> format.
>
>
> On Thu, Jan 14, 2016 at 10:42 AM, Gwen Shapira <gw...@confluent.io> wrote:
>
> > Ah, got it!
> >
> > There's no easy way to transfer leadership on command, but you could use
> > the reassignment tool to change the preferred leader (and nothing else)
> and
> > then trigger preferred leader election.
> >
> > Gwen
> >
> > On Thu, Jan 14, 2016 at 11:30 AM, Luke Steensen <
> > luke.steensen@braintreepayments.com> wrote:
> >
> > > Hi Gwen,
> > >
> > > 1. I sent a message to this list a couple days ago with the subject
> > > "Controlled shutdown not relinquishing leadership of all partitions"
> > > describing the issue I saw. Sorry there's not a lot of detail on the
> > > controlled shutdown part, but I've had trouble reproducing outside of
> our
> > > specific deployment.
> > >
> > > 2. Yes, that makes sense. Sorry, I was implicitly assuming tight
> timeouts
> > > and at least one retry.
> > >
> > > 3. Right, my understanding is that it doesn't change the preferred
> > leader,
> > > it just triggers a more graceful leader election than would occur if
> the
> > > broker were killed unexpectedly. I was basically asking if there's a
> way
> > to
> > > move leadership away from a broker independently of shutting it down.
> > That
> > > would really just be a workaround for the controlled shutdown issues we
> > > experienced.
> > >
> > > 4. Yep, we rely on exactly this behavior when replacing nodes. It's
> very
> > > helpful :)
> > >
> > > Thanks!
> > > Luke
> > >
> > >
> > > On Thu, Jan 14, 2016 at 10:07 AM, Gwen Shapira <gw...@confluent.io>
> > wrote:
> > >
> > > > Hi,
> > > >
> > > > 1. If you had problems with controlled shutdown, we need to know.
> Maybe
> > > > open a thread to discuss?
> > > > 2. Controlled shutdown is only used to reduce the downtime involved
> in
> > > > large number of leader elections. New leaders will get elected in any
> > > case.
> > > > 3. Controlled (or uncontrolled shutdown) does not change the
> preferred
> > > > leader. This happens only on re-assignment.
> > > > 4. #3 relies on the fact that if you are a brand new broker with
> > > absolutely
> > > > no data joining the cluster with id = "n", and the replica-map shows
> > that
> > > > broker "n" has certain partitions (because we never assigned them
> > away),
> > > > the new broker will immediately become follower for these partitions
> > and
> > > > start replicating the missing data.
> > > > This makes automatic recover much easier.
> > > >
> > > > Gwen
> > > >
> > > > On Thu, Jan 14, 2016 at 10:49 AM, Luke Steensen <
> > > > luke.steensen@braintreepayments.com> wrote:
> > > >
> > > > > Hello,
> > > > >
> > > > > For #3, I assume this relies on controlled shutdown to transfer
> > > > leadership
> > > > > gracefully? Or is there some way to use partition reassignment to
> set
> > > the
> > > > > preferred leader of each partition? I ask because we've run into
> some
> > > > > problems relying on controlled shutdown and having a separate
> > > verifiable
> > > > > step would be nice.
> > > > >
> > > > > Thanks,
> > > > > Luke
> > > > >
> > > > >
> > > > > On Thu, Jan 14, 2016 at 9:36 AM, Gwen Shapira <gw...@confluent.io>
> > > wrote:
> > > > >
> > > > > > Hi,
> > > > > >
> > > > > > There was a Jira to add "remove broker" option to the
> > > > > > partition-reassignment tool. I think it died in a long discussion
> > > > trying
> > > > > to
> > > > > > solve a harder problem...
> > > > > >
> > > > > > To your work-around - it is an acceptable work-around.
> > > > > >
> > > > > > Few improvements:
> > > > > > 1. Manually edit the resulting assignment json to avoid
> unnecessary
> > > > > moves.
> > > > > > Or even create your own assignment (either manually or using a
> > small
> > > > > > script).
> > > > > > 2. We don't throttle the partition move automatically, so it can
> > > easily
> > > > > > take over the network if you are not careful. Therefore running
> the
> > > > > > reassignment tools multiple times to move partitions one-by-one
> is
> > > > often
> > > > > > safer.
> > > > > > 3. If you don't mean to permanently reduce the number of brokers
> > but
> > > > > rather
> > > > > > to replace a broker, don't reassign. Just take down the existing
> > > broker
> > > > > and
> > > > > > give the new one the same ID.
> > > > > >
> > > > > > Hope this helps,
> > > > > >
> > > > > > Gwen
> > > > > >
> > > > > > On Wed, Jan 6, 2016 at 10:14 AM, Tom Crayford <
> > tcrayford@heroku.com>
> > > > > > wrote:
> > > > > >
> > > > > > > Hi there,
> > > > > > >
> > > > > > > Kafka's `kafka-reassign-partitions.sh` tool currently has no
> > > > mechanism
> > > > > > for
> > > > > > > removing brokers. However, it does have the ability to generate
> > > > > partition
> > > > > > > plans across arbitrary sets of brokers, by using `--generate`,
> > > > passing
> > > > > > all
> > > > > > > the topics in the cluster into it, then passing the generated
> > plan
> > > to
> > > > > > > --execute.
> > > > > > >
> > > > > > > This isn't ideal, because it (from my understanding),
> potentially
> > > > moves
> > > > > > all
> > > > > > > the partitions in the entire cluster around, but it should work
> > > fine,
> > > > > and
> > > > > > > stop Kafka from having the partitions assigned to a broker that
> > no
> > > > > longer
> > > > > > > exists.
> > > > > > >
> > > > > > > Am I missing something there? Or is this a reasonable
> workaround
> > > > until
> > > > > > > better partition reassignment tools turn up in the future?
> > > > > > >
> > > > > > > Thanks
> > > > > > >
> > > > > > > Tom
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Re: Partition rebalancing after broker removal

Posted by Luke Steensen <lu...@braintreepayments.com>.
Is the preferred leader the first replica in the list passed to the
reassignment tool? I don't see it specifically called out in the json file
format.


On Thu, Jan 14, 2016 at 10:42 AM, Gwen Shapira <gw...@confluent.io> wrote:

> Ah, got it!
>
> There's no easy way to transfer leadership on command, but you could use
> the reassignment tool to change the preferred leader (and nothing else) and
> then trigger preferred leader election.
>
> Gwen
>
> On Thu, Jan 14, 2016 at 11:30 AM, Luke Steensen <
> luke.steensen@braintreepayments.com> wrote:
>
> > Hi Gwen,
> >
> > 1. I sent a message to this list a couple days ago with the subject
> > "Controlled shutdown not relinquishing leadership of all partitions"
> > describing the issue I saw. Sorry there's not a lot of detail on the
> > controlled shutdown part, but I've had trouble reproducing outside of our
> > specific deployment.
> >
> > 2. Yes, that makes sense. Sorry, I was implicitly assuming tight timeouts
> > and at least one retry.
> >
> > 3. Right, my understanding is that it doesn't change the preferred
> leader,
> > it just triggers a more graceful leader election than would occur if the
> > broker were killed unexpectedly. I was basically asking if there's a way
> to
> > move leadership away from a broker independently of shutting it down.
> That
> > would really just be a workaround for the controlled shutdown issues we
> > experienced.
> >
> > 4. Yep, we rely on exactly this behavior when replacing nodes. It's very
> > helpful :)
> >
> > Thanks!
> > Luke
> >
> >
> > On Thu, Jan 14, 2016 at 10:07 AM, Gwen Shapira <gw...@confluent.io>
> wrote:
> >
> > > Hi,
> > >
> > > 1. If you had problems with controlled shutdown, we need to know. Maybe
> > > open a thread to discuss?
> > > 2. Controlled shutdown is only used to reduce the downtime involved in
> > > large number of leader elections. New leaders will get elected in any
> > case.
> > > 3. Controlled (or uncontrolled shutdown) does not change the preferred
> > > leader. This happens only on re-assignment.
> > > 4. #3 relies on the fact that if you are a brand new broker with
> > absolutely
> > > no data joining the cluster with id = "n", and the replica-map shows
> that
> > > broker "n" has certain partitions (because we never assigned them
> away),
> > > the new broker will immediately become follower for these partitions
> and
> > > start replicating the missing data.
> > > This makes automatic recover much easier.
> > >
> > > Gwen
> > >
> > > On Thu, Jan 14, 2016 at 10:49 AM, Luke Steensen <
> > > luke.steensen@braintreepayments.com> wrote:
> > >
> > > > Hello,
> > > >
> > > > For #3, I assume this relies on controlled shutdown to transfer
> > > leadership
> > > > gracefully? Or is there some way to use partition reassignment to set
> > the
> > > > preferred leader of each partition? I ask because we've run into some
> > > > problems relying on controlled shutdown and having a separate
> > verifiable
> > > > step would be nice.
> > > >
> > > > Thanks,
> > > > Luke
> > > >
> > > >
> > > > On Thu, Jan 14, 2016 at 9:36 AM, Gwen Shapira <gw...@confluent.io>
> > wrote:
> > > >
> > > > > Hi,
> > > > >
> > > > > There was a Jira to add "remove broker" option to the
> > > > > partition-reassignment tool. I think it died in a long discussion
> > > trying
> > > > to
> > > > > solve a harder problem...
> > > > >
> > > > > To your work-around - it is an acceptable work-around.
> > > > >
> > > > > Few improvements:
> > > > > 1. Manually edit the resulting assignment json to avoid unnecessary
> > > > moves.
> > > > > Or even create your own assignment (either manually or using a
> small
> > > > > script).
> > > > > 2. We don't throttle the partition move automatically, so it can
> > easily
> > > > > take over the network if you are not careful. Therefore running the
> > > > > reassignment tools multiple times to move partitions one-by-one is
> > > often
> > > > > safer.
> > > > > 3. If you don't mean to permanently reduce the number of brokers
> but
> > > > rather
> > > > > to replace a broker, don't reassign. Just take down the existing
> > broker
> > > > and
> > > > > give the new one the same ID.
> > > > >
> > > > > Hope this helps,
> > > > >
> > > > > Gwen
> > > > >
> > > > > On Wed, Jan 6, 2016 at 10:14 AM, Tom Crayford <
> tcrayford@heroku.com>
> > > > > wrote:
> > > > >
> > > > > > Hi there,
> > > > > >
> > > > > > Kafka's `kafka-reassign-partitions.sh` tool currently has no
> > > mechanism
> > > > > for
> > > > > > removing brokers. However, it does have the ability to generate
> > > > partition
> > > > > > plans across arbitrary sets of brokers, by using `--generate`,
> > > passing
> > > > > all
> > > > > > the topics in the cluster into it, then passing the generated
> plan
> > to
> > > > > > --execute.
> > > > > >
> > > > > > This isn't ideal, because it (from my understanding), potentially
> > > moves
> > > > > all
> > > > > > the partitions in the entire cluster around, but it should work
> > fine,
> > > > and
> > > > > > stop Kafka from having the partitions assigned to a broker that
> no
> > > > longer
> > > > > > exists.
> > > > > >
> > > > > > Am I missing something there? Or is this a reasonable workaround
> > > until
> > > > > > better partition reassignment tools turn up in the future?
> > > > > >
> > > > > > Thanks
> > > > > >
> > > > > > Tom
> > > > > >
> > > > >
> > > >
> > >
> >
>

Re: Partition rebalancing after broker removal

Posted by Gwen Shapira <gw...@confluent.io>.
Ah, got it!

There's no easy way to transfer leadership on command, but you could use
the reassignment tool to change the preferred leader (and nothing else) and
then trigger preferred leader election.

Gwen

On Thu, Jan 14, 2016 at 11:30 AM, Luke Steensen <
luke.steensen@braintreepayments.com> wrote:

> Hi Gwen,
>
> 1. I sent a message to this list a couple days ago with the subject
> "Controlled shutdown not relinquishing leadership of all partitions"
> describing the issue I saw. Sorry there's not a lot of detail on the
> controlled shutdown part, but I've had trouble reproducing outside of our
> specific deployment.
>
> 2. Yes, that makes sense. Sorry, I was implicitly assuming tight timeouts
> and at least one retry.
>
> 3. Right, my understanding is that it doesn't change the preferred leader,
> it just triggers a more graceful leader election than would occur if the
> broker were killed unexpectedly. I was basically asking if there's a way to
> move leadership away from a broker independently of shutting it down. That
> would really just be a workaround for the controlled shutdown issues we
> experienced.
>
> 4. Yep, we rely on exactly this behavior when replacing nodes. It's very
> helpful :)
>
> Thanks!
> Luke
>
>
> On Thu, Jan 14, 2016 at 10:07 AM, Gwen Shapira <gw...@confluent.io> wrote:
>
> > Hi,
> >
> > 1. If you had problems with controlled shutdown, we need to know. Maybe
> > open a thread to discuss?
> > 2. Controlled shutdown is only used to reduce the downtime involved in
> > large number of leader elections. New leaders will get elected in any
> case.
> > 3. Controlled (or uncontrolled shutdown) does not change the preferred
> > leader. This happens only on re-assignment.
> > 4. #3 relies on the fact that if you are a brand new broker with
> absolutely
> > no data joining the cluster with id = "n", and the replica-map shows that
> > broker "n" has certain partitions (because we never assigned them away),
> > the new broker will immediately become follower for these partitions and
> > start replicating the missing data.
> > This makes automatic recover much easier.
> >
> > Gwen
> >
> > On Thu, Jan 14, 2016 at 10:49 AM, Luke Steensen <
> > luke.steensen@braintreepayments.com> wrote:
> >
> > > Hello,
> > >
> > > For #3, I assume this relies on controlled shutdown to transfer
> > leadership
> > > gracefully? Or is there some way to use partition reassignment to set
> the
> > > preferred leader of each partition? I ask because we've run into some
> > > problems relying on controlled shutdown and having a separate
> verifiable
> > > step would be nice.
> > >
> > > Thanks,
> > > Luke
> > >
> > >
> > > On Thu, Jan 14, 2016 at 9:36 AM, Gwen Shapira <gw...@confluent.io>
> wrote:
> > >
> > > > Hi,
> > > >
> > > > There was a Jira to add "remove broker" option to the
> > > > partition-reassignment tool. I think it died in a long discussion
> > trying
> > > to
> > > > solve a harder problem...
> > > >
> > > > To your work-around - it is an acceptable work-around.
> > > >
> > > > Few improvements:
> > > > 1. Manually edit the resulting assignment json to avoid unnecessary
> > > moves.
> > > > Or even create your own assignment (either manually or using a small
> > > > script).
> > > > 2. We don't throttle the partition move automatically, so it can
> easily
> > > > take over the network if you are not careful. Therefore running the
> > > > reassignment tools multiple times to move partitions one-by-one is
> > often
> > > > safer.
> > > > 3. If you don't mean to permanently reduce the number of brokers but
> > > rather
> > > > to replace a broker, don't reassign. Just take down the existing
> broker
> > > and
> > > > give the new one the same ID.
> > > >
> > > > Hope this helps,
> > > >
> > > > Gwen
> > > >
> > > > On Wed, Jan 6, 2016 at 10:14 AM, Tom Crayford <tc...@heroku.com>
> > > > wrote:
> > > >
> > > > > Hi there,
> > > > >
> > > > > Kafka's `kafka-reassign-partitions.sh` tool currently has no
> > mechanism
> > > > for
> > > > > removing brokers. However, it does have the ability to generate
> > > partition
> > > > > plans across arbitrary sets of brokers, by using `--generate`,
> > passing
> > > > all
> > > > > the topics in the cluster into it, then passing the generated plan
> to
> > > > > --execute.
> > > > >
> > > > > This isn't ideal, because it (from my understanding), potentially
> > moves
> > > > all
> > > > > the partitions in the entire cluster around, but it should work
> fine,
> > > and
> > > > > stop Kafka from having the partitions assigned to a broker that no
> > > longer
> > > > > exists.
> > > > >
> > > > > Am I missing something there? Or is this a reasonable workaround
> > until
> > > > > better partition reassignment tools turn up in the future?
> > > > >
> > > > > Thanks
> > > > >
> > > > > Tom
> > > > >
> > > >
> > >
> >
>

Re: Partition rebalancing after broker removal

Posted by Luke Steensen <lu...@braintreepayments.com>.
Hi Gwen,

1. I sent a message to this list a couple days ago with the subject
"Controlled shutdown not relinquishing leadership of all partitions"
describing the issue I saw. Sorry there's not a lot of detail on the
controlled shutdown part, but I've had trouble reproducing outside of our
specific deployment.

2. Yes, that makes sense. Sorry, I was implicitly assuming tight timeouts
and at least one retry.

3. Right, my understanding is that it doesn't change the preferred leader,
it just triggers a more graceful leader election than would occur if the
broker were killed unexpectedly. I was basically asking if there's a way to
move leadership away from a broker independently of shutting it down. That
would really just be a workaround for the controlled shutdown issues we
experienced.

4. Yep, we rely on exactly this behavior when replacing nodes. It's very
helpful :)

Thanks!
Luke


On Thu, Jan 14, 2016 at 10:07 AM, Gwen Shapira <gw...@confluent.io> wrote:

> Hi,
>
> 1. If you had problems with controlled shutdown, we need to know. Maybe
> open a thread to discuss?
> 2. Controlled shutdown is only used to reduce the downtime involved in
> large number of leader elections. New leaders will get elected in any case.
> 3. Controlled (or uncontrolled shutdown) does not change the preferred
> leader. This happens only on re-assignment.
> 4. #3 relies on the fact that if you are a brand new broker with absolutely
> no data joining the cluster with id = "n", and the replica-map shows that
> broker "n" has certain partitions (because we never assigned them away),
> the new broker will immediately become follower for these partitions and
> start replicating the missing data.
> This makes automatic recover much easier.
>
> Gwen
>
> On Thu, Jan 14, 2016 at 10:49 AM, Luke Steensen <
> luke.steensen@braintreepayments.com> wrote:
>
> > Hello,
> >
> > For #3, I assume this relies on controlled shutdown to transfer
> leadership
> > gracefully? Or is there some way to use partition reassignment to set the
> > preferred leader of each partition? I ask because we've run into some
> > problems relying on controlled shutdown and having a separate verifiable
> > step would be nice.
> >
> > Thanks,
> > Luke
> >
> >
> > On Thu, Jan 14, 2016 at 9:36 AM, Gwen Shapira <gw...@confluent.io> wrote:
> >
> > > Hi,
> > >
> > > There was a Jira to add "remove broker" option to the
> > > partition-reassignment tool. I think it died in a long discussion
> trying
> > to
> > > solve a harder problem...
> > >
> > > To your work-around - it is an acceptable work-around.
> > >
> > > Few improvements:
> > > 1. Manually edit the resulting assignment json to avoid unnecessary
> > moves.
> > > Or even create your own assignment (either manually or using a small
> > > script).
> > > 2. We don't throttle the partition move automatically, so it can easily
> > > take over the network if you are not careful. Therefore running the
> > > reassignment tools multiple times to move partitions one-by-one is
> often
> > > safer.
> > > 3. If you don't mean to permanently reduce the number of brokers but
> > rather
> > > to replace a broker, don't reassign. Just take down the existing broker
> > and
> > > give the new one the same ID.
> > >
> > > Hope this helps,
> > >
> > > Gwen
> > >
> > > On Wed, Jan 6, 2016 at 10:14 AM, Tom Crayford <tc...@heroku.com>
> > > wrote:
> > >
> > > > Hi there,
> > > >
> > > > Kafka's `kafka-reassign-partitions.sh` tool currently has no
> mechanism
> > > for
> > > > removing brokers. However, it does have the ability to generate
> > partition
> > > > plans across arbitrary sets of brokers, by using `--generate`,
> passing
> > > all
> > > > the topics in the cluster into it, then passing the generated plan to
> > > > --execute.
> > > >
> > > > This isn't ideal, because it (from my understanding), potentially
> moves
> > > all
> > > > the partitions in the entire cluster around, but it should work fine,
> > and
> > > > stop Kafka from having the partitions assigned to a broker that no
> > longer
> > > > exists.
> > > >
> > > > Am I missing something there? Or is this a reasonable workaround
> until
> > > > better partition reassignment tools turn up in the future?
> > > >
> > > > Thanks
> > > >
> > > > Tom
> > > >
> > >
> >
>

Re: Partition rebalancing after broker removal

Posted by Gwen Shapira <gw...@confluent.io>.
Hi,

1. If you had problems with controlled shutdown, we need to know. Maybe
open a thread to discuss?
2. Controlled shutdown is only used to reduce the downtime involved in
large number of leader elections. New leaders will get elected in any case.
3. Controlled (or uncontrolled shutdown) does not change the preferred
leader. This happens only on re-assignment.
4. #3 relies on the fact that if you are a brand new broker with absolutely
no data joining the cluster with id = "n", and the replica-map shows that
broker "n" has certain partitions (because we never assigned them away),
the new broker will immediately become follower for these partitions and
start replicating the missing data.
This makes automatic recover much easier.

Gwen

On Thu, Jan 14, 2016 at 10:49 AM, Luke Steensen <
luke.steensen@braintreepayments.com> wrote:

> Hello,
>
> For #3, I assume this relies on controlled shutdown to transfer leadership
> gracefully? Or is there some way to use partition reassignment to set the
> preferred leader of each partition? I ask because we've run into some
> problems relying on controlled shutdown and having a separate verifiable
> step would be nice.
>
> Thanks,
> Luke
>
>
> On Thu, Jan 14, 2016 at 9:36 AM, Gwen Shapira <gw...@confluent.io> wrote:
>
> > Hi,
> >
> > There was a Jira to add "remove broker" option to the
> > partition-reassignment tool. I think it died in a long discussion trying
> to
> > solve a harder problem...
> >
> > To your work-around - it is an acceptable work-around.
> >
> > Few improvements:
> > 1. Manually edit the resulting assignment json to avoid unnecessary
> moves.
> > Or even create your own assignment (either manually or using a small
> > script).
> > 2. We don't throttle the partition move automatically, so it can easily
> > take over the network if you are not careful. Therefore running the
> > reassignment tools multiple times to move partitions one-by-one is often
> > safer.
> > 3. If you don't mean to permanently reduce the number of brokers but
> rather
> > to replace a broker, don't reassign. Just take down the existing broker
> and
> > give the new one the same ID.
> >
> > Hope this helps,
> >
> > Gwen
> >
> > On Wed, Jan 6, 2016 at 10:14 AM, Tom Crayford <tc...@heroku.com>
> > wrote:
> >
> > > Hi there,
> > >
> > > Kafka's `kafka-reassign-partitions.sh` tool currently has no mechanism
> > for
> > > removing brokers. However, it does have the ability to generate
> partition
> > > plans across arbitrary sets of brokers, by using `--generate`, passing
> > all
> > > the topics in the cluster into it, then passing the generated plan to
> > > --execute.
> > >
> > > This isn't ideal, because it (from my understanding), potentially moves
> > all
> > > the partitions in the entire cluster around, but it should work fine,
> and
> > > stop Kafka from having the partitions assigned to a broker that no
> longer
> > > exists.
> > >
> > > Am I missing something there? Or is this a reasonable workaround until
> > > better partition reassignment tools turn up in the future?
> > >
> > > Thanks
> > >
> > > Tom
> > >
> >
>

Re: Partition rebalancing after broker removal

Posted by Luke Steensen <lu...@braintreepayments.com>.
Hello,

For #3, I assume this relies on controlled shutdown to transfer leadership
gracefully? Or is there some way to use partition reassignment to set the
preferred leader of each partition? I ask because we've run into some
problems relying on controlled shutdown and having a separate verifiable
step would be nice.

Thanks,
Luke


On Thu, Jan 14, 2016 at 9:36 AM, Gwen Shapira <gw...@confluent.io> wrote:

> Hi,
>
> There was a Jira to add "remove broker" option to the
> partition-reassignment tool. I think it died in a long discussion trying to
> solve a harder problem...
>
> To your work-around - it is an acceptable work-around.
>
> Few improvements:
> 1. Manually edit the resulting assignment json to avoid unnecessary moves.
> Or even create your own assignment (either manually or using a small
> script).
> 2. We don't throttle the partition move automatically, so it can easily
> take over the network if you are not careful. Therefore running the
> reassignment tools multiple times to move partitions one-by-one is often
> safer.
> 3. If you don't mean to permanently reduce the number of brokers but rather
> to replace a broker, don't reassign. Just take down the existing broker and
> give the new one the same ID.
>
> Hope this helps,
>
> Gwen
>
> On Wed, Jan 6, 2016 at 10:14 AM, Tom Crayford <tc...@heroku.com>
> wrote:
>
> > Hi there,
> >
> > Kafka's `kafka-reassign-partitions.sh` tool currently has no mechanism
> for
> > removing brokers. However, it does have the ability to generate partition
> > plans across arbitrary sets of brokers, by using `--generate`, passing
> all
> > the topics in the cluster into it, then passing the generated plan to
> > --execute.
> >
> > This isn't ideal, because it (from my understanding), potentially moves
> all
> > the partitions in the entire cluster around, but it should work fine, and
> > stop Kafka from having the partitions assigned to a broker that no longer
> > exists.
> >
> > Am I missing something there? Or is this a reasonable workaround until
> > better partition reassignment tools turn up in the future?
> >
> > Thanks
> >
> > Tom
> >
>

Re: Partition rebalancing after broker removal

Posted by Gwen Shapira <gw...@confluent.io>.
Hi,

There was a Jira to add "remove broker" option to the
partition-reassignment tool. I think it died in a long discussion trying to
solve a harder problem...

To your work-around - it is an acceptable work-around.

Few improvements:
1. Manually edit the resulting assignment json to avoid unnecessary moves.
Or even create your own assignment (either manually or using a small
script).
2. We don't throttle the partition move automatically, so it can easily
take over the network if you are not careful. Therefore running the
reassignment tools multiple times to move partitions one-by-one is often
safer.
3. If you don't mean to permanently reduce the number of brokers but rather
to replace a broker, don't reassign. Just take down the existing broker and
give the new one the same ID.

Hope this helps,

Gwen

On Wed, Jan 6, 2016 at 10:14 AM, Tom Crayford <tc...@heroku.com> wrote:

> Hi there,
>
> Kafka's `kafka-reassign-partitions.sh` tool currently has no mechanism for
> removing brokers. However, it does have the ability to generate partition
> plans across arbitrary sets of brokers, by using `--generate`, passing all
> the topics in the cluster into it, then passing the generated plan to
> --execute.
>
> This isn't ideal, because it (from my understanding), potentially moves all
> the partitions in the entire cluster around, but it should work fine, and
> stop Kafka from having the partitions assigned to a broker that no longer
> exists.
>
> Am I missing something there? Or is this a reasonable workaround until
> better partition reassignment tools turn up in the future?
>
> Thanks
>
> Tom
>