You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@kafka.apache.org by Li Kan <li...@gmail.com> on 2019/02/05 22:38:24 UTC

[DISCUSS] KIP-426: Persist Broker Id to Zookeeper

Hi, I have KIP-426, which is a small change on automatically determining
broker id when starting up. I am new to Kafka so there are a bunch of
design trade-offs that I might be missing or hard to decide, so I'd like to
get some suggestions on it. I'd expect (and open) to modify (or even
totally rewrite) the KIP based on suggestions. Thanks.

-- 
Best,
Kan

Re: [DISCUSS] KIP-426: Persist Broker Id to Zookeeper

Posted by Harsha <ka...@harsha.io>.

Hi Eno,

A control plane needs to do this today because Kafka doesn't provide such mapping.
I am not sure why we want every control plane figure this out and rather let this mapping which exists today in Kafka at node level
on disk be at a global level in zookeeper.
If we implement this, any control plane will be much simpler and not all the different environments need to understand and re-implement this broker.id mapping.
I don't understand duplication either, which control-pane we are talking about? 

Irrespective which control pane a user ends up using I want to understand the concerns about a broker.id host mapping being available in zookeeper.  Broker id belongs to Kafka and not in control pane. 

Thanks,
Harsha


On Sat, Mar 2, 2019, at 3:50 AM, Eno Thereska wrote:
> Hi Harsha, Li Kan,
> 
> What Colin mentioned is what I see in practice as well (at AWS and our
> clusters). A control plane management tool decides the mapping
> hostname-broker ID and can change it as it sees fit as brokers fail and new
> ones are brought in. That control plane usually already has a database of
> sorts that keeps track of existing broker IDs. So this work would duplicate
> what that control plane already does. It could also lead to extra work if
> that control plane decides to do something different that what the mapping
> in Zookeeper has.
> 
> At a minimum I'd like to see the motivation expanded and a description of
> how the current cluster is managed that Li Kan has in mind.
> 
> Thanks
> Eno
> 
> On Sat, Mar 2, 2019 at 1:43 AM Harsha <ka...@harsha.io> wrote:
> 
> > Hi,
> >      Cluster management tools are more generic and they are not aware of
> > Kafka specific configs like broker.id.
> > Even if they are aware of broker.id's , they will be lost when a disk is
> > lost.
> >       Irrespective of these use cases, let's look at the problem in
> > isolation.
> > 1. disks are the most common failure case in Kafka clusters
> > 2. We are storing auto-generated broker.id on disks hence we lose this
> > broker.id mapping when disks fail.
> > 3. If we keep the previously generated broker.id mapping along with host
> > on zookeeper it's easier to retrieve that mapping on a new host. This would
> > reduce the reassignment step and allow us to just copy the data and start
> > the new node with the previous broker.id
> > which is what the KIP is proposing.
> > I want to understand what are your concerns in moving this mapping which
> > already exists on disk to zookeeper?
> >
> > Thanks,
> > Harsha
> >
> > On Fri, Mar 1, 2019, at 11:11 AM, Colin McCabe wrote:
> > > On Wed, Feb 27, 2019, at 14:12, Harsha wrote:
> > > > Hi Colin,
> > > >               What we want to is to preserve the broker.id so that we
> > > > can do an offline rebuild of a broker. In our cases going through
> > > > online Kafka replication to bring up, a failed node will put producer
> > > > latencies at risk given the new broker will put all the other leaders
> > > > busy with its replication requests. For an offline rebuild, we do not
> > > > need to do rebalance as long as we can recover the broker.id
> > > >           Overall, irrespective of this use case we still want an
> > > > ability to retrieve a broker.id for an existing host. This will make
> > > > swapping in new hosts with failed hosts by keeping the existing
> > > > hostname easier.
> > >
> > > Thanks for the explanation.  Shouldn't this should be handled by the
> > > cluster management tool, though?  Kafka doesn't include a mechanism for
> > > re-creating nodes that failed.  That's up to kubernetes, or ansible, or
> > > whatever cluster provisioning framework you have in place.  This feels
> > > like the same kind of thing: managing how the cluster is provisioned.
> > >
> > > best,
> > > Colin
> > >
> > > >
> > > > Thanks,
> > > > Harsha
> > > > On Wed, Feb 27, 2019, at 11:53 AM, Colin McCabe wrote:
> > > > > Hi Li,
> > > > >
> > > > >  > The mechanism simplifies deployment because the same
> > configuration can be
> > > > >  > used across all brokers, however, in a large system where disk
> > failure is
> > > > >  > a norm, the meta file could often get lost, causing a new broker
> > id being
> > > > >  > allocated. This is problematic because new broker id has no
> > partition
> > > > >  > assigned to it so it can’t do anything, while partitions assigned
> > to the
> > > > >  > old one lose one replica
> > > > >
> > > > > If all of the disks have failed, then the partitions will lose their
> > > > > replicas no matter what, right?  If any of the disks is still
> > around,
> > > > > then there will be a meta file on the disk which contains the
> > previous
> > > > > broker ID.  So I'm not sure that we need to change anything here.
> > > > >
> > > > > best,
> > > > > Colin
> > > > >
> > > > >
> > > > > On Tue, Feb 5, 2019, at 14:38, Li Kan wrote:
> > > > > > Hi, I have KIP-426, which is a small change on automatically
> > determining
> > > > > > broker id when starting up. I am new to Kafka so there are a bunch
> > of
> > > > > > design trade-offs that I might be missing or hard to decide, so
> > I'd like to
> > > > > > get some suggestions on it. I'd expect (and open) to modify (or
> > even
> > > > > > totally rewrite) the KIP based on suggestions. Thanks.
> > > > > >
> > > > > > --
> > > > > > Best,
> > > > > > Kan
> > > > > >
> > > > >
> > > >
> > >
> >
>

Re: [DISCUSS] KIP-426: Persist Broker Id to Zookeeper

Posted by Eno Thereska <en...@gmail.com>.

Hi Harsha, Li Kan,

What Colin mentioned is what I see in practice as well (at AWS and our
clusters). A control plane management tool decides the mapping
hostname-broker ID and can change it as it sees fit as brokers fail and new
ones are brought in. That control plane usually already has a database of
sorts that keeps track of existing broker IDs. So this work would duplicate
what that control plane already does. It could also lead to extra work if
that control plane decides to do something different that what the mapping
in Zookeeper has.

At a minimum I'd like to see the motivation expanded and a description of
how the current cluster is managed that Li Kan has in mind.

Thanks
Eno

On Sat, Mar 2, 2019 at 1:43 AM Harsha <ka...@harsha.io> wrote:

> Hi,
>      Cluster management tools are more generic and they are not aware of
> Kafka specific configs like broker.id.
> Even if they are aware of broker.id's , they will be lost when a disk is
> lost.
>       Irrespective of these use cases, let's look at the problem in
> isolation.
> 1. disks are the most common failure case in Kafka clusters
> 2. We are storing auto-generated broker.id on disks hence we lose this
> broker.id mapping when disks fail.
> 3. If we keep the previously generated broker.id mapping along with host
> on zookeeper it's easier to retrieve that mapping on a new host. This would
> reduce the reassignment step and allow us to just copy the data and start
> the new node with the previous broker.id
> which is what the KIP is proposing.
> I want to understand what are your concerns in moving this mapping which
> already exists on disk to zookeeper?
>
> Thanks,
> Harsha
>
> On Fri, Mar 1, 2019, at 11:11 AM, Colin McCabe wrote:
> > On Wed, Feb 27, 2019, at 14:12, Harsha wrote:
> > > Hi Colin,
> > >               What we want to is to preserve the broker.id so that we
> > > can do an offline rebuild of a broker. In our cases going through
> > > online Kafka replication to bring up, a failed node will put producer
> > > latencies at risk given the new broker will put all the other leaders
> > > busy with its replication requests. For an offline rebuild, we do not
> > > need to do rebalance as long as we can recover the broker.id
> > >           Overall, irrespective of this use case we still want an
> > > ability to retrieve a broker.id for an existing host. This will make
> > > swapping in new hosts with failed hosts by keeping the existing
> > > hostname easier.
> >
> > Thanks for the explanation.  Shouldn't this should be handled by the
> > cluster management tool, though?  Kafka doesn't include a mechanism for
> > re-creating nodes that failed.  That's up to kubernetes, or ansible, or
> > whatever cluster provisioning framework you have in place.  This feels
> > like the same kind of thing: managing how the cluster is provisioned.
> >
> > best,
> > Colin
> >
> > >
> > > Thanks,
> > > Harsha
> > > On Wed, Feb 27, 2019, at 11:53 AM, Colin McCabe wrote:
> > > > Hi Li,
> > > >
> > > >  > The mechanism simplifies deployment because the same
> configuration can be
> > > >  > used across all brokers, however, in a large system where disk
> failure is
> > > >  > a norm, the meta file could often get lost, causing a new broker
> id being
> > > >  > allocated. This is problematic because new broker id has no
> partition
> > > >  > assigned to it so it can’t do anything, while partitions assigned
> to the
> > > >  > old one lose one replica
> > > >
> > > > If all of the disks have failed, then the partitions will lose their
> > > > replicas no matter what, right?  If any of the disks is still
> around,
> > > > then there will be a meta file on the disk which contains the
> previous
> > > > broker ID.  So I'm not sure that we need to change anything here.
> > > >
> > > > best,
> > > > Colin
> > > >
> > > >
> > > > On Tue, Feb 5, 2019, at 14:38, Li Kan wrote:
> > > > > Hi, I have KIP-426, which is a small change on automatically
> determining
> > > > > broker id when starting up. I am new to Kafka so there are a bunch
> of
> > > > > design trade-offs that I might be missing or hard to decide, so
> I'd like to
> > > > > get some suggestions on it. I'd expect (and open) to modify (or
> even
> > > > > totally rewrite) the KIP based on suggestions. Thanks.
> > > > >
> > > > > --
> > > > > Best,
> > > > > Kan
> > > > >
> > > >
> > >
> >
>

Re: [DISCUSS] KIP-426: Persist Broker Id to Zookeeper

Posted by Harsha <ka...@harsha.io>.

Hi,
     Cluster management tools are more generic and they are not aware of Kafka specific configs like broker.id.
Even if they are aware of broker.id's , they will be lost when a disk is lost. 
      Irrespective of these use cases, let's look at the problem in isolation.
1. disks are the most common failure case in Kafka clusters 
2. We are storing auto-generated broker.id on disks hence we lose this broker.id mapping when disks fail.
3. If we keep the previously generated broker.id mapping along with host on zookeeper it's easier to retrieve that mapping on a new host. This would reduce the reassignment step and allow us to just copy the data and start the new node with the previous broker.id
which is what the KIP is proposing. 
I want to understand what are your concerns in moving this mapping which already exists on disk to zookeeper? 

Thanks,
Harsha

On Fri, Mar 1, 2019, at 11:11 AM, Colin McCabe wrote:
> On Wed, Feb 27, 2019, at 14:12, Harsha wrote:
> > Hi Colin,
> >               What we want to is to preserve the broker.id so that we 
> > can do an offline rebuild of a broker. In our cases going through 
> > online Kafka replication to bring up, a failed node will put producer 
> > latencies at risk given the new broker will put all the other leaders 
> > busy with its replication requests. For an offline rebuild, we do not 
> > need to do rebalance as long as we can recover the broker.id
> >           Overall, irrespective of this use case we still want an 
> > ability to retrieve a broker.id for an existing host. This will make 
> > swapping in new hosts with failed hosts by keeping the existing 
> > hostname easier.
> 
> Thanks for the explanation.  Shouldn't this should be handled by the 
> cluster management tool, though?  Kafka doesn't include a mechanism for 
> re-creating nodes that failed.  That's up to kubernetes, or ansible, or 
> whatever cluster provisioning framework you have in place.  This feels 
> like the same kind of thing: managing how the cluster is provisioned.
> 
> best,
> Colin
> 
> > 
> > Thanks,
> > Harsha
> > On Wed, Feb 27, 2019, at 11:53 AM, Colin McCabe wrote:
> > > Hi Li,
> > > 
> > >  > The mechanism simplifies deployment because the same configuration can be 
> > >  > used across all brokers, however, in a large system where disk failure is 
> > >  > a norm, the meta file could often get lost, causing a new broker id being 
> > >  > allocated. This is problematic because new broker id has no partition 
> > >  > assigned to it so it can’t do anything, while partitions assigned to the 
> > >  > old one lose one replica
> > > 
> > > If all of the disks have failed, then the partitions will lose their 
> > > replicas no matter what, right?  If any of the disks is still around, 
> > > then there will be a meta file on the disk which contains the previous 
> > > broker ID.  So I'm not sure that we need to change anything here.
> > > 
> > > best,
> > > Colin
> > > 
> > > 
> > > On Tue, Feb 5, 2019, at 14:38, Li Kan wrote:
> > > > Hi, I have KIP-426, which is a small change on automatically determining
> > > > broker id when starting up. I am new to Kafka so there are a bunch of
> > > > design trade-offs that I might be missing or hard to decide, so I'd like to
> > > > get some suggestions on it. I'd expect (and open) to modify (or even
> > > > totally rewrite) the KIP based on suggestions. Thanks.
> > > > 
> > > > -- 
> > > > Best,
> > > > Kan
> > > >
> > >
> >
>

Re: [DISCUSS] KIP-426: Persist Broker Id to Zookeeper

Posted by Colin McCabe <cm...@apache.org>.

On Wed, Feb 27, 2019, at 14:12, Harsha wrote:
> Hi Colin,
>               What we want to is to preserve the broker.id so that we 
> can do an offline rebuild of a broker. In our cases going through 
> online Kafka replication to bring up, a failed node will put producer 
> latencies at risk given the new broker will put all the other leaders 
> busy with its replication requests. For an offline rebuild, we do not 
> need to do rebalance as long as we can recover the broker.id
>           Overall, irrespective of this use case we still want an 
> ability to retrieve a broker.id for an existing host. This will make 
> swapping in new hosts with failed hosts by keeping the existing 
> hostname easier.

Thanks for the explanation.  Shouldn't this should be handled by the cluster management tool, though?  Kafka doesn't include a mechanism for re-creating nodes that failed.  That's up to kubernetes, or ansible, or whatever cluster provisioning framework you have in place.  This feels like the same kind of thing: managing how the cluster is provisioned.

best,
Colin

> 
> Thanks,
> Harsha
> On Wed, Feb 27, 2019, at 11:53 AM, Colin McCabe wrote:
> > Hi Li,
> > 
> >  > The mechanism simplifies deployment because the same configuration can be 
> >  > used across all brokers, however, in a large system where disk failure is 
> >  > a norm, the meta file could often get lost, causing a new broker id being 
> >  > allocated. This is problematic because new broker id has no partition 
> >  > assigned to it so it can’t do anything, while partitions assigned to the 
> >  > old one lose one replica
> > 
> > If all of the disks have failed, then the partitions will lose their 
> > replicas no matter what, right?  If any of the disks is still around, 
> > then there will be a meta file on the disk which contains the previous 
> > broker ID.  So I'm not sure that we need to change anything here.
> > 
> > best,
> > Colin
> > 
> > 
> > On Tue, Feb 5, 2019, at 14:38, Li Kan wrote:
> > > Hi, I have KIP-426, which is a small change on automatically determining
> > > broker id when starting up. I am new to Kafka so there are a bunch of
> > > design trade-offs that I might be missing or hard to decide, so I'd like to
> > > get some suggestions on it. I'd expect (and open) to modify (or even
> > > totally rewrite) the KIP based on suggestions. Thanks.
> > > 
> > > -- 
> > > Best,
> > > Kan
> > >
> >
>

Re: [DISCUSS] KIP-426: Persist Broker Id to Zookeeper

Posted by Harsha <ka...@harsha.io>.

Hi Colin,
              What we want to is to preserve the broker.id so that we can do an offline rebuild of a broker. In our cases going through online Kafka replication to bring up, a failed node will put producer latencies at risk given the new broker will put all the other leaders busy with its replication requests. For an offline rebuild, we do not need to do rebalance as long as we can recover the broker.id
          Overall, irrespective of this use case we still want an ability to retrieve a broker.id for an existing host. This will make swapping in new hosts with failed hosts by keeping the existing hostname easier.

Thanks,
Harsha
On Wed, Feb 27, 2019, at 11:53 AM, Colin McCabe wrote:
> Hi Li,
> 
>  > The mechanism simplifies deployment because the same configuration can be 
>  > used across all brokers, however, in a large system where disk failure is 
>  > a norm, the meta file could often get lost, causing a new broker id being 
>  > allocated. This is problematic because new broker id has no partition 
>  > assigned to it so it can’t do anything, while partitions assigned to the 
>  > old one lose one replica
> 
> If all of the disks have failed, then the partitions will lose their 
> replicas no matter what, right?  If any of the disks is still around, 
> then there will be a meta file on the disk which contains the previous 
> broker ID.  So I'm not sure that we need to change anything here.
> 
> best,
> Colin
> 
> 
> On Tue, Feb 5, 2019, at 14:38, Li Kan wrote:
> > Hi, I have KIP-426, which is a small change on automatically determining
> > broker id when starting up. I am new to Kafka so there are a bunch of
> > design trade-offs that I might be missing or hard to decide, so I'd like to
> > get some suggestions on it. I'd expect (and open) to modify (or even
> > totally rewrite) the KIP based on suggestions. Thanks.
> > 
> > -- 
> > Best,
> > Kan
> >
>

Re: [DISCUSS] KIP-426: Persist Broker Id to Zookeeper

Posted by Colin McCabe <cm...@apache.org>.

Hi Li,

 > The mechanism simplifies deployment because the same configuration can be 
 > used across all brokers, however, in a large system where disk failure is 
 > a norm, the meta file could often get lost, causing a new broker id being 
 > allocated. This is problematic because new broker id has no partition 
 > assigned to it so it can’t do anything, while partitions assigned to the 
 > old one lose one replica

If all of the disks have failed, then the partitions will lose their replicas no matter what, right?  If any of the disks is still around, then there will be a meta file on the disk which contains the previous broker ID.  So I'm not sure that we need to change anything here.

best,
Colin


On Tue, Feb 5, 2019, at 14:38, Li Kan wrote:
> Hi, I have KIP-426, which is a small change on automatically determining
> broker id when starting up. I am new to Kafka so there are a bunch of
> design trade-offs that I might be missing or hard to decide, so I'd like to
> get some suggestions on it. I'd expect (and open) to modify (or even
> totally rewrite) the KIP based on suggestions. Thanks.
> 
> -- 
> Best,
> Kan
>

Re: [DISCUSS] KIP-426: Persist Broker Id to Zookeeper

Posted by Harsha <ka...@harsha.io>.

Thanks for the KIP Kan.  I think the design will be simpler if we just deprecate storing broker.id in meta.properties and start storing it in zookeeper as you suggested. 

Thanks,
Harsha

On Tue, Feb 5, 2019, at 2:40 PM, Li Kan wrote:
> My bad, forgot to put the link to the KIP:
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-426%3A+Persist+Broker+Id+to+Zookeeper
> 
> On Tue, Feb 5, 2019 at 2:38 PM Li Kan <li...@gmail.com> wrote:
> 
> > Hi, I have KIP-426, which is a small change on automatically determining
> > broker id when starting up. I am new to Kafka so there are a bunch of
> > design trade-offs that I might be missing or hard to decide, so I'd like to
> > get some suggestions on it. I'd expect (and open) to modify (or even
> > totally rewrite) the KIP based on suggestions. Thanks.
> >
> > --
> > Best,
> > Kan
> >
> 
> 
> -- 
> Best,
> Kan
>

Re: [DISCUSS] KIP-426: Persist Broker Id to Zookeeper

Posted by Li Kan <li...@gmail.com>.

My bad, forgot to put the link to the KIP:
https://cwiki.apache.org/confluence/display/KAFKA/KIP-426%3A+Persist+Broker+Id+to+Zookeeper

On Tue, Feb 5, 2019 at 2:38 PM Li Kan <li...@gmail.com> wrote:

> Hi, I have KIP-426, which is a small change on automatically determining
> broker id when starting up. I am new to Kafka so there are a bunch of
> design trade-offs that I might be missing or hard to decide, so I'd like to
> get some suggestions on it. I'd expect (and open) to modify (or even
> totally rewrite) the KIP based on suggestions. Thanks.
>
> --
> Best,
> Kan
>


-- 
Best,
Kan