You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@kafka.apache.org by Stephane Maarek <st...@simplemachines.com.au> on 2017/06/23 00:48:09 UTC
Minimum Replication Factor
Hi all,
Interested in getting people’s opinion on something.
The problem I have is that some people launch streams app in our cluster but forget to set a replication factor > 1. Then it’s a pain to increase the topic’s RF, when we do notice some topic partitions go offline because we reboot brokers.
I have two solutions for this, which I’m interested in hearing:
Make the replication.factor in Kafka Streams “opiniated / smart” by changing the default to a dynamic min(3, # brokers).
Create a “minimum.replication.factor” in Kafka broker settings. If any topic is trying to be created using a RF less than the min, Kafka says no and doesn’t create the topic. That would ensure no topics get “miscreated” in production clusters and ease the pain on both devs, devops and support.
Thoughts?
My preference goes towards 2).
Cheers!
Stephane
Re: Minimum Replication Factor
Posted by Eno Thereska <en...@gmail.com>.
Good discussion. Stephane, we did briefly think of your options 1 and 2 but didn’t get time to make a KIP and discuss broadly. A dynamic config is appealing, however it has the drawback that you might end up with an unpredictable replication factor. A minimum replication factor also needs some discussion since if users leave it large by default (e.g., 3), that has implications on how much storage and network bandwidth is used.
I’m curious to hear your experiences with Edo’s suggestion when you try it and any pros/cons.
Thanks
Eno
> On Jun 27, 2017, at 10:15 AM, Edoardo Comar <EC...@uk.ibm.com> wrote:
>
> James,
> the create.topic.policy exists in 0.10.2 and for example all intermediate
> topics created by Kafka Stream (which has its own Admin Client) go through
> it.
>
> Topics created via Zookeeper won't be subject to the policy, but that
> stays the same in 0.11
>
> For our users, we have kept zookeeper inaccessible to clients via network
> rules, and allow clients to use either the Kafka API (and by that I mean
> the wire protocol),
> for which a client needs to write a bit of code e.g. hacking Streams'
> AdminClient,
> or a REST interface (which goes through Kafka's API) to create topics,
> hence the policy applies.
>
> ciao,
> Edo
> --------------------------------------------------
>
> Edoardo Comar
>
> IBM Message Hub
>
> IBM UK Ltd, Hursley Park, SO21 2JN
>
>
>
> From: James Cheng <wu...@gmail.com>
> To: dev@kafka.apache.org
> Date: 27/06/2017 09:15
> Subject: Re: Minimum Replication Factor
>
>
>
> The create.topic.policy stuff I think is only used as part of the new
> CreateTopic broker API that's coming in 0.11. That's one of the
> administrative APIs which let you create topics by talking directly to the
> broker, without needing to talk to zookeeper directly.
>
> So this means your brokers will need to be at 0.11 for this to be applied.
> And Kafka Streams will need to be updated to use this API in order for the
> policy to be applied and I don't recall seeing that Kafka Streams was
> updated to use this in 0.11.
>
> And lastly, I think that it won't be applied if you use the
> Kafka-topics.sh script, because that still talks directly to zookeeper.
>
> For us, we plan to run regular auditing scripts to notice topics with
> replication factors that are too low, and notify us to increase them (or
> automatically do it).
>
> -James
>
> Sent from my iPhone
>
>> On Jun 23, 2017, at 12:32 PM, Stephane Maarek
> <st...@simplemachines.com.au> wrote:
>>
>> That’s the first time I see this setting, wow it was burried!
>> I think it makes sense to implement one to get full control.
>>
>> I wonder if it’s still not worth implementing a simple setting, or
> implementing a few “simple” topic creation policies that users can just
> reference. I don’t see that interface being implemented anywhere
>>
>>
>> On 23/6/17, 6:43 pm, "Edoardo Comar" <EC...@uk.ibm.com> wrote:
>>
>> Hi Stephane,
>> we enforce the constraint in a custom create topic policy (see '
>> create.topic.policy.class.name')
>> --------------------------------------------------
>> Edoardo Comar
>> IBM Message Hub
>> ecomar@uk.ibm.com
>> IBM UK Ltd, Hursley Park, SO21 2JN
>>
>>
>>
>>
>> From: Stephane Maarek <st...@simplemachines.com.au>
>> To: "dev@kafka.apache.org" <de...@kafka.apache.org>
>> Date: 23/06/2017 01:48
>> Subject: Minimum Replication Factor
>>
>>
>>
>> Hi all,
>>
>>
>>
>> Interested in getting people’s opinion on something.
>>
>> The problem I have is that some people launch streams app in our
> cluster
>> but forget to set a replication factor > 1. Then it’s a pain to
> increase
>> the topic’s RF, when we do notice some topic partitions go offline
> because
>> we reboot brokers.
>>
>>
>>
>> I have two solutions for this, which I’m interested in hearing:
>> Make the replication.factor in Kafka Streams “opiniated / smart” by
>
>> changing the default to a dynamic min(3, # brokers).
>> Create a “minimum.replication.factor” in Kafka broker settings. If
> any
>> topic is trying to be created using a RF less than the min, Kafka
> says no
>> and doesn’t create the topic. That would ensure no topics get “
> miscreated”
>> in production clusters and ease the pain on both devs, devops and
> support.
>>
>>
>> Thoughts?
>>
>> My preference goes towards 2).
>>
>>
>>
>> Cheers!
>>
>> Stephane
>>
>>
>>
>>
>> Unless stated otherwise above:
>> IBM United Kingdom Limited - Registered in England and Wales with
> number
>> 741598.
>> Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire
> PO6 3AU
>>
>>
>>
>>
>
>
>
>
> Unless stated otherwise above:
> IBM United Kingdom Limited - Registered in England and Wales with number
> 741598.
> Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
>
Re: Minimum Replication Factor
Posted by Edoardo Comar <EC...@uk.ibm.com>.
James,
the create.topic.policy exists in 0.10.2 and for example all intermediate
topics created by Kafka Stream (which has its own Admin Client) go through
it.
Topics created via Zookeeper won't be subject to the policy, but that
stays the same in 0.11
For our users, we have kept zookeeper inaccessible to clients via network
rules, and allow clients to use either the Kafka API (and by that I mean
the wire protocol),
for which a client needs to write a bit of code e.g. hacking Streams'
AdminClient,
or a REST interface (which goes through Kafka's API) to create topics,
hence the policy applies.
ciao,
Edo
--------------------------------------------------
Edoardo Comar
IBM Message Hub
IBM UK Ltd, Hursley Park, SO21 2JN
From: James Cheng <wu...@gmail.com>
To: dev@kafka.apache.org
Date: 27/06/2017 09:15
Subject: Re: Minimum Replication Factor
The create.topic.policy stuff I think is only used as part of the new
CreateTopic broker API that's coming in 0.11. That's one of the
administrative APIs which let you create topics by talking directly to the
broker, without needing to talk to zookeeper directly.
So this means your brokers will need to be at 0.11 for this to be applied.
And Kafka Streams will need to be updated to use this API in order for the
policy to be applied and I don't recall seeing that Kafka Streams was
updated to use this in 0.11.
And lastly, I think that it won't be applied if you use the
Kafka-topics.sh script, because that still talks directly to zookeeper.
For us, we plan to run regular auditing scripts to notice topics with
replication factors that are too low, and notify us to increase them (or
automatically do it).
-James
Sent from my iPhone
> On Jun 23, 2017, at 12:32 PM, Stephane Maarek
<st...@simplemachines.com.au> wrote:
>
> That’s the first time I see this setting, wow it was burried!
> I think it makes sense to implement one to get full control.
>
> I wonder if it’s still not worth implementing a simple setting, or
implementing a few “simple” topic creation policies that users can just
reference. I don’t see that interface being implemented anywhere
>
>
> On 23/6/17, 6:43 pm, "Edoardo Comar" <EC...@uk.ibm.com> wrote:
>
> Hi Stephane,
> we enforce the constraint in a custom create topic policy (see '
> create.topic.policy.class.name')
> --------------------------------------------------
> Edoardo Comar
> IBM Message Hub
> ecomar@uk.ibm.com
> IBM UK Ltd, Hursley Park, SO21 2JN
>
>
>
>
> From: Stephane Maarek <st...@simplemachines.com.au>
> To: "dev@kafka.apache.org" <de...@kafka.apache.org>
> Date: 23/06/2017 01:48
> Subject: Minimum Replication Factor
>
>
>
> Hi all,
>
>
>
> Interested in getting people’s opinion on something.
>
> The problem I have is that some people launch streams app in our
cluster
> but forget to set a replication factor > 1. Then it’s a pain to
increase
> the topic’s RF, when we do notice some topic partitions go offline
because
> we reboot brokers.
>
>
>
> I have two solutions for this, which I’m interested in hearing:
> Make the replication.factor in Kafka Streams “opiniated / smart” by
> changing the default to a dynamic min(3, # brokers).
> Create a “minimum.replication.factor” in Kafka broker settings. If
any
> topic is trying to be created using a RF less than the min, Kafka
says no
> and doesn’t create the topic. That would ensure no topics get “
miscreated”
> in production clusters and ease the pain on both devs, devops and
support.
>
>
> Thoughts?
>
> My preference goes towards 2).
>
>
>
> Cheers!
>
> Stephane
>
>
>
>
> Unless stated otherwise above:
> IBM United Kingdom Limited - Registered in England and Wales with
number
> 741598.
> Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire
PO6 3AU
>
>
>
>
Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number
741598.
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
Re: Minimum Replication Factor
Posted by James Cheng <wu...@gmail.com>.
The create.topic.policy stuff I think is only used as part of the new CreateTopic broker API that's coming in 0.11. That's one of the administrative APIs which let you create topics by talking directly to the broker, without needing to talk to zookeeper directly.
So this means your brokers will need to be at 0.11 for this to be applied. And Kafka Streams will need to be updated to use this API in order for the policy to be applied and I don't recall seeing that Kafka Streams was updated to use this in 0.11.
And lastly, I think that it won't be applied if you use the Kafka-topics.sh script, because that still talks directly to zookeeper.
For us, we plan to run regular auditing scripts to notice topics with replication factors that are too low, and notify us to increase them (or automatically do it).
-James
Sent from my iPhone
> On Jun 23, 2017, at 12:32 PM, Stephane Maarek <st...@simplemachines.com.au> wrote:
>
> That’s the first time I see this setting, wow it was burried!
> I think it makes sense to implement one to get full control.
>
> I wonder if it’s still not worth implementing a simple setting, or implementing a few “simple” topic creation policies that users can just reference. I don’t see that interface being implemented anywhere
>
>
> On 23/6/17, 6:43 pm, "Edoardo Comar" <EC...@uk.ibm.com> wrote:
>
> Hi Stephane,
> we enforce the constraint in a custom create topic policy (see '
> create.topic.policy.class.name')
> --------------------------------------------------
> Edoardo Comar
> IBM Message Hub
> ecomar@uk.ibm.com
> IBM UK Ltd, Hursley Park, SO21 2JN
>
>
>
>
> From: Stephane Maarek <st...@simplemachines.com.au>
> To: "dev@kafka.apache.org" <de...@kafka.apache.org>
> Date: 23/06/2017 01:48
> Subject: Minimum Replication Factor
>
>
>
> Hi all,
>
>
>
> Interested in getting people’s opinion on something.
>
> The problem I have is that some people launch streams app in our cluster
> but forget to set a replication factor > 1. Then it’s a pain to increase
> the topic’s RF, when we do notice some topic partitions go offline because
> we reboot brokers.
>
>
>
> I have two solutions for this, which I’m interested in hearing:
> Make the replication.factor in Kafka Streams “opiniated / smart” by
> changing the default to a dynamic min(3, # brokers).
> Create a “minimum.replication.factor” in Kafka broker settings. If any
> topic is trying to be created using a RF less than the min, Kafka says no
> and doesn’t create the topic. That would ensure no topics get “miscreated”
> in production clusters and ease the pain on both devs, devops and support.
>
>
> Thoughts?
>
> My preference goes towards 2).
>
>
>
> Cheers!
>
> Stephane
>
>
>
>
> Unless stated otherwise above:
> IBM United Kingdom Limited - Registered in England and Wales with number
> 741598.
> Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
>
>
>
>
Re: Minimum Replication Factor
Posted by Stephane Maarek <st...@simplemachines.com.au>.
That’s the first time I see this setting, wow it was burried!
I think it makes sense to implement one to get full control.
I wonder if it’s still not worth implementing a simple setting, or implementing a few “simple” topic creation policies that users can just reference. I don’t see that interface being implemented anywhere
On 23/6/17, 6:43 pm, "Edoardo Comar" <EC...@uk.ibm.com> wrote:
Hi Stephane,
we enforce the constraint in a custom create topic policy (see '
create.topic.policy.class.name')
--------------------------------------------------
Edoardo Comar
IBM Message Hub
ecomar@uk.ibm.com
IBM UK Ltd, Hursley Park, SO21 2JN
From: Stephane Maarek <st...@simplemachines.com.au>
To: "dev@kafka.apache.org" <de...@kafka.apache.org>
Date: 23/06/2017 01:48
Subject: Minimum Replication Factor
Hi all,
Interested in getting people’s opinion on something.
The problem I have is that some people launch streams app in our cluster
but forget to set a replication factor > 1. Then it’s a pain to increase
the topic’s RF, when we do notice some topic partitions go offline because
we reboot brokers.
I have two solutions for this, which I’m interested in hearing:
Make the replication.factor in Kafka Streams “opiniated / smart” by
changing the default to a dynamic min(3, # brokers).
Create a “minimum.replication.factor” in Kafka broker settings. If any
topic is trying to be created using a RF less than the min, Kafka says no
and doesn’t create the topic. That would ensure no topics get “miscreated”
in production clusters and ease the pain on both devs, devops and support.
Thoughts?
My preference goes towards 2).
Cheers!
Stephane
Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number
741598.
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
Re: Minimum Replication Factor
Posted by Edoardo Comar <EC...@uk.ibm.com>.
Hi Stephane,
we enforce the constraint in a custom create topic policy (see '
create.topic.policy.class.name')
--------------------------------------------------
Edoardo Comar
IBM Message Hub
ecomar@uk.ibm.com
IBM UK Ltd, Hursley Park, SO21 2JN
From: Stephane Maarek <st...@simplemachines.com.au>
To: "dev@kafka.apache.org" <de...@kafka.apache.org>
Date: 23/06/2017 01:48
Subject: Minimum Replication Factor
Hi all,
Interested in getting people’s opinion on something.
The problem I have is that some people launch streams app in our cluster
but forget to set a replication factor > 1. Then it’s a pain to increase
the topic’s RF, when we do notice some topic partitions go offline because
we reboot brokers.
I have two solutions for this, which I’m interested in hearing:
Make the replication.factor in Kafka Streams “opiniated / smart” by
changing the default to a dynamic min(3, # brokers).
Create a “minimum.replication.factor” in Kafka broker settings. If any
topic is trying to be created using a RF less than the min, Kafka says no
and doesn’t create the topic. That would ensure no topics get “miscreated”
in production clusters and ease the pain on both devs, devops and support.
Thoughts?
My preference goes towards 2).
Cheers!
Stephane
Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number
741598.
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU