You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@kafka.apache.org by Stephane Maarek <st...@simplemachines.com.au> on 2017/06/23 00:48:09 UTC

Minimum Replication Factor

Hi all,

 

Interested in getting people’s opinion on something.

The problem I have is that some people launch streams app in our cluster but forget to set a replication factor > 1. Then it’s a pain to increase the topic’s RF, when we do notice some topic partitions go offline because we reboot brokers. 

 

I have two solutions for this, which I’m interested in hearing:
Make the replication.factor in Kafka Streams “opiniated / smart” by changing the default to a dynamic min(3, # brokers).
Create a “minimum.replication.factor” in Kafka broker settings. If any topic is trying to be created using a RF less than the min, Kafka says no and doesn’t create the topic. That would ensure no topics get “miscreated” in production clusters and ease the pain on both devs, devops and support.
 

Thoughts? 

My preference goes towards 2). 

 

Cheers!

Stephane 


Re: Minimum Replication Factor

Posted by Eno Thereska <en...@gmail.com>.
Good discussion. Stephane, we did briefly think of your options 1 and 2 but didn’t get time to make a KIP and discuss broadly. A dynamic config is appealing, however it has the drawback that you might end up with an unpredictable replication factor. A minimum replication factor also needs some discussion since if users leave it large by default (e.g., 3), that has implications on how much storage and network bandwidth is used.

I’m curious to hear your experiences with Edo’s suggestion when you try it and any pros/cons.

Thanks
Eno

> On Jun 27, 2017, at 10:15 AM, Edoardo Comar <EC...@uk.ibm.com> wrote:
> 
> James,
> the create.topic.policy exists in 0.10.2 and for example all intermediate 
> topics created by Kafka Stream (which has its own Admin Client) go through 
> it.
> 
> Topics created via Zookeeper won't be subject to the policy, but that 
> stays the same in 0.11
> 
> For our users, we have kept zookeeper inaccessible to clients via network 
> rules, and allow clients to use either the Kafka API (and by that I mean 
> the wire protocol),
> for which a client needs to write a bit of code e.g. hacking Streams' 
> AdminClient,
> or a REST interface (which goes through Kafka's API) to create topics, 
> hence the policy applies.
> 
> ciao,
> Edo
> --------------------------------------------------
> 
> Edoardo Comar
> 
> IBM Message Hub
> 
> IBM UK Ltd, Hursley Park, SO21 2JN
> 
> 
> 
> From:   James Cheng <wu...@gmail.com>
> To:     dev@kafka.apache.org
> Date:   27/06/2017 09:15
> Subject:        Re: Minimum Replication Factor
> 
> 
> 
> The create.topic.policy stuff I think is only used as part of the new 
> CreateTopic broker API that's coming in 0.11. That's one of the 
> administrative APIs which let you create topics by talking directly to the 
> broker, without needing to talk to zookeeper directly. 
> 
> So this means your brokers will need to be at 0.11 for this to be applied. 
> And Kafka Streams will need to be updated to use this API in order for the 
> policy to be applied and I don't recall seeing that Kafka Streams was 
> updated to use this in 0.11. 
> 
> And lastly, I think that it won't be applied if you use the 
> Kafka-topics.sh script, because that still talks directly to zookeeper.
> 
> For us, we plan to run regular auditing scripts to notice topics with 
> replication factors that are too low, and notify us to increase them (or 
> automatically do it).
> 
> -James
> 
> Sent from my iPhone
> 
>> On Jun 23, 2017, at 12:32 PM, Stephane Maarek 
> <st...@simplemachines.com.au> wrote:
>> 
>> That’s the first time I see this setting, wow it was burried!
>> I think it makes sense to implement one to get full control. 
>> 
>> I wonder if it’s still not worth implementing a simple setting, or 
> implementing a few “simple” topic creation policies that users can just 
> reference. I don’t see that interface being implemented anywhere
>> 
>> 
>> On 23/6/17, 6:43 pm, "Edoardo Comar" <EC...@uk.ibm.com> wrote:
>> 
>>   Hi Stephane,
>>   we enforce the constraint in a custom create topic policy (see '
>>   create.topic.policy.class.name')
>>   --------------------------------------------------
>>   Edoardo Comar
>>   IBM Message Hub
>>   ecomar@uk.ibm.com
>>   IBM UK Ltd, Hursley Park, SO21 2JN
>> 
>> 
>> 
>> 
>>   From:   Stephane Maarek <st...@simplemachines.com.au>
>>   To:     "dev@kafka.apache.org" <de...@kafka.apache.org>
>>   Date:   23/06/2017 01:48
>>   Subject:        Minimum Replication Factor
>> 
>> 
>> 
>>   Hi all,
>> 
>> 
>> 
>>   Interested in getting people’s opinion on something.
>> 
>>   The problem I have is that some people launch streams app in our 
> cluster 
>>   but forget to set a replication factor > 1. Then it’s a pain to 
> increase 
>>   the topic’s RF, when we do notice some topic partitions go offline 
> because 
>>   we reboot brokers. 
>> 
>> 
>> 
>>   I have two solutions for this, which I’m interested in hearing:
>>   Make the replication.factor in Kafka Streams “opiniated / smart” by 
> 
>>   changing the default to a dynamic min(3, # brokers).
>>   Create a “minimum.replication.factor” in Kafka broker settings. If 
> any 
>>   topic is trying to be created using a RF less than the min, Kafka 
> says no 
>>   and doesn’t create the topic. That would ensure no topics get “
> miscreated” 
>>   in production clusters and ease the pain on both devs, devops and 
> support.
>> 
>> 
>>   Thoughts? 
>> 
>>   My preference goes towards 2). 
>> 
>> 
>> 
>>   Cheers!
>> 
>>   Stephane 
>> 
>> 
>> 
>> 
>>   Unless stated otherwise above:
>>   IBM United Kingdom Limited - Registered in England and Wales with 
> number 
>>   741598. 
>>   Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire 
> PO6 3AU
>> 
>> 
>> 
>> 
> 
> 
> 
> 
> Unless stated otherwise above:
> IBM United Kingdom Limited - Registered in England and Wales with number 
> 741598. 
> Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
> 


Re: Minimum Replication Factor

Posted by Edoardo Comar <EC...@uk.ibm.com>.
James,
the create.topic.policy exists in 0.10.2 and for example all intermediate 
topics created by Kafka Stream (which has its own Admin Client) go through 
it.

Topics created via Zookeeper won't be subject to the policy, but that 
stays the same in 0.11

For our users, we have kept zookeeper inaccessible to clients via network 
rules, and allow clients to use either the Kafka API (and by that I mean 
the wire protocol),
for which a client needs to write a bit of code e.g. hacking Streams' 
AdminClient,
or a REST interface (which goes through Kafka's API) to create topics, 
hence the policy applies.

ciao,
Edo
--------------------------------------------------

Edoardo Comar

IBM Message Hub

IBM UK Ltd, Hursley Park, SO21 2JN



From:   James Cheng <wu...@gmail.com>
To:     dev@kafka.apache.org
Date:   27/06/2017 09:15
Subject:        Re: Minimum Replication Factor



The create.topic.policy stuff I think is only used as part of the new 
CreateTopic broker API that's coming in 0.11. That's one of the 
administrative APIs which let you create topics by talking directly to the 
broker, without needing to talk to zookeeper directly. 

So this means your brokers will need to be at 0.11 for this to be applied. 
And Kafka Streams will need to be updated to use this API in order for the 
policy to be applied and I don't recall seeing that Kafka Streams was 
updated to use this in 0.11. 

And lastly, I think that it won't be applied if you use the 
Kafka-topics.sh script, because that still talks directly to zookeeper.

For us, we plan to run regular auditing scripts to notice topics with 
replication factors that are too low, and notify us to increase them (or 
automatically do it).

-James

Sent from my iPhone

> On Jun 23, 2017, at 12:32 PM, Stephane Maarek 
<st...@simplemachines.com.au> wrote:
> 
> That’s the first time I see this setting, wow it was burried!
> I think it makes sense to implement one to get full control. 
> 
> I wonder if it’s still not worth implementing a simple setting, or 
implementing a few “simple” topic creation policies that users can just 
reference. I don’t see that interface being implemented anywhere
> 
> 
> On 23/6/17, 6:43 pm, "Edoardo Comar" <EC...@uk.ibm.com> wrote:
> 
>    Hi Stephane,
>    we enforce the constraint in a custom create topic policy (see '
>    create.topic.policy.class.name')
>    --------------------------------------------------
>    Edoardo Comar
>    IBM Message Hub
>    ecomar@uk.ibm.com
>    IBM UK Ltd, Hursley Park, SO21 2JN
> 
> 
> 
> 
>    From:   Stephane Maarek <st...@simplemachines.com.au>
>    To:     "dev@kafka.apache.org" <de...@kafka.apache.org>
>    Date:   23/06/2017 01:48
>    Subject:        Minimum Replication Factor
> 
> 
> 
>    Hi all,
> 
> 
> 
>    Interested in getting people’s opinion on something.
> 
>    The problem I have is that some people launch streams app in our 
cluster 
>    but forget to set a replication factor > 1. Then it’s a pain to 
increase 
>    the topic’s RF, when we do notice some topic partitions go offline 
because 
>    we reboot brokers. 
> 
> 
> 
>    I have two solutions for this, which I’m interested in hearing:
>    Make the replication.factor in Kafka Streams “opiniated / smart” by 

>    changing the default to a dynamic min(3, # brokers).
>    Create a “minimum.replication.factor” in Kafka broker settings. If 
any 
>    topic is trying to be created using a RF less than the min, Kafka 
says no 
>    and doesn’t create the topic. That would ensure no topics get “
miscreated” 
>    in production clusters and ease the pain on both devs, devops and 
support.
> 
> 
>    Thoughts? 
> 
>    My preference goes towards 2). 
> 
> 
> 
>    Cheers!
> 
>    Stephane 
> 
> 
> 
> 
>    Unless stated otherwise above:
>    IBM United Kingdom Limited - Registered in England and Wales with 
number 
>    741598. 
>    Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire 
PO6 3AU
> 
> 
> 
> 




Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number 
741598. 
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU


Re: Minimum Replication Factor

Posted by James Cheng <wu...@gmail.com>.
The create.topic.policy stuff I think is only used as part of the new CreateTopic broker API that's coming in 0.11. That's one of the administrative APIs which let you create topics by talking directly to the broker, without needing to talk to zookeeper directly. 

So this means your brokers will need to be at 0.11 for this to be applied. And Kafka Streams will need to be updated to use this API in order for the policy to be applied and I don't recall seeing that Kafka Streams was updated to use this in 0.11. 

And lastly, I think that it won't be applied if you use the Kafka-topics.sh script, because that still talks directly to zookeeper.

For us, we plan to run regular auditing scripts to notice topics with replication factors that are too low, and notify us to increase them (or automatically do it).

-James

Sent from my iPhone

> On Jun 23, 2017, at 12:32 PM, Stephane Maarek <st...@simplemachines.com.au> wrote:
> 
> That’s the first time I see this setting, wow it was burried!
> I think it makes sense to implement one to get full control. 
> 
> I wonder if it’s still not worth implementing a simple setting, or implementing a few “simple” topic creation policies that users can just reference. I don’t see that interface being implemented anywhere
> 
> 
> On 23/6/17, 6:43 pm, "Edoardo Comar" <EC...@uk.ibm.com> wrote:
> 
>    Hi Stephane,
>    we enforce the constraint in a custom create topic policy (see '
>    create.topic.policy.class.name')
>    --------------------------------------------------
>    Edoardo Comar
>    IBM Message Hub
>    ecomar@uk.ibm.com
>    IBM UK Ltd, Hursley Park, SO21 2JN
> 
> 
> 
> 
>    From:   Stephane Maarek <st...@simplemachines.com.au>
>    To:     "dev@kafka.apache.org" <de...@kafka.apache.org>
>    Date:   23/06/2017 01:48
>    Subject:        Minimum Replication Factor
> 
> 
> 
>    Hi all,
> 
> 
> 
>    Interested in getting people’s opinion on something.
> 
>    The problem I have is that some people launch streams app in our cluster 
>    but forget to set a replication factor > 1. Then it’s a pain to increase 
>    the topic’s RF, when we do notice some topic partitions go offline because 
>    we reboot brokers. 
> 
> 
> 
>    I have two solutions for this, which I’m interested in hearing:
>    Make the replication.factor in Kafka Streams “opiniated / smart” by 
>    changing the default to a dynamic min(3, # brokers).
>    Create a “minimum.replication.factor” in Kafka broker settings. If any 
>    topic is trying to be created using a RF less than the min, Kafka says no 
>    and doesn’t create the topic. That would ensure no topics get “miscreated” 
>    in production clusters and ease the pain on both devs, devops and support.
> 
> 
>    Thoughts? 
> 
>    My preference goes towards 2). 
> 
> 
> 
>    Cheers!
> 
>    Stephane 
> 
> 
> 
> 
>    Unless stated otherwise above:
>    IBM United Kingdom Limited - Registered in England and Wales with number 
>    741598. 
>    Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
> 
> 
> 
> 

Re: Minimum Replication Factor

Posted by Stephane Maarek <st...@simplemachines.com.au>.
That’s the first time I see this setting, wow it was burried!
I think it makes sense to implement one to get full control. 

I wonder if it’s still not worth implementing a simple setting, or implementing a few “simple” topic creation policies that users can just reference. I don’t see that interface being implemented anywhere
 

On 23/6/17, 6:43 pm, "Edoardo Comar" <EC...@uk.ibm.com> wrote:

    Hi Stephane,
    we enforce the constraint in a custom create topic policy (see '
    create.topic.policy.class.name')
    --------------------------------------------------
    Edoardo Comar
    IBM Message Hub
    ecomar@uk.ibm.com
    IBM UK Ltd, Hursley Park, SO21 2JN
    
    
    
    
    From:   Stephane Maarek <st...@simplemachines.com.au>
    To:     "dev@kafka.apache.org" <de...@kafka.apache.org>
    Date:   23/06/2017 01:48
    Subject:        Minimum Replication Factor
    
    
    
    Hi all,
    
     
    
    Interested in getting people’s opinion on something.
    
    The problem I have is that some people launch streams app in our cluster 
    but forget to set a replication factor > 1. Then it’s a pain to increase 
    the topic’s RF, when we do notice some topic partitions go offline because 
    we reboot brokers. 
    
     
    
    I have two solutions for this, which I’m interested in hearing:
    Make the replication.factor in Kafka Streams “opiniated / smart” by 
    changing the default to a dynamic min(3, # brokers).
    Create a “minimum.replication.factor” in Kafka broker settings. If any 
    topic is trying to be created using a RF less than the min, Kafka says no 
    and doesn’t create the topic. That would ensure no topics get “miscreated” 
    in production clusters and ease the pain on both devs, devops and support.
     
    
    Thoughts? 
    
    My preference goes towards 2). 
    
     
    
    Cheers!
    
    Stephane 
    
    
    
    
    Unless stated otherwise above:
    IBM United Kingdom Limited - Registered in England and Wales with number 
    741598. 
    Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
    
    



Re: Minimum Replication Factor

Posted by Edoardo Comar <EC...@uk.ibm.com>.
Hi Stephane,
we enforce the constraint in a custom create topic policy (see '
create.topic.policy.class.name')
--------------------------------------------------
Edoardo Comar
IBM Message Hub
ecomar@uk.ibm.com
IBM UK Ltd, Hursley Park, SO21 2JN




From:   Stephane Maarek <st...@simplemachines.com.au>
To:     "dev@kafka.apache.org" <de...@kafka.apache.org>
Date:   23/06/2017 01:48
Subject:        Minimum Replication Factor



Hi all,

 

Interested in getting people’s opinion on something.

The problem I have is that some people launch streams app in our cluster 
but forget to set a replication factor > 1. Then it’s a pain to increase 
the topic’s RF, when we do notice some topic partitions go offline because 
we reboot brokers. 

 

I have two solutions for this, which I’m interested in hearing:
Make the replication.factor in Kafka Streams “opiniated / smart” by 
changing the default to a dynamic min(3, # brokers).
Create a “minimum.replication.factor” in Kafka broker settings. If any 
topic is trying to be created using a RF less than the min, Kafka says no 
and doesn’t create the topic. That would ensure no topics get “miscreated” 
in production clusters and ease the pain on both devs, devops and support.
 

Thoughts? 

My preference goes towards 2). 

 

Cheers!

Stephane 




Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number 
741598. 
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU