You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@kafka.apache.org by Randall Hauch <rh...@gmail.com> on 2017/05/05 22:42:22 UTC

[DISCUSS] KIP-154 Add Kafka Connect configuration properties for creating internal topics

Hi, all.

I've been working on KAFKA-4667 to change the distributed worker of Kafka
Connect to look for the topics used to store connector and task
configurations, offsets, and status, and if those tasks do not exist to
create them using the new AdminClient. To make this as useful as possible
and to minimize the need to still manually create the topics, I propose
adding several new distributed worker configurations to specify the
partitions and replication factor for these topics, and have outlined them
in "KIP-154 Add Kafka Connect configuration properties for creating
internal topics".

https://cwiki.apache.org/confluence/display/KAFKA/KIP-154+Add+Kafka+Connect+configuration+properties+for+creating+internal+topics

Please take a look and provide feedback. Thanks!

Best regards,

Randall

Re: [DISCUSS] KIP-154 Add Kafka Connect configuration properties for creating internal topics

Posted by BigData dev <bi...@gmail.com>.
Thank You got it.


On Mon, May 8, 2017 at 8:34 PM, Randall Hauch <rh...@gmail.com> wrote:

> Yes, that's the approach I'm suggesting and that is mentioned in the KIP. I
> also propose that the distributed configuration provided in the examples
> set the replication factor to one but include a relevant comment.
>
> On Mon, May 8, 2017 at 11:14 PM, BigData dev <bi...@gmail.com>
> wrote:
>
> > So, when Kafka broker is less than 3, and the user has not set the
> > replication configuration it will throw an error to the user, to correct
> > the configuration according to his setup? Is this the approach you are
> > suggesting here?
> >
> >
> >
> > On Mon, May 8, 2017 at 7:13 PM, Randall Hauch <rh...@gmail.com> wrote:
> >
> > > One of the "Rejected Alternatives" was to do something "smarter" by
> > > automatically reducing the replication factor when the cluster size is
> > > smaller than the replication factor. However, this is extremely
> > > unintuitive, and in rare cases (e.g., during a partial outage) might
> even
> > > result in internal topics being created with too small of a replication
> > > factor. And defaulting to 1 is certainly bad for production use cases,
> so
> > > that's not an option, either.
> > >
> > > While defaulting to 3 and failing if the cluster doesn't have 3 nodes
> is
> > a
> > > bit harsher than I'd like, it does appear to be the safer option: an
> > error
> > > message (with instructions on how to correct) is better than
> > inadvertently
> > > setting the replication factor too small and not knowing about it until
> > it
> > > is too late.
> > >
> > > On Mon, May 8, 2017 at 6:12 PM, BigData dev <bi...@gmail.com>
> > > wrote:
> > >
> > > > Hi,
> > > > I liked the KIP, as it will avoid so many errors which user can make
> > > during
> > > > setup.
> > > > I have 1 questions here.
> > > > 1. As default replication factor is set to 3, but if Kafka cluster is
> > > setup
> > > > for one node, then the user needs to override the default
> configuraion,
> > > > till then topics will not be created.
> > > > So, this is the behavior we want to give?
> > > >
> > > > On Mon, May 8, 2017 at 2:25 PM, Konstantine Karantasis <
> > > > konstantine@confluent.io> wrote:
> > > >
> > > > > Thanks a lot for the KIP Randall. This improvement should simplify
> > both
> > > > > regular deployments and testing!
> > > > >
> > > > > A minor comment. Maybe it would be nice to add a note about why
> > there's
> > > > no
> > > > > need for the property: config.storage.partitions
> > > > > I'm mentioning this for the sake of completeness, in case someone
> > > notices
> > > > > this slight asymmetry with respect to the newly introduced config
> > > > > properties.
> > > > >
> > > > > This is by no means a blocking comment.
> > > > >
> > > > > Thanks,
> > > > > Konstantine
> > > > >
> > > > > On Fri, May 5, 2017 at 7:18 PM, Randall Hauch <rh...@gmail.com>
> > > wrote:
> > > > >
> > > > > > Thanks, Gwen.
> > > > > >
> > > > > > Switching to low-priority is a great idea.
> > > > > >
> > > > > > The default value for the replication factor configuration is 3,
> > > since
> > > > > > that makes sense and is safe for production. Using the default
> > values
> > > > in
> > > > > > the example would mean it could only be run against a Kafka
> cluster
> > > > with
> > > > > a
> > > > > > minimum of 3 nodes. I propose overriding the example's
> replication
> > > > factor
> > > > > > configurations to be 1 so that the examples could be run on any
> > sized
> > > > > > cluster.
> > > > > >
> > > > > > The rejected alternatives mentions why the implementation doesn't
> > try
> > > > to
> > > > > > be too smart by calculating the replication factor.
> > > > > >
> > > > > > Best regards,
> > > > > >
> > > > > > Randall
> > > > > >
> > > > > > > On May 5, 2017, at 8:02 PM, Gwen Shapira <gw...@confluent.io>
> > > wrote:
> > > > > > >
> > > > > > > Looks great to me :)
> > > > > > >
> > > > > > > Just one note - configurations have levels (which reflect in
> the
> > > > docs)
> > > > > -
> > > > > > I
> > > > > > > suggest putting the whole thing as LOW. Most users will never
> > need
> > > to
> > > > > > worry
> > > > > > > about these. For same reason I recommend leaving them out of
> the
> > > > > example
> > > > > > > config files - we already have issues with users playing with
> > > configs
> > > > > > > without understanding what they are doing and not liking the
> > > results.
> > > > > > >
> > > > > > >> On Fri, May 5, 2017 at 3:42 PM, Randall Hauch <
> rhauch@gmail.com
> > >
> > > > > wrote:
> > > > > > >>
> > > > > > >> Hi, all.
> > > > > > >>
> > > > > > >> I've been working on KAFKA-4667 to change the distributed
> worker
> > > of
> > > > > > Kafka
> > > > > > >> Connect to look for the topics used to store connector and
> task
> > > > > > >> configurations, offsets, and status, and if those tasks do not
> > > exist
> > > > > to
> > > > > > >> create them using the new AdminClient. To make this as useful
> as
> > > > > > possible
> > > > > > >> and to minimize the need to still manually create the topics,
> I
> > > > > propose
> > > > > > >> adding several new distributed worker configurations to
> specify
> > > the
> > > > > > >> partitions and replication factor for these topics, and have
> > > > outlined
> > > > > > them
> > > > > > >> in "KIP-154 Add Kafka Connect configuration properties for
> > > creating
> > > > > > >> internal topics".
> > > > > > >>
> > > > > > >> https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> > > > > > >> 154+Add+Kafka+Connect+configuration+properties+for+
> > > > > > >> creating+internal+topics
> > > > > > >>
> > > > > > >> Please take a look and provide feedback. Thanks!
> > > > > > >>
> > > > > > >> Best regards,
> > > > > > >>
> > > > > > >> Randall
> > > > > > >>
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > --
> > > > > > > *Gwen Shapira*
> > > > > > > Product Manager | Confluent
> > > > > > > 650.450.2760 | @gwenshap
> > > > > > > Follow us: Twitter <https://twitter.com/ConfluentInc> | blog
> > > > > > > <http://www.confluent.io/blog>
> > > > > >
> > > > >
> > > >
> > >
> >
>

Re: [DISCUSS] KIP-154 Add Kafka Connect configuration properties for creating internal topics

Posted by Randall Hauch <rh...@gmail.com>.
Yes, that's the approach I'm suggesting and that is mentioned in the KIP. I
also propose that the distributed configuration provided in the examples
set the replication factor to one but include a relevant comment.

On Mon, May 8, 2017 at 11:14 PM, BigData dev <bi...@gmail.com>
wrote:

> So, when Kafka broker is less than 3, and the user has not set the
> replication configuration it will throw an error to the user, to correct
> the configuration according to his setup? Is this the approach you are
> suggesting here?
>
>
>
> On Mon, May 8, 2017 at 7:13 PM, Randall Hauch <rh...@gmail.com> wrote:
>
> > One of the "Rejected Alternatives" was to do something "smarter" by
> > automatically reducing the replication factor when the cluster size is
> > smaller than the replication factor. However, this is extremely
> > unintuitive, and in rare cases (e.g., during a partial outage) might even
> > result in internal topics being created with too small of a replication
> > factor. And defaulting to 1 is certainly bad for production use cases, so
> > that's not an option, either.
> >
> > While defaulting to 3 and failing if the cluster doesn't have 3 nodes is
> a
> > bit harsher than I'd like, it does appear to be the safer option: an
> error
> > message (with instructions on how to correct) is better than
> inadvertently
> > setting the replication factor too small and not knowing about it until
> it
> > is too late.
> >
> > On Mon, May 8, 2017 at 6:12 PM, BigData dev <bi...@gmail.com>
> > wrote:
> >
> > > Hi,
> > > I liked the KIP, as it will avoid so many errors which user can make
> > during
> > > setup.
> > > I have 1 questions here.
> > > 1. As default replication factor is set to 3, but if Kafka cluster is
> > setup
> > > for one node, then the user needs to override the default configuraion,
> > > till then topics will not be created.
> > > So, this is the behavior we want to give?
> > >
> > > On Mon, May 8, 2017 at 2:25 PM, Konstantine Karantasis <
> > > konstantine@confluent.io> wrote:
> > >
> > > > Thanks a lot for the KIP Randall. This improvement should simplify
> both
> > > > regular deployments and testing!
> > > >
> > > > A minor comment. Maybe it would be nice to add a note about why
> there's
> > > no
> > > > need for the property: config.storage.partitions
> > > > I'm mentioning this for the sake of completeness, in case someone
> > notices
> > > > this slight asymmetry with respect to the newly introduced config
> > > > properties.
> > > >
> > > > This is by no means a blocking comment.
> > > >
> > > > Thanks,
> > > > Konstantine
> > > >
> > > > On Fri, May 5, 2017 at 7:18 PM, Randall Hauch <rh...@gmail.com>
> > wrote:
> > > >
> > > > > Thanks, Gwen.
> > > > >
> > > > > Switching to low-priority is a great idea.
> > > > >
> > > > > The default value for the replication factor configuration is 3,
> > since
> > > > > that makes sense and is safe for production. Using the default
> values
> > > in
> > > > > the example would mean it could only be run against a Kafka cluster
> > > with
> > > > a
> > > > > minimum of 3 nodes. I propose overriding the example's replication
> > > factor
> > > > > configurations to be 1 so that the examples could be run on any
> sized
> > > > > cluster.
> > > > >
> > > > > The rejected alternatives mentions why the implementation doesn't
> try
> > > to
> > > > > be too smart by calculating the replication factor.
> > > > >
> > > > > Best regards,
> > > > >
> > > > > Randall
> > > > >
> > > > > > On May 5, 2017, at 8:02 PM, Gwen Shapira <gw...@confluent.io>
> > wrote:
> > > > > >
> > > > > > Looks great to me :)
> > > > > >
> > > > > > Just one note - configurations have levels (which reflect in the
> > > docs)
> > > > -
> > > > > I
> > > > > > suggest putting the whole thing as LOW. Most users will never
> need
> > to
> > > > > worry
> > > > > > about these. For same reason I recommend leaving them out of the
> > > > example
> > > > > > config files - we already have issues with users playing with
> > configs
> > > > > > without understanding what they are doing and not liking the
> > results.
> > > > > >
> > > > > >> On Fri, May 5, 2017 at 3:42 PM, Randall Hauch <rhauch@gmail.com
> >
> > > > wrote:
> > > > > >>
> > > > > >> Hi, all.
> > > > > >>
> > > > > >> I've been working on KAFKA-4667 to change the distributed worker
> > of
> > > > > Kafka
> > > > > >> Connect to look for the topics used to store connector and task
> > > > > >> configurations, offsets, and status, and if those tasks do not
> > exist
> > > > to
> > > > > >> create them using the new AdminClient. To make this as useful as
> > > > > possible
> > > > > >> and to minimize the need to still manually create the topics, I
> > > > propose
> > > > > >> adding several new distributed worker configurations to specify
> > the
> > > > > >> partitions and replication factor for these topics, and have
> > > outlined
> > > > > them
> > > > > >> in "KIP-154 Add Kafka Connect configuration properties for
> > creating
> > > > > >> internal topics".
> > > > > >>
> > > > > >> https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> > > > > >> 154+Add+Kafka+Connect+configuration+properties+for+
> > > > > >> creating+internal+topics
> > > > > >>
> > > > > >> Please take a look and provide feedback. Thanks!
> > > > > >>
> > > > > >> Best regards,
> > > > > >>
> > > > > >> Randall
> > > > > >>
> > > > > >
> > > > > >
> > > > > >
> > > > > > --
> > > > > > *Gwen Shapira*
> > > > > > Product Manager | Confluent
> > > > > > 650.450.2760 | @gwenshap
> > > > > > Follow us: Twitter <https://twitter.com/ConfluentInc> | blog
> > > > > > <http://www.confluent.io/blog>
> > > > >
> > > >
> > >
> >
>

Re: [DISCUSS] KIP-154 Add Kafka Connect configuration properties for creating internal topics

Posted by BigData dev <bi...@gmail.com>.
So, when Kafka broker is less than 3, and the user has not set the
replication configuration it will throw an error to the user, to correct
the configuration according to his setup? Is this the approach you are
suggesting here?



On Mon, May 8, 2017 at 7:13 PM, Randall Hauch <rh...@gmail.com> wrote:

> One of the "Rejected Alternatives" was to do something "smarter" by
> automatically reducing the replication factor when the cluster size is
> smaller than the replication factor. However, this is extremely
> unintuitive, and in rare cases (e.g., during a partial outage) might even
> result in internal topics being created with too small of a replication
> factor. And defaulting to 1 is certainly bad for production use cases, so
> that's not an option, either.
>
> While defaulting to 3 and failing if the cluster doesn't have 3 nodes is a
> bit harsher than I'd like, it does appear to be the safer option: an error
> message (with instructions on how to correct) is better than inadvertently
> setting the replication factor too small and not knowing about it until it
> is too late.
>
> On Mon, May 8, 2017 at 6:12 PM, BigData dev <bi...@gmail.com>
> wrote:
>
> > Hi,
> > I liked the KIP, as it will avoid so many errors which user can make
> during
> > setup.
> > I have 1 questions here.
> > 1. As default replication factor is set to 3, but if Kafka cluster is
> setup
> > for one node, then the user needs to override the default configuraion,
> > till then topics will not be created.
> > So, this is the behavior we want to give?
> >
> > On Mon, May 8, 2017 at 2:25 PM, Konstantine Karantasis <
> > konstantine@confluent.io> wrote:
> >
> > > Thanks a lot for the KIP Randall. This improvement should simplify both
> > > regular deployments and testing!
> > >
> > > A minor comment. Maybe it would be nice to add a note about why there's
> > no
> > > need for the property: config.storage.partitions
> > > I'm mentioning this for the sake of completeness, in case someone
> notices
> > > this slight asymmetry with respect to the newly introduced config
> > > properties.
> > >
> > > This is by no means a blocking comment.
> > >
> > > Thanks,
> > > Konstantine
> > >
> > > On Fri, May 5, 2017 at 7:18 PM, Randall Hauch <rh...@gmail.com>
> wrote:
> > >
> > > > Thanks, Gwen.
> > > >
> > > > Switching to low-priority is a great idea.
> > > >
> > > > The default value for the replication factor configuration is 3,
> since
> > > > that makes sense and is safe for production. Using the default values
> > in
> > > > the example would mean it could only be run against a Kafka cluster
> > with
> > > a
> > > > minimum of 3 nodes. I propose overriding the example's replication
> > factor
> > > > configurations to be 1 so that the examples could be run on any sized
> > > > cluster.
> > > >
> > > > The rejected alternatives mentions why the implementation doesn't try
> > to
> > > > be too smart by calculating the replication factor.
> > > >
> > > > Best regards,
> > > >
> > > > Randall
> > > >
> > > > > On May 5, 2017, at 8:02 PM, Gwen Shapira <gw...@confluent.io>
> wrote:
> > > > >
> > > > > Looks great to me :)
> > > > >
> > > > > Just one note - configurations have levels (which reflect in the
> > docs)
> > > -
> > > > I
> > > > > suggest putting the whole thing as LOW. Most users will never need
> to
> > > > worry
> > > > > about these. For same reason I recommend leaving them out of the
> > > example
> > > > > config files - we already have issues with users playing with
> configs
> > > > > without understanding what they are doing and not liking the
> results.
> > > > >
> > > > >> On Fri, May 5, 2017 at 3:42 PM, Randall Hauch <rh...@gmail.com>
> > > wrote:
> > > > >>
> > > > >> Hi, all.
> > > > >>
> > > > >> I've been working on KAFKA-4667 to change the distributed worker
> of
> > > > Kafka
> > > > >> Connect to look for the topics used to store connector and task
> > > > >> configurations, offsets, and status, and if those tasks do not
> exist
> > > to
> > > > >> create them using the new AdminClient. To make this as useful as
> > > > possible
> > > > >> and to minimize the need to still manually create the topics, I
> > > propose
> > > > >> adding several new distributed worker configurations to specify
> the
> > > > >> partitions and replication factor for these topics, and have
> > outlined
> > > > them
> > > > >> in "KIP-154 Add Kafka Connect configuration properties for
> creating
> > > > >> internal topics".
> > > > >>
> > > > >> https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> > > > >> 154+Add+Kafka+Connect+configuration+properties+for+
> > > > >> creating+internal+topics
> > > > >>
> > > > >> Please take a look and provide feedback. Thanks!
> > > > >>
> > > > >> Best regards,
> > > > >>
> > > > >> Randall
> > > > >>
> > > > >
> > > > >
> > > > >
> > > > > --
> > > > > *Gwen Shapira*
> > > > > Product Manager | Confluent
> > > > > 650.450.2760 | @gwenshap
> > > > > Follow us: Twitter <https://twitter.com/ConfluentInc> | blog
> > > > > <http://www.confluent.io/blog>
> > > >
> > >
> >
>

Re: [DISCUSS] KIP-154 Add Kafka Connect configuration properties for creating internal topics

Posted by Randall Hauch <rh...@gmail.com>.
One of the "Rejected Alternatives" was to do something "smarter" by
automatically reducing the replication factor when the cluster size is
smaller than the replication factor. However, this is extremely
unintuitive, and in rare cases (e.g., during a partial outage) might even
result in internal topics being created with too small of a replication
factor. And defaulting to 1 is certainly bad for production use cases, so
that's not an option, either.

While defaulting to 3 and failing if the cluster doesn't have 3 nodes is a
bit harsher than I'd like, it does appear to be the safer option: an error
message (with instructions on how to correct) is better than inadvertently
setting the replication factor too small and not knowing about it until it
is too late.

On Mon, May 8, 2017 at 6:12 PM, BigData dev <bi...@gmail.com> wrote:

> Hi,
> I liked the KIP, as it will avoid so many errors which user can make during
> setup.
> I have 1 questions here.
> 1. As default replication factor is set to 3, but if Kafka cluster is setup
> for one node, then the user needs to override the default configuraion,
> till then topics will not be created.
> So, this is the behavior we want to give?
>
> On Mon, May 8, 2017 at 2:25 PM, Konstantine Karantasis <
> konstantine@confluent.io> wrote:
>
> > Thanks a lot for the KIP Randall. This improvement should simplify both
> > regular deployments and testing!
> >
> > A minor comment. Maybe it would be nice to add a note about why there's
> no
> > need for the property: config.storage.partitions
> > I'm mentioning this for the sake of completeness, in case someone notices
> > this slight asymmetry with respect to the newly introduced config
> > properties.
> >
> > This is by no means a blocking comment.
> >
> > Thanks,
> > Konstantine
> >
> > On Fri, May 5, 2017 at 7:18 PM, Randall Hauch <rh...@gmail.com> wrote:
> >
> > > Thanks, Gwen.
> > >
> > > Switching to low-priority is a great idea.
> > >
> > > The default value for the replication factor configuration is 3, since
> > > that makes sense and is safe for production. Using the default values
> in
> > > the example would mean it could only be run against a Kafka cluster
> with
> > a
> > > minimum of 3 nodes. I propose overriding the example's replication
> factor
> > > configurations to be 1 so that the examples could be run on any sized
> > > cluster.
> > >
> > > The rejected alternatives mentions why the implementation doesn't try
> to
> > > be too smart by calculating the replication factor.
> > >
> > > Best regards,
> > >
> > > Randall
> > >
> > > > On May 5, 2017, at 8:02 PM, Gwen Shapira <gw...@confluent.io> wrote:
> > > >
> > > > Looks great to me :)
> > > >
> > > > Just one note - configurations have levels (which reflect in the
> docs)
> > -
> > > I
> > > > suggest putting the whole thing as LOW. Most users will never need to
> > > worry
> > > > about these. For same reason I recommend leaving them out of the
> > example
> > > > config files - we already have issues with users playing with configs
> > > > without understanding what they are doing and not liking the results.
> > > >
> > > >> On Fri, May 5, 2017 at 3:42 PM, Randall Hauch <rh...@gmail.com>
> > wrote:
> > > >>
> > > >> Hi, all.
> > > >>
> > > >> I've been working on KAFKA-4667 to change the distributed worker of
> > > Kafka
> > > >> Connect to look for the topics used to store connector and task
> > > >> configurations, offsets, and status, and if those tasks do not exist
> > to
> > > >> create them using the new AdminClient. To make this as useful as
> > > possible
> > > >> and to minimize the need to still manually create the topics, I
> > propose
> > > >> adding several new distributed worker configurations to specify the
> > > >> partitions and replication factor for these topics, and have
> outlined
> > > them
> > > >> in "KIP-154 Add Kafka Connect configuration properties for creating
> > > >> internal topics".
> > > >>
> > > >> https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> > > >> 154+Add+Kafka+Connect+configuration+properties+for+
> > > >> creating+internal+topics
> > > >>
> > > >> Please take a look and provide feedback. Thanks!
> > > >>
> > > >> Best regards,
> > > >>
> > > >> Randall
> > > >>
> > > >
> > > >
> > > >
> > > > --
> > > > *Gwen Shapira*
> > > > Product Manager | Confluent
> > > > 650.450.2760 | @gwenshap
> > > > Follow us: Twitter <https://twitter.com/ConfluentInc> | blog
> > > > <http://www.confluent.io/blog>
> > >
> >
>

Re: [DISCUSS] KIP-154 Add Kafka Connect configuration properties for creating internal topics

Posted by BigData dev <bi...@gmail.com>.
Hi,
I liked the KIP, as it will avoid so many errors which user can make during
setup.
I have 1 questions here.
1. As default replication factor is set to 3, but if Kafka cluster is setup
for one node, then the user needs to override the default configuraion,
till then topics will not be created.
So, this is the behavior we want to give?

On Mon, May 8, 2017 at 2:25 PM, Konstantine Karantasis <
konstantine@confluent.io> wrote:

> Thanks a lot for the KIP Randall. This improvement should simplify both
> regular deployments and testing!
>
> A minor comment. Maybe it would be nice to add a note about why there's no
> need for the property: config.storage.partitions
> I'm mentioning this for the sake of completeness, in case someone notices
> this slight asymmetry with respect to the newly introduced config
> properties.
>
> This is by no means a blocking comment.
>
> Thanks,
> Konstantine
>
> On Fri, May 5, 2017 at 7:18 PM, Randall Hauch <rh...@gmail.com> wrote:
>
> > Thanks, Gwen.
> >
> > Switching to low-priority is a great idea.
> >
> > The default value for the replication factor configuration is 3, since
> > that makes sense and is safe for production. Using the default values in
> > the example would mean it could only be run against a Kafka cluster with
> a
> > minimum of 3 nodes. I propose overriding the example's replication factor
> > configurations to be 1 so that the examples could be run on any sized
> > cluster.
> >
> > The rejected alternatives mentions why the implementation doesn't try to
> > be too smart by calculating the replication factor.
> >
> > Best regards,
> >
> > Randall
> >
> > > On May 5, 2017, at 8:02 PM, Gwen Shapira <gw...@confluent.io> wrote:
> > >
> > > Looks great to me :)
> > >
> > > Just one note - configurations have levels (which reflect in the docs)
> -
> > I
> > > suggest putting the whole thing as LOW. Most users will never need to
> > worry
> > > about these. For same reason I recommend leaving them out of the
> example
> > > config files - we already have issues with users playing with configs
> > > without understanding what they are doing and not liking the results.
> > >
> > >> On Fri, May 5, 2017 at 3:42 PM, Randall Hauch <rh...@gmail.com>
> wrote:
> > >>
> > >> Hi, all.
> > >>
> > >> I've been working on KAFKA-4667 to change the distributed worker of
> > Kafka
> > >> Connect to look for the topics used to store connector and task
> > >> configurations, offsets, and status, and if those tasks do not exist
> to
> > >> create them using the new AdminClient. To make this as useful as
> > possible
> > >> and to minimize the need to still manually create the topics, I
> propose
> > >> adding several new distributed worker configurations to specify the
> > >> partitions and replication factor for these topics, and have outlined
> > them
> > >> in "KIP-154 Add Kafka Connect configuration properties for creating
> > >> internal topics".
> > >>
> > >> https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> > >> 154+Add+Kafka+Connect+configuration+properties+for+
> > >> creating+internal+topics
> > >>
> > >> Please take a look and provide feedback. Thanks!
> > >>
> > >> Best regards,
> > >>
> > >> Randall
> > >>
> > >
> > >
> > >
> > > --
> > > *Gwen Shapira*
> > > Product Manager | Confluent
> > > 650.450.2760 | @gwenshap
> > > Follow us: Twitter <https://twitter.com/ConfluentInc> | blog
> > > <http://www.confluent.io/blog>
> >
>

Re: [DISCUSS] KIP-154 Add Kafka Connect configuration properties for creating internal topics

Posted by Konstantine Karantasis <ko...@confluent.io>.
Thanks a lot for the KIP Randall. This improvement should simplify both
regular deployments and testing!

A minor comment. Maybe it would be nice to add a note about why there's no
need for the property: config.storage.partitions
I'm mentioning this for the sake of completeness, in case someone notices
this slight asymmetry with respect to the newly introduced config
properties.

This is by no means a blocking comment.

Thanks,
Konstantine

On Fri, May 5, 2017 at 7:18 PM, Randall Hauch <rh...@gmail.com> wrote:

> Thanks, Gwen.
>
> Switching to low-priority is a great idea.
>
> The default value for the replication factor configuration is 3, since
> that makes sense and is safe for production. Using the default values in
> the example would mean it could only be run against a Kafka cluster with a
> minimum of 3 nodes. I propose overriding the example's replication factor
> configurations to be 1 so that the examples could be run on any sized
> cluster.
>
> The rejected alternatives mentions why the implementation doesn't try to
> be too smart by calculating the replication factor.
>
> Best regards,
>
> Randall
>
> > On May 5, 2017, at 8:02 PM, Gwen Shapira <gw...@confluent.io> wrote:
> >
> > Looks great to me :)
> >
> > Just one note - configurations have levels (which reflect in the docs) -
> I
> > suggest putting the whole thing as LOW. Most users will never need to
> worry
> > about these. For same reason I recommend leaving them out of the example
> > config files - we already have issues with users playing with configs
> > without understanding what they are doing and not liking the results.
> >
> >> On Fri, May 5, 2017 at 3:42 PM, Randall Hauch <rh...@gmail.com> wrote:
> >>
> >> Hi, all.
> >>
> >> I've been working on KAFKA-4667 to change the distributed worker of
> Kafka
> >> Connect to look for the topics used to store connector and task
> >> configurations, offsets, and status, and if those tasks do not exist to
> >> create them using the new AdminClient. To make this as useful as
> possible
> >> and to minimize the need to still manually create the topics, I propose
> >> adding several new distributed worker configurations to specify the
> >> partitions and replication factor for these topics, and have outlined
> them
> >> in "KIP-154 Add Kafka Connect configuration properties for creating
> >> internal topics".
> >>
> >> https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> >> 154+Add+Kafka+Connect+configuration+properties+for+
> >> creating+internal+topics
> >>
> >> Please take a look and provide feedback. Thanks!
> >>
> >> Best regards,
> >>
> >> Randall
> >>
> >
> >
> >
> > --
> > *Gwen Shapira*
> > Product Manager | Confluent
> > 650.450.2760 | @gwenshap
> > Follow us: Twitter <https://twitter.com/ConfluentInc> | blog
> > <http://www.confluent.io/blog>
>

Re: [DISCUSS] KIP-154 Add Kafka Connect configuration properties for creating internal topics

Posted by Randall Hauch <rh...@gmail.com>.
Thanks, Gwen. 

Switching to low-priority is a great idea.

The default value for the replication factor configuration is 3, since that makes sense and is safe for production. Using the default values in the example would mean it could only be run against a Kafka cluster with a minimum of 3 nodes. I propose overriding the example's replication factor configurations to be 1 so that the examples could be run on any sized cluster.

The rejected alternatives mentions why the implementation doesn't try to be too smart by calculating the replication factor.

Best regards, 

Randall

> On May 5, 2017, at 8:02 PM, Gwen Shapira <gw...@confluent.io> wrote:
> 
> Looks great to me :)
> 
> Just one note - configurations have levels (which reflect in the docs) - I
> suggest putting the whole thing as LOW. Most users will never need to worry
> about these. For same reason I recommend leaving them out of the example
> config files - we already have issues with users playing with configs
> without understanding what they are doing and not liking the results.
> 
>> On Fri, May 5, 2017 at 3:42 PM, Randall Hauch <rh...@gmail.com> wrote:
>> 
>> Hi, all.
>> 
>> I've been working on KAFKA-4667 to change the distributed worker of Kafka
>> Connect to look for the topics used to store connector and task
>> configurations, offsets, and status, and if those tasks do not exist to
>> create them using the new AdminClient. To make this as useful as possible
>> and to minimize the need to still manually create the topics, I propose
>> adding several new distributed worker configurations to specify the
>> partitions and replication factor for these topics, and have outlined them
>> in "KIP-154 Add Kafka Connect configuration properties for creating
>> internal topics".
>> 
>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-
>> 154+Add+Kafka+Connect+configuration+properties+for+
>> creating+internal+topics
>> 
>> Please take a look and provide feedback. Thanks!
>> 
>> Best regards,
>> 
>> Randall
>> 
> 
> 
> 
> -- 
> *Gwen Shapira*
> Product Manager | Confluent
> 650.450.2760 | @gwenshap
> Follow us: Twitter <https://twitter.com/ConfluentInc> | blog
> <http://www.confluent.io/blog>

Re: [DISCUSS] KIP-154 Add Kafka Connect configuration properties for creating internal topics

Posted by Gwen Shapira <gw...@confluent.io>.
Looks great to me :)

Just one note - configurations have levels (which reflect in the docs) - I
suggest putting the whole thing as LOW. Most users will never need to worry
about these. For same reason I recommend leaving them out of the example
config files - we already have issues with users playing with configs
without understanding what they are doing and not liking the results.

On Fri, May 5, 2017 at 3:42 PM, Randall Hauch <rh...@gmail.com> wrote:

> Hi, all.
>
> I've been working on KAFKA-4667 to change the distributed worker of Kafka
> Connect to look for the topics used to store connector and task
> configurations, offsets, and status, and if those tasks do not exist to
> create them using the new AdminClient. To make this as useful as possible
> and to minimize the need to still manually create the topics, I propose
> adding several new distributed worker configurations to specify the
> partitions and replication factor for these topics, and have outlined them
> in "KIP-154 Add Kafka Connect configuration properties for creating
> internal topics".
>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> 154+Add+Kafka+Connect+configuration+properties+for+
> creating+internal+topics
>
> Please take a look and provide feedback. Thanks!
>
> Best regards,
>
> Randall
>



-- 
*Gwen Shapira*
Product Manager | Confluent
650.450.2760 | @gwenshap
Follow us: Twitter <https://twitter.com/ConfluentInc> | blog
<http://www.confluent.io/blog>