You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@kafka.apache.org by Ismael Juma <is...@juma.me.uk> on 2017/05/19 12:46:34 UTC

Re: [VOTE] KIP-117: Add a public AdminClient API for Kafka admin operations

Hi all,

Feedback from people who tried the AdminClient is that auto topic creation
during describe is unexpected and confusing. This is consistent with the
reaction of most people when they learn that MetadataRequest can cause
topics to be created. We had assumed that we'd tackle this issue for all
the clients as part of deprecation of server-side auto topic creation in
favour of client-side auto-topic creation.

However, it would be better to do the right thing for the AdminClient from
the start. Users will be less confused and we won't have to deal with
compatibility concerns. Jason suggested a simple solution: make it possible
to disallow auto topic creation when sending the metadata request. The
AdminClient would take advantage of this now (i.e. 0.11.0.0) while the
producer and consumer would retain the existing behaviour. In a subsequent
release, we'll work out the details of how to move away from server-side
auto topic creation for the producer and consumer (taking into account the
compatibility impact).

Because of the protocol change, this solution would only help in cases
where the AdminClient is describing topics from a 0.11.0.0 or newer broker.

I submitted a PR for this and it's small and straightforward:

https://github.com/apache/kafka/pull/3098

Thoughts?

Ismael

On Sat, Mar 25, 2017 at 1:25 AM, Colin McCabe <cm...@apache.org> wrote:

> With binding +1 votes from Gwen Shapira, Sriram Subramanian, and Grant
> Henke, and a non-binding vote from Dong Lin, the vote passes.  There
> were no +0 or -1 votes.  As mentioned earlier, the interface will be
> unstable at first and we will continue to evolve it.
>
> thanks,
> Colin McCabe
>
>
> On Wed, Mar 22, 2017, at 10:21, Colin McCabe wrote:
> > On Fri, Mar 17, 2017, at 10:50, Jun Rao wrote:
> > > Hi, Colin,
> > >
> > > Thanks for the KIP. Looks good overall. A few comments below.
> > >
> > > 1. Sometimes we return
> > >     CompletableFuture<Map<String, TopicDescription>>
> > > and some other times we return
> > >     Map<String, CompletableFuture<Void>>
> > > , which doesn't seem consistent. Is that intentional?
> >
> > Yes, this is intentional.  We got feedback from some people that they
> > wanted a single future that would fail if anything failed.  Other people
> > wanted to be able to detect failures on individual elements of a batch.
> > This API lets us have both (you just choose which future you want to
> > wait on).
> >
> > >
> > > 2. We support batching in CreateTopic/DeleteTopic/ListTopic, but not
> in
> > > DescribeTopic. Should we add batching in DescribeTopic to make it
> > > consistent?
> >
> > Good idea.  Let's add batching to DescribeTopic(s).
> >
> > > Also, both ListTopic and DescribeTopic seem to return
> > > TopicDescription. Could we just consolidate the two by just keeping
> > > DescribeTopic?
> >
> > Sorry, that was a typo.  ListTopics is supposed to return TopicListing,
> > which tells you only the name of the topic and whether it is internal.
> > The idea is that later we will add another RPC which allows us to fetch
> > just this information, and not the other topic fields. That way, we can
> > be more efficient.  The idea is that ListTopics is like readdir()/ls and
> > DescribeTopics is like stat().  Getting detailed information about
> > 1,000s of topics could be quite a resource hog for cluster management
> > systems in a big cluster.
> >
> > >
> > > 3. listNodes: At the request protocol level, we can get things like
> > > clusterId and controller broker id. Both are useful info from an admin
> > > perspective, but are not exposed through the api. Perhaps we can
> > > generalize
> > > listNodes to sth like describeCluster so that we can return those
> > > additional info as well?
> >
> > Yeah, let's change listNodes -> describeCluster.
> >
> > >
> > > 4. Configurations: To support security, we will need to include all
> > > properties related to SSL and SASL.
> >
> > Yeah
> >
> > best,
> > Colin
> >
> > >
> > > Thanks,
> > >
> > > Jun
> > >
> > >
> > > On Thu, Mar 16, 2017 at 11:59 PM, Colin McCabe <cm...@apache.org>
> > > wrote:
> > >
> > > > Hi all,
> > > >
> > > > It seems like people agree with the basic direction of the proposal
> and
> > > > the API, including the operations that are included, the async and
> > > > batching support, and the mechanisms for extending it in the
> future.  If
> > > > there's no more votes, I'd like to close the vote and start progress
> on
> > > > this.
> > > >
> > > > I think the API should be unstable for a while (at least until the
> 0.11
> > > > release is made), so we can consider ways to improve it.  A few have
> > > > been suggested here: removing or adding functions, renaming things a
> > > > bit, or using request objects instead of options objects.  I think
> once
> > > > people try out the API a bit, it will be easier to evaluate these.
> > > >
> > > > best,
> > > > Colin
> > > >
> > > >
> > > > On Tue, Mar 14, 2017, at 10:12, Dong Lin wrote:
> > > > > +1
> > > > >
> > > > > On Tue, Mar 14, 2017 at 8:50 AM, Grant Henke <gh...@cloudera.com>
> > > > wrote:
> > > > >
> > > > > > +1
> > > > > >
> > > > > > On Tue, Mar 14, 2017 at 2:44 AM, Sriram Subramanian <
> ram@confluent.io>
> > > > > > wrote:
> > > > > >
> > > > > > > +1 (binding)
> > > > > > >
> > > > > > > Nice work in driving this.
> > > > > > >
> > > > > > > On Mon, Mar 13, 2017 at 10:31 PM, Gwen Shapira <
> gwen@confluent.io>
> > > > > > wrote:
> > > > > > >
> > > > > > > > +1 (binding)
> > > > > > > >
> > > > > > > > I expressed few concerns in the discussion thread, but in
> general
> > > > this
> > > > > > is
> > > > > > > > super important to get done.
> > > > > > > >
> > > > > > > > On Fri, Mar 10, 2017 at 10:38 AM, Colin McCabe <
> cmccabe@apache.org
> > > > >
> > > > > > > wrote:
> > > > > > > >
> > > > > > > > > Hi all,
> > > > > > > > >
> > > > > > > > > I'd like to start voting on KIP-117
> > > > > > > > > (https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> > > > > > > > > 117%3A+Add+a+public+AdminClient+API+for+Kafka+
> admin+operations
> > > > > > > > > ).
> > > > > > > > >
> > > > > > > > > The discussion thread can be found here:
> > > > > > > > > https://www.mail-archive.com/
> dev@kafka.apache.org/msg65697.html
> > > > > > > > >
> > > > > > > > > best,
> > > > > > > > > Colin
> > > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > --
> > > > > > > > *Gwen Shapira*
> > > > > > > > Product Manager | Confluent
> > > > > > > > 650.450.2760 | @gwenshap
> > > > > > > > Follow us: Twitter <https://twitter.com/ConfluentInc> | blog
> > > > > > > > <http://www.confluent.io/blog>
> > > > > > > >
> > > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > > --
> > > > > > Grant Henke
> > > > > > Software Engineer | Cloudera
> > > > > > grant@cloudera.com | twitter.com/gchenke |
> linkedin.com/in/granthenke
> > > > > >
> > > >
>

Re: [VOTE] KIP-117: Add a public AdminClient API for Kafka admin operations

Posted by Colin McCabe <cm...@apache.org>.
Hi all,

We're getting close to the final 0.11 release and we still haven't
gotten the version retrieval API for AdminClient in, as described by
KAFKA-5214.  Let's put this off until the next release so that we have
some time to let it soak before shipping.  This will just mean that we
will remove AdminClient#apiVersions for now, until the next release.  It
will still be possible to get version information with the existing
kafka-broker-api-versions.sh script.

best,
Colin


On Sat, Jun 3, 2017, at 01:08, Ismael Juma wrote:
> Hi Colin,
> 
> Thanks for the feedback. Regarding the behaviour for brokers older than
> 0.11.0, I had gone for the Javadoc note because it made it possible to
> avoid the inefficiency of getting all topics for users who have disabled
> auto topic creation.
> 
> After some thought and discussion, I agree that keeping the behaviour
> consistent across broker versions is the better option, so the PR was
> updated to do that.
> 
> Ismael
> 
> On Mon, May 22, 2017 at 7:42 PM, Colin McCabe <cm...@apache.org> wrote:
> 
> > > As you noted, though, we don't have a way to do this for the 0.10.x
> > > releases.  It seems a bit harsh to have such different behavior there.
> > > Is there a way that we can fudge this a bit so that it mostly works?
> > > For example, when communicating with 0.10.x brokers, describeTopics
> > > could do a MetadataRequest(topics=*) to filter out non-existent topics.
> > >
> > > This would obviously have a bit of a time-of-check, time-of-use race
> > > condition since we're making two calls.  And also a scalability problem
> > > since we're using topics=*.  Is it worth it to make the behavior saner
> > > on older brokers?  Or should we add a JavaDoc note and move on?
> > >
> > > best,
> > > Colin
> > >
> > >
> > > On Fri, May 19, 2017, at 05:46, Ismael Juma wrote:
> > > > Hi all,
> > > >
> > > > Feedback from people who tried the AdminClient is that auto topic
> > > > creation
> > > > during describe is unexpected and confusing. This is consistent with
> > the
> > > > reaction of most people when they learn that MetadataRequest can cause
> > > > topics to be created. We had assumed that we'd tackle this issue for
> > all
> > > > the clients as part of deprecation of server-side auto topic creation
> > in
> > > > favour of client-side auto-topic creation.
> > > >
> > > > However, it would be better to do the right thing for the AdminClient
> > > > from
> > > > the start. Users will be less confused and we won't have to deal with
> > > > compatibility concerns. Jason suggested a simple solution: make it
> > > > possible
> > > > to disallow auto topic creation when sending the metadata request. The
> > > > AdminClient would take advantage of this now (i.e. 0.11.0.0) while the
> > > > producer and consumer would retain the existing behaviour. In a
> > > > subsequent
> > > > release, we'll work out the details of how to move away from
> > server-side
> > > > auto topic creation for the producer and consumer (taking into account
> > > > the
> > > > compatibility impact).
> > > >
> > > > Because of the protocol change, this solution would only help in cases
> > > > where the AdminClient is describing topics from a 0.11.0.0 or newer
> > > > broker.
> > > >
> > > > I submitted a PR for this and it's small and straightforward:
> > > >
> > > > https://github.com/apache/kafka/pull/3098
> > > >
> > > > Thoughts?
> > > >
> > > > Ismael
> > > >
> > > > On Sat, Mar 25, 2017 at 1:25 AM, Colin McCabe <cm...@apache.org>
> > wrote:
> > > >
> > > > > With binding +1 votes from Gwen Shapira, Sriram Subramanian, and
> > Grant
> > > > > Henke, and a non-binding vote from Dong Lin, the vote passes.  There
> > > > > were no +0 or -1 votes.  As mentioned earlier, the interface will be
> > > > > unstable at first and we will continue to evolve it.
> > > > >
> > > > > thanks,
> > > > > Colin McCabe
> > > > >
> > > > >
> > > > > On Wed, Mar 22, 2017, at 10:21, Colin McCabe wrote:
> > > > > > On Fri, Mar 17, 2017, at 10:50, Jun Rao wrote:
> > > > > > > Hi, Colin,
> > > > > > >
> > > > > > > Thanks for the KIP. Looks good overall. A few comments below.
> > > > > > >
> > > > > > > 1. Sometimes we return
> > > > > > >     CompletableFuture<Map<String, TopicDescription>>
> > > > > > > and some other times we return
> > > > > > >     Map<String, CompletableFuture<Void>>
> > > > > > > , which doesn't seem consistent. Is that intentional?
> > > > > >
> > > > > > Yes, this is intentional.  We got feedback from some people that
> > they
> > > > > > wanted a single future that would fail if anything failed.  Other
> > people
> > > > > > wanted to be able to detect failures on individual elements of a
> > batch.
> > > > > > This API lets us have both (you just choose which future you want
> > to
> > > > > > wait on).
> > > > > >
> > > > > > >
> > > > > > > 2. We support batching in CreateTopic/DeleteTopic/ListTopic,
> > but not
> > > > > in
> > > > > > > DescribeTopic. Should we add batching in DescribeTopic to make it
> > > > > > > consistent?
> > > > > >
> > > > > > Good idea.  Let's add batching to DescribeTopic(s).
> > > > > >
> > > > > > > Also, both ListTopic and DescribeTopic seem to return
> > > > > > > TopicDescription. Could we just consolidate the two by just
> > keeping
> > > > > > > DescribeTopic?
> > > > > >
> > > > > > Sorry, that was a typo.  ListTopics is supposed to return
> > TopicListing,
> > > > > > which tells you only the name of the topic and whether it is
> > internal.
> > > > > > The idea is that later we will add another RPC which allows us to
> > fetch
> > > > > > just this information, and not the other topic fields. That way,
> > we can
> > > > > > be more efficient.  The idea is that ListTopics is like
> > readdir()/ls and
> > > > > > DescribeTopics is like stat().  Getting detailed information about
> > > > > > 1,000s of topics could be quite a resource hog for cluster
> > management
> > > > > > systems in a big cluster.
> > > > > >
> > > > > > >
> > > > > > > 3. listNodes: At the request protocol level, we can get things
> > like
> > > > > > > clusterId and controller broker id. Both are useful info from an
> > admin
> > > > > > > perspective, but are not exposed through the api. Perhaps we can
> > > > > > > generalize
> > > > > > > listNodes to sth like describeCluster so that we can return those
> > > > > > > additional info as well?
> > > > > >
> > > > > > Yeah, let's change listNodes -> describeCluster.
> > > > > >
> > > > > > >
> > > > > > > 4. Configurations: To support security, we will need to include
> > all
> > > > > > > properties related to SSL and SASL.
> > > > > >
> > > > > > Yeah
> > > > > >
> > > > > > best,
> > > > > > Colin
> > > > > >
> > > > > > >
> > > > > > > Thanks,
> > > > > > >
> > > > > > > Jun
> > > > > > >
> > > > > > >
> > > > > > > On Thu, Mar 16, 2017 at 11:59 PM, Colin McCabe <
> > cmccabe@apache.org>
> > > > > > > wrote:
> > > > > > >
> > > > > > > > Hi all,
> > > > > > > >
> > > > > > > > It seems like people agree with the basic direction of the
> > proposal
> > > > > and
> > > > > > > > the API, including the operations that are included, the async
> > and
> > > > > > > > batching support, and the mechanisms for extending it in the
> > > > > future.  If
> > > > > > > > there's no more votes, I'd like to close the vote and start
> > progress
> > > > > on
> > > > > > > > this.
> > > > > > > >
> > > > > > > > I think the API should be unstable for a while (at least until
> > the
> > > > > 0.11
> > > > > > > > release is made), so we can consider ways to improve it.  A
> > few have
> > > > > > > > been suggested here: removing or adding functions, renaming
> > things a
> > > > > > > > bit, or using request objects instead of options objects.  I
> > think
> > > > > once
> > > > > > > > people try out the API a bit, it will be easier to evaluate
> > these.
> > > > > > > >
> > > > > > > > best,
> > > > > > > > Colin
> > > > > > > >
> > > > > > > >
> > > > > > > > On Tue, Mar 14, 2017, at 10:12, Dong Lin wrote:
> > > > > > > > > +1
> > > > > > > > >
> > > > > > > > > On Tue, Mar 14, 2017 at 8:50 AM, Grant Henke <
> > ghenke@cloudera.com>
> > > > > > > > wrote:
> > > > > > > > >
> > > > > > > > > > +1
> > > > > > > > > >
> > > > > > > > > > On Tue, Mar 14, 2017 at 2:44 AM, Sriram Subramanian <
> > > > > ram@confluent.io>
> > > > > > > > > > wrote:
> > > > > > > > > >
> > > > > > > > > > > +1 (binding)
> > > > > > > > > > >
> > > > > > > > > > > Nice work in driving this.
> > > > > > > > > > >
> > > > > > > > > > > On Mon, Mar 13, 2017 at 10:31 PM, Gwen Shapira <
> > > > > gwen@confluent.io>
> > > > > > > > > > wrote:
> > > > > > > > > > >
> > > > > > > > > > > > +1 (binding)
> > > > > > > > > > > >
> > > > > > > > > > > > I expressed few concerns in the discussion thread, but
> > in
> > > > > general
> > > > > > > > this
> > > > > > > > > > is
> > > > > > > > > > > > super important to get done.
> > > > > > > > > > > >
> > > > > > > > > > > > On Fri, Mar 10, 2017 at 10:38 AM, Colin McCabe <
> > > > > cmccabe@apache.org
> > > > > > > > >
> > > > > > > > > > > wrote:
> > > > > > > > > > > >
> > > > > > > > > > > > > Hi all,
> > > > > > > > > > > > >
> > > > > > > > > > > > > I'd like to start voting on KIP-117
> > > > > > > > > > > > > (https://cwiki.apache.org/conf
> > luence/display/KAFKA/KIP-
> > > > > > > > > > > > > 117%3A+Add+a+public+AdminClient+API+for+Kafka+
> > > > > admin+operations
> > > > > > > > > > > > > ).
> > > > > > > > > > > > >
> > > > > > > > > > > > > The discussion thread can be found here:
> > > > > > > > > > > > > https://www.mail-archive.com/
> > > > > dev@kafka.apache.org/msg65697.html
> > > > > > > > > > > > >
> > > > > > > > > > > > > best,
> > > > > > > > > > > > > Colin
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > --
> > > > > > > > > > > > *Gwen Shapira*
> > > > > > > > > > > > Product Manager | Confluent
> > > > > > > > > > > > 650.450.2760 | @gwenshap
> > > > > > > > > > > > Follow us: Twitter <https://twitter.com/ConfluentInc>
> > | blog
> > > > > > > > > > > > <http://www.confluent.io/blog>
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > --
> > > > > > > > > > Grant Henke
> > > > > > > > > > Software Engineer | Cloudera
> > > > > > > > > > grant@cloudera.com | twitter.com/gchenke |
> > > > > linkedin.com/in/granthenke
> > > > > > > > > >
> > > > > > > >
> > > > >
> >

Re: [VOTE] KIP-117: Add a public AdminClient API for Kafka admin operations

Posted by Ismael Juma <is...@juma.me.uk>.
Hi Colin,

Thanks for the feedback. Regarding the behaviour for brokers older than
0.11.0, I had gone for the Javadoc note because it made it possible to
avoid the inefficiency of getting all topics for users who have disabled
auto topic creation.

After some thought and discussion, I agree that keeping the behaviour
consistent across broker versions is the better option, so the PR was
updated to do that.

Ismael

On Mon, May 22, 2017 at 7:42 PM, Colin McCabe <cm...@apache.org> wrote:

> > As you noted, though, we don't have a way to do this for the 0.10.x
> > releases.  It seems a bit harsh to have such different behavior there.
> > Is there a way that we can fudge this a bit so that it mostly works?
> > For example, when communicating with 0.10.x brokers, describeTopics
> > could do a MetadataRequest(topics=*) to filter out non-existent topics.
> >
> > This would obviously have a bit of a time-of-check, time-of-use race
> > condition since we're making two calls.  And also a scalability problem
> > since we're using topics=*.  Is it worth it to make the behavior saner
> > on older brokers?  Or should we add a JavaDoc note and move on?
> >
> > best,
> > Colin
> >
> >
> > On Fri, May 19, 2017, at 05:46, Ismael Juma wrote:
> > > Hi all,
> > >
> > > Feedback from people who tried the AdminClient is that auto topic
> > > creation
> > > during describe is unexpected and confusing. This is consistent with
> the
> > > reaction of most people when they learn that MetadataRequest can cause
> > > topics to be created. We had assumed that we'd tackle this issue for
> all
> > > the clients as part of deprecation of server-side auto topic creation
> in
> > > favour of client-side auto-topic creation.
> > >
> > > However, it would be better to do the right thing for the AdminClient
> > > from
> > > the start. Users will be less confused and we won't have to deal with
> > > compatibility concerns. Jason suggested a simple solution: make it
> > > possible
> > > to disallow auto topic creation when sending the metadata request. The
> > > AdminClient would take advantage of this now (i.e. 0.11.0.0) while the
> > > producer and consumer would retain the existing behaviour. In a
> > > subsequent
> > > release, we'll work out the details of how to move away from
> server-side
> > > auto topic creation for the producer and consumer (taking into account
> > > the
> > > compatibility impact).
> > >
> > > Because of the protocol change, this solution would only help in cases
> > > where the AdminClient is describing topics from a 0.11.0.0 or newer
> > > broker.
> > >
> > > I submitted a PR for this and it's small and straightforward:
> > >
> > > https://github.com/apache/kafka/pull/3098
> > >
> > > Thoughts?
> > >
> > > Ismael
> > >
> > > On Sat, Mar 25, 2017 at 1:25 AM, Colin McCabe <cm...@apache.org>
> wrote:
> > >
> > > > With binding +1 votes from Gwen Shapira, Sriram Subramanian, and
> Grant
> > > > Henke, and a non-binding vote from Dong Lin, the vote passes.  There
> > > > were no +0 or -1 votes.  As mentioned earlier, the interface will be
> > > > unstable at first and we will continue to evolve it.
> > > >
> > > > thanks,
> > > > Colin McCabe
> > > >
> > > >
> > > > On Wed, Mar 22, 2017, at 10:21, Colin McCabe wrote:
> > > > > On Fri, Mar 17, 2017, at 10:50, Jun Rao wrote:
> > > > > > Hi, Colin,
> > > > > >
> > > > > > Thanks for the KIP. Looks good overall. A few comments below.
> > > > > >
> > > > > > 1. Sometimes we return
> > > > > >     CompletableFuture<Map<String, TopicDescription>>
> > > > > > and some other times we return
> > > > > >     Map<String, CompletableFuture<Void>>
> > > > > > , which doesn't seem consistent. Is that intentional?
> > > > >
> > > > > Yes, this is intentional.  We got feedback from some people that
> they
> > > > > wanted a single future that would fail if anything failed.  Other
> people
> > > > > wanted to be able to detect failures on individual elements of a
> batch.
> > > > > This API lets us have both (you just choose which future you want
> to
> > > > > wait on).
> > > > >
> > > > > >
> > > > > > 2. We support batching in CreateTopic/DeleteTopic/ListTopic,
> but not
> > > > in
> > > > > > DescribeTopic. Should we add batching in DescribeTopic to make it
> > > > > > consistent?
> > > > >
> > > > > Good idea.  Let's add batching to DescribeTopic(s).
> > > > >
> > > > > > Also, both ListTopic and DescribeTopic seem to return
> > > > > > TopicDescription. Could we just consolidate the two by just
> keeping
> > > > > > DescribeTopic?
> > > > >
> > > > > Sorry, that was a typo.  ListTopics is supposed to return
> TopicListing,
> > > > > which tells you only the name of the topic and whether it is
> internal.
> > > > > The idea is that later we will add another RPC which allows us to
> fetch
> > > > > just this information, and not the other topic fields. That way,
> we can
> > > > > be more efficient.  The idea is that ListTopics is like
> readdir()/ls and
> > > > > DescribeTopics is like stat().  Getting detailed information about
> > > > > 1,000s of topics could be quite a resource hog for cluster
> management
> > > > > systems in a big cluster.
> > > > >
> > > > > >
> > > > > > 3. listNodes: At the request protocol level, we can get things
> like
> > > > > > clusterId and controller broker id. Both are useful info from an
> admin
> > > > > > perspective, but are not exposed through the api. Perhaps we can
> > > > > > generalize
> > > > > > listNodes to sth like describeCluster so that we can return those
> > > > > > additional info as well?
> > > > >
> > > > > Yeah, let's change listNodes -> describeCluster.
> > > > >
> > > > > >
> > > > > > 4. Configurations: To support security, we will need to include
> all
> > > > > > properties related to SSL and SASL.
> > > > >
> > > > > Yeah
> > > > >
> > > > > best,
> > > > > Colin
> > > > >
> > > > > >
> > > > > > Thanks,
> > > > > >
> > > > > > Jun
> > > > > >
> > > > > >
> > > > > > On Thu, Mar 16, 2017 at 11:59 PM, Colin McCabe <
> cmccabe@apache.org>
> > > > > > wrote:
> > > > > >
> > > > > > > Hi all,
> > > > > > >
> > > > > > > It seems like people agree with the basic direction of the
> proposal
> > > > and
> > > > > > > the API, including the operations that are included, the async
> and
> > > > > > > batching support, and the mechanisms for extending it in the
> > > > future.  If
> > > > > > > there's no more votes, I'd like to close the vote and start
> progress
> > > > on
> > > > > > > this.
> > > > > > >
> > > > > > > I think the API should be unstable for a while (at least until
> the
> > > > 0.11
> > > > > > > release is made), so we can consider ways to improve it.  A
> few have
> > > > > > > been suggested here: removing or adding functions, renaming
> things a
> > > > > > > bit, or using request objects instead of options objects.  I
> think
> > > > once
> > > > > > > people try out the API a bit, it will be easier to evaluate
> these.
> > > > > > >
> > > > > > > best,
> > > > > > > Colin
> > > > > > >
> > > > > > >
> > > > > > > On Tue, Mar 14, 2017, at 10:12, Dong Lin wrote:
> > > > > > > > +1
> > > > > > > >
> > > > > > > > On Tue, Mar 14, 2017 at 8:50 AM, Grant Henke <
> ghenke@cloudera.com>
> > > > > > > wrote:
> > > > > > > >
> > > > > > > > > +1
> > > > > > > > >
> > > > > > > > > On Tue, Mar 14, 2017 at 2:44 AM, Sriram Subramanian <
> > > > ram@confluent.io>
> > > > > > > > > wrote:
> > > > > > > > >
> > > > > > > > > > +1 (binding)
> > > > > > > > > >
> > > > > > > > > > Nice work in driving this.
> > > > > > > > > >
> > > > > > > > > > On Mon, Mar 13, 2017 at 10:31 PM, Gwen Shapira <
> > > > gwen@confluent.io>
> > > > > > > > > wrote:
> > > > > > > > > >
> > > > > > > > > > > +1 (binding)
> > > > > > > > > > >
> > > > > > > > > > > I expressed few concerns in the discussion thread, but
> in
> > > > general
> > > > > > > this
> > > > > > > > > is
> > > > > > > > > > > super important to get done.
> > > > > > > > > > >
> > > > > > > > > > > On Fri, Mar 10, 2017 at 10:38 AM, Colin McCabe <
> > > > cmccabe@apache.org
> > > > > > > >
> > > > > > > > > > wrote:
> > > > > > > > > > >
> > > > > > > > > > > > Hi all,
> > > > > > > > > > > >
> > > > > > > > > > > > I'd like to start voting on KIP-117
> > > > > > > > > > > > (https://cwiki.apache.org/conf
> luence/display/KAFKA/KIP-
> > > > > > > > > > > > 117%3A+Add+a+public+AdminClient+API+for+Kafka+
> > > > admin+operations
> > > > > > > > > > > > ).
> > > > > > > > > > > >
> > > > > > > > > > > > The discussion thread can be found here:
> > > > > > > > > > > > https://www.mail-archive.com/
> > > > dev@kafka.apache.org/msg65697.html
> > > > > > > > > > > >
> > > > > > > > > > > > best,
> > > > > > > > > > > > Colin
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > --
> > > > > > > > > > > *Gwen Shapira*
> > > > > > > > > > > Product Manager | Confluent
> > > > > > > > > > > 650.450.2760 | @gwenshap
> > > > > > > > > > > Follow us: Twitter <https://twitter.com/ConfluentInc>
> | blog
> > > > > > > > > > > <http://www.confluent.io/blog>
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > --
> > > > > > > > > Grant Henke
> > > > > > > > > Software Engineer | Cloudera
> > > > > > > > > grant@cloudera.com | twitter.com/gchenke |
> > > > linkedin.com/in/granthenke
> > > > > > > > >
> > > > > > >
> > > >
>

Re: [VOTE] KIP-117: Add a public AdminClient API for Kafka admin operations

Posted by Colin McCabe <cm...@apache.org>.
Oops, I just realized if we do a call with topics=*, we don't need to
make a follow-up call.

:)

The question still holds, though: is it worth sacrificing some
scalability when talking to older brokers, to get saner semantics?

cheers,
Colin


On Mon, May 22, 2017, at 11:41, Colin McCabe wrote:
> I definitely agree that auto topic creation is unexpected and confusing
> (even with the JavaDoc note in the API).  The proposed solution of
> adding a flag to MetadataRequest seems pretty simple and reasonable. 
> +1.
> 
> As you noted, though, we don't have a way to do this for the 0.10.x
> releases.  It seems a bit harsh to have such different behavior there. 
> Is there a way that we can fudge this a bit so that it mostly works? 
> For example, when communicating with 0.10.x brokers, describeTopics
> could do a MetadataRequest(topics=*) to filter out non-existent topics.
> 
> This would obviously have a bit of a time-of-check, time-of-use race
> condition since we're making two calls.  And also a scalability problem
> since we're using topics=*.  Is it worth it to make the behavior saner
> on older brokers?  Or should we add a JavaDoc note and move on?
> 
> best,
> Colin
> 
> 
> On Fri, May 19, 2017, at 05:46, Ismael Juma wrote:
> > Hi all,
> > 
> > Feedback from people who tried the AdminClient is that auto topic
> > creation
> > during describe is unexpected and confusing. This is consistent with the
> > reaction of most people when they learn that MetadataRequest can cause
> > topics to be created. We had assumed that we'd tackle this issue for all
> > the clients as part of deprecation of server-side auto topic creation in
> > favour of client-side auto-topic creation.
> > 
> > However, it would be better to do the right thing for the AdminClient
> > from
> > the start. Users will be less confused and we won't have to deal with
> > compatibility concerns. Jason suggested a simple solution: make it
> > possible
> > to disallow auto topic creation when sending the metadata request. The
> > AdminClient would take advantage of this now (i.e. 0.11.0.0) while the
> > producer and consumer would retain the existing behaviour. In a
> > subsequent
> > release, we'll work out the details of how to move away from server-side
> > auto topic creation for the producer and consumer (taking into account
> > the
> > compatibility impact).
> > 
> > Because of the protocol change, this solution would only help in cases
> > where the AdminClient is describing topics from a 0.11.0.0 or newer
> > broker.
> > 
> > I submitted a PR for this and it's small and straightforward:
> > 
> > https://github.com/apache/kafka/pull/3098
> > 
> > Thoughts?
> > 
> > Ismael
> > 
> > On Sat, Mar 25, 2017 at 1:25 AM, Colin McCabe <cm...@apache.org> wrote:
> > 
> > > With binding +1 votes from Gwen Shapira, Sriram Subramanian, and Grant
> > > Henke, and a non-binding vote from Dong Lin, the vote passes.  There
> > > were no +0 or -1 votes.  As mentioned earlier, the interface will be
> > > unstable at first and we will continue to evolve it.
> > >
> > > thanks,
> > > Colin McCabe
> > >
> > >
> > > On Wed, Mar 22, 2017, at 10:21, Colin McCabe wrote:
> > > > On Fri, Mar 17, 2017, at 10:50, Jun Rao wrote:
> > > > > Hi, Colin,
> > > > >
> > > > > Thanks for the KIP. Looks good overall. A few comments below.
> > > > >
> > > > > 1. Sometimes we return
> > > > >     CompletableFuture<Map<String, TopicDescription>>
> > > > > and some other times we return
> > > > >     Map<String, CompletableFuture<Void>>
> > > > > , which doesn't seem consistent. Is that intentional?
> > > >
> > > > Yes, this is intentional.  We got feedback from some people that they
> > > > wanted a single future that would fail if anything failed.  Other people
> > > > wanted to be able to detect failures on individual elements of a batch.
> > > > This API lets us have both (you just choose which future you want to
> > > > wait on).
> > > >
> > > > >
> > > > > 2. We support batching in CreateTopic/DeleteTopic/ListTopic, but not
> > > in
> > > > > DescribeTopic. Should we add batching in DescribeTopic to make it
> > > > > consistent?
> > > >
> > > > Good idea.  Let's add batching to DescribeTopic(s).
> > > >
> > > > > Also, both ListTopic and DescribeTopic seem to return
> > > > > TopicDescription. Could we just consolidate the two by just keeping
> > > > > DescribeTopic?
> > > >
> > > > Sorry, that was a typo.  ListTopics is supposed to return TopicListing,
> > > > which tells you only the name of the topic and whether it is internal.
> > > > The idea is that later we will add another RPC which allows us to fetch
> > > > just this information, and not the other topic fields. That way, we can
> > > > be more efficient.  The idea is that ListTopics is like readdir()/ls and
> > > > DescribeTopics is like stat().  Getting detailed information about
> > > > 1,000s of topics could be quite a resource hog for cluster management
> > > > systems in a big cluster.
> > > >
> > > > >
> > > > > 3. listNodes: At the request protocol level, we can get things like
> > > > > clusterId and controller broker id. Both are useful info from an admin
> > > > > perspective, but are not exposed through the api. Perhaps we can
> > > > > generalize
> > > > > listNodes to sth like describeCluster so that we can return those
> > > > > additional info as well?
> > > >
> > > > Yeah, let's change listNodes -> describeCluster.
> > > >
> > > > >
> > > > > 4. Configurations: To support security, we will need to include all
> > > > > properties related to SSL and SASL.
> > > >
> > > > Yeah
> > > >
> > > > best,
> > > > Colin
> > > >
> > > > >
> > > > > Thanks,
> > > > >
> > > > > Jun
> > > > >
> > > > >
> > > > > On Thu, Mar 16, 2017 at 11:59 PM, Colin McCabe <cm...@apache.org>
> > > > > wrote:
> > > > >
> > > > > > Hi all,
> > > > > >
> > > > > > It seems like people agree with the basic direction of the proposal
> > > and
> > > > > > the API, including the operations that are included, the async and
> > > > > > batching support, and the mechanisms for extending it in the
> > > future.  If
> > > > > > there's no more votes, I'd like to close the vote and start progress
> > > on
> > > > > > this.
> > > > > >
> > > > > > I think the API should be unstable for a while (at least until the
> > > 0.11
> > > > > > release is made), so we can consider ways to improve it.  A few have
> > > > > > been suggested here: removing or adding functions, renaming things a
> > > > > > bit, or using request objects instead of options objects.  I think
> > > once
> > > > > > people try out the API a bit, it will be easier to evaluate these.
> > > > > >
> > > > > > best,
> > > > > > Colin
> > > > > >
> > > > > >
> > > > > > On Tue, Mar 14, 2017, at 10:12, Dong Lin wrote:
> > > > > > > +1
> > > > > > >
> > > > > > > On Tue, Mar 14, 2017 at 8:50 AM, Grant Henke <gh...@cloudera.com>
> > > > > > wrote:
> > > > > > >
> > > > > > > > +1
> > > > > > > >
> > > > > > > > On Tue, Mar 14, 2017 at 2:44 AM, Sriram Subramanian <
> > > ram@confluent.io>
> > > > > > > > wrote:
> > > > > > > >
> > > > > > > > > +1 (binding)
> > > > > > > > >
> > > > > > > > > Nice work in driving this.
> > > > > > > > >
> > > > > > > > > On Mon, Mar 13, 2017 at 10:31 PM, Gwen Shapira <
> > > gwen@confluent.io>
> > > > > > > > wrote:
> > > > > > > > >
> > > > > > > > > > +1 (binding)
> > > > > > > > > >
> > > > > > > > > > I expressed few concerns in the discussion thread, but in
> > > general
> > > > > > this
> > > > > > > > is
> > > > > > > > > > super important to get done.
> > > > > > > > > >
> > > > > > > > > > On Fri, Mar 10, 2017 at 10:38 AM, Colin McCabe <
> > > cmccabe@apache.org
> > > > > > >
> > > > > > > > > wrote:
> > > > > > > > > >
> > > > > > > > > > > Hi all,
> > > > > > > > > > >
> > > > > > > > > > > I'd like to start voting on KIP-117
> > > > > > > > > > > (https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> > > > > > > > > > > 117%3A+Add+a+public+AdminClient+API+for+Kafka+
> > > admin+operations
> > > > > > > > > > > ).
> > > > > > > > > > >
> > > > > > > > > > > The discussion thread can be found here:
> > > > > > > > > > > https://www.mail-archive.com/
> > > dev@kafka.apache.org/msg65697.html
> > > > > > > > > > >
> > > > > > > > > > > best,
> > > > > > > > > > > Colin
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > --
> > > > > > > > > > *Gwen Shapira*
> > > > > > > > > > Product Manager | Confluent
> > > > > > > > > > 650.450.2760 | @gwenshap
> > > > > > > > > > Follow us: Twitter <https://twitter.com/ConfluentInc> | blog
> > > > > > > > > > <http://www.confluent.io/blog>
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > --
> > > > > > > > Grant Henke
> > > > > > > > Software Engineer | Cloudera
> > > > > > > > grant@cloudera.com | twitter.com/gchenke |
> > > linkedin.com/in/granthenke
> > > > > > > >
> > > > > >
> > >

Re: [VOTE] KIP-117: Add a public AdminClient API for Kafka admin operations

Posted by Colin McCabe <cm...@apache.org>.
I definitely agree that auto topic creation is unexpected and confusing
(even with the JavaDoc note in the API).  The proposed solution of
adding a flag to MetadataRequest seems pretty simple and reasonable. 
+1.

As you noted, though, we don't have a way to do this for the 0.10.x
releases.  It seems a bit harsh to have such different behavior there. 
Is there a way that we can fudge this a bit so that it mostly works? 
For example, when communicating with 0.10.x brokers, describeTopics
could do a MetadataRequest(topics=*) to filter out non-existent topics.

This would obviously have a bit of a time-of-check, time-of-use race
condition since we're making two calls.  And also a scalability problem
since we're using topics=*.  Is it worth it to make the behavior saner
on older brokers?  Or should we add a JavaDoc note and move on?

best,
Colin


On Fri, May 19, 2017, at 05:46, Ismael Juma wrote:
> Hi all,
> 
> Feedback from people who tried the AdminClient is that auto topic
> creation
> during describe is unexpected and confusing. This is consistent with the
> reaction of most people when they learn that MetadataRequest can cause
> topics to be created. We had assumed that we'd tackle this issue for all
> the clients as part of deprecation of server-side auto topic creation in
> favour of client-side auto-topic creation.
> 
> However, it would be better to do the right thing for the AdminClient
> from
> the start. Users will be less confused and we won't have to deal with
> compatibility concerns. Jason suggested a simple solution: make it
> possible
> to disallow auto topic creation when sending the metadata request. The
> AdminClient would take advantage of this now (i.e. 0.11.0.0) while the
> producer and consumer would retain the existing behaviour. In a
> subsequent
> release, we'll work out the details of how to move away from server-side
> auto topic creation for the producer and consumer (taking into account
> the
> compatibility impact).
> 
> Because of the protocol change, this solution would only help in cases
> where the AdminClient is describing topics from a 0.11.0.0 or newer
> broker.
> 
> I submitted a PR for this and it's small and straightforward:
> 
> https://github.com/apache/kafka/pull/3098
> 
> Thoughts?
> 
> Ismael
> 
> On Sat, Mar 25, 2017 at 1:25 AM, Colin McCabe <cm...@apache.org> wrote:
> 
> > With binding +1 votes from Gwen Shapira, Sriram Subramanian, and Grant
> > Henke, and a non-binding vote from Dong Lin, the vote passes.  There
> > were no +0 or -1 votes.  As mentioned earlier, the interface will be
> > unstable at first and we will continue to evolve it.
> >
> > thanks,
> > Colin McCabe
> >
> >
> > On Wed, Mar 22, 2017, at 10:21, Colin McCabe wrote:
> > > On Fri, Mar 17, 2017, at 10:50, Jun Rao wrote:
> > > > Hi, Colin,
> > > >
> > > > Thanks for the KIP. Looks good overall. A few comments below.
> > > >
> > > > 1. Sometimes we return
> > > >     CompletableFuture<Map<String, TopicDescription>>
> > > > and some other times we return
> > > >     Map<String, CompletableFuture<Void>>
> > > > , which doesn't seem consistent. Is that intentional?
> > >
> > > Yes, this is intentional.  We got feedback from some people that they
> > > wanted a single future that would fail if anything failed.  Other people
> > > wanted to be able to detect failures on individual elements of a batch.
> > > This API lets us have both (you just choose which future you want to
> > > wait on).
> > >
> > > >
> > > > 2. We support batching in CreateTopic/DeleteTopic/ListTopic, but not
> > in
> > > > DescribeTopic. Should we add batching in DescribeTopic to make it
> > > > consistent?
> > >
> > > Good idea.  Let's add batching to DescribeTopic(s).
> > >
> > > > Also, both ListTopic and DescribeTopic seem to return
> > > > TopicDescription. Could we just consolidate the two by just keeping
> > > > DescribeTopic?
> > >
> > > Sorry, that was a typo.  ListTopics is supposed to return TopicListing,
> > > which tells you only the name of the topic and whether it is internal.
> > > The idea is that later we will add another RPC which allows us to fetch
> > > just this information, and not the other topic fields. That way, we can
> > > be more efficient.  The idea is that ListTopics is like readdir()/ls and
> > > DescribeTopics is like stat().  Getting detailed information about
> > > 1,000s of topics could be quite a resource hog for cluster management
> > > systems in a big cluster.
> > >
> > > >
> > > > 3. listNodes: At the request protocol level, we can get things like
> > > > clusterId and controller broker id. Both are useful info from an admin
> > > > perspective, but are not exposed through the api. Perhaps we can
> > > > generalize
> > > > listNodes to sth like describeCluster so that we can return those
> > > > additional info as well?
> > >
> > > Yeah, let's change listNodes -> describeCluster.
> > >
> > > >
> > > > 4. Configurations: To support security, we will need to include all
> > > > properties related to SSL and SASL.
> > >
> > > Yeah
> > >
> > > best,
> > > Colin
> > >
> > > >
> > > > Thanks,
> > > >
> > > > Jun
> > > >
> > > >
> > > > On Thu, Mar 16, 2017 at 11:59 PM, Colin McCabe <cm...@apache.org>
> > > > wrote:
> > > >
> > > > > Hi all,
> > > > >
> > > > > It seems like people agree with the basic direction of the proposal
> > and
> > > > > the API, including the operations that are included, the async and
> > > > > batching support, and the mechanisms for extending it in the
> > future.  If
> > > > > there's no more votes, I'd like to close the vote and start progress
> > on
> > > > > this.
> > > > >
> > > > > I think the API should be unstable for a while (at least until the
> > 0.11
> > > > > release is made), so we can consider ways to improve it.  A few have
> > > > > been suggested here: removing or adding functions, renaming things a
> > > > > bit, or using request objects instead of options objects.  I think
> > once
> > > > > people try out the API a bit, it will be easier to evaluate these.
> > > > >
> > > > > best,
> > > > > Colin
> > > > >
> > > > >
> > > > > On Tue, Mar 14, 2017, at 10:12, Dong Lin wrote:
> > > > > > +1
> > > > > >
> > > > > > On Tue, Mar 14, 2017 at 8:50 AM, Grant Henke <gh...@cloudera.com>
> > > > > wrote:
> > > > > >
> > > > > > > +1
> > > > > > >
> > > > > > > On Tue, Mar 14, 2017 at 2:44 AM, Sriram Subramanian <
> > ram@confluent.io>
> > > > > > > wrote:
> > > > > > >
> > > > > > > > +1 (binding)
> > > > > > > >
> > > > > > > > Nice work in driving this.
> > > > > > > >
> > > > > > > > On Mon, Mar 13, 2017 at 10:31 PM, Gwen Shapira <
> > gwen@confluent.io>
> > > > > > > wrote:
> > > > > > > >
> > > > > > > > > +1 (binding)
> > > > > > > > >
> > > > > > > > > I expressed few concerns in the discussion thread, but in
> > general
> > > > > this
> > > > > > > is
> > > > > > > > > super important to get done.
> > > > > > > > >
> > > > > > > > > On Fri, Mar 10, 2017 at 10:38 AM, Colin McCabe <
> > cmccabe@apache.org
> > > > > >
> > > > > > > > wrote:
> > > > > > > > >
> > > > > > > > > > Hi all,
> > > > > > > > > >
> > > > > > > > > > I'd like to start voting on KIP-117
> > > > > > > > > > (https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> > > > > > > > > > 117%3A+Add+a+public+AdminClient+API+for+Kafka+
> > admin+operations
> > > > > > > > > > ).
> > > > > > > > > >
> > > > > > > > > > The discussion thread can be found here:
> > > > > > > > > > https://www.mail-archive.com/
> > dev@kafka.apache.org/msg65697.html
> > > > > > > > > >
> > > > > > > > > > best,
> > > > > > > > > > Colin
> > > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > --
> > > > > > > > > *Gwen Shapira*
> > > > > > > > > Product Manager | Confluent
> > > > > > > > > 650.450.2760 | @gwenshap
> > > > > > > > > Follow us: Twitter <https://twitter.com/ConfluentInc> | blog
> > > > > > > > > <http://www.confluent.io/blog>
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > --
> > > > > > > Grant Henke
> > > > > > > Software Engineer | Cloudera
> > > > > > > grant@cloudera.com | twitter.com/gchenke |
> > linkedin.com/in/granthenke
> > > > > > >
> > > > >
> >